focaccia-qemu - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	net: implement UDP tunnel features offloading	Paolo Abeni	2025-10-04	1	-8/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When any host or guest GSO over UDP tunnel offload is enabled the virtio net header includes the additional tunnel-related fields, update the size accordingly. Push the GSO over UDP tunnel offloads all the way down to the tap device extending the newly introduced NetFeatures struct, and eventually enable the associated features. As per virtio specification, to convert features bit to offload bit, map the extended features into the reserved range. Finally, make the vhost backend aware of the exact header layout, to copy it correctly. The tunnel-related field are present if either the guest or the host negotiated any UDP tunnel related feature: add them to the kernel supported features list, to allow qemu transfer to the backend the needed information. Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-ID: <093b4bc68368046bffbcab2202227632d6e4e83b.1758549625.git.pabeni@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
*	net: implement tunnel probing	Paolo Abeni	2025-10-04	1	-0/+41
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Tap devices support GSO over UDP tunnel offload. Probe for such feature in a similar manner to other offloads. GSO over UDP tunnel needs to be enabled in addition to a "plain" offload (TSO or USO). No need to check separately for the outer header checksum offload: the kernel is going to support both of them or none. The new features are disabled by default to avoid compat issues, and could be enabled, after that hw_compat_10_1 will be added, together with the related compat entries. Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-ID: <a987a8a7613cbf33bb2209c7c7f5889b512638a7.1758549625.git.pabeni@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
*	virtio-net: implement extended features support	Paolo Abeni	2025-10-04	1	-58/+82
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use the extended types and helpers to manipulate the virtio_net features. Note that offloads are still 64bits wide, as per specification, and extended offloads will be mapped into such range. Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-ID: <bc5afdc5c1cb1a37238dd2b36004db3d46cbf211.1758549625.git.pabeni@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
*	net: bundle all offloads in a single struct	Paolo Abeni	2025-10-04	1	-8/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The set_offload() argument list is already pretty long and we are going to introduce soon a bunch of additional offloads. Replace the offload arguments with a single struct and update all the relevant call-sites. No functional changes intended. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-ID: <a9d4dd043b8c71b791e9ff05e17ef06072d9714e.1758549625.git.pabeni@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
*	error: Kill @error_warn	Markus Armbruster	2025-10-01	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We added @error_warn some two years ago in commit 3ffef1a55ca (error: add global &error_warn destination). It has multiple issues: * error.h's big comment was not updated for it. * Function contracts were not updated for it. * ERRP_GUARD() is unaware of @error_warn, and fails to mask it from error_prepend() and such. These crash on @error_warn, as pointed out by Akihiko Odaki. All fixable. However, after more than two years, we had just of 15 uses, of which the last few patches removed seven as unclean or otherwise undesirable, adding back five elsewhere. I didn't look closely enough at the remaining seven to decide whether they are desirable or not. I don't think this feature earns its keep. Drop it. Thanks-to: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Message-ID: <20250923091000.3180122-14-armbru@redhat.com> Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
*	virtio-net: Fix VLAN filter table reset timing	Akihiko Odaki	2025-08-01	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem ------- The expected initial state of the table depends on feature negotiation: With VIRTIO_NET_F_CTRL_VLAN: The table must be empty in accordance with the specification. Without VIRTIO_NET_F_CTRL_VLAN: The table must be filled to permit all VLAN traffic. Prior to commit 06b636a1e2ad ("virtio-net: do not reset vlan filtering at set_features"), virtio_net_set_features() always reset the VLAN table. That commit changed the behavior to skip table reset when VIRTIO_NET_F_CTRL_VLAN was negotiated, assuming the table would be properly cleared during device reset and remain stable. However, this assumption breaks when a driver renegotiates features: 1. Initial negotiation without VIRTIO_NET_F_CTRL_VLAN (table filled) 2. Renegotiation with VIRTIO_NET_F_CTRL_VLAN (table will not be cleared) The problem was exacerbated by commit 0caed25cd171 ("virtio: Call set_features during reset"), which triggered virtio_net_set_features() during device reset, exposing the bug whenever VIRTIO_NET_F_CTRL_VLAN was negotiated after a device reset. Solution -------- Fix the issue by initializing the table when virtio_net_set_features() is called to change the VIRTIO_NET_F_CTRL_VLAN bit of vdev->guest_features. This approach ensures the correct table state regardless of feature negotiation sequence by performing initialization in virtio_net_set_features() as QEMU did prior to commit 06b636a1e2ad ("virtio-net: do not reset vlan filtering at set_features"). This change still preserves the goal of the commit, which was to avoid resetting the table during migration, by checking whether the VIRTIO_NET_F_CTRL_VLAN bit of vdev->guest_features is being changed; vdev->guest_features is set before virtio_net_set_features() gets called during migration. It also avoids resetting the table when the driver sets a feature bitmask with no change for the VIRTIO_NET_F_CTRL_VLAN bit, which makes the operation idempotent and its semantics cleaner. Additionally, this change ensures the table is initialized after feature negotiation and before the DRIVER_OK status bit being set for compatibility with the Linux driver before commit 50c0ada627f5 ("virtio-net: fix race between ndo_open() and virtio_device_ready()"), which did not ensure to set the DRIVER_OK status bit before modifying the table. Fixes: 06b636a1e2ad ("virtio-net: do not reset vlan filtering at set_features") Cc: qemu-stable@nongnu.org Reported-by: Konstantin Shkolnyy <kshk@linux.ibm.com> Signed-off-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp> Tested-by: Konstantin Shkolnyy <kshk@linux.ibm.com> Tested-by: Lei Yang <leiyang@redhat.com> Message-Id: <20250727-vlan-v3-1-bbee738619b1@rsg.ci.i.u-tokyo.ac.jp> Tested-by: Konstantin Shkolnyy <kshk@linux.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
*	Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu ↵	Stefan Hajnoczi	2025-07-16	1	-84/+170
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	into staging virtio,pci,pc: features, fixes, tests SPCR acpi table can now be disabled vhost-vdpa can now report hashing capability to guest PPTT acpi table now tells guest vCPUs are identical vost-user-blk now shuts down faster loongarch64 now supports bios-tables-test intel_iommu now supports ATS cxl now supports DCD Fabric Management Command Set arm now supports acpi pci hotplug fixes, cleanups Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # -----BEGIN PGP SIGNATURE----- # # iQFDBAABCgAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmh1+7APHG1zdEByZWRo # YXQuY29tAAoJECgfDbjSjVRpcZ8H/2udpCZ49vjPB8IwQAGdFTw2TWVdxUQFHexQ # pOsCGyFBNAXqD1bmb8lwWyYVJ08WELyL6xWsQ5tfVPiXpKYYHPHl4rNr/SPoyNcv # joY++tagudmOki2DU7nfJ+rPIIuigOTUHbv4TZciwcHle6f65s0iKXhR1sL0cj4i # TS6iJlApSuJInrBBUxuxSUomXk79mFTNKRiXj1k58LRw6JOUEgYvtIW8i+mOUcTg # h1dZphxEQr/oG+a2pM8GOVJ1AFaBPSfgEnRM4kTX9QuTIDCeMAKUBo/mwOk6PV7z # ZhSrDPLrea27XKGL++EJm0fFJ/AsHF1dTks2+c0rDrSK+UV87Zc= # =sktm # -----END PGP SIGNATURE----- # gpg: Signature made Tue 15 Jul 2025 02:56:48 EDT # gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469 # gpg: issuer "mst@redhat.com" # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full] # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full] # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (97 commits) hw/cxl: mailbox-utils: 0x5605 - FMAPI Initiate DC Release hw/cxl: mailbox-utils: 0x5604 - FMAPI Initiate DC Add hw/cxl: Create helper function to create DC Event Records from extents hw/cxl: mailbox-utils: 0x5603 - FMAPI Get DC Region Extent Lists hw/cxl: mailbox-utils: 0x5602 - FMAPI Set DC Region Config hw/mem: cxl_type3: Add DC Region bitmap lock hw/cxl: Move definition for dynamic_capacity_uuid and enum for DC event types to header hw/cxl: mailbox-utils: 0x5601 - FMAPI Get Host Region Config hw/mem: cxl_type3: Add dsmas_flags to CXLDCRegion struct hw/cxl: mailbox-utils: 0x5600 - FMAPI Get DCD Info hw/cxl: fix DC extent capacity tracking tests: virt: Update expected ACPI tables for virt test hw/acpi/aml-build: Build a root node in the PPTT table hw/acpi/aml-build: Set identical implementation flag for PPTT processor nodes tests: virt: Allow changes to PPTT test table qtest/bios-tables-test: Generate reference blob for DSDT.acpipcihp qtest/bios-tables-test: Generate reference blob for DSDT.hpoffacpiindex tests/qtest/bios-tables-test: Add aarch64 ACPI PCI hotplug test tests/qtest/bios-tables-test: Prepare for addition of acpi pci hp tests hw/arm/virt: Let virt support pci hotplug/unplug GED event ... Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Conflicts: net/vhost-vdpa.c vhost_vdpa_set_steering_ebpf() was removed, resolve the context conflict.
\| *	virtio-net: Add hash type options	Akihiko Odaki	2025-07-14	1	-2/+43
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	By default, virtio-net limits the hash types that will be advertised to the guest so that all hash types are covered by the offloading capability the client provides. This change allows to override this behavior and to advertise hash types that require user-space hash calculation by specifying "on" for the corresponding properties. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Message-Id: <20250530-vdpa-v1-6-5af4109b1c19@daynix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
\| *	virtio-net: Retrieve peer hashing capability	Akihiko Odaki	2025-07-14	1	-13/+58
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Retrieve peer hashing capability instead of hardcoding. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Message-Id: <20250530-vdpa-v1-4-5af4109b1c19@daynix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
\| *	virtio-net: Move virtio_net_get_features() down	Akihiko Odaki	2025-07-14	1	-73/+73
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move virtio_net_get_features() to the later part of the file so that it can call other functions. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Message-Id: <20250530-vdpa-v1-3-5af4109b1c19@daynix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* \|	net: Add is_vhost_user flag to vhost_net struct	Laurent Vivier	2025-07-14	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Introduce a boolean is_vhost_user field to the vhost_net structure. This flag is initialized during vhost_net_init based on whether the backend is vhost-user. This refactoring simplifies checks for vhost-user specific behavior, replacing direct comparisons of 'net->nc->info->type' with the new flag. It improves readability and encapsulates the backend type information directly within the vhost_net instance. Signed-off-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
* \|	net: Allow network backends to advertise max TX queue size	Laurent Vivier	2025-07-14	1	-12/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit refactors how the maximum transmit queue size for virtio-net devices is determined, making the mechanism more generic and extensible. Previously, virtio_net_max_tx_queue_size() contained hardcoded checks for specific network backend types (vhost-user and vhost-vdpa) to determine their supported maximum queue size. This created direct dependencies and would require modifications for every new backend that supports variable queue sizes. To improve flexibility, a new max_tx_queue_size field is added to the vhost_net structure. This allows each network backend to advertise its supported maximum transmit queue size directly. The virtio_net_max_tx_queue_size() function now retrieves the max TX queue size from the vhost_net struct, if available and set. Otherwise, it defaults to VIRTIO_NET_TX_QUEUE_DEFAULT_SIZE. Signed-off-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
* \|	vhost_net: Rename vhost_set_vring_enable() for clarity	Laurent Vivier	2025-07-14	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a cosmetic change with no functional impact. The function vhost_set_vring_enable() is specific to vhost_net and is used outside of vhost_net.c (specifically, in hw/net/virtio-net.c). To prevent confusion with other similarly named vhost functions, such as the one found in cryptodev-vhost.c, it has been renamed to vhost_net_set_vring_enable(). This clarifies that the function belongs to the vhost_net module. Signed-off-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
* \|	virtio-net: Add queues for RSS during migration	Akihiko Odaki	2025-07-14	1	-7/+4
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	virtio_net_pre_load_queues() inspects vdev->guest_features to tell if VIRTIO_NET_F_RSS or VIRTIO_NET_F_MQ is enabled to infer the required number of queues. This works for VIRTIO_NET_F_MQ but it doesn't for VIRTIO_NET_F_RSS because only the lowest 32 bits of vdev->guest_features is set at the point and VIRTIO_NET_F_RSS uses bit 60 while VIRTIO_NET_F_MQ uses bit 22. Instead of inferring the required number of queues from vdev->guest_features, use the number loaded from the vm state. This change also has a nice side effect to remove a duplicate peer queue pair change by circumventing virtio_net_set_multiqueue(). Also update the comment in include/hw/virtio/virtio.h to prevent an implementation of pre_load_queues() from refering to any fields being loaded during migration by accident in the future. Fixes: 8c49756825da ("virtio-net: Add only one queue pair when realizing") Tested-by: Lei Yang <leiyang@redhat.com> Cc: qemu-stable@nongnu.org Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
*	hw/net/virtio-net: skip automatic zero-init of large arrays	Daniel P. Berrangé	2025-06-12	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The 'virtio_net_receive_rcu' method has three arrays with VIRTQUEUE_MAX_SIZE elements, which are apprixmately 32k in size used for copying data between guest and host. Skip the automatic zero-init of these arrays to eliminate the performance overhead in the I/O hot path. The three arrays will be selectively initialized as required when processing network buffers. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20250610123709.835102-22-berrange@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
*	vhost-user: return failure if backend crash when live migration	Haoqian He	2025-05-14	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Live migration should be terminated if the vhost-user backend crashes before the migration completes. Specifically, since the vhost device will be stopped when VM is stopped before the end of the live migration, in current implementation if the backend crashes, vhost-user device set_status() won't return failure, live migration won't perceive the disconnection between QEMU and the backend. When the VM is migrated to the destination, the inflight IO will be resubmitted, and if the IO was completed out of order before, it will cause IO error. To fix this issue: 1. Add the return value to set_status() for VirtioDeviceClass. a. For the vhost-user device, return failure when the backend crashes. b. For other virtio devices, always return 0. 2. Return failure if vhost_dev_stop() failed for vhost-user device. If QEMU loses connection with the vhost-user backend, virtio set_status() can return failure to the upper layer, migration_completion() can handle the error, terminate the live migration, and restore the VM, so that inflight IO can be completed normally. Signed-off-by: Haoqian He <haoqian.he@smartx.com> Message-Id: <20250416024729.3289157-4-haoqian.he@smartx.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
*	qom: Have class_init() take a const data argument	Philippe Mathieu-Daudé	2025-04-25	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	Mechanical change using gsed, then style manually adapted to pass checkpatch.pl script. Suggested-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20250424194905.82506-4-philmd@linaro.org>
*	Revert "virtio-net: Copy received header to buffer"	Antoine Damhet	2025-04-15	1	-47/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit 7987d2be5a8bc3a502f89ba8cf3ac3e09f64d1ce. The goal was to remove the need to patch the (const) input buffer with a recomputed UDP checksum by copying headers to a RW region and inject the checksum there. The patch computed the checksum only from the header fields (missing the rest of the payload) producing an invalid one and making guests fail to acquire a DHCP lease. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2727 Cc: qemu-stable@nongnu.org Signed-off-by: Antoine Damhet <adamhet@scaleway.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Message-ID: <20250408145345.142947-1-adamhet@scaleway.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
*	virtio-net: Fix num_buffers for version 1	Akihiko Odaki	2025-04-02	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The specification says the device MUST set num_buffers to 1 if VIRTIO_NET_F_MRG_RXBUF has not been negotiated. Fixes: df91055db5c9 ("virtio-net: enable virtio 1.0") Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Message-Id: <20250108-buffers-v1-1-a0c85ff31aeb@daynix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
*	Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu ↵	Stefan Hajnoczi	2025-02-22	1	-26/+17
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	into staging virtio,pc,pci: features, fixes, cleanups Features: SR-IOV emulation for pci virtio-mem-pci support for s390 interleave support for cxl big endian support for vdpa svq new QAPI events for vhost-user Also vIOMMU reset order fixups are in. Fixes, cleanups all over the place. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # -----BEGIN PGP SIGNATURE----- # # iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAme4b8sPHG1zdEByZWRo # YXQuY29tAAoJECgfDbjSjVRpHKcIAKPJsVqPdda2dJ7b7FdyRT0Q+uwezXqaGHd4 # 7Lzih1wsxYNkwIAyPtEb76/21qiS7BluqlUCfCB66R9xWjP5/KfvAFj4/r4AEduE # fxAgYzotNpv55zcRbcflMyvQ42WGiZZHC+o5Lp7vDXUP3pIyHrl0Ydh5WmcD+hwS # BjXvda58TirQpPJ7rUL+sSfLih17zQkkDcfv5/AgorDy1wK09RBKwMx/gq7wG8yJ # twy8eBY2CmfmFD7eTM+EKqBD2T0kwLEeLfS/F/tl5Fyg6lAiYgYtCbGLpAmWErsg # XZvfZmwqL7CNzWexGvPFnnLyqwC33WUP0k0kT88Y5wh3/h98blw= # =tej8 # -----END PGP SIGNATURE----- # gpg: Signature made Fri 21 Feb 2025 20:21:31 HKT # gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469 # gpg: issuer "mst@redhat.com" # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full] # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full] # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (41 commits) docs/devel/reset: Document reset expectations for DMA and IOMMU hw/vfio/common: Add a trace point in vfio_reset_handler hw/arm/smmuv3: Move reset to exit phase hw/i386/intel-iommu: Migrate to 3-phase reset hw/virtio/virtio-iommu: Migrate to 3-phase reset vhost-user-snd: correct the calculation of config_size net: vhost-user: add QAPI events to report connection state hw/virtio/virtio-nsm: Respond with correct length vdpa: Fix endian bugs in shadow virtqueue MAINTAINERS: add more files to `vhost` cryptodev/vhost: allocate CryptoDevBackendVhost using g_mem0() vhost-iova-tree: Update documentation vhost-iova-tree, svq: Implement GPA->IOVA & partial IOVA->HVA trees vhost-iova-tree: Implement an IOVA-only tree amd_iommu: Use correct bitmask to set capability BAR amd_iommu: Use correct DTE field for interrupt passthrough hw/virtio: reset virtio balloon stats on machine reset mem/cxl_type3: support 3, 6, 12 and 16 interleave ways hw/mem/cxl_type3: Ensure errp is set on realization failure hw/mem/cxl_type3: Fix special_ops memory leak on msix_init_exclusive_bar() failure ... Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
\| *	hw/net: Fix NULL dereference with software RSS	Akihiko Odaki	2025-02-20	1	-26/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When an eBPF program cannot be attached, virtio_net_load_ebpf() returns false, and virtio_net_device_realize() enters the code path to handle errors because of this, but it causes NULL dereference because no error is generated. Change virtio_net_load_ebpf() to return false only when a fatal error occurred. Fixes: b5900dff14e5 ("hw/net: report errors from failing to use eBPF RSS FDs") Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Message-Id: <20250116-software-v1-1-9e5161b534d8@daynix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* \|	qapi: Move include/qapi/qmp/ to include/qobject/	Daniel P. Berrangé	2025-02-10	1	-1/+1
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The general expectation is that header files should follow the same file/path naming scheme as the corresponding source file. There are various historical exceptions to this practice in QEMU, with one of the most notable being the include/qapi/qmp/ directory. Most of the headers there correspond to source files in qobject/. This patch corrects most of that inconsistency by creating include/qobject/ and moving the headers for qobject/ there. This also fixes MAINTAINERS for include/qapi/qmp/dispatch.h: scripts/get_maintainer.pl now reports "QAPI" instead of "No maintainers found". Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Acked-by: Halil Pasic <pasic@linux.ibm.com> #s390x Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-ID: <20241118151235.2665921-2-armbru@redhat.com> [Rebased]
*	virtio-net: vhost-user: Implement internal migration	Laurent Vivier	2025-01-15	1	-23/+112
\| \| \| \| \| \| \| \| \| \| \|	Add support of VHOST_USER_PROTOCOL_F_DEVICE_STATE in virtio-net with vhost-user backend. Cc: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Laurent Vivier <lvivier@redhat.com> Message-Id: <20250115135044.799698-3-lvivier@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
*	Merge tag 'exec-20241220' of https://github.com/philmd/qemu into staging	Stefan Hajnoczi	2024-12-21	1	-3/+3
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Accel & Exec patch queue - Ignore writes to CNTP_CTL_EL0 on HVF ARM (Alexander) - Add '-d invalid_mem' logging option (Zoltan) - Create QOM containers explicitly (Peter) - Rename sysemu/ -> system/ (Philippe) - Re-orderning of include/exec/ headers (Philippe) Move a lot of declarations from these legacy mixed bag headers: . "exec/cpu-all.h" . "exec/cpu-common.h" . "exec/cpu-defs.h" . "exec/exec-all.h" . "exec/translate-all" to these more specific ones: . "exec/page-protection.h" . "exec/translation-block.h" . "user/cpu_loop.h" . "user/guest-host.h" . "user/page-protection.h" # -----BEGIN PGP SIGNATURE----- # # iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmdlnyAACgkQ4+MsLN6t # wN6mBw//QFWi7CrU+bb8KMM53kOU9C507tjn99LLGFb5or73/umDsw6eo/b8DHBt # KIwGLgATel42oojKfNKavtAzLK5rOrywpboPDpa3SNeF1onW+99NGJ52LQUqIX6K # A6bS0fPdGG9ZzEuPpbjDXlp++0yhDcdSgZsS42fEsT7Dyj5gzJYlqpqhiXGqpsn8 # 4Y0UMxSL21K3HEexlzw2hsoOBFA3tUm2ujNDhNkt8QASr85yQVLCypABJnuoe/// # 5Ojl5wTBeDwhANET0rhwHK8eIYaNboiM9fHopJYhvyw1bz6yAu9jQwzF/MrL3s/r # xa4OBHBy5mq2hQV9Shcl3UfCQdk/vDaYaWpgzJGX8stgMGYfnfej1SIl8haJIfcl # VMX8/jEFdYbjhO4AeGRYcBzWjEJymkDJZoiSWp2NuEDi6jqIW+7yW1q0Rnlg9lay # ShAqLK5Pv4zUw3t0Jy3qv9KSW8sbs6PQxtzXjk8p97rTf76BJ2pF8sv1tVzmsidP # 9L92Hv5O34IqzBu2oATOUZYJk89YGmTIUSLkpT7asJZpBLwNM2qLp5jO00WVU0Sd # +kAn324guYPkko/TVnjC/AY7CMu55EOtD9NU35k3mUAnxXT9oDUeL4NlYtfgrJx6 # x1Nzr2FkS68+wlPAFKNSSU5lTjsjNaFM0bIJ4LCNtenJVP+SnRo= # =cjz8 # -----END PGP SIGNATURE----- # gpg: Signature made Fri 20 Dec 2024 11:45:20 EST # gpg: using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE # gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [unknown] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: FAAB E75E 1291 7221 DCFD 6BB2 E3E3 2C2C DEAD C0DE * tag 'exec-20241220' of https://github.com/philmd/qemu: (59 commits) util/qemu-timer: fix indentation meson: Do not define CONFIG_DEVICES on user emulation system/accel-ops: Remove unnecessary 'exec/cpu-common.h' header system/numa: Remove unnecessary 'exec/cpu-common.h' header hw/xen: Remove unnecessary 'exec/cpu-common.h' header target/mips: Drop left-over comment about Jazz machine target/mips: Remove tswap() calls in semihosting uhi_fstat_cb() target/xtensa: Remove tswap() calls in semihosting simcall() helper accel/tcg: Un-inline translator_is_same_page() accel/tcg: Include missing 'exec/translation-block.h' header accel/tcg: Move tcg_cflags_has/set() to 'exec/translation-block.h' accel/tcg: Restrict curr_cflags() declaration to 'internal-common.h' qemu/coroutine: Include missing 'qemu/atomic.h' header exec/translation-block: Include missing 'qemu/atomic.h' header accel/tcg: Declare cpu_loop_exit_requested() in 'exec/cpu-common.h' exec/cpu-all: Include 'cpu.h' earlier so MMU_USER_IDX is always defined target/sparc: Move sparc_restore_state_to_opc() to cpu.c target/sparc: Uninline cpu_get_tb_cpu_state() target/loongarch: Declare loongarch_cpu_dump_state() locally user: Move various declarations out of 'exec/exec-all.h' ... Conflicts: hw/char/riscv_htif.c hw/intc/riscv_aplic.c target/s390x/cpu.c Apply sysemu header path changes to not in the pull request. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
\| *	include: Rename sysemu/ -> system/	Philippe Mathieu-Daudé	2024-12-20	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Headers in include/sysemu/ are not only related to system emulation, they are also used by virtualization. Rename as system/ which is clearer. Files renamed manually then mechanical change using sed tool. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Tested-by: Lei Yang <leiyang@redhat.com> Message-Id: <20241203172445.28576-1-philmd@linaro.org>
* \|	include/hw/qdev-properties: Remove DEFINE_PROP_END_OF_LIST	Richard Henderson	2024-12-19	1	-1/+0
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that all of the Property arrays are counted, we can remove the terminator object from each array. Update the assertions in device_class_set_props to match. With struct Property being 88 bytes, this was a rather large form of terminator. Saves 30k from qemu-system-aarch64. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Tested-by: Lei Yang <leiyang@redhat.com> Link: https://lore.kernel.org/r/20241218134251.4724-21-richard.henderson@linaro.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
*	hw/net: Constify all Property	Richard Henderson	2024-12-15	1	-1/+1
\| \| \| \| \| \|	Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
*	virtio-net: Add queues before loading them	Akihiko Odaki	2024-11-26	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \|	Call virtio_net_set_multiqueue() to add queues before loading their states. Otherwise the loaded queues will not have handlers and elements in them will not be processed. Cc: qemu-stable@nongnu.org Fixes: 8c49756825da ("virtio-net: Add only one queue pair when realizing") Reported-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
*	virtio-net: Copy received header to buffer	Akihiko Odaki	2024-11-25	1	-39/+46
\| \| \| \| \| \| \| \| \|	receive_header() used to cast the const qualifier of the pointer to the received packet away to modify the header. Avoid this by copying the received header to buffer. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
*	virtio-net: Initialize hash reporting values	Akihiko Odaki	2024-11-25	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The specification says hash_report should be set to VIRTIO_NET_HASH_REPORT_NONE if VIRTIO_NET_F_HASH_REPORT is negotiated but not configured with VIRTIO_NET_CTRL_MQ_RSS_CONFIG. However, virtio_net_receive_rcu() instead wrote out the content of the extra_hdr variable, which is not uninitialized in such a case. Fix this by zeroing the extra_hdr. Fixes: e22f0603fb2f ("virtio-net: reference implementation of hash report") Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Jason Wang <jasowang@redhat.com>
*	virtio-net: Fix hash reporting when the queue changes	Akihiko Odaki	2024-11-25	1	-7/+7
\| \| \| \| \| \| \| \| \| \| \|	virtio_net_process_rss() fills the values used for hash reporting, but the values used to be thrown away with a recursive function call if the queue changes after RSS. Avoid the function call to keep the values. Fixes: a4c960eedcd2 ("virtio-net: Do not write hashes to peer buffer") Buglink: https://issues.redhat.com/browse/RHEL-59572 Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
*	virtio-net: Do not check for the queue before RSS	Akihiko Odaki	2024-11-25	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \|	virtio_net_can_receive() checks if the queue is ready, but RSS will change the queue to use so, strictly speaking, we may still be able to receive the packet even if the queue initially provided is not ready. Perform RSS before virtio_net_can_receive() to cover such a case. Fixes: 4474e37a5b3a ("virtio-net: implement RX RSS processing") Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
*	virtio-net: Fix size check in dhclient workaround	Akihiko Odaki	2024-11-25	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \|	work_around_broken_dhclient() accesses IP and UDP headers to detect relevant packets and to calculate checksums, but it didn't check if the packet has size sufficient to accommodate them, causing out-of-bound access hazards. Fix this by correcting the size requirement. Fixes: 1d41b0c1ec66 ("Work around dhclient brokenness") Cc: qemu-stable@nongnu.org Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
*	hw/net/virtio-net.c: Don't assume IP length field is aligned	Peter Maydell	2024-11-18	1	-4/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In virtio-net.c we assume that the IP length field in the packet is aligned, and we copy its address into a uint16_t* in the VirtioNetRscUnit struct which we then dereference later. This isn't a safe assumption; it will also result in compilation failures if we mark the ip_header struct as QEMU_PACKED because the compiler will not let you take the address of an unaligned struct field. Make the ip_plen field in VirtioNetRscUnit a void*, and make all the places where we read or write through that pointer instead use some new accessor functions read_unit_ip_len() and write_unit_ip_len() which account for the pointer being potentially unaligned and also do the network-byte-order conversion we were previously using htons() to perform. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-ID: <20241114141619.806652-2-peter.maydell@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
*	virtio-net: Avoid indirection_table_mask overflow	Akihiko Odaki	2024-10-29	1	-6/+6
\| \| \| \| \| \| \| \| \| \|	We computes indirections_len by adding 1 to indirection_table_mask, but it may overflow indirection_table_mask is UINT16_MAX. Check if indirection_table_mask is small enough before adding 1. Fixes: 590790297c0d ("virtio-net: implement RSS configuration command") Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
*	hw/net: improve tracing of eBPF RSS setup	Daniel P. Berrangé	2024-10-28	1	-3/+6
\| \| \| \| \| \| \| \|	This adds more trace events to key eBPF RSS setup operations, and also distinguishes events from multiple NIC instances. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
*	hw/net: report errors from failing to use eBPF RSS FDs	Daniel P. Berrangé	2024-10-28	1	-12/+29
\| \| \| \| \| \| \| \| \| \| \| \|	If the user/mgmt app passed in a set of pre-opened FDs for eBPF RSS, then it is expecting QEMU to use them. Any failure to do so must be considered a fatal error and propagated back up the stack, otherwise deployment mistakes will not be detectable in a prompt manner. When not using pre-opened FDs, then eBPF RSS is tried on a "best effort" basis only and thus fallback to software RSS is valid. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
*	ebpf: add formal error reporting to all APIs	Daniel P. Berrangé	2024-10-28	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	The eBPF code is currently reporting error messages through trace events. Trace events are fine for debugging, but they are not to be considered the primary error reporting mechanism, as their output is inaccessible to callers. This adds an "Error **errp" parameter to all methods which have important error scenarios to report to the caller. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
*	hw/net: fix typo s/epbf/ebpf/ in virtio-net	Daniel P. Berrangé	2024-10-28	1	-5/+5
\| \| \| \| \| \|	Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
*	virtio: Allow .get_vhost() without vhost_started	Hanna Czenczek	2024-09-10	1	-2/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Historically, .get_vhost() was probably only called when vdev->vhost_started is true. However, we now decidedly want to call it also when vhost_started is false, specifically so we can issue a reset to the vhost back-end while device operation is stopped. Some .get_vhost() implementations dereference some pointers (or return offsets from them) that are probably guaranteed to be non-NULL when vhost_started is true, but not necessarily otherwise. This patch makes all such implementations check all such pointers, returning NULL if any is NULL. Signed-off-by: Hanna Czenczek <hreitz@redhat.com> Message-Id: <20240723163941.48775-2-hreitz@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
*	virtio-net: Use virtual time for RSC timers	Nicholas Piggin	2024-08-16	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Receive coalescing is visible to the target machine, so its timers should use virtual time like other timers in virtio-net, to be compatible with record-replay. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Message-Id: <20240813050638.446172-10-npiggin@gmail.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20240813202329.1237572-18-alex.bennee@linaro.org>
*	virtio-net: Use replay_schedule_bh_event for bhs that affect machine state	Nicholas Piggin	2024-08-16	1	-5/+6
\| \| \| \| \| \| \| \| \| \| \| \|	The regular qemu_bh_schedule() calls result in non-deterministic execution of the bh in record-replay mode, which causes replay failure. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Message-Id: <20240813050638.446172-9-npiggin@gmail.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20240813202329.1237572-17-alex.bennee@linaro.org>
*	virtio-net: Fix network stall at the host side waiting for kick	thomas	2024-08-02	1	-12/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Patch 06b12970174 ("virtio-net: fix network stall under load") added double-check to test whether the available buffer size can satisfy the request or not, in case the guest has added some buffers to the avail ring simultaneously after the first check. It will be lucky if the available buffer size becomes okay after the double-check, then the host can send the packet to the guest. If the buffer size still can't satisfy the request, even if the guest has added some buffers, viritio-net would stall at the host side forever. The patch enables notification and checks whether the guest has added some buffers since last check of available buffers when the available buffers are insufficient. If no buffer is added, return false, else recheck the available buffers in the loop. If the available buffers are sufficient, disable notification and return true. Changes: 1. Change the return type of virtqueue_get_avail_bytes() from void to int, it returns an opaque that represents the shadow_avail_idx of the virtqueue on success, else -1 on error. 2. Add a new API: virtio_queue_enable_notification_and_check(), it takes an opaque as input arg which is returned from virtqueue_get_avail_bytes(). It enables notification firstly, then checks whether the guest has added some buffers since last check of available buffers or not by virtio_queue_poll(), return ture if yes. The patch also reverts patch "06b12970174". The case below can reproduce the stall. Guest 0 +--------+ \| iperf \| ---------------> \| server \| Host \| +--------+ +--------+ \| ... \| iperf \|---- \| client \|---- Guest n +--------+ \| +--------+ \| \| iperf \| ---------------> \| server \| +--------+ Boot many guests from qemu with virtio network: qemu ... -netdev tap,id=net_x \ -device virtio-net-pci-non-transitional,\ iommu_platform=on,mac=xx:xx:xx:xx:xx:xx,netdev=net_x Each guest acts as iperf server with commands below: iperf3 -s -D -i 10 -p 8001 iperf3 -s -D -i 10 -p 8002 The host as iperf client: iperf3 -c guest_IP -p 8001 -i 30 -w 256k -P 20 -t 40000 iperf3 -c guest_IP -p 8002 -i 30 -w 256k -P 20 -t 40000 After some time, the host loses connection to the guest, the guest can send packet to the host, but can't receive packet from the host. It's more likely to happen if SWIOTLB is enabled in the guest, allocating and freeing bounce buffer takes some CPU ticks, copying from/to bounce buffer takes more CPU ticks, compared with that there is no bounce buffer in the guest. Once the rate of producing packets from the host approximates the rate of receiveing packets in the guest, the guest would loop in NAPI. receive packets --- \| \| v \| free buf virtnet_poll \| \| v \| add buf to avail ring --- \| \| need kick the host? \| NAPI continues v receive packets --- \| \| v \| free buf virtnet_poll \| \| v \| add buf to avail ring --- \| v ... ... On the other hand, the host fetches free buf from avail ring, if the buf in the avail ring is not enough, the host notifies the guest the event by writing the avail idx read from avail ring to the event idx of used ring, then the host goes to sleep, waiting for the kick signal from the guest. Once the guest finds the host is waiting for kick singal (in virtqueue_kick_prepare_split()), it kicks the host. The host may stall forever at the sequences below: Host Guest ------------ ----------- fetch buf, send packet receive packet --- ... ... \| fetch buf, send packet add buf \| ... add buf virtnet_poll buf not enough avail idx-> add buf \| read avail idx add buf \| add buf --- receive packet --- write event idx ... \| wait for kick add buf virtnet_poll ... \| --- no more packet, exit NAPI In the first loop of NAPI above, indicated in the range of virtnet_poll above, the host is sending packets while the guest is receiving packets and adding buffers. step 1: The buf is not enough, for example, a big packet needs 5 buf, but the available buf count is 3. The host read current avail idx. step 2: The guest adds some buf, then checks whether the host is waiting for kick signal, not at this time. The used ring is not empty, the guest continues the second loop of NAPI. step 3: The host writes the avail idx read from avail ring to used ring as event idx via virtio_queue_set_notification(q->rx_vq, 1). step 4: At the end of the second loop of NAPI, recheck whether kick is needed, as the event idx in the used ring written by the host is beyound the range of kick condition, the guest will not send kick signal to the host. Fixes: 06b12970174 ("virtio-net: fix network stall under load") Cc: qemu-stable@nongnu.org Signed-off-by: Wencheng Yang <east.moutain.yang@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
*	virtio-net: Ensure queue index fits with RSS	Akihiko Odaki	2024-08-02	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	Ensure the queue index points to a valid queue when software RSS enabled. The new calculation matches with the behavior of Linux's TAP device with the RSS eBPF program. Fixes: 4474e37a5b3a ("virtio-net: implement RX RSS processing") Reported-by: Zhibin Hu <huzhibin5@huawei.com> Cc: qemu-stable@nongnu.org Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
*	hw/net/virtio-net.c: fix crash in iov_copy()	Dmitry Frolov	2024-07-01	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \|	A crash found while fuzzing device virtio-net-socket-check-used. Assertion "offset == 0" in iov_copy() fails if less than guest_hdr_len bytes were transmited. Signed-off-by: Dmitry Frolov <frolov@swemel.ru> Message-Id: <20240613143529.602591-2-frolov@swemel.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
*	virtio-net: drop too short packets early	Alexey Dobriyan	2024-06-04	1	-6/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reproducer from https://gitlab.com/qemu-project/qemu/-/issues/1451 creates small packet (1 segment, len = 10 == n->guest_hdr_len), then destroys queue. "if (n->host_hdr_len != n->guest_hdr_len)" is triggered, if body creates zero length/zero segment packet as there is nothing after guest header. qemu_sendv_packet_async() tries to send it. slirp discards it because it is smaller than Ethernet header, but returns 0 because tx hooks are supposed to return total length of data. 0 is propagated upwards and is interpreted as "packet has been sent" which is terrible because queue is being destroyed, nobody is waiting for TX to complete and assert it triggered. Fix is discard such empty packets instead of sending them. Length 1 packets will go via different codepath: virtqueue_push(q->tx_vq, elem, 0); virtio_notify(vdev, q->tx_vq); g_free(elem); and aren't problematic. Signed-off-by: Alexey Dobriyan <adobriyan@yandex-team.ru> Signed-off-by: Jason Wang <jasowang@redhat.com>
*	virtio-net: Do not write hashes to peer buffer	Akihiko Odaki	2024-06-04	1	-19/+17
\| \| \| \| \| \| \|	The peer buffer is qualified with const and not meant to be modified. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
*	virtio-net: Always set populate_hash	Akihiko Odaki	2024-06-04	1	-0/+1
\| \| \| \| \| \| \|	The member is not cleared during reset so may have a stale value. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
*	virtio-net: Unify the logic to update NIC state for RSS	Akihiko Odaki	2024-06-04	1	-54/+36
\| \| \| \| \| \| \| \|	The code to attach or detach the eBPF program to RSS were duplicated so unify them into one function to save some code. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
*	virtio-net: Disable RSS on reset	Akihiko Odaki	2024-06-04	1	-34/+36
\| \| \| \| \| \| \| \| \|	RSS is disabled by default. Fixes: 590790297c ("virtio-net: implement RSS configuration command") Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: Jason Wang <jasowang@redhat.com>