mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Eugenio Pérez	495fe3a787	vhost: Track descriptor chain in private at SVQ The device could have access to modify them, and it definitely have access when we implement packed vq. Harden SVQ maintaining a private copy of the descriptor chain. Other fields like buffer addresses are already maintained sepparatedly. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20220512175747.142058-2-eperezma@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-05-13 07:57:26 -04:00
Halil Pasic	06134e2bc3	virtio: fix feature negotiation for ACCESS_PLATFORM Unlike most virtio features ACCESS_PLATFORM is considered mandatory by QEMU, i.e. the driver must accept it if offered by the device. The virtio specification says that the driver SHOULD accept the ACCESS_PLATFORM feature if offered, and that the device MAY fail to operate if ACCESS_PLATFORM was offered but not negotiated. While a SHOULD ain't exactly a MUST, we are certainly allowed to fail the device when the driver fences ACCESS_PLATFORM. With commit `2943b53f68` ("virtio: force VIRTIO_F_IOMMU_PLATFORM") we already made the decision to do so whenever the get_dma_as() callback is implemented (by the bus), which in practice means for the entirety of virtio-pci. That means, if the device needs to translate I/O addresses, then ACCESS_PLATFORM is mandatory. The aforementioned commit tells us in the commit message that this is for security reasons. More precisely if we were to allow a less then trusted driver (e.g. an user-space driver, or a nested guest) to make the device bypass the IOMMU by not negotiating ACCESS_PLATFORM, then the guest kernel would have no ability to control/police (by programming the IOMMU) what pieces of guest memory the driver may manipulate using the device. Which would break security assumptions within the guest. If ACCESS_PLATFORM is offered not because we want the device to utilize an IOMMU and do address translation, but because the device does not have access to the entire guest RAM, and needs the driver to grant access to the bits it needs access to (e.g. confidential guest support), we still require the guest to have the corresponding logic and to accept ACCESS_PLATFORM. If the driver does not accept ACCESS_PLATFORM, then things are bound to go wrong, and we may see failures much less graceful than failing the device because the driver didn't negotiate ACCESS_PLATFORM. So let us make ACCESS_PLATFORM mandatory for the driver regardless of whether the get_dma_as() callback is implemented or not. Signed-off-by: Halil Pasic <pasic@linux.ibm.com> Fixes: `2943b53f68` ("virtio: force VIRTIO_F_IOMMU_PLATFORM") Message-Id: <20220307112939.2780117-1-pasic@linux.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com>	2022-05-13 05:22:31 -04:00
Paolo Bonzini	d93e839ccd	vhost-backend: do not depend on CONFIG_VHOST_VSOCK The vsock callbacks .vhost_vsock_set_guest_cid and .vhost_vsock_set_running are the only ones to be conditional on #ifdef CONFIG_VHOST_VSOCK. This is different from any other device-dependent callbacks like .vhost_scsi_set_endpoint, and it also broke when CONFIG_VHOST_VSOCK was changed to a per-target symbol. It would be possible to also use the CONFIG_DEVICES include, but really there is no reason for most virtio files to be per-target so just remove the #ifdef to fix the issue. Reported-by: Dov Murik <dovmurik@linux.ibm.com> Fixes: `9972ae314f` ("build: move vhost-vsock configuration to Kconfig") Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-12 12:29:44 +02:00
Stefan Hajnoczi	38738f7dbb	virtio-scsi: don't waste CPU polling the event virtqueue The virtio-scsi event virtqueue is not emptied by its handler function. This is typical for rx virtqueues where the device uses buffers when some event occurs (e.g. a packet is received, an error condition happens, etc). Polling non-empty virtqueues wastes CPU cycles. We are not waiting for new buffers to become available, we are waiting for an event to occur, so it's a misuse of CPU resources to poll for buffers. Introduce the new virtio_queue_aio_attach_host_notifier_no_poll() API, which is identical to virtio_queue_aio_attach_host_notifier() except that it does not poll the virtqueue. Before this patch the following command-line consumed 100% CPU in the IOThread polling and calling virtio_scsi_handle_event(): $ qemu-system-x86_64 -M accel=kvm -m 1G -cpu host \ --object iothread,id=iothread0 \ --device virtio-scsi-pci,iothread=iothread0 \ --blockdev file,filename=test.img,aio=native,cache.direct=on,node-name=drive0 \ --device scsi-hd,drive=drive0 After this patch CPU is no longer wasted. Reported-by: Nir Soffer <nsoffer@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Nir Soffer <nsoffer@redhat.com> Message-id: 20220427143541.119567-3-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2022-05-09 10:45:04 +01:00
Paolo Bonzini	43b6d7ee1f	meson: use have_vhost_* variables to pick sources Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-07 07:46:58 +02:00
Paolo Bonzini	5166dabf4f	build: move vhost-user-fs configuration to Kconfig vhost-user-fs is a device and it should be possible to enable/disable it with --without-default-devices, not --without-default-features. Compute its default value in Kconfig to obtain the more intuitive behavior. In this case the configure options were undocumented, too. Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-07 07:46:58 +02:00
Paolo Bonzini	9972ae314f	build: move vhost-vsock configuration to Kconfig vhost-vsock and vhost-user-vsock are two devices of their own; it should be possible to enable/disable them with --without-default-devices, not --without-default-features. Compute their default value in Kconfig to obtain the more intuitive behavior. Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-07 07:46:58 +02:00
Paolo Bonzini	10925d0617	meson, virtio: place all virtio-pci devices under virtio_pci_ss Since a sourceset already exists for this, avoid unnecessary repeat of CONFIG_VIRTIO_PCI. Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-07 07:46:58 +02:00
Murilo Opsfelder Araujo	97252353c1	vhost-user: Use correct macro name TARGET_PPC64 The correct name of the macro is TARGET_PPC64. Fixes: `27598393a2` ("Lift max memory slots limit imposed by vhost-user") Reported-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com> Cc: Raphael Norwitz <raphael.norwitz@nutanix.com> Cc: Peter Turschmid <peter.turschm@nutanix.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com> Message-Id: <20220503180108.34506-1-muriloo@linux.ibm.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>	2022-05-05 15:36:16 -03:00
Kevin Wolf	a81d8d4a72	vhost-user: Don't pass file descriptor for VHOST_USER_REM_MEM_REG The spec clarifies now that QEMU should not send a file descriptor in a request to remove a memory region. Change it accordingly. For libvhost-user, this is a bug fix that makes it compatible with rust-vmm's implementation that doesn't send a file descriptor. Keep accepting, but ignoring a file descriptor for compatibility with older QEMU versions. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20220407133657.155281-4-kwolf@redhat.com> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2022-05-04 15:55:23 +02:00
Marc-André Lureau	ff5927baa7	util: rename qemu_block() socket functions The qemu_block() functions are meant to be be used with sockets (the win32 implementation expects SOCKET) Over time, those functions where used with Win32 SOCKET or file-descriptors interchangeably. But for portability, they must only be used with socket-like file-descriptors. FDs can use g_unix_set_fd_nonblocking() instead. Rename the functions with "socket" in the name to prevent bad usages. This is effectively reverting commit `f9e8cacc55` ("oslib-posix: rename socket_set_nonblock() to qemu_set_nonblock()"). Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2022-05-03 15:53:20 +04:00
Marc-André Lureau	701544cfaf	hw: replace qemu_set_nonblock() Those calls are non-socket fd, or are POSIX-specific. Use the dedicated GLib API. (qemu_set_nonblock() is for socket-like) (this is a preliminary patch before renaming qemu_set_nonblock()) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2022-05-03 15:52:33 +04:00
Eugenio Pérez	5580b9f00e	vdpa: Add missing tracing to batch mapping functions These functions were not traced properly. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220405063628.853745-1-eperezma@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2022-04-26 12:32:47 +02:00
Thomas Huth	55d71e0b78	Don't include sysemu/tcg.h if it is not necessary This header only defines the tcg_allowed variable and the tcg_enabled() function - which are not required in many files that include this header. Drop the #include statement there. Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20220315144107.1012530-1-thuth@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2022-04-20 12:12:47 -07:00
Richard Henderson	1be5a765c0	* Add cpu0-id to query-sev-capabilities * whpx support for breakpoints and stepping * initial support for Hyper-V Synthetic Debugging * use monotonic clock for QemuCond and QemuSemaphore * Remove qemu-common.h include from most units and lots of other clenaups * do not include headers for all virtio devices in virtio-ccw.h -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmJXCQAUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroNT6wf+NHDJUEdDiwaVGVTGXgHuiaycsymi FpNPiw/+XxSGN5xF3fkUGgqaDrcwIYwVfnXlghKSz8kp1cP3cjxa5CzNMLGTp5je N6BxFbD7yC6dhagGm3mj32jlsptv3M38OHqKc3t+RaUAotP5RF2VdCyfUBLG6vU0 aMzvMfMtB5aG0D8Fr5EV63t1JMTceFU0YxsG73UCFs2Yx4Z0cGBbNxMbHweRhd1q tPeVDS46MFPM3/2cGGHpeeqxkoCTU7A9j1VuNQI3k+Kg+6W5YVxiK/UP7bw77E/a yAHsmIVTNro8ajMBch73weuHtGtdfFLvCKc6QX6aVjzK4dF1voQ01E7gPQ== =rMle -----END PGP SIGNATURE----- Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging * Add cpu0-id to query-sev-capabilities * whpx support for breakpoints and stepping * initial support for Hyper-V Synthetic Debugging * use monotonic clock for QemuCond and QemuSemaphore * Remove qemu-common.h include from most units and lots of other clenaups * do not include headers for all virtio devices in virtio-ccw.h # -----BEGIN PGP SIGNATURE----- # # iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmJXCQAUHHBib256aW5p # QHJlZGhhdC5jb20ACgkQv/vSX3jHroNT6wf+NHDJUEdDiwaVGVTGXgHuiaycsymi # FpNPiw/+XxSGN5xF3fkUGgqaDrcwIYwVfnXlghKSz8kp1cP3cjxa5CzNMLGTp5je # N6BxFbD7yC6dhagGm3mj32jlsptv3M38OHqKc3t+RaUAotP5RF2VdCyfUBLG6vU0 # aMzvMfMtB5aG0D8Fr5EV63t1JMTceFU0YxsG73UCFs2Yx4Z0cGBbNxMbHweRhd1q # tPeVDS46MFPM3/2cGGHpeeqxkoCTU7A9j1VuNQI3k+Kg+6W5YVxiK/UP7bw77E/a # yAHsmIVTNro8ajMBch73weuHtGtdfFLvCKc6QX6aVjzK4dF1voQ01E7gPQ== # =rMle # -----END PGP SIGNATURE----- # gpg: Signature made Wed 13 Apr 2022 10:31:44 AM PDT # gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83 # gpg: issuer "pbonzini@redhat.com" # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [undefined] # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [undefined] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (53 commits) target/i386: Remove unused XMMReg, YMMReg types and CPUState fields target/i386: do not access beyond the low 128 bits of SSE registers virtio-ccw: do not include headers for all virtio devices virtio-ccw: move device type declarations to .c files virtio-ccw: move vhost_ccw_scsi to a separate file s390x: follow qdev tree to detect SCSI device on a CCW bus hw: hyperv: Initial commit for Synthetic Debugging device hyperv: Add support to process syndbg commands hyperv: Add definitions for syndbg hyperv: SControl is optional to enable SynIc thread-posix: optimize qemu_sem_timedwait with zero timeout thread-posix: implement Semaphore with QemuCond and QemuMutex thread-posix: use monotonic clock for QemuCond and QemuSemaphore thread-posix: remove the posix semaphore support whpx: Added support for breakpoints and stepping build-sys: simplify AF_VSOCK check build-sys: drop ntddscsi.h check Remove qemu-common.h include from most units qga: remove explicit environ argument from exec/spawn Move fcntl_setfl() to oslib-posix ... Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2022-04-19 18:22:16 -07:00
Wentao Liang	4bf58c7213	virtio-iommu: use-after-free fix A potential Use-after-free was reported in virtio_iommu_handle_command when using virtio-iommu: > I find a potential Use-after-free in QEMU 6.2.0, which is in > virtio_iommu_handle_command() (./hw/virtio/virtio-iommu.c). > > > Specifically, in the loop body, the variable 'buf' allocated at line 639 can be > freed by g_free() at line 659. However, if the execution path enters the loop > body again and the if branch takes true at line 616, the control will directly > jump to 'out' at line 651. At this time, 'buf' is a freed pointer, which is not > assigned with an allocated memory but used at line 653. As a result, a UAF bug > is triggered. > > > > 599 for (;;) { > ... > 615 sz = iov_to_buf(iov, iov_cnt, 0, &head, sizeof(head)); > 616 if (unlikely(sz != sizeof(head))) { > 617 tail.status = VIRTIO_IOMMU_S_DEVERR; > 618 goto out; > 619 } > ... > 639 buf = g_malloc0(output_size); > ... > 651 out: > 652 sz = iov_from_buf(elem->in_sg, elem->in_num, 0, > 653 buf ? buf : &tail, output_size); > ... > 659 g_free(buf); > > We can fix it by set ‘buf‘ to NULL after freeing it: > > > 651 out: > 652 sz = iov_from_buf(elem->in_sg, elem->in_num, 0, > 653 buf ? buf : &tail, output_size); > ... > 659 g_free(buf); > +++ buf = NULL; > 660 } Fix as suggested by the reporter. Signed-off-by: Wentao Liang <Wentao_Liang_g@163.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-id: 20220407095047.50371-1-mst@redhat.com Message-ID: <20220406040445-mutt-send-email-mst@kernel.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2022-04-08 15:02:09 +01:00
Marc-André Lureau	0f9668e0c1	Remove qemu-common.h include from most units Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20220323155743.1585078-33-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-04-06 14:31:55 +02:00
Marc-André Lureau	415b732751	include: move C/util-related declarations to cutils.h Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20220323155743.1585078-22-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-04-06 14:31:43 +02:00
Marc-André Lureau	8e3b0cbb72	Replace qemu_real_host_page variables with inlined functions Replace the global variables with inlined helper functions. getpagesize() is very likely annotated with a "const" function attribute (at least with glibc), and thus optimization should apply even better. This avoids the need for a constructor initialization too. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20220323155743.1585078-12-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-04-06 10:50:38 +02:00
Marc-André Lureau	e03b56863d	Replace config-time define HOST_WORDS_BIGENDIAN Replace a config-time define with a compile time condition define (compatible with clang and gcc) that must be declared prior to its usage. This avoids having a global configure time define, but also prevents from bad usage, if the config header wasn't included before. This can help to make some code independent from qemu too. gcc supports __BYTE_ORDER__ from about 4.6 and clang from 3.2. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> [ For the s390x parts I'm involved in ] Acked-by: Halil Pasic <pasic@linux.ibm.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220323155743.1585078-7-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-04-06 10:50:37 +02:00
Marc-André Lureau	f793dde091	Replace qemu_gettimeofday() with g_get_real_time() GLib g_get_real_time() is an alternative to gettimeofday() which allows to simplify our code. For semihosting, a few bits are lost on POSIX host, but this shouldn't be a big concern. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Laurent Vivier <laurent@vivier.eu> Message-Id: <20220307070401.171986-5-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-04-06 10:50:37 +02:00
Stefano Garzarella	ef4ff56cf3	vhost-vdpa: fix typo in a comment Replace vpda with vdpa. Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20220328152022.73245-1-sgarzare@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2022-03-31 21:30:56 +02:00
Paolo Bonzini	14b61778bc	virtio: fix --enable-vhost-user build on non-Linux The vhost-shadow-virtqueue.c build requires include files from linux-headers/, so it cannot be built on non-Linux systems. Fortunately it is only needed by vhost-vdpa, so move it there. Acked-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-03-29 15:58:06 +02:00
Marc-André Lureau	9edc6313da	Replace GCC_FMT_ATTR with G_GNUC_PRINTF One less qemu-specific macro. It also helps to make some headers/units only depend on glib, and thus moved in standalone projects eventually. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Richard W.M. Jones <rjones@redhat.com>	2022-03-22 14:40:51 +04:00
Markus Armbruster	b21e238037	Use g_new() & friends where that makes obvious sense g_new(T, n) is neater than g_malloc(sizeof(T) * n). It's also safer, for two reasons. One, it catches multiplication overflowing size_t. Two, it returns T * rather than void *, which lets the compiler catch more type errors. This commit only touches allocations with size arguments of the form sizeof(T). Patch created mechanically with: $ spatch --in-place --sp-file scripts/coccinelle/use-g_new-etc.cocci \ --macro-file scripts/cocci-macro-file.h FILES... Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20220315144156.1595462-4-armbru@redhat.com> Reviewed-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru>	2022-03-21 15:44:44 +01:00
Bernhard Beschow	a2d860bb54	virtio/virtio-balloon: Prefer Object* over void* parameter opaque is an alias to obj. Using the ladder makes the code consistent with with other devices, e.g. accel/kvm/kvm-all and accel/tcg/tcg-all. It also makes the cast more typesafe. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: David Hildenbrand <david@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20220301222301.103821-2-shentey@gmail.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2022-03-18 13:57:50 +01:00
Peter Maydell	dee3a86d54	* whpx fixes in preparation for GDB support (Ivan) * VSS header fixes (Marc-André) * 5-level EPT support (Vitaly) * AMX support (Jing Liu & Yang Zhong) * Bundle changes to MSI routes (Longpeng) * More precise emulation of #SS (Gareth) * Disable ASAN testing -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmIwb5QUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroOOUQf8DiNcq8XVVMdX946Qwa4pSxc4ZJtF X+RkNsscluuLJ2vGEFKwPVps6c6UPqAhXUruZOQmcLmma511MsyJrxyfd4iRgPD2 tL1+n4RpfsbnTEGT8c6TFWWMEIOjLTbKmR/SIxuxpeVG3xlk6tlCevykrIdc90gP vQIByTGFx3GwiPyDo0j92mA/CsWLnfq6zQ2Tox1xCyt8R+QDimqG0KGLc5RAyiyC ZmilN2yaqizDfkIzinwHG6gP1NGwVUsrUNl4X9C4mwEMFnsXiyKP5n/BlDZ7w4Wb QXalFpPg1hJxRGGvyta6OF9VmCfmK9Q0FNVWm1lPE5adn3ECHFo6FJKvfg== =LVgf -----END PGP SIGNATURE----- Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging * whpx fixes in preparation for GDB support (Ivan) * VSS header fixes (Marc-André) * 5-level EPT support (Vitaly) * AMX support (Jing Liu & Yang Zhong) * Bundle changes to MSI routes (Longpeng) * More precise emulation of #SS (Gareth) * Disable ASAN testing # gpg: Signature made Tue 15 Mar 2022 10:51:00 GMT # gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83 # gpg: issuer "pbonzini@redhat.com" # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full] # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full] # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (22 commits) gitlab-ci: do not run tests with address sanitizer KVM: SVM: always set MSR_AMD64_TSC_RATIO to default value i386: Add Icelake-Server-v6 CPU model with 5-level EPT support x86: Support XFD and AMX xsave data migration x86: add support for KVM_CAP_XSAVE2 and AMX state migration x86: Add AMX CPUIDs enumeration x86: Add XFD faulting bit for state components x86: Grant AMX permission for guest x86: Add AMX XTILECFG and XTILEDATA components x86: Fix the 64-byte boundary enumeration for extended state linux-headers: include missing changes from 5.17 target/i386: Throw a #SS when loading a non-canonical IST target/i386: only include bits in pg_mode if they are not ignored kvm/msi: do explicit commit when adding msi routes kvm-irqchip: introduce new API to support route change update meson-buildoptions.sh qga/vss: update informative message about MinGW qga/vss-win32: check old VSS SDK headers meson: fix generic location of vss headers vmxcap: Add 5-level EPT bit ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2022-03-15 14:41:16 +00:00
Longpeng(Mike)	def4c5570c	kvm/msi: do explicit commit when adding msi routes We invoke the kvm_irqchip_commit_routes() for each addition to MSI route table, which is not efficient if we are adding lots of routes in some cases. This patch lets callers invoke the kvm_irqchip_commit_routes(), so the callers can decide how to optimize. [1] https://lists.gnu.org/archive/html/qemu-devel/2021-11/msg00967.html Signed-off-by: Longpeng <longpeng2@huawei.com> Message-Id: <20220222141116.2091-3-longpeng2@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-03-15 11:26:20 +01:00
Eugenio Pérez	12a195fa34	vdpa: Expose VHOST_F_LOG_ALL on SVQ SVQ is able to log the dirty bits by itself, so let's use it to not block migration. Also, ignore set and clear of VHOST_F_LOG_ALL on set_features if SVQ is enabled. Even if the device supports it, the reports would be nonsense because SVQ memory is in the qemu region. The log region is still allocated. Future changes might skip that, but this series is already long enough. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2022-03-15 13:57:44 +08:00
Eugenio Pérez	773ebc952e	vdpa: Never set log_base addr if SVQ is enabled Setting the log address would make the device start reporting invalid dirty memory because the SVQ vrings are located in qemu's memory. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2022-03-15 13:57:44 +08:00
Eugenio Pérez	6d0b222666	vdpa: Adapt vhost_vdpa_get_vring_base to SVQ This is needed to achieve migration, so the destination can restore its index. Setting base as last used idx, so destination will see as available all the entries that the device did not use, including the in-flight processing ones. This is ok for networking, but other kinds of devices might have problems with these retransmissions. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2022-03-15 13:57:44 +08:00
Eugenio Pérez	34e3c94eda	vdpa: Add custom IOTLB translations to SVQ Use translations added in VhostIOVATree in SVQ. Only introduce usage here, not allocation and deallocation. As with previous patches, we use the dead code paths of shadow_vqs_enabled to avoid commiting too many changes at once. These are impossible to take at the moment. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2022-03-15 13:57:44 +08:00
Eugenio Pérez	ec6122d882	vhost: Add VhostIOVATree This tree is able to look for a translated address from an IOVA address. At first glance it is similar to util/iova-tree. However, SVQ working on devices with limited IOVA space need more capabilities, like allocating IOVA chunks or performing reverse translations (qemu addresses to iova). The allocation capability, as "assign a free IOVA address to this chunk of memory in qemu's address space" allows shadow virtqueue to create a new address space that is not restricted by guest's addressable one, so we can allocate shadow vqs vrings outside of it. It duplicates the tree so it can search efficiently in both directions, and it will signal overlap if iova or the translated address is present in any tree. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2022-03-15 13:57:44 +08:00
Eugenio Pérez	100890f7ca	vhost: Shadow virtqueue buffers forwarding Initial version of shadow virtqueue that actually forward buffers. There is no iommu support at the moment, and that will be addressed in future patches of this series. Since all vhost-vdpa devices use forced IOMMU, this means that SVQ is not usable at this point of the series on any device. For simplicity it only supports modern devices, that expects vring in little endian, with split ring and no event idx or indirect descriptors. Support for them will not be added in this series. It reuses the VirtQueue code for the device part. The driver part is based on Linux's virtio_ring driver, but with stripped functionality and optimizations so it's easier to review. However, forwarding buffers have some particular pieces: One of the most unexpected ones is that a guest's buffer can expand through more than one descriptor in SVQ. While this is handled gracefully by qemu's emulated virtio devices, it may cause unexpected SVQ queue full. This patch also solves it by checking for this condition at both guest's kicks and device's calls. The code may be more elegant in the future if SVQ code runs in its own iocontext. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2022-03-15 13:57:44 +08:00
Eugenio Pérez	d96be4c894	vdpa: adapt vhost_ops callbacks to svq First half of the buffers forwarding part, preparing vhost-vdpa callbacks to SVQ to offer it. QEMU cannot enable it at this moment, so this is effectively dead code at the moment, but it helps to reduce patch size. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2022-03-15 13:57:44 +08:00
Eugenio Pérez	dafb34c992	virtio: Add vhost_svq_get_vring_addr It reports the shadow virtqueue address from qemu virtual address space. Since this will be different from the guest's vaddr, but the device can access it, SVQ takes special care about its alignment & lack of garbage data. It assumes that IOMMU will work in host_page_size ranges for that. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2022-03-15 13:57:44 +08:00
Eugenio Pérez	4725a4181b	vhost: Add vhost_svq_valid_features to shadow vq This allows SVQ to negotiate features with the guest and the device. For the device, SVQ is a driver. While this function bypasses all non-transport features, it needs to disable the features that SVQ does not support when forwarding buffers. This includes packed vq layout, indirect descriptors or event idx. Future changes can add support to offer more features to the guest, since the use of VirtQueue gives this for free. This is left out at the moment for simplicity. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2022-03-15 13:57:44 +08:00
Eugenio Pérez	a8ac88585d	vhost: Add Shadow VirtQueue call forwarding capabilities This will make qemu aware of the device used buffers, allowing it to write the guest memory with its contents if needed. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2022-03-15 13:57:44 +08:00
Eugenio Pérez	dff4426fa6	vhost: Add Shadow VirtQueue kick forwarding capabilities At this mode no buffer forwarding will be performed in SVQ mode: Qemu will just forward the guest's kicks to the device. Host memory notifiers regions are left out for simplicity, and they will not be addressed in this series. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2022-03-15 13:57:44 +08:00
Eugenio Pérez	10857ec0ad	vhost: Add VhostShadowVirtqueue Vhost shadow virtqueue (SVQ) is an intermediate jump for virtqueue notifications and buffers, allowing qemu to track them. While qemu is forwarding the buffers and virtqueue changes, it is able to commit the memory it's being dirtied, the same way regular qemu's VirtIO devices do. This commit only exposes basic SVQ allocation and free. Next patches of the series add functionality like notifications and buffers forwarding. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2022-03-15 13:57:44 +08:00
Sergio Lopez	ff5eb77b8a	vhost: use wfd on functions setting vring call fd When ioeventfd is emulated using qemu_pipe(), only EventNotifier's wfd can be used for writing. Use the recently introduced event_notifier_get_wfd() function to obtain the fd that our peer must use to signal the vring. Signed-off-by: Sergio Lopez <slp@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20220304100854.14829-3-slp@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-03-06 06:19:47 -05:00
Stefano Garzarella	8d1b247f37	vhost-vsock: detach the virqueue element in case of error In vhost_vsock_common_send_transport_reset(), if an element popped from the virtqueue is invalid, we should call virtqueue_detach_element() to detach it from the virtqueue before freeing its memory. Fixes: `fc0b9b0e1c` ("vhost-vsock: add virtio sockets device") Fixes: CVE-2022-26354 Cc: qemu-stable@nongnu.org Reported-by: VictorV <vv474172261@gmail.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20220228095058.27899-1-sgarzare@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-03-06 05:08:23 -05:00
Jean-Philippe Brucker	d9c96f2425	virtio-iommu: Support bypass domain The driver can create a bypass domain by passing the VIRTIO_IOMMU_ATTACH_F_BYPASS flag on the ATTACH request. Bypass domains perform slightly better than domains with identity mappings since they skip translation. Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Message-Id: <20220214124356.872985-4-jean-philippe@linaro.org> Acked-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-03-06 05:08:23 -05:00
Jean-Philippe Brucker	448179e33e	virtio-iommu: Default to bypass during boot Currently the virtio-iommu device must be programmed before it allows DMA from any PCI device. This can make the VM entirely unusable when a virtio-iommu driver isn't present, for example in a bootloader that loads the OS from storage. Similarly to the other vIOMMU implementations, default to DMA bypassing the IOMMU during boot. Add a "boot-bypass" property, defaulting to true, that lets users change this behavior. Replace the VIRTIO_IOMMU_F_BYPASS feature, which didn't support bypass before feature negotiation, with VIRTIO_IOMMU_F_BYPASS_CONFIG. We add the bypass field to the migration stream without introducing subsections, based on the assumption that this virtio-iommu device isn't being used in production enough to require cross-version migration at the moment (all previous version required workarounds since they didn't support ACPI and boot-bypass). Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Message-Id: <20220214124356.872985-3-jean-philippe@linaro.org> Acked-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-03-06 05:08:23 -05:00
Laurent Vivier	b1f030a0a2	vhost-vdpa: make notifiers _init()/_uninit() symmetric vhost_vdpa_host_notifiers_init() initializes queue notifiers for queues "dev->vq_index" to queue "dev->vq_index + dev->nvqs", whereas vhost_vdpa_host_notifiers_uninit() uninitializes the same notifiers for queue "0" to queue "dev->nvqs". This asymmetry seems buggy, fix that by using dev->vq_index as the base for both. Fixes: `d0416d487b` ("vhost-vdpa: map virtqueue notification area if possible") Cc: jasowang@redhat.com Signed-off-by: Laurent Vivier <lvivier@redhat.com> Message-Id: <20220211161309.1385839-1-lvivier@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-03-06 05:08:23 -05:00
Laurent Vivier	98f7607ecd	hw/virtio: vdpa: Fix leak of host-notifier memory-region If call virtio_queue_set_host_notifier_mr fails, should free host-notifier memory-region. This problem can trigger a coredump with some vDPA drivers (mlx5, but not with the vdpasim), if we unplug the virtio-net card from the guest after a stop/start. The same fix has been done for vhost-user: `1f89d3b91e` ("hw/virtio: Fix leak of host-notifier memory-region") Fixes: `d0416d487b` ("vhost-vdpa: map virtqueue notification area if possible") Cc: jasowang@redhat.com Resolves: https://bugzilla.redhat.com/2027208 Signed-off-by: Laurent Vivier <lvivier@redhat.com> Message-Id: <20220211170259.1388734-1-lvivier@redhat.com> Cc: qemu-stable@nongnu.org Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-03-06 05:08:23 -05:00
Viresh Kumar	0a24dd1fd5	hw/vhost-user-i2c: Add support for VIRTIO_I2C_F_ZERO_LENGTH_REQUEST VIRTIO_I2C_F_ZERO_LENGTH_REQUEST is a mandatory feature, that must be implemented by everyone. Add its support. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Message-Id: <fc47ab63b1cd414319c9201e8d6c7705b5ec3bd9.1644490993.git.viresh.kumar@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-03-04 08:30:52 -05:00
Halil Pasic	e65902a913	virtio: fix the condition for iommu_platform not supported The commit `04ceb61a40` ("virtio: Fail if iommu_platform is requested, but unsupported") claims to fail the device hotplug when iommu_platform is requested, but not supported by the (vhost) device. On the first glance the condition for detecting that situation looks perfect, but because a certain peculiarity of virtio_platform it ain't. In fact the aforementioned commit introduces a regression. It breaks virtio-fs support for Secure Execution, and most likely also for AMD SEV or any other confidential guest scenario that relies encrypted guest memory. The same also applies to any other vhost device that does not support _F_ACCESS_PLATFORM. The peculiarity is that iommu_platform and _F_ACCESS_PLATFORM collates "device can not access all of the guest RAM" and "iova != gpa, thus device needs to translate iova". Confidential guest technologies currently rely on the device/hypervisor offering _F_ACCESS_PLATFORM, so that, after the feature has been negotiated, the guest grants access to the portions of memory the device needs to see. So in for confidential guests, generally, _F_ACCESS_PLATFORM is about the restricted access to memory, but not about the addresses used being something else than guest physical addresses. This is the very reason for which commit `f7ef7e6e3b` ("vhost: correctly turn on VIRTIO_F_IOMMU_PLATFORM") fences _F_ACCESS_PLATFORM from the vhost device that does not need it, because on the vhost interface it only means "I/O address translation is needed". This patch takes inspiration from `f7ef7e6e3b` ("vhost: correctly turn on VIRTIO_F_IOMMU_PLATFORM"), and uses the same condition for detecting the situation when _F_ACCESS_PLATFORM is requested, but no I/O translation by the device, and thus no device capability is needed. In this situation claiming that the device does not support iommu_plattform=on is counter-productive. So let us stop doing that! Signed-off-by: Halil Pasic <pasic@linux.ibm.com> Reported-by: Jakob Naucke <Jakob.Naucke@ibm.com> Fixes: `04ceb61a40` ("virtio: Fail if iommu_platform is requested, but unsupported") Acked-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Tested-by: Daniel Henrique Barboza <danielhb413@gmail.com> Cc: Kevin Wolf <kwolf@redhat.com> Cc: qemu-stable@nongnu.org Message-Id: <20220207112857.607829-1-pasic@linux.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2022-03-04 08:30:52 -05:00
Xueming Li	0b0af4d62f	vhost-user: fix VirtQ notifier cleanup When vhost-user device cleanup, remove notifier MR and munmaps notifier address in the event-handling thread, VM CPU thread writing the notifier in concurrent fails with an error of accessing invalid address. It happens because MR is still being referenced and accessed in another thread while the underlying notifier mmap address is being freed and becomes invalid. This patch calls RCU and munmap notifiers in the callback after the memory flatview update finish. Fixes: `44866521bd` ("vhost-user: support registering external host notifiers") Cc: qemu-stable@nongnu.org Signed-off-by: Xueming Li <xuemingl@nvidia.com> Message-Id: <20220207071929.527149-3-xuemingl@nvidia.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-03-04 08:30:52 -05:00
Xueming Li	e867144b73	vhost-user: remove VirtQ notifier restore Notifier set when vhost-user backend asks qemu to mmap an FD and offset. When vhost-user backend restart or getting killed, VQ notifier FD and mmap addresses become invalid. After backend restart, MR contains the invalid address will be restored and fail on notifier access. On the other hand, qemu should munmap the notifier, release underlying hardware resources to enable backend restart and allocate hardware notifier resources correctly. Qemu shouldn't reference and use resources of disconnected backend. This patch removes VQ notifier restore, uses the default vhost-user notifier to avoid invalid address access. After backend restart, the backend should ask qemu to install a hardware notifier if needed. Fixes: `44866521bd` ("vhost-user: support registering external host notifiers") Cc: qemu-stable@nongnu.org Signed-off-by: Xueming Li <xuemingl@nvidia.com> Message-Id: <20220207071929.527149-2-xuemingl@nvidia.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-03-04 08:30:52 -05:00
Peter Maydell	b85ea5fa2f	include: Move qemu_madvise() and related #defines to new qemu/madvise.h The function qemu_madvise() and the QEMU_MADV_* constants associated with it are used in only 10 files. Move them out of osdep.h to a new qemu/madvise.h header that is included where it is needed. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20220208200856.3558249-2-peter.maydell@linaro.org	2022-02-21 13:30:20 +00:00
Bernhard Beschow	5e78c98b7c	Mark remaining global TypeInfo instances as const More than 1k of TypeInfo instances are already marked as const. Mark the remaining ones, too. This commit was created with: git grep -z -l 'static TypeInfo' -- '*.c' \| \ xargs -0 sed -i 's/static TypeInfo/static const TypeInfo/' Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Corey Minyard <cminyard@mvista.com> Message-id: 20220117145805.173070-2-shentey@gmail.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2022-02-21 13:30:20 +00:00
Peter Maydell	17e3134061	Remove unnecessary minimum_version_id_old fields The migration code will not look at a VMStateDescription's minimum_version_id_old field unless that VMSD has set the load_state_old field to something non-NULL. (The purpose of minimum_version_id_old is to specify what migration version is needed for the code in the function pointed to by load_state_old to be able to handle it on incoming migration.) We have exactly one VMSD which still has a load_state_old, in the PPC CPU; every other VMSD which sets minimum_version_id_old is doing so unnecessarily. Delete all the unnecessary ones. Commit created with: sed -i '/\.minimum_version_id_old/d' $(git grep -l '\.minimum_version_id_old') with the one legitimate use then hand-edited back in. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> --- It missed vmstate_ppc_cpu.	2022-01-28 15:38:23 +01:00
Gavin Shan	b1b87327a9	hw/arm/virt: Support for virtio-mem-pci This supports virtio-mem-pci device on "virt" platform, by simply following the implementation on x86. * This implements the hotplug handlers to support virtio-mem-pci device hot-add, while the hot-remove isn't supported as we have on x86. * The block size is 512MB on ARM64 instead of 128MB on x86. * It has been passing the tests with various combinations like 64KB and 4KB page sizes on host and guest, different memory device backends like normal, transparent huge page and HugeTLB, plus migration. Co-developed-by: David Hildenbrand <david@redhat.com> Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@Huawei.com> Reviewed-by: David Hildenbrand <david@redhat.com> Message-id: 20220111063329.74447-3-gshan@redhat.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2022-01-20 11:47:52 +00:00
Gavin Shan	1263615efe	virtio-mem: Correct default THP size for ARM64 The default block size is same as to the THP size, which is either retrieved from "/sys/kernel/mm/transparent_hugepage/hpage_pmd_size" or hardcoded to 2MB. There are flaws in both mechanisms and this intends to fix them up. * When "/sys/kernel/mm/transparent_hugepage/hpage_pmd_size" is used to getting the THP size, 32MB and 512MB are valid values when we have 16KB and 64KB page size on ARM64. * When the hardcoded THP size is used, 2MB, 32MB and 512MB are valid values when we have 4KB, 16KB and 64KB page sizes on ARM64. Co-developed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@Huawei.com> Reviewed-by: David Hildenbrand <david@redhat.com> Message-id: 20220111063329.74447-2-gshan@redhat.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2022-01-20 11:47:52 +00:00
Stefan Hajnoczi	db608fb784	virtio: unify dataplane and non-dataplane ->handle_output() Now that virtio-blk and virtio-scsi are ready, get rid of the handle_aio_output() callback. It's no longer needed. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Message-id: 20211207132336.36627-7-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2022-01-12 17:09:39 +00:00
Stefan Hajnoczi	d6fbfe2b83	virtio: use ->handle_output() instead of ->handle_aio_output() The difference between ->handle_output() and ->handle_aio_output() was that ->handle_aio_output() returned a bool return value indicating progress. This was needed by the old polling API but now that the bool return value is gone, the two functions can be unified. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Message-id: 20211207132336.36627-6-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2022-01-12 17:09:39 +00:00
Stefan Hajnoczi	d93d16c045	virtio: get rid of VirtIOHandleAIOOutput The virtqueue host notifier API virtio_queue_aio_set_host_notifier_handler() polls the virtqueue for new buffers. AioContext previously required a bool progress return value indicating whether an event was handled or not. This is no longer necessary because the AioContext polling API has been split into a poll check function and an event handler function. The event handler is only run when we know there is work to do, so it doesn't return bool. The VirtIOHandleAIOOutput function signature is now the same as VirtIOHandleOutput. Get rid of the bool return value. Further simplifications will be made for virtio-blk and virtio-scsi in the next patch. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Message-id: 20211207132336.36627-3-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2022-01-12 17:09:39 +00:00
Stefan Hajnoczi	826cc32423	aio-posix: split poll check from ready handler Adaptive polling measures the execution time of the polling check plus handlers called when a polled event becomes ready. Handlers can take a significant amount of time, making it look like polling was running for a long time when in fact the event handler was running for a long time. For example, on Linux the io_submit(2) syscall invoked when a virtio-blk device's virtqueue becomes ready can take 10s of microseconds. This can exceed the default polling interval (32 microseconds) and cause adaptive polling to stop polling. By excluding the handler's execution time from the polling check we make the adaptive polling calculation more accurate. As a result, the event loop now stays in polling mode where previously it would have fallen back to file descriptor monitoring. The following data was collected with virtio-blk num-queues=2 event_idx=off using an IOThread. Before: 168k IOPS, IOThread syscalls: 9837.115 ( 0.020 ms): IO iothread1/620155 io_submit(ctx_id: 140512552468480, nr: 16, iocbpp: 0x7fcb9f937db0) = 16 9837.158 ( 0.002 ms): IO iothread1/620155 write(fd: 103, buf: 0x556a2ef71b88, count: 8) = 8 9837.161 ( 0.001 ms): IO iothread1/620155 write(fd: 104, buf: 0x556a2ef71b88, count: 8) = 8 9837.163 ( 0.001 ms): IO iothread1/620155 ppoll(ufds: 0x7fcb90002800, nfds: 4, tsp: 0x7fcb9f1342d0, sigsetsize: 8) = 3 9837.164 ( 0.001 ms): IO iothread1/620155 read(fd: 107, buf: 0x7fcb9f939cc0, count: 512) = 8 9837.174 ( 0.001 ms): IO iothread1/620155 read(fd: 105, buf: 0x7fcb9f939cc0, count: 512) = 8 9837.176 ( 0.001 ms): IO iothread1/620155 read(fd: 106, buf: 0x7fcb9f939cc0, count: 512) = 8 9837.209 ( 0.035 ms): IO iothread1/620155 io_submit(ctx_id: 140512552468480, nr: 32, iocbpp: 0x7fca7d0cebe0) = 32 174k IOPS (+3.6%), IOThread syscalls: 9809.566 ( 0.036 ms): IO iothread1/623061 io_submit(ctx_id: 140539805028352, nr: 32, iocbpp: 0x7fd0cdd62be0) = 32 9809.625 ( 0.001 ms): IO iothread1/623061 write(fd: 103, buf: 0x5647cfba5f58, count: 8) = 8 9809.627 ( 0.002 ms): IO iothread1/623061 write(fd: 104, buf: 0x5647cfba5f58, count: 8) = 8 9809.663 ( 0.036 ms): IO iothread1/623061 io_submit(ctx_id: 140539805028352, nr: 32, iocbpp: 0x7fd0d0388b50) = 32 Notice that ppoll(2) and eventfd read(2) syscalls are eliminated because the IOThread stays in polling mode instead of falling back to file descriptor monitoring. As usual, polling is not implemented on Windows so this patch ignores the new io_poll_read() callback in aio-win32.c. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Message-id: 20211207132336.36627-2-stefanha@redhat.com [Fixed up aio_set_event_notifier() calls in tests/unit/test-fdmon-epoll.c added after this series was queued. --Stefan] Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2022-01-12 17:09:39 +00:00
Michael S. Tsirkin	a882b57123	Revert "virtio: introduce macro IRTIO_CONFIG_IRQ_IDX" This reverts commit `bf1d85c166`. Fixes: `bf1d85c166` ("virtio: introduce macro IRTIO_CONFIG_IRQ_IDX") Cc: "Cindy Lu" <lulu@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-01-10 16:02:54 -05:00
Michael S. Tsirkin	a20fa00ce1	Revert "virtio-pci: decouple notifier from interrupt process" This reverts commit `e3480ef81f`. Fixes: `e3480ef81f` ("virtio-pci: decouple notifier from interrupt process") Cc: "Cindy Lu" <lulu@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-01-10 16:02:36 -05:00
Michael S. Tsirkin	38ce405198	Revert "virtio-pci: decouple the single vector from the interrupt process" This reverts commit `316011b8a7`. Fixes: `316011b8a7` ("virtio-pci: decouple the single vector from the interrupt process") Cc: "Cindy Lu" <lulu@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-01-10 16:02:16 -05:00
Michael S. Tsirkin	73bd56abe1	Revert "vhost-vdpa: add support for config interrupt" This reverts commit `634f7c89fb`. Fixes: `634f7c89fb` ("vhost-vdpa: add support for config interrupt") Cc: "Cindy Lu" <lulu@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-01-10 16:01:44 -05:00
Michael S. Tsirkin	81c3ebc32f	Revert "virtio: add support for configure interrupt" This reverts commit `081f864f56`. Fixes: `081f864f56` ("virtio: add support for configure interrupt") Cc: "Cindy Lu" <lulu@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-01-10 16:01:28 -05:00
Michael S. Tsirkin	a86d1a0a93	Revert "vhost: add support for configure interrupt" This reverts commit `f7220a7ce2`. Fixes: `f7220a7ce2` ("vhost: add support for configure interrupt") Cc: "Cindy Lu" <lulu@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-01-10 16:01:11 -05:00
Michael S. Tsirkin	99478e5941	Revert "virtio-mmio: add support for configure interrupt" This reverts commit `d48185f1a4`. Fixes: `d48185f1a4` ("virtio-mmio: add support for configure interrupt") Cc: "Cindy Lu" <lulu@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-01-10 16:00:38 -05:00
Michael S. Tsirkin	847e9bc974	Revert "virtio-pci: add support for configure interrupt" This reverts commit `d5d24d859c`. Fixes: `d5d24d859c` ("virtio-pci: add support for configure interrupt") Cc: "Cindy Lu" <lulu@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-01-10 16:00:02 -05:00
Daniil Tatianin	d731ab3119	virtio/vhost-vsock: don't double close vhostfd, remove redundant cleanup In case of an error during initialization in vhost_dev_init, vhostfd is closed in vhost_dev_cleanup. Remove close from err_virtio as it's both redundant and causes a double close on vhostfd. Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru> Message-Id: <20211129125204.1108088-1-d-tatianin@yandex-team.ru> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-01-07 19:30:13 -05:00
David Hildenbrand	60f1f77cab	virtio-mem: Set "unplugged-inaccessible=auto" for the 7.0 machine on x86 Set the new default to "auto", keeping it set to "off" for compat machines. This property is only available for x86 targets. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20211217134039.29670-4-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-01-07 19:30:13 -05:00
David Hildenbrand	23ad8dec8d	virtio-mem: Support VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE With VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE, we signal the VM that reading unplugged memory is not supported. We have to fail feature negotiation in case the guest does not support VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE. First, VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE is required to properly handle memory backends (or architectures) without support for the shared zeropage in the hypervisor cleanly. Without the shared zeropage, even reading an unpopulated virtual memory location can populate real memory and consequently consume memory in the hypervisor. We have a guaranteed shared zeropage only on MAP_PRIVATE anonymous memory. Second, we want VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE to be the default long-term as even populating the shared zeropage can be problematic: for example, without THP support (possible) or without support for the shared huge zeropage with THP (unlikely), the PTE page tables to hold the shared zeropage entries can consume quite some memory that cannot be reclaimed easily. Third, there are other optimizations+features (e.g., protection of unplugged memory, reducing the total memory slot size and bitmap sizes) that will require VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE. We really only support x86 targets with virtio-mem for now (and Linux similarly only support x86), but that might change soon, so prepare for different targets already. Add a new "unplugged-inaccessible" tristate property for x86 targets: - "off" will keep VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE unset and legacy guests working. - "on" will set VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE and stop legacy guests from using the device. - "auto" selects the default based on support for the shared zeropage. Warn in case the property is set to "off" and we don't have support for the shared zeropage. For existing compat machines, the property will default to "off", to not change the behavior but eventually warn about a problematic setup. Short-term, we'll set the property default to "auto" for new QEMU machines. Mid-term, we'll set the property default to "on" for new QEMU machines. Long-term, we'll deprecate the parameter and disallow legacy guests completely. The property has to match on the migration source and destination. "auto" will result in the same VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE setting as long as the qemu command line (esp. memdev) match -- so "auto" is good enough for migration purposes and the parameter doesn't have to be migrated explicitly. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20211217134039.29670-3-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-01-07 19:30:13 -05:00
Stefan Hajnoczi	750539c4c4	virtio: signal after wrapping packed used_idx Packed Virtqueues wrap used_idx instead of letting it run freely like Split Virtqueues do. If the used ring wraps more than once there is no way to compare vq->signalled_used and vq->used_idx in virtio_packed_should_notify() since they are modulo vq->vring.num. This causes the device to stop sending used buffer notifications when when virtio_packed_should_notify() is called less than once each time around the used ring. It is possible to trigger this with virtio-blk's dataplane notify_guest_bh() irq coalescing optimization. The call to virtio_notify_irqfd() (and virtio_packed_should_notify()) is deferred to a BH. If the guest driver is polling it can complete and submit more requests before the BH executes, causing the used ring to wrap more than once. The result is that the virtio-blk device ceases to raise interrupts and I/O hangs. Cc: Tiwei Bie <tiwei.bie@intel.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20211130134510.267382-1-stefanha@redhat.com> Fixes: `86044b24e8` ("virtio: basic packed virtqueue support") Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-01-07 19:30:13 -05:00
David Hildenbrand	09b3b7e092	virtio-mem: Support "prealloc=on" option For scarce memory resources, such as hugetlb, we want to be able to prealloc such memory resources in order to not crash later on access. On simple user errors we could otherwise easily run out of memory resources an crash the VM -- pretty much undesired. For ordinary memory devices, such as DIMMs, we preallocate memory via the memory backend for such use cases; however, with virtio-mem we're dealing with sparse memory backends; preallocating the whole memory backend destroys the whole purpose of virtio-mem. Instead, we want to preallocate memory when actually exposing memory to the VM dynamically, and fail plugging memory gracefully + warn the user in case preallocation fails. A common use case for hugetlb will be using "reserve=off,prealloc=off" for the memory backend and "prealloc=on" for the virtio-mem device. This way, no huge pages will be reserved for the process, but we can recover if there are no actual huge pages when plugging memory. Libvirt is already prepared for this. Note that preallocation cannot protect from the OOM killer -- which holds true for any kind of preallocation in QEMU. It's primarily useful only for scarce memory resources such as hugetlb, or shared file-backed memory. It's of little use for ordinary anonymous memory that can be swapped, KSM merged, ... but we won't forbid it. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20211217134611.31172-9-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-01-07 19:30:13 -05:00
Roman Kagan	5d33ae4b7a	vhost: stick to -errno error return convention The generic vhost code expects that many of the VhostOps methods in the respective backends set errno on errors. However, none of the existing backends actually bothers to do so. In a number of those methods errno from the failed call is clobbered by successful later calls to some library functions; on a few code paths the generic vhost code then negates and returns that errno, thus making failures look as successes to the caller. As a result, in certain scenarios (e.g. live migration) the device doesn't notice the first failure and goes on through its state transitions as if everything is ok, instead of taking recovery actions (break and reestablish the vhost-user connection, cancel migration, etc) before it's too late. To fix this, consolidate on the convention to return negated errno on failures throughout generic vhost, and use it for error propagation. Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru> Message-Id: <20211111153354.18807-10-rvkagan@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-01-07 05:19:55 -05:00
Roman Kagan	025faa872b	vhost-user: stick to -errno error return convention VhostOps methods in user_ops are not very consistent in their error returns: some return negated errno while others just -1. Make sure all of them consistently return negated errno. This also helps error propagation from the functions being called inside. Besides, this synchronizes the error return convention with the other two vhost backends, kernel and vdpa, and will therefore allow for consistent error propagation in the generic vhost code (in a followup patch). Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru> Message-Id: <20211111153354.18807-9-rvkagan@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-01-07 05:19:55 -05:00
Roman Kagan	3631151b3e	vhost-vdpa: stick to -errno error return convention Almost all VhostOps methods in vdpa_ops follow the convention of returning negated errno on error. Adjust the few that don't. To that end, rework vhost_vdpa_add_status to check if setting of the requested status bits has succeeded and return the respective error code it hasn't, and propagate the error codes wherever it's appropriate. Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru> Message-Id: <20211111153354.18807-8-rvkagan@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-01-07 05:19:55 -05:00
Roman Kagan	2d88d9c65c	vhost-backend: stick to -errno error return convention Almost all VhostOps methods in kernel_ops follow the convention of returning negated errno on error. Adjust the only one that doesn't. Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru> Message-Id: <20211111153354.18807-7-rvkagan@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>	2022-01-07 05:19:55 -05:00
Roman Kagan	6dcae534e8	vhost-backend: avoid overflow on memslots_limit Fix the (hypothetical) potential problem when the value parsed out of the vhost module parameter in sysfs overflows the return value from vhost_kernel_memslots_limit. Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru> Message-Id: <20211111153354.18807-6-rvkagan@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-01-07 05:19:55 -05:00
Cindy Lu	d5d24d859c	virtio-pci: add support for configure interrupt Add support for configure interrupt, The process is used kvm_irqfd_assign to set the gsi to kernel. When the configure notifier was signal by host, qemu will inject a msix interrupt to guest Signed-off-by: Cindy Lu <lulu@redhat.com> Message-Id: <20211104164827.21911-11-lulu@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-01-07 05:19:55 -05:00
Cindy Lu	d48185f1a4	virtio-mmio: add support for configure interrupt Add configure interrupt support for virtio-mmio bus. This interrupt will be working while the backend is vhost-vdpa Signed-off-by: Cindy Lu <lulu@redhat.com> Message-Id: <20211104164827.21911-10-lulu@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-01-07 05:19:55 -05:00
Cindy Lu	f7220a7ce2	vhost: add support for configure interrupt Add functions to support configure interrupt. The configure interrupt process will start in vhost_dev_start and stop in vhost_dev_stop. Also add the functions to support vhost_config_pending and vhost_config_mask, for masked_config_notifier, we only use the notifier saved in vq 0. Signed-off-by: Cindy Lu <lulu@redhat.com> Message-Id: <20211104164827.21911-8-lulu@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-01-06 06:11:39 -05:00
Cindy Lu	081f864f56	virtio: add support for configure interrupt Add the functions to support the configure interrupt in virtio The function virtio_config_guest_notifier_read will notify the guest if there is an configure interrupt. The function virtio_config_set_guest_notifier_fd_handler is to set the fd hander for the notifier Signed-off-by: Cindy Lu <lulu@redhat.com> Message-Id: <20211104164827.21911-7-lulu@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-01-06 06:11:39 -05:00
Cindy Lu	634f7c89fb	vhost-vdpa: add support for config interrupt Add new call back function in vhost-vdpa, this function will set the event fd to kernel. This function will be called in the vhost_dev_start and vhost_dev_stop Signed-off-by: Cindy Lu <lulu@redhat.com> Message-Id: <20211104164827.21911-6-lulu@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-01-06 06:11:39 -05:00
Cindy Lu	316011b8a7	virtio-pci: decouple the single vector from the interrupt process To reuse the interrupt process in configure interrupt Need to decouple the single vector from the interrupt process. Add new function kvm_virtio_pci_vector_use_one and _release_one. These functions are use for the single vector, the whole process will finish in a loop for the vq number. Signed-off-by: Cindy Lu <lulu@redhat.com> Message-Id: <20211104164827.21911-4-lulu@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-01-06 06:11:39 -05:00
Cindy Lu	e3480ef81f	virtio-pci: decouple notifier from interrupt process To reuse the notifier process in configure interrupt. Use the virtio_pci_get_notifier function to get the notifier. the INPUT of this function is the IDX, the OUTPUT is notifier and the vector Signed-off-by: Cindy Lu <lulu@redhat.com> Message-Id: <20211104164827.21911-3-lulu@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-01-06 06:11:39 -05:00
Cindy Lu	bf1d85c166	virtio: introduce macro IRTIO_CONFIG_IRQ_IDX To support configure interrupt for vhost-vdpa Introduce VIRTIO_CONFIG_IRQ_IDX -1 as configure interrupt's queue index, Then we can reuse the functions guest_notifier_mask and guest_notifier_pending. Add the check of queue index in these drivers, if the driver does not support configure interrupt, the function will just return Signed-off-by: Cindy Lu <lulu@redhat.com> Message-Id: <20211104164827.21911-2-lulu@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-01-06 06:11:39 -05:00
David Hildenbrand	7656d9ce09	virtio-mem: Don't skip alignment checks when warning about block size If we warn about the block size being smaller than the default, we skip some alignment checks. This can currently only fail on x86-64, when specifying a block size of 1 MiB, however, we detect the THP size of 2 MiB. Fixes: `228957fea3` ("virtio-mem: Probe THP size to determine default block size") Cc: "Michael S. Tsirkin" <mst@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20211011173305.13778-1-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-01-06 04:16:58 -05:00
Philippe Mathieu-Daudé	a1d4b0a305	dma: Let dma_memory_map() take MemTxAttrs argument Let devices specify transaction attributes when calling dma_memory_map(). Patch created mechanically using spatch with this script: @@ expression E1, E2, E3, E4; @@ - dma_memory_map(E1, E2, E3, E4) + dma_memory_map(E1, E2, E3, E4, MEMTXATTRS_UNSPECIFIED) Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Li Qiang <liq3ea@gmail.com> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20211223115554.3155328-7-philmd@redhat.com>	2021-12-30 17:16:32 +01:00
Leonardo Garcia	f71d31fa81	hw/virtio/vhost: Fix typo in comment. Signed-off-by: Leonardo Garcia <lagarcia@br.ibm.com> Reviewed-by: Laurent Vivier <laurent@vivier.eu> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <a10a0ddab65b474ebea1e1141abe0f4aa463909b.1637668012.git.lagarcia@br.ibm.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2021-12-17 10:46:08 +01:00
Richard Henderson	aab8cfd4c3	target-arm queue: * ITS: error reporting cleanup * aspeed: improve documentation * Fix STM32F2XX USART data register readout * allow emulated GICv3 to be disabled in non-TCG builds * fix exception priority for singlestep, misaligned PC, bp, etc * Correct calculation of tlb range invalidate length * npcm7xx_emc: fix missing queue_flush * virt: Add VIOT ACPI table for virtio-iommu * target/i386: Use assert() to sanity-check b1 in SSE decode * Don't include qemu-common unnecessarily -----BEGIN PGP SIGNATURE----- iQJNBAABCAA3FiEE4aXFk81BneKOgxXPPCUl7RQ2DN4FAmG5xekZHHBldGVyLm1h eWRlbGxAbGluYXJvLm9yZwAKCRA8JSXtFDYM3ramD/0WL8YV70sW5B/tHdb+/em1 xTBuABUUj5QDvKnxNoPIBwJI0vgmzwhAonYzcKKEUvlbL97crkgt6xSPvVxv2nf5 wnSYMKTDEC11AuYVdEyIMm5KLc88mq1w78pTYkFSUJmujCpfqLAsyXdEastIPHfN MdrwkpQ3wVmMeMcNBTq2yCxiGlz7x/myeJtDU9ihgPTcsgXa8BzziK6qCZHAOGCL 0/ljXDbVTJtLYUki9IqptPs8QUtlqOBt3rLplxHfKRKpmjiuD+xFlQ4GuIOBX+AL tQWgEyyiR9FnYpY1t3fWVtuKgjYXzlbh1A6cwdsK3Q68+qfi7Yr+lPryjwrmOkx7 /Yupq+QB/xgK4nxF4ydDXLvqI3h6GjaF2U9qujK3H9DyMOEYJDpaX1TZMphtWI89 9u7kLO6DNE00oUoiX+6Aty0qQtXv12SSaNpJmFON87/WLJJamHuiS6NiZp/r4ORU 51ds2LPGJAKAy9duqmZJ/81WlNjmHmurq1v+FIl29XInc4a2SpwEUM0rsTrrQTaD 16Qh2OZCnlYEg9nh6B54FQe8xP+pp69Gn/BRFhcwW9fPq4/pHSrwKEkI6lE+Yuiq +Fe8r0DbZczfhjcGdoUlIgMj+WSVY9Q8Opztsmv/kjZqxt0VvfdmAVp0odl5KdB4 cKAeYciNSgq2bGd+N4kuHA== =KuTi -----END PGP SIGNATURE----- Merge tag 'pull-target-arm-20211215' of https://git.linaro.org/people/pmaydell/qemu-arm into staging target-arm queue: * ITS: error reporting cleanup * aspeed: improve documentation * Fix STM32F2XX USART data register readout * allow emulated GICv3 to be disabled in non-TCG builds * fix exception priority for singlestep, misaligned PC, bp, etc * Correct calculation of tlb range invalidate length * npcm7xx_emc: fix missing queue_flush * virt: Add VIOT ACPI table for virtio-iommu * target/i386: Use assert() to sanity-check b1 in SSE decode * Don't include qemu-common unnecessarily # gpg: Signature made Wed 15 Dec 2021 02:39:37 AM PST # gpg: using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE # gpg: issuer "peter.maydell@linaro.org" # gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [full] # gpg: aka "Peter Maydell <pmaydell@gmail.com>" [full] # gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [full] * tag 'pull-target-arm-20211215' of https://git.linaro.org/people/pmaydell/qemu-arm: (33 commits) tests/acpi: add expected blob for VIOT test on virt machine tests/acpi: add expected blobs for VIOT test on q35 machine tests/acpi: add test case for VIOT tests/acpi: allow updates of VIOT expected data files hw/arm/virt: Use object_property_set instead of qdev_prop_set hw/arm/virt: Reject instantiation of multiple IOMMUs hw/arm/virt: Remove device tree restriction for virtio-iommu hw/arm/virt-acpi-build: Add VIOT table for virtio-iommu hw/net: npcm7xx_emc fix missing queue_flush target/arm: Correct calculation of tlb range invalidate length hw/arm: Don't include qemu-common.h unnecessarily target/rx/cpu.h: Don't include qemu-common.h target/hexagon/cpu.h: don't include qemu-common.h include/hw/i386: Don't include qemu-common.h in .h files target/i386: Use assert() to sanity-check b1 in SSE decode tests/tcg: Add arm and aarch64 pc alignment tests target/arm: Suppress bp for exceptions with more priority target/arm: Assert thumb pc is aligned target/arm: Take an exception if PC is misaligned target/arm: Split compute_fsr_fsc out of arm_deliver_fault ... Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-12-15 10:33:45 -08:00
Jean-Philippe Brucker	092cba0350	hw/arm/virt: Remove device tree restriction for virtio-iommu virtio-iommu is now supported with ACPI VIOT as well as device tree. Remove the restriction that prevents from instantiating a virtio-iommu device under ACPI. Acked-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Message-id: 20211210170415.583179-3-jean-philippe@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2021-12-15 10:35:26 +00:00
Eric Auger	6b77ae0531	virtio-iommu: Fix the domain_range end in old times the domain range was defined by a domain_bits le32. This was then converted into a domain_range struct. During the upgrade the original value of '32' (bits) has been kept while the end field now is the max value of the domain id (UINT32_MAX). Fix that and also use UINT64_MAX for the input_range.end. Reported-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Message-Id: <20211127072910.1261824-4-eric.auger@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>	2021-12-15 08:08:59 +01:00
Eric Auger	3a411b2d96	virtio-iommu: Fix endianness in get_config Endianess is not properly handled when populating the returned config. Use the cpu_to_le* primitives for each separate field. Also, while at it, trace the domain range start. Reported-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Message-Id: <20211127072910.1261824-3-eric.auger@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>	2021-12-15 08:08:59 +01:00
Eric Auger	7b140d2359	virtio-iommu: Remove set_config callback The spec says "the driver must not write to device configuration fields". So remove the set_config() callback which anyway did not do anything. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Message-Id: <20211127072910.1261824-2-eric.auger@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>	2021-12-15 08:08:59 +01:00
Jason Wang	d3f1f940eb	virtio-balloon: correct used length Spec said: "and len the total of bytes written into the buffer." For inflateq, deflateq and statsq, we don't process in_sg so the used length should be zero. For free_page_vq, tough the pages could be changed by the device (in the destination), spec said: "Note: len is particularly useful for drivers using untrusted buffers: if a driver does not know exactly how much has been written by the device, the driver would have to zero the buffer in advance to ensure no data leakage occurs." So 0 should be used as well here. Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20211129030841.3611-2-jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com>	2021-11-29 08:49:36 -05:00
Jason Wang	0fe7245d8b	virtio-balloon: process all in sgs for free_page_vq We only process the first in sg which may lead to the bitmap of the pages belongs to following sgs were not cleared. This may result more pages to be migrated. Fixing this by process all in sgs for free_page_vq. Acked-by: David Hildenbrand <david@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20211129030841.3611-1-jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-11-29 08:49:36 -05:00
Cindy Lu	7abba7c638	virtio-mmio : fix the crash in the vm shutdown The root cause for this crash is the ioeventfd not stopped while the VM stop. The callback for vmstate_change was not implement in virtio-mmio bus Reproduce step load the vm with -M microvm \ -netdev tap,id=net0,vhostforce,script=no,downscript=no \ -device virtio-net-device,netdev=net0\ After the VM boot, login the vm and then shutdown the vm System will crash [Current thread is 1 (Thread 0x7ffff6edde00 (LWP 374378))] (gdb) bt 0 0x00005555558f18b4 in qemu_flush_or_purge_queued_packets (purge=false, nc=0x55500252e850) at ../net/net.c:636 1 qemu_flush_queued_packets (nc=0x55500252e850) at ../net/net.c:656 2 0x0000555555b6c363 in virtio_queue_notify_vq (vq=0x7fffe7e2b010) at ../hw/virtio/virtio.c:2339 3 virtio_queue_host_notifier_read (n=0x7fffe7e2b08c) at ../hw/virtio/virtio.c:3583 4 0x0000555555de7b5a in aio_dispatch_handler (ctx=ctx@entry=0x5555567c5780, node=0x555556b83fd0) at ../util/aio-posix.c:329 5 0x0000555555de8454 in aio_dispatch_ready_handlers (ready_list=<optimized out>, ctx=<optimized out>) at ../util/aio-posix.c:359 6 aio_poll (ctx=0x5555567c5780, blocking=blocking@entry=false) at ../util/aio-posix.c:662 7 0x0000555555cce0cc in monitor_cleanup () at ../monitor/monitor.c:645 8 0x0000555555b06bd2 in qemu_cleanup () at ../softmmu/runstate.c:822 9 0x000055555586e693 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at ../softmmu/main.c:51 Signed-off-by: Cindy Lu <lulu@redhat.com> Message-Id: <20211109023744.22387-1-lulu@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-11-28 17:03:52 -05:00
Jason Wang	d152cdd6f6	virtio: use virtio accessor to access packed event We used to access packed descriptor event and off_wrap via address_space_{write\|read}_cached(). When we hit the cache, memcpy() is used which is not atomic which may lead a wrong value to be read or wrote. This patch fixes this by switching to use virito_{stw\|lduw}_phys_cached() to make sure the access is atomic. Fixes: `683f766567` ("virtio: event suppression support for packed ring") Cc: qemu-stable@nongnu.org Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20211111063854.29060-2-jasowang@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-11-15 09:44:46 -05:00
Jason Wang	f463e761a4	virtio: use virtio accessor to access packed descriptor flags We used to access packed descriptor flags via address_space_{write\|read}_cached(). When we hit the cache, memcpy() is used which is not an atomic operation which may lead a wrong value is read or wrote. So this patch switches to use virito_{stw\|lduw}_phys_cached() to make sure the aceess is atomic. Fixes: `86044b24e8` ("virtio: basic packed virtqueue support") Cc: qemu-stable@nongnu.org Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20211111063854.29060-1-jasowang@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-11-15 09:44:46 -05:00
Eugenio Pérez	245cf2c24e	vhost: Rename last_index to vq_index_end The doc of this field pointed out that last_index is the last vq index. This is misleading, since it's actually one past the end of the vqs. Renaming and modifying comment. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20211104085625.2054959-2-eperezma@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-11-11 03:13:05 -05:00
Richard Henderson	7fa736595e	pc,pci,virtio: features, fixes virtio-iommu support for x86/ACPI. Fixes, cleanups all over the place. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> -----BEGIN PGP SIGNATURE----- iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmGAefYPHG1zdEByZWRo YXQuY29tAAoJECgfDbjSjVRpiCUH/2pIs3FmOGIasEqn4BnqXr4dHMReUO5Ghg0v cXle4ZUrbg7Qpnxh07CwMuUpJV3Qv+xtVK7hzbD13nnxrkTZuKzBRV1AthkA1Hly zIKOxnEgV497LaXoaSOtqAx48fuznk5XOHju91usgu4mehJ0qe2gcwb4H8uWGkQi hrsR7a9woP0M4H/jvb3+aQRCJKMscj8ReabM1ulOugNpPdNI/jIKtBvZBtTxAqtQ CH9/DJLfVmzDRYdeBpnF06A+tXm4uU1Q5BmpmF9qaymk/PzthN54gdnDd6zH405Z Tmjp9UA2xfEYDmKzuTCBdPmoUe6OI7mU9o0WbB5MGYx5RRRBETw= =R7DD -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging pc,pci,virtio: features, fixes virtio-iommu support for x86/ACPI. Fixes, cleanups all over the place. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # gpg: Signature made Mon 01 Nov 2021 07:36:22 PM EDT # gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469 # gpg: issuer "mst@redhat.com" # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full] # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full] * remotes/mst/tags/for_upstream: hw/i386: fix vmmouse registration pci: Export pci_for_each_device_under_bus*() pci: Define pci_bus_dev_fn/pci_bus_fn/pci_bus_ret_fn hw/i386/pc: Allow instantiating a virtio-iommu device hw/i386/pc: Move IOMMU singleton into PCMachineState hw/i386/pc: Remove x86_iommu_get_type() hw/acpi: Add VIOT table vhost-vdpa: Set discarding of RAM broken when initializing the backend qtest: fix 'expression is always false' build failure in qtest_has_accel() Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-11-02 19:24:17 -04:00

1 2 3 4 5 ...

1301 Commits