It is always VHOST_IOTLB_MSG_V2. We can always make it back per
vhost_dev if needed.
This change makes easier for vhost_vdpa_map and unmap not to depend on
vhost_vdpa but only in VhostVDPAShared.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20231221174322.3130442-10-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Next patches will register the vhost_vdpa memory listener while the VM
is migrating at the destination, so we can map the memory to the device
before stopping the VM at the source. The main goal is to reduce the
downtime.
However, the destination QEMU is unaware of which vhost_vdpa device will
register its memory_listener. If the source guest has CVQ enabled, it
will be the CVQ device. Otherwise, it will be the first one.
Move the backend_cap member to VhostVDPAShared so all vhost_vdpa can use
it, rather than always in the first / last vhost_vdpa.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20231221174322.3130442-9-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Next patches will register the vhost_vdpa memory listener while the VM
is migrating at the destination, so we can map the memory to the device
before stopping the VM at the source. The main goal is to reduce the
downtime.
However, the destination QEMU is unaware of which vhost_vdpa device will
register its memory_listener. If the source guest has CVQ enabled, it
will be the CVQ device. Otherwise, it will be the first one.
Move the iotlb_batch_begin_sent member to VhostVDPAShared so all
vhost_vdpa can use it, rather than always in the first / last
vhost_vdpa.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20231221174322.3130442-8-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Next patches will register the vhost_vdpa memory listener while the VM
is migrating at the destination, so we can map the memory to the device
before stopping the VM at the source. The main goal is to reduce the
downtime.
However, the destination QEMU is unaware of which vhost_vdpa device will
register its memory_listener. If the source guest has CVQ enabled, it
will be the CVQ device. Otherwise, it will be the first one.
Move the file descriptor to VhostVDPAShared so all vhost_vdpa can use
it, rather than always in the first / last vhost_vdpa.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20231221174322.3130442-7-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
By the end of this series dma_map and dma_unmap functions don't have the
vdpa device for tracing. Movinge trace function to shared member one.
Print it also in the vdpa initialization so log reader can relate them.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20231221174322.3130442-6-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Next patches will register the vhost_vdpa memory listener while the VM
is migrating at the destination, so we can map the memory to the device
before stopping the VM at the source. The main goal is to reduce the
downtime.
However, the destination QEMU is unaware of which vhost_vdpa device will
register its memory_listener. If the source guest has CVQ enabled, it
will be the CVQ device. Otherwise, it will be the first one.
Move the shadow_data member to VhostVDPAShared so all vhost_vdpa can use
it, rather than always in the first or last vhost_vdpa.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20231221174322.3130442-5-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Next patches will register the vhost_vdpa memory listener while the VM
is migrating at the destination, so we can map the memory to the device
before stopping the VM at the source. The main goal is to reduce the
downtime.
However, the destination QEMU is unaware of which vhost_vdpa device will
register its memory_listener. If the source guest has CVQ enabled, it
will be the CVQ device. Otherwise, it will be the first one.
Move the iova range to VhostVDPAShared so all vhost_vdpa can use it,
rather than always in the first or last vhost_vdpa.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20231221174322.3130442-4-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Next patches will register the vhost_vdpa memory listener while the VM
is migrating at the destination, so we can map the memory to the device
before stopping the VM at the source. The main goal is to reduce the
downtime.
However, the destination QEMU is unaware of which vhost_vdpa device will
register its memory_listener. If the source guest has CVQ enabled, it
will be the CVQ device. Otherwise, it will be the first one.
Move the iova tree to VhostVDPAShared so all vhost_vdpa can use it,
rather than always in the first or last vhost_vdpa.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20231221174322.3130442-3-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
It will hold properties shared among all vhost_vdpa instances associated
with of the same device. For example, we just need one iova_tree or one
memory listener for the entire device.
Next patches will register the vhost_vdpa memory listener at the
beginning of the VM migration at the destination. This enables QEMU to
map the memory to the device before stopping the VM at the source,
instead of doing while both source and destination are stopped, thus
minimizing the downtime.
However, the destination QEMU is unaware of which vhost_vdpa struct will
register its memory_listener. If the source guest has CVQ enabled, it
will be the one associated with the CVQ. Otherwise, it will be the
first one.
Save the memory operations related members in a common place rather than
always in the first / last vhost_vdpa.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20231221174322.3130442-2-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Next commits will set DRIVER and ACKNOWLEDGE flags repeatedly in the
case of a migration destination. Let's save ioctls with this.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20231215172830.2540987-2-eperezma@redhat.com>
Virtio-gpu malloc memory for the queue when it realized, but the queues was not
released when it unrealized, which resulting in a memory leak. In addition,
vm_change_state_handler is not cleaned up, which is related to vdev and will
lead to segmentation fault when VM shutdown.
Signed-off-by: wangmeiling <wangmeiling21@huawei.com>
Signed-off-by: Binfeng Wu <wubinfeng@huawei.com>
Message-Id: <7bbbc0f3-2ad9-83ca-b39b-f976d0837daf@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
It is required to use error_report() instead of error_reportf_err(), if the
prior function does not take local_err as the argument. As a result, the
local_err is always NULL and segment fault may happen.
vhost_scsi_start()
-> vhost_scsi_set_endpoint(s) --> does not allocate local_err
-> error_reportf_err()
-> error_vprepend()
-> g_string_append(newmsg, (*errp)->msg) --> (*errp) is NULL
In addition, add ": " at the end of other error_reportf_err() logs.
Fixes: 7962e432b4e4 ("vhost-user-scsi: support reconnect to backend")
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Message-Id: <20231214003117.43960-1-dongli.zhang@oracle.com>
Reviewed-by: Feng Li <fengli@smartx.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
If a vcpu with an apic-id that is not supported by the legacy
interface (>255) is hot-plugged, the legacy code will dynamically switch
to the modern interface. However, the hotplug event is not forwarded to
the new interface resulting in the vcpu not being fully/properly added
to the machine config. This BUG is evidenced by OVMF when it
it attempts to count the vcpus and reports an inconsistent vcpu count
reported by the fw_cfg interface and the modern hotpug interface.
Fix is to propagate the hotplug event after making the switch from
the legacy interface to the modern interface.
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Aaron Young <aaron.young@oracle.com>
Message-Id: <0e8a9baebbb29f2a6c87fd08e43dc2ac4019759a.1702398644.git.Aaron.Young@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This adds support for vhost-scsi to be able to create a worker thread
per virtqueue. Right now for vhost-net we get a worker thread per
tx/rx virtqueue pair which scales nicely as we add more virtqueues and
CPUs, but for scsi we get the single worker thread that's shared by all
virtqueues. When trying to send IO to more than 2 virtqueues the single
thread becomes a bottlneck.
This patch adds a new setting, worker_per_virtqueue, which can be set
to:
false: Existing behavior where we get the single worker thread.
true: Create a worker per IO virtqueue.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20231204231618.21962-3-michael.christie@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
This adds the vhost backend callouts for the worker ioctls added in the
6.4 linux kernel commit:
c1ecd8e95007 ("vhost: allow userspace to create workers")
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20231204231618.21962-2-michael.christie@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In fact, type4-count, core-count, core-count2, thread-count and
thread-count2 are tested with KVM not TCG.
Rename these test functions to reflect KVM base instead of TCG.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-Id: <20231127160202.1037290-1-zhao1.liu@linux.intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Since the driver doesn't support interrupts, we must return early when
index is set to VIRTIO_CONFIG_IRQ_IDX. Basically the same thing Viresh
did for "91208dd297f2 virtio: i2c: Check notifier helpers for
VIRTIO_CONFIG_IRQ_IDX".
Fixes: 544f0278afca ("virtio: introduce macro VIRTIO_CONFIG_IRQ_IDX")
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Message-Id: <20231025171841.3379663-1-mathieu.poirier@linaro.org>
Tested-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
vcpu_dirty_stat_collect() has an unused parameter so remove it.
Signed-off-by: Wafer <wafer@jaguarmicro.com>
Reviewed-by: Hyman Huang <yong.huang@smartx.com>
Message-Id: <20231204012230.4123-1-wafer@jaguarmicro.com>
Patch removes DPRINTF macro and adds multiple tracepoints
to capture different kvm events.
We also drop the DPRINTFs that don't add any additional
information than trace_kvm_run_exit already does.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1827
Signed-off-by: Jai Arora <arorajai2798@gmail.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Some options for -display cocoa were not described or not listed at all.
Reported-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Apparently the help entries were not merged when the patches got in.
Fixes: f844cdb99714 ("ui/cocoa: capture all keys and combos when mouse is grabbed")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
strerrorname_np is non-portable and breaks building with musl libc.
Use strerror(errno) instead, like we do other places.
Cc: qemu-stable@nongnu.org
Fixes: commit 082e9e4a58ba (target/riscv/kvm: improve 'init_multiext_cfg' error msg)
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2041
Buglink: https://gitlab.alpinelinux.org/alpine/aports/-/issues/15541
Signed-off-by: Natanael Copa <ncopa@alpinelinux.org>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
These static memory regions are contained within the machine and do not need to
be dynamically allocated.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Thomas Huth <huth@tuxfamily.org>
Message-ID: <20231220131641.592826-12-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Thomas Huth <huth@tuxfamily.org>
These are now redundant with the scr2 and old_scr2 fields in NeXTPC. Rename
the function from nextscr2_write() to next_scr2_rtc_update() to better
reflect its purpose. At the same time replace the manual bit manipulation with
the extract32() and deposit32() functions.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Thomas Huth <huth@tuxfamily.org>
Message-ID: <20231220131641.592826-11-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Thomas Huth <huth@tuxfamily.org>
Ensure that the LED status is updated by calling next_scr2_led_update() whenever
the SC2 register is written.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Thomas Huth <huth@tuxfamily.org>
Message-ID: <20231220131641.592826-10-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Thomas Huth <huth@tuxfamily.org>
Move the old_scr2 variable to NeXTPC so that the old SCR2 register state is
stored along with the current SCR2 state.
Since the SCR2 register is 32-bits wide, convert old_scr2 to uint32_t and
update the SCR2 register access code to allow unaligned writes.
Note that this is a migration break, but as nothing will currently boot then
we do not need to worry about this now.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Thomas Huth <huth@tuxfamily.org>
Message-ID: <20231220131641.592826-9-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Thomas Huth <huth@tuxfamily.org>
The phase variable represents part of the state machine used to clock data out
of the NextRtc device.
Note that this is a migration break for the NeXTRtc struct, but as nothing will
currently boot then we simply bump the migration version for now.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Thomas Huth <huth@tuxfamily.org>
Message-ID: <20231220131641.592826-8-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Thomas Huth <huth@tuxfamily.org>
The state of the led is stored in the SCR2 register which is part of the NeXTPC
device.
Note that this is a migration break for the NeXTPC device, but as nothing will
currently boot then we simply bump the migration version for now.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Thomas Huth <huth@tuxfamily.org>
Message-ID: <20231220131641.592826-7-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Thomas Huth <huth@tuxfamily.org>
Rename dma_ops to next_dma_ops and the read/write functions to next_dma_read()
and next_dma_write() respectively, mark next_dma_ops as DEVICE_BIG_ENDIAN and
also improve the consistency of the val variable in next_dma_read() and
next_dma_write().
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Thomas Huth <huth@tuxfamily.org>
Message-ID: <20231220131641.592826-6-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Thomas Huth <huth@tuxfamily.org>
The old QEMU memory accessors used in the original NextCube patch series had
separate functions for 1, 2 and 4 byte accessors. When the series was finally
merged a simple wrapper function was written to dispatch the memory accesses
using the original functions.
Convert scr_ops to use the memory API directly renaming it to next_scr_ops,
marking it as DEVICE_BIG_ENDIAN, and handling any unaligned accesses.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Thomas Huth <huth@tuxfamily.org>
Message-ID: <20231220131641.592826-5-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Thomas Huth <huth@tuxfamily.org>
The old QEMU memory accessors used in the original NextCube patch series had
separate functions for 1, 2 and 4 byte accessors. When the series was finally
merged a simple wrapper function was written to dispatch the memory accesses
using the original functions.
Convert mmio_ops to use the memory API directly renaming it to next_mmio_ops,
marking it as DEVICE_BIG_ENDIAN, and handling any unaligned accesses.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Thomas Huth <huth@tuxfamily.org>
Message-ID: <20231220131641.592826-4-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Thomas Huth <huth@tuxfamily.org>
Normally a DMA FLUSH command is used to ensure that data is completely written
to the device and/or memory, so remove the pulse of the SCSI DMA IRQ if a DMA
FLUSH command is received. This enables the NeXT ROM monitor to start to load
from a SCSI disk.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Thomas Huth <huth@tuxfamily.org>
Message-ID: <20231220131641.592826-3-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Thomas Huth <huth@tuxfamily.org>
Add a dummy register at address 0x6000 in the MMIO memory region to allow the
initial diagnostic test to timeout rather than getting stuck in a loop
continuously writing "en_write: tx not ready" to the console.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Thomas Huth <huth@tuxfamily.org>
Message-ID: <20231220131641.592826-2-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Thomas Huth <huth@tuxfamily.org>
Commit c2118e9e1ab ("configure: don't try a "native" cross for linux-user",
2023-11-23) sought to avoid issues with using the native compiler with a
cross-endian or cross-bitness setup. However, in doing so it ended up
requiring a cross compiler setup (and most likely a slow compiler setup)
even when building TCG tests that are native to the host architecture.
Always allow the host compiler in that case.
Cc: qemu-stable@nongnu.org
Fixes: c2118e9e1ab ("configure: don't try a "native" cross for linux-user", 2023-11-23)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-----BEGIN PGP SIGNATURE-----
iLMEAAEKAB0WIQS4/x2g0v3LLaCcbCxAov/yOSY+3wUCZYPyvQAKCRBAov/yOSY+
38/vBADT3b+Wo/2AeXOO3OXOM1VBhIvzDjY1OWytuJpkF3JGW45cMLqgtIgMj8h7
NtzRS3JbFYbYuxITeeo1Ppl6dAD0pCZjIU6OCBxAJ6ADPsE/xD8nYWrMGqYVXg7E
hN0Cno2sf6dmJ0QxUxn7G+cUuvNtnGaDSZE+RAkjtzq1nvx7CQ==
=mte5
-----END PGP SIGNATURE-----
Merge tag 'pull-loongarch-20231221' of https://gitlab.com/gaosong/qemu into staging
pull-loongarch-20231221
# -----BEGIN PGP SIGNATURE-----
#
# iLMEAAEKAB0WIQS4/x2g0v3LLaCcbCxAov/yOSY+3wUCZYPyvQAKCRBAov/yOSY+
# 38/vBADT3b+Wo/2AeXOO3OXOM1VBhIvzDjY1OWytuJpkF3JGW45cMLqgtIgMj8h7
# NtzRS3JbFYbYuxITeeo1Ppl6dAD0pCZjIU6OCBxAJ6ADPsE/xD8nYWrMGqYVXg7E
# hN0Cno2sf6dmJ0QxUxn7G+cUuvNtnGaDSZE+RAkjtzq1nvx7CQ==
# =mte5
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 21 Dec 2023 03:09:33 EST
# gpg: using RSA key B8FF1DA0D2FDCB2DA09C6C2C40A2FFF239263EDF
# gpg: Good signature from "Song Gao <m17746591750@163.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B8FF 1DA0 D2FD CB2D A09C 6C2C 40A2 FFF2 3926 3EDF
* tag 'pull-loongarch-20231221' of https://gitlab.com/gaosong/qemu:
target/loongarch: Add timer information dump support
hw/loongarch/virt: Align high memory base address with super page size
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Add the iothread-vq-mapping parameter to assign virtqueues to IOThreads.
Store the vq:AioContext mapping in the new struct
VirtIOBlockDataPlane->vq_aio_context[] field and refactor the code to
use the per-vq AioContext instead of the BlockDriverState's AioContext.
Reimplement --device virtio-blk-pci,iothread= and non-IOThread mode by
assigning all virtqueues to the IOThread and main loop's AioContext in
vq_aio_context[], respectively.
The comment in struct VirtIOBlockDataPlane about EventNotifiers is
stale. Remove it.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20231220134755.814917-5-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
virtio-blk and virtio-scsi devices will need a way to specify the
mapping between IOThreads and virtqueues. At the moment all virtqueues
are assigned to a single IOThread or the main loop. This single thread
can be a CPU bottleneck, so it is necessary to allow finer-grained
assignment to spread the load.
Introduce DEFINE_PROP_IOTHREAD_VQ_MAPPING_LIST() so devices can take a
parameter that maps virtqueues to IOThreads. The command-line syntax for
this new property is as follows:
--device '{"driver":"foo","iothread-vq-mapping":[{"iothread":"iothread0","vqs":[0,1,2]},...]}'
IOThreads are specified by name and virtqueues are specified by 0-based
index.
It will be common to simply assign virtqueues round-robin across a set
of IOThreads. A convenient syntax that does not require specifying
individual virtqueue indices is available:
--device '{"driver":"foo","iothread-vq-mapping":[{"iothread":"iothread0"},{"iothread":"iothread1"},...]}'
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20231220134755.814917-4-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
qdev_alias_all_properties() aliases a DeviceState's qdev properties onto
an Object. This is used for VirtioPCIProxy types so that --device
virtio-blk-pci has properties of its embedded --device virtio-blk-device
object.
Currently this function is implemented using qdev properties. Change the
function to use QOM object class properties instead. This works because
qdev properties create QOM object class properties, but it also catches
any QOM object class-only properties that have no qdev properties.
This change ensures that properties of devices are shown with --device
foo,\? even if they are QOM object class properties.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20231220134755.814917-2-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
StringOutputVisitor crashes when it visits a struct because
->start_struct() is NULL.
Show "<omitted>" instead of crashing. This is necessary because the
virtio-blk-pci iothread-vq-mapping parameter that I'd like to introduce
soon is a list of IOThreadMapping structs.
This patch is a quick fix to solve the crash, but the long-term solution
is replacing StringOutputVisitor with something that can handle the full
gamut of values in QEMU.
Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20231212134934.500289-1-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Use qemu_get_current_aio_context() in mixed wrappers and coroutine
wrappers so that code runs in the caller's AioContext instead of moving
to the BlockDriverState's AioContext. This change is necessary for the
multi-queue block layer where any thread can call into the block layer.
Most wrappers are IO_CODE where it's safe to use the current AioContext
nowadays. BlockDrivers and the core block layer use their own locks and
no longer depend on the AioContext lock for thread-safety.
The bdrv_create() wrapper invokes GLOBAL_STATE code. Using the current
AioContext is safe because this code is only called with the BQL held
from the main loop thread.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230912231037.826804-6-stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The AioContext lock no longer exists.
There is one noteworthy change:
- * More specifically, these functions use BDRV_POLL_WHILE(bs), which
- * requires the caller to be either in the main thread and hold
- * the BlockdriverState (bs) AioContext lock, or directly in the
- * home thread that runs the bs AioContext. Calling them from
- * another thread in another AioContext would cause deadlocks.
+ * More specifically, these functions use BDRV_POLL_WHILE(bs), which requires
+ * the caller to be either in the main thread or directly in the home thread
+ * that runs the bs AioContext. Calling them from another thread in another
+ * AioContext would cause deadlocks.
I am not sure whether deadlocks are still possible. Maybe they have just
moved to the fine-grained locks that have replaced the AioContext. Since
I am not sure if the deadlocks are gone, I have kept the substance
unchanged and just removed mention of the AioContext.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-ID: <20231205182011.1976568-15-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The AioContext lock no longer exists.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-ID: <20231205182011.1976568-14-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The SCSI subsystem no longer uses the AioContext lock. Request
processing runs exclusively in the BlockBackend's AioContext since
"scsi: only access SCSIDevice->requests from one thread" and hence the
lock is unnecessary.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-ID: <20231205182011.1976568-13-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Encourage the use of locking primitives and stop mentioning the
AioContext lock since it is being removed.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-ID: <20231205182011.1976568-12-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Delete these functions because nothing calls these functions anymore.
I introduced these APIs in commit 98563fc3ec44 ("aio: add
aio_context_acquire() and aio_context_release()") in 2014. It's with a
sigh of relief that I delete these APIs almost 10 years later.
Thanks to Paolo Bonzini's vision for multi-queue QEMU, we got an
understanding of where the code needed to go in order to remove the
limitations that the original dataplane and the IOThread/AioContext
approach that followed it.
Emanuele Giuseppe Esposito had the splendid determination to convert
large parts of the codebase so that they no longer needed the AioContext
lock. This was a painstaking process, both in the actual code changes
required and the iterations of code review that Emanuele eked out of
Kevin and me over many months.
Kevin Wolf tackled multitudes of graph locking conversions to protect
in-flight I/O from run-time changes to the block graph as well as the
clang Thread Safety Analysis annotations that allow the compiler to
check whether the graph lock is being used correctly.
And me, well, I'm just here to add some pizzazz to the QEMU multi-queue
block layer :). Thank you to everyone who helped with this effort,
including Eric Blake, code reviewer extraordinaire, and others who I've
forgotten to mention.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-ID: <20231205182011.1976568-11-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Now that the AioContext lock no longer exists, AIO_WAIT_WHILE() and
AIO_WAIT_WHILE_UNLOCKED() are equivalent.
A future patch will get rid of AIO_WAIT_WHILE_UNLOCKED().
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-ID: <20231205182011.1976568-10-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>