When the vhost-user is reconnecting to the backend, and if the vhost-user fails
at the get_features in vhost_dev_init(), then the reconnect will fail
and it will not be retriggered forever.
The reason is:
When the vhost-user fail at get_features, the vhost_dev_cleanup will be called
immediately.
vhost_dev_cleanup calls 'memset(hdev, 0, sizeof(struct vhost_dev))'.
The reconnect path is:
vhost_user_blk_event
vhost_user_async_close(.. vhost_user_blk_disconnect ..)
qemu_chr_fe_set_handlers <----- clear the notifier callback
schedule vhost_user_async_close_bh
The vhost->vdev is null, so the vhost_user_blk_disconnect will not be
called, then the event fd callback will not be reinstalled.
We need to ensure that even if vhost_dev_init initialization fails, the event
handler still needs to be reinstalled when s->connected is false.
All vhost-user devices have this issue, including vhost-user-blk/scsi.
Fixes: 71e076a07d ("hw/virtio: generalise CHR_EVENT_CLOSED handling")
Signed-off-by: Li Feng <fengli@smartx.com>
Message-Id: <20240516025753.130171-3-fengli@smartx.com>
Reviewed-by: Raphael Norwitz <raphael@enfabrica.net>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This reverts commit f02a4b8e64.
Since the current patch cannot completely fix the lost reconnect
problem, there is a scenario that is not considered:
- When the virtio-blk driver is removed from the guest os,
s->connected has no chance to be set to false, resulting in
subsequent reconnection not being executed.
The next patch will completely fix this issue with a better approach.
Signed-off-by: Li Feng <fengli@smartx.com>
Message-Id: <20240516025753.130171-2-fengli@smartx.com>
Reviewed-by: Raphael Norwitz <raphael@enfabrica.net>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add support for the VIRTIO_F_NOTIFICATION_DATA feature across a variety
of vhost devices.
The inclusion of VIRTIO_F_NOTIFICATION_DATA in the feature bits arrays
for these devices ensures that the backend is capable of offering and
providing support for this feature, and that it can be disabled if the
backend does not support it.
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Message-Id: <20240315165557.26942-6-jonah.palmer@oracle.com>
Acked-by: Srujana Challa <schalla@marvell.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The legacy SCSI passthrough functionality has never been enabled for
VIRTIO 1.0 and was deprecated more than four years ago.
Get rid of it---almost, because QEMU is advertising it unconditionally
for legacy virtio-blk devices. Just parse the header and return a
nonzero status.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move the pflash_blk_write_start() call. We need the offset of the
first data write, not the offset for the setup (number-of-bytes)
write. Without this fix u-boot can do block writes to the first
flash block only.
While being at it drop a leftover FIXME.
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2343
Fixes: 284a7ee2e2 ("hw/pflash: implement update buffer for block writes")
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240516121237.534875-1-kraxel@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
nand_command() and nand_getio() don't check @offset points
into the block, nor the available data length (s->iolen) is
not negative.
In order to fix:
- check the offset is in range in nand_blk_load_NAND_PAGE_SIZE(),
- do not set @iolen if blk_load() failed.
Reproducer:
$ cat << EOF | qemu-system-arm -machine tosa \
-monitor none -serial none \
-display none -qtest stdio
write 0x10000111 0x1 0xca
write 0x10000104 0x1 0x47
write 0x1000ca04 0x1 0xd7
write 0x1000ca01 0x1 0xe0
write 0x1000ca04 0x1 0x71
write 0x1000ca00 0x1 0x50
write 0x1000ca04 0x1 0xd7
read 0x1000ca02 0x1
write 0x1000ca01 0x1 0x10
EOF
=================================================================
==15750==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x61f000000de0
at pc 0x560e61557210 bp 0x7ffcfc4a59f0 sp 0x7ffcfc4a59e8
READ of size 1 at 0x61f000000de0 thread T0
#0 0x560e6155720f in mem_and hw/block/nand.c:101:20
#1 0x560e6155ac9c in nand_blk_write_512 hw/block/nand.c:663:9
#2 0x560e61544200 in nand_command hw/block/nand.c:293:13
#3 0x560e6153cc83 in nand_setio hw/block/nand.c:520:13
#4 0x560e61a0a69e in tc6393xb_nand_writeb hw/display/tc6393xb.c:380:13
#5 0x560e619f9bf7 in tc6393xb_writeb hw/display/tc6393xb.c:524:9
#6 0x560e647c7d03 in memory_region_write_accessor softmmu/memory.c:492:5
#7 0x560e647c7641 in access_with_adjusted_size softmmu/memory.c:554:18
#8 0x560e647c5f66 in memory_region_dispatch_write softmmu/memory.c:1514:16
#9 0x560e6485409e in flatview_write_continue softmmu/physmem.c:2825:23
#10 0x560e648421eb in flatview_write softmmu/physmem.c:2867:12
#11 0x560e64841ca8 in address_space_write softmmu/physmem.c:2963:18
#12 0x560e61170162 in qemu_writeb tests/qtest/videzzo/videzzo_qemu.c:1080:5
#13 0x560e6116eef7 in dispatch_mmio_write tests/qtest/videzzo/videzzo_qemu.c:1227:28
0x61f000000de0 is located 0 bytes to the right of 3424-byte region [0x61f000000080,0x61f000000de0)
allocated by thread T0 here:
#0 0x560e611276cf in malloc /root/llvm-project/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
#1 0x7f7959a87e98 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x57e98)
#2 0x560e64b98871 in object_new qom/object.c:749:12
#3 0x560e64b5d1a1 in qdev_new hw/core/qdev.c:153:19
#4 0x560e61547ea5 in nand_init hw/block/nand.c:639:11
#5 0x560e619f8772 in tc6393xb_init hw/display/tc6393xb.c:558:16
#6 0x560e6390bad2 in tosa_init hw/arm/tosa.c:250:12
SUMMARY: AddressSanitizer: heap-buffer-overflow hw/block/nand.c:101:20 in mem_and
==15750==ABORTING
Broken since introduction in commit 3e3d5815cb ("NAND Flash memory
emulation and ECC calculation helpers for use by NAND controllers").
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1445
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1446
Reported-by: Qiang Liu <cyruscyliu@gmail.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20240409135944.24997-4-philmd@linaro.org>
Negative offset is meaningless, use unsigned type.
Return a boolean value indicating success.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20240409135944.24997-3-philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20240409135944.24997-2-philmd@linaro.org>
Let's not care about what was changed and update the whole config,
reasons:
1. config->geometry should be updated together with capacity, so we fix
a bug.
2. Vhost-user protocol doesn't say anything about config change
limitation. Silent ignore of changes doesn't seem to be correct.
3. vhost-user-vsock reads the whole config
4. on realize we don't do any checks on retrieved config, so no reason
to care here
Comment "valid for resize only" exists since introduction the whole
hw/block/vhost-user-blk.c in commit
00343e4b54
"vhost-user-blk: introduce a new vhost-user-blk host device",
seems it was just an extra limitation.
Also, let's notify guest unconditionally:
1. So does vhost-user-vsock
2. We are going to reuse the functionality in new cases when we do want
to notify the guest unconditionally. So, no reason to create extra
branches in the logic.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Acked-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Message-Id: <20240329183758.3360733-2-vsementsov@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This modification ensures that in scenarios where the buffer size is
insufficient for a zone report, the function will now properly set an
error status and proceed to a cleanup label, instead of merely
returning.
The following ASAN log reveals it:
==1767400==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 312 byte(s) in 1 object(s) allocated from:
#0 0x64ac7b3280cd in malloc llvm/compiler-rt/lib/asan/asan_malloc_linux.cpp:129:3
#1 0x735b02fb9738 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x5e738)
#2 0x64ac7d23be96 in virtqueue_split_pop hw/virtio/virtio.c:1612:12
#3 0x64ac7d23728a in virtqueue_pop hw/virtio/virtio.c:1783:16
#4 0x64ac7cfcaacd in virtio_blk_get_request hw/block/virtio-blk.c:228:27
#5 0x64ac7cfca7c7 in virtio_blk_handle_vq hw/block/virtio-blk.c:1123:23
#6 0x64ac7cfecb95 in virtio_blk_handle_output hw/block/virtio-blk.c:1157:5
Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Message-id: 20240404120040.1951466-1-zheyuma97@gmail.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The Aspeed machines have many Static Memory Controllers (SMC), up to
8, which can only drive flash memory devices. Commit 27a2c66c92
("aspeed/smc: Wire CS lines at reset") tried to ease the definitions
of these devices by allowing flash devices from the command line to be
attached to a SSI bus. For that, the wiring of the CS lines of the
Aspeed SMC controller was moved at reset. Two assumptions are made
though, first that the device has a SSI_GPIO_CS GPIO line, which is
not always the case, and second that it is a flash device.
Correct this problem by ensuring that the devices attached to the bus
are of the correct flash type. This fixes a QEMU abort when devices
without a CS line, such as the max111x, are passed on the command
line.
While at it, export TYPE_M25P80 used in the Xilinx Versal Virtual
machine.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2228
Fixes: 27a2c66c92 ("aspeed/smc: Wire CS lines at reset")
Reported-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
[ clg: minor fixes in the commit log ]
Signed-off-by: Cédric Le Goater <clg@redhat.com>
As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():
* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
* error_append_hint(), because that doesn't work with &error_fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user
can't see this additional information, because exit() happens in
error_setg earlier than information is added [1].
The virtio_blk_vq_aio_context_init() passes @errp to error_prepend().
Though its @errp points its caller's local @err variable, to follow the
requirement of @errp, add missing ERRP_GUARD() at the beginning of
virtio_blk_vq_aio_context_init().
[1]: Issue description in the commit message of commit ae7c80a7bd
("error: New macro ERRP_GUARD()").
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Hanna Reitz <hreitz@redhat.com>
Cc: qemu-block@nongnu.org
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: "Michael S. Tsirkin" <mst@redhat.com>
Message-ID: <20240311033822.3142585-14-zhao1.liu@linux.intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Add Micro 2Gb OSPI flash part with sfdp data.
Signed-off-by: Sai Pavan Boddu <sai.pavan.boddu@amd.com>
Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Message-id: 20240220091721.82954-2-sai.pavan.boddu@amd.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Suppress the deprecation warning when we're running under qtest,
to avoid "make check" including warning messages in its output.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240206154151.155620-1-peter.maydell@linaro.org
The real SuperI/O chips emulated by QEMU allow for relocating and enabling or
disabling their SuperI/O functions via software. So far this is not implemented.
Prepare for that by adding isa_fdc_set_{enabled,iobase}.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20240114123911.4877-8-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
FDCtrl::iomem isn't used inside FDCtrl context but only inside FDCtrlSysBus
context, so move it there.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu>
Message-Id: <20240114123911.4877-3-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
FDCtrl::portio_list isn't used inside FDCtrl context but only inside
FDCtrlISABus context, so move it there.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu>
Message-Id: <20240114123911.4877-2-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Requests that complete in an IOThread use irqfd to notify the guest
while requests that complete in the main loop thread use the traditional
qdev irq code path. The reason for this conditional is that the irq code
path requires the BQL:
if (s->ioeventfd_started && !s->ioeventfd_disabled) {
virtio_notify_irqfd(vdev, req->vq);
} else {
virtio_notify(vdev, req->vq);
}
There is a corner case where the conditional invokes the irq code path
instead of the irqfd code path:
static void virtio_blk_stop_ioeventfd(VirtIODevice *vdev)
{
...
/*
* Set ->ioeventfd_started to false before draining so that host notifiers
* are not detached/attached anymore.
*/
s->ioeventfd_started = false;
/* Wait for virtio_blk_dma_restart_bh() and in flight I/O to complete */
blk_drain(s->conf.conf.blk);
During blk_drain() the conditional produces the wrong result because
ioeventfd_started is false.
Use qemu_in_iothread() instead of checking the ioeventfd state.
Cc: qemu-stable@nongnu.org
Buglink: https://issues.redhat.com/browse/RHEL-15394
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240122172625.415386-1-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Commit d3f6f294ae ("virtio-blk: always set
ioeventfd during startup") has made virtio_blk_start_ioeventfd() always
kick the virtqueue (set the ioeventfd), regardless of whether the BB is
drained. That is no longer necessary, because attaching the host
notifier will now set the ioeventfd, too; this happens either
immediately right here in virtio_blk_start_ioeventfd(), or later when
the drain ends, in virtio_blk_ioeventfd_attach().
With event_notifier_set() removed, the code becomes the same as the one
in virtio_blk_ioeventfd_attach(), so we can reuse that function.
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Message-ID: <20240202153158.788922-4-hreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
QEMU's coding style generally forbids C99 mixed declarations.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240206140410.65650-1-stefanha@redhat.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Hanna Czenczek <hreitz@redhat.com> noted that the array index in
virtio_blk_dma_restart_cb() is not bounds-checked:
g_autofree VirtIOBlockReq **vq_rq = g_new0(VirtIOBlockReq *, num_queues);
...
while (rq) {
VirtIOBlockReq *next = rq->next;
uint16_t idx = virtio_get_queue_index(rq->vq);
rq->next = vq_rq[idx];
^^^^^^^^^^
The code is correct because both rq->vq and vq_rq[] depend on
num_queues, but this is indirect and not 100% obvious. Add an assertion.
Suggested-by: Hanna Czenczek <hreitz@redhat.com>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240206190610.107963-4-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
It is not possible to instantiate a virtio-blk device with 0 virtqueues.
The following check is located in ->realize():
if (!conf->num_queues) {
error_setg(errp, "num-queues property must be larger than 0");
return;
}
Later on we access s->vq_aio_context[0] under the assumption that there
is as least one virtqueue. Hanna Czenczek <hreitz@redhat.com> noted that
it would help to show that the array index is already valid.
Add an assertion to document that s->vq_aio_context[0] is always
safe...and catch future code changes that break this assumption.
Suggested-by: Hanna Czenczek <hreitz@redhat.com>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240206190610.107963-3-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Hanna Czenczek <hreitz@redhat.com> noticed that the safety of
`vq_aio_context[vq->value] = ctx;` with user-defined vq->value inputs is
not obvious.
The code is structured in validate() + apply() steps so input validation
is there, but it happens way earlier and there is nothing that
guarantees apply() can only be called with validated inputs.
This patch moves the validate() call inside the apply() function so
validation is guaranteed. I also added the bounds checking assertion
that Hanna suggested.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Message-ID: <20240206190610.107963-2-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In cases where a device tries to read more bytes than the block device
contains, the error is vague: "device requires X bytes, block backend
provides Y bytes".
This patch changes the errors of this function to include the block
backend name, the device id and device type name where appropriate.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Message-id: 7260eadff22c08457740117c1bb7bd2b4353acb9.1706598705.git.manos.pitsidianakis@linaro.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The following expression is incorrect because blk_pread_nonzeroes()
deals in units of bytes, not sectors:
bytes = MIN(size - offset, BDRV_REQUEST_MAX_SECTORS)
^^^^^^^
BDRV_REQUEST_MAX_BYTES is the appropriate constant.
Fixes: a4b15a8b9e ("pflash: Only read non-zero parts of backend image")
Cc: Xiang Zheng <zhengxiang9@huawei.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240130002712.257815-1-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
When starting ioeventfd it is common practice to set the event notifier
so that the ioeventfd handler is triggered to run immediately. There may
be no requests waiting to be processed, but the idea is that if a
request snuck in then we guarantee that it will be detected.
One scenario where self-triggering the ioeventfd is necessary is when
virtio_blk_handle_output() is called from a vCPU thread before the
VIRTIO Device Status transitions to DRIVER_OK. In that case we need to
self-trigger the ioeventfd so that the kick handled by the vCPU thread
causes the vq AioContext thread to take over handling the request(s).
Fixes: b6948ab01d ("virtio-blk: add iothread-vq-mapping parameter")
Reported-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240119135748.270944-7-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We no longer rely on setting the AioContext since the block layer
IO_CODE APIs can be called from any thread. Now it's just a hint to help
block jobs and other operations co-locate themselves in a thread with
the guest I/O requests. Keep going if setting the AioContext fails.
Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240119135748.270944-6-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
A virtio-blk device with the iothread-vq-mapping parameter has
per-virtqueue AioContexts. It is not thread-safe to process s->rq
requests in the BlockBackend AioContext since that may be different from
the virtqueue's AioContext to which this request belongs. The code
currently races and could crash.
Adapt virtio_blk_dma_restart_cb() to first split s->rq into per-vq lists
and then schedule a BH each vq's AioContext as necessary. This way
requests are safely processed in their vq's AioContext.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240119135748.270944-5-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The dataplane code is really about using ioeventfd. It's used both for
IOThreads (what we think of as dataplane) and for the core virtio-pci
code's ioeventfd feature (which is enabled by default and used when no
IOThread has been specified). Rename the code to reflect this.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240119135748.270944-4-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
virtio_blk_data_plane_create() and virtio_blk_data_plane_destroy() are
actually about s->vq_aio_context[] rather than managing
dataplane-specific state.
As a prerequisite to using s->vq_aio_context[] in all code paths (even
when dataplane is not used), rename these functions to reflect that they
just manage s->vq_aio_context and call them regardless of whether or not
dataplane is in use.
Note that virtio-blk supports running with -device
virtio-blk-pci,ioevent=off where the vCPU thread enters the device
emulation code. In this mode ioeventfd is not used for virtqueue
processing. However, we still want to initialize s->vq_aio_context[] to
qemu_aio_context in that case since I/O completion callbacks will be
invoked in the main loop thread.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240119135748.270944-3-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The dataplane code used to be significantly different from the
non-dataplane code and therefore had a separate source file.
Over time the difference has gotten smaller because the I/O code paths
were unified. Nowadays the distinction between the VirtIOBlock and
VirtIOBlockDataPlane structs is more of an inconvenience that hinders
code simplification.
Move hw/block/dataplane/virtio-blk.c into hw/block/virtio-blk.c, merging
VirtIOBlockDataPlane's fields into VirtIOBlock.
hw/block/virtio-blk.c used VirtIOBlock->dataplane to check if
virtio_blk_data_plane_create() was successful. This is not necessary
because ->dataplane_started and ->dataplane_disabled can be used
instead. This patch makes those changes in order to drop
VirtIOBlock->dataplane.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240119135748.270944-2-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Add an update buffer where all block updates are staged.
Flush or discard updates properly, so we should never see
half-completed block writes in pflash storage.
Drop a bunch of FIXME comments ;)
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240108160900.104835-4-kraxel@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Use the helper functions we have to read/write multi-byte values
in correct byte order.
Suggested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240108160900.104835-3-kraxel@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Move the offset calculation, do it once at the start of the function and
let the 'p' variable point directly to the memory location which should
be updated. This makes it simpler to update other buffers than
pfl->storage in an upcoming patch. No functional change.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240108160900.104835-2-kraxel@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The 16MiB flash device is only used by the deprecated shix machine.
Its code it old and unmaintained, and has never been adapted to the
QOM architecture. It still contains debug statements and uses global
variables. It is time to deprecate it.
Signed-off-by: Samuel Tardieu <sam@rfc1149.net>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240109083053.2581588-3-sam@rfc1149.net>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The term "QEMU global mutex" is identical to the more widely used Big
QEMU Lock ("BQL"). Update the code comments and documentation to use
"BQL" instead of "QEMU global mutex".
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Paul Durrant <paul@xen.org>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Message-id: 20240102153529.486531-6-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Add the iothread-vq-mapping parameter to assign virtqueues to IOThreads.
Store the vq:AioContext mapping in the new struct
VirtIOBlockDataPlane->vq_aio_context[] field and refactor the code to
use the per-vq AioContext instead of the BlockDriverState's AioContext.
Reimplement --device virtio-blk-pci,iothread= and non-IOThread mode by
assigning all virtqueues to the IOThread and main loop's AioContext in
vq_aio_context[], respectively.
The comment in struct VirtIOBlockDataPlane about EventNotifiers is
stale. Remove it.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20231220134755.814917-5-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This is the big patch that removes
aio_context_acquire()/aio_context_release() from the block layer and
affected block layer users.
There isn't a clean way to split this patch and the reviewers are likely
the same group of people, so I decided to do it in one patch.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Message-ID: <20231205182011.1976568-7-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
There is no need to acquire the AioContext lock around blk_aio_*() or
blk_get_geometry() anymore. I/O plugging (defer_call()) also does not
require the AioContext lock anymore.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230914140101.1065008-5-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Nothing in the completion code path relies on the AioContext lock
anymore. Virtqueues are only accessed from one thread at any moment and
the s->rq global state is protected by its own lock now.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230914140101.1065008-4-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
s->rq is accessed from IO_CODE and GLOBAL_STATE_CODE. Introduce a lock
to protect s->rq and eliminate reliance on the AioContext lock.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230914140101.1065008-3-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If the error occurs in vhost_dev_init, the value of s->connected is set to true
in advance, and there is no chance to enter this function execution again
in the future.
Signed-off-by: Li Feng <fengli@smartx.com>
Message-Id: <20231123055431.217792-2-fengli@smartx.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Coverity couldn't see that nr_existing was always going to be zero when
qemu_xen_xs_directory() returned NULL in the ENOENT case (CID 1523906).
Perhaps more to the point, neither could Peter at first glance. Improve
the code to hopefully make it clearer to Coverity and human reviewers
alike.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
There's no need to force the user to assign a vdev. We can automatically
assign one, starting at xvda and searching until we find the first disk
name that's unused.
This means we can now allow '-drive if=xen,file=xxx' to work without an
explicit separate -driver argument, just like if=virtio.
Rip out the legacy handling from the xenpv machine, which was scribbling
over any disks configured by the toolstack, and didn't work with anything
but raw images.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Acked-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Even on x86_64 the default protocol is the x86-32 one if the guest doesn't
specifically ask for x86-64.
Cc: qemu-stable@nongnu.org
Fixes: b6af8926fb ("xen: add implementations of xen-block connect and disconnect functions...")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
There is a batching mechanism for virtio-blk Used Buffer Notifications
that is no longer needed because the previous commit added batching to
virtio_notify_irqfd().
Note that this mechanism was rarely used in practice because it is only
enabled when EVENT_IDX is not negotiated by the driver. Modern drivers
enable EVENT_IDX.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230913200045.1024233-5-stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The networking subsystem may wish to use defer_call(), so move the code
to util/ where it can be reused.
As a reminder of what defer_call() does:
This API defers a function call within a defer_call_begin()/defer_call_end()
section, allowing multiple calls to batch up. This is a performance
optimization that is used in the block layer to submit several I/O requests
at once instead of individually:
defer_call_begin(); <-- start of section
...
defer_call(my_func, my_obj); <-- deferred my_func(my_obj) call
defer_call(my_func, my_obj); <-- another
defer_call(my_func, my_obj); <-- another
...
defer_call_end(); <-- end of section, my_func(my_obj) is called once
Suggested-by: Ilya Maximets <i.maximets@ovn.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230913200045.1024233-3-stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Prepare to move the blk_io_plug_call() API out of the block layer so
that other subsystems call use this deferred call mechanism. Rename it
to defer_call() but leave the code in block/plug.c.
The next commit will move the code out of the block layer.
Suggested-by: Ilya Maximets <i.maximets@ovn.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Paul Durrant <paul@xen.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230913200045.1024233-2-stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>