qemu/tests/qemu-iotests/223.out

299 lines
13 KiB
Plaintext
Raw Normal View History

QA output created by 223
=== Create partially sparse image, then add dirty bitmaps ===
Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=4194304
wrote 2097152/2097152 bytes at offset 1048576
2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
Testing:
QMP_VERSION
{"return": {}}
{"return": {}}
{"return": {}}
{"return": {}}
{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false, "reason": "host-qmp-quit"}}
monitor: only run coroutine commands in qemu_aio_context monitor_qmp_dispatcher_co() runs in the iohandler AioContext that is not polled during nested event loops. The coroutine currently reschedules itself in the main loop's qemu_aio_context AioContext, which is polled during nested event loops. One known problem is that QMP device-add calls drain_call_rcu(), which temporarily drops the BQL, leading to all sorts of havoc like other vCPU threads re-entering device emulation code while another vCPU thread is waiting in device emulation code with aio_poll(). Paolo Bonzini suggested running non-coroutine QMP handlers in the iohandler AioContext. This avoids trouble with nested event loops. His original idea was to move coroutine rescheduling to monitor_qmp_dispatch(), but I resorted to moving it to qmp_dispatch() because we don't know if the QMP handler needs to run in coroutine context in monitor_qmp_dispatch(). monitor_qmp_dispatch() would have been nicer since it's associated with the monitor implementation and not as general as qmp_dispatch(), which is also used by qemu-ga. A number of qemu-iotests need updated .out files because the order of QMP events vs QMP responses has changed. Solves Issue #1933. Cc: qemu-stable@nongnu.org Fixes: 7bed89958bfbf40df9ca681cefbdca63abdde39d ("device_core: use drain_call_rcu in in qmp_device_add") Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2215192 Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2214985 Buglink: https://issues.redhat.com/browse/RHEL-17369 Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20240118144823.1497953-4-stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Tested-by: Fiona Ebner <f.ebner@proxmox.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2024-01-18 17:48:23 +03:00
{"return": {}}
=== Write part of the file under active bitmap ===
wrote 512/512 bytes at offset 512
512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
wrote 2097152/2097152 bytes at offset 2097152
2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
=== End dirty bitmaps, and start serving image over NBD ===
{"execute":"qmp_capabilities"}
{"return": {}}
{"execute":"blockdev-add",
"arguments":{"driver":"IMGFMT", "node-name":"n",
"file":{"driver":"file", "filename":"TEST_DIR/t.IMGFMT"}}}
{"return": {}}
{"execute":"block-dirty-bitmap-disable",
"arguments":{"node":"n", "name":"b"}}
{"return": {}}
{"execute":"blockdev-add",
"arguments":{"driver":"null-co", "node-name":"null",
"size": 4194304}}
{"return": {}}
{"execute":"block-dirty-bitmap-add",
"arguments":{"node":"null", "name":"b3"}}
{"return": {}}
=== Set up NBD with normal access ===
{"execute":"nbd-server-add",
"arguments":{"device":"n"}}
{"error": {"class": "GenericError", "desc": "NBD server not running"}}
{"execute":"nbd-server-start",
"arguments":{"addr":{"type":"unix",
"data":{"path":"SOCK_DIR/nbd"}}}}
{"return": {}}
{"execute":"nbd-server-start",
"arguments":{"addr":{"type":"unix",
"data":{"path":"SOCK_DIR/nbd1"}}}}
{"error": {"class": "GenericError", "desc": "NBD server already running"}}
exports available: 0
{"execute":"nbd-server-add",
"arguments":{"device":"n", "bitmap":"b"}}
{"return": {}}
{"execute":"nbd-server-add",
"arguments":{"device":"nosuch"}}
{"error": {"class": "GenericError", "desc": "Cannot find device='nosuch' nor node-name='nosuch'"}}
{"execute":"nbd-server-add",
"arguments":{"device":"n"}}
{"error": {"class": "GenericError", "desc": "Block export id 'n' is already in use"}}
{"execute":"nbd-server-add",
"arguments":{"device":"n", "name":"n2",
"bitmap":"b2"}}
{"error": {"class": "GenericError", "desc": "Enabled bitmap 'b2' incompatible with readonly export"}}
{"execute":"nbd-server-add",
"arguments":{"device":"n", "name":"n2",
"bitmap":"b3"}}
{"error": {"class": "GenericError", "desc": "Bitmap 'b3' is not found"}}
{"execute":"nbd-server-add",
"arguments":{"device":"n", "name":"n2", "writable":true,
"description":"some text", "bitmap":"b2"}}
{"return": {}}
{"execute":"block-export-add",
"arguments":{"type": "nbd", "node-name":"n", "id":"n3", "name": "n3",
"bitmaps":[{"node":"null","name":"b3"}]}}
{"return": {}}
exports available: 3
export: 'n'
size: 4194304
nbd/server: Add FLAG_PAYLOAD support to CMD_BLOCK_STATUS Allow a client to request a subset of negotiated meta contexts. For example, a client may ask to use a single connection to learn about both block status and dirty bitmaps, but where the dirty bitmap queries only need to be performed on a subset of the disk; forcing the server to compute that information on block status queries in the rest of the disk is wasted effort (both at the server, and on the amount of traffic sent over the wire to be parsed and ignored by the client). Qemu as an NBD client never requests to use more than one meta context, so it has no need to use block status payloads. Testing this instead requires support from libnbd, which CAN access multiple meta contexts in parallel from a single NBD connection; an interop test submitted to the libnbd project at the same time as this patch demonstrates the feature working, as well as testing some corner cases (for example, when the payload length is longer than the export length), although other corner cases (like passing the same id duplicated) requires a protocol fuzzer because libnbd is not wired up to break the protocol that badly. This also includes tweaks to 'qemu-nbd --list' to show when a server is advertising the capability, and to the testsuite to reflect the addition to that output. Of note: qemu will always advertise the new feature bit during NBD_OPT_INFO if extended headers have alreay been negotiated (regardless of whether any NBD_OPT_SET_META_CONTEXT negotiation has occurred); but for NBD_OPT_GO, qemu only advertises the feature if block status is also enabled (that is, if the client does not negotiate any contexts, then NBD_CMD_BLOCK_STATUS cannot be used, so the feature is not advertised). Signed-off-by: Eric Blake <eblake@redhat.com> Message-ID: <20230925192229.3186470-26-eblake@redhat.com> [eblake: fix logic to reject unnegotiated contexts] Signed-off-by: Eric Blake <eblake@redhat.com>
2023-09-25 22:22:42 +03:00
flags: 0x158f ( readonly flush fua df multi cache block-status-payload )
min block: 1
opt block: 4096
max block: 33554432
transaction size: 64-bit
available meta contexts: 2
base:allocation
qemu:dirty-bitmap:b
export: 'n2'
description: some text
size: 4194304
nbd/server: Add FLAG_PAYLOAD support to CMD_BLOCK_STATUS Allow a client to request a subset of negotiated meta contexts. For example, a client may ask to use a single connection to learn about both block status and dirty bitmaps, but where the dirty bitmap queries only need to be performed on a subset of the disk; forcing the server to compute that information on block status queries in the rest of the disk is wasted effort (both at the server, and on the amount of traffic sent over the wire to be parsed and ignored by the client). Qemu as an NBD client never requests to use more than one meta context, so it has no need to use block status payloads. Testing this instead requires support from libnbd, which CAN access multiple meta contexts in parallel from a single NBD connection; an interop test submitted to the libnbd project at the same time as this patch demonstrates the feature working, as well as testing some corner cases (for example, when the payload length is longer than the export length), although other corner cases (like passing the same id duplicated) requires a protocol fuzzer because libnbd is not wired up to break the protocol that badly. This also includes tweaks to 'qemu-nbd --list' to show when a server is advertising the capability, and to the testsuite to reflect the addition to that output. Of note: qemu will always advertise the new feature bit during NBD_OPT_INFO if extended headers have alreay been negotiated (regardless of whether any NBD_OPT_SET_META_CONTEXT negotiation has occurred); but for NBD_OPT_GO, qemu only advertises the feature if block status is also enabled (that is, if the client does not negotiate any contexts, then NBD_CMD_BLOCK_STATUS cannot be used, so the feature is not advertised). Signed-off-by: Eric Blake <eblake@redhat.com> Message-ID: <20230925192229.3186470-26-eblake@redhat.com> [eblake: fix logic to reject unnegotiated contexts] Signed-off-by: Eric Blake <eblake@redhat.com>
2023-09-25 22:22:42 +03:00
flags: 0x1ded ( flush fua trim zeroes df multi cache fast-zero block-status-payload )
min block: 1
opt block: 4096
max block: 33554432
transaction size: 64-bit
available meta contexts: 2
base:allocation
qemu:dirty-bitmap:b2
export: 'n3'
size: 4194304
nbd/server: Add FLAG_PAYLOAD support to CMD_BLOCK_STATUS Allow a client to request a subset of negotiated meta contexts. For example, a client may ask to use a single connection to learn about both block status and dirty bitmaps, but where the dirty bitmap queries only need to be performed on a subset of the disk; forcing the server to compute that information on block status queries in the rest of the disk is wasted effort (both at the server, and on the amount of traffic sent over the wire to be parsed and ignored by the client). Qemu as an NBD client never requests to use more than one meta context, so it has no need to use block status payloads. Testing this instead requires support from libnbd, which CAN access multiple meta contexts in parallel from a single NBD connection; an interop test submitted to the libnbd project at the same time as this patch demonstrates the feature working, as well as testing some corner cases (for example, when the payload length is longer than the export length), although other corner cases (like passing the same id duplicated) requires a protocol fuzzer because libnbd is not wired up to break the protocol that badly. This also includes tweaks to 'qemu-nbd --list' to show when a server is advertising the capability, and to the testsuite to reflect the addition to that output. Of note: qemu will always advertise the new feature bit during NBD_OPT_INFO if extended headers have alreay been negotiated (regardless of whether any NBD_OPT_SET_META_CONTEXT negotiation has occurred); but for NBD_OPT_GO, qemu only advertises the feature if block status is also enabled (that is, if the client does not negotiate any contexts, then NBD_CMD_BLOCK_STATUS cannot be used, so the feature is not advertised). Signed-off-by: Eric Blake <eblake@redhat.com> Message-ID: <20230925192229.3186470-26-eblake@redhat.com> [eblake: fix logic to reject unnegotiated contexts] Signed-off-by: Eric Blake <eblake@redhat.com>
2023-09-25 22:22:42 +03:00
flags: 0x158f ( readonly flush fua df multi cache block-status-payload )
min block: 1
opt block: 4096
max block: 33554432
transaction size: 64-bit
available meta contexts: 2
base:allocation
qemu:dirty-bitmap:b3
=== Contrast normal status to large granularity dirty-bitmap ===
read 512/512 bytes at offset 512
512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
read 524288/524288 bytes at offset 524288
512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
read 1048576/1048576 bytes at offset 1048576
1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
read 2097152/2097152 bytes at offset 2097152
2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
[{ "start": 0, "length": 4096, "depth": 0, "present": true, "zero": false, "data": true, "compressed": false, "offset": OFFSET},
{ "start": 4096, "length": 1044480, "depth": 0, "present": true, "zero": true, "data": false, "compressed": false, "offset": OFFSET},
{ "start": 1048576, "length": 3145728, "depth": 0, "present": true, "zero": false, "data": true, "compressed": false, "offset": OFFSET}]
[{ "start": 0, "length": 65536, "depth": 0, "present": false, "zero": false, "data": false, "compressed": false},
{ "start": 65536, "length": 2031616, "depth": 0, "present": true, "zero": false, "data": true, "compressed": false, "offset": OFFSET},
{ "start": 2097152, "length": 2097152, "depth": 0, "present": false, "zero": false, "data": false, "compressed": false}]
=== Contrast to small granularity dirty-bitmap ===
[{ "start": 0, "length": 512, "depth": 0, "present": true, "zero": false, "data": true, "compressed": false, "offset": OFFSET},
{ "start": 512, "length": 512, "depth": 0, "present": false, "zero": false, "data": false, "compressed": false},
{ "start": 1024, "length": 2096128, "depth": 0, "present": true, "zero": false, "data": true, "compressed": false, "offset": OFFSET},
{ "start": 2097152, "length": 2097152, "depth": 0, "present": false, "zero": false, "data": false, "compressed": false}]
=== Check bitmap taken from another node ===
[{ "start": 0, "length": 4194304, "depth": 0, "present": true, "zero": false, "data": true, "compressed": false, "offset": OFFSET}]
=== End qemu NBD server ===
{"execute":"nbd-server-remove",
"arguments":{"name":"n"}}
monitor: only run coroutine commands in qemu_aio_context monitor_qmp_dispatcher_co() runs in the iohandler AioContext that is not polled during nested event loops. The coroutine currently reschedules itself in the main loop's qemu_aio_context AioContext, which is polled during nested event loops. One known problem is that QMP device-add calls drain_call_rcu(), which temporarily drops the BQL, leading to all sorts of havoc like other vCPU threads re-entering device emulation code while another vCPU thread is waiting in device emulation code with aio_poll(). Paolo Bonzini suggested running non-coroutine QMP handlers in the iohandler AioContext. This avoids trouble with nested event loops. His original idea was to move coroutine rescheduling to monitor_qmp_dispatch(), but I resorted to moving it to qmp_dispatch() because we don't know if the QMP handler needs to run in coroutine context in monitor_qmp_dispatch(). monitor_qmp_dispatch() would have been nicer since it's associated with the monitor implementation and not as general as qmp_dispatch(), which is also used by qemu-ga. A number of qemu-iotests need updated .out files because the order of QMP events vs QMP responses has changed. Solves Issue #1933. Cc: qemu-stable@nongnu.org Fixes: 7bed89958bfbf40df9ca681cefbdca63abdde39d ("device_core: use drain_call_rcu in in qmp_device_add") Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2215192 Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2214985 Buglink: https://issues.redhat.com/browse/RHEL-17369 Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20240118144823.1497953-4-stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Tested-by: Fiona Ebner <f.ebner@proxmox.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2024-01-18 17:48:23 +03:00
{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "BLOCK_EXPORT_DELETED", "data": {"id": "n"}}
{"return": {}}
{"execute":"nbd-server-remove",
"arguments":{"name":"n2"}}
monitor: only run coroutine commands in qemu_aio_context monitor_qmp_dispatcher_co() runs in the iohandler AioContext that is not polled during nested event loops. The coroutine currently reschedules itself in the main loop's qemu_aio_context AioContext, which is polled during nested event loops. One known problem is that QMP device-add calls drain_call_rcu(), which temporarily drops the BQL, leading to all sorts of havoc like other vCPU threads re-entering device emulation code while another vCPU thread is waiting in device emulation code with aio_poll(). Paolo Bonzini suggested running non-coroutine QMP handlers in the iohandler AioContext. This avoids trouble with nested event loops. His original idea was to move coroutine rescheduling to monitor_qmp_dispatch(), but I resorted to moving it to qmp_dispatch() because we don't know if the QMP handler needs to run in coroutine context in monitor_qmp_dispatch(). monitor_qmp_dispatch() would have been nicer since it's associated with the monitor implementation and not as general as qmp_dispatch(), which is also used by qemu-ga. A number of qemu-iotests need updated .out files because the order of QMP events vs QMP responses has changed. Solves Issue #1933. Cc: qemu-stable@nongnu.org Fixes: 7bed89958bfbf40df9ca681cefbdca63abdde39d ("device_core: use drain_call_rcu in in qmp_device_add") Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2215192 Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2214985 Buglink: https://issues.redhat.com/browse/RHEL-17369 Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20240118144823.1497953-4-stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Tested-by: Fiona Ebner <f.ebner@proxmox.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2024-01-18 17:48:23 +03:00
{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "BLOCK_EXPORT_DELETED", "data": {"id": "n2"}}
{"return": {}}
{"execute":"nbd-server-remove",
"arguments":{"name":"n2"}}
{"error": {"class": "GenericError", "desc": "Export 'n2' is not found"}}
{"execute":"nbd-server-stop"}
{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "BLOCK_EXPORT_DELETED", "data": {"id": "n3"}}
{"return": {}}
{"execute":"nbd-server-stop"}
{"error": {"class": "GenericError", "desc": "NBD server not running"}}
=== Set up NBD with iothread access ===
{"execute":"x-blockdev-set-iothread",
"arguments":{"node-name":"n", "iothread":"io0"}}
{"return": {}}
{"execute":"nbd-server-add",
"arguments":{"device":"n"}}
{"error": {"class": "GenericError", "desc": "NBD server not running"}}
{"execute":"nbd-server-start",
"arguments":{"addr":{"type":"unix",
"data":{"path":"SOCK_DIR/nbd"}}}}
{"return": {}}
{"execute":"nbd-server-start",
"arguments":{"addr":{"type":"unix",
"data":{"path":"SOCK_DIR/nbd1"}}}}
{"error": {"class": "GenericError", "desc": "NBD server already running"}}
exports available: 0
{"execute":"nbd-server-add",
"arguments":{"device":"n", "bitmap":"b"}}
{"return": {}}
{"execute":"nbd-server-add",
"arguments":{"device":"nosuch"}}
{"error": {"class": "GenericError", "desc": "Cannot find device='nosuch' nor node-name='nosuch'"}}
{"execute":"nbd-server-add",
"arguments":{"device":"n"}}
{"error": {"class": "GenericError", "desc": "Block export id 'n' is already in use"}}
{"execute":"nbd-server-add",
"arguments":{"device":"n", "name":"n2",
"bitmap":"b2"}}
nbd: Only require disabled bitmap for read-only exports Our initial implementation of x-nbd-server-add-bitmap put in a restriction because of incremental backups: in that usage, we are exporting one qcow2 file (the temporary overlay target of a blockdev-backup sync:none job) and a dirty bitmap owned by a second qcow2 file (the source of the blockdev-backup, which is the backing file of the temporary). While both qcow2 files are still writable (the target in order to capture copy-on-write of old contents, and the source in order to track live guest writes in the meantime), the NBD client expects to see constant data, including the dirty bitmap. An enabled bitmap in the source would be modified by guest writes, which is at odds with the NBD export being a read-only constant view, hence the initial code choice of enforcing a disabled bitmap (the intent is that the exposed bitmap was disabled in the same transaction that started the blockdev-backup job, although we don't want to track enough state to actually enforce that). However, consider the case of a bitmap contained in a read-only node (including when the bitmap is found in a backing layer of the active image). Because the node can't be modified, the bitmap won't change due to writes, regardless of whether it is still enabled. Forbidding the export unless the bitmap is disabled is awkward, paritcularly since we can't change the bitmap to be disabled (because the node is read-only). Alternatively, consider the case of live storage migration, where management directs the destination to create a writable NBD server, then performs a drive-mirror from the source to the target, prior to doing the rest of the live migration. Since storage migration can be time-consuming, it may be wise to let the destination include a dirty bitmap to track which portions it has already received, where even if the migration is interrupted and restarted, the source can query the destination block status in order to potentially minimize re-sending data that has not changed in the meantime on a second attempt. Such code has not been written, and might not be trivial (after all, a cluster being marked dirty in the bitmap does not necessarily guarantee it has the desired contents), but it makes sense that letting an active dirty bitmap be exposed and changing alongside writes may prove useful in the future. Solve both issues by gating the restriction against a disabled bitmap to only happen when the caller has requested a read-only export, and where the BDS that owns the bitmap (whether or not it is the BDS handed to nbd_export_new() or from its backing chain) is still writable. We could drop the check altogether (if management apps are prepared to deal with a changing bitmap even on a read-only image), but for now keeping a check for the read-only case still stands a chance of preventing management errors. Update iotest 223 to show the looser behavior by leaving a bitmap enabled the whole run; note that we have to tear down and re-export a node when handling an error. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20190111194720.15671-4-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2019-01-11 22:47:15 +03:00
{"error": {"class": "GenericError", "desc": "Enabled bitmap 'b2' incompatible with readonly export"}}
{"execute":"nbd-server-add",
"arguments":{"device":"n", "name":"n2",
"bitmap":"b3"}}
nbd: Allow bitmap export during QMP nbd-server-add With the experimental x-nbd-server-add-bitmap command, there was a window of time where an NBD client could see the export but not the associated dirty bitmap, which can cause a client that planned on using the dirty bitmap to be forced to treat the entire image as dirty as a safety fallback. Furthermore, if the QMP client successfully exports a disk but then fails to add the bitmap, it has to take on the burden of removing the export. Since we don't allow changing the exposed dirty bitmap (whether to a different bitmap, or removing advertisement of the bitmap), it is nicer to make the bitmap tied to the export at the time the export is created, with automatic failure to export if the bitmap is not available. The experimental command included an optional 'bitmap-export-name' field for remapping the name exposed over NBD to be different from the bitmap name stored on disk. However, my libvirt demo code for implementing differential backups on top of persistent bitmaps did not need to take advantage of that feature (it is instead possible to create a new temporary bitmap with the desired name, use block-dirty-bitmap-merge to merge one or more persistent bitmaps into the temporary, then associate the temporary with the NBD export, if control is needed over the exported bitmap name). Hence, I'm not copying that part of the experiment over to the stable addition. For more details on the libvirt demo, see https://www.redhat.com/archives/libvir-list/2018-October/msg01254.html, https://kvmforum2018.sched.com/event/FzuB/facilitating-incremental-backup-eric-blake-red-hat This patch focuses on the user interface, and reduces (but does not completely eliminate) the window where an NBD client can see the export but not the dirty bitmap, with less work to clean up after errors. Later patches will add further cleanups now that this interface is declared stable via a single QMP command, including removing the race window. Update test 223 to use the new interface. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20190111194720.15671-6-eblake@redhat.com>
2019-01-11 22:47:17 +03:00
{"error": {"class": "GenericError", "desc": "Bitmap 'b3' is not found"}}
{"execute":"nbd-server-add",
"arguments":{"device":"n", "name":"n2", "writable":true,
"description":"some text", "bitmap":"b2"}}
{"return": {}}
{"execute":"block-export-add",
"arguments":{"type": "nbd", "node-name":"n", "id":"n3", "name": "n3",
"bitmaps":[{"node":"null","name":"b3"}]}}
{"return": {}}
exports available: 3
export: 'n'
size: 4194304
nbd/server: Add FLAG_PAYLOAD support to CMD_BLOCK_STATUS Allow a client to request a subset of negotiated meta contexts. For example, a client may ask to use a single connection to learn about both block status and dirty bitmaps, but where the dirty bitmap queries only need to be performed on a subset of the disk; forcing the server to compute that information on block status queries in the rest of the disk is wasted effort (both at the server, and on the amount of traffic sent over the wire to be parsed and ignored by the client). Qemu as an NBD client never requests to use more than one meta context, so it has no need to use block status payloads. Testing this instead requires support from libnbd, which CAN access multiple meta contexts in parallel from a single NBD connection; an interop test submitted to the libnbd project at the same time as this patch demonstrates the feature working, as well as testing some corner cases (for example, when the payload length is longer than the export length), although other corner cases (like passing the same id duplicated) requires a protocol fuzzer because libnbd is not wired up to break the protocol that badly. This also includes tweaks to 'qemu-nbd --list' to show when a server is advertising the capability, and to the testsuite to reflect the addition to that output. Of note: qemu will always advertise the new feature bit during NBD_OPT_INFO if extended headers have alreay been negotiated (regardless of whether any NBD_OPT_SET_META_CONTEXT negotiation has occurred); but for NBD_OPT_GO, qemu only advertises the feature if block status is also enabled (that is, if the client does not negotiate any contexts, then NBD_CMD_BLOCK_STATUS cannot be used, so the feature is not advertised). Signed-off-by: Eric Blake <eblake@redhat.com> Message-ID: <20230925192229.3186470-26-eblake@redhat.com> [eblake: fix logic to reject unnegotiated contexts] Signed-off-by: Eric Blake <eblake@redhat.com>
2023-09-25 22:22:42 +03:00
flags: 0x158f ( readonly flush fua df multi cache block-status-payload )
nbd/server: Advertise actual minimum block size Both NBD_CMD_BLOCK_STATUS and structured NBD_CMD_READ will split their reply according to bdrv_block_status() boundaries. If the block device has a request_alignment smaller than 512, but we advertise a block alignment of 512 to the client, then this can result in the server reply violating client expectations by reporting a smaller region of the export than what the client is permitted to address (although this is less of an issue for qemu 4.0 clients, given recent client patches to overlook our non-compliance at EOF). Since it's always better to be strict in what we send, it is worth advertising the actual minimum block limit rather than blindly rounding it up to 512. Note that this patch is not foolproof - it is still possible to provoke non-compliant server behavior using: $ qemu-nbd --image-opts driver=blkdebug,align=512,image.driver=file,image.filename=/path/to/non-aligned-file That is arguably a bug in the blkdebug driver (it should never pass back block status smaller than its alignment, even if it has to make multiple bdrv_get_status calls and determine the least-common-denominator status among the group to return). It may also be possible to observe issues with a backing layer with smaller alignment than the active layer, although so far I have been unable to write a reliable iotest for that scenario (but again, an issue like that could be argued to be a bug in the block layer, or something where we need a flag to bdrv_block_status() to state whether the result must be aligned to the current layer's limits or can be subdivided for accuracy when chasing backing files). Anyways, as blkdebug is not normally used, and as this patch makes our server more interoperable with qemu 3.1 clients, it is worth applying now, even while we still work on a larger patch series for the 4.1 timeframe to have byte-accurate file lengths. Note that the iotests output changes - for 223 and 233, we can see the server's better granularity advertisement; and for 241, the three test cases have the following effects: - natural alignment: the server's smaller alignment is now advertised, and the hole reported at EOF is now the right result; we've gotten rid of the server's non-compliance - forced server alignment: the server still advertises 512 bytes, but still sends a mid-sector hole. This is still a server compliance bug, which needs to be fixed in the block layer in a later patch; output does not change because the client is already being tolerant of the non-compliance - forced client alignment: the server's smaller alignment means that the client now sees the server's status change mid-sector without any protocol violations, but the fact that the map shows an unaligned mid-sector hole is evidence of the block layer problems with aligned block status, to be fixed in a later patch Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20190329042750.14704-7-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> [eblake: rebase to enhanced iotest 241 coverage]
2019-03-31 04:36:36 +03:00
min block: 1
opt block: 4096
max block: 33554432
transaction size: 64-bit
available meta contexts: 2
base:allocation
qemu:dirty-bitmap:b
export: 'n2'
description: some text
size: 4194304
nbd/server: Add FLAG_PAYLOAD support to CMD_BLOCK_STATUS Allow a client to request a subset of negotiated meta contexts. For example, a client may ask to use a single connection to learn about both block status and dirty bitmaps, but where the dirty bitmap queries only need to be performed on a subset of the disk; forcing the server to compute that information on block status queries in the rest of the disk is wasted effort (both at the server, and on the amount of traffic sent over the wire to be parsed and ignored by the client). Qemu as an NBD client never requests to use more than one meta context, so it has no need to use block status payloads. Testing this instead requires support from libnbd, which CAN access multiple meta contexts in parallel from a single NBD connection; an interop test submitted to the libnbd project at the same time as this patch demonstrates the feature working, as well as testing some corner cases (for example, when the payload length is longer than the export length), although other corner cases (like passing the same id duplicated) requires a protocol fuzzer because libnbd is not wired up to break the protocol that badly. This also includes tweaks to 'qemu-nbd --list' to show when a server is advertising the capability, and to the testsuite to reflect the addition to that output. Of note: qemu will always advertise the new feature bit during NBD_OPT_INFO if extended headers have alreay been negotiated (regardless of whether any NBD_OPT_SET_META_CONTEXT negotiation has occurred); but for NBD_OPT_GO, qemu only advertises the feature if block status is also enabled (that is, if the client does not negotiate any contexts, then NBD_CMD_BLOCK_STATUS cannot be used, so the feature is not advertised). Signed-off-by: Eric Blake <eblake@redhat.com> Message-ID: <20230925192229.3186470-26-eblake@redhat.com> [eblake: fix logic to reject unnegotiated contexts] Signed-off-by: Eric Blake <eblake@redhat.com>
2023-09-25 22:22:42 +03:00
flags: 0x1ded ( flush fua trim zeroes df multi cache fast-zero block-status-payload )
nbd/server: Advertise actual minimum block size Both NBD_CMD_BLOCK_STATUS and structured NBD_CMD_READ will split their reply according to bdrv_block_status() boundaries. If the block device has a request_alignment smaller than 512, but we advertise a block alignment of 512 to the client, then this can result in the server reply violating client expectations by reporting a smaller region of the export than what the client is permitted to address (although this is less of an issue for qemu 4.0 clients, given recent client patches to overlook our non-compliance at EOF). Since it's always better to be strict in what we send, it is worth advertising the actual minimum block limit rather than blindly rounding it up to 512. Note that this patch is not foolproof - it is still possible to provoke non-compliant server behavior using: $ qemu-nbd --image-opts driver=blkdebug,align=512,image.driver=file,image.filename=/path/to/non-aligned-file That is arguably a bug in the blkdebug driver (it should never pass back block status smaller than its alignment, even if it has to make multiple bdrv_get_status calls and determine the least-common-denominator status among the group to return). It may also be possible to observe issues with a backing layer with smaller alignment than the active layer, although so far I have been unable to write a reliable iotest for that scenario (but again, an issue like that could be argued to be a bug in the block layer, or something where we need a flag to bdrv_block_status() to state whether the result must be aligned to the current layer's limits or can be subdivided for accuracy when chasing backing files). Anyways, as blkdebug is not normally used, and as this patch makes our server more interoperable with qemu 3.1 clients, it is worth applying now, even while we still work on a larger patch series for the 4.1 timeframe to have byte-accurate file lengths. Note that the iotests output changes - for 223 and 233, we can see the server's better granularity advertisement; and for 241, the three test cases have the following effects: - natural alignment: the server's smaller alignment is now advertised, and the hole reported at EOF is now the right result; we've gotten rid of the server's non-compliance - forced server alignment: the server still advertises 512 bytes, but still sends a mid-sector hole. This is still a server compliance bug, which needs to be fixed in the block layer in a later patch; output does not change because the client is already being tolerant of the non-compliance - forced client alignment: the server's smaller alignment means that the client now sees the server's status change mid-sector without any protocol violations, but the fact that the map shows an unaligned mid-sector hole is evidence of the block layer problems with aligned block status, to be fixed in a later patch Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20190329042750.14704-7-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> [eblake: rebase to enhanced iotest 241 coverage]
2019-03-31 04:36:36 +03:00
min block: 1
opt block: 4096
max block: 33554432
transaction size: 64-bit
available meta contexts: 2
base:allocation
qemu:dirty-bitmap:b2
export: 'n3'
size: 4194304
nbd/server: Add FLAG_PAYLOAD support to CMD_BLOCK_STATUS Allow a client to request a subset of negotiated meta contexts. For example, a client may ask to use a single connection to learn about both block status and dirty bitmaps, but where the dirty bitmap queries only need to be performed on a subset of the disk; forcing the server to compute that information on block status queries in the rest of the disk is wasted effort (both at the server, and on the amount of traffic sent over the wire to be parsed and ignored by the client). Qemu as an NBD client never requests to use more than one meta context, so it has no need to use block status payloads. Testing this instead requires support from libnbd, which CAN access multiple meta contexts in parallel from a single NBD connection; an interop test submitted to the libnbd project at the same time as this patch demonstrates the feature working, as well as testing some corner cases (for example, when the payload length is longer than the export length), although other corner cases (like passing the same id duplicated) requires a protocol fuzzer because libnbd is not wired up to break the protocol that badly. This also includes tweaks to 'qemu-nbd --list' to show when a server is advertising the capability, and to the testsuite to reflect the addition to that output. Of note: qemu will always advertise the new feature bit during NBD_OPT_INFO if extended headers have alreay been negotiated (regardless of whether any NBD_OPT_SET_META_CONTEXT negotiation has occurred); but for NBD_OPT_GO, qemu only advertises the feature if block status is also enabled (that is, if the client does not negotiate any contexts, then NBD_CMD_BLOCK_STATUS cannot be used, so the feature is not advertised). Signed-off-by: Eric Blake <eblake@redhat.com> Message-ID: <20230925192229.3186470-26-eblake@redhat.com> [eblake: fix logic to reject unnegotiated contexts] Signed-off-by: Eric Blake <eblake@redhat.com>
2023-09-25 22:22:42 +03:00
flags: 0x158f ( readonly flush fua df multi cache block-status-payload )
min block: 1
opt block: 4096
max block: 33554432
transaction size: 64-bit
available meta contexts: 2
base:allocation
qemu:dirty-bitmap:b3
=== Contrast normal status to large granularity dirty-bitmap ===
read 512/512 bytes at offset 512
512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
read 524288/524288 bytes at offset 524288
512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
read 1048576/1048576 bytes at offset 1048576
1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
read 2097152/2097152 bytes at offset 2097152
2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
[{ "start": 0, "length": 4096, "depth": 0, "present": true, "zero": false, "data": true, "compressed": false, "offset": OFFSET},
{ "start": 4096, "length": 1044480, "depth": 0, "present": true, "zero": true, "data": false, "compressed": false, "offset": OFFSET},
{ "start": 1048576, "length": 3145728, "depth": 0, "present": true, "zero": false, "data": true, "compressed": false, "offset": OFFSET}]
[{ "start": 0, "length": 65536, "depth": 0, "present": false, "zero": false, "data": false, "compressed": false},
{ "start": 65536, "length": 2031616, "depth": 0, "present": true, "zero": false, "data": true, "compressed": false, "offset": OFFSET},
{ "start": 2097152, "length": 2097152, "depth": 0, "present": false, "zero": false, "data": false, "compressed": false}]
=== Contrast to small granularity dirty-bitmap ===
[{ "start": 0, "length": 512, "depth": 0, "present": true, "zero": false, "data": true, "compressed": false, "offset": OFFSET},
{ "start": 512, "length": 512, "depth": 0, "present": false, "zero": false, "data": false, "compressed": false},
{ "start": 1024, "length": 2096128, "depth": 0, "present": true, "zero": false, "data": true, "compressed": false, "offset": OFFSET},
{ "start": 2097152, "length": 2097152, "depth": 0, "present": false, "zero": false, "data": false, "compressed": false}]
=== Check bitmap taken from another node ===
[{ "start": 0, "length": 4194304, "depth": 0, "present": true, "zero": false, "data": true, "compressed": false, "offset": OFFSET}]
=== End qemu NBD server ===
{"execute":"nbd-server-remove",
"arguments":{"name":"n"}}
monitor: only run coroutine commands in qemu_aio_context monitor_qmp_dispatcher_co() runs in the iohandler AioContext that is not polled during nested event loops. The coroutine currently reschedules itself in the main loop's qemu_aio_context AioContext, which is polled during nested event loops. One known problem is that QMP device-add calls drain_call_rcu(), which temporarily drops the BQL, leading to all sorts of havoc like other vCPU threads re-entering device emulation code while another vCPU thread is waiting in device emulation code with aio_poll(). Paolo Bonzini suggested running non-coroutine QMP handlers in the iohandler AioContext. This avoids trouble with nested event loops. His original idea was to move coroutine rescheduling to monitor_qmp_dispatch(), but I resorted to moving it to qmp_dispatch() because we don't know if the QMP handler needs to run in coroutine context in monitor_qmp_dispatch(). monitor_qmp_dispatch() would have been nicer since it's associated with the monitor implementation and not as general as qmp_dispatch(), which is also used by qemu-ga. A number of qemu-iotests need updated .out files because the order of QMP events vs QMP responses has changed. Solves Issue #1933. Cc: qemu-stable@nongnu.org Fixes: 7bed89958bfbf40df9ca681cefbdca63abdde39d ("device_core: use drain_call_rcu in in qmp_device_add") Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2215192 Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2214985 Buglink: https://issues.redhat.com/browse/RHEL-17369 Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20240118144823.1497953-4-stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Tested-by: Fiona Ebner <f.ebner@proxmox.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2024-01-18 17:48:23 +03:00
{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "BLOCK_EXPORT_DELETED", "data": {"id": "n"}}
{"return": {}}
{"execute":"nbd-server-remove",
"arguments":{"name":"n2"}}
monitor: only run coroutine commands in qemu_aio_context monitor_qmp_dispatcher_co() runs in the iohandler AioContext that is not polled during nested event loops. The coroutine currently reschedules itself in the main loop's qemu_aio_context AioContext, which is polled during nested event loops. One known problem is that QMP device-add calls drain_call_rcu(), which temporarily drops the BQL, leading to all sorts of havoc like other vCPU threads re-entering device emulation code while another vCPU thread is waiting in device emulation code with aio_poll(). Paolo Bonzini suggested running non-coroutine QMP handlers in the iohandler AioContext. This avoids trouble with nested event loops. His original idea was to move coroutine rescheduling to monitor_qmp_dispatch(), but I resorted to moving it to qmp_dispatch() because we don't know if the QMP handler needs to run in coroutine context in monitor_qmp_dispatch(). monitor_qmp_dispatch() would have been nicer since it's associated with the monitor implementation and not as general as qmp_dispatch(), which is also used by qemu-ga. A number of qemu-iotests need updated .out files because the order of QMP events vs QMP responses has changed. Solves Issue #1933. Cc: qemu-stable@nongnu.org Fixes: 7bed89958bfbf40df9ca681cefbdca63abdde39d ("device_core: use drain_call_rcu in in qmp_device_add") Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2215192 Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2214985 Buglink: https://issues.redhat.com/browse/RHEL-17369 Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20240118144823.1497953-4-stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Tested-by: Fiona Ebner <f.ebner@proxmox.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2024-01-18 17:48:23 +03:00
{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "BLOCK_EXPORT_DELETED", "data": {"id": "n2"}}
{"return": {}}
{"execute":"nbd-server-remove",
"arguments":{"name":"n2"}}
{"error": {"class": "GenericError", "desc": "Export 'n2' is not found"}}
{"execute":"nbd-server-stop"}
{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "BLOCK_EXPORT_DELETED", "data": {"id": "n3"}}
{"return": {}}
{"execute":"nbd-server-stop"}
{"error": {"class": "GenericError", "desc": "NBD server not running"}}
{"execute":"quit"}
{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false, "reason": "host-qmp-quit"}}
monitor: only run coroutine commands in qemu_aio_context monitor_qmp_dispatcher_co() runs in the iohandler AioContext that is not polled during nested event loops. The coroutine currently reschedules itself in the main loop's qemu_aio_context AioContext, which is polled during nested event loops. One known problem is that QMP device-add calls drain_call_rcu(), which temporarily drops the BQL, leading to all sorts of havoc like other vCPU threads re-entering device emulation code while another vCPU thread is waiting in device emulation code with aio_poll(). Paolo Bonzini suggested running non-coroutine QMP handlers in the iohandler AioContext. This avoids trouble with nested event loops. His original idea was to move coroutine rescheduling to monitor_qmp_dispatch(), but I resorted to moving it to qmp_dispatch() because we don't know if the QMP handler needs to run in coroutine context in monitor_qmp_dispatch(). monitor_qmp_dispatch() would have been nicer since it's associated with the monitor implementation and not as general as qmp_dispatch(), which is also used by qemu-ga. A number of qemu-iotests need updated .out files because the order of QMP events vs QMP responses has changed. Solves Issue #1933. Cc: qemu-stable@nongnu.org Fixes: 7bed89958bfbf40df9ca681cefbdca63abdde39d ("device_core: use drain_call_rcu in in qmp_device_add") Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2215192 Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2214985 Buglink: https://issues.redhat.com/browse/RHEL-17369 Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20240118144823.1497953-4-stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Tested-by: Fiona Ebner <f.ebner@proxmox.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2024-01-18 17:48:23 +03:00
{"return": {}}
=== Use qemu-nbd as server ===
[{ "start": 0, "length": 65536, "depth": 0, "present": false, "zero": false, "data": false, "compressed": false},
{ "start": 65536, "length": 2031616, "depth": 0, "present": true, "zero": false, "data": true, "compressed": false, "offset": OFFSET},
{ "start": 2097152, "length": 2097152, "depth": 0, "present": false, "zero": false, "data": false, "compressed": false}]
[{ "start": 0, "length": 512, "depth": 0, "present": true, "zero": false, "data": true, "compressed": false, "offset": OFFSET},
{ "start": 512, "length": 512, "depth": 0, "present": false, "zero": false, "data": false, "compressed": false},
{ "start": 1024, "length": 11321, "depth": 0, "present": true, "zero": false, "data": true, "compressed": false, "offset": OFFSET}]
[{ "start": 12345, "length": 2084807, "depth": 0, "present": true, "zero": false, "data": true, "compressed": false, "offset": OFFSET},
{ "start": 2097152, "length": 2097152, "depth": 0, "present": false, "zero": false, "data": false, "compressed": false}]
*** done