mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Eric Blake	7d31776d7f	nbd/server: CVE-2024-7409: Drop non-negotiating clients A client that opens a socket but does not negotiate is merely hogging qemu's resources (an open fd and a small amount of memory); and a malicious client that can access the port where NBD is listening can attempt a denial of service attack by intentionally opening and abandoning lots of unfinished connections. The previous patch put a default bound on the number of such ongoing connections, but once that limit is hit, no more clients can connect (including legitimate ones). The solution is to insist that clients complete handshake within a reasonable time limit, defaulting to 10 seconds. A client that has not successfully completed NBD_OPT_GO by then (including the case of where the client didn't know TLS credentials to even reach the point of NBD_OPT_GO) is wasting our time and does not deserve to stay connected. Later patches will allow fine-tuning the limit away from the default value (including disabling it for doing integration testing of the handshake process itself). Note that this patch in isolation actually makes it more likely to see qemu SEGV after nbd-server-stop, as any client socket still connected when the server shuts down will now be closed after 10 seconds rather than at the client's whims. That will be addressed in the next patch. For a demo of this patch in action: $ qemu-nbd -f raw -r -t -e 10 file & $ nbdsh --opt-mode -c ' H = list() for i in range(20): print(i) H.insert(i, nbd.NBD()) H[i].set_opt_mode(True) H[i].connect_uri("nbd://localhost") ' $ kill $! where later connections get to start progressing once earlier ones are forcefully dropped for taking too long, rather than hanging. Suggested-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com> Message-ID: <20240807174943.771624-13-eblake@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> [eblake: rebase to changes earlier in series, reduce scope of timer] Signed-off-by: Eric Blake <eblake@redhat.com> (cherry picked from commit `b9b72cb3ce`) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> (Mjt: context fixup in nbd/server.c: lack of WITH_QEMU_LOCK_GUARD in 7.2)	2024-08-11 11:01:20 +03:00
Eric Blake	cb48a7089b	nbd/server: Plumb in new args to nbd_client_add() Upcoming patches to fix a CVE need to track an opaque pointer passed in by the owner of a client object, as well as request for a time limit on how fast negotiation must complete. Prepare for that by changing the signature of nbd_client_new() and adding an accessor to get at the opaque pointer, although for now the two servers (qemu-nbd.c and blockdev-nbd.c) do not change behavior even though they pass in a new default timeout value. Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Eric Blake <eblake@redhat.com> Message-ID: <20240807174943.771624-11-eblake@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> [eblake: s/LIMIT/MAX_SECS/ as suggested by Dan] Signed-off-by: Eric Blake <eblake@redhat.com> (cherry picked from commit `fb1c2aaa98`) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2024-08-11 10:57:00 +03:00
Eric Blake	f7b46e82ce	nbd/server: Mark negotiation functions as coroutine_fn nbd_negotiate() is already marked coroutine_fn. And given the fix in the previous patch to have nbd_negotiate_handle_starttls not create and wait on a g_main_loop (as that would violate coroutine constraints), it is worth marking the rest of the related static functions reachable only during option negotiation as also being coroutine_fn. Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Eric Blake <eblake@redhat.com> Message-ID: <20240408160214.1200629-6-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> [eblake: drop one spurious coroutine_fn marking] Signed-off-by: Eric Blake <eblake@redhat.com> (cherry picked from commit `4fa333e08d`) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> (Mjt: fixups in nbd/server.c for v8.1.0-1366-gfd358d83901d "nbd/server: Refactor list of negotiated meta contexts")	2024-04-28 15:16:47 +03:00
Zhu Yangyang	a0823c2766	nbd/server: do not poll within a coroutine context Coroutines are not supposed to block. Instead, they should yield. The client performs TLS upgrade outside of an AIOContext, during synchronous handshake; this still requires g_main_loop. But the server responds to TLS upgrade inside a coroutine, so a nested g_main_loop is wrong. Since the two callbacks no longer share more than the setting of data.complete and data.error, it's just as easy to use static helpers instead of trying to share a common code path. It is also possible to add assertions that no other code is interfering with the eventual path to qio reaching the callback, whether or not it required a yield or main loop. Fixes: `f95910f` ("nbd: implement TLS support in the protocol negotiation") Signed-off-by: Zhu Yangyang <zhuyangyang14@huawei.com> [eblake: move callbacks to their use point, add assertions] Signed-off-by: Eric Blake <eblake@redhat.com> Message-ID: <20240408160214.1200629-5-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> (cherry picked from commit `ae6d91a7e9`) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2024-04-28 15:13:17 +03:00
Alberto Faria	a9262f551e	block: Change blk_{pread,pwrite}() param order Swap 'buf' and 'bytes' around for consistency with blk_co_{pread,pwrite}(), and in preparation to implement these functions using generated_co_wrapper. Callers were updated using this Coccinelle script: @@ expression blk, offset, buf, bytes, flags; @@ - blk_pread(blk, offset, buf, bytes, flags) + blk_pread(blk, offset, bytes, buf, flags) @@ expression blk, offset, buf, bytes, flags; @@ - blk_pwrite(blk, offset, buf, bytes, flags) + blk_pwrite(blk, offset, bytes, buf, flags) It had no effect on hw/block/nand.c, presumably due to the #if, so that file was updated manually. Overly-long lines were then fixed by hand. Signed-off-by: Alberto Faria <afaria@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Message-Id: <20220705161527.1054072-4-afaria@redhat.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com>	2022-07-12 12:14:56 +02:00
Alberto Faria	3b35d4542c	block: Add a 'flags' param to blk_pread() For consistency with other I/O functions, and in preparation to implement it using generated_co_wrapper. Callers were updated using this Coccinelle script: @@ expression blk, offset, buf, bytes; @@ - blk_pread(blk, offset, buf, bytes) + blk_pread(blk, offset, buf, bytes, 0) It had no effect on hw/block/nand.c, presumably due to the #if, so that file was updated manually. Overly-long lines were then fixed by hand. Signed-off-by: Alberto Faria <afaria@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Message-Id: <20220705161527.1054072-3-afaria@redhat.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com>	2022-07-12 12:14:56 +02:00
Denis V. Lunev	8bb100c9e2	nbd: trace long NBD operations At the moment there are 2 sources of lengthy operations if configured: * open connection, which could retry inside and * reconnect of already opened connection These operations could be quite lengthy and cumbersome to catch thus it would be quite natural to add trace points for them. This patch is based on the original downstream work made by Vladimir. Signed-off-by: Denis V. Lunev <den@openvz.org> CC: Eric Blake <eblake@redhat.com> CC: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> CC: Kevin Wolf <kwolf@redhat.com> CC: Hanna Reitz <hreitz@redhat.com> CC: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>	2022-06-29 10:57:02 +03:00
Eric Blake	58a6fdcc9e	nbd/server: Allow MULTI_CONN for shared writable exports According to the NBD spec, a server that advertises NBD_FLAG_CAN_MULTI_CONN promises that multiple client connections will not see any cache inconsistencies: when properly separated by a single flush, actions performed by one client will be visible to another client, regardless of which client did the flush. We always satisfy these conditions in qemu - even when we support multiple clients, ALL clients go through a single point of reference into the block layer, with no local caching. The effect of one client is instantly visible to the next client. Even if our backend were a network device, we argue that any multi-path caching effects that would cause inconsistencies in back-to-back actions not seeing the effect of previous actions would be a bug in that backend, and not the fault of caching in qemu. As such, it is safe to unconditionally advertise CAN_MULTI_CONN for any qemu NBD server situation that supports parallel clients. Note, however, that we don't want to advertise CAN_MULTI_CONN when we know that a second client cannot connect (for historical reasons, qemu-nbd defaults to a single connection while nbd-server-add and QMP commands default to unlimited connections; but we already have existing means to let either style of NBD server creation alter those defaults). This is visible by no longer advertising MULTI_CONN for 'qemu-nbd -r' without -e, as in the iotest nbd-qemu-allocation. The harder part of this patch is setting up an iotest to demonstrate behavior of multiple NBD clients to a single server. It might be possible with parallel qemu-io processes, but I found it easier to do in python with the help of libnbd, and help from Nir and Vladimir in writing the test. Signed-off-by: Eric Blake <eblake@redhat.com> Suggested-by: Nir Soffer <nsoffer@redhat.com> Suggested-by: Vladimir Sementsov-Ogievskiy <v.sementsov-og@mail.ru> Message-Id: <20220512004924.417153-3-eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2022-05-12 13:10:52 +02:00
Vladimir Sementsov-Ogievskiy	e5fb29d5d0	qapi: nbd-export: allow select bitmaps by node/name pair Hi all! Current logic of relying on search through backing chain is not safe neither convenient. Sometimes it leads to necessity of extra bitmap copying. Also, we are going to add "snapshot-access" driver, to access some snapshot state through NBD. And this driver is not formally a filter, and of course it's not a COW format driver. So, searching through backing chain will not work. Instead of widening the workaround of bitmap searching, let's extend the interface so that user can select bitmap precisely. Note, that checking for bitmap active status is not copied to the new API, I don't see a reason for it, user should understand the risks. And anyway, bitmap from other node is unrelated to this export being read-only or read-write. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org> Message-Id: <20220314213226.362217-3-v.sementsov-og@mail.ru> [eblake: Adjust S-o-b to Vladimir's new email, with permission] Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2022-04-26 13:15:19 -05:00
Marc-André Lureau	e0e7fe07e1	Remove trailing ; after G_DEFINE_AUTO macro The macro doesn't need it. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>	2022-03-22 14:46:18 +04:00
Marc-André Lureau	9edc6313da	Replace GCC_FMT_ATTR with G_GNUC_PRINTF One less qemu-specific macro. It also helps to make some headers/units only depend on glib, and thus moved in standalone projects eventually. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Richard W.M. Jones <rjones@redhat.com>	2022-03-22 14:40:51 +04:00
Peter Maydell	fdee2c9692	nbd patches for 2022-03-07 - Dan Berrange: Allow qemu-nbd to support TLS over Unix sockets - Eric Blake: Minor cleanups related to 64-bit block operations -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEccLMIrHEYCkn0vOqp6FrSiUnQ2oFAmImtE8ACgkQp6FrSiUn Q2ovmgf/aksDqf2eNcahs++fez+8Qi9ll5OY/qGyjnzBgsatYKjrK+xF7OnjoJox eRX026lh81Q4EQK7oZBUnr2UCY4bncDBTI7MTLh603EV/tId5ZLwx007ERhzvtC1 mIsQHXNuO9X25LQG2eWnfunY9YztQpiT5r/g3khD2yPBqJWIvBfblzPLx6FkF7px /WM8xEKCihmGr1Wr3b+zGYL083YkaBWCvHoR8mJt3tEFUj+Qie8XcdV0OVyI0XUj 5goIFRcpVwBE8P2nLtfUKNzEXz22cmdonOJUX7E5IvGO21k5F/HrWlHdo8JnuSUZ t0w5L9yCxBrRpY1burz30b77J0WMCw== =C8Dd -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/ericb/tags/pull-nbd-2022-03-07' into staging nbd patches for 2022-03-07 - Dan Berrange: Allow qemu-nbd to support TLS over Unix sockets - Eric Blake: Minor cleanups related to 64-bit block operations # gpg: Signature made Tue 08 Mar 2022 01:41:35 GMT # gpg: using RSA key 71C2CC22B1C4602927D2F3AAA7A16B4A2527436A # gpg: Good signature from "Eric Blake <eblake@redhat.com>" [full] # gpg: aka "Eric Blake (Free Software Programmer) <ebb9@byu.net>" [full] # gpg: aka "[jpeg image of size 6874]" [full] # Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2 F3AA A7A1 6B4A 2527 436A * remotes/ericb/tags/pull-nbd-2022-03-07: qemu-io: Allow larger write zeroes under no fallback qemu-io: Utilize 64-bit status during map nbd/server: Minor cleanups tests/qemu-iotests: validate NBD TLS with UNIX sockets and PSK tests/qemu-iotests: validate NBD TLS with UNIX sockets tests/qemu-iotests: validate NBD TLS with hostname mismatch tests/qemu-iotests: convert NBD TLS test to use standard filters tests/qemu-iotests: introduce filter for qemu-nbd export list tests/qemu-iotests: expand _filter_nbd rules tests/qemu-iotests: add QEMU_IOTESTS_REGEN=1 to update reference file block/nbd: don't restrict TLS usage to IP sockets qemu-nbd: add --tls-hostname option for TLS certificate validation block/nbd: support override of hostname for TLS certificate validation block: pass desired TLS hostname through from block driver client crypto: mandate a hostname when checking x509 creds on a client Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2022-03-09 11:38:29 +00:00
Eric Blake	314b902621	nbd/server: Minor cleanups Spelling fixes, grammar improvements and consistent spacing, noticed while preparing other patches in this file. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20211203231539.3900865-2-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>	2022-03-07 19:23:25 -06:00
Daniel P. Berrangé	046f98d075	block: pass desired TLS hostname through from block driver client In commit `a71d597b98` Author: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Date: Thu Jun 10 13:08:00 2021 +0300 block/nbd: reuse nbd_co_do_establish_connection() in nbd_open() the use of the 'hostname' field from the BDRVNBDState struct was lost, and 'nbd_connect' just hardcoded it to match the IP socket address. This was a harmless bug at the time since we block use with anything other than IP sockets. Shortly though, we want to allow the caller to override the hostname used in the TLS certificate checks. This is to allow for TLS when doing port forwarding or tunneling. Thus we need to reinstate the passing along of the 'hostname'. Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20220304193610.3293146-3-berrange@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2022-03-07 15:58:42 -06:00
Peter Maydell	5df022cf2e	osdep: Move memalign-related functions to their own header Move the various memalign-related functions out of osdep.h and into their own header, which we include only where they are used. While we're doing this, add some brief documentation comments. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 20220226180723.1706285-10-peter.maydell@linaro.org	2022-03-07 13:16:49 +00:00
Nir Soffer	523f5a9971	nbd/server.c: Remove unused field NBDRequestData struct has unused QSIMPLEQ_ENTRY field. It seems that this field exists since the first git commit and was never used. Signed-off-by: Nir Soffer <nsoffer@redhat.com> Message-Id: <20220111194313.581486-1-nsoffer@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Fixes: `d9a73806` ("qemu-nbd: introduce NBDRequest", v1.1) Signed-off-by: Eric Blake <eblake@redhat.com>	2022-01-28 16:48:28 -06:00
Vladimir Sementsov-Ogievskiy	9e14491af4	nbd/client-connection: improve error message of cancelled attempt Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2021-12-21 14:52:19 +01:00
Vladimir Sementsov-Ogievskiy	169b9a94ed	nbd/client-connection: nbd_co_establish_connection(): return real error The only caller of nbd_do_establish_connection() that uses errp is nbd_open(). The only way to cancel this call is through open_timer timeout. And for this case, user will be more interested in description of last failed connect rather than in "Connection attempt cancelled by other operation". So, let's change behavior on cancel to return previous failure error if available. Do the same for non-blocking failure case. In this case we still don't have a caller that is interested in errp. But let's be consistent. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2021-12-21 14:52:10 +01:00
Eric Blake	e35574226a	nbd/server: Simplify zero and trim Now that the block layer supports 64-bit operations (see commit `2800637a` and friends, new to v6.2), we no longer have to self-fragment requests larger than 2G, reverting the workaround added in `890cbccb08` ("nbd: Fix large trim/zero requests", v5.1.0). Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20211117170230.1128262-3-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>	2021-11-22 07:37:15 -06:00
Eric Blake	1644cccea5	nbd/server: Don't complain on certain client disconnects When a client disconnects abruptly, but did not have any pending requests (for example, when using nbdsh without calling h.shutdown), we used to output the following message: $ qemu-nbd -f raw file $ nbdsh -u 'nbd://localhost:10809' -c 'h.trim(1,0)' qemu-nbd: Disconnect client, due to: Failed to read request: Unexpected end-of-file before all bytes were read Then in commit `f148ae7`, we refactored nbd_receive_request() to use nbd_read_eof(); when this returns 0, we regressed into tracing uninitialized memory (if tracing is enabled) and reporting a less-specific: qemu-nbd: Disconnect client, due to: Request handling failed in intermediate state Note that with Unix sockets, we have yet another error message, unchanged by the 6.0 regression: $ qemu-nbd -k /tmp/sock -f raw file $ nbdsh -u 'nbd+unix:///?socket=/tmp/sock' -c 'h.trim(1,0)' qemu-nbd: Disconnect client, due to: Failed to send reply: Unable to write to socket: Broken pipe But in all cases, the error message goes away if the client performs a soft shutdown by using NBD_CMD_DISC, rather than a hard shutdown by abrupt disconnect: $ nbdsh -u 'nbd://localhost:10809' -c 'h.trim(1,0)' -c 'h.shutdown()' This patch fixes things to avoid uninitialized memory, and in general avoids warning about a client that does a hard shutdown when not in the middle of a packet. A client that aborts mid-request, or which does not read the full server's reply, can still result in warnings, but those are indeed much more unusual situations. CC: qemu-stable@nongnu.org Fixes: `f148ae7d36` ("nbd/server: Quiesce coroutines on context switch", v6.0.0) Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> [eblake: defer unrelated typo fixes to later patch] Message-Id: <20211117170230.1128262-2-eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2021-11-22 07:37:14 -06:00
Eric Blake	76df2b8d69	nbd/server: Silence clang sanitizer warning clang's sanitizer is picky: memset(NULL, x, 0) is technically undefined behavior, even though no sane implementation of memset() deferences the NULL. Caught by the nbd-qemu-allocation iotest. The alternative to checking before each memset is to instead force an allocation of 1 element instead of g_new0(type, 0)'s behavior of returning NULL for a 0-length array. Reported-by: Peter Maydell <peter.maydell@linaro.org> Fixes: `3b1f244c59` (nbd: Allow export of multiple bitmaps for one device) Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20211115223943.626416-1-eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>	2021-11-16 08:45:33 -06:00
Vladimir Sementsov-Ogievskiy	4ddb5d2fde	block/nbd: drop connection_co OK, that's a big rewrite of the logic. Pre-patch we have an always running coroutine - connection_co. It does reply receiving and reconnecting. And it leads to a lot of difficult and unobvious code around drained sections and context switch. We also abuse bs->in_flight counter which is increased for connection_co and temporary decreased in points where we want to allow drained section to begin. One of these place is in another file: in nbd_read_eof() in nbd/client.c. We also cancel reconnect and requests waiting for reconnect on drained begin which is not correct. And this patch fixes that. Let's finally drop this always running coroutine and go another way: do both reconnect and receiving in request coroutines. The detailed list of changes below (in the sequence of diff hunks). 1. receiving coroutines are woken directly from nbd_channel_error, when we change s->state 2. nbd_co_establish_connection_cancel(): we don't have drain_begin now, and in nbd_teardown_connection() all requests should already be finished (and reconnect is done from request). So nbd_co_establish_connection_cancel() is called from nbd_cancel_in_flight() (to cancel the request that is doing nbd_co_establish_connection()) and from reconnect_delay_timer_cb() (previously we didn't need it, as reconnect delay only should cancel active requests not the reconnection itself). But now reconnection itself is done in the separate thread (we now call nbd_client_connection_enable_retry() in nbd_open()), and we need to cancel the requests that wait in nbd_co_establish_connection() now). 2A. We do receive headers in request coroutine. But we also should dispatch replies for other pending requests. So, nbd_connection_entry() is turned into nbd_receive_replies(), which does reply dispatching while it receives other request headers, and returns when it receives the requested header. 3. All old staff around drained sections and context switch is dropped. In details: - we don't need to move connection_co to new aio context, as we don't have connection_co anymore - we don't have a fake "request" of connection_co (extra increasing in_flight), so don't care with it in drain_begin/end - we don't stop reconnection during drained section anymore. This means that drain_begin may wait for a long time (up to reconnect_delay). But that's an improvement and more correct behavior see below[] 4. In nbd_teardown_connection() we don't have to wait for connection_co, as it is dropped. And cleanup for s->ioc and nbd_yank is moved here from removed connection_co. 5. In nbd_co_do_establish_connection() we now should handle NBD_CLIENT_CONNECTING_NOWAIT: if new request comes when we are in NBD_CLIENT_CONNECTING_NOWAIT, it still should call nbd_co_establish_connection() (who knows, maybe the connection was already established by another thread in the background). But we shouldn't wait: if nbd_co_establish_connection() can't return new channel immediately the request should fail (we are in NBD_CLIENT_CONNECTING_NOWAIT state). 6. nbd_reconnect_attempt() is simplified: it's now easier to wait for other requests in the caller, so here we just assert that fact. Also delay time is now initialized here: we can easily detect first attempt and start a timer. 7. nbd_co_reconnect_loop() is dropped, we don't need it. Reconnect retries are fully handle by thread (nbd/client-connection.c), delay timer we initialize in nbd_reconnect_attempt(), we don't have to bother with s->drained and friends. nbd_reconnect_attempt() now called from nbd_co_send_request(). 8. nbd_connection_entry is dropped: reconnect is now handled by nbd_co_send_request(), receiving reply is now handled by nbd_receive_replies(): all handled from request coroutines. 9. So, welcome new nbd_receive_replies() called from request coroutine, that receives reply header instead of nbd_connection_entry(). Like with sending requests, only one coroutine may receive in a moment. So we introduce receive_mutex, which is locked around nbd_receive_reply(). It also protects some related fields. Still, full audit of thread-safety in nbd driver is a separate task. New function waits for a reply with specified handle being received and works rather simple: Under mutex: - if current handle is 0, do receive by hand. If another handle received - switch to other request coroutine, release mutex and yield. Otherwise return success - if current handle == requested handle, we are done - otherwise, release mutex and yield 10: in nbd_co_send_request() we now do nbd_reconnect_attempt() if needed. Also waiting in free_sema queue we now wait for one of two conditions: - connectED, in_flight < MAX_NBD_REQUESTS (so we can start new one) - connectING, in_flight == 0, so we can call nbd_reconnect_attempt() And this logic is protected by s->send_mutex Also, on failure we don't have to care of removed s->connection_co 11. nbd_co_do_receive_one_chunk(): now instead of yield() and wait for s->connection_co we just call new nbd_receive_replies(). 12. nbd_co_receive_one_chunk(): place where s->reply.handle becomes 0, which means that handling of the whole reply is finished. Here we need to wake one of coroutines sleeping in nbd_receive_replies(). If none are sleeping - do nothing. That's another behavior change: we don't have endless recv() in the idle time. It may be considered as a drawback. If so, it may be fixed later. 13. nbd_reply_chunk_iter_receive(): don't care about removed connection_co, just ping in_flight waiters. 14. Don't create connection_co, enable retry in the connection thread (we don't have own reconnect loop anymore) 15. We now need to add a nbd_co_establish_connection_cancel() call in nbd_cancel_in_flight(), to cancel the request that is doing a connection attempt. [], ok, now we don't cancel reconnect on drain begin. That's correct: reconnect feature leads to possibility of long-running requests (up to reconnect delay). Still, drain begin is not a reason to kill long requests. We should wait for them. This also means, that we can again reproduce a dead-lock, described in `8c517de24a`. Why we are OK with it: 1. Now this is not absolutely-dead dead-lock: the vm is unfrozen after reconnect delay. Actually `8c517de24a` fixed a bug in NBD logic, that was not described in `8c517de24a` and led to forever dead-lock. The problem was that nobody woke the free_sema queue, but drain_begin can't finish until there is a request in free_sema queue. Now we have a reconnect delay timer that works well. 2. It's not a problem of the NBD driver, but of the ide code, because it does drain_begin under the global mutex; the problem doesn't reproduce when using scsi instead of ide. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20210902103805.25686-5-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> [eblake: grammar and comment tweaks] Signed-off-by: Eric Blake <eblake@redhat.com>	2021-09-29 13:46:33 -05:00
Vladimir Sementsov-Ogievskiy	f7ca4aadca	nbd/client-connection: nbd_co_establish_connection(): fix non set errp When we don't have a connection and blocking is false, we return NULL but don't set errp. That's wrong. We have two paths for calling nbd_co_establish_connection(): 1. nbd_open() -> nbd_do_establish_connection() -> ... but that will never set blocking=false 2. nbd_reconnect_attempt() -> nbd_co_do_establish_connection() -> ... but that uses errp=NULL So, we are safe with our wrong errp policy in nbd_co_establish_connection(). Still let's fix it. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20210906190654.183421-2-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2021-09-29 13:46:33 -05:00
Eric Blake	da24597dd3	nbd/server: Allow LIST_META_CONTEXT without STRUCTURED_REPLY The NBD protocol just relaxed the requirements on NBD_OPT_LIST_META_CONTEXT: https://github.com/NetworkBlockDevice/nbd/commit/13a4e33a87 Since listing is not stateful (unlike SET_META_CONTEXT), we don't care if a client asks for meta contexts without first requesting structured replies. Well-behaved clients will still ask for structured reply first (if for no other reason than for back-compat to older servers), but that's no reason to avoid this change. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20210907173505.1499709-1-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>	2021-09-29 13:46:32 -05:00
Richard Henderson	cd1675f8d7	nbd/server: Mark variable unused in nbd_negotiate_meta_queries From clang-13: nbd/server.c:976:22: error: variable 'bitmaps' set but not used \ [-Werror,-Wunused-but-set-variable] which is incorrect; see //bugs.llvm.org/show_bug.cgi?id=3888. Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-07-26 07:06:25 -10:00
Vladimir Sementsov-Ogievskiy	97cf89259e	nbd/client-connection: add option for non-blocking connection attempt We'll need a possibility of non-blocking nbd_co_establish_connection(), so that it returns immediately, and it returns success only if a connections was previously established in background. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20210610100802.5888-30-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2021-06-18 12:21:22 -05:00
Vladimir Sementsov-Ogievskiy	43cb34dede	nbd/client-connection: return only one io channel block/nbd doesn't need underlying sioc channel anymore. So, we can update nbd/client-connection interface to return only one top-most io channel, which is more straight forward. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20210610100802.5888-27-vsementsov@virtuozzo.com> [eblake: squash in Vladimir's fixes for uninit usage caught by clang] Signed-off-by: Eric Blake <eblake@redhat.com>	2021-06-18 12:20:53 -05:00
Vladimir Sementsov-Ogievskiy	f58b2dfe3e	nbd/client-connection: shutdown connection on release Now, when a thread can do negotiation and retry, it may run relatively long. We need a mechanism to stop it, when the user is not interested in a result any more. So, on nbd_client_connection_release() let's shutdown the socket, and do not retry connection if thread is detached. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20210610100802.5888-22-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2021-06-18 10:59:53 -05:00
Vladimir Sementsov-Ogievskiy	e0e67cbe58	nbd/client-connection: implement connection retry Add an option for a thread to retry connecting until it succeeds. We'll use nbd/client-connection both for reconnect and for initial connection in nbd_open(), so we need a possibility to use same NBDClientConnection instance to connect once in nbd_open() and then use retry semantics for reconnect. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20210610100802.5888-21-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> [eblake: grammar tweak] Signed-off-by: Eric Blake <eblake@redhat.com>	2021-06-18 10:59:53 -05:00
Vladimir Sementsov-Ogievskiy	130d49baa5	nbd/client-connection: add possibility of negotiation Add arguments and logic to support nbd negotiation in the same thread after successful connection. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20210610100802.5888-20-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2021-06-18 10:59:53 -05:00
Vladimir Sementsov-Ogievskiy	e70da5ff64	nbd/client-connection: use QEMU_LOCK_GUARD We don't update connect_thread_func() to use QEMU_LOCK_GUARD, as it will get more complex critical sections logic in further commit, where QEMU_LOCK_GUARD doesn't help. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20210610100802.5888-19-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2021-06-18 10:59:53 -05:00
Vladimir Sementsov-Ogievskiy	5276c87c12	nbd: move connection code from block/nbd to nbd/client-connection We now have bs-independent connection API, which consists of four functions: nbd_client_connection_new() nbd_client_connection_release() nbd_co_establish_connection() nbd_co_establish_connection_cancel() Move them to a separate file together with NBDClientConnection structure which becomes private to the new API. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20210610100802.5888-18-vsementsov@virtuozzo.com> [eblake: comment tweaks] Signed-off-by: Eric Blake <eblake@redhat.com>	2021-06-18 10:59:53 -05:00
Peter Maydell	8e6dad2028	Block layer patches - NBD server: Fix crashes related to switching between AioContexts - file-posix: Workaround for discard/write_zeroes on buggy filesystems - Follow-up fixes for the reopen vs. permission changes - quorum: Fix error handling for flush - block-copy: Refactor copy_range handling - docs: Describe how to use 'null-co' block driver -----BEGIN PGP SIGNATURE----- iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmC3iy8RHGt3b2xmQHJl ZGhhdC5jb20ACgkQfwmycsiPL9ZP0hAAuh07CFWLzHCcRC7PBzekSPfzRYYBLDSW EObJ1Ov4mvz8UZoP6BDJ5QVzLPhel6hXkxTd83B1D7t/Dq+yJYR0z8Kv3USpaVJ4 2U26SsoGQM8BmtVDL1Q8tQ5eDWQ4ykxNx6F2/lKBe1EH1lfaun04Xj1rNh7jpilo nNmKMDDI1UOkH0lKDR3tqfEV0XQE7o+ZKfPlIbvYMjXk9ZPKUHfjNPGVdCLQVnqH VJI01hF7eEx1ykSMdlC+TzNoVGG+mCBokGuW0JlUvOpX6FcGnAlxXQx8u53c1I8O lggZV8b2IbrNBUVwXQHLLrXxjOo+u54Ct9y/gXUTAj8qai+9jVRp60Y1pnCyeIeu DzFx10xwy04PGleb7AAZ4dT8du2+PTkuyQ9KmlvQ2U4IUcgW124CrDeO7XYr1aif hCTJPeEDrC9YNU6AQ8rLXrYUtkumSm2zUzU5nZ//i5WH41475/vsmgP5A+Jr457A Xu0yiI2Gqkr9CNsP9ZzMkNj03oIBhPFuGxiwibLQsy/6UVnaDYS0+rQ8FXYnF5+K iEpgXe3vKTWxM097kzJMBTDVRMXRa75NtK7KWXMDgVpHTbcv1t1otsn+6dfv+B55 ULJM1ETsyYS0T6BqNglvdytsraSt7JgSF+ZLHbYk3KVDshwnq/0ksgSqHNNA14ca kYTzHhMgo5w= =gSq/ -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging Block layer patches - NBD server: Fix crashes related to switching between AioContexts - file-posix: Workaround for discard/write_zeroes on buggy filesystems - Follow-up fixes for the reopen vs. permission changes - quorum: Fix error handling for flush - block-copy: Refactor copy_range handling - docs: Describe how to use 'null-co' block driver # gpg: Signature made Wed 02 Jun 2021 14:44:15 BST # gpg: using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6 # gpg: issuer "kwolf@redhat.com" # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full] # Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6 * remotes/kevin/tags/for-upstream: docs/secure-coding-practices: Describe how to use 'null-co' block driver block-copy: refactor copy_range handling block-copy: fix block_copy_task_entry() progress update nbd/server: Use drained block ops to quiesce the server block-backend: add drained_poll block: improve permission conflict error message block: simplify bdrv_child_user_desc() block/vvfat: inherit child_vvfat_qcow from child_of_bds block: improve bdrv_child_get_parent_desc() block-backend: improve blk_root_get_parent_desc() block: document child argument of bdrv_attach_child_common() block/file-posix: Try other fallbacks after invalid FALLOC_FL_ZERO_RANGE block/file-posix: Fix problem with fallocate(PUNCH_HOLE) on GPFS block: drop BlockBackendRootState::read_only block: drop BlockDriverState::read_only block: consistently use bdrv_is_read_only() block/vvfat: fix vvfat_child_perm crash block/vvfat: child_vvfat_qcow: add .get_parent_aio_context, fix crash qemu-io-cmds: assert that we don't have .perm requested in no-blk case block/quorum: Provide .bdrv_co_flush instead of .bdrv_co_flush_to_disk Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2021-06-02 19:34:03 +01:00
Sergio Lopez	fd6afc501a	nbd/server: Use drained block ops to quiesce the server Before switching between AioContexts we need to make sure that we're fully quiesced ("nb_requests == 0" for every client) when entering the drained section. To do this, we set "quiescing = true" for every client on ".drained_begin" to prevent new coroutines from being created, and check if "nb_requests == 0" on ".drained_poll". Finally, once we're exiting the drained section, on ".drained_end" we set "quiescing = false" and call "nbd_client_receive_next_request()" to resume the processing of new requests. With these changes, "blk_aio_attach()" and "blk_aio_detach()" can be reverted to be as simple as they were before `f148ae7d36`. RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1960137 Suggested-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Sergio Lopez <slp@redhat.com> Message-Id: <20210602060552.17433-3-slp@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2021-06-02 14:23:20 +02:00
Stefano Garzarella	d0fb9657a3	docs: fix references to docs/devel/tracing.rst Commit `e50caf4a5c` ("tracing: convert documentation to rST") converted docs/devel/tracing.txt to docs/devel/tracing.rst. We still have several references to the old file, so let's fix them with the following command: sed -i s/tracing.txt/tracing.rst/ $(git grep -l docs/devel/tracing.txt) Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20210517151702.109066-2-sgarzare@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>	2021-06-02 06:51:09 +02:00
Nir Soffer	0da9856851	nbd: server: Report holes for raw images When querying image extents for raw image, qemu-nbd reports holes as zero: $ qemu-nbd -t -r -f raw empty-6g.raw $ qemu-img map --output json nbd://localhost [{ "start": 0, "length": 6442450944, "depth": 0, "zero": true, "data": true, "offset": 0}] $ qemu-img map --output json empty-6g.raw [{ "start": 0, "length": 6442450944, "depth": 0, "zero": true, "data": false, "offset": 0}] Turns out that qemu-img map reports a hole based on BDRV_BLOCK_DATA, but nbd server reports a hole based on BDRV_BLOCK_ALLOCATED. The NBD protocol says: NBD_STATE_HOLE (bit 0): if set, the block represents a hole (and future writes to that area may cause fragmentation or encounter an NBD_ENOSPC error); if clear, the block is allocated or the server could not otherwise determine its status. qemu-img manual says: whether the sectors contain actual data or not (boolean field data; if false, the sectors are either unallocated or stored as optimized all-zero clusters); To me, data=false looks compatible with NBD_STATE_HOLE. From user point of view, getting same results from qemu-nbd and qemu-img is more important than being more correct about allocation status. Changing nbd server to report holes using BDRV_BLOCK_DATA makes qemu-nbd results compatible with qemu-img map: $ qemu-img map --output json nbd://localhost [{ "start": 0, "length": 6442450944, "depth": 0, "zero": true, "data": false, "offset": 0}] Signed-off-by: Nir Soffer <nsoffer@redhat.com> Message-Id: <20210219160752.1826830-1-nsoffer@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2021-03-08 13:08:45 -06:00
Sergio Lopez	f148ae7d36	nbd/server: Quiesce coroutines on context switch When switching between AIO contexts we need to me make sure that both recv_coroutine and send_coroutine are not scheduled to run. Otherwise, QEMU may crash while attaching the new context with an error like this one: aio_co_schedule: Co-routine was already scheduled in 'aio_co_schedule' To achieve this we need a local implementation of 'qio_channel_readv_all_eof' named 'nbd_read_eof' (a trick already done by 'nbd/client.c') that allows us to interrupt the operation and to know when recv_coroutine is yielding. With this in place, we delegate detaching the AIO context to the owning context with a BH ('nbd_aio_detach_bh') scheduled using 'aio_wait_bh_oneshot'. This BH signals that we need to quiesce the channel by setting 'client->quiescing' to 'true', and either waits for the coroutine to finish using AIO_WAIT_WHILE or, if it's yielding in 'nbd_read_eof', actively enters the coroutine to interrupt it. RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1900326 Signed-off-by: Sergio Lopez <slp@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20201214170519.223781-4-slp@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2021-01-20 14:48:54 -06:00
Peter Maydell	6b728efcb0	* Fixes for compiling on Haiku, and add Haiku VM for compile-testing * Update NetBSD VM to version 9.1 * Misc fixes (e.g. categorize some devices) -----BEGIN PGP SIGNATURE----- iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAl+zld8RHHRodXRoQHJl ZGhhdC5jb20ACgkQLtnXdP5wLbWb8Q/+IRvGUGjGcPfbTvwoOqJVy4Hm7huW5i1s wHe/6nitNtpvaAqcxbQHBIvWX9xTzppWcFiEkIs8rPwLOUFKh5xJ+NbEdf4acQaJ m4G2mEY5bYt/o5e6p7ZK1RgS2EjD1eQ6BwMWQKeUHET7MTv0UabKtvWmBWpMqFxA vl/3SbVWsSwGB9gOA5oksYhKY5ZRcVaDxsGk89f7iwgaStcxWNxVFEXddbBmqhfW Q4ZPt0K7yod7NDBOaGEoc2hOjIfr0TvovHojDuAxt+2tKdYi1vwtnwKbFqTWp7Ca 7ttzoQUSsteiOjAhHRpa2PEbfrNs+loIm9fem5fQ9i7POlbS/Ozv2RnPCZm1X8pW n7Jvsh25V066AFnHat7PnjcBVBRFfmR3xtA61PqvAMGEKW8tortbZbpqXO18Pv5p 6P/GG9G3QE0v2rEsU5BNFWp/aD7fiWy/VPu3dGFUkI9/S3biatocldHn/9eyXz94 k75Xzhe5x6n5Jf8QYFQ/6BO0qSoidNbAVg1W8+QyRXIJJhWRnvW9eYa7tSx5ezJg 5+oCo4oh6Qd9nvrl5pIwvX6QMDf2kPxzp7PsHeemqt7+QNmXErAVsIi1HUVsLWRP Qb/BbKyKNeWJwvWWLAm/2kXVmNQfjLVNCwg04xa8tkQemhIDekVrCpMoX3cNHjWf EWa1vEtbq9k= =A3/B -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/huth-gitlab/tags/pull-request-2020-11-17' into staging * Fixes for compiling on Haiku, and add Haiku VM for compile-testing * Update NetBSD VM to version 9.1 * Misc fixes (e.g. categorize some devices) # gpg: Signature made Tue 17 Nov 2020 09:20:31 GMT # gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5 # gpg: issuer "thuth@redhat.com" # gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full] # gpg: aka "Thomas Huth <thuth@redhat.com>" [full] # gpg: aka "Thomas Huth <huth@tuxfamily.org>" [full] # gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown] # Primary key fingerprint: 27B8 8847 EEE0 2501 18F3 EAB9 2ED9 D774 FE70 2DB5 * remotes/huth-gitlab/tags/pull-request-2020-11-17: max111x: put it into the 'misc' category nand: put it into the 'storage' category ads7846: put it into the 'input' category ssd0323: put it into the 'display' category gitlab-ci: Use $CI_REGISTRY instead of hard-coding registry.gitlab.com target/microblaze: Fix possible array out of bounds in mmu_write() tests/vm: update NetBSD to 9.1 tests/vm: Add Haiku test based on their vagrant images configure: Add a proper check for sys/ioccom.h and use it in tpm_ioctl.h configure: Do not build pc-bios/optionrom on Haiku configure: Fix the _BSD_SOURCE define for the Haiku build qemu/bswap: Remove unused qemu_bswap_len() Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2020-11-17 14:12:21 +00:00
Thomas Huth	ded5d78c1e	configure: Add a proper check for sys/ioccom.h and use it in tpm_ioctl.h On Solaris and Haiku, the _IO() macros are defined in <sys/ioccom.h>. Add a proper check for this header to our build system, and make sure to include the header in tpm_ioctl.h to fix a build failure on Solaris and Haiku. Message-Id: <20201115152317.42752-1-thuth@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>	2020-11-17 09:45:24 +01:00
Eric Blake	c0b21f2e22	nbd: Silence Coverity false positive Coverity noticed (CID 1436125) that we check the return value of nbd_extent_array_add in most places, but not at the end of bitmap_to_extents(). The return value exists to break loops before a future iteration, so there is nothing to check if we are already done iterating. Adding a cast to void, plus a comment why, pacifies Coverity. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20201111163510.713855-1-eblake@redhat.com> [eblake: Prefer cast to void over odd && usage] Reviewed-by: Richard Henderson <richard.henderson@linaro.org>	2020-11-16 14:51:12 -06:00
Eric Blake	dbc7b01492	nbd: Add 'qemu-nbd -A' to expose allocation depth Allow the server to expose an additional metacontext to be requested by savvy clients. qemu-nbd adds a new option -A to expose the qemu:allocation-depth metacontext through NBD_CMD_BLOCK_STATUS; this can also be set via QMP when using block-export-add. qemu as client is hacked into viewing the key aspects of this new context by abusing the already-experimental x-dirty-bitmap option to collapse all depths greater than 2, which results in a tri-state value visible in the output of 'qemu-img map --output=json' (yes, that means x-dirty-bitmap is now a bit of a misnomer, but I didn't feel like renaming it as it would introduce a needless break of back-compat, even though we make no compat guarantees with x- members): unallocated (depth 0) => "zero":false, "data":true local (depth 1) => "zero":false, "data":false backing (depth 2+) => "zero":true, "data":true libnbd as client is probably a nicer way to get at the information without having to decipher such hacks in qemu as client. ;) Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20201027050556.269064-11-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>	2020-10-30 15:22:00 -05:00
Eric Blake	71719cd57f	nbd: Add new qemu:allocation-depth metadata context 'qemu-img map' provides a way to determine which extents of an image come from the top layer vs. inherited from a backing chain. This is useful information worth exposing over NBD. There is a proposal to add a QMP command block-dirty-bitmap-populate which can create a dirty bitmap that reflects allocation information, at which point the qemu:dirty-bitmap:NAME metadata context can expose that information via the creation of a temporary bitmap, but we can shorten the effort by adding a new qemu:allocation-depth metadata context that does the same thing without an intermediate bitmap (this patch does not eliminate the need for that proposal, as it will have other uses as well). While documenting things, remember that although the NBD protocol has NBD_OPT_SET_META_CONTEXT, the rest of its documentation refers to 'metadata context', which is a more apt description of what is actually being used by NBD_CMD_BLOCK_STATUS: the user is requesting metadata by passing one or more context names. So I also touched up some existing wording to prefer the term 'metadata context' where it makes sense. Note that this patch does not actually enable any way to request a server to enable this context; that will come in the next patch. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20201027050556.269064-10-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>	2020-10-30 15:22:00 -05:00
Eric Blake	3b1f244c59	nbd: Allow export of multiple bitmaps for one device With this, 'qemu-nbd -B b0 -B b1 -f qcow2 img.qcow2' can let you sniff out multiple bitmaps from one server. qemu-img as client can still only read one bitmap per client connection, but other NBD clients (hello libnbd) can now read multiple bitmaps in a single pass. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20201027050556.269064-8-eblake@redhat.com>	2020-10-30 15:10:15 -05:00
Eric Blake	47ec485e8d	nbd: Refactor counting of metadata contexts Rather than open-code the count of negotiated contexts at several sites, embed it directly into the struct. This will make it easier for upcoming commits to support even more simultaneous contexts. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20201027050556.269064-7-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>	2020-10-30 15:10:15 -05:00
Eric Blake	02e87e3b1c	nbd: Simplify qemu bitmap context name Each dirty bitmap already knows its name; by reducing the scope of the places where we construct "qemu:dirty-bitmap:NAME" strings, tracking the name is more localized, and there are fewer per-export fields to worry about. This in turn will make it easier for an upcoming patch to export more than one bitmap at once. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20201027050556.269064-6-eblake@redhat.com>	2020-10-30 15:10:15 -05:00
Eric Blake	cbad81cef8	nbd: Update qapi to support exporting multiple bitmaps Since 'block-export-add' is new to 5.2, we can still tweak the interface; there, allowing 'bitmaps':['str'] is nicer than 'bitmap':'str'. This wires up the qapi and qemu-nbd changes to permit passing multiple bitmaps as distinct metadata contexts that the NBD client may request, but the actual support for more than one will require a further patch to the server. Note that there are no changes made to the existing deprecated 'nbd-server-add' command; this required splitting the QAPI type BlockExportOptionsNbd, which fortunately does not affect QMP introspection. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20201027050556.269064-5-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2020-10-30 15:10:15 -05:00
Stefan Hajnoczi	f51d23c80a	block/export: add iothread and fixed-iothread options Make it possible to specify the iothread where the export will run. By default the block node can be moved to other AioContexts later and the export will follow. The fixed-iothread option forces strict behavior that prevents changing AioContext while the export is active. See the QAPI docs for details. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20200929125516.186715-5-stefanha@redhat.com [Fix stray '#' character in block-export.json and add missing "(since: 5.2)" as suggested by Eric Blake. --Stefan] Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2020-10-23 13:42:16 +01:00
Stefan Hajnoczi	cbc20bfb8f	block: move block exports to libblockdev Block exports are used by softmmu, qemu-storage-daemon, and qemu-nbd. They are not used by other programs and are not otherwise needed in libblock. Undo the recent move of blockdev-nbd.c from blockdev_ss into block_ss. Since bdrv_close_all() (libblock) calls blk_exp_close_all() (libblockdev) a stub function is required.. Make qemu-nbd.c use signal handling utility functions instead of duplicating the code. This helps because os-posix.c is in libblockdev and it depends on a qemu_system_killed() symbol that qemu-nbd.c lacks. Once we use the signal handling utility functions we also end up providing the necessary symbol. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20200929125516.186715-4-stefanha@redhat.com [Fixed s/ndb/nbd/ typo in commit description as suggested by Eric Blake --Stefan] Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2020-10-23 13:42:16 +01:00
Eric Blake	ebd57062a1	nbd: Simplify meta-context parsing We had a premature optimization of trying to read as little from the wire as possible while handling NBD_OPT_SET_META_CONTEXT in phases. But in reality, we HAVE to read the entire string from the client before we can get to the next command, and it is easier to just read it all at once than it is to read it in pieces. And once we do that, several functions end up no longer performing I/O, so they can drop length and errp parameters, and just return a bool instead of modifying through a pointer. Our iotests still pass; I also checked that libnbd's testsuite (which covers more corner cases of odd meta context requests) still passes. There are cases where the sequence of trace messages produced differs (for example, when no bitmap is exported, a query for "qemu:" now produces two trace lines instead of one), but trace points are for debug and have no effect on what the client sees. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20200930121105.667049-4-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> [eblake: enhance commit message] Signed-off-by: Eric Blake <eblake@redhat.com>	2020-10-09 15:05:08 -05:00
Eric Blake	d1e2c3e7bd	nbd/server: Reject embedded NUL in NBD strings The NBD spec is clear that any string sent from the client must not contain embedded NUL characters. If the client passes "a\0", we should reject that option request rather than act on "a". Testing this is not possible with a compliant client, but I was able to use gdb to coerce libnbd into temporarily behaving as such a client. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20200930121105.667049-3-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2020-10-09 15:05:08 -05:00

1 2 3 4 5 ...

342 Commits