Commit Graph

339 Commits

Author SHA1 Message Date
Markus Armbruster
8461b4d601 nbd/server: Clean up abuse of BlockExportOptionsNbd member @arg
block-export-add argument @name defaults to the value of argument
@node-name.

nbd_export_create() implements this by copying @node_name to @name.
It leaves @has_node_name false, violating the "has_node_name ==
!!node_name" invariant.  Unclean.  Falls apart when we elide
@has_node_name (next commit): then QAPI frees the same value twice,
once for @node_name and once @name.  iotest 307 duly explodes.

Goes back to commit c62d24e906 "blockdev-nbd: Boxed argument type for
nbd-server-add" (v5.0.0).  Got moved from qmp_nbd_server_add() to
nbd_export_create() (commit 56ee86261e), then copied back (commit
b6076afcab).  Commit 8675cbd68b "nbd: Utilize QAPI_CLONE for type
conversion" (v5.2.0) cleaned up the copy in qmp_nbd_server_add()
noting

    Second, our assignment to arg->name is fishy: the generated QAPI code
    for qapi_free_NbdServerAddOptions does not visit arg->name if
    arg->has_name is false, but if it DID visit it, we would have
    introduced a double-free situation when arg is finally freed.

Exactly.  However, the copy in nbd_export_create() remained dirty.

Clean it up.  Since the value stored in member @name is not actually
used outside this function, use a local variable instead of modifying
the QAPI object.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Cc: qemu-block@nongnu.org
Message-Id: <20221104160712.3005652-10-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
2022-12-13 18:31:37 +01:00
Alberto Faria
a9262f551e block: Change blk_{pread,pwrite}() param order
Swap 'buf' and 'bytes' around for consistency with
blk_co_{pread,pwrite}(), and in preparation to implement these functions
using generated_co_wrapper.

Callers were updated using this Coccinelle script:

    @@ expression blk, offset, buf, bytes, flags; @@
    - blk_pread(blk, offset, buf, bytes, flags)
    + blk_pread(blk, offset, bytes, buf, flags)

    @@ expression blk, offset, buf, bytes, flags; @@
    - blk_pwrite(blk, offset, buf, bytes, flags)
    + blk_pwrite(blk, offset, bytes, buf, flags)

It had no effect on hw/block/nand.c, presumably due to the #if, so that
file was updated manually.

Overly-long lines were then fixed by hand.

Signed-off-by: Alberto Faria <afaria@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220705161527.1054072-4-afaria@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
2022-07-12 12:14:56 +02:00
Alberto Faria
3b35d4542c block: Add a 'flags' param to blk_pread()
For consistency with other I/O functions, and in preparation to
implement it using generated_co_wrapper.

Callers were updated using this Coccinelle script:

    @@ expression blk, offset, buf, bytes; @@
    - blk_pread(blk, offset, buf, bytes)
    + blk_pread(blk, offset, buf, bytes, 0)

It had no effect on hw/block/nand.c, presumably due to the #if, so that
file was updated manually.

Overly-long lines were then fixed by hand.

Signed-off-by: Alberto Faria <afaria@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220705161527.1054072-3-afaria@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
2022-07-12 12:14:56 +02:00
Denis V. Lunev
8bb100c9e2 nbd: trace long NBD operations
At the moment there are 2 sources of lengthy operations if configured:
* open connection, which could retry inside and
* reconnect of already opened connection
These operations could be quite lengthy and cumbersome to catch thus
it would be quite natural to add trace points for them.

This patch is based on the original downstream work made by Vladimir.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Eric Blake <eblake@redhat.com>
CC: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Hanna Reitz <hreitz@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
2022-06-29 10:57:02 +03:00
Eric Blake
58a6fdcc9e nbd/server: Allow MULTI_CONN for shared writable exports
According to the NBD spec, a server that advertises
NBD_FLAG_CAN_MULTI_CONN promises that multiple client connections will
not see any cache inconsistencies: when properly separated by a single
flush, actions performed by one client will be visible to another
client, regardless of which client did the flush.

We always satisfy these conditions in qemu - even when we support
multiple clients, ALL clients go through a single point of reference
into the block layer, with no local caching.  The effect of one client
is instantly visible to the next client.  Even if our backend were a
network device, we argue that any multi-path caching effects that
would cause inconsistencies in back-to-back actions not seeing the
effect of previous actions would be a bug in that backend, and not the
fault of caching in qemu.  As such, it is safe to unconditionally
advertise CAN_MULTI_CONN for any qemu NBD server situation that
supports parallel clients.

Note, however, that we don't want to advertise CAN_MULTI_CONN when we
know that a second client cannot connect (for historical reasons,
qemu-nbd defaults to a single connection while nbd-server-add and QMP
commands default to unlimited connections; but we already have
existing means to let either style of NBD server creation alter those
defaults).  This is visible by no longer advertising MULTI_CONN for
'qemu-nbd -r' without -e, as in the iotest nbd-qemu-allocation.

The harder part of this patch is setting up an iotest to demonstrate
behavior of multiple NBD clients to a single server.  It might be
possible with parallel qemu-io processes, but I found it easier to do
in python with the help of libnbd, and help from Nir and Vladimir in
writing the test.

Signed-off-by: Eric Blake <eblake@redhat.com>
Suggested-by: Nir Soffer <nsoffer@redhat.com>
Suggested-by: Vladimir Sementsov-Ogievskiy <v.sementsov-og@mail.ru>
Message-Id: <20220512004924.417153-3-eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-05-12 13:10:52 +02:00
Vladimir Sementsov-Ogievskiy
e5fb29d5d0 qapi: nbd-export: allow select bitmaps by node/name pair
Hi all! Current logic of relying on search through backing chain is not
safe neither convenient.

Sometimes it leads to necessity of extra bitmap copying. Also, we are
going to add "snapshot-access" driver, to access some snapshot state
through NBD. And this driver is not formally a filter, and of course
it's not a COW format driver. So, searching through backing chain will
not work. Instead of widening the workaround of bitmap searching, let's
extend the interface so that user can select bitmap precisely.

Note, that checking for bitmap active status is not copied to the new
API, I don't see a reason for it, user should understand the risks. And
anyway, bitmap from other node is unrelated to this export being
read-only or read-write.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
Message-Id: <20220314213226.362217-3-v.sementsov-og@mail.ru>
[eblake: Adjust S-o-b to Vladimir's new email, with permission]
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2022-04-26 13:15:19 -05:00
Marc-André Lureau
e0e7fe07e1 Remove trailing ; after G_DEFINE_AUTO macro
The macro doesn't need it.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2022-03-22 14:46:18 +04:00
Marc-André Lureau
9edc6313da Replace GCC_FMT_ATTR with G_GNUC_PRINTF
One less qemu-specific macro. It also helps to make some headers/units
only depend on glib, and thus moved in standalone projects eventually.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
2022-03-22 14:40:51 +04:00
Peter Maydell
fdee2c9692 nbd patches for 2022-03-07
- Dan Berrange: Allow qemu-nbd to support TLS over Unix sockets
 - Eric Blake: Minor cleanups related to 64-bit block operations
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEccLMIrHEYCkn0vOqp6FrSiUnQ2oFAmImtE8ACgkQp6FrSiUn
 Q2ovmgf/aksDqf2eNcahs++fez+8Qi9ll5OY/qGyjnzBgsatYKjrK+xF7OnjoJox
 eRX026lh81Q4EQK7oZBUnr2UCY4bncDBTI7MTLh603EV/tId5ZLwx007ERhzvtC1
 mIsQHXNuO9X25LQG2eWnfunY9YztQpiT5r/g3khD2yPBqJWIvBfblzPLx6FkF7px
 /WM8xEKCihmGr1Wr3b+zGYL083YkaBWCvHoR8mJt3tEFUj+Qie8XcdV0OVyI0XUj
 5goIFRcpVwBE8P2nLtfUKNzEXz22cmdonOJUX7E5IvGO21k5F/HrWlHdo8JnuSUZ
 t0w5L9yCxBrRpY1burz30b77J0WMCw==
 =C8Dd
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/ericb/tags/pull-nbd-2022-03-07' into staging

nbd patches for 2022-03-07

- Dan Berrange: Allow qemu-nbd to support TLS over Unix sockets
- Eric Blake: Minor cleanups related to 64-bit block operations

# gpg: Signature made Tue 08 Mar 2022 01:41:35 GMT
# gpg:                using RSA key 71C2CC22B1C4602927D2F3AAA7A16B4A2527436A
# gpg: Good signature from "Eric Blake <eblake@redhat.com>" [full]
# gpg:                 aka "Eric Blake (Free Software Programmer) <ebb9@byu.net>" [full]
# gpg:                 aka "[jpeg image of size 6874]" [full]
# Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2  F3AA A7A1 6B4A 2527 436A

* remotes/ericb/tags/pull-nbd-2022-03-07:
  qemu-io: Allow larger write zeroes under no fallback
  qemu-io: Utilize 64-bit status during map
  nbd/server: Minor cleanups
  tests/qemu-iotests: validate NBD TLS with UNIX sockets and PSK
  tests/qemu-iotests: validate NBD TLS with UNIX sockets
  tests/qemu-iotests: validate NBD TLS with hostname mismatch
  tests/qemu-iotests: convert NBD TLS test to use standard filters
  tests/qemu-iotests: introduce filter for qemu-nbd export list
  tests/qemu-iotests: expand _filter_nbd rules
  tests/qemu-iotests: add QEMU_IOTESTS_REGEN=1 to update reference file
  block/nbd: don't restrict TLS usage to IP sockets
  qemu-nbd: add --tls-hostname option for TLS certificate validation
  block/nbd: support override of hostname for TLS certificate validation
  block: pass desired TLS hostname through from block driver client
  crypto: mandate a hostname when checking x509 creds on a client

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2022-03-09 11:38:29 +00:00
Eric Blake
314b902621 nbd/server: Minor cleanups
Spelling fixes, grammar improvements and consistent spacing, noticed
while preparing other patches in this file.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20211203231539.3900865-2-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2022-03-07 19:23:25 -06:00
Daniel P. Berrangé
046f98d075 block: pass desired TLS hostname through from block driver client
In

  commit a71d597b98
  Author: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
  Date:   Thu Jun 10 13:08:00 2021 +0300

    block/nbd: reuse nbd_co_do_establish_connection() in nbd_open()

the use of the 'hostname' field from the BDRVNBDState struct was
lost, and 'nbd_connect' just hardcoded it to match the IP socket
address. This was a harmless bug at the time since we block use
with anything other than IP sockets.

Shortly though, we want to allow the caller to override the hostname
used in the TLS certificate checks. This is to allow for TLS
when doing port forwarding or tunneling. Thus we need to reinstate
the passing along of the 'hostname'.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20220304193610.3293146-3-berrange@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2022-03-07 15:58:42 -06:00
Peter Maydell
5df022cf2e osdep: Move memalign-related functions to their own header
Move the various memalign-related functions out of osdep.h and into
their own header, which we include only where they are used.
While we're doing this, add some brief documentation comments.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20220226180723.1706285-10-peter.maydell@linaro.org
2022-03-07 13:16:49 +00:00
Nir Soffer
523f5a9971 nbd/server.c: Remove unused field
NBDRequestData struct has unused QSIMPLEQ_ENTRY field. It seems that
this field exists since the first git commit and was never used.

Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Message-Id: <20220111194313.581486-1-nsoffer@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Fixes: d9a73806 ("qemu-nbd: introduce NBDRequest", v1.1)
Signed-off-by: Eric Blake <eblake@redhat.com>
2022-01-28 16:48:28 -06:00
Vladimir Sementsov-Ogievskiy
9e14491af4 nbd/client-connection: improve error message of cancelled attempt
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2021-12-21 14:52:19 +01:00
Vladimir Sementsov-Ogievskiy
169b9a94ed nbd/client-connection: nbd_co_establish_connection(): return real error
The only caller of nbd_do_establish_connection() that uses errp is
nbd_open(). The only way to cancel this call is through open_timer
timeout. And for this case, user will be more interested in description
of last failed connect rather than in
"Connection attempt cancelled by other operation".

So, let's change behavior on cancel to return previous failure error if
available.

Do the same for non-blocking failure case. In this case we still don't
have a caller that is interested in errp. But let's be consistent.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2021-12-21 14:52:10 +01:00
Eric Blake
e35574226a nbd/server: Simplify zero and trim
Now that the block layer supports 64-bit operations (see commit
2800637a and friends, new to v6.2), we no longer have to self-fragment
requests larger than 2G, reverting the workaround added in 890cbccb08
("nbd: Fix large trim/zero requests", v5.1.0).

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20211117170230.1128262-3-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2021-11-22 07:37:15 -06:00
Eric Blake
1644cccea5 nbd/server: Don't complain on certain client disconnects
When a client disconnects abruptly, but did not have any pending
requests (for example, when using nbdsh without calling h.shutdown),
we used to output the following message:

$ qemu-nbd -f raw file
$ nbdsh -u 'nbd://localhost:10809' -c 'h.trim(1,0)'
qemu-nbd: Disconnect client, due to: Failed to read request: Unexpected end-of-file before all bytes were read

Then in commit f148ae7, we refactored nbd_receive_request() to use
nbd_read_eof(); when this returns 0, we regressed into tracing
uninitialized memory (if tracing is enabled) and reporting a
less-specific:

qemu-nbd: Disconnect client, due to: Request handling failed in intermediate state

Note that with Unix sockets, we have yet another error message,
unchanged by the 6.0 regression:

$ qemu-nbd -k /tmp/sock -f raw file
$ nbdsh -u 'nbd+unix:///?socket=/tmp/sock' -c 'h.trim(1,0)'
qemu-nbd: Disconnect client, due to: Failed to send reply: Unable to write to socket: Broken pipe

But in all cases, the error message goes away if the client performs a
soft shutdown by using NBD_CMD_DISC, rather than a hard shutdown by
abrupt disconnect:

$ nbdsh -u 'nbd://localhost:10809' -c 'h.trim(1,0)' -c 'h.shutdown()'

This patch fixes things to avoid uninitialized memory, and in general
avoids warning about a client that does a hard shutdown when not in
the middle of a packet.  A client that aborts mid-request, or which
does not read the full server's reply, can still result in warnings,
but those are indeed much more unusual situations.

CC: qemu-stable@nongnu.org
Fixes: f148ae7d36 ("nbd/server: Quiesce coroutines on context switch", v6.0.0)
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
[eblake: defer unrelated typo fixes to later patch]
Message-Id: <20211117170230.1128262-2-eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-11-22 07:37:14 -06:00
Eric Blake
76df2b8d69 nbd/server: Silence clang sanitizer warning
clang's sanitizer is picky: memset(NULL, x, 0) is technically
undefined behavior, even though no sane implementation of memset()
deferences the NULL.  Caught by the nbd-qemu-allocation iotest.

The alternative to checking before each memset is to instead force an
allocation of 1 element instead of g_new0(type, 0)'s behavior of
returning NULL for a 0-length array.

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Fixes: 3b1f244c59 (nbd: Allow export of multiple bitmaps for one device)
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20211115223943.626416-1-eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2021-11-16 08:45:33 -06:00
Vladimir Sementsov-Ogievskiy
4ddb5d2fde block/nbd: drop connection_co
OK, that's a big rewrite of the logic.

Pre-patch we have an always running coroutine - connection_co. It does
reply receiving and reconnecting. And it leads to a lot of difficult
and unobvious code around drained sections and context switch. We also
abuse bs->in_flight counter which is increased for connection_co and
temporary decreased in points where we want to allow drained section to
begin. One of these place is in another file: in nbd_read_eof() in
nbd/client.c.

We also cancel reconnect and requests waiting for reconnect on drained
begin which is not correct. And this patch fixes that.

Let's finally drop this always running coroutine and go another way:
do both reconnect and receiving in request coroutines.

The detailed list of changes below (in the sequence of diff hunks).

1. receiving coroutines are woken directly from nbd_channel_error, when
   we change s->state

2. nbd_co_establish_connection_cancel(): we don't have drain_begin now,
   and in nbd_teardown_connection() all requests should already be
   finished (and reconnect is done from request). So
   nbd_co_establish_connection_cancel() is called from
   nbd_cancel_in_flight() (to cancel the request that is doing
   nbd_co_establish_connection()) and from reconnect_delay_timer_cb()
   (previously we didn't need it, as reconnect delay only should cancel
   active requests not the reconnection itself). But now reconnection
   itself is done in the separate thread (we now call
   nbd_client_connection_enable_retry() in nbd_open()), and we need to
   cancel the requests that wait in nbd_co_establish_connection()
   now).

2A. We do receive headers in request coroutine. But we also should
   dispatch replies for other pending requests. So,
   nbd_connection_entry() is turned into nbd_receive_replies(), which
   does reply dispatching while it receives other request headers, and
   returns when it receives the requested header.

3. All old staff around drained sections and context switch is dropped.
   In details:
   - we don't need to move connection_co to new aio context, as we
     don't have connection_co anymore
   - we don't have a fake "request" of connection_co (extra increasing
     in_flight), so don't care with it in drain_begin/end
   - we don't stop reconnection during drained section anymore. This
     means that drain_begin may wait for a long time (up to
     reconnect_delay). But that's an improvement and more correct
     behavior see below[*]

4. In nbd_teardown_connection() we don't have to wait for
   connection_co, as it is dropped. And cleanup for s->ioc and nbd_yank
   is moved here from removed connection_co.

5. In nbd_co_do_establish_connection() we now should handle
   NBD_CLIENT_CONNECTING_NOWAIT: if new request comes when we are in
   NBD_CLIENT_CONNECTING_NOWAIT, it still should call
   nbd_co_establish_connection() (who knows, maybe the connection was
   already established by another thread in the background). But we
   shouldn't wait: if nbd_co_establish_connection() can't return new
   channel immediately the request should fail (we are in
   NBD_CLIENT_CONNECTING_NOWAIT state).

6. nbd_reconnect_attempt() is simplified: it's now easier to wait for
   other requests in the caller, so here we just assert that fact.
   Also delay time is now initialized here: we can easily detect first
   attempt and start a timer.

7. nbd_co_reconnect_loop() is dropped, we don't need it. Reconnect
   retries are fully handle by thread (nbd/client-connection.c), delay
   timer we initialize in nbd_reconnect_attempt(), we don't have to
   bother with s->drained and friends. nbd_reconnect_attempt() now
   called from nbd_co_send_request().

8. nbd_connection_entry is dropped: reconnect is now handled by
   nbd_co_send_request(), receiving reply is now handled by
   nbd_receive_replies(): all handled from request coroutines.

9. So, welcome new nbd_receive_replies() called from request coroutine,
   that receives reply header instead of nbd_connection_entry().
   Like with sending requests, only one coroutine may receive in a
   moment. So we introduce receive_mutex, which is locked around
   nbd_receive_reply(). It also protects some related fields. Still,
   full audit of thread-safety in nbd driver is a separate task.
   New function waits for a reply with specified handle being received
   and works rather simple:

   Under mutex:
     - if current handle is 0, do receive by hand. If another handle
       received - switch to other request coroutine, release mutex and
       yield. Otherwise return success
     - if current handle == requested handle, we are done
     - otherwise, release mutex and yield

10: in nbd_co_send_request() we now do nbd_reconnect_attempt() if
    needed. Also waiting in free_sema queue we now wait for one of two
    conditions:
    - connectED, in_flight < MAX_NBD_REQUESTS (so we can start new one)
    - connectING, in_flight == 0, so we can call
      nbd_reconnect_attempt()
    And this logic is protected by s->send_mutex

    Also, on failure we don't have to care of removed s->connection_co

11. nbd_co_do_receive_one_chunk(): now instead of yield() and wait for
    s->connection_co we just call new nbd_receive_replies().

12. nbd_co_receive_one_chunk(): place where s->reply.handle becomes 0,
    which means that handling of the whole reply is finished. Here we
    need to wake one of coroutines sleeping in nbd_receive_replies().
    If none are sleeping - do nothing. That's another behavior change: we
    don't have endless recv() in the idle time. It may be considered as
    a drawback. If so, it may be fixed later.

13. nbd_reply_chunk_iter_receive(): don't care about removed
    connection_co, just ping in_flight waiters.

14. Don't create connection_co, enable retry in the connection thread
    (we don't have own reconnect loop anymore)

15. We now need to add a nbd_co_establish_connection_cancel() call in
    nbd_cancel_in_flight(), to cancel the request that is doing a
    connection attempt.

[*], ok, now we don't cancel reconnect on drain begin. That's correct:
    reconnect feature leads to possibility of long-running requests (up
    to reconnect delay). Still, drain begin is not a reason to kill
    long requests. We should wait for them.

    This also means, that we can again reproduce a dead-lock, described
    in 8c517de24a.
    Why we are OK with it:
    1. Now this is not absolutely-dead dead-lock: the vm is unfrozen
       after reconnect delay. Actually 8c517de24a fixed a bug in
       NBD logic, that was not described in 8c517de24a and led to
       forever dead-lock. The problem was that nobody woke the free_sema
       queue, but drain_begin can't finish until there is a request in
       free_sema queue. Now we have a reconnect delay timer that works
       well.
    2. It's not a problem of the NBD driver, but of the ide code,
       because it does drain_begin under the global mutex; the problem
       doesn't reproduce when using scsi instead of ide.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210902103805.25686-5-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: grammar and comment tweaks]
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-09-29 13:46:33 -05:00
Vladimir Sementsov-Ogievskiy
f7ca4aadca nbd/client-connection: nbd_co_establish_connection(): fix non set errp
When we don't have a connection and blocking is false, we return NULL
but don't set errp. That's wrong.

We have two paths for calling nbd_co_establish_connection():

1. nbd_open() -> nbd_do_establish_connection() -> ...
  but that will never set blocking=false

2. nbd_reconnect_attempt() -> nbd_co_do_establish_connection() -> ...
  but that uses errp=NULL

So, we are safe with our wrong errp policy in
nbd_co_establish_connection(). Still let's fix it.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210906190654.183421-2-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-09-29 13:46:33 -05:00
Eric Blake
da24597dd3 nbd/server: Allow LIST_META_CONTEXT without STRUCTURED_REPLY
The NBD protocol just relaxed the requirements on
NBD_OPT_LIST_META_CONTEXT:

https://github.com/NetworkBlockDevice/nbd/commit/13a4e33a87

Since listing is not stateful (unlike SET_META_CONTEXT), we don't care
if a client asks for meta contexts without first requesting structured
replies.  Well-behaved clients will still ask for structured reply
first (if for no other reason than for back-compat to older servers),
but that's no reason to avoid this change.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210907173505.1499709-1-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2021-09-29 13:46:32 -05:00
Richard Henderson
cd1675f8d7 nbd/server: Mark variable unused in nbd_negotiate_meta_queries
From clang-13:
nbd/server.c:976:22: error: variable 'bitmaps' set but not used \
    [-Werror,-Wunused-but-set-variable]

which is incorrect; see //bugs.llvm.org/show_bug.cgi?id=3888.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-07-26 07:06:25 -10:00
Vladimir Sementsov-Ogievskiy
97cf89259e nbd/client-connection: add option for non-blocking connection attempt
We'll need a possibility of non-blocking nbd_co_establish_connection(),
so that it returns immediately, and it returns success only if a
connections was previously established in background.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210610100802.5888-30-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-06-18 12:21:22 -05:00
Vladimir Sementsov-Ogievskiy
43cb34dede nbd/client-connection: return only one io channel
block/nbd doesn't need underlying sioc channel anymore. So, we can
update nbd/client-connection interface to return only one top-most io
channel, which is more straight forward.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210610100802.5888-27-vsementsov@virtuozzo.com>
[eblake: squash in Vladimir's fixes for uninit usage caught by clang]
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-06-18 12:20:53 -05:00
Vladimir Sementsov-Ogievskiy
f58b2dfe3e nbd/client-connection: shutdown connection on release
Now, when a thread can do negotiation and retry, it may run relatively
long. We need a mechanism to stop it, when the user is not interested
in a result any more. So, on nbd_client_connection_release() let's
shutdown the socket, and do not retry connection if thread is detached.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210610100802.5888-22-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-06-18 10:59:53 -05:00
Vladimir Sementsov-Ogievskiy
e0e67cbe58 nbd/client-connection: implement connection retry
Add an option for a thread to retry connecting until it succeeds. We'll
use nbd/client-connection both for reconnect and for initial connection
in nbd_open(), so we need a possibility to use same NBDClientConnection
instance to connect once in nbd_open() and then use retry semantics for
reconnect.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210610100802.5888-21-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: grammar tweak]
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-06-18 10:59:53 -05:00
Vladimir Sementsov-Ogievskiy
130d49baa5 nbd/client-connection: add possibility of negotiation
Add arguments and logic to support nbd negotiation in the same thread
after successful connection.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210610100802.5888-20-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-06-18 10:59:53 -05:00
Vladimir Sementsov-Ogievskiy
e70da5ff64 nbd/client-connection: use QEMU_LOCK_GUARD
We don't update connect_thread_func() to use QEMU_LOCK_GUARD, as it
will get more complex critical sections logic in further commit, where
QEMU_LOCK_GUARD doesn't help.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210610100802.5888-19-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-06-18 10:59:53 -05:00
Vladimir Sementsov-Ogievskiy
5276c87c12 nbd: move connection code from block/nbd to nbd/client-connection
We now have bs-independent connection API, which consists of four
functions:

  nbd_client_connection_new()
  nbd_client_connection_release()
  nbd_co_establish_connection()
  nbd_co_establish_connection_cancel()

Move them to a separate file together with NBDClientConnection
structure which becomes private to the new API.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210610100802.5888-18-vsementsov@virtuozzo.com>
[eblake: comment tweaks]
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-06-18 10:59:53 -05:00
Peter Maydell
8e6dad2028 Block layer patches
- NBD server: Fix crashes related to switching between AioContexts
 - file-posix: Workaround for discard/write_zeroes on buggy filesystems
 - Follow-up fixes for the reopen vs. permission changes
 - quorum: Fix error handling for flush
 - block-copy: Refactor copy_range handling
 - docs: Describe how to use 'null-co' block driver
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmC3iy8RHGt3b2xmQHJl
 ZGhhdC5jb20ACgkQfwmycsiPL9ZP0hAAuh07CFWLzHCcRC7PBzekSPfzRYYBLDSW
 EObJ1Ov4mvz8UZoP6BDJ5QVzLPhel6hXkxTd83B1D7t/Dq+yJYR0z8Kv3USpaVJ4
 2U26SsoGQM8BmtVDL1Q8tQ5eDWQ4ykxNx6F2/lKBe1EH1lfaun04Xj1rNh7jpilo
 nNmKMDDI1UOkH0lKDR3tqfEV0XQE7o+ZKfPlIbvYMjXk9ZPKUHfjNPGVdCLQVnqH
 VJI01hF7eEx1ykSMdlC+TzNoVGG+mCBokGuW0JlUvOpX6FcGnAlxXQx8u53c1I8O
 lggZV8b2IbrNBUVwXQHLLrXxjOo+u54Ct9y/gXUTAj8qai+9jVRp60Y1pnCyeIeu
 DzFx10xwy04PGleb7AAZ4dT8du2+PTkuyQ9KmlvQ2U4IUcgW124CrDeO7XYr1aif
 hCTJPeEDrC9YNU6AQ8rLXrYUtkumSm2zUzU5nZ//i5WH41475/vsmgP5A+Jr457A
 Xu0yiI2Gqkr9CNsP9ZzMkNj03oIBhPFuGxiwibLQsy/6UVnaDYS0+rQ8FXYnF5+K
 iEpgXe3vKTWxM097kzJMBTDVRMXRa75NtK7KWXMDgVpHTbcv1t1otsn+6dfv+B55
 ULJM1ETsyYS0T6BqNglvdytsraSt7JgSF+ZLHbYk3KVDshwnq/0ksgSqHNNA14ca
 kYTzHhMgo5w=
 =gSq/
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Block layer patches

- NBD server: Fix crashes related to switching between AioContexts
- file-posix: Workaround for discard/write_zeroes on buggy filesystems
- Follow-up fixes for the reopen vs. permission changes
- quorum: Fix error handling for flush
- block-copy: Refactor copy_range handling
- docs: Describe how to use 'null-co' block driver

# gpg: Signature made Wed 02 Jun 2021 14:44:15 BST
# gpg:                using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6
# gpg:                issuer "kwolf@redhat.com"
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* remotes/kevin/tags/for-upstream:
  docs/secure-coding-practices: Describe how to use 'null-co' block driver
  block-copy: refactor copy_range handling
  block-copy: fix block_copy_task_entry() progress update
  nbd/server: Use drained block ops to quiesce the server
  block-backend: add drained_poll
  block: improve permission conflict error message
  block: simplify bdrv_child_user_desc()
  block/vvfat: inherit child_vvfat_qcow from child_of_bds
  block: improve bdrv_child_get_parent_desc()
  block-backend: improve blk_root_get_parent_desc()
  block: document child argument of bdrv_attach_child_common()
  block/file-posix: Try other fallbacks after invalid FALLOC_FL_ZERO_RANGE
  block/file-posix: Fix problem with fallocate(PUNCH_HOLE) on GPFS
  block: drop BlockBackendRootState::read_only
  block: drop BlockDriverState::read_only
  block: consistently use bdrv_is_read_only()
  block/vvfat: fix vvfat_child_perm crash
  block/vvfat: child_vvfat_qcow: add .get_parent_aio_context, fix crash
  qemu-io-cmds: assert that we don't have .perm requested in no-blk case
  block/quorum: Provide .bdrv_co_flush instead of .bdrv_co_flush_to_disk

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-06-02 19:34:03 +01:00
Sergio Lopez
fd6afc501a nbd/server: Use drained block ops to quiesce the server
Before switching between AioContexts we need to make sure that we're
fully quiesced ("nb_requests == 0" for every client) when entering the
drained section.

To do this, we set "quiescing = true" for every client on
".drained_begin" to prevent new coroutines from being created, and
check if "nb_requests == 0" on ".drained_poll". Finally, once we're
exiting the drained section, on ".drained_end" we set "quiescing =
false" and call "nbd_client_receive_next_request()" to resume the
processing of new requests.

With these changes, "blk_aio_attach()" and "blk_aio_detach()" can be
reverted to be as simple as they were before f148ae7d36.

RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1960137
Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Sergio Lopez <slp@redhat.com>
Message-Id: <20210602060552.17433-3-slp@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-06-02 14:23:20 +02:00
Stefano Garzarella
d0fb9657a3 docs: fix references to docs/devel/tracing.rst
Commit e50caf4a5c ("tracing: convert documentation to rST")
converted docs/devel/tracing.txt to docs/devel/tracing.rst.

We still have several references to the old file, so let's fix them
with the following command:

  sed -i s/tracing.txt/tracing.rst/ $(git grep -l docs/devel/tracing.txt)

Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210517151702.109066-2-sgarzare@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2021-06-02 06:51:09 +02:00
Nir Soffer
0da9856851 nbd: server: Report holes for raw images
When querying image extents for raw image, qemu-nbd reports holes as
zero:

$ qemu-nbd -t -r -f raw empty-6g.raw

$ qemu-img map --output json nbd://localhost
[{ "start": 0, "length": 6442450944, "depth": 0, "zero": true, "data": true, "offset": 0}]

$ qemu-img map --output json empty-6g.raw
[{ "start": 0, "length": 6442450944, "depth": 0, "zero": true, "data": false, "offset": 0}]

Turns out that qemu-img map reports a hole based on BDRV_BLOCK_DATA, but
nbd server reports a hole based on BDRV_BLOCK_ALLOCATED.

The NBD protocol says:

    NBD_STATE_HOLE (bit 0): if set, the block represents a hole (and
    future writes to that area may cause fragmentation or encounter an
    NBD_ENOSPC error); if clear, the block is allocated or the server
    could not otherwise determine its status.

qemu-img manual says:

    whether the sectors contain actual data or not (boolean field data;
    if false, the sectors are either unallocated or stored as
    optimized all-zero clusters);

To me, data=false looks compatible with NBD_STATE_HOLE. From user point
of view, getting same results from qemu-nbd and qemu-img is more
important than being more correct about allocation status.

Changing nbd server to report holes using BDRV_BLOCK_DATA makes qemu-nbd
results compatible with qemu-img map:

$ qemu-img map --output json nbd://localhost
[{ "start": 0, "length": 6442450944, "depth": 0, "zero": true, "data": false, "offset": 0}]

Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Message-Id: <20210219160752.1826830-1-nsoffer@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-03-08 13:08:45 -06:00
Sergio Lopez
f148ae7d36 nbd/server: Quiesce coroutines on context switch
When switching between AIO contexts we need to me make sure that both
recv_coroutine and send_coroutine are not scheduled to run. Otherwise,
QEMU may crash while attaching the new context with an error like
this one:

aio_co_schedule: Co-routine was already scheduled in 'aio_co_schedule'

To achieve this we need a local implementation of
'qio_channel_readv_all_eof' named 'nbd_read_eof' (a trick already done
by 'nbd/client.c') that allows us to interrupt the operation and to
know when recv_coroutine is yielding.

With this in place, we delegate detaching the AIO context to the
owning context with a BH ('nbd_aio_detach_bh') scheduled using
'aio_wait_bh_oneshot'. This BH signals that we need to quiesce the
channel by setting 'client->quiescing' to 'true', and either waits for
the coroutine to finish using AIO_WAIT_WHILE or, if it's yielding in
'nbd_read_eof', actively enters the coroutine to interrupt it.

RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1900326
Signed-off-by: Sergio Lopez <slp@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20201214170519.223781-4-slp@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-01-20 14:48:54 -06:00
Peter Maydell
6b728efcb0 * Fixes for compiling on Haiku, and add Haiku VM for compile-testing
* Update NetBSD VM to version 9.1
 * Misc fixes (e.g. categorize some devices)
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAl+zld8RHHRodXRoQHJl
 ZGhhdC5jb20ACgkQLtnXdP5wLbWb8Q/+IRvGUGjGcPfbTvwoOqJVy4Hm7huW5i1s
 wHe/6nitNtpvaAqcxbQHBIvWX9xTzppWcFiEkIs8rPwLOUFKh5xJ+NbEdf4acQaJ
 m4G2mEY5bYt/o5e6p7ZK1RgS2EjD1eQ6BwMWQKeUHET7MTv0UabKtvWmBWpMqFxA
 vl/3SbVWsSwGB9gOA5oksYhKY5ZRcVaDxsGk89f7iwgaStcxWNxVFEXddbBmqhfW
 Q4ZPt0K7yod7NDBOaGEoc2hOjIfr0TvovHojDuAxt+2tKdYi1vwtnwKbFqTWp7Ca
 7ttzoQUSsteiOjAhHRpa2PEbfrNs+loIm9fem5fQ9i7POlbS/Ozv2RnPCZm1X8pW
 n7Jvsh25V066AFnHat7PnjcBVBRFfmR3xtA61PqvAMGEKW8tortbZbpqXO18Pv5p
 6P/GG9G3QE0v2rEsU5BNFWp/aD7fiWy/VPu3dGFUkI9/S3biatocldHn/9eyXz94
 k75Xzhe5x6n5Jf8QYFQ/6BO0qSoidNbAVg1W8+QyRXIJJhWRnvW9eYa7tSx5ezJg
 5+oCo4oh6Qd9nvrl5pIwvX6QMDf2kPxzp7PsHeemqt7+QNmXErAVsIi1HUVsLWRP
 Qb/BbKyKNeWJwvWWLAm/2kXVmNQfjLVNCwg04xa8tkQemhIDekVrCpMoX3cNHjWf
 EWa1vEtbq9k=
 =A3/B
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/huth-gitlab/tags/pull-request-2020-11-17' into staging

* Fixes for compiling on Haiku, and add Haiku VM for compile-testing
* Update NetBSD VM to version 9.1
* Misc fixes (e.g. categorize some devices)

# gpg: Signature made Tue 17 Nov 2020 09:20:31 GMT
# gpg:                using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg:                issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg:                 aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg:                 aka "Thomas Huth <huth@tuxfamily.org>" [full]
# gpg:                 aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3  EAB9 2ED9 D774 FE70 2DB5

* remotes/huth-gitlab/tags/pull-request-2020-11-17:
  max111x: put it into the 'misc' category
  nand: put it into the 'storage' category
  ads7846: put it into the 'input' category
  ssd0323: put it into the 'display' category
  gitlab-ci: Use $CI_REGISTRY instead of hard-coding registry.gitlab.com
  target/microblaze: Fix possible array out of bounds in mmu_write()
  tests/vm: update NetBSD to 9.1
  tests/vm: Add Haiku test based on their vagrant images
  configure: Add a proper check for sys/ioccom.h and use it in tpm_ioctl.h
  configure: Do not build pc-bios/optionrom on Haiku
  configure: Fix the _BSD_SOURCE define for the Haiku build
  qemu/bswap: Remove unused qemu_bswap_len()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-11-17 14:12:21 +00:00
Thomas Huth
ded5d78c1e configure: Add a proper check for sys/ioccom.h and use it in tpm_ioctl.h
On Solaris and Haiku, the _IO() macros are defined in <sys/ioccom.h>.
Add a proper check for this header to our build system, and make sure
to include the header in tpm_ioctl.h to fix a build failure on Solaris
and Haiku.

Message-Id: <20201115152317.42752-1-thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2020-11-17 09:45:24 +01:00
Eric Blake
c0b21f2e22 nbd: Silence Coverity false positive
Coverity noticed (CID 1436125) that we check the return value of
nbd_extent_array_add in most places, but not at the end of
bitmap_to_extents().  The return value exists to break loops before a
future iteration, so there is nothing to check if we are already done
iterating.  Adding a cast to void, plus a comment why, pacifies
Coverity.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20201111163510.713855-1-eblake@redhat.com>
[eblake: Prefer cast to void over odd && usage]
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2020-11-16 14:51:12 -06:00
Eric Blake
dbc7b01492 nbd: Add 'qemu-nbd -A' to expose allocation depth
Allow the server to expose an additional metacontext to be requested
by savvy clients.  qemu-nbd adds a new option -A to expose the
qemu:allocation-depth metacontext through NBD_CMD_BLOCK_STATUS; this
can also be set via QMP when using block-export-add.

qemu as client is hacked into viewing the key aspects of this new
context by abusing the already-experimental x-dirty-bitmap option to
collapse all depths greater than 2, which results in a tri-state value
visible in the output of 'qemu-img map --output=json' (yes, that means
x-dirty-bitmap is now a bit of a misnomer, but I didn't feel like
renaming it as it would introduce a needless break of back-compat,
even though we make no compat guarantees with x- members):

unallocated (depth 0) => "zero":false, "data":true
local (depth 1)       => "zero":false, "data":false
backing (depth 2+)    => "zero":true,  "data":true

libnbd as client is probably a nicer way to get at the information
without having to decipher such hacks in qemu as client. ;)

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20201027050556.269064-11-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2020-10-30 15:22:00 -05:00
Eric Blake
71719cd57f nbd: Add new qemu:allocation-depth metadata context
'qemu-img map' provides a way to determine which extents of an image
come from the top layer vs. inherited from a backing chain.  This is
useful information worth exposing over NBD.  There is a proposal to
add a QMP command block-dirty-bitmap-populate which can create a dirty
bitmap that reflects allocation information, at which point the
qemu:dirty-bitmap:NAME metadata context can expose that information
via the creation of a temporary bitmap, but we can shorten the effort
by adding a new qemu:allocation-depth metadata context that does the
same thing without an intermediate bitmap (this patch does not
eliminate the need for that proposal, as it will have other uses as
well).

While documenting things, remember that although the NBD protocol has
NBD_OPT_SET_META_CONTEXT, the rest of its documentation refers to
'metadata context', which is a more apt description of what is
actually being used by NBD_CMD_BLOCK_STATUS: the user is requesting
metadata by passing one or more context names.  So I also touched up
some existing wording to prefer the term 'metadata context' where it
makes sense.

Note that this patch does not actually enable any way to request a
server to enable this context; that will come in the next patch.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20201027050556.269064-10-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2020-10-30 15:22:00 -05:00
Eric Blake
3b1f244c59 nbd: Allow export of multiple bitmaps for one device
With this, 'qemu-nbd -B b0 -B b1 -f qcow2 img.qcow2' can let you sniff
out multiple bitmaps from one server.  qemu-img as client can still
only read one bitmap per client connection, but other NBD clients
(hello libnbd) can now read multiple bitmaps in a single pass.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20201027050556.269064-8-eblake@redhat.com>
2020-10-30 15:10:15 -05:00
Eric Blake
47ec485e8d nbd: Refactor counting of metadata contexts
Rather than open-code the count of negotiated contexts at several
sites, embed it directly into the struct.  This will make it easier
for upcoming commits to support even more simultaneous contexts.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20201027050556.269064-7-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2020-10-30 15:10:15 -05:00
Eric Blake
02e87e3b1c nbd: Simplify qemu bitmap context name
Each dirty bitmap already knows its name; by reducing the scope of the
places where we construct "qemu:dirty-bitmap:NAME" strings, tracking
the name is more localized, and there are fewer per-export fields to
worry about.  This in turn will make it easier for an upcoming patch
to export more than one bitmap at once.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20201027050556.269064-6-eblake@redhat.com>
2020-10-30 15:10:15 -05:00
Eric Blake
cbad81cef8 nbd: Update qapi to support exporting multiple bitmaps
Since 'block-export-add' is new to 5.2, we can still tweak the
interface; there, allowing 'bitmaps':['str'] is nicer than
'bitmap':'str'.  This wires up the qapi and qemu-nbd changes to permit
passing multiple bitmaps as distinct metadata contexts that the NBD
client may request, but the actual support for more than one will
require a further patch to the server.

Note that there are no changes made to the existing deprecated
'nbd-server-add' command; this required splitting the QAPI type
BlockExportOptionsNbd, which fortunately does not affect QMP
introspection.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20201027050556.269064-5-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2020-10-30 15:10:15 -05:00
Stefan Hajnoczi
f51d23c80a block/export: add iothread and fixed-iothread options
Make it possible to specify the iothread where the export will run. By
default the block node can be moved to other AioContexts later and the
export will follow. The fixed-iothread option forces strict behavior
that prevents changing AioContext while the export is active. See the
QAPI docs for details.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20200929125516.186715-5-stefanha@redhat.com
[Fix stray '#' character in block-export.json and add missing "(since:
5.2)" as suggested by Eric Blake.
--Stefan]
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-10-23 13:42:16 +01:00
Stefan Hajnoczi
cbc20bfb8f block: move block exports to libblockdev
Block exports are used by softmmu, qemu-storage-daemon, and qemu-nbd.
They are not used by other programs and are not otherwise needed in
libblock.

Undo the recent move of blockdev-nbd.c from blockdev_ss into block_ss.
Since bdrv_close_all() (libblock) calls blk_exp_close_all()
(libblockdev) a stub function is required..

Make qemu-nbd.c use signal handling utility functions instead of
duplicating the code. This helps because os-posix.c is in libblockdev
and it depends on a qemu_system_killed() symbol that qemu-nbd.c lacks.
Once we use the signal handling utility functions we also end up
providing the necessary symbol.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20200929125516.186715-4-stefanha@redhat.com
[Fixed s/ndb/nbd/ typo in commit description as suggested by Eric Blake
--Stefan]
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-10-23 13:42:16 +01:00
Eric Blake
ebd57062a1 nbd: Simplify meta-context parsing
We had a premature optimization of trying to read as little from the
wire as possible while handling NBD_OPT_SET_META_CONTEXT in phases.
But in reality, we HAVE to read the entire string from the client
before we can get to the next command, and it is easier to just read
it all at once than it is to read it in pieces.  And once we do that,
several functions end up no longer performing I/O, so they can drop
length and errp parameters, and just return a bool instead of
modifying through a pointer.

Our iotests still pass; I also checked that libnbd's testsuite (which
covers more corner cases of odd meta context requests) still passes.
There are cases where the sequence of trace messages produced differs
(for example, when no bitmap is exported, a query for "qemu:" now
produces two trace lines instead of one), but trace points are for
debug and have no effect on what the client sees.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200930121105.667049-4-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
[eblake: enhance commit message]
Signed-off-by: Eric Blake <eblake@redhat.com>
2020-10-09 15:05:08 -05:00
Eric Blake
d1e2c3e7bd nbd/server: Reject embedded NUL in NBD strings
The NBD spec is clear that any string sent from the client must not
contain embedded NUL characters.  If the client passes "a\0", we
should reject that option request rather than act on "a".

Testing this is not possible with a compliant client, but I was able
to use gdb to coerce libnbd into temporarily behaving as such a
client.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200930121105.667049-3-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2020-10-09 15:05:08 -05:00
Christian Borntraeger
bbc35fc20e nbd: silence maybe-uninitialized warnings
gcc 10 from Fedora 32 gives me:

Compiling C object libblock.fa.p/nbd_server.c.o
../nbd/server.c: In function ‘nbd_co_client_start’:
../nbd/server.c:625:14: error: ‘namelen’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
  625 |         rc = nbd_negotiate_send_info(client, NBD_INFO_NAME, namelen, name,
      |              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  626 |                                      errp);
      |                                      ~~~~~
../nbd/server.c:564:14: note: ‘namelen’ was declared here
  564 |     uint32_t namelen;
      |              ^~~~~~~
cc1: all warnings being treated as errors

As I cannot see how this can happen, let uns silence the warning.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <20200930155859.303148-3-borntraeger@de.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2020-10-09 15:04:32 -05:00
Kevin Wolf
5b1cb49704 nbd: Merge nbd_export_new() and nbd_export_create()
There is no real reason any more why nbd_export_new() and
nbd_export_create() should be separate functions. The latter only
performs a few checks before it calls the former.

What makes the current state stand out is that it's the only function in
BlockExportDriver that is not a static function inside nbd/server.c, but
a small wrapper in blockdev-nbd.c that then calls back into nbd/server.c
for the real functionality.

Move all the checks to nbd/server.c and make the resulting function
static to improve readability.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20200924152717.287415-27-kwolf@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-10-02 15:46:40 +02:00
Kevin Wolf
30dbc81d31 block/export: Move writable to BlockExportOptions
The 'writable' option is a basic option that will probably be applicable
to most if not all export types that we will implement. Move it from NBD
to the generic BlockExport layer.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20200924152717.287415-26-kwolf@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-10-02 15:46:40 +02:00