2021-11-18 17:34:01 +03:00
|
|
|
QEMU supports the NBD protocol, and has an internal NBD client (see
|
2018-06-09 18:17:58 +03:00
|
|
|
block/nbd.c), an internal NBD server (see blockdev-nbd.c), and an
|
|
|
|
external NBD server tool (see qemu-nbd.c). The common code is placed
|
|
|
|
in nbd/*.
|
|
|
|
|
|
|
|
The NBD protocol is specified here:
|
|
|
|
https://github.com/NetworkBlockDevice/nbd/blob/master/doc/proto.md
|
|
|
|
|
|
|
|
The following paragraphs describe some specific properties of NBD
|
2021-11-18 17:34:01 +03:00
|
|
|
protocol realization in QEMU.
|
2018-06-09 18:17:58 +03:00
|
|
|
|
|
|
|
= Metadata namespaces =
|
|
|
|
|
2021-11-18 17:34:01 +03:00
|
|
|
QEMU supports the "base:allocation" metadata context as defined in the
|
2018-06-09 18:17:58 +03:00
|
|
|
NBD protocol specification, and also defines an additional metadata
|
|
|
|
namespace "qemu".
|
|
|
|
|
|
|
|
== "qemu" namespace ==
|
|
|
|
|
2020-10-27 08:05:54 +03:00
|
|
|
The "qemu" namespace currently contains two available metadata context
|
|
|
|
types. The first is related to exposing the contents of a dirty
|
|
|
|
bitmap alongside the associated disk contents. That metadata context
|
|
|
|
is named with the following form:
|
2018-06-09 18:17:58 +03:00
|
|
|
|
|
|
|
qemu:dirty-bitmap:<dirty-bitmap-export-name>
|
|
|
|
|
|
|
|
Each dirty-bitmap metadata context defines only one flag for extents
|
|
|
|
in reply for NBD_CMD_BLOCK_STATUS:
|
|
|
|
|
2020-10-27 08:05:54 +03:00
|
|
|
bit 0: NBD_STATE_DIRTY, set when the extent is "dirty"
|
|
|
|
|
|
|
|
The second is related to exposing the source of various extents within
|
|
|
|
the image, with a single metadata context named:
|
|
|
|
|
|
|
|
qemu:allocation-depth
|
|
|
|
|
|
|
|
In the allocation depth context, the entire 32-bit value represents a
|
|
|
|
depth of which layer in a thin-provisioned backing chain provided the
|
|
|
|
data (0 for unallocated, 1 for the active layer, 2 for the first
|
|
|
|
backing layer, and so forth).
|
2018-06-09 18:17:58 +03:00
|
|
|
|
|
|
|
For NBD_OPT_LIST_META_CONTEXT the following queries are supported
|
2020-10-27 08:05:54 +03:00
|
|
|
in addition to the specific "qemu:allocation-depth" and
|
|
|
|
"qemu:dirty-bitmap:<dirty-bitmap-export-name>":
|
2018-06-09 18:17:58 +03:00
|
|
|
|
|
|
|
* "qemu:" - returns list of all available metadata contexts in the
|
|
|
|
namespace.
|
|
|
|
* "qemu:dirty-bitmap:" - returns list of all available dirty-bitmap
|
|
|
|
metadata contexts.
|
2018-12-15 16:53:04 +03:00
|
|
|
|
|
|
|
= Features by version =
|
|
|
|
|
|
|
|
The following list documents which qemu version first implemented
|
|
|
|
various features (both as a server exposing the feature, and as a
|
|
|
|
client taking advantage of the feature when present), to make it
|
|
|
|
easier to plan for cross-version interoperability. Note that in
|
|
|
|
several cases, the initial release containing a feature may require
|
|
|
|
additional patches from the corresponding stable branch to fix bugs in
|
|
|
|
the operation of that feature.
|
|
|
|
|
|
|
|
* 2.6: NBD_OPT_STARTTLS with TLS X.509 Certificates
|
|
|
|
* 2.8: NBD_CMD_WRITE_ZEROES
|
|
|
|
* 2.10: NBD_OPT_GO, NBD_INFO_BLOCK
|
|
|
|
* 2.11: NBD_OPT_STRUCTURED_REPLY
|
|
|
|
* 2.12: NBD_CMD_BLOCK_STATUS for "base:allocation"
|
|
|
|
* 3.0: NBD_OPT_STARTTLS with TLS Pre-Shared Keys (PSK),
|
|
|
|
NBD_CMD_BLOCK_STATUS for "qemu:dirty-bitmap:", NBD_CMD_CACHE
|
2020-09-17 10:50:22 +03:00
|
|
|
* 4.2: NBD_FLAG_CAN_MULTI_CONN for shareable read-only exports,
|
2019-08-23 17:37:23 +03:00
|
|
|
NBD_CMD_FLAG_FAST_ZERO
|
2020-10-27 08:05:54 +03:00
|
|
|
* 5.2: NBD_CMD_BLOCK_STATUS for "qemu:allocation-depth"
|
nbd/server: Allow MULTI_CONN for shared writable exports
According to the NBD spec, a server that advertises
NBD_FLAG_CAN_MULTI_CONN promises that multiple client connections will
not see any cache inconsistencies: when properly separated by a single
flush, actions performed by one client will be visible to another
client, regardless of which client did the flush.
We always satisfy these conditions in qemu - even when we support
multiple clients, ALL clients go through a single point of reference
into the block layer, with no local caching. The effect of one client
is instantly visible to the next client. Even if our backend were a
network device, we argue that any multi-path caching effects that
would cause inconsistencies in back-to-back actions not seeing the
effect of previous actions would be a bug in that backend, and not the
fault of caching in qemu. As such, it is safe to unconditionally
advertise CAN_MULTI_CONN for any qemu NBD server situation that
supports parallel clients.
Note, however, that we don't want to advertise CAN_MULTI_CONN when we
know that a second client cannot connect (for historical reasons,
qemu-nbd defaults to a single connection while nbd-server-add and QMP
commands default to unlimited connections; but we already have
existing means to let either style of NBD server creation alter those
defaults). This is visible by no longer advertising MULTI_CONN for
'qemu-nbd -r' without -e, as in the iotest nbd-qemu-allocation.
The harder part of this patch is setting up an iotest to demonstrate
behavior of multiple NBD clients to a single server. It might be
possible with parallel qemu-io processes, but I found it easier to do
in python with the help of libnbd, and help from Nir and Vladimir in
writing the test.
Signed-off-by: Eric Blake <eblake@redhat.com>
Suggested-by: Nir Soffer <nsoffer@redhat.com>
Suggested-by: Vladimir Sementsov-Ogievskiy <v.sementsov-og@mail.ru>
Message-Id: <20220512004924.417153-3-eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-05-12 03:49:24 +03:00
|
|
|
* 7.1: NBD_FLAG_CAN_MULTI_CONN for shareable writable exports
|
nbd/server: Add FLAG_PAYLOAD support to CMD_BLOCK_STATUS
Allow a client to request a subset of negotiated meta contexts. For
example, a client may ask to use a single connection to learn about
both block status and dirty bitmaps, but where the dirty bitmap
queries only need to be performed on a subset of the disk; forcing the
server to compute that information on block status queries in the rest
of the disk is wasted effort (both at the server, and on the amount of
traffic sent over the wire to be parsed and ignored by the client).
Qemu as an NBD client never requests to use more than one meta
context, so it has no need to use block status payloads. Testing this
instead requires support from libnbd, which CAN access multiple meta
contexts in parallel from a single NBD connection; an interop test
submitted to the libnbd project at the same time as this patch
demonstrates the feature working, as well as testing some corner cases
(for example, when the payload length is longer than the export
length), although other corner cases (like passing the same id
duplicated) requires a protocol fuzzer because libnbd is not wired up
to break the protocol that badly.
This also includes tweaks to 'qemu-nbd --list' to show when a server
is advertising the capability, and to the testsuite to reflect the
addition to that output.
Of note: qemu will always advertise the new feature bit during
NBD_OPT_INFO if extended headers have alreay been negotiated
(regardless of whether any NBD_OPT_SET_META_CONTEXT negotiation has
occurred); but for NBD_OPT_GO, qemu only advertises the feature if
block status is also enabled (that is, if the client does not
negotiate any contexts, then NBD_CMD_BLOCK_STATUS cannot be used, so
the feature is not advertised).
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-ID: <20230925192229.3186470-26-eblake@redhat.com>
[eblake: fix logic to reject unnegotiated contexts]
Signed-off-by: Eric Blake <eblake@redhat.com>
2023-09-25 22:22:42 +03:00
|
|
|
* 8.2: NBD_OPT_EXTENDED_HEADERS, NBD_FLAG_BLOCK_STATUS_PAYLOAD
|