Another refactoring creating nbd_negotiate_finish_oldstyle()
for further reuse during 'qemu-nbd --list'.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Message-Id: <20190117193658.16413-16-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
An upcoming patch will add the ability for qemu-nbd to list
the services provided by an NBD server. Share the common
code of the TLS handshake by splitting the initial exchange
into a separate function, leaving only the export handling
in the original function. Functionally, there should be no
change in behavior in this patch, although some of the code
motion may be difficult to follow due to indentation changes
(view with 'git diff -w' for a smaller changeset).
I considered an enum for the return code coordinating state
between the two functions, but in the end just settled with
ample comments.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20190117193658.16413-15-eblake@redhat.com>
The function could only ever return 0 or -EINVAL; make this
clearer by dropping a useless 'fail:' label.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20190117193658.16413-14-eblake@redhat.com>
Extract portions of nbd_negotiate_simple_meta_context() to
a new function nbd_receive_one_meta_context() that copies the
pattern of nbd_receive_list() for performing the argument
validation of one reply. The error message when the server
replies with more than one context changes slightly, but
that shouldn't happen in the common case.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20190117193658.16413-13-eblake@redhat.com>
Refactor nbd_negotiate_simple_meta_context() to pull out the
code that can be reused to send a LIST request for 0 or 1 query.
No semantic change. The old comment about 'sizeof(uint32_t)'
being equivalent to '/* number of queries */' is no longer
needed, now that we are computing 'sizeof(queries)' instead.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Message-Id: <20190117193658.16413-12-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Pass 'info' instead of three separate parameters related to info,
when requesting the server to set the meta context. Update the
NBDExportInfo struct to rename the received id field to match the
fact that we are currently overloading the field to match whatever
context the user supplied through the x-dirty-bitmap hack, as well
as adding a TODO comment to remind future patches about a desire
to request two contexts at once.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20190117193658.16413-11-eblake@redhat.com>
Refactor the 'name' parameter of nbd_receive_negotiate() from
being a separate parameter into being part of the in-out 'info'.
This also spills over to a simplification of nbd_opt_go().
The main driver for this refactoring is that an upcoming patch
would like to add support to qemu-nbd to list information about
all exports available on a server, where the name(s) will be
provided by the server instead of the client. But another benefit
is that we can now allow the client to explicitly specify the
empty export name "" even when connecting to an oldstyle server
(even if qemu is no longer such a server after commit 7f7dfe2a).
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20190117193658.16413-10-eblake@redhat.com>
Right now, nbd_receive_list() is only called by
nbd_receive_query_exports(), which in turn is only called if the
server lacks NBD_OPT_GO but has working option negotiation, and is
merely used as a quality-of-implementation trick since servers
can't give decent errors for NBD_OPT_EXPORT_NAME. However, servers
that lack NBD_OPT_GO are becoming increasingly rare (nbdkit was a
latecomer, in Aug 2018, but qemu has been such a server since commit
f37708f6 in July 2017 and released in 2.10), so it no longer makes
sense to micro-optimize that function for performance.
Furthermore, when debugging a server's implementation, tracing the
full reply (both names and descriptions) is useful, not to mention
that upcoming patches adding 'qemu-nbd --list' will want to collect
that data. And when you consider that a server can send an export
name up to the NBD protocol length limit of 4k; but our current
NBD_MAX_NAME_SIZE is only 256, we can't trace all valid server
names without more storage, but 4k is large enough that the heap
is better than the stack for long names.
Thus, I'm changing the division of labor, with nbd_receive_list()
now always malloc'ing a result on success (the malloc is bounded
by the fact that we reject servers with a reply length larger
than 32M), and moving the comparison to 'wantname' to the caller.
There is a minor change in behavior where a server with 0 exports
(an immediate NBD_REP_ACK reply) is now no longer distinguished
from a server without LIST support (NBD_REP_ERR_UNSUP); this
information could be preserved with a complication to the calling
contract to provide a bit more information, but I didn't see the
point. After all, the worst that can happen if our guess at a
match is wrong is that the caller will get a cryptic disconnect
when NBD_OPT_EXPORT_NAME fails (which is no different from what
would happen if we had not tried LIST), while treating an empty
list as immediate failure would prevent connecting to really old
servers that really did lack LIST. Besides, NBD servers with 0
exports are rare (qemu can do it when using QMP nbd-server-start
without nbd-server-add - but qemu understands NBD_OPT_GO and
thus won't tickle this change in behavior).
Fix the spelling of foundExport to match coding standards while
in the area.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20190117193658.16413-9-eblake@redhat.com>
Although our compile-time environment is set up so that we always
support long files with 64-bit off_t, we have no guarantee whether
off_t is the same type as int64_t. This requires casts when
printing values, and prevents us from directly using qemu_strtoi64()
(which will be done in the next patch). Let's just flip to uint64_t
where possible, and stick to int64_t for detecting failure of
blk_getlength(); we also keep the assertions added in the previous
patch that the resulting values fit in 63 bits. The overflow check
in nbd_co_receive_request() was already sane (request->from is
validated to fit in 63 bits, and request->len is 32 bits, so the
addition can't overflow 64 bits), but rewrite it in a form easier
to recognize as a typical overflow check.
Rename the variable 'description' to keep line lengths reasonable.
Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20190117193658.16413-7-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
We only had two callers to nbd_export_new; qemu-nbd.c always
passed a valid offset/length pair (because it already checked
the file length, to ensure that offset was in bounds), while
blockdev-nbd.c always passed 0/-1. Then nbd_export_new reduces
the size to a multiple of BDRV_SECTOR_SIZE (can only happen
when offset is not sector-aligned, since bdrv_getlength()
currently rounds up) (someday, it would be nice to have
byte-accurate lengths - but not today).
However, I'm finding it easier to work with the code if we are
consistent on having both callers pass in a valid length, and
just assert that things are sane in nbd_export_new, meaning
that no negative values were passed, and that offset+size does
not exceed 63 bits (as that really is a fundamental limit to
later operations, whether we use off_t or uint64_t).
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20190117193658.16413-6-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
We only have one caller that wants to export a bitmap name,
which it does right after creation of the export. But there is
still a brief window of time where an NBD client could see the
export but not the dirty bitmap, which a robust client would
have to interpret as meaning the entire image should be treated
as dirty. Better is to eliminate the window entirely, by
inlining nbd_export_bitmap() into nbd_export_new(), and refusing
to create the bitmap in the first place if the requested bitmap
can't be located.
We also no longer need logic for setting a different bitmap
name compared to the bitmap being exported.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20190111194720.15671-8-eblake@redhat.com>
The existing NBD code had a weird split where nbd_export_new()
created an export but did not add it to the list of exported
names until a later nbd_export_set_name() came along and grabbed
a second reference on the object; later, the first call to
nbd_export_close() drops the second reference while removing
the export from the list. This is in part because the QAPI
NbdServerRemoveNode enum documents the possibility of adding a
mode where we could do a soft disconnect: preventing new clients,
but waiting for existing clients to gracefully quit, based on
the mode used when calling nbd_export_close().
But in spite of all that, note that we never change the name of
an NBD export while it is exposed, which means it is easier to
just inline the process of setting the name as part of creating
the export.
Inline the contents of nbd_export_set_name() and
nbd_export_set_description() into the two points in an export
lifecycle where they matter, then adjust both callers to pass
the name up front. Note that for creation, all callers pass a
non-NULL name, (passing NULL at creation was for old style
servers, but we removed support for that in commit 7f7dfe2a),
so we can add an assert and do things unconditionally; but for
cleanup, because of the dual nature of nbd_export_close(), we
still have to be careful to avoid use-after-free. Along the
way, add a comment reminding ourselves of the potential of
adding a middle mode disconnect.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20190111194720.15671-5-eblake@redhat.com>
Our initial implementation of x-nbd-server-add-bitmap put
in a restriction because of incremental backups: in that
usage, we are exporting one qcow2 file (the temporary overlay
target of a blockdev-backup sync:none job) and a dirty bitmap
owned by a second qcow2 file (the source of the
blockdev-backup, which is the backing file of the temporary).
While both qcow2 files are still writable (the target in
order to capture copy-on-write of old contents, and the
source in order to track live guest writes in the meantime),
the NBD client expects to see constant data, including the
dirty bitmap. An enabled bitmap in the source would be
modified by guest writes, which is at odds with the NBD
export being a read-only constant view, hence the initial
code choice of enforcing a disabled bitmap (the intent is
that the exposed bitmap was disabled in the same transaction
that started the blockdev-backup job, although we don't want
to track enough state to actually enforce that).
However, consider the case of a bitmap contained in a read-only
node (including when the bitmap is found in a backing layer of
the active image). Because the node can't be modified, the
bitmap won't change due to writes, regardless of whether it is
still enabled. Forbidding the export unless the bitmap is
disabled is awkward, paritcularly since we can't change the
bitmap to be disabled (because the node is read-only).
Alternatively, consider the case of live storage migration,
where management directs the destination to create a writable
NBD server, then performs a drive-mirror from the source to
the target, prior to doing the rest of the live migration.
Since storage migration can be time-consuming, it may be wise
to let the destination include a dirty bitmap to track which
portions it has already received, where even if the migration
is interrupted and restarted, the source can query the
destination block status in order to potentially minimize
re-sending data that has not changed in the meantime on a
second attempt. Such code has not been written, and might not
be trivial (after all, a cluster being marked dirty in the
bitmap does not necessarily guarantee it has the desired
contents), but it makes sense that letting an active dirty
bitmap be exposed and changing alongside writes may prove
useful in the future.
Solve both issues by gating the restriction against a
disabled bitmap to only happen when the caller has requested
a read-only export, and where the BDS that owns the bitmap
(whether or not it is the BDS handed to nbd_export_new() or
from its backing chain) is still writable. We could drop
the check altogether (if management apps are prepared to
deal with a changing bitmap even on a read-only image), but
for now keeping a check for the read-only case still stands
a chance of preventing management errors.
Update iotest 223 to show the looser behavior by leaving
a bitmap enabled the whole run; note that we have to tear
down and re-export a node when handling an error.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20190111194720.15671-4-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
There's no need to read into a temporary buffer (oversized
since commit 7d3123e1) followed by a byteswap into a uint64_t
to check for a magic number via memcmp(), when the code
immediately below demonstrates reading into the uint64_t then
byteswapping in place and checking for a magic number via
integer math. What's more, having a different error message
when the server's first reply byte is 0 is unusual - it's no
different from any other wrong magic number, and we already
detected short reads. That whole strlen() issue has been
present and useless since commit 1d45f8b5 in 2010; perhaps it
was leftover debugging (since the correct magic number happens
to be ASCII)? Make the error messages more consistent and
detailed while touching things.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20181215135324.152629-9-eblake@redhat.com>
Connecting to a /dev/nbdN device is a Linux-specific action.
We were already masking -c and -d from 'qemu-nbd --help' on
non-linux. However, while -d fails with a sensible error
message, it took hunting through a couple of files to prove
that. What's more, the code for -c doesn't fail until after
it has created a pthread and tried to open a device - possibly
even printing an error message with %m on a non-Linux platform
in spite of the comment that %m is glibc-specific. Make the
failure happen sooner, then get rid of stubs that are no
longer needed because of the early exits.
While at it: tweak the blank newlines in --help output to be
consistent, whether or not built on Linux.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20181215135324.152629-7-eblake@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Consolidate on using decimal (not hex), on outputting the
option reply name (not just value), and a consistent comma between
clauses, when the client reports protocol discrepancies from the
server. While it won't affect normal operation, it makes
debugging additions easier.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20181215135324.152629-6-eblake@redhat.com>
Not all servers send free-form text alongside option error replies, but
for servers that do (such as qemu), we pass the server's message as a
hint alongside our own error reporting. However, it would also be
useful to trace such server messages, since we can't guarantee how the
hint may be consumed.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20181218225714.284495-3-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
These functions are used for formatting pretty trace points. We are
going to add some in block/nbd-client, so, let's publish all these
functions at once. Note, that nbd_reply_type_lookup is already
published, and constants, "named" by these functions live in
include/block/nbd.h too.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20181102151152.288399-3-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
The NBD spec, and even our code comment, says that if the client
asks for NBD_OPT_LIST_META_CONTEXT with 0 queries, then we should
reply with (a possibly-compressed representation of) ALL contexts
that we are willing to let them try. But commit 3d068aff forgot
to advertise qemu:dirty-bitmap:FOO.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20181130023232.3079982-2-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Commit 37ec36f6 intentionally ignores errors when trying to reply
to an NBD_OPT_ABORT request for plaintext clients, but did not make
the same change for a TLS server. Since NBD_OPT_ABORT is
documented as being a potential for an EPIPE when the client hangs
up without waiting for our reply, we don't need to pollute the
server's output with that failure.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20181117223221.2198751-1-eblake@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
A space was missing after the option number was printed:
Option 0x8not permitted before TLS
becomes
Option 0x8 not permitted before TLS
This fixes
commit 3668328303
Author: Eric Blake <eblake@redhat.com>
Date: Fri Oct 14 13:33:09 2016 -0500
nbd: Send message along with server NBD_REP_ERR errors
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20181116155325.22428-2-berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: move lone space to next line]
Signed-off-by: Eric Blake <eblake@redhat.com>
Whether it's "locked" or "frozen", it's in use and should
not be allowed for the purposes of this operation.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20181002230218.13949-7-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
After the previous commit, nbd_client_new's first parameter is always
NULL. Let's drop it with all corresponding old-style negotiation code
path which is unreachable now.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20181003170228.95973-3-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: re-wrap short line]
Signed-off-by: Eric Blake <eblake@redhat.com>
We should not go to structured-read branch on CACHE command, fix that.
Bug introduced in bc37b06a5c "nbd/server: introduce NBD_CMD_CACHE"
with the whole feature and affects 3.0.0 release.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
CC: qemu-stable@nongnu.org
Message-Id: <20181003144738.70670-1-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: commit message typo fix]
Signed-off-by: Eric Blake <eblake@redhat.com>
Taking the address of a field in a packed struct is a bad idea, because
it might not be actually aligned enough for that pointer type (and
thus cause a crash on dereference on some host architectures). Newer
versions of clang warn about this. Avoid the bug by not using the
"modify in place" byte swapping functions.
This patch was produced with the following spatch script:
@@
expression E;
@@
-be16_to_cpus(&E);
+E = be16_to_cpu(E);
@@
expression E;
@@
-be32_to_cpus(&E);
+E = be32_to_cpu(E);
@@
expression E;
@@
-be64_to_cpus(&E);
+E = be64_to_cpu(E);
@@
expression E;
@@
-cpu_to_be16s(&E);
+E = cpu_to_be16(E);
@@
expression E;
@@
-cpu_to_be32s(&E);
+E = cpu_to_be32(E);
@@
expression E;
@@
-cpu_to_be64s(&E);
+E = cpu_to_be64(E);
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20180927164200.15097-1-peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: rebase, and squash in missed changes]
Signed-off-by: Eric Blake <eblake@redhat.com>
This is necessary for efficient block-status export, for clients which
support it. (qemu is not yet such a client, but could become one.)
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180704112302.471456-3-vsementsov@virtuozzo.com>
[eblake: grammar tweaks]
Signed-off-by: Eric Blake <eblake@redhat.com>
bitmap_to_extents function is broken: it switches dirty variable after
every iteration, however it can process only part of dirty (or zero)
area during one iteration in case when this area is too large for one
extent.
Fortunately, the bug doesn't produce wrong extent flags: it just inserts
a zero-length extent between sequential extents representing large dirty
(or zero) area. However, zero-length extents are forbidden by the NBD
protocol. So, a careful client should consider such a reply as a server
fault, while a less-careful will likely ignore zero-length extents.
The bug can only be triggered by a client that requests block status
for nearly 4G at once (a request of 4G and larger is impossible per
the protocol, and requests smaller than 4G less the bitmap granularity
cause the loop to quit iterating rather than revisit the tail of the
large area); it also cannot trigger if the client used the
NBD_CMD_FLAG_REQ_ONE flag. Since qemu 3.0 as client (using the
x-dirty-bitmap extension) always passes the flag, it is immune; and
we are not aware of other open-source clients that know how to request
qemu:dirty-bitmap:FOO contexts. Clients that want to avoid the bug
could cap block status requests to a smaller length, such as 2G or 3G.
Fix this by more careful handling of dirty variable.
Bug was introduced in 3d068aff16
"nbd/server: implement dirty bitmap export", with the whole function.
and is present in v3.0.0 release.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180914165116.23182-1-vsementsov@virtuozzo.com>
CC: qemu-stable@nongnu.org
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: improved commit message]
Signed-off-by: Eric Blake <eblake@redhat.com>
Call nbd_co_send_extents() with correct length parameter
(extent.length may be smaller than original length).
Also, switch length parameter type to uint32_t, to correspond with
request->len and similar nbd_co_send_bitmap().
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180704112302.471456-2-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
In order to test that the NBD server is properly advertising
dirty bitmaps, we need a bare minimum client that can request
and read the context. Since feature freeze for 3.0 is imminent,
this is the smallest workable patch, which replaces the qemu
block status report with the results of the NBD server's dirty
bitmap (making it very easy to use 'qemu-img map --output=json'
to learn where the dirty portions are). Note that the NBD
protocol defines a dirty section with the same bit but opposite
sense that normal "base:allocation" uses to report an allocated
section; so in qemu-img map output, "data":true corresponds to
clean, "data":false corresponds to dirty.
A more complete solution that allows dirty bitmaps to be queried
at the same time as normal block status will be required before
this addition can lose the x- prefix. Until then, the fact that
this replaces normal status with dirty status means actions
like 'qemu-img convert' will likely misbehave due to treating
dirty regions of the file as if they are unallocated.
The next patch adds an iotest to exercise this new code.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180702191458.28741-2-eblake@redhat.com>
In my hurry to fix a build failure, I introduced a logic bug.
The assertion conditional is backwards, meaning that qemu will
now abort instead of reporting dirty bitmap status.
The bug can only be tickled by an NBD client using an exported
dirty bitmap (which is still an experimental QMP command), so
it's not the end of the world for supported usage (and neither
'make check' nor qemu-iotests fails); but it also shows that we
really want qemu-io support for reading dirty bitmaps if only
so that I can add iotests coverage to prevent future
brown-bag-of-shame events like this one.
Fixes: 45eb6fb6
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180622153509.375130-1-eblake@redhat.com>
The code has a while() loop that always initialized 'end', and
the loop always executes at least once (as evidenced by the assert()
just prior to the loop). But some versions of gcc still complain
that 'end' is used uninitialized, so silence them.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20180622125814.345274-1-eblake@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Handle nbd CACHE command. Just do read, without sending read data back.
Cache mechanism should be done by exported node driver chain.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180413143156.11409-1-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: fix two missing case labels in switch statements]
Signed-off-by: Eric Blake <eblake@redhat.com>
Handle a new NBD meta namespace: "qemu", and corresponding queries:
"qemu:dirty-bitmap:<export bitmap name>".
With the new metadata context negotiated, BLOCK_STATUS query will reply
with dirty-bitmap data, converted to extents. The new public function
nbd_export_bitmap selects which bitmap to export. For now, only one bitmap
may be exported.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180609151758.17343-5-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: wording tweaks, minor cleanups, additional tracing]
Signed-off-by: Eric Blake <eblake@redhat.com>
Add nbd_meta_pattern() and nbd_meta_empty_or_pattern() helpers for
metadata query parsing. nbd_meta_pattern() will be reused for the
"qemu" namespace in following patches.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180609151758.17343-4-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: comment tweaks]
Signed-off-by: Eric Blake <eblake@redhat.com>
Use NBDExport pointer instead of just export name: there is no need to
store a duplicated name in the struct; moreover, NBDExport will be used
further.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180609151758.17343-3-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: commit message grammar tweak]
Signed-off-by: Eric Blake <eblake@redhat.com>
Return code = 1 doesn't mean that we parsed base:allocation. Use
correct traces in both -parsed and -skipped cases.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180609151758.17343-2-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: comment tweaks]
Signed-off-by: Eric Blake <eblake@redhat.com>
The NBD spec says that behavior is unspecified if the client
requests 0 length for block status; but since the structured
reply is documenting as returning a non-zero length, it's
easier to just diagnose this with an EINVAL error than to
figure out what to return.
CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180621124937.166549-1-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
A missing space makes for poor error messages, and sizes can't
go negative. Also, we missed diagnosing a server that sends
a maximum block size less than the minimum.
Fixes: 081dd1fe
CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180501154654.943782-1-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Initialize received variable. Otherwise, is is possible for server to
answer without any contexts, but we will set context_id to something
random (received_id is not initialized too) and return 1, which is
wrong.
To solve it, just initialize received to false. Initialize received_id
too, just to make all possible checkers happy.
Bug was introduced in 78a33ab587 "nbd: BLOCK_STATUS for
standard get_block_status function: client part" with the whole
function.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180427142002.21930-2-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
Having a more detailed log of the interaction between client and
server is invaluable in debugging how meta context negotiation
actually works.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180330130950.1931229-1-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
It's never a good idea to blindly read for size bytes as
returned by the server without first validating that the size
is within bounds; a malicious or buggy server could cause us
to hang or get out of sync from reading further messages.
It may be smarter to try and teach the client to cope with
unexpected context ids by silently ignoring them instead of
hanging up on the server, but for now, if the server doesn't
reply with exactly the one context we expect, it's easier to
just give up - however, if we give up for any reason other
than an I/O failure, we might as well try to politely tell
the server we are quitting rather than continuing.
Fix some typos in the process.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180329231837.1914680-1-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Minimal realization: only one extent in server answer is supported.
Flag NBD_CMD_FLAG_REQ_ONE is used to force this behavior.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180312152126.286890-6-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: grammar tweaks, fix min_block check and 32-bit cap, use -1
instead of errno on failure in nbd_negotiate_simple_meta_context,
ensure that block status makes progress on success]
Signed-off-by: Eric Blake <eblake@redhat.com>
Minimal realization: only one extent in server answer is supported.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180312152126.286890-4-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: tweak whitespace, move constant from .h to .c, improve
logic of check_meta_export_name, simplify nbd_negotiate_options
by doing more in nbd_negotiate_meta_queries]
Signed-off-by: Eric Blake <eblake@redhat.com>
Add helper to read name in format:
uint32 len (<= NBD_MAX_NAME_SIZE)
len bytes string (not 0-terminated)
The helper will be reused in following patch.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180312152126.286890-3-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: grammar fixes, actually check error]
Signed-off-by: Eric Blake <eblake@redhat.com>
NBD_REP_ERR_INVALID is often parameter to nbd_opt_drop and it would
be used more in following patches. So, let's add a helper.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180312152126.286890-2-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
The NBD spec states that since trim requests can affect disk contents,
then they should allow for FUA semantics just like writes for ensuring
the disk has settled before returning. As bdrv_[co_]pdiscard() does
not support a flags argument, we can't pass FUA down the block layer
stack, and must therefore emulate it with a flush at the NBD layer.
Note that in all reality, generic well-behaved clients will never
send TRIM+FUA (in fact, qemu as a client never does, and we have no
intention to plumb flags into bdrv_pdiscard). This is because the
NBD protocol states that it is unspecified to READ a trimmed area
(you might read stale data, all zeroes, or even random unrelated
data) without first rewriting it, and even the experimental
BLOCK_STATUS extension states that TRIM need not affect reported
status. Thus, in the general case, a client cannot tell the
difference between an arbitrary server that ignores TRIM, a server
that had a power outage without flushing to disk, and a server that
actually affected the disk before returning; so waiting for the
trim actions to flush to disk makes little sense. However, for a
specific client and server pair, where the client knows the server
treats TRIM'd areas as guaranteed reads-zero, waiting for a flush
makes sense, hence why the protocol documents that FUA is valid on
trim. So, even though the NBD protocol doesn't have a way for the
server to advertise what effects (if any) TRIM will actually have,
and thus any client that relies on specific effects is probably
in error, we can at least support a client that requests TRIM+FUA.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180307225732.155835-1-eblake@redhat.com>
Split out request handling logic.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180308184636.178534-6-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: touch up blank line placement]
Signed-off-by: Eric Blake <eblake@redhat.com>
nbd_trip has difficult logic when sending replies: it tries to use one
code path for all replies. It is ok for simple replies, but is not
comfortable for structured replies. Also, two types of error (and
corresponding messages in local_err) - fatal (leading to disconnect)
and not-fatal (just to be sent to the client) are difficult to follow.
To make things a bit clearer, the following is done:
- split CMD_READ logic to separate function. It is the most difficult
command for now, and it is definitely cramped inside nbd_trip. Also,
it is difficult to follow CMD_READ logic, shared between
"case NBD_CMD_READ" and "if"s under "reply:" label.
- create separate helper function nbd_send_generic_reply() and use it
both in new nbd_do_cmd_read and for other commands in nbd_trip instead
of common code-path under "reply:" label in nbd_trip. The helper
supports an error message, so logic with local_err in nbd_trip is
simplified.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180308184636.178534-5-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: grammar tweaks and blank line placement]
Signed-off-by: Eric Blake <eblake@redhat.com>
Since the unchanged code has just set client->recv_coroutine to
NULL before calling nbd_client_receive_next_request(), we are
spawning a new coroutine unconditionally, but the first thing
that coroutine will do is check for client->closing, making it
a no-op if we have already detected that the client is going
away. Furthermore, for any error other than EIO (where we
disconnect, which itself sets client->closing), if the client
has already gone away, we'll probably encounter EIO later
in the function and attempt disconnect at that point. Logically,
as soon as we know the connection is closing, there is no need
to try a likely-to-fail a response or spawn a no-op coroutine.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180308184636.178534-4-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: squash in further reordering: hoist check before spawning
next coroutine, and document rationale in commit message]
Signed-off-by: Eric Blake <eblake@redhat.com>