Commit Graph

29 Commits

Author SHA1 Message Date
Kevin Wolf
5ad81b4946 nbd: Restrict connection_co reentrance
nbd_client_attach_aio_context() schedules connection_co in the new
AioContext and this way reenters it in any arbitrary place that has
yielded. We can restrict this a bit to the function call where the
coroutine actually sits waiting when it's idle.

This doesn't solve any bug yet, but it shows where in the code we need
to support this random reentrance and where we don't have to care.

Add FIXME comments for the existing bugs that the rest of this series
will fix.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2019-02-25 15:03:19 +01:00
Vladimir Sementsov-Ogievskiy
bc5a03350c block/nbd-client: rename read_reply_co to connection_co
This coroutine will serve nbd reconnects, so, rename it to be something
more generic.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20190201130138.94525-7-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2019-02-04 15:11:27 -06:00
Vladimir Sementsov-Ogievskiy
d42f78e940 block/nbd: move connection code from block/nbd to block/nbd-client
Keep all connection code in one file, to be able to implement reconnect
in further patches.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20190201130138.94525-3-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: format tweak]
Signed-off-by: Eric Blake <eblake@redhat.com>
2019-02-04 15:11:27 -06:00
Eric Blake
216ee3657e nbd/client: Add x-dirty-bitmap to query bitmap from server
In order to test that the NBD server is properly advertising
dirty bitmaps, we need a bare minimum client that can request
and read the context.  Since feature freeze for 3.0 is imminent,
this is the smallest workable patch, which replaces the qemu
block status report with the results of the NBD server's dirty
bitmap (making it very easy to use 'qemu-img map --output=json'
to learn where the dirty portions are).  Note that the NBD
protocol defines a dirty section with the same bit but opposite
sense that normal "base:allocation" uses to report an allocated
section; so in qemu-img map output, "data":true corresponds to
clean, "data":false corresponds to dirty.

A more complete solution that allows dirty bitmaps to be queried
at the same time as normal block status will be required before
this addition can lose the x- prefix.  Until then, the fact that
this replaces normal status with dirty status means actions
like 'qemu-img convert' will likely misbehave due to treating
dirty regions of the file as if they are unallocated.

The next patch adds an iotest to exercise this new code.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180702191458.28741-2-eblake@redhat.com>
2018-07-02 15:27:38 -05:00
Vladimir Sementsov-Ogievskiy
78a33ab587 nbd: BLOCK_STATUS for standard get_block_status function: client part
Minimal realization: only one extent in server answer is supported.
Flag NBD_CMD_FLAG_REQ_ONE is used to force this behavior.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180312152126.286890-6-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: grammar tweaks, fix min_block check and 32-bit cap, use -1
instead of errno on failure in nbd_negotiate_simple_meta_context,
ensure that block status makes progress on success]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-13 15:43:48 -05:00
Vladimir Sementsov-Ogievskiy
f140e30003 nbd: Minimal structured read for client
Minimal implementation: for structured error only error_report error
message.

Note that test 83 is now more verbose, because the implementation
prints more warnings about unexpected communication errors; perhaps
future patches should tone things down by using trace messages
instead of traces, but the common case of successful communication
is no noisier than before.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20171027104037.8319-13-eblake@redhat.com>
2017-10-30 21:48:41 +01:00
Stefan Hajnoczi
40f4a21895 nbd-client: avoid spurious qio_channel_yield() re-entry
The following scenario leads to an assertion failure in
qio_channel_yield():

1. Request coroutine calls qio_channel_yield() successfully when sending
   would block on the socket.  It is now yielded.
2. nbd_read_reply_entry() calls nbd_recv_coroutines_enter_all() because
   nbd_receive_reply() failed.
3. Request coroutine is entered and returns from qio_channel_yield().
   Note that the socket fd handler has not fired yet so
   ioc->write_coroutine is still set.
4. Request coroutine attempts to send the request body with nbd_rwv()
   but the socket would still block.  qio_channel_yield() is called
   again and assert(!ioc->write_coroutine) is hit.

The problem is that nbd_read_reply_entry() does not distinguish between
request coroutines that are waiting to receive a reply and those that
are not.

This patch adds a per-request bool receiving flag so
nbd_read_reply_entry() can avoid spurious aio_wake() calls.

Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20170822125113.5025-1-stefanha@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Tested-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2017-08-23 11:22:15 -05:00
Eric Blake
72b6ffc766 nbd-client: Fix regression when server sends garbage
When we switched NBD to use coroutines for qemu 2.9 (in particular,
commit a12a712a), we introduced a regression: if a server sends us
garbage (such as a corrupted magic number), we quit the read loop
but do not stop sending further queued commands, resulting in the
client hanging when it never reads the response to those additional
commands.  In qemu 2.8, we properly detected that the server is no
longer reliable, and cancelled all existing pending commands with
EIO, then tore down the socket so that all further command attempts
get EPIPE.

Restore the proper behavior of quitting (almost) all communication
with a broken server: Once we know we are out of sync or otherwise
can't trust the server, we must assume that any further incoming
data is unreliable and therefore end all pending commands with EIO,
and quit trying to send any further commands.  As an exception, we
still (try to) send NBD_CMD_DISC to let the server know we are going
away (in part, because it is easier to do that than to further
refactor nbd_teardown_connection, and in part because it is the
only command where we do not have to wait for a reply).

Based on a patch by Vladimir Sementsov-Ogievskiy.

A malicious server can be created with the following hack,
followed by setting NBD_SERVER_DEBUG to a non-zero value in the
environment when running qemu-nbd:

| --- a/nbd/server.c
| +++ b/nbd/server.c
| @@ -919,6 +919,17 @@ static int nbd_send_reply(QIOChannel *ioc, NBDReply *reply, Error **errp)
|      stl_be_p(buf + 4, reply->error);
|      stq_be_p(buf + 8, reply->handle);
|
| +    static int debug;
| +    static int count;
| +    if (!count++) {
| +        const char *str = getenv("NBD_SERVER_DEBUG");
| +        if (str) {
| +            debug = atoi(str);
| +        }
| +    }
| +    if (debug && !(count % debug)) {
| +        buf[0] = 0;
| +    }
|      return nbd_write(ioc, buf, sizeof(buf), errp);
|  }

Reported-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20170814213426.24681-1-eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-08-15 10:03:28 -05:00
Eric Blake
004a89fce9 nbd: Create struct for tracking export info
The NBD Protocol is introducing some additional information
about exports, such as minimum request size and alignment, as
well as an advertised maximum request size.  It will be easier
to feed this information back to the block layer if we gather
all the information into a struct, rather than adding yet more
pointer parameters during negotiation.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20170707203049.534-2-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-14 12:04:41 +02:00
Manos Pitsidianakis
f5a5ca7969 block: change variable names in BlockDriverState
Change the 'int count' parameter in *pwrite_zeros, *pdiscard related
functions (and some others) to 'int bytes', as they both refer to bytes.
This helps with code legibility.

Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr>
Message-id: 20170609101808.13506-1-el13635@mail.ntua.gr
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-06-26 14:54:46 +02:00
Stefan Hajnoczi
e4548bb640 nbd: drop unused NBDClientSession.is_unix field
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20170327123223.1199-1-stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-27 14:41:01 +02:00
Paolo Bonzini
ff82911cd3 nbd: convert to use qio_channel_yield
In the client, read the reply headers from a coroutine, switching the
read side between the "read header" coroutine and the I/O coroutine that
reads the body of the reply.

In the server, if the server can read more requests it will create a new
"read request" coroutine as soon as a request has been read.  Otherwise,
the new coroutine is created in nbd_request_put.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170213135235.12274-8-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-21 11:14:08 +00:00
Eric Blake
fa778fffdf nbd: Implement NBD_CMD_WRITE_ZEROES on client
Upstream NBD protocol recently added the ability to efficiently
write zeroes without having to send the zeroes over the wire,
along with a flag to control whether the client wants a hole.

The generic block code takes care of falling back to the obvious
write of lots of zeroes if we return -ENOTSUP because the server
does not have WRITE_ZEROES.

Ideally, since NBD_CMD_WRITE_ZEROES does not involve any data
over the wire, we want to support transactions that are much
larger than the normal 32M limit imposed on NBD_CMD_WRITE.  But
the server may still have a limit smaller than UINT_MAX, so
until experimental NBD protocol additions for advertising various
command sizes is finalized (see [1], [2]), for now we just stick to
the same limits as normal writes.

[1] https://github.com/yoe/nbd/blob/extension-info/doc/proto.md
[2] https://sourceforge.net/p/nbd/mailman/message/35081223/

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1476469998-28592-17-git-send-email-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-11-02 09:28:56 +01:00
Eric Blake
ed2dd91267 nbd: Rename struct nbd_request and nbd_reply
Our coding convention prefers CamelCase names, and we already
have other existing structs with NBDFoo naming.  Let's be
consistent, before later patches add even more structs.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1476469998-28592-6-git-send-email-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-11-02 09:28:55 +01:00
Eric Blake
10676b81a9 nbd: Rename NbdClientSession to NBDClientSession
It's better to use consistent capitalization of the namespace
used for NBD functions; we have more instances of NBD* than
Nbd*.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1476469998-28592-5-git-send-email-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-11-02 09:28:55 +01:00
Changlong Xie
9bc9732fae nbd: Use CoQueue for free_sema instead of CoMutex
NBD is using the CoMutex in a way that wasn't anticipated. For example, if there are
N(N=26, MAX_NBD_REQUESTS=16) nbd write requests, so we will invoke nbd_client_co_pwritev
N times.
----------------------------------------------------------------------------------------
time request Actions
1    1       in_flight=1, Coroutine=C1
2    2       in_flight=2, Coroutine=C2
...
15   15      in_flight=15, Coroutine=C15
16   16      in_flight=16, Coroutine=C16, free_sema->holder=C16, mutex->locked=true
17   17      in_flight=16, Coroutine=C17, queue C17 into free_sema->queue
18   18      in_flight=16, Coroutine=C18, queue C18 into free_sema->queue
...
26   N       in_flight=16, Coroutine=C26, queue C26 into free_sema->queue
----------------------------------------------------------------------------------------

Once nbd client recieves request No.16' reply, we will re-enter C16. It's ok, because
it's equal to 'free_sema->holder'.
----------------------------------------------------------------------------------------
time request Actions
27   16      in_flight=15, Coroutine=C16, free_sema->holder=C16, mutex->locked=false
----------------------------------------------------------------------------------------

Then nbd_coroutine_end invokes qemu_co_mutex_unlock what will pop coroutines from
free_sema->queue's head and enter C17. More free_sema->holder is C17 now.
----------------------------------------------------------------------------------------
time request Actions
28   17      in_flight=16, Coroutine=C17, free_sema->holder=C17, mutex->locked=true
----------------------------------------------------------------------------------------

In above scenario, we only recieves request No.16' reply. As time goes by, nbd client will
almostly recieves replies from requests 1 to 15 rather than request 17 who owns C17. In this
case, we will encounter assert "mutex->holder == self" failed since Kevin's commit 0e438cdc
"coroutine: Let CoMutex remember who holds it". For example, if nbd client recieves request
No.15' reply, qemu will stop unexpectedly:
----------------------------------------------------------------------------------------
time request       Actions
29   15(most case) in_flight=15, Coroutine=C15, free_sema->holder=C17, mutex->locked=false
----------------------------------------------------------------------------------------

Per Paolo's suggestion "The simplest fix is to change it to CoQueue, which is like a condition
variable", this patch replaces CoMutex with CoQueue.

Cc: Wen Congyang <wency@cn.fujitsu.com>
Reported-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
Message-Id: <1476267508-19499-1-git-send-email-xiecl.fnst@cn.fujitsu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-11-01 16:06:57 +01:00
Eric Blake
7423f41782 nbd: Limit nbdflags to 16 bits
Rather than asserting that nbdflags is within range, just give
it the correct type to begin with :)  nbdflags corresponds to
the per-export portion of NBD Protocol "transmission flags", which
is 16 bits in response to NBD_OPT_EXPORT_NAME and NBD_OPT_GO.

Furthermore, upstream NBD has never passed the global flags to
the kernel via ioctl(NBD_SET_FLAGS) (the ioctl was first
introduced in NBD 2.9.22; then a latent bug in NBD 3.1 actually
tried to OR the global flags with the transmission flags, with
the disaster that the addition of NBD_FLAG_NO_ZEROES in 3.9
caused all earlier NBD 3.x clients to treat every export as
read-only; NBD 3.10 and later intentionally clip things to 16
bits to pass only transmission flags).  Qemu should follow suit,
since the current two global flags (NBD_FLAG_FIXED_NEWSTYLE
and NBD_FLAG_NO_ZEROES) have no impact on the kernel's behavior
during transmission.

CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>

Message-Id: <1469129688-22848-3-git-send-email-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-08-03 18:44:56 +02:00
Eric Blake
70c4fb2648 nbd: Convert to byte-based interface
The NBD protocol doesn't have any notion of sectors, so it is
a fairly easy conversion to use byte-based read and write.

Signed-off-by: Eric Blake <eblake@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1468624988-423-19-git-send-email-eblake@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-07-20 14:24:25 +01:00
Eric Blake
447e57c3b0 nbd: Switch .bdrv_co_discard() to byte-based
Another step towards killing off sector-based block APIs.

While at it, call directly into nbd-client.c instead of having
a pointless trivial wrapper in nbd.c.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1468624988-423-14-git-send-email-eblake@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-07-20 14:24:25 +01:00
Eric Blake
52a4650574 nbd: Simplify client FUA handling
Now that the block layer honors per-bds FUA support, we don't
have to duplicate the fallback flush at the NBD layer.  The
static function nbd_co_writev_flags() is no longer needed, and
the driver can just directly use nbd_client_co_writev().

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12 15:22:09 +02:00
Kevin Wolf
2b556518c3 nbd: Support BDRV_REQ_FUA
The NBD server already used to send a FUA flag when the writethrough
mode was set. This code was a remnant from the times where protocol
drivers actually had to implement writethrough modes. Since nowadays the
block layer sends flushes in writethrough mode and non-root nodes are
always writeback, this was mostly dead code - only mostly because if NBD
was configured to be used without a format, we sent _both_ FUA and an
explicit flush afterwards, which makes the code not technically dead,
but useless overhead.

This patch changes the code so that the block layer's FUA flag is
recognised and translated into a NBD FUA flag. The additional flush is
avoided now.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-03-30 12:16:02 +02:00
Daniel P. Berrange
75822a12c0 nbd: enable use of TLS with NBD block driver
This modifies the NBD driver so that it is possible to request
use of TLS. This is done by providing the 'tls-creds' parameter
with the ID of a previously created QCryptoTLSCreds object.

For example

  $QEMU -object tls-creds-x509,id=tls0,endpoint=client,\
                dir=/home/berrange/security/qemutls \
        -drive driver=nbd,host=localhost,port=9000,tls-creds=tls0

The client will drop the connection if the NBD server does not
provide TLS.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1455129674-17255-15-git-send-email-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-16 17:16:33 +01:00
Daniel P. Berrange
064097d919 nbd: convert block client to use I/O channels for connection setup
This converts the NBD block driver client to use the QIOChannelSocket
class for initial connection setup. The NbdClientSession struct has
two pointers, one to the master QIOChannelSocket providing the raw
data channel, and one to a QIOChannel which is the current channel
used for I/O. Initially the two point to the same object, but when
TLS support is added, they will point to different objects.

The qemu-img & qemu-io tools now need to use MODULE_INIT_QOM to
ensure the QIOChannel object classes are registered. The qemu-nbd
tool already did this.

In this initial conversion though, all I/O is still actually done
using the raw POSIX sockets APIs.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1455129674-17255-4-git-send-email-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-16 17:13:22 +01:00
Max Reitz
3f4726596d nbd: Set block size to BDRV_SECTOR_SIZE
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <1424887718-10800-13-git-send-email-mreitz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-03-18 12:07:01 +01:00
Max Reitz
f53a829bb9 nbd: Drop BDS backpointer
Before this patch, the "opaque" pointer in an NBD BDS points to a
BDRVNBDState, which contains an NbdClientSession object, which in turn
contains a pointer to the BDS. This pointer may become invalid due to
bdrv_swap(), so drop it, and instead pass the BDS directly to the
nbd-client.c functions which then retrieve the NbdClientSession object
from there.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1423256778-3340-2-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-02-16 14:36:03 +00:00
Max Reitz
1ce52846d3 nbd: Improve error messages
This patch makes use of the Error object for nbd_receive_negotiate() so
that errors during negotiation look nicer.

Furthermore, this patch adds an additional error message if the received
magic was wrong, but would be correct for the other protocol version,
respectively: So if an export name was specified, but the NBD server
magic corresponds to an old handshake, this condition is explicitly
signaled to the user, and vice versa.

As these messages are now part of the "Could not open image" error
message, additional filtering has to be employed in iotest 083, which
this patch does as well.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-02-06 17:24:22 +01:00
Stefan Hajnoczi
69447cd8f3 nbd: implement .bdrv_detach/attach_aio_context()
Drop the assumption that we're using the main AioContext.  Convert
qemu_aio_set_fd_handler() calls to aio_set_fd_handler().

The .bdrv_detach/attach_aio_context() interfaces also need to be
implemented to move the socket fd handler from the old to the new
AioContext.

Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-04 09:56:11 +02:00
Marc-André Lureau
e2bc625f9b nbd: pass export name as init argument
There is no need to keep the export name around, and it seems a better
fit as an argument in the init() call.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-12-16 10:12:20 +01:00
Marc-André Lureau
2302c1cafb Split nbd block client code
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-12-16 10:12:20 +01:00