Commit Graph

3384 Commits

Author SHA1 Message Date
Vladimir Sementsov-Ogievskiy
f35dff7e13 block/nbd-client: refactor request send/receive
Add nbd_co_request, to remove code duplications in
nbd_client_co_{pwrite,pread,...} functions. Also this is
needed for further refactoring.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20170804151440.320927-8-vsementsov@virtuozzo.com>
[eblake: make nbd_co_request a wrapper, rather than merging two
existing functions]
Signed-off-by: Eric Blake <eblake@redhat.com>
2017-08-30 13:00:38 -05:00
Vladimir Sementsov-Ogievskiy
07b1b99c78 block/nbd-client: rename nbd_recv_coroutines_enter_all
Rename nbd_recv_coroutines_enter_all to nbd_recv_coroutines_wake_all,
as it most probably just adds all recv coroutines into co_queue_wakeup,
rather than directly enter them.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20170804151440.320927-9-vsementsov@virtuozzo.com>
[eblake: tweak commit message]
Signed-off-by: Eric Blake <eblake@redhat.com>
2017-08-30 13:00:38 -05:00
Vladimir Sementsov-Ogievskiy
6faa077772 block/nbd-client: get rid of ssize_t
Use int variable for nbd_co_send_request return value (as
nbd_co_send_request returns int).

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20170804151440.320927-6-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2017-08-30 13:00:38 -05:00
Stefan Hajnoczi
3c2d5183f9 nbd-client: avoid read_reply_co entry if send failed
The following segfault is encountered if the NBD server closes the UNIX
domain socket immediately after negotiation:

  Program terminated with signal SIGSEGV, Segmentation fault.
  #0  aio_co_schedule (ctx=0x0, co=0xd3c0ff2ef0) at util/async.c:441
  441       QSLIST_INSERT_HEAD_ATOMIC(&ctx->scheduled_coroutines,
  (gdb) bt
  #0  0x000000d3c01a50f8 in aio_co_schedule (ctx=0x0, co=0xd3c0ff2ef0) at util/async.c:441
  #1  0x000000d3c012fa90 in nbd_coroutine_end (bs=bs@entry=0xd3c0fec650, request=<optimized out>) at block/nbd-client.c:207
  #2  0x000000d3c012fb58 in nbd_client_co_preadv (bs=0xd3c0fec650, offset=0, bytes=<optimized out>, qiov=0x7ffc10a91b20, flags=0) at block/nbd-client.c:237
  #3  0x000000d3c0128e63 in bdrv_driver_preadv (bs=bs@entry=0xd3c0fec650, offset=offset@entry=0, bytes=bytes@entry=512, qiov=qiov@entry=0x7ffc10a91b20, flags=0) at block/io.c:836
  #4  0x000000d3c012c3e0 in bdrv_aligned_preadv (child=child@entry=0xd3c0ff51d0, req=req@entry=0x7f31885d6e90, offset=offset@entry=0, bytes=bytes@entry=512, align=align@entry=1, qiov=qiov@entry=0x7ffc10a91b20, f
+lags=0) at block/io.c:1086
  #5  0x000000d3c012c6b8 in bdrv_co_preadv (child=0xd3c0ff51d0, offset=offset@entry=0, bytes=bytes@entry=512, qiov=qiov@entry=0x7ffc10a91b20, flags=flags@entry=0) at block/io.c:1182
  #6  0x000000d3c011cc17 in blk_co_preadv (blk=0xd3c0ff4f80, offset=0, bytes=512, qiov=0x7ffc10a91b20, flags=0) at block/block-backend.c:1032
  #7  0x000000d3c011ccec in blk_read_entry (opaque=0x7ffc10a91b40) at block/block-backend.c:1079
  #8  0x000000d3c01bbb96 in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at util/coroutine-ucontext.c:79
  #9  0x00007f3196cb8600 in __start_context () at /lib64/libc.so.6

The problem is that nbd_client_init() uses
nbd_client_attach_aio_context() -> aio_co_schedule(new_context,
client->read_reply_co).  Execution of read_reply_co is deferred to a BH
which doesn't run until later.

In the mean time blk_co_preadv() can be called and nbd_coroutine_end()
calls aio_wake() on read_reply_co.  At this point in time
read_reply_co's ctx isn't set because it has never been entered yet.

This patch simplifies the nbd_co_send_request() ->
nbd_co_receive_reply() -> nbd_coroutine_end() lifecycle to just
nbd_co_send_request() -> nbd_co_receive_reply().  The request is "ended"
if an error occurs at any point.  Callers no longer have to invoke
nbd_coroutine_end().

This cleanup also eliminates the segfault because we don't call
aio_co_schedule() to wake up s->read_reply_co if sending the request
failed.  It is only necessary to wake up s->read_reply_co if a reply was
received.

Note this only happens with UNIX domain sockets on Linux.  It doesn't
seem possible to reproduce this with TCP sockets.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20170829122745.14309-2-stefanha@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2017-08-30 13:00:37 -05:00
Stefan Hajnoczi
3e4c705212 qcow2: allocate cluster_cache/cluster_data on demand
Most qcow2 files are uncompressed so it is wasteful to allocate (32 + 1)
* cluster_size + 512 bytes upfront.  Allocate s->cluster_cache and
s->cluster_data when the first read operation is performance on a
compressed cluster.

The buffers are freed in .bdrv_close().  .bdrv_open() no longer has any
code paths that can allocate these buffers, so remove the free functions
in the error code path.

This patch can result in significant memory savings when many qcow2
disks are attached or backing file chains are long:

Before 12.81% (1,023,193,088B)
After   5.36% (393,893,888B)

Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170821135530.32344-1-stefanha@redhat.com
Cc: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-08-30 18:02:10 +01:00
Alberto Garcia
c3a8fe331e misc: Remove unused Error variables
There's a few cases which we're passing an Error pointer to a function
only to discard it immediately afterwards without checking it. In
these cases we can simply remove the variable and pass NULL instead.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20170829120836.16091-1-berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-08-30 11:58:26 +01:00
Stefan Hajnoczi
40f4a21895 nbd-client: avoid spurious qio_channel_yield() re-entry
The following scenario leads to an assertion failure in
qio_channel_yield():

1. Request coroutine calls qio_channel_yield() successfully when sending
   would block on the socket.  It is now yielded.
2. nbd_read_reply_entry() calls nbd_recv_coroutines_enter_all() because
   nbd_receive_reply() failed.
3. Request coroutine is entered and returns from qio_channel_yield().
   Note that the socket fd handler has not fired yet so
   ioc->write_coroutine is still set.
4. Request coroutine attempts to send the request body with nbd_rwv()
   but the socket would still block.  qio_channel_yield() is called
   again and assert(!ioc->write_coroutine) is hit.

The problem is that nbd_read_reply_entry() does not distinguish between
request coroutines that are waiting to receive a reply and those that
are not.

This patch adds a per-request bool receiving flag so
nbd_read_reply_entry() can avoid spurious aio_wake() calls.

Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20170822125113.5025-1-stefanha@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Tested-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2017-08-23 11:22:15 -05:00
Fam Zheng
045a2f8254 mirror: Mark target BB as "force allow inactivate"
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170823134242.12080-4-famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2017-08-23 10:21:55 -05:00
Fam Zheng
ca2e214411 block-backend: Allow more "can inactivate" cases
These two conditions corresponds to mirror job's source and target,
which need to be allowed as they are part of the non-shared storage
migration workflow: failing to inactivate either will result in a
failure during migration completion.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170823134242.12080-3-famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[eblake: improve comment grammar]
Signed-off-by: Eric Blake <eblake@redhat.com>
2017-08-23 10:21:55 -05:00
Fam Zheng
c16de8f59a block-backend: Refactor inactivate check
The logic will be fixed (extended), move it to a separate function.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170823134242.12080-2-famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2017-08-23 10:21:55 -05:00
Igor Mammedov
d0a180131c fix build failure in nbd_read_reply_entry()
travis builds fail at HEAD at rc3 master with

  block/nbd-client.c: In function ‘nbd_read_reply_entry’:
  block/nbd-client.c:110:8: error: ‘ret’ may be used uninitialized in this function [-Werror=uninitialized]

fix it by initializing 'ret' to 0

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-08-23 12:24:41 +01:00
Eric Blake
72b6ffc766 nbd-client: Fix regression when server sends garbage
When we switched NBD to use coroutines for qemu 2.9 (in particular,
commit a12a712a), we introduced a regression: if a server sends us
garbage (such as a corrupted magic number), we quit the read loop
but do not stop sending further queued commands, resulting in the
client hanging when it never reads the response to those additional
commands.  In qemu 2.8, we properly detected that the server is no
longer reliable, and cancelled all existing pending commands with
EIO, then tore down the socket so that all further command attempts
get EPIPE.

Restore the proper behavior of quitting (almost) all communication
with a broken server: Once we know we are out of sync or otherwise
can't trust the server, we must assume that any further incoming
data is unreliable and therefore end all pending commands with EIO,
and quit trying to send any further commands.  As an exception, we
still (try to) send NBD_CMD_DISC to let the server know we are going
away (in part, because it is easier to do that than to further
refactor nbd_teardown_connection, and in part because it is the
only command where we do not have to wait for a reply).

Based on a patch by Vladimir Sementsov-Ogievskiy.

A malicious server can be created with the following hack,
followed by setting NBD_SERVER_DEBUG to a non-zero value in the
environment when running qemu-nbd:

| --- a/nbd/server.c
| +++ b/nbd/server.c
| @@ -919,6 +919,17 @@ static int nbd_send_reply(QIOChannel *ioc, NBDReply *reply, Error **errp)
|      stl_be_p(buf + 4, reply->error);
|      stq_be_p(buf + 8, reply->handle);
|
| +    static int debug;
| +    static int count;
| +    if (!count++) {
| +        const char *str = getenv("NBD_SERVER_DEBUG");
| +        if (str) {
| +            debug = atoi(str);
| +        }
| +    }
| +    if (debug && !(count % debug)) {
| +        buf[0] = 0;
| +    }
|      return nbd_write(ioc, buf, sizeof(buf), errp);
|  }

Reported-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20170814213426.24681-1-eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-08-15 10:03:28 -05:00
Fam Zheng
5f7772c4d0 block-backend: Defer shared_perm tightening migration completion
As in the case of nbd_export_new(), bdrv_invalidate_cache() can be
called when migration is still in progress. In this case we are not
ready to tighten the shared permissions fenced by blk->disable_perm.

Defer to a VM state change handler.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170815130740.31229-4-famz@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2017-08-15 10:03:28 -05:00
Fam Zheng
2b218f5dbc file-posix: Do runtime check for ofd lock API
It is reported that on Windows Subsystem for Linux, ofd operations fail
with -EINVAL. In other words, QEMU binary built with system headers that
exports F_OFD_SETLK doesn't necessarily run in an environment that
actually supports it:

$ qemu-system-aarch64 ... -drive file=test.vhdx,if=none,id=hd0 \
    -device virtio-blk-pci,drive=hd0
qemu-system-aarch64: -drive file=test.vhdx,if=none,id=hd0: Failed to unlock byte 100
qemu-system-aarch64: -drive file=test.vhdx,if=none,id=hd0: Failed to unlock byte 100
qemu-system-aarch64: -drive file=test.vhdx,if=none,id=hd0: Failed to lock byte 100

As a matter of fact this is not WSL specific. It can happen when running
a QEMU compiled against a newer glibc on an older kernel, such as in
a containerized environment.

Let's do a runtime check to cope with that.

Reported-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-08-11 14:12:44 +02:00
Eric Blake
d0d5d0e31a qcow2: Check failure of bdrv_getlength()
qcow2_co_pwritev_compressed() should not call bdrv_truncate()
if determining the size failed.

Reported-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-08-11 13:23:47 +02:00
Eric Blake
c40fe9c06c qcow2: Drop debugging dump_refcounts()
It's been #if 0'd since its introduction in 2006, commit 585f8587.
We can revive dead code if we need it, but in the meantime, it has
bit-rotted (for example, not checking for failure in bdrv_getlength()).

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-08-11 13:23:45 +02:00
Eric Blake
81caa3cc3b vpc: Check failure of bdrv_getlength()
vpc_open() was checking for bdrv_getlength() failure in one, but
not the other, location.

Reported-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-08-11 13:23:40 +02:00
Jeff Cody
113fe792fd block/nfs: fix mutex assertion in nfs_file_close()
Commit c096358e74 introduced assertion
checks for when qemu_mutex() functions are called without the
corresponding qemu_mutex_init() having initialized the mutex.

This uncovered a latent bug in qemu's nfs driver - in
nfs_client_close(), the NFSClient structure is overwritten with zeros,
prior to the mutex being destroyed.

Go ahead and destroy the mutex in nfs_client_close(), and change where
we call qemu_mutex_init() so that it is correctly balanced.

There are also a couple of memory leaks obscured by the memset, so this
fixes those as well.

Finally, we should be able to get rid of the memset(), as it isn't
necessary.

Cc: qemu-stable@nongnu.org
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-08-08 15:19:16 +02:00
Denis V. Lunev
e5e6268348 parallels: drop check that bdrv_truncate() is working
This would be actually strange and error prone. If truncate() nowadays
will fail, there is something fatally wrong. Let's check for that during
the actual work.

The only fallback case is when the file is not zero initialized. In this
case we should switch to preallocation via fallocate().

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Markus Armbruster <armbru@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Max Reitz <mreitz@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-08-08 15:19:16 +02:00
Denis V. Lunev
d8b83e37c3 parallels: respect error code of bdrv_getlength() in allocate_clusters()
If we can not get the file length, the state of BDS is broken completely.
Return error to the caller.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Markus Armbruster <armbru@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Max Reitz <mreitz@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-08-08 15:19:16 +02:00
Denis V. Lunev
70d9110b44 block: respect error code from bdrv_getlength in handle_aiocb_write_zeroes
Original idea beyond the code in question was the following: we have failed
to write zeroes with fallocate(FALLOC_FL_ZERO_RANGE) as the simplest
approach and via fallocate(FALLOC_FL_PUNCH_HOLE)/fallocate(0). We have the
only chance now: if the request comes beyond end of the file. Thus we
should calculate file length and respect the error code from that op.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Markus Armbruster <armbru@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Max Reitz <mreitz@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-08-08 15:19:16 +02:00
Fam Zheng
0e51b9b7c7 vmdk: Fix error handling/reporting of vmdk_check
Errors from the callees must be captured and propagated to our caller,
ensure this for both find_extent() and bdrv_getlength().

Reported-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-08-08 15:19:16 +02:00
Kevin Wolf
809eb70ed6 block/null: Remove 'filename' option
This option was only added to allow 'null-co://' and 'null-aio://' as
filenames, its value never served any actual purpose and was ignored.
Nevertheless it was accepted as '-drive driver=null,filename=foo'.

The correct way to enable the protocol prefixes (and that without adding
a useless -drive option) is implementing .bdrv_parse_filename. This is
what this patch does.

Technically, this is an incompatible change, but the null block driver
is only used for benchmarking, testing and debugging, and an option
without effect isn't likely to be used by anyone anyway, so no bad
effects are to be expected.

Reported-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-08-08 15:19:16 +02:00
Jeff Cody
95d729835f block/vhdx: check error return of bdrv_truncate()
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-08-08 14:37:00 +02:00
Jeff Cody
c6572fa0d2 block/vhdx: check error return of bdrv_flush()
Reported-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-08-08 14:37:00 +02:00
Jeff Cody
27539ac531 block/vhdx: check for offset overflow to bdrv_truncate()
VHDX uses uint64_t types for most offsets, following the VHDX spec.
However, bdrv_truncate() takes an int64_t value for the truncating
offset.  Check for overflow before calling bdrv_truncate().

While we are here, replace the bit shifting with QEMU_ALIGN_UP as well.

N.B.: For a compliant image this is not an issue, as the maximum VHDX
image size is defined per the spec to be 64TB.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-08-08 14:37:00 +02:00
Jeff Cody
3f910692c2 block/vhdx: check error return of bdrv_getlength()
Calls to bdrv_getlength() were not checking for error.  In vhdx.c, this
can lead to truncating an image file, so it is a definite bug.  In
vhdx-log.c, the path for improper behavior is less clear, but it is best
to check in any case.

Some minor code movement of the log_guid intialization, as well.

Reported-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-08-08 14:37:00 +02:00
Alberto Garcia
795be0621a quorum: Set sectors-count to 0 when reporting a flush error
The QUORUM_REPORT_BAD event has fields to report the sector in which
the error was detected and the number of affected sectors starting
from that one. This is important for read and write errors, but not
for flush errors.

For flush errors the current code reports the total size of the disk
image. That is however not useful information in this case. Moreover,
the bdrv_getlength() call can fail, and there's no good way of
handling that failure.

Since we're reporting useless information and we cannot even guarantee
to do it in a consistent way, this patch changes the code to report 0
instead in all cases.

Reported-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-08-08 14:37:00 +02:00
Daniel P. Berrange
f42cf447e2 block: move trace probes into bdrv_co_preadv|pwritev
There are trace probes in bdrv_co_readv|writev, however, the
block drivers are being gradually moved over to using the
bdrv_co_preadv|pwritev functions instead. As a result some
block drivers miss the current probes. Move the probes
into bdrv_co_preadv|pwritev instead, so that they are triggered
by more (all?) I/O code paths.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170804105036.11879-1-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-08-07 09:39:35 +01:00
Peter Maydell
3b64f272d3 Block layer patches for 2.10.0-rc1
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJZgKgMAAoJEH8JsnLIjy/W8FQP/i8f6lf3kpmTyqfyG/d9rU+V
 jizaa3uHBmmctQ9Ib0k1woFudSc1Rxt1kXtG3Cj45nYjHXYzWZOA2p3LF0bpRVtX
 keq3yXTLE+Il0ED6WiklKMS+BjpzxlnnHJpqnP3axl2TWZQtlQFIqO/f51RYtrVe
 SOaNdXxqEFQgV3uJIwtPdo38BBZvbIwA99+gCHM1YURkCESUnmy5h8plpnovJtMT
 Ta8sF+LXAjHtusYzfghNJ/p0Rpg3DkUmvHgo5QTA1F54AAPa3TePewqxssHaNK1T
 cDfDocvq9/gEMCMca2uyWNFxeqDZzaExoNUo2EVYoCPXKWr+vPaEgs9+RjMv6XUw
 d6lXZ6F0Rpm1zdtYQI8R1/ZpYcf29oo13q6fW0EEDx8y+9LMRKZP7pRtWA+MpNzT
 9iRMVm3y0G4FOaoWD9W9cMVfD9aJknz8j3pggIY8nUhvh7BqkEmbgoaO230AmDYc
 dVDDmGL8544g0x/v0USqe2ed/XdBkZSScOeKVeRpuS/r2E4UCBhhJNSefxzvn2+p
 GYj+M6HLZ+biyKBkK3gwyk1fT74vOMpOBzysbBpIN8kg9ySDSkFhvj7qE24YJMKT
 6yuWQE7WzPmXMiUz6hpn/m5TSjtVAXvL3BDc2lMz0HW6tHWJ5asv9zJjku+Ze9P4
 FGBPYzwJ585Wxq/Z6y5b
 =DIex
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Block layer patches for 2.10.0-rc1

# gpg: Signature made Tue 01 Aug 2017 17:10:52 BST
# gpg:                using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* remotes/kevin/tags/for-upstream:
  block/qapi: Remove redundant NULL check to silence Coverity
  qemu-iotests/059: Fix leaked image files
  qemu-iotests/063: Fix leaked image
  qemu-iotests/162: Fix leaked temporary files
  qemu-iotests/153: Fix leaked scratch images
  qemu-iotests/141: Fix image cleanup
  qemu-iotests: Remove blkdebug.conf after tests
  qemu-iotests/041: Fix leaked scratch images
  block: fix leaks in bdrv_open_driver()
  block: fix dangling bs->explicit_options in block.c
  iotests: Add test of recent fix to 'qemu-img measure'
  iotests: Check dirty bitmap statistics in 124
  iotests: Redirect stderr to stdout in 186
  iotests: Fix test 156

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-08-01 17:27:36 +01:00
Kevin Wolf
8e8eb0a903 block/qapi: Remove redundant NULL check to silence Coverity
When skipping implicit nodes in bdrv_block_device_info(), we know that
bs0 is always non-NULL; initially, because it's taken from a BdrvChild
and a BdrvChild never has a NULL bs, and after the first iteration
because implicit nodes always have a backing file.

Remove the NULL check and add an assertion that the implicit node does
indeed have a backing file.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
2017-08-01 18:09:33 +02:00
Vladimir Sementsov-Ogievskiy
8908eb1a4a trace-events: fix code style: print 0x before hex numbers
The only exception are groups of numers separated by symbols
'.', ' ', ':', '/', like 'ab.09.7d'.

This patch is made by the following:

> find . -name trace-events | xargs python script.py

where script.py is the following python script:
=========================
 #!/usr/bin/env python

import sys
import re
import fileinput

rhex = '%[-+ *.0-9]*(?:[hljztL]|ll|hh)?(?:x|X|"\s*PRI[xX][^"]*"?)'
rgroup = re.compile('((?:' + rhex + '[.:/ ])+' + rhex + ')')
rbad = re.compile('(?<!0x)' + rhex)

files = sys.argv[1:]

for fname in files:
    for line in fileinput.input(fname, inplace=True):
        arr = re.split(rgroup, line)
        for i in range(0, len(arr), 2):
            arr[i] = re.sub(rbad, '0x\g<0>', arr[i])

        sys.stdout.write(''.join(arr))
=========================

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Message-id: 20170731160135.12101-5-vsementsov@virtuozzo.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-08-01 12:13:07 +01:00
Vladimir Sementsov-Ogievskiy
db73ee4bc8 trace-events: fix code style: %# -> 0x%
In trace format '#' flag of printf is forbidden. Fix it to '0x%'.

This patch is created by the following:

check that we have a problem
> find . -name trace-events | xargs grep '%#' | wc -l
56

check that there are no cases with additional printf flags before '#'
> find . -name trace-events | xargs grep "%[-+ 0'I]+#" | wc -l
0

check that there are no wrong usage of '#' and '0x' together
> find . -name trace-events | xargs grep '0x%#' | wc -l
0

fix the problem
> find . -name trace-events | xargs sed -i 's/%#/0x%/g'

[Eric Blake noted that xargs grep '%[-+ 0'I]+#' should be xargs grep
"%[-+ 0'I]+#" instead so the shell quoting is correct.
--Stefan]

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20170731160135.12101-3-vsementsov@virtuozzo.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-08-01 12:13:07 +01:00
Philippe Mathieu-Daudé
87e0331c5a docs: fix broken paths to docs/devel/tracing.txt
With the move of some docs/ to docs/devel/ on ac06724a71,
no references were updated.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-07-31 13:12:53 +03:00
Philippe Mathieu-Daudé
f80ac75d0e qcow2: fix null pointer dereference
It seems this assert() was somehow misplaced.

block/qcow2-refcount.c:2193:42: warning: Array access (from variable 'on_disk_reftable') results in a null pointer dereference
        on_disk_reftable[refblock_index] = refblock_offset;
        ~~~~~~~~~~~~~~~~                 ^

Reported-by: Clang Static Analyzer
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-07-31 13:06:38 +03:00
Vladimir Sementsov-Ogievskiy
b6b75a99da qcow2-bitmap: fix bitmap_free
Fix possible crash on error path in
qcow2_remove_persistent_dirty_bitmap. Although bitmap_free was added in
88ddffae8f the bug was introduced later in commit 469c71edc7 (when
qcow2_remove_persistent_dirty_bitmap was added).

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20170714123341.373857-1-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-25 16:33:31 +02:00
Daniel P. Berrange
0696ae2c92 qcow: fix memory leaks related to encryption
Fix leak of the 'encryptopts' string, which was mistakenly
declared const.

Fix leak of QemuOpts entry which should not have been deleted
from the opts array.

Reported by: coverity
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170714103105.5781-1-berrange@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-25 16:33:31 +02:00
Kevin Wolf
d3c8c67469 block: Skip implicit nodes in query-block/blockstats
Commits 0db832f and 6cdbceb introduced the automatic insertion of filter
nodes above the top layer of mirror and commit block jobs. The
assumption made there was that since libvirt doesn't do node-level
management of the block layer yet, it shouldn't be affected by added
nodes.

This is true as far as commands issued by libvirt are concerned. It only
uses BlockBackend names to address nodes, so any operations it performs
still operate on the root of the tree as intended.

However, the assumption breaks down when you consider query commands,
which return data for the wrong node now. These commands also return
information on some child nodes (bs->file and/or bs->backing), which
libvirt does make use of, and which refer to the wrong nodes, too.

One of the consequences is that oVirt gets wrong information about the
image size and stops the VM in response as long as a mirror or commit
job is running:

https://bugzilla.redhat.com/show_bug.cgi?id=1470634

This patch fixes the problem by hiding the implicit nodes created
automatically by the mirror and commit block jobs in the output of
query-block and BlockBackend-based query-blockstats as long as the user
doesn't indicate that they are aware of those nodes by providing a node
name for them in the QMP command to start the block job.

The node-based commands query-named-block-nodes and query-blockstats
with query-nodes=true still show all nodes, including implicit ones.
This ensures that users that are capable of node-level management can
still access the full information; users that only know BlockBackends
won't use these commands.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Tested-by: Eric Blake <eblake@redhat.com>
2017-07-24 15:06:04 +02:00
Eric Blake
24bae02b19 qcow2: Fix sector calculation in qcow2_measure()
We used MAX() instead of the intended MIN() when computing how many
sectors to view in the current loop iteration of qcow2_measure(),
and passed in a value of INT_MAX sectors instead of our more usual
limit of BDRV_REQUEST_MAX_SECTORS (the latter avoids 32-bit overflow
on conversion to bytes).  For small files, the bug is harmless:
bdrv_get_block_status_above() clamps its *pnum answer to the BDS
size, regardless of any insanely larger input request.  However, for
any file at least 2T in size, we can very easily end up going into an
infinite loop (the maximum of 0x100000000 sectors and INT_MAX is a
64-bit quantity, which becomes 0 when assigned to int; once nb_sectors
is 0, we never make progress).

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-24 15:06:04 +02:00
Eric Blake
6c98c57af3 dirty-bitmap: Report BlockDirtyInfo.count in bytes, as documented
We've been documenting the value in bytes since its introduction
in commit b9a9b3a4 (v1.3), where it was actually reported in bytes.

Commit e4654d2 (v2.0) then removed things from block/qapi.c, in
preparation for a rewrite to a list of dirty sectors in the next
commit 21b5683 in block.c, but the new code mistakenly started
reporting in sectors.

Fixes: https://bugzilla.redhat.com/1441460

CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-24 15:06:04 +02:00
Mark Cave-Ayland
c8115f8eb8 block/vpc: fix uninitialised variable compiler warning
Since commit cfc87e00 "block/vpc.c: Handle write failures in
get_image_offset()" older versions of gcc (in this case 4.7) incorrectly
warn that "ret" can be used uninitialised in vpc_co_pwritev().

Setting ret to 0 at the start of vpc_co_pwritev() prevents the warning
in gcc 4.7 and enables compilation with -Werror to succeed.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1500625265-23844-1-git-send-email-mark.cave-ayland@ilande.co.uk
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-07-21 15:00:07 +01:00
Max Reitz
7c8730d45f block/vvfat: Fix compiler warning with gcc 7
gcc 7 complains that the sprintf() might write a null byte beyond the
end of the tail buffer.  That is wrong, but we can silence it by making
i unsigned (it can never be negative anyway, see the if condition right
before).  For some reason, this allows gcc to suddenly accurately
calculate the range of i so we can give the tail[] array the exact size
it needs to have (which is 8 bytes) without gcc complaining.

In addition, let us convert the sprintf() to snprintf(), because that is
always nicer, and add an assertion about the range of the return value
afterwards so we can see that "8 - len" will never be negative and thus
"entry->name + MIN(j, 8 - len)" will never be out of bounds.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-18 15:14:36 +02:00
Hervé Poussineau
f80256b7ee vvfat: initialize memory after allocating it
This prevents some host to guest memory content leaks.

Fixes: https://bugs.launchpad.net/qemu/+bug/1599539

Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-18 15:14:36 +02:00
Hervé Poussineau
e03da26b71 vvfat: correctly parse non-ASCII short and long file names
Write support works again when image contains non-ASCII names. It is either the
case when user created a non-ASCII filename, or when initial directory contained
a non-ASCII filename (since 0c36111f57)

Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-18 15:14:36 +02:00
Hervé Poussineau
63d261cb0d vvfat: add a constant for bootsector name
Also add links to related compatibility problems.

Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-18 15:14:36 +02:00
Hervé Poussineau
8c4517fd6e vvfat: add constants for special values of name[0]
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-18 15:14:36 +02:00
Kevin Wolf
ec18b0a93a block: List anonymous device BBs in query-block
Instead of listing only monitor-owned BlockBackends in query-block, also
add those anonymous BlockBackends that are owned by a qdev device and as
such under the control of the user.

This allows using query-block to inspect BlockBackends for the modern
configuration syntax with -blockdev and -device.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
2017-07-18 15:14:36 +02:00
Kevin Wolf
d5b68844e6 block/qapi: Use blk_all_next() for query-block
This patch replaces the blk_next() loop in query-block by a
blk_all_next() one so that we also get access to BlockBackends that
aren't owned by the monitor. For now, the next thing we do is check
whether each BB has a name, so there is no semantic difference.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
2017-07-18 15:14:36 +02:00
Kevin Wolf
a429b9b5f4 block: Make blk_all_next() public
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
2017-07-18 15:14:36 +02:00
Kevin Wolf
46eade7be8 block/qapi: Add qdev device name to query-block
With -blockdev/-device, users can indirectly create anonymous
BlockBackends, while the state of such backends is still of interest. As
a preparation for making such BBs visible in query-block, make sure that
they can be identified even without a name by adding the ID/QOM path of
their qdev device to BlockInfo.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
2017-07-18 15:14:35 +02:00
Kevin Wolf
77beef8365 block: Make blk_get_attached_dev_id() public
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
2017-07-18 15:14:35 +02:00
Peter Maydell
cfc87e00c2 block/vpc.c: Handle write failures in get_image_offset()
Coverity (CID 1355236) points out that get_image_offset() doesn't check that
it actually succeeded in writing the updated block bitmap to the file.
Check the error return from bdrv_pwrite_sync() and propagate an error
response back up to the function which calls get_image_offset() for
a write so that it can return the error to its caller.

get_sector_offset() is only used for reads, but we move it to the
same API for consistency.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-18 15:14:35 +02:00
Peter Maydell
9877860e7b block/vmdk: Report failures in vmdk_read_cid()
The function vmdk_read_cid() can fail if the read on the underlying
block device fails, or if there's a format error in the VMDK file.
However its API doesn't provide a mechanism to report these errors,
and in some cases we were returning a CID of 0 and in some cases a
CID of 0xffffffff, either of which might potentially be valid values.

Change the function to return 0 on success or a negative errno, and
return the CID via a uint32_t* argument. Update the callsites to
handle and propagate the error appropriately.

This fixes in passing a Coverity-spotted issue (CID 1350038) where
we weren't checking the return value from sscanf().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-18 15:14:35 +02:00
Manos Pitsidianakis
27e4cf1303 block: remove timer canceling in throttle_config()
throttle_config() cancels the timers of the calling BlockBackend. This
doesn't make sense because other BlockBackends in the group remain
untouched. There's no need to cancel the timers in the one specific
BlockBackend so let's not do that. Throttled requests will run as
scheduled and future requests will follow the new configuration. This
also allows a throttle group's configuration to be changed even when it
has no members.

Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-18 15:14:35 +02:00
Manos Pitsidianakis
dbe824cc57 block: add clock_type field to ThrottleGroup
Clock type in throttling is currently inferred by the ThrottleTimer's
clock type even though it is a per-ThrottleGroup property; it doesn't
make sense to have different clock types in the same group. Moving this
to a field in ThrottleGroup can simplify some of the throttle functions.

Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-18 15:14:35 +02:00
Kevin Wolf
b1e1fa0c3a commit: Add NULL check for overlay_bs
I can't see how overlay_bs could become NULL with the current code, but
other code in this function already checks it and we can make Coverity
happy with this check, so let's add it.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-07-18 15:14:35 +02:00
Peter Maydell
718d7f4f9c -----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJZbNpiAAoJEJykq7OBq3PIKcIIAMEJpEiWQonSJZVV4fxqcbOF
 dXJSVxHqrrVUrcM8NY2zGXIwcGS8RBNZG+Yx/SEZgIljoYH4NFmbvKXWS2zgHSyr
 LUH0M6gYlQ/1vQTXwQrkJdtmgfc3xNVrQbanbynK3+aB1S5Y6pRGauDo8SqCBWu0
 uLWkhcSQbG+OHD8Go5X1kZUSdpP8yOqKrxcNLe980ghi4HPMUydL3lbs4SwNlnRt
 NJIpMTGzJrL+CqyakIL+/PT9RBGCo4hllPD0CgX6HETEkuojxxXaqJIG+Tzj2FeU
 fXkoK1YQHEHLdVXnPkpModoPylhRqIQcPXoXt+aMvoLSM+bLooYXMYryMPoPDGI=
 =VgCF
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging

# gpg: Signature made Mon 17 Jul 2017 16:40:18 BST
# gpg:                using RSA key 0x9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha/tags/block-pull-request:
  block: fix shadowed variable in bdrv_co_pdiscard
  util/aio-win32: Only select on what we are actually waiting for

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-07-18 13:09:51 +01:00
Denis V. Lunev
593ed6f0a3 block: fix shadowed variable in bdrv_co_pdiscard
We've had a shadowed 'ret' variable, which risks returning the wrong
value, introduced in commit b9c64947.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20170710150559.30163-1-den@openvz.org
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-07-17 15:58:37 +01:00
Paolo Bonzini
5aca18a4ff ssh: support I/O from any AioContext
The coroutine may run in a different AioContext, causing the
fd handler to busy wait.  Fix this by resetting the handler
in restart_coroutine, before the coroutine is restarted.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20170629132749.997-12-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-07-17 11:34:20 +08:00
Paolo Bonzini
f1af3251f8 sheepdog: add queue_lock
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20170629132749.997-11-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-07-17 11:34:20 +08:00
Paolo Bonzini
1f01e50b83 qed: protect table cache with CoMutex
This makes the driver thread-safe.  The CoMutex is dropped temporarily
while accessing the data clusters or the backing file.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20170629132749.997-10-pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-07-17 11:34:11 +08:00
Paolo Bonzini
61c7887e0f qed: introduce bdrv_qed_init_state
This will be used in the next patch, which will call bdrv_qed_do_open
with a CoMutex taken.  bdrv_qed_init_state provides a nice place to
initialize it.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20170629132749.997-9-pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-07-17 11:33:11 +08:00
Paolo Bonzini
61124f03ab block: invoke .bdrv_drain callback in coroutine context and from AioContext
This will let the callback take a CoMutex in the next patch.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20170629132749.997-8-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-07-17 11:28:15 +08:00
Paolo Bonzini
e7569c1829 qed: move tail of qed_aio_write_main to qed_aio_write_{cow, alloc}
This part is never called for in-place writes, move it away to avoid
the "backwards" coding style typical of callback-based code.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20170629132749.997-7-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-07-17 11:28:15 +08:00
Paolo Bonzini
254aee4dbb vvfat: make it thread-safe
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20170629132749.997-6-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-07-17 11:28:15 +08:00
Paolo Bonzini
778b087e51 vpc: make it thread-safe
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20170629132749.997-5-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-07-17 11:28:15 +08:00
Paolo Bonzini
1e88663979 vdi: make it thread-safe
The VirtualBox driver is using a mutex to order all allocating writes,
but it is not protecting accesses to the bitmap because they implicitly
happen under the AioContext mutex.  Change this to use a CoRwlock
explicitly.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20170629132749.997-4-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-07-17 11:28:15 +08:00
Paolo Bonzini
a8c57408cd qcow2: call CoQueue APIs under CoMutex
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20170629132749.997-2-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-07-17 11:28:15 +08:00
Peter Maydell
6c6076662d * gdbstub fixes (Alex)
* IOMMU MemoryRegion subclass (Alexey)
 * Chardev hotswap (Anton)
 * NBD_OPT_GO support (Eric)
 * Misc bugfixes
 * DEFINE_PROP_LINK (minus the ARM patches - Fam)
 * MAINTAINERS updates (Philippe)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJZaJejAAoJEL/70l94x66DwQ4H/0NUvh/Zfs64wE1iuZJACc24
 1za02fFaB50vFDwQKWbM0GkHzDxoXBHk4Rvn92p+VSxpKtaAX4GRwCvxRA5GeUtm
 GAYbdIJUe0UELepKExrlUVzQcK9VfljoJpK3dZkP5Zzx83L2PAI/SexrZRibN2Uf
 yRI60uvlsMWU12nenzdVnYORd+TWDNKele7BhMrX/FX9wxaS1PlnsnKZggy6CU7G
 8dwZJAZJ/s5tRGXyXyAQzLm5JZQCLnA6jxya540TbPeciFgbvvS2ydIitZ54vSPO
 VtmZ1rSWfTEbNF5xGD1Ztu8aAENr5/I05l6IjxZd45BdUCW3HxeJkc+7lE0K4uk=
 =wnVs
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* gdbstub fixes (Alex)
* IOMMU MemoryRegion subclass (Alexey)
* Chardev hotswap (Anton)
* NBD_OPT_GO support (Eric)
* Misc bugfixes
* DEFINE_PROP_LINK (minus the ARM patches - Fam)
* MAINTAINERS updates (Philippe)

# gpg: Signature made Fri 14 Jul 2017 11:06:27 BST
# gpg:                using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (55 commits)
  spapr_rng: Convert to DEFINE_PROP_LINK
  cpu: Convert to DEFINE_PROP_LINK
  mips_cmgcr: Convert to DEFINE_PROP_LINK
  ivshmem: Convert to DEFINE_PROP_LINK
  dimm: Convert to DEFINE_PROP_LINK
  virtio-crypto: Convert to DEFINE_PROP_LINK
  virtio-rng: Convert to DEFINE_PROP_LINK
  virtio-scsi: Convert to DEFINE_PROP_LINK
  virtio-blk: Convert to DEFINE_PROP_LINK
  qdev: Add const qualifier to PropertyInfo definitions
  qmp: Use ObjectProperty.type if present
  qdev: Introduce DEFINE_PROP_LINK
  qdev: Introduce PropertyInfo.create
  qom: enforce readonly nature of link's check callback
  translate-all: remove redundant !tcg_enabled check in dump_exec_info
  vl: fix breakage of -tb-size
  nbd: Implement NBD_INFO_BLOCK_SIZE on client
  nbd: Implement NBD_INFO_BLOCK_SIZE on server
  nbd: Implement NBD_OPT_GO on client
  nbd: Implement NBD_OPT_GO on server
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-07-14 12:16:09 +01:00
Eric Blake
081dd1fe36 nbd: Implement NBD_INFO_BLOCK_SIZE on client
The upstream NBD Protocol has defined a new extension to allow
the server to advertise block sizes to the client, as well as
a way for the client to inform the server whether it intends to
obey block sizes.

When using the block layer as the client, we will obey block
sizes; but when used as 'qemu-nbd -c' to hand off to the
kernel nbd module as the client, we are still waiting for the
kernel to implement a way for us to learn if it will honor
block sizes (perhaps by an addition to sysfs, rather than an
ioctl), as well as any way to tell the kernel what additional
block sizes to obey (NBD_SET_BLKSIZE appears to be accurate
for the minimum size, but preferred and maximum sizes would
probably be new ioctl()s), so until then, we need to make our
request for block sizes conditional.

When using ioctl(NBD_SET_BLKSIZE) to hand off to the kernel,
use the minimum block size as the sector size if it is larger
than 512, which also has the nice effect of cooperating with
(non-qemu) servers that don't do read-modify-write when
exposing a block device with 4k sectors; it might also allow
us to visit a file larger than 2T on a 32-bit kernel.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20170707203049.534-10-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-14 12:04:42 +02:00
Eric Blake
004a89fce9 nbd: Create struct for tracking export info
The NBD Protocol is introducing some additional information
about exports, such as minimum request size and alignment, as
well as an advertised maximum request size.  It will be easier
to feed this information back to the block layer if we gather
all the information into a struct, rather than adding yet more
pointer parameters during negotiation.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20170707203049.534-2-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-14 12:04:41 +02:00
Peter Maydell
a309b290aa Error reporting patches for 2017-07-13
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJZZ1/BAAoJEDhwtADrkYZTo7oP+gLj4B4kkp/DJnkzfuMMD1Ce
 ZPddZ8Z9RyXE4fS66sq1ODBQo5U+aQQZO7K234+jf8V4cKWW98lpVzLc3YdAHm2U
 ZF6Z9Rji5K4414ZsUcg92Zlovvdaji+mY0ooINav+4mqlONYrz29ntApWc0e0tGc
 e3tj4XDLhJrOM+mIx8vzixFlgSYj+6HgEiybYwolEK5svQbIQao3Y2omyb+zy0w0
 RDT3XQnAAaZSOQAXcJGkhekkyMe0jMHOF0tULLx1uDQYctg9mUGlAGTZ5oTLgSve
 TCpSJwWCAx8XAJMkXyDRrdRFDLeUh6yGY7NTqAL3OuPSoAw9ygKrHyhTavxBJL+W
 rX7Qit3dmVrlZLviwNFQplAKYb10d08vBoKXmrnW5oVCmPEDvJIQfncbucpA/CNS
 ucdJ3RMLuDbbWdl+5tsL7jfiZAG7oSgAePTjN1rm0bDe5JN7NAU8WzHnKfE83iZq
 R+I3hofqGoiXSByYRLamZb+6nsURAxWPhcqcw7hdMsk7UI6dyZwWl9Fnm72w0BZK
 M5LHLkX0LYc+kZjiLKXlNK7Z50bXY0zKQpPCLH3nHA69iMiwVoozrjwa9iCKIxE+
 7ZlOfsu4ztExuicEyTr8b27CBrHjJjYDuFP0hroEOzqCKXUzegoq3oYMGP0doXxe
 o3xcwXVKT/1PudddyR4z
 =tByN
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/armbru/tags/pull-error-2017-07-13' into staging

Error reporting patches for 2017-07-13

# gpg: Signature made Thu 13 Jul 2017 12:55:45 BST
# gpg:                using RSA key 0x3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* remotes/armbru/tags/pull-error-2017-07-13:
  Convert error_report*_err() to warn_report*_err()
  error: Implement the warn and free Error functions
  char-socket: Report TCP socket waiting as information
  Convert error_report() to warn_report()
  error: Functions to report warnings and informational messages
  util/qemu-error: Rename error_print_loc() to be more generic
  websock: Don't try to set *errp directly
  block: Don't try to set *errp directly
  xilinx: Fix latent error handling bug

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-07-14 09:36:40 +01:00
Alistair Francis
3dc6f86936 Convert error_report() to warn_report()
Convert all uses of error_report("warning:"... to use warn_report()
instead. This helps standardise on a single method of printing warnings
to the user.

All of the warnings were changed using these two commands:
    find ./* -type f -exec sed -i \
      's|error_report(".*warning[,:] |warn_report("|Ig' {} +

Indentation fixed up manually afterwards.

The test-qdev-global-props test case was manually updated to ensure that
this patch passes make check (as the test cases are case sensitive).

Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Suggested-by: Thomas Huth <thuth@redhat.com>
Cc: Jeff Cody <jcody@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Max Reitz <mreitz@redhat.com>
Cc: Ronnie Sahlberg <ronniesahlberg@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Lieven <pl@kamp.de>
Cc: Josh Durgin <jdurgin@redhat.com>
Cc: "Richard W.M. Jones" <rjones@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Greg Kurz <groug@kaod.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Peter Chubb <peter.chubb@nicta.com.au>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Marcel Apfelbaum <marcel@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Alexander Graf <agraf@suse.de>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Greg Kurz <groug@kaod.org>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed by: Peter Chubb <peter.chubb@data61.csiro.au>
Acked-by: Max Reitz <mreitz@redhat.com>
Acked-by: Marcel Apfelbaum <marcel@redhat.com>
Message-Id: <e1cfa2cd47087c248dd24caca9c33d9af0c499b0.1499866456.git.alistair.francis@xilinx.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-07-13 13:49:58 +02:00
Max Reitz
772d1f973f block/qcow2: falloc/full preallocating growth
Implement the preallocation modes falloc and full for growing qcow2
images.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170613202107.10125-15-mreitz@redhat.com
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:45:02 +02:00
Max Reitz
60c48a29b7 block/qcow2: Rename "fail_block" to just "fail"
Now alloc_refcount_block() only contains a single fail label, so it
makes more sense to just name it "fail" instead of "fail_block".

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170613202107.10125-14-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:45:02 +02:00
Max Reitz
12cc30a8cb block/qcow2: Add qcow2_refcount_area()
This function creates a collection of self-describing refcount
structures (including a new refcount table) at the end of a qcow2 image
file. Optionally, these structures can also describe a number of
additional clusters beyond themselves; this will be important for
preallocated truncation, which will place the data clusters and L2
tables there.

For now, we can use this function to replace the part of
alloc_refcount_block() that grows the refcount table (from which it is
actually derived).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170613202107.10125-13-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:45:02 +02:00
Max Reitz
95b98f343b block/qcow2: Metadata preallocation for truncate
We can support PREALLOC_MODE_METADATA by invoking preallocate() in
qcow2_truncate().

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170613202107.10125-12-mreitz@redhat.com
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:45:02 +02:00
Max Reitz
652fecd005 block/qcow2: Lock s->lock in preallocate()
preallocate() is and will be called only from places that do not
otherwise need to lock s->lock: Currently that is qcow2_create2(), as of
a future patch it will be called from qcow2_truncate(), too.

It therefore makes sense to move locking that mutex into preallocate()
itself.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170613202107.10125-11-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:45:02 +02:00
Max Reitz
7bc45dc172 block/qcow2: Generalize preallocate()
This patch adds two new parameters to the preallocate() function so we
will be able to use it not just for preallocating a new image but also
for preallocated image growth.

The offset parameter allows the caller to specify a virtual offset from
which to start preallocating. For newly created images this is always 0,
but for preallocating growth this will be the old image length.

The new_length parameter specifies the supposed new length of the image
(basically the "end offset" for preallocation). During image truncation,
bdrv_getlength() will return the old image length so we cannot rely on
its return value then.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170613202107.10125-10-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:45:02 +02:00
Max Reitz
35d72602ec block/file-posix: Preallocation for truncate
By using raw_regular_truncate() in raw_truncate(), we can now easily
support preallocation.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170613202107.10125-9-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:45:01 +02:00
Max Reitz
d0bc9e5d5e block/file-posix: Generalize raw_regular_truncate
Currently, raw_regular_truncate() is intended for setting the size of a
newly created file. However, we also want to use it for truncating an
existing file in which case only the newly added space (when growing)
should be preallocated.

This also means that if resizing failed, we should try to restore the
original file size. This is important when using preallocation.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170613202107.10125-8-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:45:01 +02:00
Max Reitz
9f63b07ee7 block/file-posix: Extract raw_regular_truncate()
This functionality is part of raw_create() which we will be able to
reuse nicely in raw_truncate().

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20170613202107.10125-7-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:45:01 +02:00
Max Reitz
7dacd8bd3d block/file-posix: Small fixes in raw_create()
Variables should be declared at the start of a block, and if a certain
parameter value is not supported it may be better to return -ENOTSUP
instead of -EINVAL.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20170613202107.10125-6-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:45:01 +02:00
Max Reitz
3a691c50f1 block: Add PreallocMode to blk_truncate()
blk_truncate() itself will pass that value to bdrv_truncate(), and all
callers of blk_truncate() just set the parameter to PREALLOC_MODE_OFF
for now.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170613202107.10125-4-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:45:01 +02:00
Max Reitz
7ea37c3066 block: Add PreallocMode to bdrv_truncate()
For block drivers that just pass a truncate request to the underlying
protocol, we can now pass the preallocation mode instead of aborting if
it is not PREALLOC_MODE_OFF.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170613202107.10125-3-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:45:01 +02:00
Max Reitz
8243ccb743 block: Add PreallocMode to BD.bdrv_truncate()
Add a PreallocMode parameter to the bdrv_truncate() function implemented
by each block driver. Currently, we always pass PREALLOC_MODE_OFF and no
driver accepts anything else.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170613202107.10125-2-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:45:01 +02:00
Stefan Hajnoczi
c501c35220 qcow2: add bdrv_measure() support
Use qcow2_calc_prealloc_size() to get the required file size.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20170705125738.8777-7-stefanha@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:45:00 +02:00
Stefan Hajnoczi
0eb4a8c1df qcow2: extract image creation option parsing
The image creation options parsed by qcow2_create() are also needed to
implement .bdrv_measure().  Extract the parsing code, including input
validation.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20170705125738.8777-6-stefanha@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:45:00 +02:00
Stefan Hajnoczi
7c5bcc4212 qcow2: make refcount size calculation conservative
The refcount metadata size calculation is inaccurate and can produce
numbers that are too small.  This is bad because we should calculate a
conservative number - one that is guaranteed to be large enough.

This patch switches the approach to a fixed point calculation because
the existing equation is hard to solve when inaccuracies are taken care
of.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20170705125738.8777-5-stefanha@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:45:00 +02:00
Stefan Hajnoczi
95c67e3bd7 qcow2: extract preallocation calculation function
Calculating the preallocated image size will be needed to implement
.bdrv_measure().  Extract the code out into a separate function.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20170705125738.8777-4-stefanha@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:45:00 +02:00
Stefan Hajnoczi
a843a22a82 raw-format: add bdrv_measure() support
Maximum size calculation is trivial for the raw format: it's just the
requested image size (because there is no metadata).

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20170705125738.8777-3-stefanha@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:45:00 +02:00
Vladimir Sementsov-Ogievskiy
615b5dcf2d block: release persistent bitmaps on inactivate
We should release them here to reload on invalidate cache.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170628120530.31251-31-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:44:59 +02:00
Vladimir Sementsov-Ogievskiy
469c71edc7 qcow2: add .bdrv_remove_persistent_dirty_bitmap
Realize .bdrv_remove_persistent_dirty_bitmap interface.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20170628120530.31251-29-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:44:59 +02:00
Vladimir Sementsov-Ogievskiy
56f364e6d7 block/dirty-bitmap: add bdrv_remove_persistent_dirty_bitmap
Interface for removing persistent bitmap from its storage.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20170628120530.31251-28-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:44:59 +02:00
Vladimir Sementsov-Ogievskiy
a3b52535e8 qmp: add x-debug-block-dirty-bitmap-sha256
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20170628120530.31251-26-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:44:59 +02:00
Vladimir Sementsov-Ogievskiy
da0eb242ad qcow2: add .bdrv_can_store_new_dirty_bitmap
Realize .bdrv_can_store_new_dirty_bitmap interface.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170628120530.31251-23-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:44:59 +02:00
Vladimir Sementsov-Ogievskiy
169b879359 qcow2: store bitmaps on reopening image as read-only
Store bitmaps and mark them read-only on reopening image as read-only.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170628120530.31251-21-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:44:58 +02:00
Vladimir Sementsov-Ogievskiy
5f72826e7f qcow2: add persistent dirty bitmaps support
Store persistent dirty bitmaps in qcow2 image.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170628120530.31251-20-vsementsov@virtuozzo.com
[mreitz: Always assign ret in store_bitmap() in case of an error]
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:44:58 +02:00
Vladimir Sementsov-Ogievskiy
3dd10a06d1 block/dirty-bitmap: add bdrv_dirty_bitmap_next()
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20170628120530.31251-19-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:44:58 +02:00
Vladimir Sementsov-Ogievskiy
a88b179fdb block: introduce persistent dirty bitmaps
New field BdrvDirtyBitmap.persistent means, that bitmap should be saved
by format driver in .bdrv_close and .bdrv_inactivate. No format driver
supports it for now.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20170628120530.31251-18-vsementsov@virtuozzo.com
[mreitz: Fixed indentation]
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:44:58 +02:00