mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Marc-André Lureau	6fb0022b48	dmg: use DIV_ROUND_UP I used the clang-tidy qemu-round check to generate the fix: https://github.com/elmarco/clang-tools-extra Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Richard Henderson <rth@twiddle.net>	2017-08-31 12:29:07 +02:00
Marc-André Lureau	cf7a09c1e4	vhdx: use QEMU_ALIGN_DOWN I used the clang-tidy qemu-round check to generate the fix: https://github.com/elmarco/clang-tools-extra Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Richard Henderson <rth@twiddle.net>	2017-08-31 12:29:07 +02:00
Vladimir Sementsov-Ogievskiy	f35dff7e13	block/nbd-client: refactor request send/receive Add nbd_co_request, to remove code duplications in nbd_client_co_{pwrite,pread,...} functions. Also this is needed for further refactoring. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20170804151440.320927-8-vsementsov@virtuozzo.com> [eblake: make nbd_co_request a wrapper, rather than merging two existing functions] Signed-off-by: Eric Blake <eblake@redhat.com>	2017-08-30 13:00:38 -05:00
Vladimir Sementsov-Ogievskiy	07b1b99c78	block/nbd-client: rename nbd_recv_coroutines_enter_all Rename nbd_recv_coroutines_enter_all to nbd_recv_coroutines_wake_all, as it most probably just adds all recv coroutines into co_queue_wakeup, rather than directly enter them. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20170804151440.320927-9-vsementsov@virtuozzo.com> [eblake: tweak commit message] Signed-off-by: Eric Blake <eblake@redhat.com>	2017-08-30 13:00:38 -05:00
Vladimir Sementsov-Ogievskiy	6faa077772	block/nbd-client: get rid of ssize_t Use int variable for nbd_co_send_request return value (as nbd_co_send_request returns int). Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20170804151440.320927-6-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2017-08-30 13:00:38 -05:00
Stefan Hajnoczi	3c2d5183f9	nbd-client: avoid read_reply_co entry if send failed The following segfault is encountered if the NBD server closes the UNIX domain socket immediately after negotiation: Program terminated with signal SIGSEGV, Segmentation fault. #0 aio_co_schedule (ctx=0x0, co=0xd3c0ff2ef0) at util/async.c:441 441 QSLIST_INSERT_HEAD_ATOMIC(&ctx->scheduled_coroutines, (gdb) bt #0 0x000000d3c01a50f8 in aio_co_schedule (ctx=0x0, co=0xd3c0ff2ef0) at util/async.c:441 #1 0x000000d3c012fa90 in nbd_coroutine_end (bs=bs@entry=0xd3c0fec650, request=<optimized out>) at block/nbd-client.c:207 #2 0x000000d3c012fb58 in nbd_client_co_preadv (bs=0xd3c0fec650, offset=0, bytes=<optimized out>, qiov=0x7ffc10a91b20, flags=0) at block/nbd-client.c:237 #3 0x000000d3c0128e63 in bdrv_driver_preadv (bs=bs@entry=0xd3c0fec650, offset=offset@entry=0, bytes=bytes@entry=512, qiov=qiov@entry=0x7ffc10a91b20, flags=0) at block/io.c:836 #4 0x000000d3c012c3e0 in bdrv_aligned_preadv (child=child@entry=0xd3c0ff51d0, req=req@entry=0x7f31885d6e90, offset=offset@entry=0, bytes=bytes@entry=512, align=align@entry=1, qiov=qiov@entry=0x7ffc10a91b20, f +lags=0) at block/io.c:1086 #5 0x000000d3c012c6b8 in bdrv_co_preadv (child=0xd3c0ff51d0, offset=offset@entry=0, bytes=bytes@entry=512, qiov=qiov@entry=0x7ffc10a91b20, flags=flags@entry=0) at block/io.c:1182 #6 0x000000d3c011cc17 in blk_co_preadv (blk=0xd3c0ff4f80, offset=0, bytes=512, qiov=0x7ffc10a91b20, flags=0) at block/block-backend.c:1032 #7 0x000000d3c011ccec in blk_read_entry (opaque=0x7ffc10a91b40) at block/block-backend.c:1079 #8 0x000000d3c01bbb96 in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at util/coroutine-ucontext.c:79 #9 0x00007f3196cb8600 in __start_context () at /lib64/libc.so.6 The problem is that nbd_client_init() uses nbd_client_attach_aio_context() -> aio_co_schedule(new_context, client->read_reply_co). Execution of read_reply_co is deferred to a BH which doesn't run until later. In the mean time blk_co_preadv() can be called and nbd_coroutine_end() calls aio_wake() on read_reply_co. At this point in time read_reply_co's ctx isn't set because it has never been entered yet. This patch simplifies the nbd_co_send_request() -> nbd_co_receive_reply() -> nbd_coroutine_end() lifecycle to just nbd_co_send_request() -> nbd_co_receive_reply(). The request is "ended" if an error occurs at any point. Callers no longer have to invoke nbd_coroutine_end(). This cleanup also eliminates the segfault because we don't call aio_co_schedule() to wake up s->read_reply_co if sending the request failed. It is only necessary to wake up s->read_reply_co if a reply was received. Note this only happens with UNIX domain sockets on Linux. It doesn't seem possible to reproduce this with TCP sockets. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20170829122745.14309-2-stefanha@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2017-08-30 13:00:37 -05:00
Stefan Hajnoczi	3e4c705212	qcow2: allocate cluster_cache/cluster_data on demand Most qcow2 files are uncompressed so it is wasteful to allocate (32 + 1) * cluster_size + 512 bytes upfront. Allocate s->cluster_cache and s->cluster_data when the first read operation is performance on a compressed cluster. The buffers are freed in .bdrv_close(). .bdrv_open() no longer has any code paths that can allocate these buffers, so remove the free functions in the error code path. This patch can result in significant memory savings when many qcow2 disks are attached or backing file chains are long: Before 12.81% (1,023,193,088B) After 5.36% (393,893,888B) Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru> Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20170821135530.32344-1-stefanha@redhat.com Cc: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-08-30 18:02:10 +01:00
Alberto Garcia	c3a8fe331e	misc: Remove unused Error variables There's a few cases which we're passing an Error pointer to a function only to discard it immediately afterwards without checking it. In these cases we can simply remove the variable and pass NULL instead. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20170829120836.16091-1-berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-08-30 11:58:26 +01:00
Stefan Hajnoczi	40f4a21895	nbd-client: avoid spurious qio_channel_yield() re-entry The following scenario leads to an assertion failure in qio_channel_yield(): 1. Request coroutine calls qio_channel_yield() successfully when sending would block on the socket. It is now yielded. 2. nbd_read_reply_entry() calls nbd_recv_coroutines_enter_all() because nbd_receive_reply() failed. 3. Request coroutine is entered and returns from qio_channel_yield(). Note that the socket fd handler has not fired yet so ioc->write_coroutine is still set. 4. Request coroutine attempts to send the request body with nbd_rwv() but the socket would still block. qio_channel_yield() is called again and assert(!ioc->write_coroutine) is hit. The problem is that nbd_read_reply_entry() does not distinguish between request coroutines that are waiting to receive a reply and those that are not. This patch adds a per-request bool receiving flag so nbd_read_reply_entry() can avoid spurious aio_wake() calls. Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20170822125113.5025-1-stefanha@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Tested-by: Eric Blake <eblake@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2017-08-23 11:22:15 -05:00
Fam Zheng	045a2f8254	mirror: Mark target BB as "force allow inactivate" Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <20170823134242.12080-4-famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2017-08-23 10:21:55 -05:00
Fam Zheng	ca2e214411	block-backend: Allow more "can inactivate" cases These two conditions corresponds to mirror job's source and target, which need to be allowed as they are part of the non-shared storage migration workflow: failing to inactivate either will result in a failure during migration completion. Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <20170823134242.12080-3-famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> [eblake: improve comment grammar] Signed-off-by: Eric Blake <eblake@redhat.com>	2017-08-23 10:21:55 -05:00
Fam Zheng	c16de8f59a	block-backend: Refactor inactivate check The logic will be fixed (extended), move it to a separate function. Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <20170823134242.12080-2-famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2017-08-23 10:21:55 -05:00
Igor Mammedov	d0a180131c	fix build failure in nbd_read_reply_entry() travis builds fail at HEAD at rc3 master with block/nbd-client.c: In function ‘nbd_read_reply_entry’: block/nbd-client.c:110:8: error: ‘ret’ may be used uninitialized in this function [-Werror=uninitialized] fix it by initializing 'ret' to 0 Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-08-23 12:24:41 +01:00
Eric Blake	72b6ffc766	nbd-client: Fix regression when server sends garbage When we switched NBD to use coroutines for qemu 2.9 (in particular, commit `a12a712a`), we introduced a regression: if a server sends us garbage (such as a corrupted magic number), we quit the read loop but do not stop sending further queued commands, resulting in the client hanging when it never reads the response to those additional commands. In qemu 2.8, we properly detected that the server is no longer reliable, and cancelled all existing pending commands with EIO, then tore down the socket so that all further command attempts get EPIPE. Restore the proper behavior of quitting (almost) all communication with a broken server: Once we know we are out of sync or otherwise can't trust the server, we must assume that any further incoming data is unreliable and therefore end all pending commands with EIO, and quit trying to send any further commands. As an exception, we still (try to) send NBD_CMD_DISC to let the server know we are going away (in part, because it is easier to do that than to further refactor nbd_teardown_connection, and in part because it is the only command where we do not have to wait for a reply). Based on a patch by Vladimir Sementsov-Ogievskiy. A malicious server can be created with the following hack, followed by setting NBD_SERVER_DEBUG to a non-zero value in the environment when running qemu-nbd: \| --- a/nbd/server.c \| +++ b/nbd/server.c \| @@ -919,6 +919,17 @@ static int nbd_send_reply(QIOChannel ioc, NBDReply reply, Error *errp) \| stl_be_p(buf + 4, reply->error); \| stq_be_p(buf + 8, reply->handle); \| \| + static int debug; \| + static int count; \| + if (!count++) { \| + const char str = getenv("NBD_SERVER_DEBUG"); \| + if (str) { \| + debug = atoi(str); \| + } \| + } \| + if (debug && !(count % debug)) { \| + buf[0] = 0; \| + } \| return nbd_write(ioc, buf, sizeof(buf), errp); \| } Reported-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20170814213426.24681-1-eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-08-15 10:03:28 -05:00
Fam Zheng	5f7772c4d0	block-backend: Defer shared_perm tightening migration completion As in the case of nbd_export_new(), bdrv_invalidate_cache() can be called when migration is still in progress. In this case we are not ready to tighten the shared permissions fenced by blk->disable_perm. Defer to a VM state change handler. Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <20170815130740.31229-4-famz@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2017-08-15 10:03:28 -05:00
Fam Zheng	2b218f5dbc	file-posix: Do runtime check for ofd lock API It is reported that on Windows Subsystem for Linux, ofd operations fail with -EINVAL. In other words, QEMU binary built with system headers that exports F_OFD_SETLK doesn't necessarily run in an environment that actually supports it: $ qemu-system-aarch64 ... -drive file=test.vhdx,if=none,id=hd0 \ -device virtio-blk-pci,drive=hd0 qemu-system-aarch64: -drive file=test.vhdx,if=none,id=hd0: Failed to unlock byte 100 qemu-system-aarch64: -drive file=test.vhdx,if=none,id=hd0: Failed to unlock byte 100 qemu-system-aarch64: -drive file=test.vhdx,if=none,id=hd0: Failed to lock byte 100 As a matter of fact this is not WSL specific. It can happen when running a QEMU compiled against a newer glibc on an older kernel, such as in a containerized environment. Let's do a runtime check to cope with that. Reported-by: Andrew Baumann <Andrew.Baumann@microsoft.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-08-11 14:12:44 +02:00
Eric Blake	d0d5d0e31a	qcow2: Check failure of bdrv_getlength() qcow2_co_pwritev_compressed() should not call bdrv_truncate() if determining the size failed. Reported-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-08-11 13:23:47 +02:00
Eric Blake	c40fe9c06c	qcow2: Drop debugging dump_refcounts() It's been #if 0'd since its introduction in 2006, commit `585f8587`. We can revive dead code if we need it, but in the meantime, it has bit-rotted (for example, not checking for failure in bdrv_getlength()). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-08-11 13:23:45 +02:00
Eric Blake	81caa3cc3b	vpc: Check failure of bdrv_getlength() vpc_open() was checking for bdrv_getlength() failure in one, but not the other, location. Reported-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-08-11 13:23:40 +02:00
Jeff Cody	113fe792fd	block/nfs: fix mutex assertion in nfs_file_close() Commit `c096358e74` introduced assertion checks for when qemu_mutex() functions are called without the corresponding qemu_mutex_init() having initialized the mutex. This uncovered a latent bug in qemu's nfs driver - in nfs_client_close(), the NFSClient structure is overwritten with zeros, prior to the mutex being destroyed. Go ahead and destroy the mutex in nfs_client_close(), and change where we call qemu_mutex_init() so that it is correctly balanced. There are also a couple of memory leaks obscured by the memset, so this fixes those as well. Finally, we should be able to get rid of the memset(), as it isn't necessary. Cc: qemu-stable@nongnu.org Signed-off-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Peter Lieven <pl@kamp.de> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-08-08 15:19:16 +02:00
Denis V. Lunev	e5e6268348	parallels: drop check that bdrv_truncate() is working This would be actually strange and error prone. If truncate() nowadays will fail, there is something fatally wrong. Let's check for that during the actual work. The only fallback case is when the file is not zero initialized. In this case we should switch to preallocation via fallocate(). Signed-off-by: Denis V. Lunev <den@openvz.org> CC: Markus Armbruster <armbru@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Max Reitz <mreitz@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-08-08 15:19:16 +02:00
Denis V. Lunev	d8b83e37c3	parallels: respect error code of bdrv_getlength() in allocate_clusters() If we can not get the file length, the state of BDS is broken completely. Return error to the caller. Signed-off-by: Denis V. Lunev <den@openvz.org> CC: Markus Armbruster <armbru@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Max Reitz <mreitz@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2017-08-08 15:19:16 +02:00
Denis V. Lunev	70d9110b44	block: respect error code from bdrv_getlength in handle_aiocb_write_zeroes Original idea beyond the code in question was the following: we have failed to write zeroes with fallocate(FALLOC_FL_ZERO_RANGE) as the simplest approach and via fallocate(FALLOC_FL_PUNCH_HOLE)/fallocate(0). We have the only chance now: if the request comes beyond end of the file. Thus we should calculate file length and respect the error code from that op. Signed-off-by: Denis V. Lunev <den@openvz.org> CC: Markus Armbruster <armbru@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Max Reitz <mreitz@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2017-08-08 15:19:16 +02:00
Fam Zheng	0e51b9b7c7	vmdk: Fix error handling/reporting of vmdk_check Errors from the callees must be captured and propagated to our caller, ensure this for both find_extent() and bdrv_getlength(). Reported-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-08-08 15:19:16 +02:00
Kevin Wolf	809eb70ed6	block/null: Remove 'filename' option This option was only added to allow 'null-co://' and 'null-aio://' as filenames, its value never served any actual purpose and was ignored. Nevertheless it was accepted as '-drive driver=null,filename=foo'. The correct way to enable the protocol prefixes (and that without adding a useless -drive option) is implementing .bdrv_parse_filename. This is what this patch does. Technically, this is an incompatible change, but the null block driver is only used for benchmarking, testing and debugging, and an option without effect isn't likely to be used by anyone anyway, so no bad effects are to be expected. Reported-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-08-08 15:19:16 +02:00
Jeff Cody	95d729835f	block/vhdx: check error return of bdrv_truncate() Signed-off-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-08-08 14:37:00 +02:00
Jeff Cody	c6572fa0d2	block/vhdx: check error return of bdrv_flush() Reported-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-08-08 14:37:00 +02:00
Jeff Cody	27539ac531	block/vhdx: check for offset overflow to bdrv_truncate() VHDX uses uint64_t types for most offsets, following the VHDX spec. However, bdrv_truncate() takes an int64_t value for the truncating offset. Check for overflow before calling bdrv_truncate(). While we are here, replace the bit shifting with QEMU_ALIGN_UP as well. N.B.: For a compliant image this is not an issue, as the maximum VHDX image size is defined per the spec to be 64TB. Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-08-08 14:37:00 +02:00
Jeff Cody	3f910692c2	block/vhdx: check error return of bdrv_getlength() Calls to bdrv_getlength() were not checking for error. In vhdx.c, this can lead to truncating an image file, so it is a definite bug. In vhdx-log.c, the path for improper behavior is less clear, but it is best to check in any case. Some minor code movement of the log_guid intialization, as well. Reported-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-08-08 14:37:00 +02:00
Alberto Garcia	795be0621a	quorum: Set sectors-count to 0 when reporting a flush error The QUORUM_REPORT_BAD event has fields to report the sector in which the error was detected and the number of affected sectors starting from that one. This is important for read and write errors, but not for flush errors. For flush errors the current code reports the total size of the disk image. That is however not useful information in this case. Moreover, the bdrv_getlength() call can fail, and there's no good way of handling that failure. Since we're reporting useless information and we cannot even guarantee to do it in a consistent way, this patch changes the code to report 0 instead in all cases. Reported-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-08-08 14:37:00 +02:00
Daniel P. Berrange	f42cf447e2	block: move trace probes into bdrv_co_preadv\|pwritev There are trace probes in bdrv_co_readv\|writev, however, the block drivers are being gradually moved over to using the bdrv_co_preadv\|pwritev functions instead. As a result some block drivers miss the current probes. Move the probes into bdrv_co_preadv\|pwritev instead, so that they are triggered by more (all?) I/O code paths. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 20170804105036.11879-1-berrange@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-08-07 09:39:35 +01:00
Peter Maydell	3b64f272d3	Block layer patches for 2.10.0-rc1 -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJZgKgMAAoJEH8JsnLIjy/W8FQP/i8f6lf3kpmTyqfyG/d9rU+V jizaa3uHBmmctQ9Ib0k1woFudSc1Rxt1kXtG3Cj45nYjHXYzWZOA2p3LF0bpRVtX keq3yXTLE+Il0ED6WiklKMS+BjpzxlnnHJpqnP3axl2TWZQtlQFIqO/f51RYtrVe SOaNdXxqEFQgV3uJIwtPdo38BBZvbIwA99+gCHM1YURkCESUnmy5h8plpnovJtMT Ta8sF+LXAjHtusYzfghNJ/p0Rpg3DkUmvHgo5QTA1F54AAPa3TePewqxssHaNK1T cDfDocvq9/gEMCMca2uyWNFxeqDZzaExoNUo2EVYoCPXKWr+vPaEgs9+RjMv6XUw d6lXZ6F0Rpm1zdtYQI8R1/ZpYcf29oo13q6fW0EEDx8y+9LMRKZP7pRtWA+MpNzT 9iRMVm3y0G4FOaoWD9W9cMVfD9aJknz8j3pggIY8nUhvh7BqkEmbgoaO230AmDYc dVDDmGL8544g0x/v0USqe2ed/XdBkZSScOeKVeRpuS/r2E4UCBhhJNSefxzvn2+p GYj+M6HLZ+biyKBkK3gwyk1fT74vOMpOBzysbBpIN8kg9ySDSkFhvj7qE24YJMKT 6yuWQE7WzPmXMiUz6hpn/m5TSjtVAXvL3BDc2lMz0HW6tHWJ5asv9zJjku+Ze9P4 FGBPYzwJ585Wxq/Z6y5b =DIex -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging Block layer patches for 2.10.0-rc1 # gpg: Signature made Tue 01 Aug 2017 17:10:52 BST # gpg: using RSA key 0x7F09B272C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" # Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6 * remotes/kevin/tags/for-upstream: block/qapi: Remove redundant NULL check to silence Coverity qemu-iotests/059: Fix leaked image files qemu-iotests/063: Fix leaked image qemu-iotests/162: Fix leaked temporary files qemu-iotests/153: Fix leaked scratch images qemu-iotests/141: Fix image cleanup qemu-iotests: Remove blkdebug.conf after tests qemu-iotests/041: Fix leaked scratch images block: fix leaks in bdrv_open_driver() block: fix dangling bs->explicit_options in block.c iotests: Add test of recent fix to 'qemu-img measure' iotests: Check dirty bitmap statistics in 124 iotests: Redirect stderr to stdout in 186 iotests: Fix test 156 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-08-01 17:27:36 +01:00
Kevin Wolf	8e8eb0a903	block/qapi: Remove redundant NULL check to silence Coverity When skipping implicit nodes in bdrv_block_device_info(), we know that bs0 is always non-NULL; initially, because it's taken from a BdrvChild and a BdrvChild never has a NULL bs, and after the first iteration because implicit nodes always have a backing file. Remove the NULL check and add an assertion that the implicit node does indeed have a backing file. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com>	2017-08-01 18:09:33 +02:00
Vladimir Sementsov-Ogievskiy	8908eb1a4a	trace-events: fix code style: print 0x before hex numbers The only exception are groups of numers separated by symbols '.', ' ', ':', '/', like 'ab.09.7d'. This patch is made by the following: > find . -name trace-events \| xargs python script.py where script.py is the following python script: ========================= #!/usr/bin/env python import sys import re import fileinput rhex = '%[-+ .0-9](?:[hljztL]\|ll\|hh)?(?:x\|X\|"\sPRI[xX][^"]"?)' rgroup = re.compile('((?:' + rhex + '[.:/ ])+' + rhex + ')') rbad = re.compile('(?<!0x)' + rhex) files = sys.argv[1:] for fname in files: for line in fileinput.input(fname, inplace=True): arr = re.split(rgroup, line) for i in range(0, len(arr), 2): arr[i] = re.sub(rbad, '0x\g<0>', arr[i]) sys.stdout.write(''.join(arr)) ========================= Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Cornelia Huck <cohuck@redhat.com> Message-id: 20170731160135.12101-5-vsementsov@virtuozzo.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-08-01 12:13:07 +01:00
Vladimir Sementsov-Ogievskiy	db73ee4bc8	trace-events: fix code style: %# -> 0x% In trace format '#' flag of printf is forbidden. Fix it to '0x%'. This patch is created by the following: check that we have a problem > find . -name trace-events \| xargs grep '%#' \| wc -l 56 check that there are no cases with additional printf flags before '#' > find . -name trace-events \| xargs grep "%[-+ 0'I]+#" \| wc -l 0 check that there are no wrong usage of '#' and '0x' together > find . -name trace-events \| xargs grep '0x%#' \| wc -l 0 fix the problem > find . -name trace-events \| xargs sed -i 's/%#/0x%/g' [Eric Blake noted that xargs grep '%[-+ 0'I]+#' should be xargs grep "%[-+ 0'I]+#" instead so the shell quoting is correct. --Stefan] Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20170731160135.12101-3-vsementsov@virtuozzo.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-08-01 12:13:07 +01:00
Philippe Mathieu-Daudé	87e0331c5a	docs: fix broken paths to docs/devel/tracing.txt With the move of some docs/ to docs/devel/ on `ac06724a71`, no references were updated. Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2017-07-31 13:12:53 +03:00
Philippe Mathieu-Daudé	f80ac75d0e	qcow2: fix null pointer dereference It seems this assert() was somehow misplaced. block/qcow2-refcount.c:2193:42: warning: Array access (from variable 'on_disk_reftable') results in a null pointer dereference on_disk_reftable[refblock_index] = refblock_offset; ~~~~~~~~~~~~~~~~ ^ Reported-by: Clang Static Analyzer Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2017-07-31 13:06:38 +03:00
Vladimir Sementsov-Ogievskiy	b6b75a99da	qcow2-bitmap: fix bitmap_free Fix possible crash on error path in qcow2_remove_persistent_dirty_bitmap. Although bitmap_free was added in `88ddffae8f` the bug was introduced later in commit `469c71edc7` (when qcow2_remove_persistent_dirty_bitmap was added). Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20170714123341.373857-1-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-25 16:33:31 +02:00
Daniel P. Berrange	0696ae2c92	qcow: fix memory leaks related to encryption Fix leak of the 'encryptopts' string, which was mistakenly declared const. Fix leak of QemuOpts entry which should not have been deleted from the opts array. Reported by: coverity Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 20170714103105.5781-1-berrange@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-25 16:33:31 +02:00
Kevin Wolf	d3c8c67469	block: Skip implicit nodes in query-block/blockstats Commits `0db832f` and `6cdbceb` introduced the automatic insertion of filter nodes above the top layer of mirror and commit block jobs. The assumption made there was that since libvirt doesn't do node-level management of the block layer yet, it shouldn't be affected by added nodes. This is true as far as commands issued by libvirt are concerned. It only uses BlockBackend names to address nodes, so any operations it performs still operate on the root of the tree as intended. However, the assumption breaks down when you consider query commands, which return data for the wrong node now. These commands also return information on some child nodes (bs->file and/or bs->backing), which libvirt does make use of, and which refer to the wrong nodes, too. One of the consequences is that oVirt gets wrong information about the image size and stops the VM in response as long as a mirror or commit job is running: https://bugzilla.redhat.com/show_bug.cgi?id=1470634 This patch fixes the problem by hiding the implicit nodes created automatically by the mirror and commit block jobs in the output of query-block and BlockBackend-based query-blockstats as long as the user doesn't indicate that they are aware of those nodes by providing a node name for them in the QMP command to start the block job. The node-based commands query-named-block-nodes and query-blockstats with query-nodes=true still show all nodes, including implicit ones. This ensures that users that are capable of node-level management can still access the full information; users that only know BlockBackends won't use these commands. Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Tested-by: Eric Blake <eblake@redhat.com>	2017-07-24 15:06:04 +02:00
Eric Blake	24bae02b19	qcow2: Fix sector calculation in qcow2_measure() We used MAX() instead of the intended MIN() when computing how many sectors to view in the current loop iteration of qcow2_measure(), and passed in a value of INT_MAX sectors instead of our more usual limit of BDRV_REQUEST_MAX_SECTORS (the latter avoids 32-bit overflow on conversion to bytes). For small files, the bug is harmless: bdrv_get_block_status_above() clamps its *pnum answer to the BDS size, regardless of any insanely larger input request. However, for any file at least 2T in size, we can very easily end up going into an infinite loop (the maximum of 0x100000000 sectors and INT_MAX is a 64-bit quantity, which becomes 0 when assigned to int; once nb_sectors is 0, we never make progress). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-24 15:06:04 +02:00
Eric Blake	6c98c57af3	dirty-bitmap: Report BlockDirtyInfo.count in bytes, as documented We've been documenting the value in bytes since its introduction in commit `b9a9b3a4` (v1.3), where it was actually reported in bytes. Commit `e4654d2` (v2.0) then removed things from block/qapi.c, in preparation for a rewrite to a list of dirty sectors in the next commit `21b5683` in block.c, but the new code mistakenly started reporting in sectors. Fixes: https://bugzilla.redhat.com/1441460 CC: qemu-stable@nongnu.org Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-24 15:06:04 +02:00
Mark Cave-Ayland	c8115f8eb8	block/vpc: fix uninitialised variable compiler warning Since commit `cfc87e00` "block/vpc.c: Handle write failures in get_image_offset()" older versions of gcc (in this case 4.7) incorrectly warn that "ret" can be used uninitialised in vpc_co_pwritev(). Setting ret to 0 at the start of vpc_co_pwritev() prevents the warning in gcc 4.7 and enables compilation with -Werror to succeed. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1500625265-23844-1-git-send-email-mark.cave-ayland@ilande.co.uk Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-07-21 15:00:07 +01:00
Max Reitz	7c8730d45f	block/vvfat: Fix compiler warning with gcc 7 gcc 7 complains that the sprintf() might write a null byte beyond the end of the tail buffer. That is wrong, but we can silence it by making i unsigned (it can never be negative anyway, see the if condition right before). For some reason, this allows gcc to suddenly accurately calculate the range of i so we can give the tail[] array the exact size it needs to have (which is 8 bytes) without gcc complaining. In addition, let us convert the sprintf() to snprintf(), because that is always nicer, and add an assertion about the range of the return value afterwards so we can see that "8 - len" will never be negative and thus "entry->name + MIN(j, 8 - len)" will never be out of bounds. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-18 15:14:36 +02:00
Hervé Poussineau	f80256b7ee	vvfat: initialize memory after allocating it This prevents some host to guest memory content leaks. Fixes: https://bugs.launchpad.net/qemu/+bug/1599539 Signed-off-by: Hervé Poussineau <hpoussin@reactos.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-18 15:14:36 +02:00
Hervé Poussineau	e03da26b71	vvfat: correctly parse non-ASCII short and long file names Write support works again when image contains non-ASCII names. It is either the case when user created a non-ASCII filename, or when initial directory contained a non-ASCII filename (since `0c36111f57`) Signed-off-by: Hervé Poussineau <hpoussin@reactos.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-18 15:14:36 +02:00
Hervé Poussineau	63d261cb0d	vvfat: add a constant for bootsector name Also add links to related compatibility problems. Signed-off-by: Hervé Poussineau <hpoussin@reactos.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-18 15:14:36 +02:00
Hervé Poussineau	8c4517fd6e	vvfat: add constants for special values of name[0] Signed-off-by: Hervé Poussineau <hpoussin@reactos.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-18 15:14:36 +02:00
Kevin Wolf	ec18b0a93a	block: List anonymous device BBs in query-block Instead of listing only monitor-owned BlockBackends in query-block, also add those anonymous BlockBackends that are owned by a qdev device and as such under the control of the user. This allows using query-block to inspect BlockBackends for the modern configuration syntax with -blockdev and -device. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com>	2017-07-18 15:14:36 +02:00
Kevin Wolf	d5b68844e6	block/qapi: Use blk_all_next() for query-block This patch replaces the blk_next() loop in query-block by a blk_all_next() one so that we also get access to BlockBackends that aren't owned by the monitor. For now, the next thing we do is check whether each BB has a name, so there is no semantic difference. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com>	2017-07-18 15:14:36 +02:00
Kevin Wolf	a429b9b5f4	block: Make blk_all_next() public Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com>	2017-07-18 15:14:36 +02:00
Kevin Wolf	46eade7be8	block/qapi: Add qdev device name to query-block With -blockdev/-device, users can indirectly create anonymous BlockBackends, while the state of such backends is still of interest. As a preparation for making such BBs visible in query-block, make sure that they can be identified even without a name by adding the ID/QOM path of their qdev device to BlockInfo. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com>	2017-07-18 15:14:35 +02:00
Kevin Wolf	77beef8365	block: Make blk_get_attached_dev_id() public Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com>	2017-07-18 15:14:35 +02:00
Peter Maydell	cfc87e00c2	block/vpc.c: Handle write failures in get_image_offset() Coverity (CID 1355236) points out that get_image_offset() doesn't check that it actually succeeded in writing the updated block bitmap to the file. Check the error return from bdrv_pwrite_sync() and propagate an error response back up to the function which calls get_image_offset() for a write so that it can return the error to its caller. get_sector_offset() is only used for reads, but we move it to the same API for consistency. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-18 15:14:35 +02:00
Peter Maydell	9877860e7b	block/vmdk: Report failures in vmdk_read_cid() The function vmdk_read_cid() can fail if the read on the underlying block device fails, or if there's a format error in the VMDK file. However its API doesn't provide a mechanism to report these errors, and in some cases we were returning a CID of 0 and in some cases a CID of 0xffffffff, either of which might potentially be valid values. Change the function to return 0 on success or a negative errno, and return the CID via a uint32_t* argument. Update the callsites to handle and propagate the error appropriately. This fixes in passing a Coverity-spotted issue (CID 1350038) where we weren't checking the return value from sscanf(). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-18 15:14:35 +02:00
Manos Pitsidianakis	27e4cf1303	block: remove timer canceling in throttle_config() throttle_config() cancels the timers of the calling BlockBackend. This doesn't make sense because other BlockBackends in the group remain untouched. There's no need to cancel the timers in the one specific BlockBackend so let's not do that. Throttled requests will run as scheduled and future requests will follow the new configuration. This also allows a throttle group's configuration to be changed even when it has no members. Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-18 15:14:35 +02:00
Manos Pitsidianakis	dbe824cc57	block: add clock_type field to ThrottleGroup Clock type in throttling is currently inferred by the ThrottleTimer's clock type even though it is a per-ThrottleGroup property; it doesn't make sense to have different clock types in the same group. Moving this to a field in ThrottleGroup can simplify some of the throttle functions. Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-18 15:14:35 +02:00
Kevin Wolf	b1e1fa0c3a	commit: Add NULL check for overlay_bs I can't see how overlay_bs could become NULL with the current code, but other code in this function already checks it and we can make Coverity happy with this check, so let's add it. Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-07-18 15:14:35 +02:00
Peter Maydell	718d7f4f9c	-----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJZbNpiAAoJEJykq7OBq3PIKcIIAMEJpEiWQonSJZVV4fxqcbOF dXJSVxHqrrVUrcM8NY2zGXIwcGS8RBNZG+Yx/SEZgIljoYH4NFmbvKXWS2zgHSyr LUH0M6gYlQ/1vQTXwQrkJdtmgfc3xNVrQbanbynK3+aB1S5Y6pRGauDo8SqCBWu0 uLWkhcSQbG+OHD8Go5X1kZUSdpP8yOqKrxcNLe980ghi4HPMUydL3lbs4SwNlnRt NJIpMTGzJrL+CqyakIL+/PT9RBGCo4hllPD0CgX6HETEkuojxxXaqJIG+Tzj2FeU fXkoK1YQHEHLdVXnPkpModoPylhRqIQcPXoXt+aMvoLSM+bLooYXMYryMPoPDGI= =VgCF -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging # gpg: Signature made Mon 17 Jul 2017 16:40:18 BST # gpg: using RSA key 0x9CA4ABB381AB73C8 # gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" # gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" # Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8 * remotes/stefanha/tags/block-pull-request: block: fix shadowed variable in bdrv_co_pdiscard util/aio-win32: Only select on what we are actually waiting for Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-07-18 13:09:51 +01:00
Denis V. Lunev	593ed6f0a3	block: fix shadowed variable in bdrv_co_pdiscard We've had a shadowed 'ret' variable, which risks returning the wrong value, introduced in commit `b9c64947`. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20170710150559.30163-1-den@openvz.org CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Eric Blake <eblake@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-07-17 15:58:37 +01:00
Paolo Bonzini	5aca18a4ff	ssh: support I/O from any AioContext The coroutine may run in a different AioContext, causing the fd handler to busy wait. Fix this by resetting the handler in restart_coroutine, before the coroutine is restarted. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20170629132749.997-12-pbonzini@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>	2017-07-17 11:34:20 +08:00
Paolo Bonzini	f1af3251f8	sheepdog: add queue_lock Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20170629132749.997-11-pbonzini@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>	2017-07-17 11:34:20 +08:00
Paolo Bonzini	1f01e50b83	qed: protect table cache with CoMutex This makes the driver thread-safe. The CoMutex is dropped temporarily while accessing the data clusters or the backing file. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20170629132749.997-10-pbonzini@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>	2017-07-17 11:34:11 +08:00
Paolo Bonzini	61c7887e0f	qed: introduce bdrv_qed_init_state This will be used in the next patch, which will call bdrv_qed_do_open with a CoMutex taken. bdrv_qed_init_state provides a nice place to initialize it. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20170629132749.997-9-pbonzini@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>	2017-07-17 11:33:11 +08:00
Paolo Bonzini	61124f03ab	block: invoke .bdrv_drain callback in coroutine context and from AioContext This will let the callback take a CoMutex in the next patch. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20170629132749.997-8-pbonzini@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>	2017-07-17 11:28:15 +08:00
Paolo Bonzini	e7569c1829	qed: move tail of qed_aio_write_main to qed_aio_write_{cow, alloc} This part is never called for in-place writes, move it away to avoid the "backwards" coding style typical of callback-based code. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20170629132749.997-7-pbonzini@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>	2017-07-17 11:28:15 +08:00
Paolo Bonzini	254aee4dbb	vvfat: make it thread-safe Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20170629132749.997-6-pbonzini@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>	2017-07-17 11:28:15 +08:00
Paolo Bonzini	778b087e51	vpc: make it thread-safe Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20170629132749.997-5-pbonzini@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>	2017-07-17 11:28:15 +08:00
Paolo Bonzini	1e88663979	vdi: make it thread-safe The VirtualBox driver is using a mutex to order all allocating writes, but it is not protecting accesses to the bitmap because they implicitly happen under the AioContext mutex. Change this to use a CoRwlock explicitly. Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20170629132749.997-4-pbonzini@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>	2017-07-17 11:28:15 +08:00
Paolo Bonzini	a8c57408cd	qcow2: call CoQueue APIs under CoMutex Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20170629132749.997-2-pbonzini@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>	2017-07-17 11:28:15 +08:00
Peter Maydell	6c6076662d	* gdbstub fixes (Alex) * IOMMU MemoryRegion subclass (Alexey) * Chardev hotswap (Anton) * NBD_OPT_GO support (Eric) * Misc bugfixes * DEFINE_PROP_LINK (minus the ARM patches - Fam) * MAINTAINERS updates (Philippe) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJZaJejAAoJEL/70l94x66DwQ4H/0NUvh/Zfs64wE1iuZJACc24 1za02fFaB50vFDwQKWbM0GkHzDxoXBHk4Rvn92p+VSxpKtaAX4GRwCvxRA5GeUtm GAYbdIJUe0UELepKExrlUVzQcK9VfljoJpK3dZkP5Zzx83L2PAI/SexrZRibN2Uf yRI60uvlsMWU12nenzdVnYORd+TWDNKele7BhMrX/FX9wxaS1PlnsnKZggy6CU7G 8dwZJAZJ/s5tRGXyXyAQzLm5JZQCLnA6jxya540TbPeciFgbvvS2ydIitZ54vSPO VtmZ1rSWfTEbNF5xGD1Ztu8aAENr5/I05l6IjxZd45BdUCW3HxeJkc+7lE0K4uk= =wnVs -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging * gdbstub fixes (Alex) * IOMMU MemoryRegion subclass (Alexey) * Chardev hotswap (Anton) * NBD_OPT_GO support (Eric) * Misc bugfixes * DEFINE_PROP_LINK (minus the ARM patches - Fam) * MAINTAINERS updates (Philippe) # gpg: Signature made Fri 14 Jul 2017 11:06:27 BST # gpg: using RSA key 0xBFFBD25F78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini/tags/for-upstream: (55 commits) spapr_rng: Convert to DEFINE_PROP_LINK cpu: Convert to DEFINE_PROP_LINK mips_cmgcr: Convert to DEFINE_PROP_LINK ivshmem: Convert to DEFINE_PROP_LINK dimm: Convert to DEFINE_PROP_LINK virtio-crypto: Convert to DEFINE_PROP_LINK virtio-rng: Convert to DEFINE_PROP_LINK virtio-scsi: Convert to DEFINE_PROP_LINK virtio-blk: Convert to DEFINE_PROP_LINK qdev: Add const qualifier to PropertyInfo definitions qmp: Use ObjectProperty.type if present qdev: Introduce DEFINE_PROP_LINK qdev: Introduce PropertyInfo.create qom: enforce readonly nature of link's check callback translate-all: remove redundant !tcg_enabled check in dump_exec_info vl: fix breakage of -tb-size nbd: Implement NBD_INFO_BLOCK_SIZE on client nbd: Implement NBD_INFO_BLOCK_SIZE on server nbd: Implement NBD_OPT_GO on client nbd: Implement NBD_OPT_GO on server ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-07-14 12:16:09 +01:00
Eric Blake	081dd1fe36	nbd: Implement NBD_INFO_BLOCK_SIZE on client The upstream NBD Protocol has defined a new extension to allow the server to advertise block sizes to the client, as well as a way for the client to inform the server whether it intends to obey block sizes. When using the block layer as the client, we will obey block sizes; but when used as 'qemu-nbd -c' to hand off to the kernel nbd module as the client, we are still waiting for the kernel to implement a way for us to learn if it will honor block sizes (perhaps by an addition to sysfs, rather than an ioctl), as well as any way to tell the kernel what additional block sizes to obey (NBD_SET_BLKSIZE appears to be accurate for the minimum size, but preferred and maximum sizes would probably be new ioctl()s), so until then, we need to make our request for block sizes conditional. When using ioctl(NBD_SET_BLKSIZE) to hand off to the kernel, use the minimum block size as the sector size if it is larger than 512, which also has the nice effect of cooperating with (non-qemu) servers that don't do read-modify-write when exposing a block device with 4k sectors; it might also allow us to visit a file larger than 2T on a 32-bit kernel. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20170707203049.534-10-eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-07-14 12:04:42 +02:00
Eric Blake	004a89fce9	nbd: Create struct for tracking export info The NBD Protocol is introducing some additional information about exports, such as minimum request size and alignment, as well as an advertised maximum request size. It will be easier to feed this information back to the block layer if we gather all the information into a struct, rather than adding yet more pointer parameters during negotiation. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20170707203049.534-2-eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-07-14 12:04:41 +02:00
Peter Maydell	a309b290aa	Error reporting patches for 2017-07-13 -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJZZ1/BAAoJEDhwtADrkYZTo7oP+gLj4B4kkp/DJnkzfuMMD1Ce ZPddZ8Z9RyXE4fS66sq1ODBQo5U+aQQZO7K234+jf8V4cKWW98lpVzLc3YdAHm2U ZF6Z9Rji5K4414ZsUcg92Zlovvdaji+mY0ooINav+4mqlONYrz29ntApWc0e0tGc e3tj4XDLhJrOM+mIx8vzixFlgSYj+6HgEiybYwolEK5svQbIQao3Y2omyb+zy0w0 RDT3XQnAAaZSOQAXcJGkhekkyMe0jMHOF0tULLx1uDQYctg9mUGlAGTZ5oTLgSve TCpSJwWCAx8XAJMkXyDRrdRFDLeUh6yGY7NTqAL3OuPSoAw9ygKrHyhTavxBJL+W rX7Qit3dmVrlZLviwNFQplAKYb10d08vBoKXmrnW5oVCmPEDvJIQfncbucpA/CNS ucdJ3RMLuDbbWdl+5tsL7jfiZAG7oSgAePTjN1rm0bDe5JN7NAU8WzHnKfE83iZq R+I3hofqGoiXSByYRLamZb+6nsURAxWPhcqcw7hdMsk7UI6dyZwWl9Fnm72w0BZK M5LHLkX0LYc+kZjiLKXlNK7Z50bXY0zKQpPCLH3nHA69iMiwVoozrjwa9iCKIxE+ 7ZlOfsu4ztExuicEyTr8b27CBrHjJjYDuFP0hroEOzqCKXUzegoq3oYMGP0doXxe o3xcwXVKT/1PudddyR4z =tByN -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/armbru/tags/pull-error-2017-07-13' into staging Error reporting patches for 2017-07-13 # gpg: Signature made Thu 13 Jul 2017 12:55:45 BST # gpg: using RSA key 0x3870B400EB918653 # gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" # gpg: aka "Markus Armbruster <armbru@pond.sub.org>" # Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653 * remotes/armbru/tags/pull-error-2017-07-13: Convert error_report_err() to warn_report_err() error: Implement the warn and free Error functions char-socket: Report TCP socket waiting as information Convert error_report() to warn_report() error: Functions to report warnings and informational messages util/qemu-error: Rename error_print_loc() to be more generic websock: Don't try to set errp directly block: Don't try to set errp directly xilinx: Fix latent error handling bug Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-07-14 09:36:40 +01:00
Alistair Francis	3dc6f86936	Convert error_report() to warn_report() Convert all uses of error_report("warning:"... to use warn_report() instead. This helps standardise on a single method of printing warnings to the user. All of the warnings were changed using these two commands: find ./* -type f -exec sed -i \ 's\|error_report(".*warning[,:] \|warn_report("\|Ig' {} + Indentation fixed up manually afterwards. The test-qdev-global-props test case was manually updated to ensure that this patch passes make check (as the test cases are case sensitive). Signed-off-by: Alistair Francis <alistair.francis@xilinx.com> Suggested-by: Thomas Huth <thuth@redhat.com> Cc: Jeff Cody <jcody@redhat.com> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Max Reitz <mreitz@redhat.com> Cc: Ronnie Sahlberg <ronniesahlberg@gmail.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Lieven <pl@kamp.de> Cc: Josh Durgin <jdurgin@redhat.com> Cc: "Richard W.M. Jones" <rjones@redhat.com> Cc: Markus Armbruster <armbru@redhat.com> Cc: Peter Crosthwaite <crosthwaite.peter@gmail.com> Cc: Richard Henderson <rth@twiddle.net> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Greg Kurz <groug@kaod.org> Cc: Rob Herring <robh@kernel.org> Cc: Peter Maydell <peter.maydell@linaro.org> Cc: Peter Chubb <peter.chubb@nicta.com.au> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Marcel Apfelbaum <marcel@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: David Gibson <david@gibson.dropbear.id.au> Cc: Alexander Graf <agraf@suse.de> Cc: Gerd Hoffmann <kraxel@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Cornelia Huck <cohuck@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Acked-by: Greg Kurz <groug@kaod.org> Acked-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed by: Peter Chubb <peter.chubb@data61.csiro.au> Acked-by: Max Reitz <mreitz@redhat.com> Acked-by: Marcel Apfelbaum <marcel@redhat.com> Message-Id: <e1cfa2cd47087c248dd24caca9c33d9af0c499b0.1499866456.git.alistair.francis@xilinx.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2017-07-13 13:49:58 +02:00
Max Reitz	772d1f973f	block/qcow2: falloc/full preallocating growth Implement the preallocation modes falloc and full for growing qcow2 images. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20170613202107.10125-15-mreitz@redhat.com Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:45:02 +02:00
Max Reitz	60c48a29b7	block/qcow2: Rename "fail_block" to just "fail" Now alloc_refcount_block() only contains a single fail label, so it makes more sense to just name it "fail" instead of "fail_block". Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20170613202107.10125-14-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:45:02 +02:00
Max Reitz	12cc30a8cb	block/qcow2: Add qcow2_refcount_area() This function creates a collection of self-describing refcount structures (including a new refcount table) at the end of a qcow2 image file. Optionally, these structures can also describe a number of additional clusters beyond themselves; this will be important for preallocated truncation, which will place the data clusters and L2 tables there. For now, we can use this function to replace the part of alloc_refcount_block() that grows the refcount table (from which it is actually derived). Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20170613202107.10125-13-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:45:02 +02:00
Max Reitz	95b98f343b	block/qcow2: Metadata preallocation for truncate We can support PREALLOC_MODE_METADATA by invoking preallocate() in qcow2_truncate(). Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20170613202107.10125-12-mreitz@redhat.com Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:45:02 +02:00
Max Reitz	652fecd005	block/qcow2: Lock s->lock in preallocate() preallocate() is and will be called only from places that do not otherwise need to lock s->lock: Currently that is qcow2_create2(), as of a future patch it will be called from qcow2_truncate(), too. It therefore makes sense to move locking that mutex into preallocate() itself. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20170613202107.10125-11-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:45:02 +02:00
Max Reitz	7bc45dc172	block/qcow2: Generalize preallocate() This patch adds two new parameters to the preallocate() function so we will be able to use it not just for preallocating a new image but also for preallocated image growth. The offset parameter allows the caller to specify a virtual offset from which to start preallocating. For newly created images this is always 0, but for preallocating growth this will be the old image length. The new_length parameter specifies the supposed new length of the image (basically the "end offset" for preallocation). During image truncation, bdrv_getlength() will return the old image length so we cannot rely on its return value then. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20170613202107.10125-10-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:45:02 +02:00
Max Reitz	35d72602ec	block/file-posix: Preallocation for truncate By using raw_regular_truncate() in raw_truncate(), we can now easily support preallocation. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20170613202107.10125-9-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:45:01 +02:00
Max Reitz	d0bc9e5d5e	block/file-posix: Generalize raw_regular_truncate Currently, raw_regular_truncate() is intended for setting the size of a newly created file. However, we also want to use it for truncating an existing file in which case only the newly added space (when growing) should be preallocated. This also means that if resizing failed, we should try to restore the original file size. This is important when using preallocation. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20170613202107.10125-8-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:45:01 +02:00
Max Reitz	9f63b07ee7	block/file-posix: Extract raw_regular_truncate() This functionality is part of raw_create() which we will be able to reuse nicely in raw_truncate(). Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 20170613202107.10125-7-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:45:01 +02:00
Max Reitz	7dacd8bd3d	block/file-posix: Small fixes in raw_create() Variables should be declared at the start of a block, and if a certain parameter value is not supported it may be better to return -ENOTSUP instead of -EINVAL. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 20170613202107.10125-6-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:45:01 +02:00
Max Reitz	3a691c50f1	block: Add PreallocMode to blk_truncate() blk_truncate() itself will pass that value to bdrv_truncate(), and all callers of blk_truncate() just set the parameter to PREALLOC_MODE_OFF for now. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20170613202107.10125-4-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:45:01 +02:00
Max Reitz	7ea37c3066	block: Add PreallocMode to bdrv_truncate() For block drivers that just pass a truncate request to the underlying protocol, we can now pass the preallocation mode instead of aborting if it is not PREALLOC_MODE_OFF. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20170613202107.10125-3-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:45:01 +02:00
Max Reitz	8243ccb743	block: Add PreallocMode to BD.bdrv_truncate() Add a PreallocMode parameter to the bdrv_truncate() function implemented by each block driver. Currently, we always pass PREALLOC_MODE_OFF and no driver accepts anything else. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20170613202107.10125-2-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:45:01 +02:00
Stefan Hajnoczi	c501c35220	qcow2: add bdrv_measure() support Use qcow2_calc_prealloc_size() to get the required file size. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 20170705125738.8777-7-stefanha@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:45:00 +02:00
Stefan Hajnoczi	0eb4a8c1df	qcow2: extract image creation option parsing The image creation options parsed by qcow2_create() are also needed to implement .bdrv_measure(). Extract the parsing code, including input validation. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 20170705125738.8777-6-stefanha@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:45:00 +02:00
Stefan Hajnoczi	7c5bcc4212	qcow2: make refcount size calculation conservative The refcount metadata size calculation is inaccurate and can produce numbers that are too small. This is bad because we should calculate a conservative number - one that is guaranteed to be large enough. This patch switches the approach to a fixed point calculation because the existing equation is hard to solve when inaccuracies are taken care of. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 20170705125738.8777-5-stefanha@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:45:00 +02:00
Stefan Hajnoczi	95c67e3bd7	qcow2: extract preallocation calculation function Calculating the preallocated image size will be needed to implement .bdrv_measure(). Extract the code out into a separate function. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 20170705125738.8777-4-stefanha@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:45:00 +02:00
Stefan Hajnoczi	a843a22a82	raw-format: add bdrv_measure() support Maximum size calculation is trivial for the raw format: it's just the requested image size (because there is no metadata). Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 20170705125738.8777-3-stefanha@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:45:00 +02:00
Vladimir Sementsov-Ogievskiy	615b5dcf2d	block: release persistent bitmaps on inactivate We should release them here to reload on invalidate cache. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20170628120530.31251-31-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:59 +02:00
Vladimir Sementsov-Ogievskiy	469c71edc7	qcow2: add .bdrv_remove_persistent_dirty_bitmap Realize .bdrv_remove_persistent_dirty_bitmap interface. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 20170628120530.31251-29-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:59 +02:00
Vladimir Sementsov-Ogievskiy	56f364e6d7	block/dirty-bitmap: add bdrv_remove_persistent_dirty_bitmap Interface for removing persistent bitmap from its storage. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 20170628120530.31251-28-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:59 +02:00
Vladimir Sementsov-Ogievskiy	a3b52535e8	qmp: add x-debug-block-dirty-bitmap-sha256 Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 20170628120530.31251-26-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:59 +02:00
Vladimir Sementsov-Ogievskiy	da0eb242ad	qcow2: add .bdrv_can_store_new_dirty_bitmap Realize .bdrv_can_store_new_dirty_bitmap interface. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20170628120530.31251-23-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:59 +02:00
Vladimir Sementsov-Ogievskiy	169b879359	qcow2: store bitmaps on reopening image as read-only Store bitmaps and mark them read-only on reopening image as read-only. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20170628120530.31251-21-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:58 +02:00
Vladimir Sementsov-Ogievskiy	5f72826e7f	qcow2: add persistent dirty bitmaps support Store persistent dirty bitmaps in qcow2 image. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20170628120530.31251-20-vsementsov@virtuozzo.com [mreitz: Always assign ret in store_bitmap() in case of an error] Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:58 +02:00
Vladimir Sementsov-Ogievskiy	3dd10a06d1	block/dirty-bitmap: add bdrv_dirty_bitmap_next() Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 20170628120530.31251-19-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:58 +02:00
Vladimir Sementsov-Ogievskiy	a88b179fdb	block: introduce persistent dirty bitmaps New field BdrvDirtyBitmap.persistent means, that bitmap should be saved by format driver in .bdrv_close and .bdrv_inactivate. No format driver supports it for now. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 20170628120530.31251-18-vsementsov@virtuozzo.com [mreitz: Fixed indentation] Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:58 +02:00
Vladimir Sementsov-Ogievskiy	a0319aacd4	block/dirty-bitmap: add autoload field to BdrvDirtyBitmap Mirror AUTO flag from Qcow2 bitmap in BdrvDirtyBitmap. This will be needed in future, to save this flag back to Qcow2 for persistent bitmaps. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 20170628120530.31251-16-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:58 +02:00
Vladimir Sementsov-Ogievskiy	1b6b0562db	qcow2: support .bdrv_reopen_bitmaps_rw Realize bdrv_reopen_bitmaps_rw interface. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20170628120530.31251-15-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:58 +02:00
Vladimir Sementsov-Ogievskiy	d1258dd0c8	qcow2: autoloading dirty bitmaps Auto loading bitmaps are bitmaps in Qcow2, with the AUTO flag set. They are loaded when the image is opened and become BdrvDirtyBitmaps for the corresponding drive. Extra data in bitmaps is not supported for now. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 20170628120530.31251-12-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:58 +02:00
Vladimir Sementsov-Ogievskiy	d6883bc968	block/dirty-bitmap: add readonly field to BdrvDirtyBitmap It will be needed in following commits for persistent bitmaps. If bitmap is loaded from read-only storage (and we can't mark it "in use" in this storage) corresponding BdrvDirtyBitmap should be read-only. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 20170628120530.31251-11-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:58 +02:00
Vladimir Sementsov-Ogievskiy	8bfc932e1e	block/dirty-bitmap: fix comment for BlockDirtyBitmap.disabled field Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20170628120530.31251-10-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:58 +02:00
Vladimir Sementsov-Ogievskiy	88ddffae8f	qcow2: add bitmaps extension Add bitmap extension as specified in docs/specs/qcow2.txt. For now, just mirror extension header into Qcow2 state and check constraints. Also, calculate refcounts for qcow2 bitmaps, to not break qemu-img check. For now, disable image resize if it has bitmaps. It will be fixed later. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 20170628120530.31251-9-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:57 +02:00
Vladimir Sementsov-Ogievskiy	8a5bb1f114	qcow2-refcount: rename inc_refcounts() and make it public This is needed for the following patch, which will introduce refcounts checking for qcow2 bitmaps. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 20170628120530.31251-8-vsementsov@virtuozzo.com [mreitz: s/inc_refcounts/qcow2_inc_refcounts_imrt/ in one more (new) place] Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:57 +02:00
Vladimir Sementsov-Ogievskiy	6bdc8b719a	block/dirty-bitmap: add deserialize_ones func Add bdrv_dirty_bitmap_deserialize_ones() function, which is needed for qcow2 bitmap loading, to handle unallocated bitmap parts, marked as all-ones. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 20170628120530.31251-7-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:57 +02:00
Vladimir Sementsov-Ogievskiy	ba06ff1a5c	block: fix bdrv_dirty_bitmap_granularity signature Make getter signature const-correct. This allows other functions with const dirty bitmap parameter use bdrv_dirty_bitmap_granularity(). Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-id: 20170628120530.31251-6-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:57 +02:00
Daniel P. Berrange	0a12f6f80e	qcow2: report encryption specific image information Currently 'qemu-img info' reports a simple "encrypted: yes" field. This is not very useful now that qcow2 can support multiple encryption formats. Users want to know which format is in use and some data related to it. Wire up usage of the qcrypto_block_get_info() method so that 'qemu-img info' can report about the encryption format and parameters in use $ qemu-img create \ --object secret,id=sec0,data=123456 \ -o encrypt.format=luks,encrypt.key-secret=sec0 \ -f qcow2 demo.qcow2 1G Formatting 'demo.qcow2', fmt=qcow2 size=1073741824 \ encryption=off encrypt.format=luks encrypt.key-secret=sec0 \ cluster_size=65536 lazy_refcounts=off refcount_bits=16 $ qemu-img info demo.qcow2 image: demo.qcow2 file format: qcow2 virtual size: 1.0G (1073741824 bytes) disk size: 480K encrypted: yes cluster_size: 65536 Format specific information: compat: 1.1 lazy refcounts: false refcount bits: 16 encrypt: ivgen alg: plain64 hash alg: sha256 cipher alg: aes-256 uuid: 3fa930c4-58c8-4ef7-b3c5-314bb5af21f3 format: luks cipher mode: xts slots: [0]: active: true iters: 1839058 key offset: 4096 stripes: 4000 [1]: active: false key offset: 262144 [2]: active: false key offset: 520192 [3]: active: false key offset: 778240 [4]: active: false key offset: 1036288 [5]: active: false key offset: 1294336 [6]: active: false key offset: 1552384 [7]: active: false key offset: 1810432 payload offset: 2068480 master key iters: 438487 corrupt: false With the legacy "AES" encryption we just report the format name $ qemu-img create \ --object secret,id=sec0,data=123456 \ -o encrypt.format=aes,encrypt.key-secret=sec0 \ -f qcow2 demo.qcow2 1G Formatting 'demo.qcow2', fmt=qcow2 size=1073741824 \ encryption=off encrypt.format=aes encrypt.key-secret=sec0 \ cluster_size=65536 lazy_refcounts=off refcount_bits=16 $ ./qemu-img info demo.qcow2 image: demo.qcow2 file format: qcow2 virtual size: 1.0G (1073741824 bytes) disk size: 196K encrypted: yes cluster_size: 65536 Format specific information: compat: 1.1 lazy refcounts: false refcount bits: 16 encrypt: format: aes corrupt: false Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 20170623162419.26068-20-berrange@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:57 +02:00
Daniel P. Berrange	1cd9a787a2	block: pass option prefix down to crypto layer While the crypto layer uses a fixed option name "key-secret", the upper block layer may have a prefix on the options. e.g. "encrypt.key-secret", in order to avoid clashes between crypto option names & other block option names. To ensure the crypto layer can report accurate error messages, we must tell it what option name prefix was used. Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 20170623162419.26068-19-berrange@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:56 +02:00
Daniel P. Berrange	c01c214b69	block: remove all encryption handling APIs Now that all encryption keys must be provided upfront via the QCryptoSecret API and associated block driver properties there is no need for any explicit encryption handling APIs in the block layer. Encryption can be handled transparently within the block driver. We only retain an API for querying whether an image is encrypted or not, since that is a potentially useful piece of metadata to report to the user. Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 20170623162419.26068-18-berrange@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:56 +02:00
Daniel P. Berrange	4652b8f3e1	qcow2: add support for LUKS encryption format This adds support for using LUKS as an encryption format with the qcow2 file, using the new encrypt.format parameter to request "luks" format. e.g. # qemu-img create --object secret,data=123456,id=sec0 \ -f qcow2 -o encrypt.format=luks,encrypt.key-secret=sec0 \ test.qcow2 10G The legacy "encryption=on" parameter still results in creation of the old qcow2 AES format (and is equivalent to the new 'encryption-format=aes'). e.g. the following are equivalent: # qemu-img create --object secret,data=123456,id=sec0 \ -f qcow2 -o encryption=on,encrypt.key-secret=sec0 \ test.qcow2 10G # qemu-img create --object secret,data=123456,id=sec0 \ -f qcow2 -o encryption-format=aes,encrypt.key-secret=sec0 \ test.qcow2 10G With the LUKS format it is necessary to store the LUKS partition header and key material in the QCow2 file. This data can be many MB in size, so cannot go into the QCow2 header region directly. Thus the spec defines a FDE (Full Disk Encryption) header extension that specifies the offset of a set of clusters to hold the FDE headers, as well as the length of that region. The LUKS header is thus stored in these extra allocated clusters before the main image payload. Aside from all the cryptographic differences implied by use of the LUKS format, there is one further key difference between the use of legacy AES and LUKS encryption in qcow2. For LUKS, the initialiazation vectors are generated using the host physical sector as the input, rather than the guest virtual sector. This guarantees unique initialization vectors for all sectors when qcow2 internal snapshots are used, thus giving stronger protection against watermarking attacks. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 20170623162419.26068-14-berrange@redhat.com Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:56 +02:00
Daniel P. Berrange	b25b387fa5	qcow2: convert QCow2 to use QCryptoBlock for encryption This converts the qcow2 driver to make use of the QCryptoBlock APIs for encrypting image content, using the legacy QCow2 AES scheme. With this change it is now required to use the QCryptoSecret object for providing passwords, instead of the current block password APIs / interactive prompting. $QEMU \ -object secret,id=sec0,file=/home/berrange/encrypted.pw \ -drive file=/home/berrange/encrypted.qcow2,encrypt.key-secret=sec0 The test 087 could be simplified since there is no longer a difference in behaviour when using blockdev_add with encrypted images for the running vs stopped CPU state. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 20170623162419.26068-12-berrange@redhat.com Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:56 +02:00
Daniel P. Berrange	446d306d23	qcow2: make qcow2_encrypt_sectors encrypt in place Instead of requiring separate input/output buffers for encrypting data, change qcow2_encrypt_sectors() to assume use of a single buffer, encrypting in place. The current callers all used the same buffer for input/output already. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 20170623162419.26068-11-berrange@redhat.com Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:56 +02:00
Daniel P. Berrange	d85f4222b4	qcow: convert QCow to use QCryptoBlock for encryption This converts the qcow driver to make use of the QCryptoBlock APIs for encrypting image content. This is only wired up to permit use of the legacy QCow encryption format. Users who wish to have the strong LUKS format should switch to qcow2 instead. With this change it is now required to use the QCryptoSecret object for providing passwords, instead of the current block password APIs / interactive prompting. $QEMU \ -object secret,id=sec0,file=/home/berrange/encrypted.pw \ -drive file=/home/berrange/encrypted.qcow,encrypt.format=aes,\ encrypt.key-secret=sec0 Though note that running QEMU system emulators with the AES encryption is no longer supported, so while the above syntax is valid, QEMU will refuse to actually run the VM in this particular example. Likewise when creating images with the legacy AES-CBC format qemu-img create -f qcow \ --object secret,id=sec0,file=/home/berrange/encrypted.pw \ -o encrypt.format=aes,encrypt.key-secret=sec0 \ /home/berrange/encrypted.qcow 64M Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 20170623162419.26068-10-berrange@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:56 +02:00
Daniel P. Berrange	1fad1f9400	qcow: make encrypt_sectors encrypt in place Instead of requiring separate input/output buffers for encrypting data, change encrypt_sectors() to assume use of a single buffer, encrypting in place. One current caller uses the same buffer for input/output already and the other two callers are easily converted to do so. Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 20170623162419.26068-9-berrange@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:56 +02:00
Daniel P. Berrange	0cb8d47ba9	block: deprecate "encryption=on" in favor of "encrypt.format=aes" Historically the qcow & qcow2 image formats supported a property "encryption=on" to enable their built-in AES encryption. We'll soon be supporting LUKS for qcow2, so need a more general purpose way to enable encryption, with a choice of formats. This introduces an "encrypt.format" option, which will later be joined by a number of other "encrypt.XXX" options. The use of a "encrypt." prefix instead of "encrypt-" is done to facilitate mapping to a nested QAPI schema at later date. e.g. the preferred syntax is now qemu-img create -f qcow2 -o encrypt.format=aes demo.qcow2 Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 20170623162419.26068-8-berrange@redhat.com Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:55 +02:00
Daniel P. Berrange	6aa837f7bd	qcow: require image size to be > 1 for new images The qcow driver refuses to open images which are less than 2 bytes in size, but will happily create such images. Add a check in the create path to avoid this discrepancy. Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 20170623162419.26068-5-berrange@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:55 +02:00
Daniel P. Berrange	4a47f85431	block: add ability to set a prefix for opt names When integrating the crypto support with qcow/qcow2, we don't want to use the bare LUKS option names "hash-alg", "key-secret", etc. We need to namespace them to match the nested QAPI schema. e.g. "encrypt.hash-alg", "encrypt.key-secret" so that they don't clash with any general qcow options at a later date. Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 20170623162419.26068-3-berrange@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:55 +02:00
Daniel P. Berrange	306a06e5f7	block: expose crypto option names / defs to other drivers The block/crypto.c defines a set of QemuOpts that provide parameters for encryption. This will also be needed by the qcow/qcow2 integration, so expose the relevant pieces in a new block/crypto.h header. Some helper methods taking QemuOpts are changed to take QDict to simplify usage in other places. Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 20170623162419.26068-2-berrange@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:55 +02:00
Eric Blake	51b0a48888	block: Make bdrv_is_allocated_above() byte-based We are gradually moving away from sector-based interfaces, towards byte-based. In the common case, allocation is unlikely to ever use values that are not naturally sector-aligned, but it is possible that byte-based values will let us be more precise about allocation at the end of an unaligned file that can do byte-based access. Changing the signature of the function to use int64_t *pnum ensures that the compiler enforces that all callers are updated. For now, the io.c layer still assert()s that all callers are sector-aligned, but that can be relaxed when a later patch implements byte-based block status. Therefore, for the most part this patch is just the addition of scaling at the callers followed by inverse scaling at bdrv_is_allocated(). But some code, particularly stream_run(), gets a lot simpler because it no longer has to mess with sectors. Leave comments where we can further simplify by switching to byte-based iterations, once later patches eliminate the need for sector-aligned operations. For ease of review, bdrv_is_allocated() was tackled separately. Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:07 +02:00
Eric Blake	c00716beb3	block: Minimize raw use of bds->total_sectors bdrv_is_allocated_above() was relying on intermediate->total_sectors, which is a field that can have stale contents depending on the value of intermediate->has_variable_length. An audit shows that we are safe (we were first calling through bdrv_co_get_block_status() which in turn calls bdrv_nb_sectors() and therefore just refreshed the current length), but it's nicer to favor our accessor functions to avoid having to repeat such an audit, even if it means refresh_total_sectors() is called more frequently. Suggested-by: John Snow <jsnow@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Manos Pitsidianakis <el13635@mail.ntua.gr> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:07 +02:00
Eric Blake	d6a644bbfe	block: Make bdrv_is_allocated() byte-based We are gradually moving away from sector-based interfaces, towards byte-based. In the common case, allocation is unlikely to ever use values that are not naturally sector-aligned, but it is possible that byte-based values will let us be more precise about allocation at the end of an unaligned file that can do byte-based access. Changing the signature of the function to use int64_t pnum ensures that the compiler enforces that all callers are updated. For now, the io.c layer still assert()s that all callers are sector-aligned on input and that pnum is sector-aligned on return to the caller, but that can be relaxed when a later patch implements byte-based block status. Therefore, this code adds usages like DIV_ROUND_UP(,BDRV_SECTOR_SIZE) to callers that still want aligned values, where the call might reasonbly give non-aligned results in the future; on the other hand, no rounding is needed for callers that should just continue to work with byte alignment. For the most part this patch is just the addition of scaling at the callers followed by inverse scaling at bdrv_is_allocated(). But some code, particularly bdrv_commit(), gets a lot simpler because it no longer has to mess with sectors; also, it is now possible to pass NULL if the caller does not care how much of the image is allocated beyond the initial offset. Leave comments where we can further simplify once a later patch eliminates the need for sector-aligned requests through bdrv_is_allocated(). For ease of review, bdrv_is_allocated_above() will be tackled separately. Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:07 +02:00
Eric Blake	6f8e35e241	backup: Switch backup_run() to byte-based We are gradually converting to byte-based interfaces, as they are easier to reason about than sector-based. Change the internal loop iteration of backups to track by bytes instead of sectors (although we are still guaranteed that we iterate by steps that are cluster-aligned). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:06 +02:00
Eric Blake	03f5d60bbf	backup: Switch backup_do_cow() to byte-based We are gradually converting to byte-based interfaces, as they are easier to reason about than sector-based. Convert another internal function (no semantic change). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:06 +02:00
Eric Blake	f6ac207893	backup: Switch block_backup.h to byte-based We are gradually converting to byte-based interfaces, as they are easier to reason about than sector-based. Continue by converting the public interface to backup jobs (no semantic change), including a change to CowRequest to track by bytes instead of cluster indices. Note that this does not change the difference between the public interface (starting point, and size of the subsequent range) and the internal interface (starting and end points). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Xie Changlong <xiechanglong@cmss.chinamobile.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:06 +02:00
Eric Blake	cf79cdf662	backup: Switch BackupBlockJob to byte-based We are gradually converting to byte-based interfaces, as they are easier to reason about than sector-based. Continue by converting an internal structure (no semantic change), and all references to tracking progress. Drop a redundant local variable bytes_per_cluster. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:06 +02:00
Eric Blake	e8a81e9cad	block: Drop unused bdrv_round_sectors_to_clusters() Now that the last user [mirror_iteration()] has converted to using bytes, we no longer need a function to round sectors to clusters. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:06 +02:00
Eric Blake	fb2ef7919b	mirror: Switch mirror_iteration() to byte-based We are gradually converting to byte-based interfaces, as they are easier to reason about than sector-based. Change the internal loop iteration of mirroring to track by bytes instead of sectors (although we are still guaranteed that we iterate by steps that are both sector-aligned and multiples of the granularity). Drop the now-unused mirror_clip_sectors(). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:06 +02:00
Eric Blake	ae4cc8777b	mirror: Switch mirror_do_read() to byte-based We are gradually converting to byte-based interfaces, as they are easier to reason about than sector-based. Convert another internal function, preserving all existing semantics, and adding one more assertion that things are still sector-aligned (so that conversions to sectors in mirror_read_complete don't need to round). Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:06 +02:00
Eric Blake	782d97efec	mirror: Switch mirror_cow_align() to byte-based We are gradually converting to byte-based interfaces, as they are easier to reason about than sector-based. Convert another internal function (no semantic change), and add mirror_clip_bytes() as a counterpart to mirror_clip_sectors(). Some of the conversion is a bit tricky, requiring temporaries to convert between units; it will be cleared up in a following patch. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:06 +02:00
Eric Blake	931e52607f	mirror: Update signature of mirror_clip_sectors() Rather than having a void function that modifies its input in-place as the output, change the signature to reduce a layer of indirection and return the result. Suggested-by: John Snow <jsnow@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:06 +02:00
Eric Blake	e6f2419389	mirror: Switch mirror_do_zero_or_discard() to byte-based We are gradually converting to byte-based interfaces, as they are easier to reason about than sector-based. Convert another internal function (no semantic change). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:06 +02:00
Eric Blake	b436982f04	mirror: Switch MirrorBlockJob to byte-based We are gradually converting to byte-based interfaces, as they are easier to reason about than sector-based. Continue by converting an internal structure (no semantic change), and all references to the buffer size. Add an assertion that our use of s->granularity >> BDRV_SECTOR_BITS (necessary for interaction with sector-based dirty bitmaps, until a later patch converts those to be byte-based) does not suffer from truncation problems. [checkpatch has a false positive on use of MIN() in this patch] Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:06 +02:00
Eric Blake	317a6676a2	commit: Switch commit_run() to byte-based We are gradually converting to byte-based interfaces, as they are easier to reason about than sector-based. Change the internal loop iteration of committing to track by bytes instead of sectors (although we are still guaranteed that we iterate by steps that are sector-aligned). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:06 +02:00
Eric Blake	d8a9858408	commit: Switch commit_populate() to byte-based We are gradually converting to byte-based interfaces, as they are easier to reason about than sector-based. Start by converting an internal function (no semantic change). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:06 +02:00
Eric Blake	d535435f4a	stream: Switch stream_run() to byte-based We are gradually converting to byte-based interfaces, as they are easier to reason about than sector-based. Change the internal loop iteration of streaming to track by bytes instead of sectors (although we are still guaranteed that we iterate by steps that are sector-aligned). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:06 +02:00
Eric Blake	158c649257	stream: Drop reached_end for stream_complete() stream_complete() skips the work of rewriting the backing file if the job was cancelled, if data->reached_end is false, or if there was an error detected (non-zero data->ret) during the streaming. But note that in stream_run(), data->reached_end is only set if the loop ran to completion, and data->ret is only 0 in two cases: either the loop ran to completion (possibly by cancellation, but stream_complete checks for that), or we took an early goto out because there is no bs->backing. Thus, we can preserve the same semantics without the use of reached_end, by merely checking for bs->backing (and logically, if there was no backing file, streaming is a no-op, so there is no backing file to rewrite). Suggested-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:06 +02:00
Eric Blake	8493211c02	stream: Switch stream_populate() to byte-based We are gradually converting to byte-based interfaces, as they are easier to reason about than sector-based. Start by converting an internal function (no semantic change). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:06 +02:00
Eric Blake	5cb1a49e01	trace: Show blockjob actions via bytes, not sectors Upcoming patches are going to switch to byte-based interfaces instead of sector-based. Even worse, trace_backup_do_cow_enter() had a weird mix of cluster and sector indices. The trace interface is low enough that there are no stability guarantees, and therefore nothing wrong with changing our units, even in cases like trace_backup_do_cow_skip() where we are not changing the trace output. So make the tracing uniformly use bytes. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:06 +02:00
Eric Blake	f3e4ce4af3	blockjob: Track job ratelimits via bytes, not sectors The user interface specifies job rate limits in bytes/second. It's pointless to have our internal representation track things in sectors/second, particularly since we want to move away from sector-based interfaces. Fix up a doc typo found while verifying that the ratelimit code handles the scaling difference. Repetition of expressions like 'n * BDRV_SECTOR_SIZE' will be cleaned up later when functions are converted to iterate over images by bytes rather than by sectors. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:06 +02:00
Hervé Poussineau	8b544293ef	vvfat: change OEM name to 'MSWIN4.1' According to specification: "'MSWIN4.1' is the recommanded setting, because it is the setting least likely to cause compatibility problems. If you want to put something else in here, that is your option, but the result may be that some FAT drivers might not recognize the volume." Specification: "FAT: General overview of on-disk format" v1.03, page 9 Signed-off-by: Hervé Poussineau <hpoussin@reactos.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:06 +02:00
Hervé Poussineau	78f002c901	vvfat: handle KANJI lead byte 0xe5 Specification: "FAT: General overview of on-disk format" v1.03, page 23 Signed-off-by: Hervé Poussineau <hpoussin@reactos.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:05 +02:00
Hervé Poussineau	6817efea3a	vvfat: limit number of entries in root directory in FAT12/FAT16 FAT12/FAT16 root directory is two sectors in size, which allows only 512 directory entries. Prevent QEMU startup if too much files exist, instead of overflowing root directory. Also introduce variable root_entries, which will be required for FAT32. Fixes: https://bugs.launchpad.net/qemu/+bug/1599539/comments/4 Signed-off-by: Hervé Poussineau <hpoussin@reactos.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:05 +02:00
Hervé Poussineau	339cebcc01	vvfat: correctly generate numeric-tail of short file names More specifically: - try without numeric-tail only if LFN didn't have invalid short chars - start at ~1 (instead of ~0) - handle case if numeric tail is more than one char (ie > 10) Windows 9x Scandisk doesn't see anymore mismatches between short file names and long file names for non-ASCII filenames. Specification: "FAT: General overview of on-disk format" v1.03, page 31 Signed-off-by: Hervé Poussineau <hpoussin@reactos.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:05 +02:00
Hervé Poussineau	0c36111f57	vvfat: correctly create base short names for non-ASCII filenames More specifically, create short name from filename and change blacklist of invalid chars to whitelist of valid chars. Windows 9x also now correctly see long file names of filenames containing a space, but Scandisk still complains about mismatch between SFN and LFN. [kwolf: Build fix for this intermediate patch (it included declarations for variables that are only used in the next patch) ] Specification: "FAT: General overview of on-disk format" v1.03, pages 30-31 Signed-off-by: Hervé Poussineau <hpoussin@reactos.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:05 +02:00
Hervé Poussineau	09ec4119fb	vvfat: correctly create long names for non-ASCII filenames Assume that input filename is encoded as UTF-8, so correctly create UTF-16 encoding. Signed-off-by: Hervé Poussineau <hpoussin@reactos.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:05 +02:00
Hervé Poussineau	f82d92bb02	vvfat: always create . and .. entries at first and in that order readdir() doesn't always return . and .. entries at first and in that order. This leads to not creating them at first in the directory, which raises some errors on file system checking utilities like MS-DOS Scandisk. Specification: "FAT: General overview of on-disk format" v1.03, page 25 Fixes: https://bugs.launchpad.net/qemu/+bug/1599539 Signed-off-by: Hervé Poussineau <hpoussin@reactos.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:05 +02:00
Hervé Poussineau	92e28d8220	vvfat: fix field names in FAT12/FAT16 and FAT32 boot sectors Specification: "FAT: General overview of on-disk format" v1.03, pages 11-13 Signed-off-by: Hervé Poussineau <hpoussin@reactos.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:05 +02:00
Hervé Poussineau	4dc705dc7e	vvfat: introduce offset_to_bootsector, offset_to_fat and offset_to_root_dir - offset_to_bootsector is the number of sectors up to FAT bootsector - offset_to_fat is the number of sectors up to first File Allocation Table - offset_to_root_dir is the number of sectors up to root directory sector Replace first_sectors_number - 1 by offset_to_bootsector. Replace first_sectors_number by offset_to_fat. Replace faked_sectors by offset_to_rootdir. Signed-off-by: Hervé Poussineau <hpoussin@reactos.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:05 +02:00
Hervé Poussineau	ad05b31857	vvfat: rename useless enumeration values MODE_FAKED and MODE_RENAMED are not and were never used. Signed-off-by: Hervé Poussineau <hpoussin@reactos.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:05 +02:00
Hervé Poussineau	5f5b29dfce	vvfat: fix typos Signed-off-by: Hervé Poussineau <hpoussin@reactos.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:05 +02:00
Hervé Poussineau	d6a7e54ed3	vvfat: replace tabs by 8 spaces This was a complete mess. On 2299 indented lines: - 1329 were with spaces only - 617 with tabulations only - 353 with spaces and tabulations Signed-off-by: Hervé Poussineau <hpoussin@reactos.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:05 +02:00
Hervé Poussineau	139921aaa7	vvfat: fix qemu-img map and qemu-img convert - bs->total_sectors is the number of sectors of the whole disk - s->sector_count is the number of sectors of the FAT partition This fixes the following assert in qemu-img map: qemu-img.c:2641: get_block_status: Assertion `nb_sectors' failed. This also fixes an infinite loop in qemu-img convert. Fixes: `4480e0f924` Fixes: https://bugs.launchpad.net/qemu/+bug/1599539 Cc: qemu-stable@nongnu.org Signed-off-by: Hervé Poussineau <hpoussin@reactos.org> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:05 +02:00
Eric Blake	544daf6679	blkdebug: Support .bdrv_co_get_block_status Without a passthrough status of BDRV_BLOCK_RAW, anything wrapped by blkdebug appears 100% allocated as data. Better is treating it the same as the underlying file being wrapped. Update iotest 177 for the new expected output. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:05 +02:00
Eric Blake	d5254033da	block: Simplify use of BDRV_BLOCK_RAW The lone caller that cares about a return of BDRV_BLOCK_RAW (namely, io.c:bdrv_co_get_block_status) completely replaces the return value, so there is no point in passing BDRV_BLOCK_DATA. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:05 +02:00
Eric Blake	81c219ac6c	block: Guarantee that file is set on bdrv_get_block_status() We document that file is valid if the return is not an error and includes BDRV_BLOCK_OFFSET_VALID, but forgot to obey this contract when a driver (such as blkdebug) lacks a callback. Messed up in commit `67a0fd2` (v2.6), when we added the file parameter. Enhance qemu-iotest 177 to cover this, using a sequence that would print garbage or even SEGV, because it was dererefencing through uninitialized memory. [The resulting test output shows that we have less-than-ideal block status from the blkdebug driver, but that's a separate fix coming up soon.] Setting file on all paths that return BDRV_BLOCK_OFFSET_VALID is enough to fix the crash, but we can go one step further: always setting file, even on error, means that a broken caller that blindly dereferences file without checking for error is now more likely to get a reliable SEGV instead of randomly acting on garbage, making it easier to diagnose such buggy callers. Adding an assertion that file is set where expected doesn't hurt either. CC: qemu-stable@nongnu.org Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:05 +02:00
Paolo Bonzini	96d06835dc	nbd: fix NBD over TLS When attaching the NBD QIOChannel to an AioContext, the TLS channel should be used, not the underlying socket channel. This is because, trivially, the TLS channel will be the one that we read/write to and thus the one that will get the qio_channel_yield() call. Fixes: `ff82911cd3` Cc: qemu-stable@nongnu.org Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Daniel P. Berrange <berrange@redhat.com> Tested-by: Daniel P. Berrange <berrange@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-07-04 14:30:03 +02:00
Eric Blake	c61e684e44	block: Exploit BDRV_BLOCK_EOF for larger zero blocks When we have a BDS with unallocated clusters, but asking the status of its underlying bs->file or backing layer encounters an end-of-file condition, we know that the rest of the unallocated area will read as zeroes. However, pre-patch, this required two separate calls to bdrv_get_block_status(), as the first call stops at the point where the underlying file ends. Thanks to BDRV_BLOCK_EOF, we can now widen the results of the primary status if the secondary status already includes BDRV_BLOCK_ZERO. In turn, this fixes a TODO mentioned in iotest 154, where we can now see that all sectors in a partial cluster at the end of a file read as zero when coupling the shorter backing file's status along with our knowledge that the remaining sectors came from an unallocated cluster. Also, note that the loop in bdrv_co_get_block_status_above() had an inefficent exit: in cases where the active layer sets BDRV_BLOCK_ZERO but does NOT set BDRV_BLOCK_ALLOCATED (namely, where we know we read zeroes merely because our unallocated clusters lie beyond the backing file's shorter length), we still ended up probing the backing layer even though we already had a good answer. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20170505021500.19315-3-eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>	2017-06-30 21:48:06 +08:00
Eric Blake	fb0d8654ff	block: Add BDRV_BLOCK_EOF to bdrv_get_block_status() Just as the block layer already sets BDRV_BLOCK_ALLOCATED as a shortcut for subsequent operations, there are also some optimizations that are made easier if we can quickly tell that *pnum will advance us to the end of a file, via a new BDRV_BLOCK_EOF which gets set by the block layer. This just plumbs up the new bit; subsequent patches will make use of it. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20170505021500.19315-2-eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>	2017-06-30 21:48:06 +08:00
Max Reitz	f69165a8fe	block: Do not strcmp() with NULL uri->scheme uri_parse(...)->scheme may be NULL. In fact, probably every field may be NULL, and the callers do test this for all of the other fields but not for scheme (except for block/gluster.c; block/vxhs.c does not access that field at all). We can easily fix this by using g_strcmp0() instead of strcmp(). Cc: qemu-stable@nongnu.org Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20170613205726.13544-1-mreitz@redhat.com Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-06-26 14:54:46 +02:00
Max Reitz	05cc758a3d	blkverify: Catch bs->exact_filename overflow The bs->exact_filename field may not be sufficient to store the full blkverify node filename. In this case, we should not generate a filename at all instead of an unusable one. Cc: qemu-stable@nongnu.org Reported-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20170613172006.19685-3-mreitz@redhat.com Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-06-26 14:54:46 +02:00
Max Reitz	de81d72d3d	blkdebug: Catch bs->exact_filename overflow The bs->exact_filename field may not be sufficient to store the full blkdebug node filename. In this case, we should not generate a filename at all instead of an unusable one. Cc: qemu-stable@nongnu.org Reported-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20170613172006.19685-2-mreitz@redhat.com Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-06-26 14:54:46 +02:00
Manos Pitsidianakis	f5a5ca7969	block: change variable names in BlockDriverState Change the 'int count' parameter in pwrite_zeros, pdiscard related functions (and some others) to 'int bytes', as they both refer to bytes. This helps with code legibility. Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr> Message-id: 20170609101808.13506-1-el13635@mail.ntua.gr Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-06-26 14:54:46 +02:00
Kevin Wolf	c5f1ad429c	block: Remove bdrv_aio_readv/writev/flush() These functions are unused now. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-06-26 14:51:15 +02:00
Kevin Wolf	0f714ec706	qed: Use bdrv_co_* for coroutine_fns All functions that are marked coroutine_fn can directly call the bdrv_co_* version of functions instead of going through the wrapper. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Manos Pitsidianakis <el13635@mail.ntua.gr> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-06-26 14:51:15 +02:00
Kevin Wolf	87f0d88261	qed: Add coroutine_fn to I/O path functions Now that we stay in coroutine context for the whole request when doing reads or writes, we can add coroutine_fn annotations to many functions that can do I/O or yield directly. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-06-26 14:51:15 +02:00
Kevin Wolf	c0e8f98927	qed: Use a coroutine for need_check_timer This fixes the last place where we degraded from AIO to actual blocking synchronous I/O requests. Putting it into a coroutine means that instead of blocking, the coroutine simply yields while doing I/O. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-06-26 14:51:15 +02:00
Kevin Wolf	48cc565e76	qed: Simplify request handling Now that we process a request in the same coroutine from beginning to end and don't drop out of it any more, we can look like a proper coroutine-based driver and simply call qed_aio_next_io() and get a return value from it instead of spawning an additional coroutine that reenters the parent when it's done. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-06-26 14:51:15 +02:00
Kevin Wolf	0806c3b5dd	qed: Use CoQueue for serialising allocations Now that we're running in coroutine context, the ad-hoc serialisation code (which drops a request that has to wait out of coroutine context) can be replaced by a CoQueue. This means that when we resume a serialised request, it is running in coroutine context again and its I/O isn't blocking any more. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-06-26 14:51:15 +02:00
Kevin Wolf	89f89709c7	qed: Implement .bdrv_co_readv/writev Most of the qed code is now synchronous and matches the coroutine model. One notable exception is the serialisation between requests which can still schedule a callback. Before we can replace this with coroutine locks, let's convert the driver's external interfaces to the coroutine versions. We need to be careful to handle both requests that call the completion callback directly from the calling coroutine (i.e. fully synchronous code) and requests that involve some callback, so that we need to yield and wait for the completion callback coming from outside the coroutine. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Manos Pitsidianakis <el13635@mail.ntua.gr> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-06-26 14:51:15 +02:00
Kevin Wolf	018598747c	qed: Remove recursion in qed_aio_next_io() Instead of calling itself recursively as the last thing, just convert qed_aio_next_io() into a loop. This patch is best reviewed with 'git show -w' because most of it is just whitespace changes. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-06-26 14:51:15 +02:00
Kevin Wolf	dddf8db10b	qed: Remove ret argument from qed_aio_next_io() All callers pass ret = 0, so we can just remove it. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-06-26 14:51:14 +02:00
Kevin Wolf	0596be7e6a	qed: Add return value to qed_aio_read/write_data() Don't recurse into qed_aio_next_io() and qed_aio_complete() here, but just return an error code and let the caller handle it. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-06-26 14:51:14 +02:00
Kevin Wolf	d6daddcdeb	qed: Add return value to qed_aio_write_inplace/alloc() Don't recurse into qed_aio_next_io() and qed_aio_complete() here, but just return an error code and let the caller handle it. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-06-26 14:51:14 +02:00
Kevin Wolf	a101341aa0	qed: Add return value to qed_aio_write_cow() Don't recurse into qed_aio_next_io() and qed_aio_complete() here, but just return an error code and let the caller handle it. While refactoring qed_aio_write_alloc() to accomodate the change, qed_aio_write_zero_cluster() ended up with a single line, so I chose to inline that line and remove the function completely. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-06-26 14:51:14 +02:00
Kevin Wolf	eaf0bc56f5	qed: Add return value to qed_aio_write_main() Don't recurse into qed_aio_next_io() and qed_aio_complete() here, but just return an error code and let the caller handle it. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-06-26 14:51:14 +02:00
Kevin Wolf	88d2dd72bc	qed: Add return value to qed_aio_write_l2_update() Don't recurse into qed_aio_next_io() and qed_aio_complete() here, but just return an error code and let the caller handle it. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-06-26 14:51:14 +02:00
Kevin Wolf	fb18de21e0	qed: Add return value to qed_aio_write_l1_update() Don't recurse into qed_aio_next_io() and qed_aio_complete() here, but just return an error code and let the caller handle it. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-06-26 14:51:14 +02:00
Kevin Wolf	fae25ac7bd	qed: Inline qed_commit_l2_update() qed_commit_l2_update() is unconditionally called at the end of qed_aio_write_l1_update(). Inline it. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-06-26 14:51:14 +02:00
Kevin Wolf	a4d8f1aee1	qed: Make qed_aio_write_main() synchronous Note that this code is generally not running in coroutine context, so this is an actual blocking synchronous operation. We'll fix this in a moment. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-06-26 14:51:14 +02:00
Kevin Wolf	3e248cdcd9	qed: Make qed_aio_read_data() synchronous Note that this code is generally not running in coroutine context, so this is an actual blocking synchronous operation. We'll fix this in a moment. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-06-26 14:51:14 +02:00
Kevin Wolf	453e53e2a1	qed: Remove callback from qed_write_table() Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-06-26 14:51:14 +02:00
Kevin Wolf	29470d11bf	qed: Remove GenericCB The GenericCB infrastructure isn't used any more. Remove it. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-06-26 14:51:14 +02:00
Kevin Wolf	602b57fba4	qed: Make qed_write_table() synchronous Note that this code is generally not running in coroutine context, so this is an actual blocking synchronous operation. We'll fix this in a moment. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-06-26 14:51:14 +02:00
Kevin Wolf	f13d712bb2	qed: Remove callback from qed_write_header() Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-06-26 14:51:14 +02:00
Kevin Wolf	7076309aef	qed: Make qed_write_header() synchronous Note that this code is generally not running in coroutine context, so this is an actual blocking synchronous operation. We'll fix this in a moment. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-06-26 14:51:14 +02:00
Kevin Wolf	b4ac32f34f	qed: Remove callback from qed_copy_from_backing_file() With this change, qed_aio_write_prefill() and qed_aio_write_postfill() collapse into a single function. This is reflected by a rename of the combined function to qed_aio_write_cow(). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-06-26 14:51:14 +02:00
Kevin Wolf	0f7aa24d2c	qed: Make qed_copy_from_backing_file() synchronous Note that this code is generally not running in coroutine context, so this is an actual blocking synchronous operation. We'll fix this in a moment. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-06-26 14:51:14 +02:00
Kevin Wolf	e85c528142	qed: Make qed_read_backing_file() synchronous Note that this code is generally not running in coroutine context, so this is an actual blocking synchronous operation. We'll fix this in a moment. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-06-26 14:51:14 +02:00
Kevin Wolf	0f21b7a1b7	qed: Remove callback from qed_find_cluster() Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-06-26 14:51:14 +02:00
Kevin Wolf	a8165d2d66	qed: Remove callback from qed_read_l2_table() Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-06-26 14:51:14 +02:00
Kevin Wolf	f6513529c6	qed: Remove callback from qed_read_table() Instead of passing the return value to a callback, return it to the caller so that the callback can be inlined there. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-06-26 14:51:14 +02:00
Kevin Wolf	11273076e9	qed: Make qed_read_table() synchronous Note that this code is generally not running in coroutine context, so this is an actual blocking synchronous operation. We'll fix this in a moment. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-06-26 14:51:14 +02:00
Kevin Wolf	3b7cd9fd8f	qed: Use bottom half to resume waiting requests The qed driver serialises allocating write requests. When the active allocation is finished, the AIO callback is called, but after this, the next allocating request is immediately processed instead of leaving the coroutine. Resuming another allocation request in the same request coroutine means that the request now runs in the wrong coroutine. The following is one of the possible effects of this: The completed request will generally reenter its request coroutine in a bottom half, expecting that it completes the request in bdrv_driver_pwritev(). However, if the second request actually yielded before leaving the coroutine, the reused request coroutine is in an entirely different place and is reentered prematurely. Not a good idea. Let's make sure that we exit the coroutine after completing the first request by resuming the next allocating request only with a bottom half. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-06-26 14:51:14 +02:00
Alberto Garcia	24990c5b95	qcow2: Use offset_into_cluster() and offset_to_l2_index() We already have functions for doing these calculations, so let's use them instead of doing everything by hand. This makes the code a bit more readable. Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-06-26 14:51:13 +02:00
Alberto Garcia	ee22a9d869	qcow2: Merge the writing of the COW regions with the guest data If the guest tries to write data that results on the allocation of a new cluster, instead of writing the guest data first and then the data from the COW regions, write everything together using one single I/O operation. This can improve the write performance by 25% or more, depending on several factors such as the media type, the cluster size and the I/O request size. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-06-26 14:51:13 +02:00

... 2 3 4 5 6 ...

3486 Commits