mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Fam Zheng	ba3f0e2545	block: Add bdrv_get_block_status_above Like bdrv_is_allocated_above, this function follows the backing chain until seeing BDRV_BLOCK_ALLOCATED. Base is not included. Reimplement bdrv_is_allocated on top. [Initialized bdrv_co_get_block_status_above() ret to 0 to silence mingw64 compiler warning about the unitialized variable. assert(bs != base) prevents that case but I suppose the program could be compiled with -DNDEBUG. --Stefan] Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-07-02 10:03:50 +01:00
John Snow	4b80ab2b7d	qapi: Rename 'dirty-bitmap' mode to 'incremental' If we wish to make differential backups a feature that's easy to access, it might be pertinent to rename the "dirty-bitmap" mode to "incremental" to make it clear what /type/ of backup the dirty-bitmap is helping us perform. This is an API breaking change, but 2.4 has not yet gone live, so we have this flexibility. Signed-off-by: John Snow <jsnow@redhat.com> Message-id: 1433463642-21840-2-git-send-email-jsnow@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-07-02 09:20:18 +01:00
Jindřich Makovička	3e5feb6202	qcow2: Handle EAGAIN returned from update_refcount Fixes a crash during image compression Signed-off-by: Jindřich Makovička <makovick@gmail.com> Tested-by: Richard W.M. Jones <rjones@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-07-02 09:20:18 +01:00
Peter Lieven	5dd7a535b7	block/iscsi: add support for request timeouts libiscsi starting with 1.15 will properly support timeout of iscsi commands. The default will remain no timeout, but this can be changed via cmdline parameters, e.g.: qemu -iscsi timeout=30 -drive file=iscsi://... If a timeout occurs a reconnect is scheduled and the timed out command will be requeued for processing after a successful reconnect. The required API call iscsi_set_timeout is present since libiscsi 1.10 which was released in October 2013. However, due to some bugs in the libiscsi code the use is not recommended before version 1.15. Please note that this patch bumps the libiscsi requirement to 1.10 to have all function and macros defined. The patch fixes also a off-by-one error in the NOP timeout calculation which was fixed while touching these code parts. Signed-off-by: Peter Lieven <pl@kamp.de> Message-id: 1434455107-19328-1-git-send-email-pl@kamp.de Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-07-02 09:20:18 +01:00
Dimitris Aragiorgis	3307ed7b3f	raw-posix: Introduce hdev_is_sg() Until now, an SG device was identified only by checking if its path started with "/dev/sg". Then, hdev_open() would set the bs->sg flag accordingly. The patch relies on the actual properties of the device instead of the specified file path. To this end, test for an SG device (e.g. /dev/sg0) by ensuring that all of the following holds: - The specified file name corresponds to a character device - The device supports the SG_GET_VERSION_NUM ioctl - The device supports the SG_GET_SCSI_ID ioctl Signed-off-by: Dimitris Aragiorgis <dimara@arrikto.com> Message-id: 1435056300-14924-6-git-send-email-dimara@arrikto.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-06-23 15:08:52 +01:00
Dimitris Aragiorgis	a93a3982a6	raw-posix: Use DPRINTF for DEBUG_FLOPPY Get rid of several #ifdef DEBUG_FLOPPY and substitute them with DPRINTF. Signed-off-by: Dimitris Aragiorgis <dimara@arrikto.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1435056300-14924-5-git-send-email-dimara@arrikto.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-06-23 15:08:52 +01:00
Dimitris Aragiorgis	bcb225550d	raw-posix: DPRINTF instead of DEBUG_BLOCK_PRINT Building the QEMU tools fails if we #define DEBUG_BLOCK inside block/raw-posix.c. Here instead of adding qemu-log.o in block-obj-y so that DEBUG_BLOCK_PRINT can be used, we substitute the latter with a simple DPRINTF() (that does not cause bit-rot). Signed-off-by: Dimitris Aragiorgis <dimara@arrikto.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1435056300-14924-4-git-send-email-dimara@arrikto.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-06-23 15:08:52 +01:00
Dimitris Aragiorgis	1b6bc94d5d	Fix migration in case of scsi-generic During migration, QEMU uses fsync()/fdatasync() on the open file descriptor for read-write block devices to flush data just before stopping the VM. However, fsync() on a scsi-generic device returns -EINVAL which causes the migration to fail. This patch skips flushing data in case of an SG device, since submitting SCSI commands directly via an SG character device (e.g. /dev/sg0) bypasses the page cache completely, anyway. Note that fsync() not only flushes the page cache but also the disk cache. The scsi-generic device never sends flushes, and for migration it assumes that the same SCSI device is used by the destination host, so it does not issue any SCSI SYNCHRONIZE CACHE (10) command. Finally, remove the bdrv_is_sg() test from iscsi_co_flush() since this is now redundant (we flush the underlying protocol at the end of bdrv_co_flush() which, with this patch, we never reach). Signed-off-by: Dimitris Aragiorgis <dimara@arrikto.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1435056300-14924-3-git-send-email-dimara@arrikto.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-06-23 15:08:52 +01:00
Dimitris Aragiorgis	b192af8acc	block: Use bdrv_is_sg() everywhere Instead of checking bs->sg use bdrv_is_sg() consistently throughout the code. Signed-off-by: Dimitris Aragiorgis <dimara@arrikto.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1435056300-14924-2-git-send-email-dimara@arrikto.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-06-23 15:08:52 +01:00
Wolfgang Bumiller	d5941ddae8	vvfat: add a label option Until now the vvfat volume label was hardcoded to be "QEMU VVFAT", now you can pass a file.label=labelname option to the -drive to change it. The FAT structure defines the volume label to be limited to 11 bytes and is filled up spaces when shorter than that. The trailing spaces however aren't exposed to the user by operating systems. [Added missing comment '#' characters in block-core.json to fix build errors. --Stefan] Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com> Message-id: 1434706529-13895-2-git-send-email-w.bumiller@proxmox.com Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-06-23 15:06:17 +01:00
Alexander Yarygin	97b0385a34	block-backend: Introduce blk_drain() This patch introduces the blk_drain() function which allows to replace blk_drain_all() when only one BlockDriverState needs to be drained. Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Cornelia Huck <cornelia.huck@de.ibm.com> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Kevin Wolf <kwolf@redhat.com> Message-id: 1434537440-28236-2-git-send-email-yarygin@linux.vnet.ibm.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-06-23 15:06:16 +01:00
Alberto Garcia	2f388b93a1	throttle: Check current timers before updating any_timer_armed[] Calling throttle_group_config() cancels all timers from a particular BlockDriverState, so any_timer_armed[] should be updated accordingly. However, with the current code it may happen that a timer is armed in a different BlockDriverState from the same group, so any_timer_armed[] would be set to false in a situation where there is still a timer armed. The consequence is that we might end up with two timers armed. This should not have any noticeable impact however, since all accesses to the ThrottleGroup are protected by a lock, and the situation would become normal again shortly thereafter as soon as all timers have been fired. The correct way to solve this is to check that we're actually cancelling a timer before updating any_timer_armed[]. Signed-off-by: Alberto Garcia <berto@igalia.com> Message-id: 1434382875-3998-1-git-send-email-berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-06-23 15:06:16 +01:00
Alexander Yarygin	f406c03c09	block: Let bdrv_drain_all() to call aio_poll() for each AioContext After the commit `9b536adc` ("block: acquire AioContext in bdrv_drain_all()") the aio_poll() function got called for every BlockDriverState, in assumption that every device may have its own AioContext. If we have thousands of disks attached, there are a lot of BlockDriverStates but only a few AioContexts, leading to tons of unnecessary aio_poll() calls. This patch changes the bdrv_drain_all() function allowing it find shared AioContexts and to call aio_poll() only for unique ones. Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Cornelia Huck <cornelia.huck@de.ibm.com> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Christian Borntraeger <borntraeger@de.ibm.com> Message-id: 1433936297-7098-4-git-send-email-yarygin@linux.vnet.ibm.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-06-23 15:06:16 +01:00
Markus Armbruster	cc7a8ea740	Include qapi/qmp/qerror.h exactly where needed In particular, don't include it into headers. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>	2015-06-22 18:20:41 +02:00
Markus Armbruster	d49b683644	qerror: Move #include out of qerror.h Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>	2015-06-22 18:20:40 +02:00
Markus Armbruster	4629ed1e98	qerror: Finally unused, clean up Remove it except for two things in qerror.h: * Two #include to be cleaned up separately to avoid cluttering this patch. * The QERR_ macros. Mark as obsolete. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>	2015-06-22 18:20:40 +02:00
Markus Armbruster	c6bd8c706a	qerror: Clean up QERR_ macros to expand into a single string These macros expand into error class enumeration constant, comma, string. Unclean. Has been that way since commit `13f59ae`. The error class is always ERROR_CLASS_GENERIC_ERROR since the previous commit. Clean up as follows: * Prepend every use of a QERR_ macro by ERROR_CLASS_GENERIC_ERROR, and delete it from the QERR_ macro. No change after preprocessing. * Rewrite error_set(ERROR_CLASS_GENERIC_ERROR, ...) into error_setg(...). Again, no change after preprocessing. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>	2015-06-22 18:20:40 +02:00
Eric Blake	fc48ffc39e	qobject: Use 'bool' for qbool We require a C99 compiler, so let's use 'bool' instead of 'int' when dealing with boolean values. There are few enough clients to fix them all in one pass. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Andreas Färber <afaerber@suse.de> Reviewed-by: Alberto Garcia <berto@igalia.com> Acked-by: Luiz Capitulino <lcapitulino@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2015-06-22 17:40:00 +02:00
Peter Maydell	f3e3b083d4	Block layer core and image format patches -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABAgAGBQJVevYFAAoJEH8JsnLIjy/W3jEP/0hiQ3rCRZ/he8s5maTdT+TR YSeHkB5rKpz0Uopn1DMn1QrIbUVzX7dyb+uf9zQ0/xRQIzf6k8uxqU/NWrdoF3NK qx91dGWedwnG+TEBIMbcR7nMrw4dP6kH7uPz/VWMXDHVLz0HIcD95qhKgs0mSY6J dWqex6ACjXM68zJU5IioagU9evV80WZE1S8z7zfixxtTBx5hCaTVbwalkaCxcrXw PbZle55rjI8B10+OzgBw0fq10nias+NTndU9CwNBboxmEtAjq8/mQ663vcWlmiFo 9a/hkda27Z5ut/0Tqk1v4uLHauylp++rrAabPBAuCFMKes6cdkddP15Q/r52aJ29 5meodQtbet1rGrM+Aq4vuSuWId71PGypEI/3URDdNfYFNISoeLLsk4lcQUu7VrDD sRX3Jt8SI3nkIgOnhPyi7NDPmafxFt8yRt5vM8MyR5ynF8NS/2hiAc3wqnbXGjUj a5GqDCefb1yM0R5HvksuFFt3OnXlKJQ3J+ksXNUJf9DSAZPauqWD696pcTeg8wyy 3PIGkczgUuKTVfFWd3THZxJLAo7ZuqvXBHHV8o1SeMBDxwh4FhTd8Kjvm3rUNFfl VDox4qwZ1AcxLrxgqazKU7sD9iWBDHURRcpOoUsBys7oxQZnhcmMp1fRlEkTOyrD HNiSNByqBrtkfeVzHlSe =QgYk -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging Block layer core and image format patches # gpg: Signature made Fri Jun 12 16:08:53 2015 BST using RSA key ID C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" * remotes/kevin/tags/for-upstream: (25 commits) block: Fix reopen flag inheritance block: Add BlockDriverState.inherits_from block: Add list of children to BlockDriverState queue.h: Add QLIST_FIX_HEAD_PTR() block: Drain requests before swapping nodes in bdrv_swap() block: Move flag inheritance to bdrv_open_inherit() block: Use QemuOpts in bdrv_open_common() block: Use macro for cache option names vmdk: Use bdrv_open_image() quorum: Use bdrv_open_image() check-qdict: Test cases for new functions qdict: Add qdict_{set,copy}_default() qdict: Add qdict_array_entries() iotests: Add tests for overriding BDRV_O_PROTOCOL block: driver should override flags in bdrv_open() block: Change bitmap truncate conditional to assertion block: record new size in bdrv_dirty_bitmap_truncate raw-posix: Fix .bdrv_co_get_block_status() for unaligned image size vmdk: Use vmdk_find_index_in_cluster everywhere vmdk: Fix index_in_cluster calculation in vmdk_co_get_block_status ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2015-06-15 10:43:06 +01:00
Kevin Wolf	67251a3113	block: Fix reopen flag inheritance When reopening an image, the block layer already takes care to reopen bs->file as well with recalculated inherited flags. The same must happen for any other child (most notably missing before this patch: backing files). If bs->file (or any other child) didn't originally inherit from bs, e.g. because it was created separately and then only referenced, it must not inherit flags on reopen either, so check the inherited_from field before propagation the reopen down. VMDK already reopened its extents manually; this code can now be dropped. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2015-06-12 17:04:59 +02:00
Kevin Wolf	f3930ed0bb	block: Move flag inheritance to bdrv_open_inherit() Instead of letting every caller of bdrv_open() determine the right flags for its child node manually and pass them to the function, pass the parent node and the role of the newly opened child (like backing file, protocol layer, etc.). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>	2015-06-12 17:04:59 +02:00
Kevin Wolf	a646836784	vmdk: Use bdrv_open_image() Besides standardising on a single interface for opening child nodes, this patch allows the user to specify options to individual extent nodes. Overriding file names isn't possible with this yet, so it's of limited usefulness, but still a step forward. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com>	2015-06-12 16:58:07 +02:00
Kevin Wolf	ea6828d81b	quorum: Use bdrv_open_image() Besides standardising on a single interface for opening child nodes, this simplifies the .bdrv_open() implementation of the quorum block driver by using block layer functionality for handling BlockdevRefs. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com>	2015-06-12 16:58:07 +02:00
Kevin Wolf	b8684454e1	raw-posix: Fix .bdrv_co_get_block_status() for unaligned image size Image files with an unaligned image size have a final hole that starts at EOF, i.e. in the middle of a sector. Currently, *pnum == 0 is returned when checking the status of this sector. In qemu-img, this triggers an assertion failure. In order to fix this, one type for the sector that contains EOF must be found. Treating a hole as data is safe, so this patch rounds the calculated number of data sectors up, so that a partial sector at EOF is treated as a full data sector. This fixes https://bugzilla.redhat.com/show_bug.cgi?id=1229394 Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Tested-by: Cole Robinson <crobinso@redhat.com>	2015-06-12 15:54:01 +02:00
Fam Zheng	90df601f06	vmdk: Use vmdk_find_index_in_cluster everywhere Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-06-12 15:54:01 +02:00
Fam Zheng	61f0ed1d54	vmdk: Fix index_in_cluster calculation in vmdk_co_get_block_status It has the similar issue with `b1649fae49`. Since the calculation is repeated for a few times already, introduce a function so it can be reused. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-06-12 15:54:01 +02:00
Max Reitz	bc85ef265a	qcow2: Add DEFAULT_L2_CACHE_CLUSTERS If a relatively large cluster size is chosen, the default of 1 MB L2 cache is not really appropriate. In this case, unless overridden by the user, the default cache size should not be determined by its size in bytes but by the number of L2 tables (clusters) it is supposed to contain. Note that without this patch, MIN_L2_CACHE_SIZE will effectively take over the same role. However, providing space for just two L2 tables is not enough to be the default. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-06-12 15:54:01 +02:00
Max Reitz	57e2166959	qcow2: Set MIN_L2_CACHE_SIZE to 2 The L2 cache must cover at least two L2 tables, because during COW two L2 tables are accessed simultaneously. Reported-by: Alexander Graf <agraf@suse.de> Cc: qemu-stable <qemu-stable@nongnu.org> Signed-off-by: Max Reitz <mreitz@redhat.com> Tested-by: Alexander Graf <agraf@suse.de> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-06-12 15:54:00 +02:00
Alberto Garcia	b8fe1694e5	throttle: add the name of the ThrottleGroup to BlockDeviceInfo Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 172df91f09c69c6f0440a697bbd1b3f95b077ee4.1433779731.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-06-12 14:00:00 +01:00
Alberto Garcia	db6283385c	throttle: acquire the ThrottleGroup lock in bdrv_swap() bdrv_swap() touches the fields of a BlockDriverState that are protected by the ThrottleGroup lock. Although those fields end up in their original place, they are temporarily swapped in the process, so there's a chance that an operation on a member of the same group happening on a different thread can try to use them. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: d92dc40d7c4f1fc5cda5cbbf4ffb7a4670b79d17.1433779731.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-06-12 14:00:00 +01:00
Alberto Garcia	76f4afb40f	throttle: Add throttle group support The throttle group support use a cooperative round robin scheduling algorithm. The principles of the algorithm are simple: - Each BDS of the group is used as a token in a circular way. - The active BDS computes if a wait must be done and arms the right timer. - If a wait must be done the token timer will be armed so the token will become the next active BDS. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: f0082a86f3ac01c46170f7eafe2101a92e8fde39.1433779731.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-06-12 14:00:00 +01:00
Alberto Garcia	2ff1f2e3a3	throttle: Add throttle group infrastructure Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 2fdb4de17210b733a13eb472c33cd08b45f8fd21.1433779731.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-06-12 14:00:00 +01:00
Benoît Canet	0e5b0a2d54	throttle: Extract timers from ThrottleState into a separate structure Group throttling will share ThrottleState between multiple bs. As a consequence the ThrottleState will be accessed by multiple aio context. Timers are tied to their aio context so they must go out of the ThrottleState structure. This commit paves the way for each bs of a common ThrottleState to have its own timer. Signed-off-by: Benoit Canet <benoit.canet@nodalink.com> Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 6cf9ea96d8b32ae2f8769cead38f68a6a0c8c909.1433779731.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-06-12 14:00:00 +01:00
Kevin Wolf	f4a769abaa	raw-posix: Fix .bdrv_co_get_block_status() for unaligned image size Image files with an unaligned image size have a final hole that starts at EOF, i.e. in the middle of a sector. Currently, *pnum == 0 is returned when checking the status of this sector. In qemu-img, this triggers an assertion failure. In order to fix this, one type for the sector that contains EOF must be found. Treating a hole as data is safe, so this patch rounds the calculated number of data sectors up, so that a partial sector at EOF is treated as a full data sector. This fixes https://bugzilla.redhat.com/show_bug.cgi?id=1229394 Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-id: 1433840108-9996-1-git-send-email-kwolf@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-06-12 13:58:33 +01:00
Markus Armbruster	8809cfc38e	blkdebug: Simplify passing of Error through qemu_opts_foreach() Cc: Kevin Wolf <kwolf@redhat.com> Cc: qemu-block@nongnu.org Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Acked-by: Kevin Wolf <kwolf@redhat.com>	2015-06-09 07:40:23 +02:00
Markus Armbruster	28d0de7a4f	QemuOpts: Convert qemu_opts_foreach() to Error Retain the function value for now, to permit selective conversion of its callers. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Acked-by: Kevin Wolf <kwolf@redhat.com>	2015-06-09 07:37:37 +02:00
Markus Armbruster	a4c7367f7d	QemuOpts: Drop qemu_opts_foreach() parameter abort_on_failure When the argument is non-zero, qemu_opts_foreach() stops on callback returning non-zero, and returns that value. When the argument is zero, it doesn't stop, and returns the bit-wise inclusive or of all the return values. Funky :) The callers that pass zero could just as well pass one, because their callbacks can't return anything but zero: * qemu_add_globals()'s callback qdev_add_one_global() * qemu_config_write()'s callback config_write_opts() * main()'s callbacks default_driver_check(), drive_enable_snapshot(), vnc_init_func() Drop the parameter, and always stop. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Acked-by: Kevin Wolf <kwolf@redhat.com>	2015-06-08 19:33:20 +02:00
Fam Zheng	44f192f364	iscsi: Remove pointless runtime check of macro value raw_bsd already has QEMU_BUILD_BUG_ON(BDRV_SECTOR_SIZE != 512), so iscsi should relax. Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2015-06-03 14:21:23 +03:00
Daniel P. Berrange	8336aafae1	qcow2/qcow: protect against uninitialized encryption key When a qcow[2] file is opened, if the header reports an encryption method, this is used to set the 'crypt_method_header' field on the BDRVQcow[2]State struct, and the 'encrypted' flag in the BDRVState struct. When doing I/O operations, the 'crypt_method' field on the BDRVQcow[2]State struct is checked to determine if encryption needs to be applied. The crypt_method_header value is copied into crypt_method when the bdrv_set_key() method is called. The QEMU code which opens a block device is expected to always do a check if (bdrv_is_encrypted(bs)) { bdrv_set_key(bs, ....key...); } If code forgets to do this, then 'crypt_method' is never set and so when I/O is performed, QEMU writes plain text data into a sector which is expected to contain cipher text, or when reading, will return cipher text instead of plain text. Change the qcow[2] code to consult bs->encrypted when deciding whether encryption is required, and assert(s->crypt_method) to protect against cases where the caller forgets to set the encryption key. Also put an assert in the set_key methods to protect against the case where the caller sets an encryption key on a block device that does not have encryption Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-05-22 17:08:01 +02:00
Alberto Garcia	d1b4efe5c4	qcow2: style fixes in qcow2-cache.c Fix pointer declaration to make it consistent with the rest of the code. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-05-22 17:08:01 +02:00
Alberto Garcia	a3f1afb43a	qcow2: make qcow2_cache_put() a void function This function never receives an invalid table pointer, so we can make it void and remove all the error checking code. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-05-22 17:08:01 +02:00
Alberto Garcia	812e4082ca	qcow2: use a hash to look for entries in the L2 cache The current cache algorithm traverses the array starting always from the beginning, so the average number of comparisons needed to perform a lookup is proportional to the size of the array. By using a hash of the offset as the starting point, lookups are faster and independent from the array size. The hash is computed using the cluster number of the table, multiplied by 4 to make it perform better when there are collisions. In my tests, using a cache with 2048 entries, this reduces the average number of comparisons per lookup from 430 to 2.5. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-05-22 17:08:01 +02:00
Alberto Garcia	fdfbca82a0	qcow2: remove qcow2_cache_find_entry_to_replace() A cache miss means that the whole array was traversed and the entry we were looking for was not found, so there's no need to traverse it again in order to select an entry to replace. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-05-22 17:08:01 +02:00
Alberto Garcia	2693310ecc	qcow2: use an LRU algorithm to replace entries from the L2 cache The current algorithm to evict entries from the cache gives always preference to those in the lowest positions. As the size of the cache increases, the chances of the later elements of being removed decrease exponentially. In a scenario with random I/O and lots of cache misses, entries in positions 8 and higher are rarely (if ever) evicted. This can be seen even with the default cache size, but with larger caches the problem becomes more obvious. Using an LRU algorithm makes the chances of being removed from the cache independent from the position. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-05-22 17:08:01 +02:00
Alberto Garcia	baf07d60f5	qcow2: simplify qcow2_cache_put() and qcow2_cache_entry_mark_dirty() Since all tables are now stored together, it is possible to obtain the position of a particular table directly from its address, so the operation becomes O(1). Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-05-22 17:08:01 +02:00
Alberto Garcia	72e80b8901	qcow2: use one single memory block for the L2/refcount cache tables The qcow2 L2/refcount cache contains one separate table for each cache entry. Doing one allocation per table adds unnecessary overhead and it also requires us to store the address of each table separately. Since the size of the cache is constant during its lifetime, it's better to have an array that contains all the tables using one single allocation. In my tests measuring freshly created caches with sizes 128MB (L2) and 32MB (refcount) this uses around 10MB of RAM less. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-05-22 17:08:01 +02:00
Fam Zheng	13c4941cdd	vmdk: Fix overflow if l1_size is 0x20000000 Richard Jones caught this bug with afl fuzzer. In fact, that's the only possible value to overflow (extent->l1_size = 0x20000000) l1_size: l1_size = extent->l1_size * sizeof(long) => 0x80000000; g_try_malloc returns NULL because l1_size is interpreted as negative during type casting from 'int' to 'gsize', which yields a enormous value. Hence, by coincidence, we get a "not too bad" behavior: qemu-img: Could not open '/tmp/afl6.img': Could not open '/tmp/afl6.img': Cannot allocate memory Values larger than 0x20000000 will be refused by the validation in vmdk_add_extent. Values smaller than 0x20000000 will not overflow l1_size. Cc: qemu-stable@nongnu.org Reported-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Tested-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-05-22 17:08:01 +02:00
Fam Zheng	5e82a31eb9	vmdk: Fix next_cluster_sector for compressed write This fixes the bug introduced by commit `c6ac36e` (vmdk: Optimize cluster allocation). Sometimes, write_len could be larger than cluster size, because it contains both data and marker. We must advance next_cluster_sector in this case, otherwise the image gets corrupted. Cc: qemu-stable@nongnu.org Reported-by: Antoni Villalonga <qemu-list@friki.cat> Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-05-22 17:08:00 +02:00
Kevin Wolf	ecbda7a225	qcow2: Flush pending discards before allocating cluster Before a freed cluster can be reused, pending discards for this cluster must be processed. The original assumption was that this was not a problem because discards are only cached during discard/write zeroes operations, which are synchronous so that no concurrent write requests can cause cluster allocations. However, the discard/write zeroes operation itself can allocate a new L2 table (and it has to in order to put zero flags there), so make sure we can cope with the situation. This fixes https://bugs.launchpad.net/bugs/1349972. Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>	2015-05-22 17:08:00 +02:00
Paolo Bonzini	a53f1a95f9	block: get_block_status: use "else" when testing the opposite condition A bit of Boolean algebra (and common sense) tells us that the second "if" here is looking for blocks that are not allocated. This is the opposite of the "if" that sets BDRV_BLOCK_ALLOCATED, and thus it can use an "else". Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 1431599702-10431-1-git-send-email-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:33 +01:00
Fam Zheng	9eeb6dd1b2	block: Fix NULL deference for unaligned write if qiov is NULL For zero write, callers pass in NULL qiov (qemu-io "write -z" or scsi-disk "write same"). Commit `fc3959e466` fixed bdrv_co_write_zeroes which is the common case for this bug, but it still exists in bdrv_aio_write_zeroes. A simpler fix would be in bdrv_co_do_pwritev which is the NULL dereference point and covers both cases. So don't access it in bdrv_co_do_pwritev in this case, use three aligned writes. [Initialize ret to 0 in bdrv_co_do_zero_pwritev() to avoid uninitialized variable warning with gcc 4.9.2. --Stefan] Signed-off-by: Fam Zheng <famz@redhat.com> Message-id: 1431522721-3266-3-git-send-email-famz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:33 +01:00
Fam Zheng	d01c07f222	Revert "block: Fix unaligned zero write" This reverts commit `fc3959e466`. The core write code already handles the case, so remove this duplication. Because commit `61007b316` moved the touched code from block.c to block/io.c, the change is manually reverted. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-id: 1431522721-3266-2-git-send-email-famz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:33 +01:00
Denis V. Lunev	459b4e6612	block: align bounce buffers to page The following sequence int fd = open(argv[1], O_RDWR \| O_CREAT \| O_DIRECT, 0644); for (i = 0; i < 100000; i++) write(fd, buf, 4096); performs 5% better if buf is aligned to 4096 bytes. The difference is quite reliable. On the other hand we do not want at the moment to enforce bounce buffering if guest request is aligned to 512 bytes. The patch changes default bounce buffer optimal alignment to MAX(page size, 4k). 4k is chosen as maximal known sector size on real HDD. The justification of the performance improve is quite interesting. From the kernel point of view each request to the disk was split by two. This could be seen by blktrace like this: 9,0 11 1 0.000000000 11151 Q WS 312737792 + 1023 [qemu-img] 9,0 11 2 0.000007938 11151 Q WS 312738815 + 8 [qemu-img] 9,0 11 3 0.000030735 11151 Q WS 312738823 + 1016 [qemu-img] 9,0 11 4 0.000032482 11151 Q WS 312739839 + 8 [qemu-img] 9,0 11 5 0.000041379 11151 Q WS 312739847 + 1016 [qemu-img] 9,0 11 6 0.000042818 11151 Q WS 312740863 + 8 [qemu-img] 9,0 11 7 0.000051236 11151 Q WS 312740871 + 1017 [qemu-img] 9,0 5 1 0.169071519 11151 Q WS 312741888 + 1023 [qemu-img] After the patch the pattern becomes normal: 9,0 6 1 0.000000000 12422 Q WS 314834944 + 1024 [qemu-img] 9,0 6 2 0.000038527 12422 Q WS 314835968 + 1024 [qemu-img] 9,0 6 3 0.000072849 12422 Q WS 314836992 + 1024 [qemu-img] 9,0 6 4 0.000106276 12422 Q WS 314838016 + 1024 [qemu-img] and the amount of requests sent to disk (could be calculated counting number of lines in the output of blktrace) is reduced about 2 times. Both qemu-img and qemu-io are affected while qemu-kvm is not. The guest does his job well and real requests comes properly aligned (to page). Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-id: 1431441056-26198-3-git-send-email-den@openvz.org CC: Paolo Bonzini <pbonzini@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:33 +01:00
Denis V. Lunev	4196d2f030	block: minimal bounce buffer alignment The patch introduces new concept: minimal memory alignment for bounce buffers. Original so called "optimal" value is actually minimal required value for aligment. It should be used for validation that the IOVec is properly aligned and bounce buffer is not required. Though, from the performance point of view, it would be better if bounce buffer or IOVec allocated by QEMU will be aligned stricter. The patch does not change any alignment value yet. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-id: 1431441056-26198-2-git-send-email-den@openvz.org CC: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:33 +01:00
Paolo Bonzini	eaf5fe2dd4	block: return EPERM on writes or discards to read-only devices This is the behavior in the operating system, for example Linux's blkdev_write_iter has the following: if (bdev_read_only(I_BDEV(bd_inode))) return -EPERM; This does not apply to opening a device for read/write, when the device only supports read-only operation. In this case any of EACCES, EPERM or EROFS is acceptable depending on why writing is not possible. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1431013548-22492-1-git-send-email-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:33 +01:00
Denis V. Lunev	ddd2ef2ce8	block/parallels: improve image writing performance further Try to perform IO for the biggest continuous block possible. All blocks abscent in the image are accounted in the same type and preallocation is made for all of them at once. The performance for sequential write is increased from 200 Mb/sec to 235 Mb/sec on my SSD HDD. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Roman Kagan <rkagan@parallels.com> Signed-off-by: Roman Kagan <rkagan@parallels.com> Message-id: 1430207220-24458-28-git-send-email-den@openvz.org CC: Kevin Wolf <kwolf@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:32 +01:00
Denis V. Lunev	19f5dc1591	block/parallels: optimize linear image expansion Plain image expansion spends a lot of time to update image file size. This seriously affects the performance. The following simple test qemu_img create -f parallels -o cluster_size=64k ./1.hds 64G qemu_io -n -c "write -P 0x11 0 1024M" ./1.hds could be improved if the format driver will pre-allocate some space in the image file with a reasonable chunk. This patch preallocates 128 Mb using bdrv_write_zeroes, which should normally use fallocate() call inside. Fallback to older truncate() could be used as a fallback using image open options thanks to the previous patch. The benefit is around 15%. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Roman Karan <rkagan@parallels.com> Signed-off-by: Roman Kagan <rkagan@parallels.com> Message-id: 1430207220-24458-27-git-send-email-den@openvz.org CC: Kevin Wolf <kwolf@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:32 +01:00
Denis V. Lunev	d61790112f	block/parallels: add prealloc-mode and prealloc-size open paramemets This is preparational commit for tweaks in Parallels image expansion. The idea is that enlarge via truncate by one data block is slow. It would be much better to use fallocate via bdrv_write_zeroes and expand by some significant amount at once. Original idea with sequential file writing to the end of the file without fallocate/truncate would be slower than this approach if the image is expanded with several operations: - each image expanding means file metadata update, i.e. filesystem journal write. Truncate/write to newly truncated space update file metadata twice thus truncate removal helps. With fallocate call inside bdrv_write_zeroes file metadata is updated only once and this should happen infrequently thus this approach is the best one for the image expansion - tail writes are ordered, i.e. the guest IO queue could not be sent immediately to the host introducing additional IO delays This patch just adds proper parameters into BDRVParallelsState and performs options parsing in parallels_open. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Roman Kagan <rkagan@parallels.com> Signed-off-by: Roman Kagan <rkagan@parallels.com> Message-id: 1430207220-24458-26-git-send-email-den@openvz.org CC: Roman Kagan <rkagan@parallels.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:32 +01:00
Denis V. Lunev	0d31c7c200	block/parallels: delay writing to BAT till bdrv_co_flush_to_os The idea is that we do not need to immediately sync BAT to the image as from the guest point of view there is a possibility that IO is lost even in the physical controller until flush command was finished. bdrv_co_flush_to_os is exactly the right place for this purpose. Technically the patch uses loaded BAT data as a cache and performs actual on-disk metadata updates in parallels_co_flush_to_os callback. This patch speed ups qemu-img create -f parallels -o cluster_size=64k ./1.hds 64G qemu-io -f parallels -c "write -P 0x11 0 1024k" 1.hds writing from 50-60 Mb/sec to 80-90 Mb/sec on rotational media and from 160 Mb/sec to 190 Mb/sec on SSD disk. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Roman Kagan <rkagan@parallels.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Roman Kagan <rkagan@parallels.com> Message-id: 1430207220-24458-25-git-send-email-den@openvz.org CC: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:32 +01:00
Denis V. Lunev	2d68e22e94	block/parallels: create bat_entry_off helper calculate offset of the BAT entry in the image file. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Roman Kagan <rkagan@parallels.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Roman Kagan <rkagan@parallels.com> Message-id: 1430207220-24458-24-git-send-email-den@openvz.org CC: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:32 +01:00
Denis V. Lunev	6953d92078	block/parallels: improve image reading performance Try to perform IO for the biggest continuous block possible. The performance for sequential read is increased from 220 Mb/sec to 360 Mb/sec for continous image on my SSD HDD. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Roman Kagan <rkagan@parallels.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Roman Kagan <rkagan@parallels.com> Message-id: 1430207220-24458-23-git-send-email-den@openvz.org CC: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:32 +01:00
Denis V. Lunev	6dd6b9f144	block/parallels: implement incorrect close detection The software driver must set inuse field in Parallels header to 0x746F6E59 when the image is opened in read-write mode. The presence of this magic in the header on open forces image consistency check. There is an unfortunate trick here. We can not check for inuse in parallels_check as this will happen too late. It is possible to do that for simple check, but during the fix this would always report an error as the image was opened in BDRV_O_RDWR mode. Thus we save the flag in BDRVParallelsState for this. On the other hand, nothing should be done to clear inuse in parallels_check. Generic close will do the job right. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Roman Kagan <rkagan@parallels.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Roman Kagan <rkagan@parallels.com> Message-id: 1430207220-24458-21-git-send-email-den@openvz.org CC: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:32 +01:00
Denis V. Lunev	49ad646731	block/parallels: implement parallels_check method of block driver The check is very simple at the moment. It calculates necessary stats and fix only the following errors: - space leak at the end of the image. This would happens due to preallocation - clusters outside the image are zeroed. Nothing else could be done here Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Roman Kagan <rkagan@parallels.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Roman Kagan <rkagan@parallels.com> Message-id: 1430207220-24458-20-git-send-email-den@openvz.org CC: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:32 +01:00
Denis V. Lunev	23d6bd3bd1	block/parallels: move parallels_open/probe to the very end of the file This will help to avoid forward declarations for upcoming parallels_check Some very obvious formatting fixes were made to the moved code to make checkpatch happy. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Roman Kagan <rkagan@parallels.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Roman Kagan <rkagan@parallels.com> Message-id: 1430207220-24458-19-git-send-email-den@openvz.org CC: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:32 +01:00
Denis V. Lunev	9eae9cca95	block/parallels: read parallels image header and BAT into single buffer This metadata cache would allow to properly batch BAT updates to disk in next patches. These updates will be properly aligned to avoid read-modify-write transactions on block level. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Roman Kagan <rkagan@parallels.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Roman Kagan <rkagan@parallels.com> Message-id: 1430207220-24458-18-git-send-email-den@openvz.org CC: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:32 +01:00
Denis V. Lunev	dd97cdc064	block/parallels: keep BAT bitmap data in little endian in memory This will allow to use this data as buffer to BAT update directly without any intermediate buffers. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Roman Kagan <rkagan@parallels.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Roman Kagan <rkagan@parallels.com> Message-id: 1430207220-24458-17-git-send-email-den@openvz.org CC: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:32 +01:00
Denis V. Lunev	555cc9d9fc	block/parallels: create bat2sect helper deduplicate copy/paste arithmetcs Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Roman Kagan <rkagan@parallels.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Roman Kagan <rkagan@parallels.com> Message-id: 1430207220-24458-16-git-send-email-den@openvz.org CC: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:32 +01:00
Denis V. Lunev	369f7de9d5	block/parallels: rename catalog_ names to bat_ BAT means 'block allocation table'. Thus this name is clean and shorter on writing. Some obvious formatting fixes in the old code were made to make checkpatch happy. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Roman Kagan <rkagan@parallels.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Roman Kagan <rkagan@parallels.com> Message-id: 1430207220-24458-15-git-send-email-den@openvz.org CC: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:32 +01:00
Denis V. Lunev	cc5690f20f	parallels: change copyright information in the image header Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Roman Kagan <rkagan@parallels.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Roman Kagan <rkagan@parallels.com> Message-id: 1430207220-24458-14-git-send-email-den@openvz.org CC: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:32 +01:00
Denis V. Lunev	74cf6c5026	block/parallels: support parallels image creation Do not even care to create WithoutFreeSpace image, it is obsolete. Always create WithouFreSpacExt one. The code also does not spend a lot of efforts to fill cylinders and heads fields, they are not used actually in a real life neither in QEMU nor in Parallels products. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Roman Kagan <rkagan@parallels.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Roman Kagan <rkagan@parallels.com> Message-id: 1430207220-24458-12-git-send-email-den@openvz.org CC: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:31 +01:00
Denis V. Lunev	5a41e1fa95	block/parallels: _co_writev callback for Parallels format Support write on Parallels images. The code is almost the same as one in the previous patch implemented scatter-gather IO for read. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Roman Kagan <rkagan@parallels.com> Signed-off-by: Roman Kagan <rkagan@parallels.com> Message-id: 1430207220-24458-10-git-send-email-den@openvz.org CC: Roman Kagan <rkagan@parallels.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:31 +01:00
Denis V. Lunev	d0e61ce56d	block/parallels: mark parallels format driver as zero inited From the guest point of view unallocated blocks are zeroed. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Roman Kagan <rkagan@parallels.com> Signed-off-by: Roman Kagan <rkagan@parallels.com> Message-id: 1430207220-24458-9-git-send-email-den@openvz.org CC: Roman Kagan <rkagan@parallels.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:31 +01:00
Denis V. Lunev	912f31281a	block/parallels: replace magic constants 4, 64 with proper sizeofs simple purification.. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Roman Kagan <rkagan@parallels.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Roman Kagan <rkagan@parallels.com> Message-id: 1430207220-24458-8-git-send-email-den@openvz.org CC: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:31 +01:00
Denis V. Lunev	481fb9cf18	block/parallels: provide _co_readv routine for parallels format driver Main approach is taken from qcow2_co_readv. The patch drops coroutine lock for the duration of IO operation and peforms normal scatter-gather IO using standard QEMU backend. The patch also adds comment about locking considerations in the driver. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Roman Kagan <rkagan@parallels.com> Signed-off-by: Roman Kagan <rkagan@parallels.com> Message-id: 1430207220-24458-7-git-send-email-den@openvz.org CC: Roman Kagan <rkagan@parallels.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:31 +01:00
Roman Kagan	dd3bed16ff	block/parallels: add get_block_status Implement VFS method for get_block_status to Parallels format driver. qemu_co_mutex_lock is not necessary yet (the driver is read-only) but will be necessary very soon when write will be supported. Signed-off-by: Roman Kagan <rkagan@parallels.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Denis V. Lunev <den@openvz.org> Message-id: 1430207220-24458-6-git-send-email-den@openvz.org CC: Kevin Wolf <kwolf@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:31 +01:00
Roman Kagan	9de9da17d8	block/parallels: read up to cluster end in one go Teach parallels_read() to do reads in coarser granularity than just a single sector: if requested, read up to the cluster end in one go. Signed-off-by: Roman Kagan <rkagan@parallels.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1430207220-24458-5-git-send-email-den@openvz.org CC: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:31 +01:00
Roman Kagan	2944256997	block/parallels: switch to bdrv_read Switch the .bdrv_read method implementation from using bdrv_pread() to bdrv_read() on the underlying file, since the latter is subject to i/o throttling while the former is not. Besides, since bdrv_read() operates in sectors rather than bytes, adjust the helper functions to do so too. Signed-off-by: Roman Kagan <rkagan@parallels.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Denis V. Lunev <den@openvz.org> Message-id: 1430207220-24458-4-git-send-email-den@openvz.org CC: Kevin Wolf <kwolf@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:31 +01:00
Denis V. Lunev	0789890467	block/parallels: rename parallels_header to ParallelsHeader this follows QEMU coding convention Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Roman Kagan <rkagan@parallels.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1430207220-24458-3-git-send-email-den@openvz.org CC: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:31 +01:00
Peter Maydell	704eb1c099	QMP pull request -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJVUKtnAAoJEDu+7JDiTtWnOAMP/27vMOKfD4Z+kIwHKRfmZjyb 4ACjEfhndM3i3oFuOz7AYoe5vuwYbIMw7H2yeXsxXf8+88PHX4yyQ/xG1KaX0Fg/ DyO2ndDL2acRfIn/eY7K+7E4HbNONFNiCsnspdFoq7ytxpVPpanc6nCQ5//YeFPo ZA/McBl8WIfhM5uTn56q14qCiGGcz0tbQ4THpSfALlBwPfxcYzpVEmO5VN9Smbef QJ4Fy9nusydia+1fsuzm3Kgm2m0+Y2+J3o/IJFE9RmQk2UK6xEe5Vjzi10biYxVW vIc2IiDt6nAj+kjsM0GPPkwAJBojbIg9m35/tvftef/5w/UWZoqovGmx5fEAF0h5 LVA3WwuadG67LHxAS2O9qaefwSU1IcZ5ti+1YAhdwwaWs3DyYzNZ5ly0l6yN6uwX Wieyme8WAZKMqwpUmxkIGlJa5x+pW1PQB3vyf9Cyjj2tWnI7HIIoncKZ4ks70YZm MxFUefUgDtztmcknm+u3t+bN/a9w45QHRAXxCNYvaGJNwwnBrM6MPMLB7DqELUSr tdfOgkcnKZsjNKLDyINDkp7Rdepz9yn1nYPRj3ImtDdq/Bceh9CCyGG3YGot2BUR VJj4U9ouyYHKCZO9gfNsvowJDHiw0swcpU0/hZhL71tbc9CSl0y3zGm5eK2BQ9Uc Xsy9M7Oo2ou0OoT/rYUw =MGRA -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/qmp-unstable/tags/for-upstream' into staging QMP pull request # gpg: Signature made Mon May 11 14:15:19 2015 BST using RSA key ID E24ED5A7 # gpg: Good signature from "Luiz Capitulino <lcapitulino@gmail.com>" * remotes/qmp-unstable/tags/for-upstream: scripts: qmp-shell: Add verbose flag scripts: qmp-shell: add transaction subshell scripts: qmp-shell: Expand support for QMP expressions scripts: qmp-shell: refactor helpers MAINTAINERS: New maintainer for QMP and QAPI json-parser: Accept 'null' in QMP qobject: Add a special null QObject qobject: Clean up around qtype_code QJSON: Use OBJECT_CHECK Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2015-05-12 09:01:51 +01:00
Markus Armbruster	a7c3181628	qobject: Clean up around qtype_code QTYPE_NONE is a sentinel value. No QObject has this type code. Document it properly. Fix dump_qobject() to abort() on QTYPE_NONE, just like for any other invalid type code. Fix to_json() to abort() on all invalid type codes, not just QTYPE_MAX. Clean up Property member qtype's type: it's a qtype_code. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>	2015-05-11 08:59:07 -04:00
zhanghailiang	973a8529c5	sheepdog: fix resource leak with sd_snapshot_create Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2015-05-08 14:11:10 +03:00
Stefan Hajnoczi	61007b316c	block: move I/O request processing to block/io.c The block.c file has grown to over 6000 lines. It is time to split this file so there are fewer conflicts and the code is easier to maintain. Extract I/O request processing code: * Read * Write * Zero writes and making the image empty * Flush * Discard * ioctl * Tracked requests and queuing * Throttling and copy-on-read * Block status and allocated functions * Refreshing block limits * Reading/writing vmstate * qemu_blockalign() and friends The patch simply moves code from block.c into block/io.c. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-04-28 15:36:17 +02:00
Fam Zheng	7237aecd7e	vmdk: Widen before shifting 32 bit header field Coverity spotted this. The field is 32 bits, but if it's possible to overflow in 32 bit left shift. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-04-28 15:36:11 +02:00
Michael Tokarev	5505e8b76f	block/dmg: make it modular dmg can optionally utilize libbz2, make it modular Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-04-28 15:36:11 +02:00
Max Reitz	001c95b740	block/mirror: Always call block_job_sleep_ns() The mirror block job is trying to take a clever shortcut if delay_ns is 0 and skips block_job_sleep_ns() in that case. But that function must be called in every block job iteration, because otherwise it is for example impossible to pause the job. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-04-28 15:36:11 +02:00
John Snow	20dca81075	block: Ensure consistent bitmap function prototypes We often don't need the BlockDriverState for functions that operate on bitmaps. Remove it. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1429314609-29776-15-git-send-email-jsnow@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-04-28 15:36:10 +02:00
John Snow	d58d845397	qmp: Add support of "dirty-bitmap" sync mode for drive-backup For "dirty-bitmap" sync mode, the block job will iterate through the given dirty bitmap to decide if a sector needs backup (backup all the dirty clusters and skip clean ones), just as allocation conditions of "top" sync mode. Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1429314609-29776-11-git-send-email-jsnow@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-04-28 15:36:10 +02:00
John Snow	341ebc2f81	qmp: Add block-dirty-bitmap-add and block-dirty-bitmap-remove The new command pair is added to manage a user created dirty bitmap. The dirty bitmap's name is mandatory and must be unique for the same device, but different devices can have bitmaps with the same names. The granularity is an optional field. If it is not specified, we will choose a default granularity based on the cluster size if available, clamped to between 4K and 64K to mirror how the 'mirror' code was already choosing granularity. If we do not have cluster size info available, we choose 64K. This code has been factored out into a helper shared with block/mirror. This patch also introduces the 'block_dirty_bitmap_lookup' helper, which takes a device name and a dirty bitmap name and validates the lookup, returning NULL and setting errp if there is a problem with either field. This helper will be re-used in future patches in this series. The types added to block-core.json will be re-used in future patches in this series, see: 'qapi: Add transaction support to block-dirty-bitmap-{add, enable, disable}' Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 1429314609-29776-5-git-send-email-jsnow@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-04-28 15:36:10 +02:00
John Snow	5fba6c0e50	qmp: Ensure consistent granularity type We treat this field with a variety of different types everywhere in the code. Now it's just uint32_t. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1429314609-29776-4-git-send-email-jsnow@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-04-28 15:36:10 +02:00
Fam Zheng	0db6e54a8a	qapi: Add optional field "name" to block dirty bitmap This field will be set for user created dirty bitmap. Also pass in an error pointer to bdrv_create_dirty_bitmap, so when a name is already taken on this BDS, it can report an error message. This is not global check, two BDSes can have dirty bitmap with a common name. Implemented bdrv_find_dirty_bitmap to find a dirty bitmap by name, will be used later when other QMP commands want to reference dirty bitmap by name. Add bdrv_dirty_bitmap_make_anon. This unsets the name of dirty bitmap. Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1429314609-29776-3-git-send-email-jsnow@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-04-28 15:36:10 +02:00
Peter Lieven	9eac3622a2	block/iscsi: use the allocationmap also if cache.direct=on the allocationmap has only a hint character. The driver always double checks that blocks marked unallocated in the cache are still unallocated before taking the fast path and return zeroes. So using the allocationmap is migration safe and can also be enabled with cache.direct=on. Signed-off-by: Peter Lieven <pl@kamp.de> Message-id: 1429193313-4263-10-git-send-email-pl@kamp.de Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-04-28 15:36:10 +02:00
Peter Lieven	03e40fef46	block/iscsi: bump year in copyright notice Signed-off-by: Peter Lieven <pl@kamp.de> Message-id: 1429193313-4263-9-git-send-email-pl@kamp.de Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-04-28 15:36:10 +02:00
Peter Lieven	e380aff831	block/iscsi: handle SCSI_STATUS_TASK_SET_FULL a target may issue a SCSI_STATUS_TASK_SET_FULL status if there is more than one "BUSY" command queued already. Signed-off-by: Peter Lieven <pl@kamp.de> Message-id: 1429193313-4263-8-git-send-email-pl@kamp.de Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-04-28 15:36:10 +02:00
Peter Lieven	59dd0a22ca	block/iscsi: increase retry count The idea is that a command is retried in a BUSY condition up a time of approx. 60 seconds before it is failed. This should be far higher than any command timeout in the guest. Signed-off-by: Peter Lieven <pl@kamp.de> Message-id: 1429193313-4263-7-git-send-email-pl@kamp.de Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-04-28 15:36:10 +02:00
Peter Lieven	73b5394e2e	block/iscsi: optimize WRITE10/16 if cache.writeback is not set SCSI allowes to tell the target to not return from a write command if the date is not written to the disk. Use this so called FUA bit if it is supported to optimize WRITE commands if writeback is not allowed. In this case qemu always issues a WRITE followed by a FLUSH. This is 2 round trip times. If we set the FUA bit we can ignore the following FLUSH. Signed-off-by: Peter Lieven <pl@kamp.de> Message-id: 1429193313-4263-6-git-send-email-pl@kamp.de Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-04-28 15:36:10 +02:00
Peter Lieven	752ce45150	block/iscsi: store DPOFUA bit from the modesense command Signed-off-by: Peter Lieven <pl@kamp.de> Message-id: 1429193313-4263-5-git-send-email-pl@kamp.de Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-04-28 15:36:10 +02:00
Peter Lieven	7191f2080c	block/iscsi: rename iscsi_write_protected and let it return void Signed-off-by: Peter Lieven <pl@kamp.de> Message-id: 1429193313-4263-4-git-send-email-pl@kamp.de Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-04-28 15:36:10 +02:00
Peter Lieven	0a386e4852	block/iscsi: change all iscsilun properties from uint8_t to bool Signed-off-by: Peter Lieven <pl@kamp.de> Message-id: 1429193313-4263-3-git-send-email-pl@kamp.de Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-04-28 15:36:10 +02:00
Peter Lieven	20474e9aa0	block/iscsi: do not forget to logout from target We actually were always impolitely dropping the connection and not cleanly logging out. CC: qemu-stable@nongnu.org Signed-off-by: Peter Lieven <pl@kamp.de> Message-id: 1429193313-4263-2-git-send-email-pl@kamp.de Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-04-28 15:36:09 +02:00
Alberto Garcia	d5a8ee60a0	qmp: fill in the image field in BlockDeviceInfo The image field in BlockDeviceInfo is supposed to contain an ImageInfo object. However that is being filled in by bdrv_query_info(), not by bdrv_block_device_info(), which is where BlockDeviceInfo is actually created. Anyone calling bdrv_block_device_info() directly will get a null image field. As a consequence of this, the HMP command 'info block -n -v' crashes QEMU. This patch moves the code that fills in that field from bdrv_query_info() to bdrv_block_device_info(). Signed-off-by: Alberto Garcia <berto@igalia.com> Message-id: 1429271563-3765-1-git-send-email-berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-04-28 15:36:09 +02:00
Alberto Garcia	dc881b441d	block: add 'node-name' field to BLOCK_IMAGE_CORRUPTED Since this event can occur in nodes that cannot have a device name associated, include also a field with the node name. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 147cec5b3594f4bec0cb41c98afe5fcbfb67567c.1428485266.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-04-28 15:36:09 +02:00
Alberto Garcia	81e5f78a9f	block: use bdrv_get_device_or_node_name() in error messages There are several error messages that identify a BlockDriverState by its device name. However those errors can be produced in nodes that don't have a device name associated. In those cases we should use bdrv_get_device_or_node_name() to fall back to the node name and produce a more meaningful message. The messages are also updated to use the more generic term 'node' instead of 'device'. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 9823a1f0514fdb0692e92868661c38a9e00a12d6.1428485266.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-04-28 15:36:09 +02:00
Alberto Garcia	9b2aa84f87	block: add bdrv_get_device_or_node_name() This function gets the device name associated with a BlockDriverState, or its node name if the device name is empty. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 4fa30aa8d61d9052ce266fd5429a59a14e941255.1428485266.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-04-28 15:36:09 +02:00
Fam Zheng	a7282330c0	blockjob: Update function name in comments Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 1428069921-2957-5-git-send-email-famz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-04-28 15:36:09 +02:00
Fam Zheng	751ebd76e6	blockjob: Allow nested pause This patch changes block_job_pause to increase the pause counter and block_job_resume to decrease it. The counter will allow calling block_job_pause/block_job_resume unconditionally on a job when we need to suspend the IO temporarily. From now on, each block_job_resume must be paired with a block_job_pause to keep the counter balanced. The user pause from QMP or HMP will only trigger block_job_pause once until it's resumed, this is achieved by adding a user_paused flag in BlockJob. One occurrence of block_job_resume in mirror_complete is replaced with block_job_enter which does what is necessary. In block_job_cancel, the cancel flag is good enough to instruct coroutines to quit loop, so use block_job_enter to replace the unpaired block_job_resume. Upon block job IO error, user is notified about the entering to the pause state, so this pause belongs to user pause, set the flag accordingly and expect a matching QMP resume. [Extended doc comments as suggested by Paolo Bonzini <pbonzini@redhat.com>. --Stefan] Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 1428069921-2957-2-git-send-email-famz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-04-28 15:36:09 +02:00
Fam Zheng	1c2b49a172	block/null: Support reopen Reopen is used in block-commit. With this always-succeed operation, it is now possible to test committing to a null drive, by specifying "null-aio://" or "null-co://" as the backing image when creating the qcow2 image. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1427852740-24315-3-git-send-email-famz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-04-28 15:36:09 +02:00
Fam Zheng	e5e51dd3af	block/null: Latency simulation by adding new option "latency-ns" Aio context switch should just work because the requests will be drained, so the scheduled timer(s) on the old context will be freed. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1427852740-24315-2-git-send-email-famz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-04-28 15:36:09 +02:00
Kevin Wolf	d1a126c53d	vhdx: Fix zero-fill iov length Fix the length of the zero-fill for the back, which was accidentally using the same value as for the front. This is caught by qemu-iotests 033. For consistency, change the code for the front as well to use the length stored in the iov (it is the same value, copied four lines above). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Acked-by: Jeff Cody <jcody@redhat.com>	2015-04-28 15:36:09 +02:00
Kevin Wolf	8eedfbd4a5	blkdebug: Add bdrv_truncate() This is, amongst others, required for qemu-iotests 033 to run as intended on VHDX, which uses explicit bdrv_truncate() calls to bs->file when allocating new blocks. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com>	2015-04-28 15:36:09 +02:00
Kevin Wolf	0df89e8e6f	block-backend: Expose bdrv_write_zeroes() Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>	2015-04-28 15:36:08 +02:00
Stefan Hajnoczi	786a4ea82e	Convert (ffs(val) - 1) to ctz32(val) This commit was generated mechanically by coccinelle from the following semantic patch: @@ expression val; @@ - (ffs(val) - 1) + ctz32(val) The call sites have been audited to ensure the ffs(0) - 1 == -1 case never occurs (due to input validation, asserts, etc). Therefore we don't need to worry about the fact that ctz32(0) == 32. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1427124571-28598-5-git-send-email-stefanha@redhat.com Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-04-28 15:36:08 +02:00
Yi Wang	407bc15033	savevm: create snapshot failed when id_str already exists The command "virsh create" will fail in such condition: vm has two disks: vda and vdb. vda has snapshot s1 with id "1", vdb doesn't have s1 but has snapshot s2 with id "1". When we want to run command "virsh create s1", del_existing_snapshots() only deletes s1 in vda, and bdrv_snapshot_create() tries to create vdb's snapshot s1 with id "1", but id "1" alreay exists in vdb with name "s2"! The simplest way is call find_new_snapshot_id() unconditionally. Signed-off-by: Yi Wang <up2wing@gmail.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-04-28 15:36:08 +02:00
Peter Lieven	05b685fbab	block/iscsi: handle zero events from iscsi_which_events newer libiscsi versions may return zero events from iscsi_which_events. In this case iscsi_service will return immediately without any progress. To avoid busy waiting for iscsi_which_events to change we deregister all read and write handlers in this case and schedule a timer to periodically check iscsi_which_events for changed events. Next libiscsi version will introduce async reconnects and zero events are returned while libiscsi is waiting for a reconnect retry. Signed-off-by: Peter Lieven <pl@kamp.de> Message-id: 1428437295-29577-1-git-send-email-pl@kamp.de Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-04-09 10:31:45 +01:00
Kevin Wolf	e4603fe139	qcow2: Fix header update with overridden backing file In recent qemu versions, it is possible to override the backing file name and format that is stored in the image file with values given at runtime. In such cases, the temporary override could end up in the image header if the qcow2 header was updated, while obviously correct behaviour would be to leave the on-disk backing file path/format unchanged. Fix this and add a test case for it. Reported-by: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 1428411796-2852-1-git-send-email-kwolf@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-04-08 10:29:20 +01:00
Peter Maydell	3e5f6234b4	Block patches for 2.3.0-rc1 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABAgAGBQJVCuU+AAoJEH8JsnLIjy/W1b0P/0RYYqkXvIvr5pymOTkA256q dPWOqkKB0LHcB1+iOdWDb7TsAfJftZ2MFvN+BQ+mBC3knheLzWlErQLZQsEJh2/b x525GKatRzc7LSQ4FPEVJ9MNDILKUQsuIza6/z3NsmmYSFSUNCzyQOS/LAO6ngjs cmQ0aLudKy41vGTE8mS6rZOjHf8uumdRPhG5clr5V80zPMg93jojYJ2ENCl6sPuh Y2OtVu6HLyX2ExKFmt4JNltxjXnUki+sEBUnCcj8tvJNMGy82IpOdO8w3W9cs2zZ cb7XUVKv6IPUycEs4IsGpHUfyIaD5sVY5ueKGGZv35kmFYdItJ9AukHc/5tlVspR kFYEGOjZRqRXboeH8VJDJWHBlIfKeoE6iwCBW62D0Bzbab8DMWbzif32b3K5dnve OleFjFS0mysUfuxoIqF12SUwZj+WzW1CaxOTAGALIrfYfOD5ZWOHiLeXbQ/fcaAW quz+/9B9CzIAbTL31RxcOXfvyuUWlWkZyz6GYxaLhzLF580Uz/qvBqbMrXDQWUzs x6udW7aylqBalSXpc29ORv5cIpA4p+IBzFpfVtwV9+Qa86nTfJJSq4u2oet/vtyb KiZ+AcKUiaChwheeQHSCcqjQ5dzhJvl/6UvwCLORLYuNExrE+1Umn/Cz33eDibxi kUW1xvRYe/pkSGdntlz0 =Hs5n -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging Block patches for 2.3.0-rc1 # gpg: Signature made Thu Mar 19 15:03:26 2015 GMT using RSA key ID C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" * remotes/kevin/tags/for-upstream: block: Fix blockdev-backup not to use funky error class raw-posix: Deprecate aio=threads fallback without O_DIRECT raw-posix: Deprecate host floppy passthrough Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2015-03-19 17:47:08 +00:00
Peter Maydell	7a9a5e72e8	trivial patches for 2015-03-19 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJVCo+SAAoJEL7lnXSkw9fbdm8H/3id64AYsZ7kSR8QdXfa/kr6 PObw3r3FZNyBwENOe6cf+8kZspFENN9I2iX1yej1MXe3W0AphTCZFrjCSh3QpFxv GL63AGdaEKdO/zQR9H/hhvTBHzi1Uo4UIIR/18pIw/gUrpxKfdNUYi8ekgWSgKvA tlp4iBZT0I6K7rxq1Z1kWiTJ+Bk5qIk1YmGW8FirOGfqKE/zq94ogIclVgiFq+0X pNu3nvRkLc88/h8bafMuSgjyFpAbxaQubx75kUvg7folzWPptlG0RcKCsEjtTfOh LImAO8NCxElh3ZYXaoFTuk0ryfkmxJKl++Qw6Jv6upTWCjL3eDanKPIll94DzHM= =BLfX -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/mjt/tags/pull-trivial-patches-2015-03-19' into staging trivial patches for 2015-03-19 # gpg: Signature made Thu Mar 19 08:57:54 2015 GMT using RSA key ID A4C3D7DB # gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>" # gpg: aka "Michael Tokarev <mjt@corpit.ru>" # gpg: aka "Michael Tokarev <mjt@debian.org>" * remotes/mjt/tags/pull-trivial-patches-2015-03-19: (24 commits) qga/commands-posix: Fix resource leak elf-loader: Add missing error handling for call of lseek elf-loader: Fix truncation warning from coverity hmp: Fix texinfo documentation Fix typos in comments qtest/ahci: Fix a bit mask expression vl: fix resource leak with monitor_fdset_add_fd smbios: add max speed comdline option for type-17 (meory device) structure pc-dimm: Add description for device list. configure: enable kvm on x32 error: Replace error_report() & error_free() with error_report_err() arm: fix memory leak qmp: Drop unused .user_print from command definitions hmp: Fix definition of command quit target-moxie: Fix warnings from Sparse (one-bit signed bitfield) block/qapi: Fix Sparse warning Fix remaining warnings from Sparse (void return) qom: Fix warning from Sparse target-mips: Fix warning from Sparse arm/nseries: Fix warnings from Sparse ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2015-03-19 14:10:20 +00:00
Kevin Wolf	965182549c	raw-posix: Deprecate aio=threads fallback without O_DIRECT Currently, if the user requests aio=native, but forgets to choose a cache mode that sets O_DIRECT, that request is silently ignored and raw falls back to aio=threads. Deprecate that behaviour so we can make it an error in future qemu versions. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com>	2015-03-19 12:30:56 +01:00
Markus Armbruster	92a539d22e	raw-posix: Deprecate host floppy passthrough Raise your hand if you have a physical floppy drive in a computer you've powered on in 2015. Okay, I see we got a few weirdos in the audience. That's okay, weirdos are welcome here. Kidding aside, media change detection doesn't fully work, isn't going to be fixed, and floppy passthrough just isn't earning its keep anymore. Deprecate block driver host_floppy now, so we can drop it after a grace period. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Gerd Hoffmann <kraxel@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-03-19 11:43:02 +01:00
Stefan Weil	2c20fa2cc2	block/qapi: Fix Sparse warning Sparse reports this warning: block/qapi.c:417:47: warning: too long initializer-string for array of char(no space for nul char) Replacing the string by an array of characters fixes this warning. Signed-off-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2015-03-19 11:11:55 +03:00
Max Reitz	3f4726596d	nbd: Set block size to BDRV_SECTOR_SIZE Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <1424887718-10800-13-git-send-email-mreitz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-03-18 12:07:01 +01:00
Max Reitz	2b1f13b996	nbd: Fix nbd_establish_connection()'s return value unix_connect_opts() and inet_connect_opts() do not necessarily set errno (if at all); therefore, nbd_establish_connection() should not literally return -errno on error. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <1424887718-10800-4-git-send-email-mreitz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-03-18 12:05:38 +01:00
Peter Lieven	304ee9174f	block/vpc: remove disabled code from get_sector_offset The code to check the bitmap for the allocation status of each sector has been "disabled by reason" ever since the vpc driver existed. The reason might be that we might end up reading sector by sector in vpc_read if we really used it. This would be a performance desaster. The current code would furthermore not work if the disabled parts get reactivated since vpc_read and vpc_write only use get_sector_offset to check the allocation status of the first sector of a read/write operation. This might lead to sectors incorrectly treated as zero in vpc_read and to sectors getting allocated twice in vpc_write. Signed-off-by: Peter Lieven <pl@kamp.de> Message-id: 1425379316-19639-6-git-send-email-pl@kamp.de Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2015-03-16 12:10:30 -04:00
Peter Lieven	03671ded30	block/vpc: rename footer->size -> footer->current_size the field is named current size in the spec. Name it accordingly. Signed-off-by: Peter Lieven <pl@kamp.de> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 1425379316-19639-5-git-send-email-pl@kamp.de Signed-off-by: Max Reitz <mreitz@redhat.com>	2015-03-16 12:10:30 -04:00
Peter Lieven	690cbb095a	block/vpc: make calculate_geometry spec conform The VHD spec [1] allows for total_sectors of 65535 x 16 x 255 (~127GB) represented by a CHS geometry. If total_sectors is greater than 65535 x 16 x 255 this geometry is set as a maximum. Qemu, Hyper-V and disk2vhd use this special geometry as an indicator to use the image current size from the footer as disk size. This patch changes vpc_create to effectively calculate a CxHxS geometry for the given image size if possible while rounding up if necessary. If the image size is too big to be represented in CHS we set the maximum and write the exact requested image size into the footer. This partly reverts commit `258d2edb`, but leaves support for >127G disks intact. [1] http://download.microsoft.com/download/f/f/e/ffef50a5-07dd-4cf8-aaa3-442c0673a029/Virtual%20Hard%20Disk%20Format%20Spec_10_18_06.doc Signed-off-by: Peter Lieven <pl@kamp.de> Message-id: 1425379316-19639-4-git-send-email-pl@kamp.de Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2015-03-16 12:10:30 -04:00
Kevin Wolf	0444dceee4	vpc: Ignore geometry for large images The CHS calculation as done per the VHD spec imposes a maximum image size of ~127 GB. Real VHD images exist that are larger than that. Apparently there are two separate non-standard ways to achieve this: You could use more heads than the spec does - this is the option that qemu-img create chooses. However, other images exist where the geometry is set to the maximum (65535/16/255), but the actual image size is larger. Until now, such images are truncated at 127 GB when opening them with qemu. This patch changes the vpc driver to ignore geometry in this case and only trust the size field in the header. Signed-off-by: Kevin Wolf <kwolf@redhat.com> [PL: Fixed maximum geometry in the commit msg] Signed-off-by: Peter Lieven <pl@kamp.de> Message-id: 1425379316-19639-3-git-send-email-pl@kamp.de Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2015-03-16 12:10:30 -04:00
Peter Lieven	2ec711dcd4	block/vpc: optimize vpc_co_get_block_status *pnum can't be greater than s->block_size / BDRV_SECTOR_SIZE for allocated sectors since there is always a bitmap in between. Signed-off-by: Peter Lieven <pl@kamp.de> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 1425379316-19639-2-git-send-email-pl@kamp.de Signed-off-by: Max Reitz <mreitz@redhat.com>	2015-03-16 12:10:30 -04:00
Max Reitz	14a58a4e0c	qcow2: Respect new_block in alloc_refcount_block() When choosing a new place for the refcount table, alloc_refcount_block() tries to infer the number of clusters used so far from its argument cluster_index (which comes from the idea that if any cluster with an index greater than cluster_index was in use, the refcount table would have to be big enough already to describe cluster_index). However, there is a cluster that may be at or after cluster_index, and which is not covered by the refcount structures, and that is the new refcount block new_block. Therefore, it should be taken into account for the blocks_used calculation. Also, because new_block already describes (or is intended to describe) cluster_index, we may not put the new refcount structures there. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 1423598552-24301-2-git-send-email-mreitz@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2015-03-16 12:10:30 -04:00
Markus Armbruster	6ec46ad541	block: Fix block-set-write-threshold not to use funky error class Error classes are a leftover from the days of "rich" error objects. New code should always use ERROR_CLASS_GENERIC_ERROR. Commit `e246211` added a use of ERROR_CLASS_DEVICE_NOT_FOUND. Replace it. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-03-16 17:07:25 +01:00
Wen Congyang	87b86e7ef2	qcow2: fix the macro QCOW_MAX_L1_SIZE's use QCOW_MAX_L1_SIZE's unit is byte, and l1_size's unit is l1 table entry size(8 bytes). Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Message-id: 54FFB0F1.5010307@cn.fujitsu.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-03-12 17:41:23 +00:00
Peter Maydell	23a7a28796	- scsi: improvements to error reporting and conversion to realize, Coverity/sparse fix for iscsi driver - RCU fallout: fix -daemonize and s390x system emulation - KVM: kvm_stat improvements and new man page - x86: SYSRET fix for VxWorks -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJU/sUFAAoJEL/70l94x66D1JwIAJ28Lan2DQwi+xHvNxF8zW6n v7eMc04/fepuon0TYmUZC3qbqc00sccEQZQ+yAAauT9epZ/kdSDudDOzG+3F4MuQ /X3crXw2/jrhtWedGq49vFCONX4MKoaoudqK8kOFMe1ImQgkOYeAzOoqeFXyHsFh jINlKTJZB6oKzrZ+SYryY14cO7pvGaIhyqaCC+6GcVihTjm9Yq13lP1lFj7LsVRV aGfd6xH9RSV/mwzvZwD4i3cUWSUaV/wY0NDhAEzDPCUcxX0/nAj3XF1YeJUF30Qd ETaCLo/Nxq2R6POK3c/Zm/FRLvjzZ2caD+q1LcwB/bCYdc2lJ1JDxE/hr48ANv0= =OWXY -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging - scsi: improvements to error reporting and conversion to realize, Coverity/sparse fix for iscsi driver - RCU fallout: fix -daemonize and s390x system emulation - KVM: kvm_stat improvements and new man page - x86: SYSRET fix for VxWorks # gpg: Signature made Tue Mar 10 10:18:45 2015 GMT using RSA key ID 78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini/tags/for-upstream: x86: fix SS selector in SYSRET scsi: Convert remaining PCI HBAs to realize() scsi: Improve error reporting for invalid drive property hw: Propagate errors through qdev_prop_set_drive() scsi: Clean up duplicated error in legacy if=scsi code cpus: initialize cpu->memory_dispatch rcu: handle forks safely qemu-thread: do not use PTHREAD_MUTEX_ERRORCHECK kvm_stat: add kvm_stat.1 man page kvm_stat: add column headers to text UI iscsi: Fix check for username Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2015-03-10 18:03:02 +00:00
Peter Maydell	1976058109	Block patches for 2.3 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABAgAGBQJU/uuVAAoJEH8JsnLIjy/WULwP/jeARjYkFuG3ahSWpeY0JnTK QCkLF06iSQQUiirXI4H+Tofl8kNVBd/Iinv+LbkF27iWbTiwalmLz7NiyboX8dl+ NJZtCrqp44q7KFbl3g19/jop/zdZ9N5Gxp8BARVUILHQb1y5cXJwrDhBxTmNRDL+ sSZXfomCgKtMP40nGLa0CcNIYKlm8MePJEM2TsMoWv7tYz4CXgBG39EqK6NJluCY kTTMcbdrLbR0imfKOVPutCgV8rhRXJ0oGVD3Q+D3/LFmPG++hoRnWCcDm6ZZ62Hi Ra7u87TBfAUUtiT+vFQJnd7hTpN+stQidsCDBLEY3qPTKYhzm648PHvcEwOAv6YW sjAELF2Rrsbe4vkL3/qgYDusnaPMElrHVEdbKtHofWtg6KctLnYIhusV+qKq1Fpa cRQEbQIZMVFeWN1G9WuYH8RBYrwJqp+/qq7DcnV62lUAdY4e3iO7E3yMLFDwpxku PLl7eofU/ZpnAOrrU2QAQvgXZRqy1ie/Unv8jFwefQkK5mXHoCtkAeBlOM8t4kJf HjkC/hYO7kwPdaz6xK80wpXqYd3vT6jKi7mlJqC5oQQLGJbRigxlMZ16UIAx+IrL NxhnQChp7IP21KMATFbpvYjcJyGMw3ZuVRaUhQBgqQArIomVHvM5WcN9M6S5dsmj vClFOIqjlSbtsmChceWr =hlbC -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging Block patches for 2.3 # gpg: Signature made Tue Mar 10 13:03:17 2015 GMT using RSA key ID C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" * remotes/kevin/tags/for-upstream: (73 commits) MAINTAINERS: Add jcody as blockjobs, block devices maintainer iotests: add O_DIRECT alignment probing test block/raw-posix: fix launching with failed disks MAINTAINERS: Add jsnow as IDE maintainer sheepdog: Fix misleading error messages in sd_snapshot_create() Add testcase for scsi-hd devices without drive property scsi-hd: fix property unset case block/vdi: Add locking for parallel requests iotests: Drop vpc from 004's and 104's format list iotests: Remove 006 iotests: Fix 051's reference output virtio-blk: Remove the stale FIXME comment tests: Check QVIRTIO_F_ANY_LAYOUT flag in virtio-blk test libqos: Solve bug in interrupt checking when using MSIX in virtio-pci.c sheepdog: fix confused return values qtest/ahci: add fragmented dma test qtest/ahci: Add PIO and LBA48 tests qtest/ahci: Add DMA test variants libqos/ahci: add ahci command helpers qtest/ahci: Add a macro bootup routine ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2015-03-10 14:01:22 +00:00
Stefan Hajnoczi	22d182e82b	block/raw-posix: fix launching with failed disks Since commit `c25f53b06e` ("raw: Probe required direct I/O alignment") QEMU has failed to launch if image files produce I/O errors. Previously, QEMU would launch successfully and the guest would see the errors when attempting I/O. This is a regression and may prevent multipath I/O inside the guest, where QEMU must launch and let the guest figure out by itself which disks are online. Tweak the alignment probing code in raw-posix.c to explicitly look for EINVAL on Linux instead of bailing. The kernel refuses misaligned requests with this error code and other error codes can be ignored. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-03-10 14:02:24 +01:00
Markus Armbruster	27994d5879	sheepdog: Fix misleading error messages in sd_snapshot_create() If do_sd_create() fails, it first reports the error returned, then reports a another one with strerror(errno). errno is meaningless at that point. Report just one error combining the valid information from both messages. Reported-by: Eric Blake <eblake@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Liu Yuan <namei.unix@gmail.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-03-10 14:02:24 +01:00
Max Reitz	f0ab6f1096	block/vdi: Add locking for parallel requests When allocating a new cluster, the first write to it must be the one doing the allocation, because that one pads its write request to the cluster size; if another write to that cluster is executed before it, that write will be overwritten due to the padding. See https://bugs.launchpad.net/qemu/+bug/1422307 for what can go wrong without this patch. Cc: qemu-stable <qemu-stable@nongnu.org> Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-03-10 14:02:24 +01:00
Liu Yuan	833a7cc36e	sheepdog: fix confused return values These functions mix up -1 and -errno in return values and would might cause trouble error handling in the call chain. This patch let them return -errno and add some comments. Cc: qemu-devel@nongnu.org Cc: Markus Armbruster <armbru@redhat.com> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Reported-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Liu Yuan <liuyuan@cmss.chinamobile.com> Message-id: 1424231875-7131-1-git-send-email-namei.unix@gmail.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-03-10 14:02:23 +01:00
Ekaterina Tumanova	f0272c4db2	block-backend: Add wrappers for blocksizes and geometry probing Signed-off-by: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1424087278-49393-5-git-send-email-tumanova@linux.vnet.ibm.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-03-10 14:02:22 +01:00
Ekaterina Tumanova	1a9335e4a9	block: Add driver methods to probe blocksizes and geometry Introduce driver methods of defining disk blocksizes (physical and logical) and hard drive geometry. Methods are only implemented for "host_device". For "raw" devices driver calls child's method. For now geometry detection will only work for DASD devices. To check that a local check_for_dasd function was introduced. It calls BIODASDINFO2 ioctl and returns its rc. Blocksizes detection function will probe sizes for DASD devices. Signed-off-by: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1424087278-49393-4-git-send-email-tumanova@linux.vnet.ibm.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-03-10 14:02:22 +01:00
Ekaterina Tumanova	8a4ed0d1b1	raw-posix: Factor block size detection out of raw_probe_alignment() Put it in new probe_logical_blocksize(). Signed-off-by: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1424087278-49393-3-git-send-email-tumanova@linux.vnet.ibm.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-03-10 14:02:21 +01:00
John Snow	a069e2f137	blkdebug: fix "once" rule Background: The blkdebug scripts are currently engineered so that when a debug event occurs, a prefilter browses a master list of parsed rules for a certain event and adds them to an "active list" of rules to be used for the forthcoming action, provided the events and state numbers match. Then, once the request is received, the last active rule is used to inject an error if certain parameters match. This active list is cleared every time the prefilter injects a new rule for the first time during a debug event. The "once" rule currently causes the error injection, if it is triggered, to only clear the active list. This is insufficient for preventing future injections of the same rule. Remedy: This patch /deletes/ the rule from the list that the prefilter browses, so it is gone for good. In V2, we remove only the rule of interest from the active list instead of allowing the "once" rule to clear the entire list of active rules. Impact: This affects iotests 026. Several ENOSPC tests that used "once" can be seen to have output that shows multiple failure messages. After this patch, the error messages tend to be smaller and less severe, but the injection can still be seen to be working. I have patched the expected output to expect the smaller error messages. Signed-off-by: John Snow <jsnow@redhat.com> Message-id: 1423257977-25630-1-git-send-email-jsnow@redhat.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-03-10 14:02:21 +01:00
Max Reitz	06d05fa738	qcow2: Allow creation with refcount order != 4 Add a creation option to qcow2 for setting the refcount order of images to be created, and respect that option's value. This breaks some test outputs, fix them. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-03-10 14:02:21 +01:00
Max Reitz	8a17b83cc3	qcow2: Use symbolic macros in qcow2_amend_options qcow2_amend_options() should not compare options against some inline strings but rather use the symbolic macros available for each of the creation options. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-03-10 14:02:21 +01:00
Max Reitz	bd4b167f84	qcow2: refcount_order parameter for qcow2_create2 Add a refcount_order parameter to qcow2_create2(), use that value for the image header and for calculating the size required for preallocation. For now, always pass 4. This addition requires changes to the calculation of the file size for the "full" and "falloc" preallocation modes. That in turn is a nice opportunity to add a comment about that calculation not necessarily being exact (and that being intentional). Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-03-10 14:02:21 +01:00
Max Reitz	b72faf9f78	qcow2: Open images with refcount order != 4 No longer refuse to open images with a different refcount entry width than 16 bits; only reject images with a refcount width larger than 64 bits (which is prohibited by the specification). Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-03-10 14:02:21 +01:00
Max Reitz	59c0cb7830	qcow2: More helpers for refcount modification Add helper functions for getting and setting refcounts in a refcount array for any possible refcount order, and choose the correct one during refcount initialization. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-03-10 14:02:21 +01:00
Max Reitz	7453c96b78	qcow2: Helper function for refcount modification Since refcounts do not always have to be a uint16_t, all refcount blocks and arrays in memory should not have a specific type (thus they become pointers to void) and for accessing them, two helper functions are used (a getter and a setter). Those functions are called indirectly through function pointers in the BDRVQcowState so they may later be exchanged for different refcount orders. With the check and repair functions using this function, the refcount array they are creating will be in big endian byte order; additionally, using realloc_refcount_array() makes the size of this refcount array always cluster-aligned. Both combined allow rebuild_refcount_structure() to drop the bounce buffer which was used to convert parts of the refcount array to big endian byte order and store them on disk. Instead, those parts can now be written directly. [ kwolf: Fixed a build failure on 32 bit and another with old glib ] Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-03-10 14:02:21 +01:00
Max Reitz	5fee192efd	qcow2: Helper for refcount array reallocation Add a helper function for reallocating a refcount array, independent of the refcount order. The newly allocated space is zeroed and the function handles failed reallocations gracefully. The helper function will always align the buffer size to a cluster boundary; if storing the refcounts in such an array in big endian byte order, this makes it possible to write parts of the array directly as refcount blocks into the image file. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-03-10 14:02:21 +01:00
Max Reitz	0e06528e98	qcow2: Use 64 bits for refcount values Refcounts may have a width of up to 64 bits, so qemu should use the same width to represent refcount values internally. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-03-10 14:02:21 +01:00
Max Reitz	2aabe7c7a1	qcow2: Use unsigned addend for update_refcount() update_refcount() and qcow2_update_cluster_refcount() currently take a signed addend. At least one caller passes a value directly derived from an absolute refcount that should be reached ("l2_refcount - 1" in expand_zero_clusters_in_l1()). Therefore, the addend should be unsigned as well; this will be especially important for 64 bit refcounts. Because update_refcount() then no longer knows whether the refcount should be increased or decreased, it now requires an additional flag which specified exactly that. The same applies to qcow2_update_cluster_refcount(). Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-03-10 14:02:21 +01:00
Max Reitz	7324c10f96	qcow2: Only return status from qcow2_get_refcount Refcounts can theoretically be of type uint64_t; in order to be able to represent the full range, qcow2_get_refcount() cannot use a single variable to represent both all refcount values and also keep some values reserved for errors. One solution would be to add an Error pointer parameter to qcow2_get_refcount(); however, no caller could (currently) pass that error message, so it would have to be emitted immediately and be passed to the next caller by returning -EIO or something similar. Therefore, an Error parameter does not offer any advantages here. The solution applied by this patch is simpler to use. Because no caller would be able to pass the error message, they would have to print it and free it, whereas with this patch the caller only needs to pass the returned integer (which is often a no-op from the code perspective, because that integer will be stored in a variable "ret" which will be returned by the fail path of many callers). Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-03-10 14:02:21 +01:00
Max Reitz	c6e9d8ae66	qcow2: Do not return new value after refcount update qcow2_update_cluster_refcount() does not have any quick access to the new refcount value, it has to call qcow2_get_refcount(). Some callers do not need that new value at all, others call qcow2_get_refcount() themselves anyway (albeit in a different code path, which can however be easily changed), therefore there is no advantage in making qcow2_update_cluster_refcount() return the new value. Drop it. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-03-10 14:02:21 +01:00
Max Reitz	0709c5a153	qcow2: Add refcount_bits to format-specific info Add the bit width of every refcount entry to the format-specific information. In contrast to lazy_refcounts and the corrupt flag, this should be always emitted, even for compat=0.10 although it does not support any refcount width other than 16 bits. This is because if a boolean is optional, one normally assumes it to be false when omitted; but if an integer is not specified, it is rather difficult to guess its value. This new field breaks some test outputs, fix them. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-03-10 14:02:20 +01:00
Max Reitz	346a53df38	qcow2: Add two new fields to BDRVQcowState Add two new fields regarding refcount information (the bit width of every entry and the maximum refcount value) to the BDRVQcowState. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-03-10 14:02:20 +01:00
Stefan Weil	532cee4184	iscsi: Fix check for username The variable user in struct iscsi_url is a character array, not a pointer. Therefore its address will never be NULL. clang reports this error: block/iscsi.c:1329:20: warning: comparison of array 'iscsi_url->user' not equal to a null pointer is always true [-Wtautological-pointer-compare] Reviewed-by: Peter Lieven <pl@kamp.de> Acked-by: Peter Lieven <pl@kamp.de> Signed-off-by: Stefan Weil <sw@weilnetz.de> Message-Id: <1425719670-5486-1-git-send-email-sw@weilnetz.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-03-10 10:49:25 +01:00
Gonglei	9d0b65e6e8	nbd: fix resource leak Signed-off-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2015-03-10 08:15:34 +03:00
Gonglei	eec5eb42f5	block: remove superfluous '\n' around error_report/error_setg Signed-off-by: Gonglei <arei.gonglei@huawei.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2015-03-10 08:15:33 +03:00
Kevin Wolf	20a1f9d071	qcow2: Remove unused struct QCowCreateState The only user went away five years ago with commit `a9420734` ('qcow2: Simplify image creation'). It's about time to remove it. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-03-09 11:12:00 +01:00
Denis V. Lunev	a6dcf097fa	block/raw-posix: fix compilation warning on OSX block/raw-posix.c:947:19: warning: unused variable 's' [-Wunused-variable] BDRVRawState *s = aiocb->bs->opaque; This variable is used only when on of the following macros are defined CONFIG_XFS, CONFIG_FALLOCATE, CONFIG_FALLOCATE_PUNCH_HOLE or CONFIG_FALLOCATE_ZERO_RANGE. Fortunately, CONFIG_FALLOCATE_PUNCH_HOLE and CONFIG_FALLOCATE_ZERO_RANGE could be defined only along with CONFIG_FALLOCATE. Therefore checking for CONFIG_XFS or CONFIG_FALLOCATE would be enough. Signed-off-by: Denis V. Lunev <den@openvz.org> CC: Peter Maydell <peter.maydell@linaro.org> CC: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-03-09 11:11:59 +01:00
Teruaki Ishizaki	876eb1b0cc	sheepdog: selectable object size support Previously, qemu block driver of sheepdog used hard-coded VDI object size. This patch enables users to handle VDI object size. When you start qemu, you don't need to specify additional command option. But when you create the VDI which doesn't have default object size with qemu-img command, you specify object_size option. If you want to create a VDI of 8MB object size, you need to specify following command option. # qemu-img create -o object_size=8M sheepdog:test1 100M In addition, when you don't specify qemu-img command option, a default value of sheepdog cluster is used for creating VDI. # qemu-img create sheepdog:test2 100M Signed-off-by: Teruaki Ishizaki <ishizaki.teruaki@lab.ntt.co.jp> Acked-by: Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-03-09 11:11:59 +01:00
Kevin Wolf	0cc8488706	vpc: Implement bdrv_co_get_block_status() This implements bdrv_co_get_block_status() for VHD images. This can significantly speed up qemu-img convert operation because only with this function implemented sparseness can be considered. (Before, converting a 1 TB empty image took several minutes for me, now it's instantaneous.) Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>	2015-03-09 11:11:59 +01:00
Kevin Wolf	3f3f20dcd3	vpc: Fix size in fixed image creation If total_sectors is rounded to match the geometry, total_size needs to be changed as well. Otherwise we end up with an image whose geometry describes a disk larger than the image file, which doesn't end well. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>	2015-03-09 11:11:59 +01:00
Peter Maydell	3180aadb1f	- more config options - bootdevice, iscsi, virtio-scsi fixes - build system patches for MinGW and config-devices.mak - qemu_mutex_lock_iothread deadlock fixes - another tiny patch from the record/replay series -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJU9DRyAAoJEL/70l94x66D5ZkH/2SPp4rrLIgotyzHTIaMvi+2 0gB7Bks9cDisFyiSgr6dqLp9CV1XMlv/NZl+z+H/7og96qhBWjAKVpG1J/En55bS vanFeWGYjINuQLnhC3pqBi2kmEkzBQSIMJZt9WnDydfQj/6Wgcr6iabOpd8eTjTz rqE/UcV2L1baFPLy/Wky2vg/a5Ug2rj+fqvjRdFB/Zx8yDYLcKYJlI8utSQexamE tUcxr/AqxNOoe6WZD7CCVNmHMHvajoOhWnVY4EgHDg8L3nNSgvDF3AjYfntU6A2y HjkS0ktvQK666oNo+ORRBzLe3s9nCfB1dMK2ZiKKyFfyuYD50d2N3oHKSAIsEJo= =AQjO -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging - more config options - bootdevice, iscsi, virtio-scsi fixes - build system patches for MinGW and config-devices.mak - qemu_mutex_lock_iothread deadlock fixes - another tiny patch from the record/replay series # gpg: Signature made Mon Mar 2 09:59:14 2015 GMT using RSA key ID 78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini/tags/for-upstream: cpus: be more paranoid in avoiding deadlocks cpus: fix deadlock and segfault in qemu_mutex_lock_iothread virtio-scsi: Allocate op blocker reason before blocking Makefile.target: binary depends on config-devices Makefile: don't silence mak file test with V=1 Makefile: fix up parallel building under MSYS+MinGW iscsi: Handle write protected case in reopen Give ivshmem its own config option Create specific config option for "platform-bus" Add specific config options for PCI-E bridges bootdevice: fix segment fault when booting guest with '-kernel' and '-initrd' timer: replace time() with QEMU_CLOCK_HOST virtio-scsi-dataplane: Call blk_set_aio_context within BQL block: Forbid bdrv_set_aio_context outside BQL scsi: give device a parent before setting properties Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2015-03-03 12:07:47 +00:00
Fam Zheng	43ae8fb10c	iscsi: Handle write protected case in reopen Save the write protected flag and check before reopen. Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <1424839208-5195-1-git-send-email-famz@redhat.com> [Fixed typo in the name of the new field. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-02-27 18:26:31 +01:00
Markus Armbruster	f43e47dbf6	QemuOpts: Drop qemu_opt_set(), rename qemu_opt_set_err(), fix use qemu_opt_set() is a wrapper around qemu_opt_set() that reports the error with qerror_report_err(). Most of its users assume the function can't fail. Make them use qemu_opt_set_err() with &error_abort, so that should the assumption ever break, it'll break noisily. Just two users remain, in util/qemu-config.c. Switch them to qemu_opt_set_err() as well, then rename qemu_opt_set_err() to qemu_opt_set(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2015-02-26 14:49:31 +01:00
Markus Armbruster	39101f2511	QemuOpts: Convert qemu_opt_set_number() to Error, fix its use Return the Error object instead of reporting it with qerror_report_err(). Change callers that assume the function can't fail to pass &error_abort, so that should the assumption ever break, it'll break noisily. Turns out all callers outside its unit test assume that. We could drop the Error ** argument, but that would make the interface less regular, so don't. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2015-02-26 14:47:32 +01:00
Peter Maydell	c5c6d7f81a	Clean up around error_get_pretty(), qerror_report_err() -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJU5GT/AAoJEDhwtADrkYZT6H8QAJSdCnymglYhsJ0L8Pn+mFbw ukAxSBjZ+XJXwSBCjSLB9e2Tb6PJZAbAdQJjmI1Ijb+3cXqjRURErTsp+Caz1pjj Zw4v4whxNedXl+WeZEwX7sU6WlDhMEk51E1NHssd9dyZ/noEqHiw/XzoqimaYlPK nrSTBZ94N+F+Daw1d/cjbRMHHGVSjpVraDEPvZIkC6Mv43dGhSdCT529FXthMpUd OhoaQvEdy/75RqFwd4gbjHzA2qHVVsKdq8EfDdHAlcg2LSGB8zM4LlRmYxMKmy2g ylZLXtm6v7Pm+tYFVdLc7xWnRIh4vFXBHFJ8O9jFXziV4Nkj7s7qXdLJXxYWfRXU KC4/vw9IEkHWWUtn1A69ktyPFjEcnW0ieiEOA7/2FXiH7RARnWTl/YChlQrSgSAM zh+/01UhHvKBkxmkJIWpHzR+70A/GyubvlrcSd0g6L+g1hXEw78aryivCoFTKocl MNTlI7AcaGW2qpSUn5kr99aBdKD1sSdGPbNqqZMOzUekGQHeUuNNrFlvsTibMo5G OikdrgygmoLHBcMCgVykYoHen5lMcz+PS5aGFoGwvMV3DQZAsAwltXGeJSNck143 WuEatwA0PhuA0S/dZMELC27kUdsbvpBUhboHuShz4pvytihWu0HmVAWDeShd9uPB r/WSqvETUcdSOqExGEP2 =g7dZ -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/armbru/tags/pull-error-2015-02-18' into staging Clean up around error_get_pretty(), qerror_report_err() # gpg: Signature made Wed Feb 18 10:10:07 2015 GMT using RSA key ID EB918653 # gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" # gpg: aka "Markus Armbruster <armbru@pond.sub.org>" * remotes/armbru/tags/pull-error-2015-02-18: qemu-char: Avoid qerror_report_err() outside QMP command handlers qemu-img: Avoid qerror_report_err() outside QMP command handlers vl: Avoid qerror_report_err() outside QMP command handlers tpm: Avoid qerror_report_err() outside QMP command handlers numa: Avoid qerror_report_err() outside QMP command handlers net: Avoid qerror_report_err() outside QMP command handlers monitor: Avoid qerror_report_err() outside QMP command handlers monitor: Clean up around monitor_handle_fd_param() error: Use error_report_err() where appropriate error: New convenience function error_report_err() vhost-scsi: Improve error reporting for invalid vhostfd Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2015-02-26 07:01:08 +00:00
Peter Maydell	68b459eaa6	hmp: Normalize HMP command handler names -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJU5HCgAAoJEDhwtADrkYZT0gcP/ijfMUqLlOPMagm5ggDCx/HK IFFgcrynQNS6FwTNSIEW04So4q2EMbqwuTEpZ5pe330brGy0U/UgkVmz76BkyoXT 9LcgKwtVytfc/niF4k5nIKXrasNG1DHPrhd+zx/oTvwmC/8r+NqHZoPOjNaOPLCX 18SWJMy57l47XAzVOUoFHEW3mEO5YjF8qo3eRcbUWEWXkRp6wg/d2f9nkiHIAfcB 0XVso0PUJ3jID/WkNqb9JoexTnH5rQSkbeJVZWed8iSAt2cCi+pnE/RjL75M9VF8 3mPh2Zhi1lEV4qsYQH1OY7909RtKIj7EBDd7kuUWBi1oSIEaIn5GjNWBGCmBbPbY 0ZVhGFXFvvtI+tPEK3aqRSlyENReT29oKfEv0LAKoUQFBl+jb7qqBns4cfOF+i26 Tb4cnzqN1rdnlCNemTQATOrr01JAZEkdp3NHq+Bx967ocP3zxfL+pX2Q/3S8aFDs j9Ynq+3FvweeDKeYbHKKscELII1DZcNs1CYJOtJIl+XgzowfgpoTRP7P/e2qFM+z ey5qF8nc3mW8tVSkotMeeseFe9tj1xxIV+CslTRiYqnxHnmq4HgsN3DoDtnyy9De g3U0d9rgBKFPEkAWXg939GXbH2HVUqLkOSy50WGRruP4dzco7BhLyhQimqPchBFj b7P40f6NyWCYDhzJu6+N =Kleh -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/armbru/tags/pull-monitor-2015-02-18' into staging hmp: Normalize HMP command handler names # gpg: Signature made Wed Feb 18 10:59:44 2015 GMT using RSA key ID EB918653 # gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" # gpg: aka "Markus Armbruster <armbru@pond.sub.org>" * remotes/armbru/tags/pull-monitor-2015-02-18: hmp: Name HMP info handler functions hmp_info_SUBCOMMAND() hmp: Name HMP command handler functions hmp_COMMAND() hmp: Clean up declarations for long-gone info handlers Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2015-02-25 13:14:37 +00:00
Markus Armbruster	3e5a50d64c	hmp: Name HMP command handler functions hmp_COMMAND() Some are called do_COMMAND() (old ones, usually), some hmp_COMMAND(), and sometimes COMMAND pointlessly differs in spelling. Normalize to hmp_COMMAND(), where COMMAND is exactly the command name with '-' replaced by '_'. Exceptions: * do_device_add() and client_migrate_info() not renamed to hmp_device_add(), hmp_client_migrate_info(), because they're also QMP handlers. They still need to be converted to QAPI. * do_memory_dump(), do_physical_memory_dump(), do_ioport_read(), do_ioport_write() renamed do hmp_* instead of hmp_x(), hmp_xp(), hmp_i(), hmp_o(), because those names are too cryptic for my taste. * do_info_help() renamed to hmp_info_help() instead of hmp_info(), because it only covers help. Signed-off-by: Markus Armbruster <armbru@redhat.com>	2015-02-18 11:58:30 +01:00
Markus Armbruster	565f65d271	error: Use error_report_err() where appropriate Coccinelle semantic patch: @@ expression E; @@ - error_report("%s", error_get_pretty(E)); - error_free(E); + error_report_err(E); @@ expression E, S; @@ - error_report("%s", error_get_pretty(E)); + error_report_err(E); ( exit(S); \| abort(); ) Trivial manual touch-ups in block/sheepdog.c. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2015-02-18 10:51:09 +01:00
Max Reitz	c0191e763b	block: Remove "growable" from BDS Now that request clamping is done in the BlockBackend, the "growable" field can be removed from the BlockDriverState. All BDSs are now treated as being "growable" (that is, they are allowed to grow; they are not necessarily actually able to). Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 1423162705-32065-16-git-send-email-mreitz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-02-16 15:07:19 +00:00
Max Reitz	e7f7d676c1	block: Clamp BlockBackend requests BlockBackend is used as the interface between the block layer and guest devices. It should therefore assure that all requests are clamped to the image size. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-id: 1423162705-32065-15-git-send-email-mreitz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-02-16 15:07:19 +00:00
Max Reitz	b65a5e12a4	block: Add Error parameter to bdrv_find_protocol() The argument given to bdrv_find_protocol() is just a file name, which makes it difficult for the caller to reconstruct what protocol bdrv_find_protocol() was hoping to find. This patch adds an Error parameter to that function to solve this issue. Suggested-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 1423162705-32065-4-git-send-email-mreitz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-02-16 15:07:18 +00:00
Max Reitz	ca49a4fdb3	block: Add blk_new_open() blk_new_with_bs() creates a BlockBackend with an empty BlockDriverState attached to it. Empty BDSs are not nice, therefore add an alternative function which combines blk_new_with_bs() with bdrv_open(). Note: In contrast to bdrv_open() which takes a BlockDriver parameter, blk_new_open() does not take such a parameter. This is because bdrv_open() opens a BlockDriverState, therefore it is natural to be able to set the BlockDriver for that BDS. The fact that bdrv_open() can open more than a single BDS is merely some form of a byproduct. blk_new_open() on the other hand is intended to be used to create a whole tree of BlockDriverStates. Therefore, setting a single BlockDriver does not make much sense. Instead, the drivers to be used for each of the nodes must be configured through the "options" QDict; including the driver of the root BDS. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-id: 1423162705-32065-3-git-send-email-mreitz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-02-16 15:07:18 +00:00
Max Reitz	1ef01253eb	block: Lift some BDS functions to the BlockBackend Create the blk_* counterparts for the following bdrv_* functions (which make sense to call on the BlockBackend level): - bdrv_co_write_zeroes() - bdrv_write_compressed() - bdrv_truncate() - bdrv_nb_sectors() - bdrv_discard() - bdrv_load_vmstate() - bdrv_save_vmstate() Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 1423162705-32065-2-git-send-email-mreitz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-02-16 15:07:18 +00:00
Jeff Cody	a7be17bee8	block: vmdk - fixed sizeof() error The size compared should be PATH_MAX, rather than sizeof(char *). Reported-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 46d873261433f4527e88885582f96942d61758d6.1423592487.git.jcody@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-02-16 15:07:17 +00:00
Bin Wu	141cabe6f1	nbd: fix the co_queue multi-adding bug When we tested the VM migartion between different hosts with NBD devices, we found if we sent a cancel command after the drive_mirror was just started, a coroutine re-enter error would occur. The stack was as follow: (gdb) bt 00) 0x00007fdfc744d885 in raise () from /lib64/libc.so.6 01) 0x00007fdfc744ee61 in abort () from /lib64/libc.so.6 02) 0x00007fdfca467cc5 in qemu_coroutine_enter (co=0x7fdfcaedb400, opaque=0x0) at qemu-coroutine.c:118 03) 0x00007fdfca467f6c in qemu_co_queue_run_restart (co=0x7fdfcaedb400) at qemu-coroutine-lock.c:59 04) 0x00007fdfca467be5 in coroutine_swap (from=0x7fdfcaf3c4e8, to=0x7fdfcaedb400) at qemu-coroutine.c:96 05) 0x00007fdfca467cea in qemu_coroutine_enter (co=0x7fdfcaedb400, opaque=0x0) at qemu-coroutine.c:123 06) 0x00007fdfca467f6c in qemu_co_queue_run_restart (co=0x7fdfcaedbdc0) at qemu-coroutine-lock.c:59 07) 0x00007fdfca467be5 in coroutine_swap (from=0x7fdfcaf3c4e8, to=0x7fdfcaedbdc0) at qemu-coroutine.c:96 08) 0x00007fdfca467cea in qemu_coroutine_enter (co=0x7fdfcaedbdc0, opaque=0x0) at qemu-coroutine.c:123 09) 0x00007fdfca4a1fa4 in nbd_recv_coroutines_enter_all (s=0x7fdfcaef7dd0) at block/nbd-client.c:41 10) 0x00007fdfca4a1ff9 in nbd_teardown_connection (client=0x7fdfcaef7dd0) at block/nbd-client.c:50 11) 0x00007fdfca4a20f0 in nbd_reply_ready (opaque=0x7fdfcaef7dd0) at block/nbd-client.c:92 12) 0x00007fdfca45ed80 in aio_dispatch (ctx=0x7fdfcae15e90) at aio-posix.c:144 13) 0x00007fdfca45ef1b in aio_poll (ctx=0x7fdfcae15e90, blocking=false) at aio-posix.c:222 14) 0x00007fdfca448c34 in aio_ctx_dispatch (source=0x7fdfcae15e90, callback=0x0, user_data=0x0) at async.c:212 15) 0x00007fdfc8f2f69a in g_main_context_dispatch () from /usr/lib64/libglib-2.0.so.0 16) 0x00007fdfca45c391 in glib_pollfds_poll () at main-loop.c:190 17) 0x00007fdfca45c489 in os_host_main_loop_wait (timeout=1483677098) at main-loop.c:235 18) 0x00007fdfca45c57b in main_loop_wait (nonblocking=0) at main-loop.c:484 19) 0x00007fdfca25f403 in main_loop () at vl.c:2249 20) 0x00007fdfca266fc2 in main (argc=42, argv=0x7ffff517d638, envp=0x7ffff517d790) at vl.c:4814 We find the nbd_recv_coroutines_enter_all function (triggered by a cancel command or a network connection breaking down) will enter a coroutine which is waiting for the sending lock. If the lock is still held by another coroutine, the entering coroutine will be added into the co_queue again. Latter, when the lock is released, a coroutine re-enter error will occur. This bug can be fixed simply by delaying the setting of recv_coroutine as suggested by paolo. After applying this patch, we have tested the cancel operation in mirror phase looply for more than 5 hous and everything is fine. Without this patch, a coroutine re-enter error will occur in 5 minutes. Signed-off-by: Bn Wu <wu.wubin@huawei.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1423552846-3896-1-git-send-email-wu.wubin@huawei.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-02-16 15:07:17 +00:00
Max Reitz	f53a829bb9	nbd: Drop BDS backpointer Before this patch, the "opaque" pointer in an NBD BDS points to a BDRVNBDState, which contains an NbdClientSession object, which in turn contains a pointer to the BDS. This pointer may become invalid due to bdrv_swap(), so drop it, and instead pass the BDS directly to the nbd-client.c functions which then retrieve the NbdClientSession object from there. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1423256778-3340-2-git-send-email-mreitz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-02-16 14:36:03 +00:00
Programmingkid	728dacbda8	block/raw-posix.c: Fix raw_getlength() on Mac OS X block devices This patch replaces the dummy code in raw_getlength() for block devices on OS X, which always returned LLONG_MAX, with a real implementation that returns the actual block device size. Signed-off-by: John Arbuckle <programmingkidx@gmail.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-02-06 18:00:53 +01:00
Max Reitz	8c44dfbc62	qcow2: Rewrite qcow2_alloc_bytes() qcow2_alloc_bytes() is a function with insufficient error handling and an unnecessary goto. This patch rewrites it. Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-02-06 17:24:22 +01:00
Alberto Garcia	8e8cb375e0	block: Give always priority to unused entries in the qcow2 L2 cache The current algorithm to replace entries from the L2 cache gives priority to newer hits by dividing the hit count of all existing entries by two everytime there is a cache miss. However, if there are several cache misses the hit count of the existing entries can easily go down to 0. This will result in those entries being replaced even when there are others that have never been used. This problem is more noticeable with larger disk images and cache sizes, since the chances of having several misses before the cache is full are higher. If we make sure that the hit count can never go down to 0 again, unused entries will always have priority. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-02-06 17:24:22 +01:00
Denis V. Lunev	fa21e6faa6	nbd: fix max_discard/max_transfer_length nbd_co_discard calls nbd_client_session_co_discard which uses uint32_t as the length in bytes of the data to discard due to the following definition: struct nbd_request { uint32_t magic; uint32_t type; uint64_t handle; uint64_t from; uint32_t len; <-- the length of data to be discarded, in bytes } QEMU_PACKED; Thus we should limit bl_max_discard to UINT32_MAX >> BDRV_SECTOR_BITS to avoid overflow. NBD read/write code uses the same structure for transfers. Fix max_transfer_length accordingly. Signed-off-by: Denis V. Lunev <den@openvz.org> CC: Peter Lieven <pl@kamp.de> CC: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-02-06 17:24:22 +01:00
Max Reitz	1ce52846d3	nbd: Improve error messages This patch makes use of the Error object for nbd_receive_negotiate() so that errors during negotiation look nicer. Furthermore, this patch adds an additional error message if the received magic was wrong, but would be correct for the other protocol version, respectively: So if an export name was specified, but the NBD server magic corresponds to an old handshake, this condition is explicitly signaled to the user, and vice versa. As these messages are now part of the "Could not open image" error message, additional filtering has to be employed in iotest 083, which this patch does as well. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-02-06 17:24:22 +01:00
Jeff Cody	e729fa6afe	block: fix off-by-one error in qcow and qcow2 This fixes an off-by-one error introduced in `9a29e18`. Both qcow and qcow2 need to make sure to leave room for string terminator '\0' for the backing file, so the max length of the non-terminated string is either 1023 or PATH_MAX - 1. Reported-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-02-06 17:24:21 +01:00
Stefan Hajnoczi	0adfa1ed65	qed: check for header size overflow Header size is denoted in clusters. The maximum cluster size is 64 MB but there is no limit on header size. Check for uint32_t overflow in case the header size field has a whacky value. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1421065893-18875-2-git-send-email-stefanha@redhat.com Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-02-06 17:24:21 +01:00
Peter Wu	177b75104d	block/dmg: improve zeroes handling Disk images may contain large all-zeroes gaps (1.66k sectors or 812 MiB is seen in the real world). These blocks (type 2) do not need to be extracted into a temporary buffer, there is no need to allocate memory for these blocks nor to check its length. (For the test image, the maximum uncompressed size is 1054371 bytes, probably for a bzip2-compressed block.) Signed-off-by: Peter Wu <peter@lekensteyn.nl> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 1420566495-13284-13-git-send-email-peter@lekensteyn.nl Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-02-06 17:24:21 +01:00
Peter Wu	6b383c08c4	block/dmg: support bzip2 block entry types This patch adds support for bzip2-compressed block entries as introduced with OS X 10.4 (source: https://en.wikipedia.org/wiki/Apple_Disk_Image). It was tested against a 5.2G "OS X Yosemite" installation image which stores the BLXX block in the XML property list (instead of resource forks) and has over 5k chunks. New configure entries are added (--enable-bzip2 / --disable-bzip2) to control inclusion of bzip2 functionality (which requires linking against libbz2). The help message suggests that this option is needed for DMG files, but the tests are generic enough that other parts of QEMU can use bzip2 if needed. The identifiers are based on http://newosxbook.com/DMG.html. The decompression routines are based on the zlib case, but as there is no way to reset the decompression state (unlike zlib), memory is allocated and deallocated for every decompression. This should not be problematic as the decompression takes most of the time and as blocks are typically about/over 1 MiB in size, only one allocation is done every 2000 sectors. Signed-off-by: Peter Wu <peter@lekensteyn.nl> Reviewed-by: John Snow <jsnow@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1420566495-13284-12-git-send-email-peter@lekensteyn.nl Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-02-06 17:24:21 +01:00
Peter Wu	a8b10c6ead	block/dmg: factor out block type check In preparation for adding bzip2 support, split the type check into a separate function. Make all offsets relative to the begin of a chunk such that it is easier to recognize the position without having to add up all offsets. Some comments are added to describe the fields. There is no functional change. Signed-off-by: Peter Wu <peter@lekensteyn.nl> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 1420566495-13284-11-git-send-email-peter@lekensteyn.nl Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-02-06 17:24:21 +01:00
Peter Wu	66ec3bba97	block/dmg: use SectorNumber from BLKX header Previously the sector table parsing relied on the previous offset of the DMG file. Now it uses the sector number from the BLKX header (see http://newosxbook.com/DMG.html). The implementation of dmg2img (from vu1tur) does not base the output sector on the location of the terminator (0xffffffff) either so it should be safe to drop this dependency on the previous state. (It makes somehow makes sense, a terminator should halt further processing of a block and is perhaps used to preallocate some space.) Signed-off-by: Peter Wu <peter@lekensteyn.nl> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 1420566495-13284-10-git-send-email-peter@lekensteyn.nl Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-02-06 17:24:21 +01:00
Peter Wu	c6d34865fa	block/dmg: fix sector data offset calculation This patch addresses two issues: - The data fork offset was not taken into account, resulting in failure to read an InstallESD.dmg file (5164763151 bytes) which had a non-zero DataForkOffset field. - The offset of the previous block ("partition") was unconditionally added to the current block because older files would start the input offset of a new block at zero. Newer files (including vlc-2.1.5.dmg, tuxpaint-0.9.15-macosx.dmg and OS X Yosemite [MAS].dmg) failed in reads because these files have chunk offsets, relative to the begin of a data fork. Now the data offset of the mish is taken into account. While we could check that the data_offset is within the data fork, let's not do that here as it would only result in parse failures on invalid files (rather than gracefully handling such bad files). dmg_read will error out if the offset is incorrect. Signed-off-by: Peter Wu <peter@lekensteyn.nl> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 1420566495-13284-9-git-send-email-peter@lekensteyn.nl Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-02-06 17:24:21 +01:00
Peter Wu	8daf425794	block/dmg: set virtual size to a non-zero value Right now the virtual size is always reported as zero which makes it impossible to convert between formats. After this patch, the number of sectors will be read from the trailer ("koly" block). To verify the behavior, the output of `dmg2img foo.dmg foo.img` was compared against `qemu-img convert -f dmg -O raw foo.dmg foo.raw`. The tests showed that the file contents are exactly the same, except that QEMU creates a slightly larger file (it matches the total sectors count). Signed-off-by: Peter Wu <peter@lekensteyn.nl> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 1420566495-13284-8-git-send-email-peter@lekensteyn.nl Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-02-06 17:24:21 +01:00
Peter Wu	0599e56ed4	block/dmg: process XML plists The format is simple enough to avoid using a full-blown XML parser. It assumes that all BLKX items begin with the "mish" magic word, therefore it is not a problem if other values get matched which are not a BLKX block. The offsets are based on the description at http://newosxbook.com/DMG.html For compatibility with glib 2.12, use g_base64_decode (which additionally requires an extra buffer allocation) instead of g_base64_decode_inplace (which is only available since glib 2.20). Signed-off-by: Peter Wu <peter@lekensteyn.nl> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 1420566495-13284-7-git-send-email-peter@lekensteyn.nl Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-02-06 17:24:21 +01:00
Peter Wu	f6e6652d7c	block/dmg: validate chunk size to avoid overflow Previously the chunk size was not checked, allowing for a large memory allocation. This patch checks whether the chunks size is within the resource fork length, and whether the resource fork is below the trailer of the dmg file. Signed-off-by: Peter Wu <peter@lekensteyn.nl> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 1420566495-13284-6-git-send-email-peter@lekensteyn.nl Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-02-06 17:24:21 +01:00
Peter Wu	7aee37b93a	block/dmg: process a buffer instead of reading ints As the decoded plist XML is not a pointer in the file, dmg_read_mish_block must be able to process a buffer instead of a file pointer. Since the full buffer must be processed, let's change the return value again to just a success flag. Signed-off-by: Peter Wu <peter@lekensteyn.nl> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 1420566495-13284-5-git-send-email-peter@lekensteyn.nl Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-02-06 17:24:21 +01:00
Peter Wu	b0e8dc5d54	block/dmg: extract processing of resource forks Besides the offset, also read the resource length. This length is now used in the extracted function to verify the end of the resource fork against "count" from the resource fork. Instead of relying on the value of offset to conclude whether the resource fork is available or not (info_begin==0), check the rsrc_fork_length instead. This would allow a dmg file to begin with a resource fork. This seemingly unnecessary restriction was found while trying to craft a DMG file by hand. Other changes: - Do not require resource data offset to be 0x100 (but check that it is within bounds though). - Further improve boundary checking (resource data must be within the resource fork). - Use correct value for resource data length (spotted by John Snow) - Consider the resource data offset when determining info_end. This fixes an EINVAL on the tuxpaint dmg example. The resource fork format is documented at https://developer.apple.com/legacy/library/documentation/mac/pdf/MoreMacintoshToolbox.pdf#page=151 Signed-off-by: Peter Wu <peter@lekensteyn.nl> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 1420566495-13284-4-git-send-email-peter@lekensteyn.nl Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-02-06 17:24:21 +01:00
Peter Wu	65a1c7c96a	block/dmg: extract mish block decoding functionality Extract the mish block decoder such that this can be used for other formats in the future. A new DmgHeaderState struct is introduced to share state while decoding. The code is kept unchanged as much as possible, a "fail" label is added for example where a simple return would probably do. In dmg_open, the variable "tmp" is renamed to "rsrc_data_offset" for clarity and comments have been added explaining various data. Note that this patch has one subtle difference with the previous version which should not affect functionality. In the previous code, the end of a resource was inferred from the mish block (the offsets would be increased by the fields). In this patch, the resource length is used instead to avoid the need to rely on the previous offsets. Signed-off-by: Peter Wu <peter@lekensteyn.nl> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1420566495-13284-3-git-send-email-peter@lekensteyn.nl Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-02-06 17:24:21 +01:00
Peter Wu	fa8354bd22	block/dmg: properly detect the UDIF trailer DMG files have a variable length with a UDIF trailer at the end of a file. This UDIF trailer is essential as it describes the contents of the image. At the moment however, the start of this trailer is almost always incorrect as bdrv_getlength() returns a multiple of the block size (rounded up). This results in a failure to recognize DMG files, resulting in Invalid argument (EINVAL) errors. As there is no API to retrieve the real file size, look for the magic header in the last two sectors to find the start of this 512-byte UDIF trailer (the "koly" block). The resource fork offset ("info_begin") has its offset adjusted as the initial value of offset does not mean "end of file" anymore, but "begin of UDIF trailer". [Replaced error_set(errp, ERROR_CLASS_GENERIC_ERROR, ...) with error_setg(errp, ...) as discussed with Peter. --Stefan] Signed-off-by: Peter Wu <peter@lekensteyn.nl> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1420566495-13284-2-git-send-email-peter@lekensteyn.nl Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-02-06 17:24:21 +01:00
Francesco Romani	e2462113b2	block: add event when disk usage exceeds threshold Managing applications, like oVirt (http://www.ovirt.org), make extensive use of thin-provisioned disk images. To let the guest run smoothly and be not unnecessarily paused, oVirt sets a disk usage threshold (so called 'high water mark') based on the occupation of the device, and automatically extends the image once the threshold is reached or exceeded. In order to detect the crossing of the threshold, oVirt has no choice but aggressively polling the QEMU monitor using the query-blockstats command. This lead to unnecessary system load, and is made even worse under scale: deployments with hundreds of VMs are no longer rare. To fix this, this patch adds: * A new monitor command `block-set-write-threshold', to set a mark for a given block device. * A new event `BLOCK_WRITE_THRESHOLD', to report if a block device usage exceeds the threshold. * A new `write_threshold' field into the `BlockDeviceInfo' structure, to report the configured threshold. This will allow the managing application to use smarter and more efficient monitoring, greatly reducing the need of polling. [Updated qemu-iotests 067 output to add the new 'write_threshold' property. --Stefan] [Changed g_assert_false() to !g_assert() to fix the build on older glib versions. --Kevin] Signed-off-by: Francesco Romani <fromani@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 1421068273-692-1-git-send-email-fromani@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-02-06 17:24:21 +01:00
Peter Lieven	454057b7d9	block-backend: expose bs->bl.max_transfer_length Signed-off-by: Peter Lieven <pl@kamp.de> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-02-06 17:24:21 +01:00
Peter Lieven	f4564d53c6	block: add accounting for merged requests Signed-off-by: Peter Lieven <pl@kamp.de> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-02-06 17:24:21 +01:00
Fam Zheng	35f5a49374	qed: Really remove unused field QEDAIOCB.finished The commit `533ffb17a` that removed qed_aiocb_info.cancel said to remove this but didn't do it. Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-02-06 17:24:21 +01:00
Denis V. Lunev	1cdc3239f1	block: use fallocate(FALLOC_FL_PUNCH_HOLE) & fallocate(0) to write zeroes This sequence works efficiently if FALLOC_FL_ZERO_RANGE is not supported. Unfortunately, FALLOC_FL_ZERO_RANGE is supported on really modern systems and only for a couple of filesystems. FALLOC_FL_PUNCH_HOLE is much more mature. The sequence of 2 operations FALLOC_FL_PUNCH_HOLE and 0 is necessary due to the following reasons: - FALLOC_FL_PUNCH_HOLE creates a hole in the file, the file becomes sparse. In order to retain original functionality we must allocate disk space afterwards. This is done using fallocate(0) call - fallocate(0) without preceeding FALLOC_FL_PUNCH_HOLE will do nothing if called above already allocated areas of the file, i.e. the content will not be zeroed This should increase the performance a bit for not-so-modern kernels. CC: Max Reitz <mreitz@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Peter Lieven <pl@kamp.de> CC: Fam Zheng <famz@redhat.com> Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-02-06 17:24:20 +01:00

... 2 3 4 5 6 ...

2163 Commits