mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Prasanna Kumar Kalever	6c7189bb29	block/gluster: add support for multiple gluster servers This patch adds a way to specify multiple volfile servers to the gluster block backend of QEMU with tcp\|rdma transport types and their port numbers. Problem: Currently VM Image on gluster volume is specified like this: file=gluster[+tcp]://host[:port]/testvol/a.img Say we have three hosts in a trusted pool with replica 3 volume in action. When the host mentioned in the command above goes down for some reason, the other two hosts are still available. But there's currently no way to tell QEMU about them. Solution: New way of specifying VM Image on gluster volume with volfile servers: (We still support old syntax to maintain backward compatibility) Basic command line syntax looks like: Pattern I: -drive driver=gluster, volume=testvol,path=/path/a.raw,[debug=N,] server.0.type=tcp, server.0.host=1.2.3.4, server.0.port=24007, server.1.type=unix, server.1.socket=/path/socketfile Pattern II: 'json:{"driver":"qcow2","file":{"driver":"gluster", "volume":"testvol","path":"/path/a.qcow2",["debug":N,] "server":[{hostinfo_1}, ...{hostinfo_N}]}}' driver => 'gluster' (protocol name) volume => name of gluster volume where our VM image resides path => absolute path of image in gluster volume [debug] => libgfapi loglevel [(0 - 9) default 4 -> Error] {hostinfo} => {{type:"tcp",host:"1.2.3.4"[,port=24007]}, {type:"unix",socket:"/path/sockfile"}} type => transport type used to connect to gluster management daemon, it can be tcp\|unix host => host address (hostname/ipv4/ipv6 addresses/socket path) port => port number on which glusterd is listening. socket => path to socket file Examples: 1. -drive driver=qcow2,file.driver=gluster, file.volume=testvol,file.path=/path/a.qcow2,file.debug=9, file.server.0.type=tcp, file.server.0.host=1.2.3.4, file.server.0.port=24007, file.server.1.type=unix, file.server.1.socket=/var/run/glusterd.socket 2. 'json:{"driver":"qcow2","file":{"driver":"gluster","volume":"testvol", "path":"/path/a.qcow2","debug":9,"server": [{"type":"tcp","host":"1.2.3.4","port":"24007"}, {"type":"unix","socket":"/var/run/glusterd.socket"} ]}}' This patch gives a mechanism to provide all the server addresses, which are in replica set, so in case host1 is down VM can still boot from any of the active hosts. This is equivalent to the backup-volfile-servers option supported by mount.glusterfs (FUSE way of mounting gluster volume) credits: sincere thanks to all the supporters Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Message-id: 1468947453-5433-6-git-send-email-prasanna.kalever@redhat.com Signed-off-by: Jeff Cody <jcody@redhat.com>	2016-07-19 17:38:50 -04:00
Prasanna Kumar Kalever	7edac2ddeb	block/gluster: using new qapi schema this patch adds 'GlusterServer' related schema in qapi/block-core.json [Jeff: minor fix-ups of comments and formatting, per patch reviews] Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 1468947453-5433-5-git-send-email-prasanna.kalever@redhat.com Signed-off-by: Jeff Cody <jcody@redhat.com>	2016-07-19 17:36:11 -04:00
Prasanna Kumar Kalever	0552ff2465	block/gluster: deprecate rdma support gluster volfile server fetch happens through unix and/or tcp, it doesn't support volfile fetch over rdma. The rdma code may actually mislead, so to make sure things do not break, for now we fallback to tcp when requested for rdma, with a warning. If you are wondering how this worked all these days, its the gluster libgfapi code which handles anything other than unix transport as socket/tcp, sad but true. Also gluster doesn't support ipv6 addresses, removing the ipv6 related comments/docs section [Jeff: Minor grammatical fixes in comments and commit message, per review comments] Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Message-id: 1468947453-5433-4-git-send-email-prasanna.kalever@redhat.com Signed-off-by: Jeff Cody <jcody@redhat.com>	2016-07-19 17:20:44 -04:00
Prasanna Kumar Kalever	f70c50c817	block/gluster: code cleanup unified coding styles of multiline function arguments and other error functions moved random declarations of structures and other list variables Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Message-id: 1468947453-5433-3-git-send-email-prasanna.kalever@redhat.com Signed-off-by: Jeff Cody <jcody@redhat.com>	2016-07-19 17:08:12 -04:00
Prasanna Kumar Kalever	d5cf4079ca	block/gluster: rename [server, volname, image] -> [host, volume, path] A future patch will add support for multiple gluster servers. Existing terminology is a bit unusual in relation to what names are used by other networked devices, and doesn't map very well to the terminology we expect to use for multiple servers. Therefore, rename the following options: 'server' -> 'host' 'image' -> 'path' 'volname' -> 'volume' Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Message-id: 1468947453-5433-2-git-send-email-prasanna.kalever@redhat.com Signed-off-by: Jeff Cody <jcody@redhat.com>	2016-07-19 17:08:12 -04:00
Denis V. Lunev	cf56a3c632	mirror: fix request throttling in drive-mirror There are 2 deficiencies here: - mirror_iteration could start several requests inside. Thus we could simply have more in_flight requests than MAX_IN_FLIGHT. - keeping this in mind throttling in mirror_run which is checking s->in_flight == MAX_IN_FLIGHT is wrong. The patch adds the check and throttling into mirror_iteration and fixes the check in mirror_run() to be sure. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 1466598927-5990-1-git-send-email-den@openvz.org CC: Jeff Cody <jcody@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Max Reitz <mreitz@redhat.com> Signed-off-by: Jeff Cody <jcody@redhat.com> (cherry picked from commit e648dc95c28fbca12e67be26a1fc4b9a0676c3fe)	2016-07-19 17:03:44 -04:00
Denis V. Lunev	4b5004d9fc	mirror: improve performance of mirroring of empty disk We should not take into account zero blocks for delay calculations. They are not read and thus IO throttling is not required. In the other case VM migration with 16 Tb QCOW2 disk with 4 Gb of data takes days. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 1468503209-19498-9-git-send-email-den@openvz.org CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Max Reitz <mreitz@redhat.com> CC: Jeff Cody <jcody@redhat.com> CC: Eric Blake <eblake@redhat.com> Signed-off-by: Jeff Cody <jcody@redhat.com>	2016-07-19 17:02:49 -04:00
Denis V. Lunev	c7c2769c0e	mirror: efficiently zero out target With a bdrv_co_write_zeroes method on a target BDS and when this method is working as indicated by the bdrv_can_write_zeroes_with_unmap(), zeroes will not be placed into the wire. Thus the target could be very efficiently zeroed out. This should be done with the largest chunk possible. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Vladimir Sementsov-Ogievskiy<vsementsov@virtuozzo.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 1468503209-19498-8-git-send-email-den@openvz.org CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Max Reitz <mreitz@redhat.com> CC: Jeff Cody <jcody@redhat.com> CC: Eric Blake <eblake@redhat.com> Signed-off-by: Jeff Cody <jcody@redhat.com>	2016-07-19 16:54:46 -04:00
Denis V. Lunev	b7d5062c9c	mirror: optimize dirty bitmap filling in mirror_run a bit There is no need to scan allocation tables if we have mark_all_dirty flag set. Just mark it all dirty. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Vladimir Sementsov-Ogievskiy<vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 1468503209-19498-7-git-send-email-den@openvz.org CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Max Reitz <mreitz@redhat.com> CC: Jeff Cody <jcody@redhat.com> Signed-off-by: Jeff Cody <jcody@redhat.com>	2016-07-19 16:54:46 -04:00
Denis V. Lunev	c0b363ad43	mirror: create mirror_dirty_init helper for mirror_run The code inside the helper will be extended in the next patch. mirror_run itself is overbloated at the moment. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Vladimir Sementsov-Ogievskiy<vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 1468503209-19498-5-git-send-email-den@openvz.org CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Max Reitz <mreitz@redhat.com> CC: Jeff Cody <jcody@redhat.com> CC: Eric Blake <eblake@redhat.com> Signed-off-by: Jeff Cody <jcody@redhat.com>	2016-07-19 16:54:46 -04:00
Denis V. Lunev	49efb1f5b0	mirror: create mirror_throttle helper The patch also places last_pause_ns from stack in mirror_run into MirrorBlockJob structure. This helper will be useful in next patches. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 1468503209-19498-4-git-send-email-den@openvz.org CC: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> CC: Eric Blake <eblake@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Fam Zheng <famz@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Max Reitz <mreitz@redhat.com> CC: Jeff Cody <jcody@redhat.com> Signed-off-by: Jeff Cody <jcody@redhat.com>	2016-07-19 16:54:46 -04:00
Denis V. Lunev	531509ba28	mirror: make sectors_in_flight int64_t We keep here the sum of int fields. Thus this could easily overflow, especially when we will start sending big requests in next patches. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 1468503209-19498-3-git-send-email-den@openvz.org CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Max Reitz <mreitz@redhat.com> CC: Jeff Cody <jcody@redhat.com> Signed-off-by: Jeff Cody <jcody@redhat.com>	2016-07-19 16:54:46 -04:00
Denis V. Lunev	6d07859926	dirty-bitmap: operate with int64_t amount Underlying HBitmap operates even with uint64_t. Thus this change is safe. This would be useful f.e. to mark entire bitmap dirty in one call. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 1468503209-19498-2-git-send-email-den@openvz.org CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Max Reitz <mreitz@redhat.com> CC: Jeff Cody <jcody@redhat.com> Signed-off-by: Jeff Cody <jcody@redhat.com>	2016-07-19 16:54:46 -04:00
Peter Maydell	a3b3437721	* two old patches from prospective GSoC students * i386 -kernel device tree support * Coverity fix * memory usage improvement from Peter * checkpatch fix * g_path_get_dirname cleanup * caching of block status for iSCSI -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQEcBAABCAAGBQJXjcwdAAoJEL/70l94x66D0s8IAKyEBxtATSXZG90jJR+uoCbv oK6ea9RDWUZxa2hKjcXh9Be+g4pTv99BqTKJJt+uwkFHAAJl0gvVty+EHE/2sfyo Nlt9FlWibxBdSoHxeCq4jg9APWORxcSx3rspg0I8TdxbweKdm9onvXEjfvmhucqG FfPSIHg5vsoutCPEDTXfaJDFiLw+rV7Em53kxD/y4VhHZlAWhahpCHSL/lWGRoLp B0mKvoHhqHR/EqJr7y7fuga+Aoimyh8R6dUfpxuXQely3309V7znhq0erPQWvSwX JKRITmGbWW3HOjplqBT971eH5iH0bDEryx91Oas9VNpm9eGr6qygePhc1eMszxE= =pBxf -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging * two old patches from prospective GSoC students * i386 -kernel device tree support * Coverity fix * memory usage improvement from Peter * checkpatch fix * g_path_get_dirname cleanup * caching of block status for iSCSI # gpg: Signature made Tue 19 Jul 2016 07:43:41 BST # gpg: using RSA key 0xBFFBD25F78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini/tags/for-upstream: target-i386: Remove redundant HF_SOFTMMU_MASK block/iscsi: allow caching of the allocation map block/iscsi: fix rounding in iscsi_allocationmap_set Move README to markdown cpu-exec: Move down some declarations in cpu_exec() exec: avoid realloc in phys_map_node_reserve checkpatch: consider git extended headers valid patches megasas: remove useless check for cmd->frame compiler: never omit assertions if using a static analysis tool hw/i386: add device tree support Changed malloc to g_malloc, free to g_free in bsd-user/qemu.h use g_path_get_dirname instead of dirname Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2016-07-19 15:08:05 +01:00
Peter Maydell	ad31cd4c69	-----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJXjV3bAAoJEH3vgQaq/DkOArAQAKc03QhwukqT2aZ5zs7kZ1Hv PCHrISL0dKGK3YfyiSitppoxr6eBoBR9UIEjaNlW0XwujqwWdWfJ7kIbVSGAyWqR YWmV+bA+TUWXz+tFbeDI0maMH9GNVCAuvQiqldmJxhZ303pf5cksZ9CqiALSylTY t5XUB6nhV7MPto63C2X2xjLkKxlsT9KOTsYxGVgVXwUzgW1lAuu8Lo6eNULXCgUa j+azgSFAiUOKwfKxcKD25kPOxgWlrxkGRc2LdFlopEzShENROhR2r9ut3okgAoM4 KPVIE3jSsLMhNr9bRQRUJw53vRSL/bxFvlCdzKiBFSo7wNMKWNA9RWF6+If1Jvoi Am+BzINCfNfoFmqlXppqWGlapk9ZtmDGbPwaUyT6NJ9axAASTQxcj8QOjGEX07UE ubvzIXx7D1Amo59/4RWRXVDpMb9+p3npqkuCL+DWZzq7EVB42ig8+fKhijhS4jUK 2DA7uL4orUjUIoTbJZsKciw7MfaWuP2/SnP1VRNRXSsiNg5N4qJDUB6Wo5AKksQB LWP4Ou4irPj/ZGvMhJBMMOQ5kl2maqj8beP9pVjltVjGVzAihAyuXHUqjPdqe1R9 PLQidHKCb1OaJWi3c53vayuzlINH9iY+adjgSuYxi1QOZ+uqEsjSs+3+m1gyTOeh sd8VU/TtJbiGBCh9VTTf =9N8+ -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/jnsnow/tags/ide-pull-request' into staging # gpg: Signature made Mon 18 Jul 2016 23:53:15 BST # gpg: using RSA key 0x7DEF8106AAFC390E # gpg: Good signature from "John Snow (John Huston) <jsnow@redhat.com>" # Primary key fingerprint: FAEB 9711 A12C F475 812F 18F2 88A9 064D 1835 61EB # Subkey fingerprint: F9B7 ABDB BCAC DF95 BE76 CBD0 7DEF 8106 AAFC 390E * remotes/jnsnow/tags/ide-pull-request: block: ignore flush requests when storage is clean tests: in IDE and AHCI tests perform DMA write before flushing ide: set retry_unit for PIO and FLUSH requests ide: refactor retry_unit set and clear into separate function Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2016-07-19 11:47:07 +01:00
Peter Lieven	e1123a3b40	block/iscsi: allow caching of the allocation map until now the allocation map was used only as a hint if a cluster is allocated or not. If a block was not allocated (or Qemu had no info about the allocation status) a get_block_status call was issued to check the allocation status and possibly avoid a subsequent read of unallocated sectors. If a block known to be allocated the get_block_status call was omitted. In the other case a get_block_status call was issued before every read to avoid the necessity for a consistent allocation map. To avoid the potential overhead of calling get_block_status for each and every read request this took only place for the bigger requests. This patch enhances this mechanism to cache the allocation status and avoid calling get_block_status for blocks where the allocation status has been queried before. This allows for bypassing the read request even for smaller requests and additionally omits calling get_block_status for known to be unallocated blocks. Signed-off-by: Peter Lieven <pl@kamp.de> Message-Id: <1468831940-15556-3-git-send-email-pl@kamp.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-07-19 08:34:53 +02:00
Peter Lieven	eb36b953e0	block/iscsi: fix rounding in iscsi_allocationmap_set when setting clusters as alloacted the boundaries have to be expanded. As Paolo pointed out the calculation of the number of clusters is wrong: Suppose cluster_sectors is 2, sector_num = 1, nb_sectors = 6: In the "mark allocated" case, you want to set 0..8, i.e. cluster_num=0, nb_clusters=4. 0--.--2--.--4--.--6--.--8 <--\|_________________\|--> (<--> = expanded) Instead you are setting nb_clusters=3, so that 6..8 is not marked. 0--.--2--.--4--.--6--.--8 <--\|______________\|!!! (! = wrong) Cc: qemu-stable@nongnu.org Reported-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Peter Lieven <pl@kamp.de> Message-Id: <1468831940-15556-2-git-send-email-pl@kamp.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-07-19 08:34:53 +02:00
Evgeny Yakovlev	3ff2f67a7c	block: ignore flush requests when storage is clean Some guests (win2008 server for example) do a lot of unnecessary flushing when underlying media has not changed. This adds additional overhead on host when calling fsync/fdatasync. This change introduces a write generation scheme in BlockDriverState. Current write generation is checked against last flushed generation to avoid unnessesary flushes. The problem with excessive flushing was found by a performance test which does parallel directory tree creation (from 2 processes). Results improved from 0.424 loops/sec to 0.432 loops/sec. Each loop creates 10^3 directories with 10 files in each. This affected some blkdebug testcases that were expecting error logs from failure-injected flushes which are now skipped entirely (tests 026 071 089). This also affects the performance of block jobs and thus BLOCK_JOB_READY events for driver-mirror and active block-commit commands now arrives faster, before QMP send successfully returns to caller (tests 141 144). Signed-off-by: Evgeny Yakovlev <eyakovlev@virtuozzo.com> Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1468870792-7411-5-git-send-email-den@openvz.org CC: Kevin Wolf <kwolf@redhat.com> CC: Max Reitz <mreitz@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Fam Zheng <famz@redhat.com> CC: John Snow <jsnow@redhat.com> Signed-off-by: John Snow <jsnow@redhat.com>	2016-07-18 18:19:01 -04:00
Roman Pen	5e1b34a3fa	linux-aio: prevent submitting more than MAX_EVENTS Invoking io_setup(MAX_EVENTS) we ask kernel to create ring buffer for us with specified number of events. But kernel ring buffer allocation logic is a bit tricky (ring buffer is page size aligned + some percpu allocation are required) so eventually more than requested events number is allocated. From a userspace side we have to follow the convention and should not try to io_submit() more or logic, which consumes completed events, should be changed accordingly. The pitfall is in the following sequence: MAX_EVENTS = 128 io_setup(MAX_EVENTS) io_submit(MAX_EVENTS) io_submit(MAX_EVENTS) /* now 256 events are in-flight / io_getevents(MAX_EVENTS) = 128 / we can handle only 128 events at once, to be sure * that nothing is pended the io_getevents(MAX_EVENTS) * call must be invoked once more or hang will happen. */ To prevent the hang or reiteration of io_getevents() call this patch restricts the number of in-flights, which is now limited to MAX_EVENTS. Signed-off-by: Roman Pen <roman.penyaev@profitbricks.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1468415004-31755-1-git-send-email-roman.penyaev@profitbricks.com Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: qemu-devel@nongnu.org Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-18 15:10:52 +01:00
Paolo Bonzini	0187f5c9cb	linux-aio: share one LinuxAioState within an AioContext This has better performance because it executes fewer system calls and does not use a bottom half per disk. Originally proposed by Ming Lei. [Changed #include "raw-aio.h" to "block/raw-aio.h" in win32-aio.c to fix build error as reported by Peter Maydell <peter.maydell@linaro.org>. --Stefan] Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1467650000-51385-1-git-send-email-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> squash! linux-aio: share one LinuxAioState within an AioContext	2016-07-18 15:09:31 +01:00
Max Reitz	c4b48bfdc5	vvfat: Fix qcow write target driver specification First, bdrv_open_child() expects all options for the child to be prefixed by the child's name (and a separating dot). Second, bdrv_open_child() does not take ownership of the QDict passed to it but only extracts all options for the child, so if a QDict is created for the sole purpose of passing it to bdrv_open_child(), it needs to be freed afterwards. This patch makes vvfat adhere to both of these rules. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20160711135452.11304-1-mreitz@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2016-07-13 13:41:39 +02:00
Reda Sallahi	524089bce4	vmdk: fix metadata write regression Commit "cdeaf1f vmdk: add bdrv_co_write_zeroes" causes a regression on writes. It writes metadata after every write instead of doing it only once for each cluster. vmdk_pwritev() writes metadata whenever m_data is set as valid so this patch sets m_data as valid only when we have a new cluster which hasn't been allocated before or a zero grain. Signed-off-by: Reda Sallahi <fullmanet@gmail.com> Message-id: 20160707084249.29084-1-fullmanet@gmail.com Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2016-07-13 13:41:39 +02:00
Sascha Silbe	f14a39ccb9	Improve block job rate limiting for small bandwidth values ratelimit_calculate_delay() previously reset the accounting every time slice, no matter how much data had been processed before. This had (at least) two consequences: 1. The minimum speed is rather large, e.g. 5 MiB/s for commit and stream. Not sure if there are real-world use cases where this would be a problem. Mirroring and backup over a slow link (e.g. DSL) would come to mind, though. 2. Tests for block job operations (e.g. cancel) were rather racy All block jobs currently use a time slice of 100ms. That's a reasonable value to get smooth output during regular operation. However this also meant that the state of block jobs changed every 100ms, no matter how low the configured limit was. On busy hosts, qemu often transferred additional chunks until the test case had a chance to cancel the job. Fix the block job rate limit code to delay for more than one time slice to address the above issues. To make it easier to handle oversized chunks we switch the semantics from returning a delay _before_ the current request to a delay _after_ the current request. If necessary, this delay consists of multiple time slice units. Since the mirror job sends multiple chunks in one go even if the rate limit was exceeded in between, we need to keep track of the start of the current time slice so we can correctly re-compute the delay for the updated amount of data. The minimum bandwidth now is 1 data unit per time slice. The block jobs are currently passing the amount of data transferred in sectors and using 100ms time slices, so this translates to 5120 bytes/second. With chunk sizes usually being O(512KiB), tests have plenty of time (O(100s)) to operate on block jobs. The chance of a race condition now is fairly remote, except possibly on insanely loaded systems. Signed-off-by: Sascha Silbe <silbe@linux.vnet.ibm.com> Message-id: 1467127721-9564-2-git-send-email-silbe@linux.vnet.ibm.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2016-07-13 13:41:38 +02:00
Max Reitz	c834cba905	qcow2: Fix qcow2_get_cluster_offset() Recently, qcow2_get_cluster_offset() has been changed to work with bytes instead of sectors. This invalidated some assertions and introduced a possible integer multiplication overflow. This could be reproduced using e.g. $ qemu-img create -f qcow2 -o cluster_size=1M blub.qcow2 8G Formatting 'foo.qcow2', fmt=qcow2 size=8589934592 encryption=off cluster_size=1048576 lazy_refcounts=off refcount_bits=16 $ qemu-io -c map blub.qcow2 qemu-io: qemu/block/qcow2-cluster.c:504: qcow2_get_cluster_offset: Assertion `bytes_needed <= INT_MAX' failed. [1] 20775 abort (core dumped) qemu-io -c map foo.qcow2 This patch removes the now wrong assertion, adding comments and more assertions to prove its correctness (and fixing the overflow which would become apparent with the original assertion removed). Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20160620142623.24471-3-mreitz@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2016-07-13 13:41:38 +02:00
Max Reitz	84c26520d3	qcow2: Avoid making the L1 table too big We refuse to open images whose L1 table we deem "too big". Consequently, we should not produce such images ourselves. Cc: qemu-stable@nongnu.org Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20160615153630.2116-3-mreitz@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> [mreitz: Added QEMU_BUILD_BUG_ON()] Signed-off-by: Max Reitz <mreitz@redhat.com>	2016-07-13 13:41:38 +02:00
Kevin Wolf	8c39825218	block/qdev: Allow configuring rerror/werror with qdev properties The rerror/werror policies are implemented in the devices, so that's where they should be configured. In comparison to the old options in -drive, the qdev properties are only added to those devices that actually support them. If the option isn't given (or "auto" is specified), the setting of the BlockBackend is used for compatibility with the old options. For block jobs, "auto" is the same as "enospc". Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>	2016-07-13 13:32:27 +02:00
Kevin Wolf	1e8fb7f1ee	commit: Fix use of error handling policy Commit implemented the 'enospc' policy as 'ignore' if the error was not ENOSPC. The QAPI documentation promises that it's treated as 'stop'. Using the common block job error handling function fixes this and also adds the missing QMP event. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>	2016-07-13 13:32:27 +02:00
Paolo Bonzini	0b8b8753e4	coroutine: move entry argument to qemu_coroutine_create In practice the entry argument is always known at creation time, and it is confusing that sometimes qemu_coroutine_enter is used with a non-NULL argument to re-enter a coroutine (this happens in block/sheepdog.c and tests/test-coroutine.c). So pass the opaque value at creation time, for consistency with e.g. aio_bh_new. Mostly done with the following semantic patch: @ entry1 @ expression entry, arg, co; @@ - co = qemu_coroutine_create(entry); + co = qemu_coroutine_create(entry, arg); ... - qemu_coroutine_enter(co, arg); + qemu_coroutine_enter(co); @ entry2 @ expression entry, arg; identifier co; @@ - Coroutine co = qemu_coroutine_create(entry); + Coroutine co = qemu_coroutine_create(entry, arg); ... - qemu_coroutine_enter(co, arg); + qemu_coroutine_enter(co); @ entry3 @ expression entry, arg; @@ - qemu_coroutine_enter(qemu_coroutine_create(entry), arg); + qemu_coroutine_enter(qemu_coroutine_create(entry, arg)); @ reentry @ expression co; @@ - qemu_coroutine_enter(co, NULL); + qemu_coroutine_enter(co); except for the aforementioned few places where the semantic patch stumbled (as expected) and for test_co_queue, which would otherwise produce an uninitialized variable warning. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-13 13:26:02 +02:00
Fam Zheng	5af7045bd0	raw-posix: Use qemu_dup Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-13 13:26:02 +02:00
Alberto Garcia	fd62c609ed	commit: Add 'job-id' parameter to 'block-commit' This patch adds a new optional 'job-id' parameter to 'block-commit', allowing the user to specify the ID of the block job to be created. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-13 13:26:02 +02:00
Alberto Garcia	2323322ed0	stream: Add 'job-id' parameter to 'block-stream' This patch adds a new optional 'job-id' parameter to 'block-stream', allowing the user to specify the ID of the block job to be created. The HMP 'block_stream' command remains unchanged. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-13 13:26:02 +02:00
Alberto Garcia	70559d499c	backup: Add 'job-id' parameter to 'blockdev-backup' and 'drive-backup' This patch adds a new optional 'job-id' parameter to 'blockdev-backup' and 'drive-backup', allowing the user to specify the ID of the block job to be created. The HMP 'drive_backup' command remains unchanged. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-13 13:26:02 +02:00
Alberto Garcia	71aa98678c	mirror: Add 'job-id' parameter to 'blockdev-mirror' and 'drive-mirror' This patch adds a new optional 'job-id' parameter to 'blockdev-mirror' and 'drive-mirror', allowing the user to specify the ID of the block job to be created. The HMP 'drive_mirror' command remains unchanged. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-13 13:26:02 +02:00
Alberto Garcia	7f0317cfc8	blockjob: Add 'job_id' parameter to block_job_create() When a new job is created, the job ID is taken from the device name of the BDS. This patch adds a new 'job_id' parameter to let the caller provide one instead. This patch also verifies that the ID is always unique and well-formed. This causes problems in a couple of places where no ID is being set, because the BDS does not have a device name. In the case of test_block_job_start() (from test-blockjob-txn.c) we can simply use this new 'job_id' parameter to set the missing ID. In the case of img_commit() (from qemu-img.c) we still don't have the API to make commit_active_start() set the job ID, so we solve it by setting a default value. We'll get rid of this as soon as we extend the API. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-13 13:26:02 +02:00
Alberto Garcia	9df229c3ca	blockjob: Update description of the 'id' field The 'id' field of the BlockJob structure will be able to hold any ID, not only a device name. This patch updates the description of that field and the error messages where it is being used. Soon we'll add the ability to set an arbitrary ID when creating a block job. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-13 13:26:02 +02:00
Markus Armbruster	a9c94277f0	Use #include "..." for our own headers, <...> for others Tracked down with an ugly, brittle and probably buggy Perl script. Also move includes converted to <...> up so they get included before ours where that's obviously okay. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Tested-by: Eric Blake <eblake@redhat.com> Reviewed-by: Richard Henderson <rth@twiddle.net>	2016-07-12 16:19:16 +02:00
Peter Maydell	975b1c3ac6	QAPI patches for 2016-07-06 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJXfMjDAAoJEDhwtADrkYZTyKwP/i5Lva3qhDk9AJhvPNSbnps/ sxVGVkUTeUM0UoXdKmDChkk2zScv2AOVxL3oibb22PmU1MvDU+XoKYokae/VucVS WL16t/Z77QxSJ1yKhXT3i0P7+9sR9Gq2eDHchSSITLreVO8/jNb7u1JjnV4nPyQP Scg6sCCnfgSw8W4EOz0wDhwXHbRv7obSY0lGUD1R6FifETHQ98rTpII0BS1AyvbF HIJMcGpirAe62BGNQeJv933mrD1EgRfBytQAkEL/l6DDlz/xk9AE26bsjdP5Xa9A MwEBje9CPm0ICleqq1ZVKNQQqng2PVfXXXdJYKQKVnIHL+REzRDKbTGTlS4njdVi v2i4E7MwDPIf86zRUXJ6eUkWDHgld4of2WZmoxNPZmWWmHMb31iqGcb0//mYmP06 QOPd/uxpxg9pze4JeQMhzHCbFum2/s43cyqaH24lrwhTw3EsI45t4sgJQTG1NIFt ZdPXVwNBfa6ur50dsRWHYonQimpofW94YYupWLggiVGKTzvfSWVQJGc9Ho65P23y ZJrkbN3bWNnwVFzIbWFI5TKnE/TsnWvTPwk+9cGSP4MKf1sy6t5nIycoTiP3vC9x PesENYYKy5ZkOkKQpC6lGwjkMl24ii3lVl5g4vWXXjtZHwGNmq/7/UJ7Dm4htqIk B5gm5gQ/2Tul4Dihlteb =BgpV -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2016-07-06' into staging QAPI patches for 2016-07-06 # gpg: Signature made Wed 06 Jul 2016 10:00:51 BST # gpg: using RSA key 0x3870B400EB918653 # gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" # gpg: aka "Markus Armbruster <armbru@pond.sub.org>" # Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653 * remotes/armbru/tags/pull-qapi-2016-07-06: replay: Use new QAPI cloning sockets: Use new QAPI cloning qapi: Add new clone visitor qapi: Add new visit_complete() function tests: Factor out common code in qapi output tests tests: Clean up test-string-output-visitor qmp-output-visitor: Favor new visit_free() function string-output-visitor: Favor new visit_free() function qmp-input-visitor: Favor new visit_free() function string-input-visitor: Favor new visit_free() function opts-visitor: Favor new visit_free() function qapi: Add new visit_free() function qapi: Add parameter to visit_end_* qemu-img: Don't leak errors when outputting JSON qapi: Improve use of qmp/types.h Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2016-07-06 11:38:09 +01:00
Eric Blake	3b098d5697	qapi: Add new visit_complete() function Making each output visitor provide its own output collection function was the only remaining reason for exposing visitor sub-types to the rest of the code base. Add a polymorphic visit_complete() function which is a no-op for input visitors, and which populates an opaque pointer for output visitors. For maximum type-safety, also add a parameter to the output visitor constructors with a type-correct version of the output pointer, and assert that the two uses match. This approach was considered superior to either passing the output parameter only during construction (action at a distance during visit_free() feels awkward) or only during visit_complete() (defeating type safety makes it easier to use incorrectly). Most callers were function-local, and therefore a mechanical conversion; the testsuite was a bit trickier, but the previous cleanup patch minimized the churn here. The visit_complete() function may be called at most once; doing so lets us use transfer semantics rather than duplication or ref-count semantics to get the just-built output back to the caller, even though it means our behavior is not idempotent. Generated code is simplified as follows for events: \|@@ -26,7 +26,7 @@ void qapi_event_send_acpi_device_ost(ACP \| QDict qmp; \| Error err = NULL; \| QMPEventFuncEmit emit; \|- QmpOutputVisitor qov; \|+ QObject obj; \| Visitor v; \| q_obj_ACPI_DEVICE_OST_arg param = { \| info \|@@ -39,8 +39,7 @@ void qapi_event_send_acpi_device_ost(ACP \| \| qmp = qmp_event_build_dict("ACPI_DEVICE_OST"); \| \|- qov = qmp_output_visitor_new(); \|- v = qmp_output_get_visitor(qov); \|+ v = qmp_output_visitor_new(&obj); \| \| visit_start_struct(v, "ACPI_DEVICE_OST", NULL, 0, &err); \| if (err) { \|@@ -55,7 +54,8 @@ void qapi_event_send_acpi_device_ost(ACP \| goto out; \| } \| \|- qdict_put_obj(qmp, "data", qmp_output_get_qobject(qov)); \|+ visit_complete(v, &obj); \|+ qdict_put_obj(qmp, "data", obj); \| emit(QAPI_EVENT_ACPI_DEVICE_OST, qmp, &err); and for commands: \| { \| Error err = NULL; \|- QmpOutputVisitor qov = qmp_output_visitor_new(); \| Visitor v; \| \|- v = qmp_output_get_visitor(qov); \|+ v = qmp_output_visitor_new(ret_out); \| visit_type_AddfdInfo(v, "unused", &ret_in, &err); \|- if (err) { \|- goto out; \|+ if (!err) { \|+ visit_complete(v, ret_out); \| } \|- *ret_out = qmp_output_get_qobject(qov); \|- \|-out: \| error_propagate(errp, err); Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <1465490926-28625-13-git-send-email-eblake@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2016-07-06 10:52:04 +02:00
Eric Blake	1830f22a67	qmp-output-visitor: Favor new visit_free() function Now that we have a polymorphic visit_free(), we no longer need qmp_output_visitor_cleanup(); however, we still need to expose the subtype for qmp_output_get_qobject(). Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <1465490926-28625-10-git-send-email-eblake@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2016-07-06 10:52:04 +02:00
Eric Blake	09204eac9b	opts-visitor: Favor new visit_free() function Now that we have a polymorphic visit_free(), we no longer need opts_visitor_cleanup(); which in turn means we no longer need to return a subtype from opts_visitor_new() nor a public upcast function. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <1465490926-28625-6-git-send-email-eblake@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2016-07-06 10:52:04 +02:00
Eric Blake	1158bb2a05	qapi: Add parameter to visit_end_* Rather than making the dealloc visitor track of stack of pointers remembered during visit_start_* in order to free them during visit_end_, it's a lot easier to just make all callers pass the same pointer to visit_end_. The generated code has access to the same pointer, while all other users are doing virtual walks and can pass NULL. The dealloc visitor is then greatly simplified. All three visit_end_() functions intentionally take a void, even though the visit_start_() functions differ between void, GenericList, and GenericAlternate*. This is done for several reasons: when doing a virtual walk, passing NULL doesn't care what the type is, but when doing a generated walk, we already have to cast the caller's specific FOO to call visit_start, while using void** lets us use visit_end without a cast. Also, an upcoming patch will add a clone visitor that wants to use the same implementation for all three visit_end callbacks, which is made easier if all three share the same signature. For visitors with already track per-object state (the QMP visitors via a stack, and the string visitors which do not allow nesting), add an assertion that the caller is indeed passing the same pointer to paired calls. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <1465490926-28625-4-git-send-email-eblake@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2016-07-06 10:52:04 +02:00
Peter Maydell	f1f7a1ddf3	block/qcow2: Don't use cpu_to_w() Don't use the cpu_to_w() functions, which we are trying to deprecate. Instead either just use cpu_to_() to do the byteswap, or use st_be_p() if we need to do the store somewhere other than to a variable that's already the correct type. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 1466093177-17890-1-git-send-email-peter.maydell@linaro.org Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2016-07-05 16:54:04 +02:00
Kevin Wolf	a03ef88f77	block: Convert bdrv_co_preadv/pwritev to BdrvChild This is the final patch for converting the common I/O path to take a BdrvChild parameter instead of BlockDriverState. The completion of this conversion means that all users that perform I/O on an image need to actually hold a reference (in the form of BdrvChild, possible as part of a BlockBackend) to that image. This also protects against inconsistent use of BlockBackend vs. BlockDriverState functions because direct use of a BlockDriverState isn't possible any more and blk->root is private for block-backends.c. In addition, we can now distinguish different users in the I/O path, and the future op blockers work is going to add assertions based on permissions stored in BdrvChild. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-05 16:46:27 +02:00
Kevin Wolf	e293b7a3df	block: Convert bdrv_prwv_co() to BdrvChild Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-05 16:46:27 +02:00
Kevin Wolf	720ff280e7	block: Convert bdrv_pwrite_zeroes() to BdrvChild Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-05 16:46:27 +02:00
Kevin Wolf	d9ca2ea2e2	block: Convert bdrv_pwrite(v/_sync) to BdrvChild Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-05 16:46:27 +02:00
Kevin Wolf	cf2ab8fc34	block: Convert bdrv_pread(v) to BdrvChild Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-05 16:46:27 +02:00
Kevin Wolf	18d51c4bac	block: Convert bdrv_write() to BdrvChild Signed-off-by: Kevin Wolf <kwolf@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-05 16:46:27 +02:00
Kevin Wolf	fbcbbf4e80	block: Convert bdrv_read() to BdrvChild Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-05 16:46:27 +02:00
Kevin Wolf	f8e2bd538d	block: Use BlockBackend for I/O in bdrv_commit() Just like block jobs, the HMP commit command should use its own BlockBackend for doing I/O on BlockDriverStates. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-05 16:46:27 +02:00

1 2 3 4 5 ...

2609 Commits