mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Alberto Garcia	184581fa4d	qcow2: Fix removal of list members from BDRVQcow2State.cluster_allocs When a write request needs to allocate new clusters (or change the L2 bitmap of existing ones) a QCowL2Meta structure is created so the L2 metadata can be later updated and any copy-on-write can be performed if necessary. A write request can span a region consisting of an arbitrary combination of previously unallocated and allocated clusters, and if the unallocated ones can be put contiguous to the existing ones then QEMU will do so in order to minimize the number of write operations. In practice this means that a write request has not just one but a number of QCowL2Meta structures. All of them are added to the cluster_allocs list that is stored in BDRVQcow2State and is used to detect overlapping requests. After the write request finishes all its associated QCowL2Meta are removed from that list. calculate_l2_meta() takes care of creating and putting those structures in the list, and qcow2_handle_l2meta() takes care of removing them. The problem is that the error path in handle_alloc() also tries to remove an item in that list, a remnant from the time when this was handled there (that code would not even be correct anymore because it only removes one struct and not all the ones from the same write request). This can trigger a double removal of the same item from the list, causing a crash. This is not easy to reproduce in practice because it requires that do_alloc_cluster_offset() fails after a successful previous allocation during the same write request, but it can be reproduced with the included test case. Signed-off-by: Alberto Garcia <berto@igalia.com> Message-Id: <3440a1c4d53c4fe48312b478c96accb338cbef7c.1599150873.git.berto@igalia.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-09-15 11:05:13 +02:00
Alberto Garcia	02b1ecfa10	qcow2: Use macros for the L1, refcount and bitmap table entry sizes This patch replaces instances of sizeof(uint64_t) in the qcow2 driver with macros that indicate what those sizes are actually referring to. Signed-off-by: Alberto Garcia <berto@igalia.com> Message-Id: <20200828110828.13833-1-berto@igalia.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-09-15 11:05:12 +02:00
Lukas Straub	5eb9a3c7b0	block/quorum.c: stable children names If we remove the child with the highest index from the quorum, decrement s->next_child_index. This way we get stable children names as long as we only remove the last child. Signed-off-by: Lukas Straub <lukasstraub2@web.de> Fixes: https://bugs.launchpad.net/bugs/1881231 Reviewed-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-Id: <5d5f930424c1c770754041aa8ad6421dc4e2b58e.1596536719.git.lukasstraub2@web.de> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-09-15 11:05:12 +02:00
Peter Maydell	2499453eb1	Block layer patches: - qemu-img create: Fail gracefully when backing file is an empty string - Fixes related to filter block nodes ("Deal with filters" series) - block/nvme: Various cleanups required to use multiple queues - block/nvme: Use NvmeBar structure from "block/nvme.h" - file-win32: Fix "locking" option - iotests: Allow running from different directory -----BEGIN PGP SIGNATURE----- iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAl9Z7bcRHGt3b2xmQHJl ZGhhdC5jb20ACgkQfwmycsiPL9ZS6w//bos+A0RfRRF0YFWkIBLQWxqzKcGvMJ8W XWv3mFzd47UaDgRYwVnCC3CR6bLYEINISngZ3geA4jI1+w7AtYKDOO0HN32dUg+D ZrNMn02701CA6qkmpxJ+yjsrl9ltR3jYe0me4Wr39Pvdexa2pl/e+M4Vas6FhkYL ghAwNThypscGCrFjAlz3ru2Sc/K+sPWrGoqkzr+SWvsm9wy4vb8aLxr8Yy50x/zc CqALS9SQ/YA93BCVi9CzPkVyV3ioA0kg/y38WvLtAQ9GZ3m/ekMro3WvdYsRsFCN LGXsuwFig+U7Kd7lJrCS9TLnlTJstNGqPq9jEoV5cThPvGknFfMvVOzRmmP7tzqT YRcPRy39z44OoLKa3kyg3aF38BTxt+9gPqBnivKMr9j9EecMvPsXXHRvF+lP+LsP j753Ih561hX6FurcjX8pc9GOM2cQA0GjlyL77UTTAmLZyFXP/8e55oQbBuYTylc/ Xlvmc/T+yEGiEGTnK+FxgDAiUaxbCCM9cDVStJjTvsIq43dwXb48g1onDsGZ5eDf j9lmAD6TJxHNOB5ErNsDPODf4/D1wJ9t9WVF8UZp9ArfPHRdxMzT7Q4LvetaDmVl +hQC9cgTq8Qd8LwSqbKEYua4L6iGbmLAT7/N6htq5L1eVLg76/tLg/tKSwh/vKAY yzPmyHaVK84= =gaaW -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging Block layer patches: - qemu-img create: Fail gracefully when backing file is an empty string - Fixes related to filter block nodes ("Deal with filters" series) - block/nvme: Various cleanups required to use multiple queues - block/nvme: Use NvmeBar structure from "block/nvme.h" - file-win32: Fix "locking" option - iotests: Allow running from different directory # gpg: Signature made Thu 10 Sep 2020 10:11:19 BST # gpg: using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6 # gpg: issuer "kwolf@redhat.com" # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full] # Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6 * remotes/kevin/tags/for-upstream: (65 commits) block/qcow2-cluster: Add missing "fallthrough" annotation block/nvme: Pair doorbell registers block/nvme: Use generic NvmeBar structure block/nvme: Group controller registers in NVMeRegs structure file-win32: Fix "locking" option iotests: Allow running from different directory iotests: Test committing to overridden backing iotests: Add test for commit in sub directory iotests: Add filter mirror test cases iotests: Add filter commit test cases iotests: Let complete_and_wait() work with commit iotests: Test that qcow2's data-file is flushed block: Leave BDS.backing_{file,format} constant block: Inline bdrv_co_block_status_from_*() blockdev: Fix active commit choice block: Drop backing_bs() qemu-img: Use child access functions nbd: Use CAF when looking for dirty bitmap commit: Deal with filters backup: Deal with filters ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2020-09-11 14:47:49 +01:00
Thomas Huth	b9be6faed1	block/qcow2-cluster: Add missing "fallthrough" annotation When compiling with -Werror=implicit-fallthrough, the compiler currently complains: ../../devel/qemu/block/qcow2-cluster.c: In function ‘cluster_needs_new_alloc’: ../../devel/qemu/block/qcow2-cluster.c:1320:12: error: this statement may fall through [-Werror=implicit-fallthrough=] if (l2_entry & QCOW_OFLAG_COPIED) { ^ ../../devel/qemu/block/qcow2-cluster.c:1323:5: note: here case QCOW2_CLUSTER_UNALLOCATED: ^~~~ It's quite obvious that the fallthrough is intended here, so let's add a comment to silence the compiler warning. Signed-off-by: Thomas Huth <thuth@redhat.com> Message-Id: <20200908070028.193298-1-thuth@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-09-10 11:11:13 +02:00
Philippe Mathieu-Daudé	e5ff22ba9f	block/nvme: Pair doorbell registers For each queue doorbell registers are paired as: - Submission Queue Tail Doorbell - Completion Queue Head Doorbell Reflect that in the NVMeRegs structure, and adapt nvme_create_queue_pair() accordingly. Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200904124130.583838-4-philmd@redhat.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Fam Zheng <fam@euphon.net> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-09-10 11:11:13 +02:00
Philippe Mathieu-Daudé	c7100f0a0b	block/nvme: Use generic NvmeBar structure Commit `f3c507adcd` ("NVMe: Initial commit for new storage interface") introduced the NvmeBar structure. Unfortunately in commit `bdd6a90a9e` ("block: Add VFIO based NVMe driver") we duplicated it. Apparently in commit `a3d9a352d4` ("block: Move NVMe constants to a separate header") we tried to unify headers but forgot to remove the structure declared in the block/nvme.c source file. Do it now, and remove the structure size check which is redundant with the header check added in commit `74e18435c0` ("hw/block/nvme: Align I/O BAR to 4 KiB"). Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200904124130.583838-3-philmd@redhat.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Fam Zheng <fam@euphon.net> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-09-10 11:11:13 +02:00
Philippe Mathieu-Daudé	0ea32f34ce	block/nvme: Group controller registers in NVMeRegs structure We want to use the NvmeBar structure from "block/nvme.h" in the next commit. As a preliminary step, group all the NVMe controller registers in the 'ctrl' field, keeping the doorbells registers out of it. Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200904124130.583838-2-philmd@redhat.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Fam Zheng <fam@euphon.net> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-09-10 11:11:12 +02:00
Kevin Wolf	3b079ac0ff	file-win32: Fix "locking" option The intended behaviour was that locking=off/auto work and have no effect (to remain compatible with file-posix), whereas locking=on would return an error. Unfortunately, the code forgot to remove "locking" from the options QDict, so any attempt to use the option would fail. Replace the option parsing code for "locking" with something that is part of the raw_runtime_opts QemuOptsList (so it is properly removed from the QDict) and looks more like file-posix. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20200907092739.9988-1-kwolf@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-09-10 11:11:12 +02:00
Markus Armbruster	b15e402fc8	trace-events: Fix attribution of trace points to source Some trace points are attributed to the wrong source file. Happens when we neglect to update trace-events for code motion, or add events in the wrong place, or misspell the file name. Clean up with help of scripts/cleanup-trace-events.pl. Funnies requiring manual post-processing: * accel/tcg/cputlb.c trace points are in trace-events. * block.c and blockdev.c trace points are in block/trace-events. * hw/block/nvme.c uses the preprocessor to hide its trace point use from cleanup-trace-events.pl. * hw/tpm/tpm_spapr.c uses pseudo trace point tpm_spapr_show_buffer to guard debug code. * include/hw/xen/xen_common.h trace points are in hw/xen/trace-events. * linux-user/trace-events abbreviates a tedious list of filenames to /signal.c. net/colo-compare and net/filter-rewriter.c use pseudo trace points colo_compare_miscompare and colo_filter_rewriter_debug to guard debug code. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20200806141334.3646302-5-armbru@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2020-09-09 17:17:58 +01:00
Markus Armbruster	6ec9379870	trace-events: Delete unused trace points Tracked down with the help of scripts/cleanup-trace-events.pl. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-id: 20200806141334.3646302-4-armbru@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2020-09-09 17:17:02 +01:00
Max Reitz	0b877d09df	block: Leave BDS.backing_{file,format} constant Parts of the block layer treat BDS.backing_file as if it were whatever the image header says (i.e., if it is a relative path, it is relative to the overlay), other parts treat it like a cache for bs->backing->bs->filename (relative paths are relative to the CWD). Considering bs->backing->bs->filename exists, let us make it mean the former. Among other things, this now allows the user to specify a base when using qemu-img to commit an image file in a directory that is not the CWD (assuming, everything uses relative filenames). Before this patch: $ ./qemu-img create -f qcow2 foo/bot.qcow2 1M $ ./qemu-img create -f qcow2 -b bot.qcow2 foo/mid.qcow2 $ ./qemu-img create -f qcow2 -b mid.qcow2 foo/top.qcow2 $ ./qemu-img commit -b mid.qcow2 foo/top.qcow2 qemu-img: Did not find 'mid.qcow2' in the backing chain of 'foo/top.qcow2' $ ./qemu-img commit -b foo/mid.qcow2 foo/top.qcow2 qemu-img: Did not find 'foo/mid.qcow2' in the backing chain of 'foo/top.qcow2' $ ./qemu-img commit -b $PWD/foo/mid.qcow2 foo/top.qcow2 qemu-img: Did not find '[...]/foo/mid.qcow2' in the backing chain of 'foo/top.qcow2' After this patch: $ ./qemu-img commit -b mid.qcow2 foo/top.qcow2 Image committed. $ ./qemu-img commit -b foo/mid.qcow2 foo/top.qcow2 qemu-img: Did not find 'foo/mid.qcow2' in the backing chain of 'foo/top.qcow2' $ ./qemu-img commit -b $PWD/foo/mid.qcow2 foo/top.qcow2 Image committed. With this change, bdrv_find_backing_image() must look at whether the user has overridden a BDS's backing file. If so, it can no longer use bs->backing_file, but must instead compare the given filename against the backing node's filename directly. Note that this changes the QAPI output for a node's backing_file. We had very inconsistent output there (sometimes what the image header said, sometimes the actual filename of the backing image). This inconsistent output was effectively useless, so we have to decide one way or the other. Considering that bs->backing_file usually at runtime contained the path to the image relative to qemu's CWD (or absolute), this patch changes QAPI's backing_file to always report the bs->backing->bs->filename from now on. If you want to receive the image header information, you have to refer to full-backing-filename. This necessitates a change to iotest 228. The interesting information it really wanted is the image header, and it can get that now, but it has to use full-backing-filename instead of backing_file. Because of this patch's changes to bs->backing_file's behavior, we also need some reference output changes. Along with the changes to bs->backing_file, stop updating BDS.backing_format in bdrv_backing_attach() as well. This way, ImageInfo's backing-filename and backing-filename-format fields will represent what the image header says and nothing else. iotest 245 changes in behavior: With the backing node no longer overriding the parent node's backing_file string, you can now omit the @backing option when reopening a node with neither a default nor a current backing file even if it used to have a backing node at some point. 273 also changes: The base image is opened without a format layer, so ImageInfo.backing-filename-format used to report "file" for the base image's overlay after blockdev-snapshot. However, the image header never says "file" anywhere, so it now reports $IMGFMT. Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-09-07 12:31:31 +02:00
Max Reitz	549ec0d978	block: Inline bdrv_co_block_status_from_*() With bdrv_filter_bs(), we can easily handle this default filter behavior in bdrv_co_block_status(). blkdebug wants to have an additional assertion, so it keeps its own implementation, except bdrv_co_block_status_from_file() needs to be inlined there. Suggested-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com>	2020-09-07 12:31:31 +02:00
Max Reitz	9a71b9de3f	commit: Deal with filters This includes some permission limiting (for example, we only need to take the RESIZE permission if the base is smaller than the top). Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-09-07 12:31:31 +02:00
Max Reitz	2b088c60bb	backup: Deal with filters Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com>	2020-09-07 12:31:31 +02:00
Max Reitz	3f072a7fb7	mirror: Deal with filters This includes some permission limiting (for example, we only need to take the RESIZE permission for active commits where the base is smaller than the top). base_overlay is introduced so we can query bdrv_is_allocated_above() on it - we cannot do that with base itself, because a filter's block_status is the same as its child node, so if there are filters on base, bdrv_is_allocated_above() on base would return information including base. Use this opportunity to rename qmp_drive_mirror()'s "source" BDS to "target_backing_bs", because that is what it really refers to. Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-09-07 12:31:31 +02:00
Max Reitz	c6f6d8462c	block-copy: Use CAF to find sync=top base Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com>	2020-09-07 12:31:31 +02:00
Max Reitz	0a7585dbba	block: Use child access functions for QAPI queries query-block, query-named-block-nodes, and query-blockstats now return any filtered child under "backing", not just bs->backing or COW children. This is so that filters do not interrupt the reported backing chain. This changes the output for iotest 184, as the throttled node now appears as a backing child. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com>	2020-09-07 12:31:31 +02:00
Max Reitz	3f26191c73	block: Report data child for query-blockstats It makes no sense to report the block stats of a purely metadata-storing child in query-blockstats. So if the primary child does not have any data, try to find a unique data-storing child. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com>	2020-09-07 12:31:31 +02:00
Max Reitz	07cd7b659a	block/null: Implement bdrv_get_allocated_file_size It is trivial, so we might as well do it. Remove _filter_actual_image_size from iotest 184, so we get to see the result in its reference output. Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-09-07 12:31:31 +02:00
Max Reitz	c8af87573f	block/snapshot: Fix fallback If the top node's driver does not provide snapshot functionality and we want to fall back to a node down the chain, we need to snapshot all non-COW children. For simplicity's sake, just do not fall back if there is more than one such child. Furthermore, we really only can fall back to bs->file and bs->backing, because bdrv_snapshot_goto() has to modify the child link (notably, set it to NULL). Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com>	2020-09-07 12:31:31 +02:00
Max Reitz	c4db2e25df	block: Use CAF in bdrv_co_rw_vmstate() If a node whose driver does not provide VM state functions has a metadata child, the VM state should probably go there; if it is a filter, the VM state should probably go there. It follows that we should generally go down to the primary child. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com>	2020-09-07 12:31:31 +02:00
Max Reitz	66b129ac5e	block: Iterate over children in refresh_limits Instead of looking at just bs->file and bs->backing, we should look at all children that could end up receiving forwarded requests. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com>	2020-09-07 12:31:31 +02:00
Max Reitz	fb787f02a6	vmdk: Drop vmdk_co_flush() Before HEAD^, we needed this because bdrv_co_flush() by itself would only flush bs->file. With HEAD^, bdrv_co_flush() will flush all children on which a WRITE or WRITE_UNCHANGED permission has been taken. Thus, vmdk no longer needs to do it itself. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com>	2020-09-07 12:31:31 +02:00
Max Reitz	883833e29c	block: Flush all children in generic code If the driver does not support .bdrv_co_flush() so bdrv_co_flush() itself has to flush the children of the given node, it should not flush just bs->file->bs, but in fact all children that might have been written to (judging from the permissions taken on them). This is a bug fix for qcow2 images with an external data file, as they so far did not flush that data_file node. In any case, the BLKDBG_EVENT() should be emitted on the primary child, because that is where a blkdebug node would be if there is any. Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com>	2020-09-07 12:31:31 +02:00
Max Reitz	23b93525a2	block: Use bdrv_cow_child() in bdrv_co_truncate() The condition modified here is not about potentially filtered children, but only about COW sources (i.e. traditional backing files). Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com>	2020-09-07 12:31:31 +02:00
Max Reitz	67acfd2188	stream: Deal with filters Because of the (not so recent anymore) changes that make the stream job independent of the base node and instead track the node above it, we have to split that "bottom" node into two cases: The bottom COW node, and the node directly above the base node (which may be an R/W filter or the bottom COW node). Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-09-07 12:31:31 +02:00
Max Reitz	cb8503159a	block: Use CAFs in block status functions Use the child access functions in the block status inquiry functions as appropriate. Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-09-07 12:31:31 +02:00
Max Reitz	93393e698c	block: Use bdrv_filter_(bs\|child) where obvious Places that use patterns like if (bs->drv->is_filter && bs->file) { ... something about bs->file->bs ... } should be BlockDriverState *filtered = bdrv_filter_bs(bs); if (filtered) { ... something about @filtered ... } instead. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com>	2020-09-07 12:31:31 +02:00
Max Reitz	4935e8be22	copy-on-read: Support compressed writes Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com>	2020-09-07 12:31:31 +02:00
Max Reitz	e7e754aec3	throttle: Support compressed writes Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com>	2020-09-07 12:31:31 +02:00
Max Reitz	8b8277cdb0	block: Drop bdrv_is_encrypted() The original purpose of bdrv_is_encrypted() was to inquire whether a BDS can be used without the user entering a password or not. It has not been used for that purpose for quite some time. Actually, it is not even fit for that purpose, because to answer that question, it would have recursively query all of the given node's children. So now we have to decide in which direction we want to fix bdrv_is_encrypted(): Recursively query all children, or drop it and just use bs->encrypted to get the current node's status? Nowadays, its only purpose is to report through bdrv_query_image_info() whether the given image is encrypted or not. For this purpose, it is probably more interesting to see whether a given node itself is encrypted or not (otherwise, a management application cannot discern for certain which nodes are really encrypted and which just have encrypted children). Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com>	2020-09-07 12:31:30 +02:00
Philippe Mathieu-Daudé	b111b3fcde	block/nvme: Use an array of EventNotifier In preparation of using multiple IRQ (thus multiple eventfds) make BDRVNVMeState::irq_notifier an array (for now of a single element, the admin queue notifier). Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200821195359.1285345-16-philmd@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-09-07 12:31:30 +02:00
Philippe Mathieu-Daudé	7a1fb2ef40	block/nvme: Extract nvme_poll_queue() As we want to do per-queue polling, extract the nvme_poll_queue() method which operates on a single queue. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200821195359.1285345-15-philmd@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-09-07 12:31:30 +02:00
Philippe Mathieu-Daudé	0a28b02ef9	block/nvme: Simplify nvme_create_queue_pair() arguments nvme_create_queue_pair() doesn't require BlockDriverState anymore. Replace it by BDRVNVMeState and AioContext to simplify. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200821195359.1285345-14-philmd@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-09-07 12:31:30 +02:00
Philippe Mathieu-Daudé	073a06978c	block/nvme: Replace BDRV_POLL_WHILE by AIO_WAIT_WHILE BDRV_POLL_WHILE() is defined as: #define BDRV_POLL_WHILE(bs, cond) ({ \ BlockDriverState *bs_ = (bs); \ AIO_WAIT_WHILE(bdrv_get_aio_context(bs_), \ cond); }) As we will remove the BlockDriverState use in the next commit, start by using the exploded version of BDRV_POLL_WHILE(). Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200821195359.1285345-13-philmd@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-09-07 12:31:30 +02:00
Philippe Mathieu-Daudé	3a6d34d066	block/nvme: Simplify nvme_init_queue() arguments nvme_init_queue() doesn't require BlockDriverState anymore. Replace it by BDRVNVMeState to simplify. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200821195359.1285345-12-philmd@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-09-07 12:31:30 +02:00
Philippe Mathieu-Daudé	38e1f8186f	block/nvme: Replace qemu_try_blockalign(bs) by qemu_try_memalign(pg_sz) qemu_try_blockalign() is a generic API that call back to the block driver to return its page alignment. As we call from within the very same driver, we already know to page alignment stored in our state. Remove indirections and use the value from BDRVNVMeState. This change is required to later remove the BlockDriverState argument, to make nvme_init_queue() per hardware, and not per block driver. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200821195359.1285345-11-philmd@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-09-07 12:31:30 +02:00
Philippe Mathieu-Daudé	2ed846930d	block/nvme: Replace qemu_try_blockalign0 by qemu_try_blockalign/memset In the next commit we'll get rid of qemu_try_blockalign(). To ease review, first replace qemu_try_blockalign0() by explicit calls to qemu_try_blockalign() and memset(). Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200821195359.1285345-10-philmd@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-09-07 12:31:30 +02:00
Philippe Mathieu-Daudé	7d3b214ae4	block/nvme: Use union of NvmeIdCtrl / NvmeIdNs structures We allocate an unique chunk of memory then use it for two different structures. By using an union, we make it clear the data is overlapping (and we can remove the casts). Suggested-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200821195359.1285345-9-philmd@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-09-07 12:24:53 +02:00
Philippe Mathieu-Daudé	4d98093937	block/nvme: Rename local variable We are going to modify the code in the next commit. Renaming the 'resp' variable to 'id' first makes the next commit easier to review. No logical changes. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200821195359.1285345-8-philmd@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-09-07 12:23:55 +02:00
Philippe Mathieu-Daudé	c8edbfb2cc	block/nvme: Use common error path in nvme_add_io_queue() Rearrange nvme_add_io_queue() by using a common error path. This will be proven useful in few commits where we add IRQ notification to the IO queues. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200821195359.1285345-7-philmd@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-09-07 12:23:55 +02:00
Philippe Mathieu-Daudé	bf6ce5ec6d	block/nvme: Improve error message when IO queue creation failed Do not use the same error message for different failures. Display a different error whether it is the CQ or the SQ. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200821195359.1285345-6-philmd@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-09-07 12:23:55 +02:00
Philippe Mathieu-Daudé	73159e52e6	block/nvme: Define INDEX macros to ease code review Use definitions instead of '0' or '1' indexes. Also this will be useful when using multi-queues later. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200821195359.1285345-5-philmd@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-09-07 12:23:55 +02:00
Philippe Mathieu-Daudé	0ea45f76eb	block/nvme: Let nvme_create_queue_pair() fail gracefully As nvme_create_queue_pair() is allowed to fail, replace the alloc() calls by try_alloc() to avoid aborting QEMU. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200821195359.1285345-4-philmd@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-09-07 12:23:55 +02:00
Philippe Mathieu-Daudé	e266f52cfb	block/nvme: Avoid further processing if trace event not enabled Avoid further processing if TRACE_NVME_SUBMIT_COMMAND_RAW is not enabled. This is an untested intend of performance optimization. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200821195359.1285345-3-philmd@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-09-07 12:23:55 +02:00
Philippe Mathieu-Daudé	e4f310fe7f	block/nvme: Replace magic value by SCALE_MS definition Use self-explicit SCALE_MS definition instead of magic value. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200821195359.1285345-2-philmd@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-09-07 12:23:55 +02:00
Peter Maydell	df8176274a	nbd patches for 2020-09-02 - fix a few iotests affected by earlier nbd changes - avoid blocking qemu by nbd client in connect() - build qemu-nbd for mingw -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEccLMIrHEYCkn0vOqp6FrSiUnQ2oFAl9QFB8ACgkQp6FrSiUn Q2pTwggArPN7dRwm1KD9jL8X6PV01uhLuLRFzrrofFX22Blroj5XPbR24BdKeN8V uDSFGzzoz2Vx3vKPHyZdx7bbeEL0pF9dzjZX6JwzT0McbVuge5aG2zC/ARdfc4PN 1Yf2FB/nY8Xt5G12usu3FIz7JBoQNm4mlPnqVqf7t0LQxgUFvO7F2LernyqEOYKS uSpXHNFqddZcax7etXeldJOlSGJBQTaCmplrJbw2ilVhLZJD+0OglY4SAsrrenN+ gb/KD4REjhtQsmoTaqthGCnGXipEoJYDfEOMhbkl0UK+9Mx0t+3cj3tflG0eGldM ERk1d4d7VSlyqIy7w43v3+IB8M99ag== =7Cn5 -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/ericb/tags/pull-nbd-2020-09-02' into staging nbd patches for 2020-09-02 - fix a few iotests affected by earlier nbd changes - avoid blocking qemu by nbd client in connect() - build qemu-nbd for mingw # gpg: Signature made Wed 02 Sep 2020 22:52:31 BST # gpg: using RSA key 71C2CC22B1C4602927D2F3AAA7A16B4A2527436A # gpg: Good signature from "Eric Blake <eblake@redhat.com>" [full] # gpg: aka "Eric Blake (Free Software Programmer) <ebb9@byu.net>" [full] # gpg: aka "[jpeg image of size 6874]" [full] # Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2 F3AA A7A1 6B4A 2527 436A * remotes/ericb/tags/pull-nbd-2020-09-02: nbd: disable signals and forking on Windows builds nbd: skip SIGTERM handler if NBD device support is not built block: add missing socket_init() calls to tools block/nbd: use non-blocking connect: fix vm hang on connect() iotests/259: Fix reference output iotests/059: Fix reference output Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2020-09-03 21:35:01 +01:00
Vladimir Sementsov-Ogievskiy	1dc4718d84	block/nbd: use non-blocking connect: fix vm hang on connect() This makes nbd's connection_co yield during reconnects, so that reconnect doesn't block the main thread. This is very important in case of an unavailable nbd server host: connect() call may take a long time, blocking the main thread (and due to reconnect, it will hang again and again with small gaps of working time during pauses between connection attempts). Realization notes: - We don't want to implement non-blocking connect() over non-blocking socket, because getaddrinfo() doesn't have portable non-blocking realization anyway, so let's just use a thread for both getaddrinfo() and connect(). - We can't use qio_channel_socket_connect_async (which behaves similarly and starts a thread to execute connect() call), as it's relying on someone iterating main loop (g_main_loop_run() or something like this), which is not always the case. - We can't use thread_pool_submit_co API, as thread pool waits for all threads to finish (but we don't want to wait for blocking reconnect attempt on shutdown. So, we just create the thread by hand. Some additional difficulties are: - We want our connect to avoid blocking drained sections and aio context switches. To achieve this, we make it possible to "cancel" synchronous wait for the connect (which is a coroutine yield actually), still, the thread continues in background, and if successful, its result may be reused on next reconnect attempt. - We don't want to wait for reconnect on shutdown, so there is CONNECT_THREAD_RUNNING_DETACHED thread state, which means that the block layer is no longer interested in a result, and thread should close new connected socket on finish and free the state. How to reproduce the bug, fixed with this commit: 1. Create an image on node1: qemu-img create -f qcow2 xx 100M 2. Start NBD server on node1: qemu-nbd xx 3. Start vm with second nbd disk on node2, like this: ./x86_64-softmmu/qemu-system-x86_64 -nodefaults -drive \ file=/work/images/cent7.qcow2 -drive file=nbd+tcp://192.168.100.2 \ -vnc :0 -qmp stdio -m 2G -enable-kvm -vga std 4. Access the vm through vnc (or some other way?), and check that NBD drive works: dd if=/dev/sdb of=/dev/null bs=1M count=10 - the command should succeed. 5. Now, let's trigger nbd-reconnect loop in Qemu process. For this: 5.1 Kill NBD server on node1 5.2 run "dd if=/dev/sdb of=/dev/null bs=1M count=10" in the guest again. The command should fail and a lot of error messages about failing disk may appear as well. Now NBD client driver in Qemu tries to reconnect. Still, VM works well. 6. Make node1 unavailable on NBD port, so connect() from node2 will last for a long time: On node1 (Note, that 10809 is just a default NBD port): sudo iptables -A INPUT -p tcp --dport 10809 -j DROP After some time the guest hangs, and you may check in gdb that Qemu hangs in connect() call, issued from the main thread. This is the BUG. 7. Don't forget to drop iptables rule from your node1: sudo iptables -D INPUT -p tcp --dport 10809 -j DROP Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200812145237.4396-1-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> [eblake: minor wording and formatting tweaks] Signed-off-by: Eric Blake <eblake@redhat.com>	2020-09-02 16:47:23 -05:00
Peter Maydell	e4d8b7c1a9	qemu-nvme -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE28EdLTc7SjdV9QLsYlFWYQpPbMAFAl9Pro4ACgkQYlFWYQpP bMC1CRAAkawTI4mMcOfI3smFoMeiY8kZJWXJUBXfHbMJ4asaoIjTkH/lXRXBw7KQ sH5tB9CuOums3VjagkZ0Sw6R/kP1LbyJTAwq/pwOXwYRDc/E3zQpMblkIHH1boIM Bxl814hw3hBqV+D0wgeKpl83lbiOpd10Cbpdb/xNKat6qVquLGurSGKgA7jNuF4s oPTPtfZpyH9LUr4DV+sL+fGX6vaCdSFZPZUhJqwFfx79+r3+YiHGLAE6fgsdGDJt 2RSSKMqBe2gg0BY5ToW9L55BsLnwMMrAZnGzEkeZvRKqm0JZBXQsERa61p4VEAJf uYkSEqOwsKjXQNTdDEekyH67AkgXaoqG0hiiOcgoLsla7C0zROtoKcfVM/+WC0LT T0/bfgubmoDV8kLzPuOV8xOGxjfbu4Qlxy1JsIC6BU4zBQvpDwOeTx3MUWaCUfvk YmDMEhZWGcZ3RBLrgQmzm4ZKMtGdYXnGQz5dwVkRRfghQs2fl5ZmUjGR7MqKe18n 4K0nzhPiXbOTlqvLVvzVlrBzdc8ECAs1kVoJF7C3LwRmXbT2N/fUhZP/nYpeM2Hj DQNmA8KpXMKae2+2iDnQNWbvdpz3SiHD6dK7A1bEsdoG0L60xfyeAF+JuPiESUnd OAhf+muxKiInv2k5GNh7mDZPWM6nDepf/PZP6ohc7dKxVam7N2M= =Y23H -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/nvme/tags/pull-nvme-20200902' into staging qemu-nvme # gpg: Signature made Wed 02 Sep 2020 15:39:10 BST # gpg: using RSA key DBC11D2D373B4A3755F502EC625156610A4F6CC0 # gpg: Good signature from "Keith Busch <kbusch@kernel.org>" [unknown] # gpg: aka "Keith Busch <keith.busch@gmail.com>" [unknown] # gpg: aka "Keith Busch <keith.busch@intel.com>" [unknown] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: DBC1 1D2D 373B 4A37 55F5 02EC 6251 5661 0A4F 6CC0 * remotes/nvme/tags/pull-nvme-20200902: (39 commits) hw/block/nvme: remove explicit qsg/iov parameters hw/block/nvme: use preallocated qsg/iov in nvme_dma_prp hw/block/nvme: consolidate qsg/iov clearing hw/block/nvme: add ns/cmd references in NvmeRequest hw/block/nvme: be consistent about zeros vs zeroes hw/block/nvme: add check for mdts hw/block/nvme: refactor request bounds checking hw/block/nvme: verify validity of prp lists in the cmb hw/block/nvme: add request mapping helper hw/block/nvme: add tracing to nvme_map_prp hw/block/nvme: refactor dma read/write hw/block/nvme: destroy request iov before reuse hw/block/nvme: remove redundant has_sg member hw/block/nvme: replace dma_acct with blk_acct equivalent hw/block/nvme: add mapping helpers hw/block/nvme: memset preallocated requests structures hw/block/nvme: bump supported version to v1.3 hw/block/nvme: provide the mandatory subnqn field hw/block/nvme: enforce valid queue creation sequence hw/block/nvme: reject invalid nsid values in active namespace id list ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2020-09-02 21:20:20 +01:00

1 2 3 4 5 ...

4921 Commits