mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Max Reitz	32b6444d23	qcow2-cluster: Expand zero clusters Add functionality for expanding zero clusters. This is necessary for downgrading the image version to one without zero cluster support. For non-backed images, this function may also just discard zero clusters instead of truly expanding them. Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2013-09-12 10:12:46 +02:00
Kevin Wolf	670df5e3b4	qcow2: Pass discard type to qcow2_discard_clusters() The function will be used internally instead of only being called for guest discard requests. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>	2013-09-12 10:12:46 +02:00
Max Reitz	e23e400ec6	qcow2-refcount: Repair OFLAG_COPIED errors Since the OFLAG_COPIED checks are now executed after the refcounts have been repaired (if repairing), it is safe to assume that they are correct but the OFLAG_COPIED flag may be not. Therefore, if its value differs from what it should be (considering the according refcount), that discrepancy can be repaired by correctly setting (or clearing that flag. Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2013-08-30 15:48:44 +02:00
Max Reitz	cf93980e77	qcow2: Employ metadata overlap checks The pre-write overlap check function is now called before most of the qcow2 writes (aborting it on collision or other error). Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2013-08-30 15:48:43 +02:00
Kevin Wolf	0b919fae31	qcow2: Batch discards This optimises the discard operation for freed clusters by batching discard requests (both snapshot deletion and bdrv_discard end up updating the refcounts cluster by cluster). Note that we don't discard asynchronously, but keep s->lock held. This is to avoid that a freed cluster is reallocated and written to while the discard is still in flight. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-06-24 10:25:17 +02:00
Kevin Wolf	6cfcb9b8b9	qcow2: Add refcount update reason to all callers This adds a refcount update reason to all callers of update_refcounts(), so that a follow-up patch can use this information to decide whether clusters that reach a refcount of 0 should be discarded in the image file. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-06-24 10:25:17 +02:00
Kevin Wolf	2cf7cfa1cd	qcow2: Catch some L1 table index overflows This catches the situation that is described in the bug report at https://bugs.launchpad.net/qemu/+bug/865518 and goes like this: $ qemu-img create -f qcow2 huge.qcow2 $((10241024))T Formatting 'huge.qcow2', fmt=qcow2 size=1152921504606846976 encryption=off cluster_size=65536 lazy_refcounts=off $ qemu-io /tmp/huge.qcow2 -c "write $((102410241024102410241024 - 1024)) 512" Segmentation fault With this patch applied the segfault will be avoided, however the case will still fail, though gracefully: $ qemu-img create -f qcow2 /tmp/huge.qcow2 $((10241024))T Formatting 'huge.qcow2', fmt=qcow2 size=1152921504606846976 encryption=off cluster_size=65536 lazy_refcounts=off qemu-img: The image size is too large for file format 'qcow2' Note that even long before these overflow checks kick in, you get insanely high memory usage (up to INT_MAX sizeof(uint64_t) = 16 GB for the L1 table), so with somewhat smaller image sizes you'll probably see qemu aborting for a failed g_malloc(). If you need huge image sizes, you should increase the cluster size to the maximum of 2 MB in order to get higher limits. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-05-14 16:44:33 +02:00
Kevin Wolf	ecdd5333ab	qcow2: Gather clusters in a looping loop Instead of just checking once in exactly this order if there are dependendies, non-COW clusters and new allocation, this starts looping around these. This way we can, for example, gather non-COW clusters after new allocations as long as the host cluster offsets stay contiguous. Once handle_dependencies() is extended so that COW areas of in-flight allocations can be overwritten, this allows to continue with gathering other clusters (we wouldn't be able to do that without this change because we would have missed a possible second dependency in one of the next clusters). This means that in the typical sequential write case, we can combine the COW overwrite of one cluster with the allocation of the next cluster as soon as something like Delayed COW gets actually implemented. It is only by avoiding splitting requests this way that Delayed COW actually starts improving performance noticably. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-03-28 11:52:44 +01:00
Kevin Wolf	2c3b32d256	qcow2: Move cluster gathering to a non-looping loop This patch is mainly to separate the indentation change from the semantic changes. All that really changes here is that everything moves into a while loop, all 'goto done' become 'break' and at the end of the loop a new 'break is inserted. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-03-28 11:52:44 +01:00
Kevin Wolf	88c6588c51	qcow2: Allow requests with multiple l2metas Instead of expecting a single l2meta, have a list of them. This allows to still have a single I/O request for the guest data, even though multiple l2meta may be needed in order to describe both a COW overwrite and a new cluster allocation (typical sequential write case). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-03-28 11:52:44 +01:00
Kevin Wolf	710c2496d8	qcow2: Use byte granularity in qcow2_alloc_cluster_offset() This gets rid of the nb_clusters and keep_clusters and the associated complicated calculations. Just advance the number of bytes that have been processed and everything is fine. This patch advances the variables even after the last operation even though they aren't used any more afterwards to make things look more uniform. A later patch will turn the whole thing into a loop and then it actually starts making sense. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-03-28 11:52:44 +01:00
Kevin Wolf	411d62b04b	qcow2: Prepare handle_alloc/copied() for byte granularity This makes handle_alloc() and handle_copied() return byte-granularity host offsets instead of returning always the cluster start. This is required so that qcow2_alloc_cluster_offset() can stop aligning everything to cluster boundaries. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-03-28 11:52:43 +01:00
Kevin Wolf	e62daaf679	qcow2: handle_copied(): Implement non-zero host_offset Look only for clusters that start at a given physical offset. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-03-28 11:52:43 +01:00
Kevin Wolf	c53ede9f6d	qcow2: handle_copied(): Get rid of keep_clusters parameter Now *bytes is used to return the length of the area that can be written to without performing an allocation or COW. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-03-28 11:52:43 +01:00
Kevin Wolf	acb0467f8d	qcow2: handle_copied(): Get rid of nb_clusters parameter handle_copied() uses its bytes parameter now to determine how many clusters it should try to find. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-03-28 11:52:43 +01:00
Kevin Wolf	0af729ec00	qcow2: Factor out handle_copied() Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-03-28 11:52:43 +01:00
Kevin Wolf	83baa9a471	qcow2: Clean up handle_alloc() Things can be simplified a bit now. No semantic changes. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-03-28 11:52:43 +01:00
Kevin Wolf	c37f4cd71d	qcow2: Finalise interface of handle_alloc() The interface works completely on a byte granularity now and duplicated parameters are removed. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-03-28 11:52:43 +01:00
Kevin Wolf	3b8e2e260c	qcow2: handle_alloc(): Get rid of keep_clusters parameter handle_alloc() is now called with the offset at which the actual new allocation starts instead of the offset at which the whole write request starts, part of which may already be processed. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-03-28 11:52:43 +01:00
Kevin Wolf	f5bc635094	qcow2: handle_alloc(): Get rid of nb_clusters parameter We already communicate the same information in *bytes. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-03-28 11:52:43 +01:00
Kevin Wolf	10f0ed8b2f	qcow2: Factor out handle_alloc() Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-03-28 11:52:43 +01:00
Kevin Wolf	037689d896	qcow2: Decouple cluster allocation from cluster reuse code This moves some code that prepares the allocation of new clusters to where the actual allocation happens. This is the minimum required to be able to move it to a separate function in the next patch. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-03-28 11:52:43 +01:00
Kevin Wolf	65eb2e35c0	qcow2: Change handle_dependency to byte granularity This is a more precise description of what really constitutes a dependency. The behaviour doesn't change at this point because the COW area of the old request is still aligned to cluster boundaries and therefore an overlap is detected wheneven the requests touch any part of the same cluster. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-03-28 11:52:43 +01:00
Kevin Wolf	d9d74f4177	qcow2: Improve check for overlapping allocations The old code detected an overlapping allocation even when the allocations didn't actually overlap, but were only adjacent. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-03-28 11:52:43 +01:00
Kevin Wolf	17a71e5823	qcow2: Handle dependencies earlier Handling overlapping allocations isn't just a detail of cluster allocation. It is rather one of three ways to get the host cluster offset for a write request: 1. If a request overlaps an in-flight allocations, the cluster offset can be taken from there (this is what handle_dependencies will evolve into) or the request must just wait until the allocation has completed. Accessing the L2 is not valid in this case, it has outdated information. 2. Outside overlapping areas, check the clusters that can be written to as they are, with no COW involved. 3. If a COW is required, allocate new clusters Changing the code to reflect this doesn't change the behaviour because overlaps cannot exist for clusters that are kept in step 2. It does however make it easier for later patches to work on clusters that belong to an allocation that is still in flight. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-03-28 11:52:42 +01:00
Paolo Bonzini	381b487d54	qcow2: make is_allocated return true for zero clusters Otherwise, live migration of the top layer will miss zero clusters and let the backing file show through. This also matches what is done in qed. QCOW2_CLUSTER_ZERO clusters are invalid in v2 image files. Check this directly in qcow2_get_cluster_offset instead of replicating the test everywhere. Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-03-15 16:07:50 +01:00
Kevin Wolf	74c4510a3c	qcow2: Allow lazy refcounts to be enabled on the command line qcow2 images now accept a boolean lazy_refcounts options. Use it like this: -drive file=test.qcow2,lazy_refcounts=on If the option is specified on the command line, it overrides the default specified by the qcow2 header flags that were set when creating the image. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-03-15 16:07:49 +01:00
Paolo Bonzini	737e150e89	block: move include files to include/block/ Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-12-19 08:31:31 +01:00
Kevin Wolf	226c3c26b9	qcow2: Factor out handle_dependencies() Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-12-13 15:37:59 +01:00
Kevin Wolf	280d373579	qcow2: Enable dirty flag in qcow2_alloc_cluster_link_l2 This is closer to where the dirty flag is really needed, and it avoids having checks for special cases related to cluster allocation directly in the writev loop. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-12-13 15:37:59 +01:00
Kevin Wolf	f50f88b9fe	qcow2: Allocate l2meta only for cluster allocations Even for writes to already allocated clusters, an l2meta is allocated, though it stays effectively unused. After this patch, only allocating requests still have one. Each l2meta now describes an in-flight request that writes to clusters that are not yet hooked up in the L2 table. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-12-13 15:37:59 +01:00
Kevin Wolf	060bee8943	qcow2: Drop l2meta.cluster_offset There's no real reason to have an l2meta for normal requests that don't allocate anything. Before we can get rid of it, we must return the host cluster offset in a different way. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-12-13 15:37:59 +01:00
Kevin Wolf	593fb83cac	qcow2: Introduce Qcow2COWRegion This makes it easier to address the areas for which a COW must be performed. As a nice side effect, the COW code in qcow2_alloc_cluster_link_l2 becomes really trivial. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-12-13 15:37:59 +01:00
Kevin Wolf	1d3afd649b	qcow2: Round QCowL2Meta.offset down to cluster boundary The offset within the cluster is already present as n_start and this is what the code uses. QCowL2Meta.offset is only needed at a cluster granularity. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-12-13 15:37:59 +01:00
Stefan Hajnoczi	bfe8043e92	qcow2: implement lazy refcounts Lazy refcounts is a performance optimization for qcow2 that postpones refcount metadata updates and instead marks the image dirty. In the case of crash or power failure the image will be left in a dirty state and repaired next time it is opened. Reducing metadata I/O is important for cache=writethrough and cache=directsync because these modes guarantee that data is on disk after each write (hence we cannot take advantage of caching updates in RAM). Refcount metadata is not needed for guest->file block address translation and therefore does not need to be on-disk at the time of write completion - this is the motivation behind the lazy refcount optimization. The lazy refcount optimization must be enabled at image creation time: qemu-img create -f qcow2 -o compat=1.1,lazy_refcounts=on a.qcow2 10G qemu-system-x86_64 -drive if=virtio,file=a.qcow2,cache=writethrough Update qemu-iotests 031 and 036 since the extension header size changes when we add feature bit table entries. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-08-06 22:39:14 +02:00
Kevin Wolf	b7ab0fea37	qcow2: Fix avail_sectors in cluster allocation code avail_sectors should really be the number of sectors from the start of the allocation, not from the start of the write request. We're lucky enough that this mistake didn't cause any real bug. avail_sectors is only used in the intialiser of QCowL2Meta: .nb_available = MIN(requested_sectors, avail_sectors), m->nb_available in turn is only used for COW at the end of the allocation. A COW occurs only if the request wasn't cluster aligned, which in turn would imply that requested_sectors was less than avail_sectors (both in the original and in the fixed version). In this case avail_sectors is ignored and therefore the mistake doesn't cause any misbehaviour. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-06-15 14:03:43 +02:00
Kevin Wolf	cdba7fee1d	qcow2: Simplify calculation for COW area at the end copy_sectors() always uses the sum (cluster_offset + n_start) or (start_sect + n_start), so if some value is added to both cluster_offset and start_sect, and subtracted from n_start, it's cancelled out anyway. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-06-15 14:03:43 +02:00
Zhi Yong Wu	833e40858c	qcow2: remove a line of unnecessary code Commit `3948d1d4` removed the pointer argument we filled in with l2_offset but forgot to remove the unnecessary l2_offset assignment. Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-06-15 14:03:42 +02:00
Kevin Wolf	1417d7e40e	qcow2: Silence false warning Some gcc versions seem not to be able to figure out that the switch statement covers all possible values and that c is therefore always initialised. Add a default branch for them. Reported-by: malc <av1474@comtv.ru> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: malc <av1474@comtv.ru>	2012-06-15 15:52:45 +04:00
Kevin Wolf	df02179189	qcow2: Check qcow2_alloc_clusters_at() return value When using qcow2_alloc_clusters_at(), the cluster allocation code checked the wrong variable for an error code. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-05-25 18:12:54 +02:00
Kevin Wolf	54e6814360	qcow2: Limit COW to where it's needed This fixes a regression introduced in commit `250196f1`. The bug leads to data corruption, found during an Autotest run with a Fedora 8 guest. Consider a write request whose first part is covered by an already allocated cluster, but additional clusters need to be newly allocated. When counting the number of clusters to allocate, the qcow2 code would decide to do COW for all remaining clusters of the write request, even if some of them are already allocated. If during this COW operation another write request is issued that touches the same cluster, it will still refer to the old cluster. When the COW completes, the first request will update the L2 table and the second write request will be lost. Note that the requests need not overlap, it's enough for them to touch the same cluster. This patch ensures that only clusters that really require COW are considered for allocation. In this case any other request writing to the same cluster will be an allocating write and gets serialised. Reported-by: Marcelo Tosatti <mtosatti@redhat.com> Tested-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-05-07 19:33:18 +02:00
Kevin Wolf	7242411460	qcow2: Don't hold cache references across yield If cache references are held while the coroutine has yielded, the cache may get used up and abort() when it can't find a free entry. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-05-02 18:39:39 +02:00
Kevin Wolf	60651f901a	qcow2: Remove unused parameter in do_alloc_cluster_offset Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-05-02 18:39:39 +02:00
Kevin Wolf	621f058940	qcow2: Zero write support Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-04-20 15:57:30 +02:00
Kevin Wolf	6377af48b0	qcow2: Support reading zero clusters This adds support for reading zero clusters in version 3 images. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-04-20 15:57:29 +02:00
Kevin Wolf	143550a83e	qcow2: Simplify count_cow_clusters count_cow_clusters() tries to reuse existing functions, and all it achieves is to make things much more complicated than they really are: Everything needs COW, unless it's a normal cluster with refcount 1. This patch implements the obvious way of doing this, and by using qcow2_get_cluster_type() it gets rid of all flag magic. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-04-20 15:57:28 +02:00
Kevin Wolf	8e37f681d5	qcow2: Ignore reserved bits in L1/L2 entries This changes the still existing places that assume that the only flags are QCOW_OFLAG_COPIED and QCOW_OFLAG_COMPRESSED to properly mask out reserved bits. It does not convert bdrv_check yet. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-04-20 15:57:28 +02:00
Kevin Wolf	b0b6862e5e	qcow2: Fail write_compressed when overwriting data qcow2_alloc_compressed_cluster_offset() already fails if the copied flag is set, because qcow2_write_compressed() doesn't perform COW as it would have to do to allow this. However, what we really want to check here is whether the cluster is allocated or not. With internal snapshots the copied flag may not be set on allocated clusters. Check the cluster offset instead. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-04-20 15:57:27 +02:00
Kevin Wolf	2bfcc4a0a0	qcow2: Ignore reserved bits in count_contiguous_clusters() Until now, count_contiguous_clusters() has an argument that allowed to specify flags that should be ignored in the comparison, i.e. that are allowed to change between contiguous clusters. This patch changes the function so that it ignores all flags by default now and you need to pass the flags on which it should stop. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-04-20 15:57:27 +02:00
Kevin Wolf	68d000a390	qcow2: Ignore reserved bits in get_cluster_offset With this change, reading from a qcow2 image ignores all reserved bits that are set in an L1 or L2 table entry. Now get_cluster_offset() assigns *cluster_offset only the offset without any other flags. The cluster type is not longer encoded in the offset, but a positive return value in case of success. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-04-20 15:57:27 +02:00
Kevin Wolf	8dc0a5e7a0	qcow2: Fix error handling in qcow2_alloc_cluster_offset If do_alloc_cluster_offset() fails, the error handling code tried to remove the request from the in-flight queue, to which it wasn't added yet, resulting in a NULL pointer dereference. m->nb_clusters really only becomes != 0 when the request is in the list. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-04-19 16:03:27 +02:00
Kevin Wolf	3948d1d487	qcow2: Remove unused parameter in get_cluster_table() Since everything goes through the cache, callers don't use the L2 table offset any more. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>	2012-04-05 14:54:39 +02:00
Kevin Wolf	250196f19c	qcow2: Reduce number of I/O requests If the first part of a write request is allocated, but the second isn't and it can be allocated so that the resulting area is contiguous, handle it at once. This is a common case for sequential writes. After this patch, alloc_cluster_offset() only checks if the clusters are already allocated or how many new clusters can be allocated contigouosly. The actual cluster allocation is split off into a new function do_alloc_cluster_offset(). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>	2012-03-12 15:14:07 +01:00
Kevin Wolf	bf319ece56	qcow2: Factor out count_cow_clusters Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>	2012-03-12 15:14:07 +01:00
Kevin Wolf	3cce16f44d	qcow2: Add some tracing Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>	2012-03-12 15:14:06 +01:00
Stefan Hajnoczi	aef4acb661	qcow2: avoid reentrant bdrv_read() in copy_sectors() A BlockDriverState should not issue requests on itself through the public block layer interface. Nested, or reentrant, requests are problematic because they do I/O throttling and request tracking twice. Features like block layer copy-on-read use request tracking to avoid race conditions between concurrent requests. The reentrant request will have to "wait" for its parent request to complete. But the parent is waiting for the reentrant request to make progress so we have reached deadlock. The solution is for block drivers to avoid the public block layer interfaces for reentrant requests. Instead they should call their own internal functions if they wish to perform reentrant requests. This is also a good opportunity to make copy_sectors() a true coroutine_fn. That means calling bdrv_co_writev() instead of bdrv_write(). Behavior is unchanged but we're being explicit that this executes in coroutine context. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-12-05 14:49:47 +01:00
Kevin Wolf	1b9f1491f8	qcow2: Unlock during COW Unlocking during COW allows for more parallelism. One change it requires is that buffers are dynamically allocated instead of just using a per-image buffer. While touching the code, drop the synchronous qcow2_read() function and replace it by a bdrv_read() call. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-12-05 14:49:40 +01:00
Kevin Wolf	8f1efd00c4	qcow2: Fix bdrv_write_compressed error handling If during allocation of compressed clusters the cluster was already allocated uncompressed, fail and properly release the l2_table (the latter avoids a failed assertion). While at it, make it return some real error numbers instead of -1. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>	2011-10-21 17:34:13 +02:00
Frediano Ziglio	ee18e73023	qcow2: fix range check QCowL2Meta::offset is not cluster aligned but only sector aligned however nb_clusters count cluster from cluster start. This fix range check. Note that old code have no corruption issues related to this check cause it only cause intersection to occur when shouldn't. Signed-off-by: Frediano Ziglio <freddy77@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-09-12 15:17:22 +02:00
Frediano Ziglio	05140499d3	qcow2: initialize metadata before inserting in cluster_allocs QCow2Meta structure was inserted into list before many fields are initialized. Currently is not a problem cause all occur in a lock but if qcow2_alloc_clusters would in a future unlock this lock some issues could arise. Initializing fields before inserting fix the problem. Signed-off-by: Frediano Ziglio <freddy77@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-09-12 15:17:22 +02:00
Frediano Ziglio	a791236992	qcow2: removed unused depends_on field Signed-off-by: Frediano Ziglio <freddy77@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-09-12 15:17:17 +02:00
Frediano Ziglio	35ee5e39c5	qcow2: use always stderr for debugging let all DEBUG_ALLOC2 printf goes to stderr Signed-off-by: Frediano Ziglio <freddy77@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-08-25 15:22:25 +02:00
Devin Nakamura	d57237f291	qcow2: fix typo in documentation for qcow2_get_cluster_offset() Documentation states the num is measured in clusters, but its actually measured in sectors Signed-off-by: Devin Nakamura <devin122@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-08-23 14:15:17 +02:00
Anthony Liguori	7267c0947d	Use glib memory allocation and free functions qemu_malloc/qemu_free no longer exist after this commit. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2011-08-20 23:01:08 -05:00
Kevin Wolf	68d100e905	qcow2: Use coroutines Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-08-02 15:53:41 +02:00
Kevin Wolf	9e2a3701a1	qcow2: Fix in-flight list after qcow2_cache_put failure If qcow2_cache_put returns an error during cluster allocation and the allocation fails, it must be removed from the list of in-flight allocations. Otherwise we'd get a loop in the list when the ACB is used for the next allocation. Luckily, this qcow2_cache_put shouldn't fail anyway because the L2 table is only read, so that qcow2_cache_put doesn't even involve I/O. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2011-06-15 14:36:15 +02:00
Kevin Wolf	80fa3341a7	qcow2: Fix memory leaks in error cases This fixes memory leaks that may be caused by I/O errors during L1 table growth (can happen during save_vm) and in qemu-img check. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-06-08 11:56:40 +02:00
Kevin Wolf	16fde5f2c2	qcow2: Fix order in L2 table COW When copying L2 tables (this happens only with internal snapshots), the order wasn't completely safe, so that after a crash you could end up with a L2 table that has too low refcount, possibly leading to corruption in the long run. This patch puts the operations in the right order: First allocate the new L2 table and replace the reference, and only then decrease the refcount of the old table. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-02-10 13:24:29 +01:00
Kevin Wolf	8af3648843	qcow2: Fix error handling for reading compressed clusters When reading a compressed cluster failed, qcow2 falsely returned success. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com>	2011-02-10 13:23:44 +01:00
Kevin Wolf	5ea929e3d1	qcow2: Add bdrv_discard support This adds a bdrv_discard function to qcow2 that frees the discarded clusters. It does not yet pass the discard on to the underlying file system driver, but the space can be reused by future writes to the image. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>	2011-01-31 10:03:00 +01:00
Kevin Wolf	3de0a2944b	qcow2: Batch flushes for COW qcow2 calls bdrv_flush() after performing COW in order to ensure that the L2 table change is never written before the copy is safe on disk. Now that the L2 table is cached, we can wait with flushing until we write out the next L2 table. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-01-24 16:41:49 +01:00
Kevin Wolf	29c1a7301a	qcow2: Use QcowCache Use the new functions of qcow2-cache.c for everything that works on refcount block and L2 tables. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-01-24 16:41:49 +01:00
Aurelien Jarno	653df36bbe	qcow2: fix unaligned access cpu_to_be64w() is called with an obviously non-aligned pointer. Use cpu_to_be64wu() instead. It fixes unaligned accesses errors on IA64 hosts. Cc: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-01-24 11:08:50 +01:00
Jes Sorensen	7c80ab3f21	block/qcow2.c: rename qcow_ functions to qcow2_ It doesn't really make sense for functions in qcow2.c to be named qcow_ so convert the names to match correctly. Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2010-12-17 16:15:01 +01:00
Kevin Wolf	1c02e2a171	qcow2: Invalidate cache after failed read The cache content may be destroyed after a failed read, better not use it any more. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>	2010-11-04 13:54:37 +01:00
Stefan Hajnoczi	72893756e0	qcow2: Support exact L1 table growth The L1 table grow operation includes a size calculation that bumps up the new L1 table size in order to anticipate the size needs of vmstate data. This helps reduce the number of times that the L1 table has to be grown when vmstate data is appended. This size overhead is not necessary during image creation, bdrv_truncate(), or snapshot goto operations. In fact, existing qemu-iotests that exercise table growth are no longer able to trigger it because image creation preallocates an L1 table that is too large after changes to qcow_create2(). This patch keeps the size calculation but also adds exact growth for callers that do not want to inflate the L1 table size unnecessarily. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2010-10-22 14:49:35 +02:00
Kevin Wolf	bd28f83565	qcow2: Avoid bounce buffers for AIO read requests qcow2 used to use bounce buffers for any AIO requests. This does not only imply unnecessary copying, but also unbounded allocations which should be avoided. This patch removes bounce buffers from the normal AIO read path, and constrains them to a constant size for encrypted images. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2010-09-21 15:39:42 +02:00
Kevin Wolf	9f8e668eb1	qcow2: Get rid of additional sync on COW We always have a sync for the refcount update when a new cluster is allocated. If we move this past the COW, we can save an additional sync. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2010-09-21 15:39:42 +02:00
Kevin Wolf	29216ed14f	qcow2: Move sync out of qcow2_alloc_clusters Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2010-09-21 15:39:42 +02:00
Kevin Wolf	7ec5e6a4ca	qcow2: Remove unnecessary flush after L2 write When a new cluster was allocated, we only need a flush after the write to the L2 table if it was a COW and we need to decrease the refcounts of the old clusters. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2010-09-08 12:39:24 +02:00
Kevin Wolf	8b3b720620	qcow2: Use bdrv_(p)write_sync for metadata writes Use bdrv_(p)write_sync to ensure metadata integrity in case of a crash. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2010-06-22 14:38:02 +02:00
Kevin Wolf	68dba0bf45	qcow2: Restore L1 entry on l2_allocate failure If writing the L1 table to disk failed, we need to restore its old content in memory to avoid inconsistencies. Reported-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2010-06-15 09:41:58 +02:00
Kevin Wolf	55c17e9821	qcow2: Change l2_load to return 0/-errno Provide the error code to the caller instead of just indicating success/error. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2010-05-28 13:29:12 +02:00
Kevin Wolf	1c46efaa0a	qcow2: Allow qcow2_get_cluster_offset to return errors qcow2_get_cluster_offset() looks up a given virtual disk offset and returns the offset of the corresponding cluster in the image file. Errors (e.g. L2 table can't be read) are currenctly indicated by a return value of 0, which is unfortuately the same as for any unallocated cluster. So in effect we can't check for errors. This makes the old return value a by-reference parameter and returns the usual 0/-errno error code. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2010-05-28 13:29:11 +02:00
Kevin Wolf	175e11526e	qcow2: Fix error handling in l2_allocate l2_allocate has some intermediate states in which the image is inconsistent. Change the order to write to the L1 table only after the new L2 table has successfully been initialized. Also reset the L2 cache in failure case, it's very likely wrong. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2010-05-28 13:14:25 +02:00
Kevin Wolf	1b7c801b40	qcow2: Clear L2 table cache after write error If the L2 table was already updated in cache, but writing it to disk has failed, we must not continue using the changed version in the cache to stay consistent with what's on the disk. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2010-05-28 13:14:25 +02:00
Kevin Wolf	66f82ceed6	block: Open the underlying image file in generic code Format drivers shouldn't need to bother with things like file names, but rather just get an open BlockDriverState for the underlying protocol. This patch introduces this behaviour for bdrv_open implementation. For protocols which need to access the filename to open their file/device/connection/... a new callback bdrv_file_open is introduced which doesn't get an underlying file opened. For now, also some of the more obscure formats use bdrv_file_open because they open() the file themselves instead of using the block.c functions. They need to be fixed in later patches. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2010-05-03 10:07:30 +02:00
Kevin Wolf	c46e116723	qcow2: Return 0/-errno in l2_allocate Returning NULL on error doesn't allow distinguishing between different errors. Change the interface to return an integer for -errno. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2010-04-23 16:08:46 +02:00
Kevin Wolf	f7defcb627	qcow2: Return 0/-errno in write_l1_entry Change write_l1_entry to return the real error code instead of -1. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2010-04-23 16:08:46 +02:00
Kevin Wolf	c835d00fc8	qcow2: Fix error return code in qcow2_alloc_cluster_link_l2 Fix qcow2_alloc_cluster_link_l2 to return the real error code like it does in all other error cases. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2010-04-23 16:08:46 +02:00
Kevin Wolf	79a31189d4	qcow2: Return 0/-errno in write_l2_entries Change write_l2_entries to return the real error code instead of -1. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2010-04-23 16:08:46 +02:00
Kevin Wolf	8252278afb	qcow2: Trigger blkdebug events This adds blkdebug events to qcow2 to allow injecting I/O errors in specific places. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2010-04-23 16:08:46 +02:00
Kevin Wolf	c644db3d53	qcow2: Remove request from in-flight list after error If we complete a request with a failure we need to remove it from the list of requests that are in flight. If we don't do it, the next time the same AIOCB is used for a cluster allocation it will create a loop in the list and qemu will hang in an endless loop. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>	2010-04-10 01:25:30 +02:00
Kevin Wolf	4805bb6696	qcow2: Fix access after end of array If a write requests crosses a L2 table boundary and all clusters until the end of the L2 table are usable for the request, we must not look at the next L2 entry because we already have arrived at the end of the array. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2010-02-19 15:53:54 -06:00
Kevin Wolf	f4f0d391b2	qcow2: Fix signedness bugs Checking for return codes < 0 isn't really going to work with unsigned types. Use signed types instead. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2010-02-10 11:56:57 -06:00
Kevin Wolf	5d757b563d	qcow2: Don't ignore qcow2_alloc_clusters return value Now that qcow2_alloc_clusters can return error codes, we must handle them in the callers of qcow2_alloc_clusters. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2010-01-26 14:59:19 -06:00
Kevin Wolf	148da7ea9d	qcow2: Return 0/-errno in qcow2_alloc_cluster_offset Returning 0/-errno allows it to distingush different errors classes. The cluster offset of newly allocated clusters is now returned in the QCowL2Meta struct. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2010-01-26 14:59:19 -06:00
Kevin Wolf	1e3e8f1a43	qcow2: Return 0/-errno in get_cluster_table Switching to 0/-errno allows it to distinguish different error cases. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2010-01-26 14:59:19 -06:00
Kevin Wolf	fb8fa77ce1	qcow2: Fix error handling in qcow2_grow_l1_table Return the appropriate error value instead of always using EIO. Don't free the L1 table on errors, we still need it. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2010-01-26 14:59:19 -06:00
Stefan Weil	d191d12d5f	qcow2: Allow qcow2 disk images with size zero Images with disk size 0 may be used for VM snapshots, but not to save normal block data. It is possible to create such images using qemu-img, but opening them later fails. So even "qemu-img info image.qcow2" is not possible for an image created with "qemu-img create -f qcow2 image.qcow2 0". This is fixed here. Signed-off-by: Stefan Weil <weil@mail.berlios.de> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2009-11-09 08:43:01 -06:00
Kevin Wolf	72ecf02d7d	Revert "qcow2: Bring synchronous read/write back to life" It was merely a workaround and the real fix is done now. This reverts commit `ef845c3bf4`. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2009-10-27 12:28:59 -05:00
Kevin Wolf	ef845c3bf4	qcow2: Bring synchronous read/write back to life When the synchronous read and write functions were dropped, they were replaced by generic emulation functions. Unfortunately, these emulation functions don't provide the same semantics as the original functions did. The original bdrv_read would mean that we read some data synchronously and that we won't be interrupted during this read. The latter assumption is no longer true with the emulation function which needs to use qemu_aio_poll and therefore allows the callback of any other concurrent AIO request to be run during the read. Which in turn means that (meta)data read earlier could have changed and be invalid now. qcow2 is not prepared to work in this way and it's just scary how many places there are where other requests could run. I'm not sure yet where exactly it breaks, but you'll see breakage with virtio on qcow2 with a backing file. Providing synchronous functions again fixes the problem for me. Patchworks-ID: 35437 Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2009-10-15 09:32:04 -05:00
Kevin Wolf	80ee15a6b2	qcow2: Increase maximum cluster size to 2 MB This patch increases the maximum qcow2 cluster size to 2 MB. Starting with 128k clusters, L2 tables span 2 GB or more of virtual disk space, causing 32 bit truncation and wraparound of signed integers. Therefore some variables need to use a larger data type. While being at reviewing data types, change some integers that are used for array indices to unsigned. In some places they were checked against some upper limit but not for negative values. This could avoid potential segfaults with corrupted qcow2 images. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2009-10-05 09:32:52 -05:00
Blue Swirl	72cf2d4f0e	Fix sys-queue.h conflict for good Problem: Our file sys-queue.h is a copy of the BSD file, but there are some additions and it's not entirely compatible. Because of that, there have been conflicts with system headers on BSD systems. Some hacks have been introduced in the commits `15cc923584`, `f40d753718`, `96555a96d7` and `3990d09adf` but the fixes were fragile. Solution: Avoid the conflict entirely by renaming the functions and the file. Revert the previous hacks. Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2009-09-12 07:36:22 +00:00
Kevin Wolf	f214978a42	qcow2: Order concurrent AIO requests on the same unallocated cluster When two AIO requests write to the same cluster, and this cluster is unallocated, currently both requests allocate a new cluster and the second one merges the first one when it is completed. This means an cluster allocation, a read and a cluster deallocation which cause some overhead. If we simply let the second request wait until the first one is done, we improve overall performance with AIO requests (specifially, qcow2/virtio combinations). This patch maintains a list of in-flight requests that have allocated new clusters. A second request touching the same cluster is limited so that it either doesn't touch the allocation of the first request (so it can have a non-overlapping allocation) or it waits for the first request to complete. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2009-09-09 17:31:26 -05:00
Kevin Wolf	3f6a3ee51e	qcow2: Fix L1 table memory allocation Contrary to what one could expect, the size of L1 tables is not cluster aligned. So as we're writing whole sectors now instead of single entries, we need to ensure that the L1 table in memory is large enough; otherwise write would access memory after the end of the L1 table. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2009-07-10 13:44:29 -05:00
Kevin Wolf	4c1612d954	alloc_cluster_link_l2: Write complete sectors When updating the L2 tables in alloc_cluster_link_l2(), write complete sectors instead of updating single entries. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2009-06-16 15:18:36 -05:00
Kevin Wolf	6583e3c7e8	l2_allocate: Write complete sectors When modifying the L1 table, l2_allocate() needs to write complete sectors instead of single entries. The L1 table is already in memory, reading it from disk in the block layer to align the request is wasted performance. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2009-06-16 15:18:36 -05:00
Kevin Wolf	ed6ccf0f51	qcow2: Rename global functions The qcow2 source is now split into several more manageable files. During the conversion quite some functions that were static before needed to be changed to be global to make the source compile again. We were lucky enough not to get name conflicts with these additional global names, but they are not nice. This patch adds a qcow2_ prefix to all of the global functions in qcow2. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2009-06-16 15:18:36 -05:00
Kevin Wolf	45aba42fba	qcow2: Split out guest cluster functions qcow2-cluster.c contains all functions related to the management of guest clusters, i.e. what the guest sees on its virtual disk. This code is about mapping these guest clusters to host clusters in the image file using the two-level lookup tables. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2009-06-16 15:18:36 -05:00

1 2 3 4 5

210 Commits