The patch adds some preparation parts for incompatible compression type
feature to qcow2 allowing the use different compression methods for
image clusters (de)compressing.
It is implied that the compression type is set on the image creation and
can be changed only later by image conversion, thus compression type
defines the only compression algorithm used for the image, and thus,
for all image clusters.
The goal of the feature is to add support of other compression methods
to qcow2. For example, ZSTD which is more effective on compression than ZLIB.
The default compression is ZLIB. Images created with ZLIB compression type
are backward compatible with older qemu versions.
Adding of the compression type breaks a number of tests because now the
compression type is reported on image creation and there are some changes
in the qcow2 header in size and offsets.
The tests are fixed in the following ways:
* filter out compression_type for many tests
* fix header size, feature table size and backing file offset
affected tests: 031, 036, 061, 080
header_size +=8: 1 byte compression type
7 bytes padding
feature_table += 48: incompatible feature compression type
backing_file_offset += 56 (8 + 48 -> header_change + feature_table_change)
* add "compression type" for test output matching when it isn't filtered
affected tests: 049, 060, 061, 065, 082, 085, 144, 182, 185, 198, 206,
242, 255, 274, 280
Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
QAPI part:
Acked-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20200507082521.29210-2-dplotnikov@virtuozzo.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
These fields were already removed in commit c3c10f72, but then commit
b58deb34 revived them probably due to bad merge conflict resolution.
They are still unused, so remove them again.
Fixes: b58deb344d
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20200326170757.12344-1-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
qcow2_check_read_snapshot_table() can perform consistency checks, but it
cannot fix everything. Specifically, it cannot allocate new clusters,
because that should wait until the refcount structures are known to be
consistent (i.e., after qcow2_check_refcounts()). Thus, it cannot call
qcow2_write_snapshots().
Do that in qcow2_check_fix_snapshot_table(), which is called after
qcow2_check_refcounts().
Currently, there is nothing that would set result->corruptions, so this
is a no-op. A follow-up patch will change that.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20191011152814.14791-10-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reading the snapshot table can fail. That is a problem when we want to
repair the image.
Therefore, stop reading the snapshot table in qcow2_do_open() in check
mode. Instead, add a new function qcow2_check_read_snapshot_table()
that reads the snapshot table at a later point. In the future, we want
to handle errors here and fix them.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20191011152814.14791-9-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Updating the snapshot list will be useful when upgrading a v2 image to
v3, so we will need to call this function in qcow2.c.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20191011152814.14791-6-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
The qcow2 specification says to ignore unknown extra data fields in
snapshot table entries. Currently, we discard it whenever we update the
image, which is a bit different from "ignore".
This patch makes the qcow2 driver keep all unknown extra data fields
when updating an image's snapshot table.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20191011152814.14791-5-mreitz@redhat.com
[mreitz: Adjusted comments as proposed by Eric]
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20191011152814.14791-4-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
qcow2_reopen_bitmaps_ro wants to store bitmaps and then mark them all
readonly. But the latter don't work, as
qcow2_store_persistent_dirty_bitmaps removes bitmaps after storing.
It's OK for inactivation but bad idea for reopen-ro. And this leads to
the following bug:
Assume we have persistent bitmap 'bitmap0'.
Create external snapshot
bitmap0 is stored and therefore removed
Commit snapshot
now we have no bitmaps
Do some writes from guest (*)
they are not marked in bitmap
Shutdown
Start
bitmap0 is loaded as valid, but it is actually broken! It misses
writes (*)
Incremental backup
it will be inconsistent
So, let's stop removing bitmaps on reopen-ro. But don't rejoice:
reopening bitmaps to rw is broken too, so the whole scenario will not
work after this patch and we can't enable corresponding test cases in
260 iotests still. Reopening bitmaps rw will be fixed in the following
patches.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20190927122355.7344-7-vsementsov@virtuozzo.com
Signed-off-by: John Snow <jsnow@redhat.com>
The function is unused, drop it.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20190927122355.7344-6-vsementsov@virtuozzo.com
Signed-off-by: John Snow <jsnow@redhat.com>
qmp_block_dirty_bitmap_add and do_block_dirty_bitmap_remove do acquire
aio context since 0a6c86d024. But this is not enough: we also must
lock qcow2 mutex when access in-image metadata. Especially it concerns
freeing qcow2 clusters.
To achieve this, move qcow2_can_store_new_dirty_bitmap and
qcow2_remove_persistent_dirty_bitmap to coroutine context.
Since we work in coroutines in correct aio context, we don't need
context acquiring in blockdev.c anymore, drop it.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20190920082543.23444-4-vsementsov@virtuozzo.com
Signed-off-by: John Snow <jsnow@redhat.com>
It's more comfortable to not deal with local_err.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20190920082543.23444-3-vsementsov@virtuozzo.com
Signed-off-by: John Snow <jsnow@redhat.com>
It improves performance for fragmented qcow2 images.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20190916175324.18478-6-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
* Change the qcow2_co_{encrypt|decrypt} to just receive full host and
guest offsets and use this function directly instead of calling
do_perform_cow_encrypt (which is removed by that patch).
* Adjust qcow2_co_encdec to take full host and guest offsets as well.
* Document the qcow2_co_{encrypt|decrypt} arguments
to prevent the bug fixed in former commit from hopefully
happening again.
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-id: 20190915203655.21638-3-mlevitsk@redhat.com
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
[mreitz: Let perform_cow() return the error value returned by
qcow2_co_encrypt(), as proposed by Vladimir]
Signed-off-by: Max Reitz <mreitz@redhat.com>
Implement and use new interface to get rid of hd_qiov.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20190604161514.262241-13-vsementsov@virtuozzo.com
Message-Id: <20190604161514.262241-13-vsementsov@virtuozzo.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
drv_co_block_status digs bs->file for additional, more accurate search
for hole inside region, reported as DATA by bs since 5daa74a6eb.
This accuracy is not free: assume we have qcow2 disk. Actually, qcow2
knows, where are holes and where is data. But every block_status
request calls lseek additionally. Assume a big disk, full of
data, in any iterative copying block job (or img convert) we'll call
lseek(HOLE) on every iteration, and each of these lseeks will have to
iterate through all metadata up to the end of file. It's obviously
ineffective behavior. And for many scenarios we don't need this lseek
at all.
However, lseek is needed when we have metadata-preallocated image.
So, let's detect metadata-preallocation case and don't dig qcow2's
protocol file in other cases.
The idea is to compare allocation size in POV of filesystem with
allocations size in POV of Qcow2 (by refcounts). If allocation in fs is
significantly lower, consider it as metadata-preallocation case.
102 iotest changed, as our detector can't detect shrinked file as
metadata-preallocation, which don't seem to be wrong, as with metadata
preallocation we always have valid file length.
Two other iotests have a slight change in their QMP output sequence:
Active 'block-commit' returns earlier because the job coroutine yields
earlier on a blocking operation. This operation is loading the refcount
blocks in qcow2_detect_metadata_preallocation().
Suggested-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If COW areas of the newly allocated clusters are zeroes on the backing
image, efficient bdrv_write_zeroes(flags=BDRV_REQ_NO_FALLBACK) can be
used on the whole cluster instead of writing explicit zero buffers later
in perform_cow().
iotest 060:
write to the discarded cluster does not trigger COW anymore.
Use a backing image instead.
Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com>
Message-id: 20190516142749.81019-2-anton.nefedov@virtuozzo.com
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Do encryption/decryption in threads, like it is already done for
compression. This improves asynchronous encrypted io.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20190506142741.41731-9-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Move generic part out of qcow2_co_do_compress, to reuse it for
encryption and rename things that would be shared with encryption path.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20190506142741.41731-6-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Move compression-on-threads to separate file. Encryption will be in it
too.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20190506142741.41731-3-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
qcow2.h depends on block_int.h. Compilation isn't broken currently only
due to block_int.h always included before qcow2.h. Though, it seems
better to directly include block_int.h in qcow2.h.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20190506142741.41731-2-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
When an L2 table entry points to a compressed cluster the space used
by the data is specified in 512-byte sectors. This size is independent
from BDRV_SECTOR_SIZE and is specific to the qcow2 file format.
The QCOW2_COMPRESSED_SECTOR_SIZE constant defined in this patch makes
this explicit.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The last user of this field disappeared when we replace the
sector-based bdrv_write() with the byte-based bdrv_pwrite().
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Since we now load all bitmaps into memory anyway, we can just truncate
them in-memory and then flush them back to disk. Just in case, we will
still check and enforce that this shortcut is valid -- i.e. that any
bitmap described on-disk is indeed in-memory and can be modified.
If there are any inconsistent bitmaps, we refuse to allow the truncate
as we do not actually load these bitmaps into memory, and it isn't safe
or reasonable to attempt to truncate corrupted data.
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20190311185147.52309-4-vsementsov@virtuozzo.com
[vsementsov: drop bitmap flushing, fix block comments style]
Signed-off-by: John Snow <jsnow@redhat.com>
Provide an option to force QEMU to always keep the external data file
consistent as a standalone read-only raw image.
At the moment, this means making sure that write_zeroes requests are
forwarded to the data file instead of just updating the metadata, and
checking that no backing file is used.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Rather than requiring that the external data file node is passed
explicitly when creating the qcow2 node, store the filename in the
designated header extension during .bdrv_create and read it from there
as a default during .bdrv_open.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This changes the qcow2 implementation to direct all guest data I/O to
s->data_file rather than bs->file, while metadata I/O still uses
bs->file. At the moment, this is still always the same, but soon we'll
add options to set s->data_file to an external data file.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
qcow2_alloc_compressed_cluster_offset() used to return the cluster
offset for success and 0 for error. This doesn't only conflict with 0 as
a valid host offset, but also loses the error code.
Similar to the change made to qcow2_alloc_cluster_offset() for
uncompressed clusters in commit 148da7ea9d, make the function return
0/-errno and return the allocated cluster offset in a by-reference
parameter.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The cluster allocation code uses 0 as an invalid offset that is used in
case of errors or as "offset not yet determined". With external data
files, a host cluster offset of 0 becomes valid, though.
Define a constant INV_OFFSET (which is not cluster aligned and will
therefore never be a valid offset) that can be used for such purposes.
This removes the additional host_offset == 0 check that commit
ff52aab2df introduced; the confusion between an invalid offset and
(erroneous) allocation at offset 0 is removed with this change.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This adds basic constants, struct fields and helper function for
external data file support to the implementation.
QCOW2_INCOMPAT_MASK and QCOW2_AUTOCLEAR_MASK are not updated yet so that
opening images with an external data file still fails (we don't handle
them correctly yet).
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In the 'Format specific information' section of the 'qemu-img info'
command output, the supplemental information about existing QCOW2
bitmaps will be shown, such as a bitmap name, flags and granularity:
image: /vz/vmprivate/VM1/harddisk.hdd
file format: qcow2
virtual size: 64G (68719476736 bytes)
disk size: 3.0M
cluster_size: 1048576
Format specific information:
compat: 1.1
lazy refcounts: true
bitmaps:
[0]:
flags:
[0]: in-use
[1]: auto
name: back-up1
granularity: 65536
[1]:
flags:
[0]: in-use
[1]: auto
name: back-up2
granularity: 65536
refcount bits: 16
corrupt: false
Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <1549638368-530182-3-git-send-email-andrey.shinkevich@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
We define 54 macros for the powers of two >= 1024. We use six, in six
macro definitions. Four of them could just as well use the common MiB
macro, so do that. The remaining two can't, because they get passed
to stringify. Replace the macro by the literal number there.
Slightly harder to read in one instance (1048576 vs. S_1MiB), so add a
comment there. The other instance is a wash: 65536 vs S_64KiB. 65536
has been good enough for more than seven years there.
This effectively reverts commit 540b849261 and 1240ac558d.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Most list head structs need not be given a name. In most cases the
name is given just in case one is going to use QTAILQ_LAST, QTAILQ_PREV
or reverse iteration, but this does not apply to lists of other kinds,
and even for QTAILQ in practice this is only rarely needed. In addition,
we will soon reimplement those macros completely so that they do not
need a name for the head struct. So clean up everything, not giving a
name except in the rare case where it is necessary.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Allocate buffers locally and release qcow2 lock. Than, reads inside
qcow2_co_preadv_compressed may be done in parallel, however all
decompression is still done synchronously. Let's improve it in the
following commit.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Our code was already checking that we did not attempt to
allocate more clusters than what would fit in an INT64 (the
physical maximimum if we can access a full off_t's worth of
data). But this does not catch smaller limits enforced by
various spots in the qcow2 image description: L1 and normal
clusters of L2 are documented as having bits 63-56 reserved
for other purposes, capping our maximum offset at 64PB (bit
55 is the maximum bit set). And for compressed images with
2M clusters, the cap drops the maximum offset to bit 48, or
a maximum offset of 512TB. If we overflow that offset, we
would write compressed data into one place, but try to
decompress from another, which won't work.
It's actually possible to prove that overflow can cause image
corruption without this patch; I'll add the iotests separately
in the next commit.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This variable doesn't work as it should, because it is actually cleared
in qcow2_co_invalidate_cache() by memset(). Drop it, as the following
patch will introduce new behavior.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
The default cache-clean-interval is set to 10 minutes, in order to lower
the overhead of the qcow2 caches (before the default was 0, i.e.
disabled).
* For non-Linux platforms the default is kept at 0, because
cache-clean-interval is not supported there yet.
Signed-off-by: Leonid Bloch <lbloch@janustech.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The upper limit on the L2 cache size is increased from 1 MB to 32 MB
on Linux platforms, and to 8 MB on other platforms (this difference is
caused by the ability to set intervals for cache cleaning on Linux
platforms only).
This is done in order to allow default full coverage with the L2 cache
for images of up to 256 GB in size (was 8 GB). Note, that only the
needed amount to cover the full image is allocated. The value which is
changed here is just the upper limit on the L2 cache size, beyond which
it will not grow, even if the size of the image will require it to.
Signed-off-by: Leonid Bloch <lbloch@janustech.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Sufficient L2 cache can noticeably improve the performance when using
large images with frequent I/O.
Previously, unless 'cache-size' was specified and was large enough, the
L2 cache was set to a certain size without taking the virtual image size
into account.
Now, the L2 cache assignment is aware of the virtual size of the image,
and will cover the entire image, unless the cache size needed for that is
larger than a certain maximum. This maximum is set to 1 MB by default
(enough to cover an 8 GB image with the default cluster size) but can
be increased or decreased using the 'l2-cache-size' option. This option
was previously documented as the *maximum* L2 cache size, and this patch
makes it behave as such, instead of as a constant size. Also, the
existing option 'cache-size' can limit the sum of both L2 and refcount
caches, as previously.
Signed-off-by: Leonid Bloch <lbloch@janustech.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Leonid Bloch <lbloch@janustech.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Do data compression in separate threads. This significantly improve
performance for qemu-img convert with -W (allow async writes) and -c
(compressed) options.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If we managed to allocate the clusters, but then failed to write the
data, there's a good chance that we'll still be able to free the
clusters again in order to avoid cluster leaks (the refcounts are
cached, so even if we can't write them out right now, we may be able to
do so when the VM is resumed after a werror=stop/enospc pause).
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Eric Blake <eblake@redhat.com>
The L2 and refcount caches have default sizes that can be overridden
using the l2-cache-size and refcount-cache-size (an additional
parameter named cache-size sets the combined size of both caches).
Unless forced by one of the aforementioned parameters, QEMU will set
the unspecified sizes so that the L2 cache is 4 times larger than the
refcount cache.
This is based on the premise that the refcount metadata needs to be
only a fourth of the L2 metadata to cover the same amount of disk
space. This is incorrect for two reasons:
a) The amount of disk covered by an L2 table depends solely on the
cluster size, but in the case of a refcount block it depends on
the cluster size *and* the width of each refcount entry.
The 4/1 ratio is only valid with 16-bit entries (the default).
b) When we talk about disk space and L2 tables we are talking about
guest space (L2 tables map guest clusters to host clusters),
whereas refcount blocks are used for host clusters (including
L1/L2 tables and the refcount blocks themselves). On a fully
populated (and uncompressed) qcow2 file, image size > virtual size
so there are more refcount entries than L2 entries.
Problem (a) could be fixed by adjusting the algorithm to take into
account the refcount entry width. Problem (b) could be fixed by
increasing a bit the refcount cache size to account for the clusters
used for qcow2 metadata.
However this patch takes a completely different approach and instead
of keeping a ratio between both cache sizes it assigns as much as
possible to the L2 cache and the remainder to the refcount cache.
The reason is that L2 tables are used for every single I/O request
from the guest and the effect of increasing the cache is significant
and clearly measurable. Refcount blocks are however only used for
cluster allocation and internal snapshots and in practice are accessed
sequentially in most cases, so the effect of increasing the cache is
negligible (even when doing random writes from the guest).
So, make the refcount cache as small as possible unless the user
explicitly asks for a larger one.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 9695182c2eb11b77cb319689a1ebaa4e7c9d6591.1523968389.git.berto@igalia.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Checking reopen by existence of some bitmaps is wrong, as it may be
some other bitmaps, or on the other hand, user may remove bitmaps. This
criteria is bad. To simplify things and make behavior more predictable
let's just add a flag to remember, that we've already tried to load
bitmaps on open and do not want do it again.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20180411122606.367301-2-vsementsov@virtuozzo.com
[mreitz: Changed comment wording according to Eric Blake's suggestion]
Signed-off-by: Max Reitz <mreitz@redhat.com>
Add version of qcow2_reopen_bitmaps_rw, which do the same work but
also return a hint about was header updated or not. This will be
used in the following fix for bitmaps reloading after migration.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20180320170521.32152-2-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
This function checks that the offset and size of a table are valid.
While the offset checks are fine, the size check is too generic, since
it only verifies that the total size in bytes fits in a 64-bit
integer. In practice all tables used in qcow2 have much smaller size
limits, so the size needs to be checked again for each table using its
actual limit.
This patch generalizes this function by allowing the caller to specify
the maximum size for that table. In addition to that it allows passing
an Error variable.
The function is also renamed and made public since we're going to use
it in other parts of the code.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>