The added parameter can be used to return the BDS pointer which the
valid offset is referring to. Its value should be ignored unless
BDRV_BLOCK_OFFSET_VALID in ret is set.
Until block drivers fill in the right value, let's clear it explicitly
right before calling .bdrv_get_block_status.
The "bs->file" condition in bdrv_co_get_block_status is kept now to keep iotest
case 102 passing, and will be fixed once all drivers return the right file
pointer.
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1453780743-16806-2-git-send-email-famz@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
If qcow2_invalidate_cache() fails, we are in a state where qcow2_close()
has already been completed, but the image hasn't been reopened yet.
Calling into any qcow2 function for an image in this state will cause
crashes.
The real solution would be to get rid of the close/open pair and instead
do an atomic reset of the involved data structures, but this isn't
trivial, so let's just make the image inaccessible for now.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
What qcow2_invalidate_cache() should do is close the image with
BDRV_O_INACTIVE set and reopen it with the flag cleared. In fact, it
used to do exactly the opposite: qcow2_close() relied on bs->open_flags,
which is already updated to have cleared BDRV_O_INACTIVE at this point,
whereas qcow2_open() was called with s->flags, which has the flag still
set. Fix this.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
The callback has to ensure that closing or flushing the image afterwards
wouldn't cause a write access to the image files. This means that just
the caches have to be written out, which is part of the existing
.bdrv_close implementation.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Instead of covering only the state of images on the migration
destination before the migration is completed, the flag will also cover
the state of images on the migration source after completion. This
common state implies that the image is technically still open, but no
writes will happen and any cached contents will be reloaded from disk if
and when the image leaves this state.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
When creating a qcow2 image, we didn't necessarily call
qcow2_update_header(), but could end up with the basic header that
qcow2_create2() created manually. One thing that this basic header
lacks is the feature table. Let's make sure that it's always present.
This requires a few updates to test cases as well.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Version 2 images don't have feature bits, so writing a feature table to
those images is kind of pointless.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.
This commit was created with scripts/clean-includes.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Done with this Coccinelle semantic patch
@@
expression FMT, E1, E2;
expression list ARGS;
@@
- error_setg(E1, FMT, ARGS, error_get_pretty(E2));
+ error_propagate(E1, E2);/*###*/
+ error_prepend(E1, FMT/*@@@*/, ARGS);
followed by manual cleanup, first because I can't figure out how to
make Coccinelle transform strings, and second to get rid of now
superfluous error_propagate().
We now use or propagate the original error whole instead of just its
message obtained with error_get_pretty(). This avoids suppressing its
hint (see commit 50b7b00), but I can't see how the errors touched in
this commit could come with hints. It also improves the message
printed with &error_abort when we screw up (see commit 1e9b65b).
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
s->qcow_version is always set to 2 or 3. Let's assert if this is wrong.
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Roman Kagan <rkagan@virtuozzo.com>
CC: Max Reitz <mreitz@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Make use of qcow2_change_refcount_order() to support changing the
refcount order with qemu-img amend.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If there is more than one time-consuming operation to be performed for
qcow2_amend_options(), we need an intermediate CB which coordinates the
progress of the individual operations and passes the result to the
original status callback.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If the image version should be upgraded, that is the first we should do;
if it should be downgraded, that is the last we should do. So split the
version change block into an upgrade part at the start and a downgrade
part at the end.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Add an opaque value which is to be passed to the bdrv_amend_options()
status callback.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
qcow2 accepts a few driver-specific options that overlap semantically
(e.g. "overlap-check" is an alias of "overlap-check.template", and any
missing cache size option is derived from the given ones).
When bdrv_reopen() merges the set of updated options with left out
options that should be kept at their old value, we need to consider this
and filter out any duplicates (which would generally cause errors
because new and old value would contradict each other).
This patch adds a .bdrv_join_options callback to BlockDriver and
implements it for qcow2.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Now that we guarantee the user doesn't have any enum values
beginning with a single underscore, we can use that for our
own purposes. Renaming ENUM_MAX to ENUM__MAX makes it obvious
that the sentinel is generated.
This patch was mostly generated by applying a temporary patch:
|diff --git a/scripts/qapi.py b/scripts/qapi.py
|index e6d014b..b862ec9 100644
|--- a/scripts/qapi.py
|+++ b/scripts/qapi.py
|@@ -1570,6 +1570,7 @@ const char *const %(c_name)s_lookup[] = {
| max_index = c_enum_const(name, 'MAX', prefix)
| ret += mcgen('''
| [%(max_index)s] = NULL,
|+// %(max_index)s
| };
| ''',
| max_index=max_index)
then running:
$ cat qapi-{types,event}.c tests/test-qapi-types.c |
sed -n 's,^// \(.*\)MAX,s|\1MAX|\1_MAX|g,p' > list
$ git grep -l _MAX | xargs sed -i -f list
The only things not generated are the changes in scripts/qapi.py.
Rejecting enum members named 'MAX' is now useless, and will be dropped
in the next patch.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1447836791-369-23-git-send-email-eblake@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
[Rebased to current master, commit message tweaked]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
We have two issues with our qapi union layout:
1) Even though the QMP wire format spells the tag 'type', the
C code spells it 'kind', requiring some hacks in the generator.
2) The C struct uses an anonymous union, which places all tag
values in the same namespace as all non-variant members. This
leads to spurious collisions if a tag value matches a non-variant
member's name.
Make the conversion to the new layout for block-related code.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1445898903-12082-16-git-send-email-eblake@redhat.com>
[Commit message tweaked slightly]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
This is the final step in converting all of the BlockDriverState
pointers that block drivers use to BdrvChild.
After this patch, bs->children contains the full list of child nodes
that are referenced by a given BDS, and these children are only
referenced through BdrvChild, so that updating the pointer in there is
enough for changing edges in the graph.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
This patch removes the temporary duplication between bs->file and
bs->file_child by converting everything to BdrvChild.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
For updating the cache sizes, disabling lazy refcounts and updating the
clean_cache_timer there is a bit more to do than just changing the
variables, but otherwise we're all set for changing options during
bdrv_reopen().
Just implement the missing pieces and hook the functions up in
bdrv_reopen().
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Before we can allow updating options at runtime with bdrv_reopen(), we
need to split the function into prepare/commit/abort parts.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
On return, either all new options should be applied to BDRVQcowState (on
success), or all of the old settings should be preserved (on failure).
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
With this commit, the handling of driver-specific options in
qcow2_open() is completely separated out into qcow2_update_options().
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
qcow2_update_options() only updates some variables in BDRVQcowState and
doesn't really depend on other parts of it being initialised yet, so it
can be moved so that it immediately follows the other half of option
handling code in qcow2_open().
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Eventually we want to be able to change options at runtime. As a first
step towards that goal, separate some option handling code from the
general initialisation code in qcow2_open().
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Eric says that "any" sounds better than "either", and my non-native
feeling says the same, so let's change it.
Suggested-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
BDRVQcowState is already used by qcow1, and gdb is always confused which
one to use. Rename the qcow2 one so they can be distinguished.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Now that this parameter is effectively unused, we can drop it and just
pass NULL on to bdrv_open_inherit().
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Change all callers of bdrv_open() to pass the driver name in the options
QDict instead of passing its BlockDriver pointer.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This adds a new 'cache-clean-interval' option that cleans all qcow2
cache entries that haven't been used in a certain interval, given in
seconds.
This allows setting a large L2 cache size so it can handle scenarios
with lots of I/O and at the same time use little memory during periods
of inactivity.
This feature currently relies on MADV_DONTNEED to free that memory, so
it is not useful in systems that don't follow that behavior.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: a70d12da60433df9360ada648b3f34b8f6f354ce.1438690126.git.berto@igalia.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Switch the qcow/qcow2 block driver over to use the generic cipher
API, this allows it to use the pluggable AES implementations,
instead of being hardcoded to use QEMU's built-in impl.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1435770638-25715-10-git-send-email-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
To prepare for a generic internal cipher API, move the
built-in AES implementation into the crypto/ directory
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1435770638-25715-3-git-send-email-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These macros expand into error class enumeration constant, comma,
string. Unclean. Has been that way since commit 13f59ae.
The error class is always ERROR_CLASS_GENERIC_ERROR since the previous
commit.
Clean up as follows:
* Prepend every use of a QERR_ macro by ERROR_CLASS_GENERIC_ERROR, and
delete it from the QERR_ macro. No change after preprocessing.
* Rewrite error_set(ERROR_CLASS_GENERIC_ERROR, ...) into
error_setg(...). Again, no change after preprocessing.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
If a relatively large cluster size is chosen, the default of 1 MB L2
cache is not really appropriate. In this case, unless overridden by the
user, the default cache size should not be determined by its size in
bytes but by the number of L2 tables (clusters) it is supposed to
contain.
Note that without this patch, MIN_L2_CACHE_SIZE will effectively take
over the same role. However, providing space for just two L2 tables is
not enough to be the default.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When a qcow[2] file is opened, if the header reports an
encryption method, this is used to set the 'crypt_method_header'
field on the BDRVQcow[2]State struct, and the 'encrypted' flag
in the BDRVState struct.
When doing I/O operations, the 'crypt_method' field on the
BDRVQcow[2]State struct is checked to determine if encryption
needs to be applied.
The crypt_method_header value is copied into crypt_method when
the bdrv_set_key() method is called.
The QEMU code which opens a block device is expected to always
do a check
if (bdrv_is_encrypted(bs)) {
bdrv_set_key(bs, ....key...);
}
If code forgets to do this, then 'crypt_method' is never set
and so when I/O is performed, QEMU writes plain text data
into a sector which is expected to contain cipher text, or
when reading, will return cipher text instead of plain
text.
Change the qcow[2] code to consult bs->encrypted when deciding
whether encryption is required, and assert(s->crypt_method)
to protect against cases where the caller forgets to set the
encryption key.
Also put an assert in the set_key methods to protect against
the case where the caller sets an encryption key on a block
device that does not have encryption
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Since this event can occur in nodes that cannot have a device name
associated, include also a field with the node name.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 147cec5b3594f4bec0cb41c98afe5fcbfb67567c.1428485266.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
There are several error messages that identify a BlockDriverState by
its device name. However those errors can be produced in nodes that
don't have a device name associated.
In those cases we should use bdrv_get_device_or_node_name() to fall
back to the node name and produce a more meaningful message. The
messages are also updated to use the more generic term 'node' instead
of 'device'.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 9823a1f0514fdb0692e92868661c38a9e00a12d6.1428485266.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This commit was generated mechanically by coccinelle from the following
semantic patch:
@@
expression val;
@@
- (ffs(val) - 1)
+ ctz32(val)
The call sites have been audited to ensure the ffs(0) - 1 == -1 case
never occurs (due to input validation, asserts, etc). Therefore we
don't need to worry about the fact that ctz32(0) == 32.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1427124571-28598-5-git-send-email-stefanha@redhat.com
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In recent qemu versions, it is possible to override the backing file
name and format that is stored in the image file with values given at
runtime. In such cases, the temporary override could end up in the
image header if the qcow2 header was updated, while obviously correct
behaviour would be to leave the on-disk backing file path/format
unchanged.
Fix this and add a test case for it.
Reported-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1428411796-2852-1-git-send-email-kwolf@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
QCOW_MAX_L1_SIZE's unit is byte, and l1_size's unit
is l1 table entry size(8 bytes).
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Message-id: 54FFB0F1.5010307@cn.fujitsu.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Add a creation option to qcow2 for setting the refcount order of images
to be created, and respect that option's value.
This breaks some test outputs, fix them.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
qcow2_amend_options() should not compare options against some inline
strings but rather use the symbolic macros available for each of the
creation options.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Add a refcount_order parameter to qcow2_create2(), use that value for
the image header and for calculating the size required for
preallocation.
For now, always pass 4.
This addition requires changes to the calculation of the file size for
the "full" and "falloc" preallocation modes. That in turn is a nice
opportunity to add a comment about that calculation not necessarily
being exact (and that being intentional).
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
No longer refuse to open images with a different refcount entry width
than 16 bits; only reject images with a refcount width larger than 64
bits (which is prohibited by the specification).
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Add the bit width of every refcount entry to the format-specific
information.
In contrast to lazy_refcounts and the corrupt flag, this should be
always emitted, even for compat=0.10 although it does not support any
refcount width other than 16 bits. This is because if a boolean is
optional, one normally assumes it to be false when omitted; but if an
integer is not specified, it is rather difficult to guess its value.
This new field breaks some test outputs, fix them.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Add two new fields regarding refcount information (the bit width of
every entry and the maximum refcount value) to the BDRVQcowState.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
qemu_opt_set() is a wrapper around qemu_opt_set() that reports the
error with qerror_report_err().
Most of its users assume the function can't fail. Make them use
qemu_opt_set_err() with &error_abort, so that should the assumption
ever break, it'll break noisily.
Just two users remain, in util/qemu-config.c. Switch them to
qemu_opt_set_err() as well, then rename qemu_opt_set_err() to
qemu_opt_set().
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Return the Error object instead of reporting it with
qerror_report_err().
Change callers that assume the function can't fail to pass
&error_abort, so that should the assumption ever break, it'll break
noisily.
Turns out all callers outside its unit test assume that. We could
drop the Error ** argument, but that would make the interface less
regular, so don't.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Now that request clamping is done in the BlockBackend, the "growable"
field can be removed from the BlockDriverState. All BDSs are now treated
as being "growable" (that is, they are allowed to grow; they are not
necessarily actually able to).
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1423162705-32065-16-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This fixes an off-by-one error introduced in 9a29e18. Both qcow and
qcow2 need to make sure to leave room for string terminator '\0' for
the backing file, so the max length of the non-terminated string is
either 1023 or PATH_MAX - 1.
Reported-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The string field entries 'filename', 'backing_file', and
'exact_filename' in the BlockDriverState struct are defined as 1024
bytes.
However, many places that use these values accept a maximum of PATH_MAX
bytes, so we have a mixture of 1024 byte and PATH_MAX byte allocations.
This patch makes the BlockDriverStruct field string sizes match usage.
This patch also does a few fixes related to the size that needs to
happen now:
* the block qapi driver is updated to use PATH_MAX bytes
* the qcow and qcow2 drivers have an additional safety check
* the block vvfat driver is updated to use PATH_MAX bytes
for the size of backing_file, for systems where PATH_MAX is < 1024
bytes.
* qemu-img uses PATH_MAX rather than 1024. These instances were not
changed to be dynamically allocated, however, as the extra
temporary 3K in stack usage for qemu-img does not seem worrisome.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
bdrv_truncate() may fail and qcow2_write_compressed() should return the
error code in that case.
Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
qcow2_cache_flush() may fail; if one of the caches failed to be flushed
successfully to disk in qcow2_close() the image should not be marked
clean, and we should emit a warning.
This breaks the (qcow2-specific) iotests 026, 071 and 089; change their
output accordingly.
Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We can always assume raw, file and qcow2 being available; so do not use
bdrv_find_format() to locate their BlockDriver objects but statically
reference the respective objects.
Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
There are some block drivers which are essential to QEMU and may not be
removed: These are raw, file and qcow2 (as the default non-raw format).
Make their BlockDriver objects public so they can be directly referenced
throughout the block layer without needing to call bdrv_find_format()
and having to deal with an error at runtime, while the real problem
occurred during linking (where raw, file or qcow2 were not linked into
qemu).
Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
After reading the extension header, offset is incremented, but not
checked against end_offset any more. This way an integer overflow could
happen when checking whether the extension end is within the allowed
range, effectively disabling the check.
This patch adds the missing check and a test case for it.
Cc: qemu-stable@nongnu.org
Reported-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1416935562-7760-2-git-send-email-kwolf@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The only really time-consuming operation potentially performed by
qcow2_amend_options() is zero cluster expansion when downgrading qcow2
images from compat=1.1 to compat=0.10, so report status of that
operation and that operation only through the status CB.
For this, approximate the progress as the number of L1 entries visited
during the operation.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Message-id: 1414404776-4919-5-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Depending on the changed options and the image format,
bdrv_amend_options() may take a significant amount of time. In these
cases, a way to be informed about the operation's status is desirable.
Since the operation is rather complex and may fundamentally change the
image, implementing it as AIO or a coroutine does not seem feasible. On
the other hand, implementing it as a block job would be significantly
more difficult than a simple callback and would not add benefits other
than progress report to the amending operation, because it should not
actually be run as a block job at all.
A callback may not be very pretty, but it's very easy to implement and
perfectly fits its purpose here.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1414404776-4919-2-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
bdrv_make_empty() is currently only called if the current image
represents an external snapshot that has been committed to its base
image; it is therefore unlikely to have internal snapshots. In this
case, bdrv_make_empty() can be greatly sped up by emptying the L1 and
refcount table (while having the dirty flag set, which only works for
compat=1.1) and creating a trivial refcount structure.
If there are snapshots or for compat=0.10, fall back to the simple
implementation (discard all clusters).
[Applied s/clusters/cluster/ typo fix suggested by Eric Blake
--Stefan]
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1414159063-25977-4-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Implement this function by making all clusters in the image file fall
through to the backing file (by using the recently extended discard).
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1414159063-25977-3-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Normally, discarded sectors should read back as zero. However, there are
cases in which a sector (or rather cluster) should be discarded as if
they were never written in the first place, that is, reading them should
fall through to the backing file again.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1414159063-25977-2-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
With BDRVQcowState.refcount_block_bits, we don't need REFCOUNT_SHIFT
anymore.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When opening dirty images, qcow2's repair function should not only
repair errors but leaks as well.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The size of a refblock entry is (in theory) variable; calculate
therefore the number of entries per refblock and the according bit shift
(1 << x == entry count) when opening an image.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
device_name[] can become non-empty only in bdrv_new_root() and
bdrv_move_feature_fields(). The latter is used only to undo damage
done by bdrv_swap(). The former is called only by blk_new_with_bs().
Therefore, when a BlockDriverState's device_name[] is non-empty, then
it's been created with a BlockBackend, and vice versa. Furthermore,
blk_new_with_bs() keeps the two names equal.
Therefore, device_name[] is redundant. Eliminate it.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Just like lazy-refcounts, this field will be present iff the qcow2
compat level is 1.1 (or probably any future revision).
As expected, this breaks some tests due to the new field present in
qemu-img info output; so fix their output accordingly.
Suggested-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1412105489-7681-3-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Being able to set the overlap-check option to a string and then refine
it via the overlap-check.* options is a nice idea for the command line
but does not work so well for non-flattened dicts. In that case, one can
only specify either but not both, so add a field to overlap-check.*
which does the same as directly specifying overlap-check but can be used
in conjunction with the other fields in non-flattened dicts.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1408557576-14574-4-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Currently, the QemuOpts object opts is leaked if anything fails from its
creation up to and including the image repair block. Fix this by freeing
that object in the fail path.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Message-id: 1408557576-14574-2-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Add a helper function for easily marking an image corrupt (on fatal
corruptions) while outputting an informative message to stderr and via
QAPI.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Message-id: 1409926039-29044-3-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
preallocation=falloc allocates disk space by posix_fallocate(),
preallocation=full allocates disk space by writing zeros to disk.
Both modes imply preallocation=metadata.
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This patch prepares for the subsequent patches.
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
and avoid converting it back later.
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Currently the file size requested by user is rounded down to nearest
sector, causing the actual file size could be a bit less than the size
user requested. Since some formats (like qcow2) record virtual disk
size in bytes, this can make the last few bytes cannot be accessed.
This patch fixes it by rounding up file size to nearest sector so that
the actual file size is no less than the requested file size.
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Add options for specifying the size of the metadata caches. This can
either be done directly for each cache (if only one is given, the other
will be derived according to a default ratio) or combined for both.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Specifying the metadata cache sizes in clusters results in less clusters
(and much less bytes) covered for small cluster sizes and vice versa.
Using a constant byte size reduces this difference, and makes it
possible to manually specify the cache size in an easily comprehensible
unit.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.
This patch addresses the allocations in the qcow2 block driver.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
It returns a multiple of the sector size.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Instead of bdrv_getlength().
Aside: a few of these callers don't handle errors. I didn't
investigate whether they should.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
qcow2's report_unsupported_feature() had two bugs: A 32 bit truncation
would prevent feature table entries for bits 32-63 from being used, and
it could assign errp multiple times if there was more than one unknown
feature, resulting in an error_set() assertion failure.
Fix the truncation, make sure to set the error exactly once and add a
qemu-iotests case for it.
This fixes https://bugs.launchpad.net/qemu/+bug/1342704/
Reported-by: Maria Kustova <maria.k@catit.be>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
If a qcow2 image has a shorter backing file and a read request to
unallocated clusters goes across EOF of the backing file, the backing
file sees a shortened request and the rest is filled with zeros.
However, the original too long qiov was used with the shortened request.
This patch makes the qiov size match the request size, avoiding a
potential buffer overflow in raw-posix.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Since we parse backing.* options to add a backing file from the command
line when the driver didn't assign one, it has been possible to have a
backing file for e.g. raw images (it just was never accessed).
This is obvious nonsense and should be rejected.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Now that all backend drivers are using QemuOpts, remove all
QEMUOptionParameter related codes.
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Change block layer to support both QemuOpts and QEMUOptionParameter.
After this patch, it will change backend drivers one by one. At the end,
QEMUOptionParameter will be removed and only QemuOpts is kept.
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Introduced in commit 5a8a30d. Spotted by Coverity.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Instead of blindly relying on a normal integer having a width of 32 bits
(which is a pretty good assumption, but we should not rely on it if
there is no need), use the correct format string macros.
This does not touch DEBUG output.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If lazy refcounts are enabled for a backing file, committing to this
backing file may leave it in a dirty state even if the commit succeeds.
The reason is that the bdrv_flush() call in bdrv_commit() doesn't flush
refcount updates with lazy refcounts enabled, and qcow2_reopen_prepare()
doesn't take care to flush metadata.
In order to fix this, this patch also fixes qcow2_mark_clean(), which
contains another ineffective bdrv_flush() call beause lazy refcounts are
disabled only afterwards. All existing callers of qcow2_mark_clean()
either don't modify refcounts or already flush manually, so that this
fixes only a latent, but not yet actually triggerable bug.
Another instance of the same problem is live snapshots. Again, a real
corruption is prevented by an explicit flush for non-read-only images in
external_snapshot_prepare(), but images using lazy refcounts stay dirty.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
preallocate() only links the first QCowL2Meta's data clusters into the
L2 table and ignores any chained QCowL2Metas in the linked list.
Chains of QCowL2Meta structs are built up when contiguous clusters span
L2 tables. Each QCowL2Meta describes one L2 table update. This is a
rare case in preallocate() but can happen.
This patch fixes preallocate() by iterating over the whole list of
QCowL2Metas. Compare with the qcow2_co_writev() function's
implementation, which is similar but also also handles request
dependencies. preallocate() only performs one allocation at a time so
there can be no dependencies.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This avoids an unbounded allocation.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The qcow2 code assumes that s->snapshots is non-NULL if s->nb_snapshots
!= 0. By having the initialisation of both fields separated in
qcow2_open(), any error occuring in between would cause the error path
to dereference NULL in qcow2_free_snapshots() if the image had any
snapshots.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
If the size becomes larger than what qcow2_open() would accept, fail the
growing operation.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
free_cluster_index is only correct if update_refcount() was called from
an allocation function, and even there it's brittle because it's used to
protect unfinished allocations which still have a refcount of 0 - if it
moves in the wrong place, the unfinished allocation can be corrupted.
So not using it any more seems to be a good idea. Instead, use the
first requested cluster to do the calculations. Return -EAGAIN if
unfinished allocations could become invalid and let the caller restart
its search for some free clusters.
The context of creating a snapsnot is one situation where
update_refcount() is called outside of a cluster allocation. For this
case, the change fixes a buffer overflow if a cluster is referenced in
an L2 table that cannot be represented by an existing refcount block.
(new_table[refcount_table_index] was out of bounds)
[Bump the qemu-iotests 026 refblock_alloc.write leak count from 10 to
11.
--Stefan]
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
len could become negative and would pass the check then. Nothing bad
happened because bdrv_pread() happens to return an error for negative
length values, but make variables for sizes unsigned anyway.
This patch also changes the behaviour to error out on invalid lengths
instead of silently truncating it to 1023.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This avoids an unbounded allocation.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>