On read operations when this parameter is set and some replicas are corrupted
while quorum can be reached quorum will proceed to rewrite the correct version
of the data to fix the corrupted replicas.
This will shine with SSD where the FTL will remap the same block at another
place on rewrite.
Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If 'base' is smaller than the overlay image being committed into it,
then the base image will be grown in commit_run via bdrv_truncate().
This tests to make sure that this works, and the bdrv_truncate() is
not blocked when it shouldn't be.
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Since we parse backing.* options to add a backing file from the command
line when the driver didn't assign one, it has been possible to have a
backing file for e.g. raw images (it just was never accessed).
This is obvious nonsense and should be rejected.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
The "driver" entry in the options QDict is now only missing if we're
opening an image with format probing.
We also catch cases now where both the drv argument and a "driver"
option is specified, e.g. by specifying -drive format=qcow2,driver=raw
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
All behavior and invariant should hold for images with 0 length, so
add a class to repeat all the tests in TestSingleDrive.
Hide two unapplicable test methods that would fail with 0 image length
because it's also used as cluster size.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
There should be a BLOCK_JOB_READY event with active commit, regardless
of image length. Let's test the 0 length image case, and make sure it
goes through the ready->complete process.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Change block layer to support both QemuOpts and QEMUOptionParameter.
After this patch, it will change backend drivers one by one. At the end,
QEMUOptionParameter will be removed and only QemuOpts is kept.
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Those options were not enabled by default, even when the build
environment would have supported them, so the corresponding
code was not compiled in normal test builds like on build bots.
[Building quorum by default "broke" qemu-iotests ./check 081. It turns
out the 081.out master output was just bitrotted. Fix this by updating
the error message.
--Stefan]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Currently, test 089 uses $QEMU_IMG info manually in order to obtain the
according output. However, the iotests should generally use _img_info as
this filters out more irrelevant information such as the host image size
or format specific information. Therefore, test 089 should use _img_info
as well.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reported-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
We should allow testing this on tmpfs. Any cache setting in iotests
should try to obey $CACHEMODE.
The cache mode is still "none" by default but overridable
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
this patch tries to optimize zero write requests
by automatically using bdrv_write_zeroes if it is
supported by the format.
This significantly speeds up file system initialization and
should speed zero write test used to test backend storage
performance.
I ran the following 2 tests on my internal SSD with a
50G QCOW2 container and on an attached iSCSI storage.
a) mkfs.ext4 -E lazy_itable_init=0,lazy_journal_init=0 /dev/vdX
QCOW2 [off] [on] [unmap]
-----
runtime: 14secs 1.1secs 1.1secs
filesize: 937M 18M 18M
iSCSI [off] [on] [unmap]
----
runtime: 9.3s 0.9s 0.9s
b) dd if=/dev/zero of=/dev/vdX bs=1M oflag=direct
QCOW2 [off] [on] [unmap]
-----
runtime: 246secs 18secs 18secs
filesize: 51G 192K 192K
throughput: 203M/s 2.3G/s 2.3G/s
iSCSI* [off] [on] [unmap]
----
runtime: 8mins 45secs 33secs
throughput: 106M/s 1.2G/s 1.6G/s
allocated: 100% 100% 0%
* The storage was connected via an 1Gbit interface.
It seems to internally handle writing zeroes
via WRITESAME16 very fast.
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Like qcow2 since commit 6d33e8e7, error out on invalid lengths instead
of silently truncating them to 1023.
Also don't rely on bdrv_pread() catching integer overflows that make len
negative, but use unsigned variables in the first place.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
A huge image size could cause s->l1_size to overflow. Make sure that
images never require a L1 table larger than what fits in s->l1_size.
This cannot only cause unbounded allocations, but also the allocation of
a too small L1 table, resulting in out-of-bounds array accesses (both
reads and writes).
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Too large L2 table sizes cause unbounded allocations. Images actually
created by qemu-img only have 512 byte or 4k L2 tables.
To keep things consistent with cluster sizes, allow ranges between 512
bytes and 64k (in fact, down to 1 entry = 8 bytes is technically
working, but L2 table sizes smaller than a cluster don't make a lot of
sense).
This also means that the number of bytes on the virtual disk that are
described by the same L2 table is limited to at most 8k * 64k or 2^29,
preventively avoiding any integer overflows.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Huge values for header.cluster_bits cause unbounded allocations (e.g.
for s->cluster_cache) and crash qemu this way. Less huge values may
survive those allocations, but can cause integer overflows later on.
The only cluster sizes that qemu can create are 4k (for standalone
images) and 512 (for images with backing files), so we can limit it
to 64k.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
The test test_stream_pause in this class uses vm.pause_drive, which
requires a blkdebug driver on top of image, otherwise it's no-op and the
test running is undeterministic.
So add it.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The shell script attempts to suppress core dumps like this:
old_ulimit=$(ulimit -c)
ulimit -c 0
$QEMU_IO arg...
ulimit -c "$old_ulimit"
This breaks the test hard unless the limit was zero to begin with!
ulimit sets both hard and soft limit by default, and (re-)raising the
hard limit requires privileges. Broken since it was added in commit
dc68afe.
Could be fixed by adding -S to set only the soft limit, but I'm not
sure how portable that is in practice. Simply do it in a subshell
instead, like this:
(ulimit -c 0; exec $QEMU_IO arg...)
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Add a test for the JSON protocol driver.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This adds a test for VHDX images created by Microsoft's tool, Disk2VHD.
VHDX images created by this tool have 2 identical header sections, with
identical sequence numbers. This makes sure we detect VHDX images with
identical headers, and do not flag them as corrupt.
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This reverts commit f915db07ef.
This commit is broken because it does not account for the
build tree and the source tree being different, and can cause
build failures for out-of-tree builds. Revert it until we can
identify a better solution to the problem.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1400153676-30180-1-git-send-email-peter.maydell@linaro.org
Acked-by: Kevin Wolf <kwolf@redhat.com>
This is an initial, simple live migration test from one
running VM to another, using monitor commands.
This is also an example of using the new common.qemu functions
for controlling multiple running qemu instances, for tests that
need a live qemu vm.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The new functionality of common.qemu implements the QEMU control
and communication functionality that was originally in test 085.
This removes that now-duplicate functionality, and uses the
common.qemu functions.
The QEMU commandline changes slightly due to this; in addition to
monitor and qmp i/o options, the new QEMU commandline from inside
common.qemu now introduces -machine accel=qtest.
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This creates some common functions for bash language qemu-iotests
to control, and communicate with, a running QEMU process.
4 functions are introduced:
1. _launch_qemu()
This launches the QEMU process(es), and sets up the file
descriptors and fifos for communication. You can choose to
launch each QEMU process listening for either QMP or HMP
monitor. You can call this function multiple times, and
save the handle returned from each. The returned handle is
in $QEMU_HANDLE. You must copy this value.
Commands 2 and 3 use the handle received from _launch_qemu(), to talk
to the appropriate process.
2. _send_qemu_cmd()
Sends a command string, specified by $2, to QEMU. If $3 is
non-NULL, _send_qemu_cmd() will wait to receive $3 as a
required result string from QEMU. Failure to receive $3 will
cause the test to fail. The command can optionally be retried
$qemu_cmd_repeat number of times. Set $qemu_error_no_exit
to not force the test the fail on exit; in this case,
$QEMU_STATUS[$1] will be set to -1 on failure.
3. _timed_wait_for()
Waits for a response, for up to a default of 10 seconds. If
$2 is not seen in that time (anywhere in the response), then
the test fails. Primarily used by _send_qemu_cmd, but could
be useful standalone, as well. To prevent automatic exit
(and therefore test failure), set $qemu_error_no_exit to a
non-NULL value. If $silent is a non-NULL value, then output
to stdout will be suppressed.
4. _cleanup_qemu()
Kills the running QEMU processes, and removes the fifos.
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The immediately visible effect of this patch is that it fixes committing
a temporary snapshot to its backing file. Previously, it would fail with
a "permission denied" error because bdrv_inherited_flags() forced the
backing file to be read-only, ignoring the r/w reopen of bdrv_commit().
The bigger problem this revealed is that the original open flags must
actually only be applied to the temporary snapshot, and the original
image file must be treated as a backing file of the temporary snapshot
and get the right flags for that.
Reported-by: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This will return cluster_size and needs_compressed_writes to caller, if all the
extents have the same value (or there's only one extent). Otherwise return
-ENOTSUP.
cluster_size is only reported for sparse formats.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Currently, QEMU's iotests rely on /usr/bin/env to start the correct
Python (that is, at least Python 2.4, but not 3). On systems where
Python 3 is the default, the user has no clean way of making the iotests
use the correct binary.
This commit makes the iotests use the Python selected by configure.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Add a test which discards a compressed cluster on qcow2. This should
work without any problems.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Both tests 019 and 086 need proper quotations to work with pathnames
that contain spaces.
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The _rm_test_img() function in common.rc did not quote the image
file, which left droppings in the scratch directory (and performed
a potentially unsafe rm -f).
This adds the necessary quotes.
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
A name that is taken by an ID can't be taken by a node-name at the same
time. Check that conflicts are correctly detected.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Since commit f298d071, block devices added with blockdev-add don't have
a QemuOpts around in dinfo->opts. Consequently, we can't rely any more
on QemuOpts catching duplicate IDs for block devices.
This patch adds a new check for duplicate IDs to bdrv_new(), and moves
the existing check that the ID isn't already taken for a node-name there
as well.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Insanely large requests could cause an integer overflow in
bdrv_rw_co() while converting sectors to bytes. This patch catches the
problem and returns an error (if we hadn't overflown the integer here,
bdrv_check_byte_request() would have rejected the request, so we're not
breaking anything that was supposed to work before).
We actually do have a test case that triggers behaviour where we
accidentally let such a request pass, so that it would return success,
but read 0 bytes instead of the requested 4 GB. It fails now like it
should.
If the vdi block driver wants to be able to deal with huge images, it
can't read the whole block bitmap at once into memory like it does
today, but needs to use a metadata cache like qcow2 does.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The old check was off by a factor of 512 and didn't consider cases where
we don't get an exact division. This could lead to an out-of-bounds
array access in seek_to_sector().
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Since commit 9fd3171a, BDRV_O_SNAPSHOT uses an option QDict to specify
the originally requested image as the backing file of the newly created
temporary snapshot. This means that the filename is stored in
"file.filename", which is an option that is not parsed for protocol
names. Therefore things like -drive file=nbd:localhost:10809 were
broken because it looked for a local file with the literal name
'nbd:localhost:10809'.
This patch changes the way BDRV_O_SNAPSHOT works once again. We now open
the originally requested image as normal, and then do a similar
operation as for live snapshots to put the temporary snapshot on top.
This way, both driver specific options and parsed filenames work.
As a nice side effect, this results in code movement to factor
bdrv_append_temp_snapshot() out. This is a good preparation for moving
its call to drive_init() and friends eventually.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
qemu doesn't print these CRs any more. The test still didn't fail
because the output comparison ignores line endings, but the change turns
up each time when you want to update the output.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
When using the QDict option 'filename', it is supposed to be interpreted
literally. The code did correctly avoid guessing the protocol from any
string before the first colon, but it still called bdrv_parse_filename()
which would, for example, incorrectly remove a 'file:' prefix in the
raw-posix driver.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
If lazy refcounts are enabled for a backing file, committing to this
backing file may leave it in a dirty state even if the commit succeeds.
The reason is that the bdrv_flush() call in bdrv_commit() doesn't flush
refcount updates with lazy refcounts enabled, and qcow2_reopen_prepare()
doesn't take care to flush metadata.
In order to fix this, this patch also fixes qcow2_mark_clean(), which
contains another ineffective bdrv_flush() call beause lazy refcounts are
disabled only afterwards. All existing callers of qcow2_mark_clean()
either don't modify refcounts or already flush manually, so that this
fixes only a latent, but not yet actually triggerable bug.
Another instance of the same problem is live snapshots. Again, a real
corruption is prevented by an explicit flush for non-read-only images in
external_snapshot_prepare(), but images using lazy refcounts stay dirty.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
This avoids a possible division by zero.
Convert s->tracks to unsigned as well because it feels better than
surviving just because the results of calculations with s->tracks are
converted to unsigned anyway.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The first test case would cause a huge memory allocation, leading to a
qemu abort; the second one to a too small malloc() for the catalog
(smaller than s->catalog_size), which causes a read-only out-of-bounds
array access and on big endian hosts an endianess conversion for an
undefined memory area.
The sample image used here is not an original Parallels image. It was
created using an hexeditor on the basis of the struct that qemu uses.
Good enough for trying to crash the driver, but not for ensuring
compatibility.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This avoids an unbounded allocation.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
For the L1 table to loaded for an internal snapshot, the code allocated
only enough memory to hold the currently active L1 table. If the
snapshot's L1 table is actually larger than the current one, this leads
to a buffer overflow.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The qcow2 code assumes that s->snapshots is non-NULL if s->nb_snapshots
!= 0. By having the initialisation of both fields separated in
qcow2_open(), any error occuring in between would cause the error path
to dereference NULL in qcow2_free_snapshots() if the image had any
snapshots.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
bs->total_sectors is not the highest possible sector number that could
be involved in a copy on write operation: VM state is after the end of
the virtual disk. This resulted in wrong values for the number of
sectors to be copied (n).
The code that checks for the end of the image isn't required any more
because the code hasn't been calling the block layer's bdrv_read() for a
long time; instead, it directly calls qcow2_readv(), which doesn't error
out on VM state sector numbers.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This test checks for proper bounds checking of some VDI input
headers. The following is checked:
1. Max image size (1024TB) with the appropriate Blocks In Image
value (0x3fffffff) is detected as valid.
2. Image size exceeding max (1024TB) is seen as invalid
3. Valid image size but with Blocks In Image value that is too
small fails
4. Blocks In Image size exceeding max (0x3fffffff) is seen as invalid
5. 64MB image, with 64 Blocks In Image, and 1MB Block Size is seen
as valid
6. Block Size < 1MB not supported
7. Block Size > 1MB not supported
[Max Reitz <mreitz@redhat.com> pointed out that "1MB + 1" in the test
case is wrong. Change to "1MB + 64KB" to match the 0x110000 value.
--Stefan]
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
free_cluster_index is only correct if update_refcount() was called from
an allocation function, and even there it's brittle because it's used to
protect unfinished allocations which still have a refcount of 0 - if it
moves in the wrong place, the unfinished allocation can be corrupted.
So not using it any more seems to be a good idea. Instead, use the
first requested cluster to do the calculations. Return -EAGAIN if
unfinished allocations could become invalid and let the caller restart
its search for some free clusters.
The context of creating a snapsnot is one situation where
update_refcount() is called outside of a cluster allocation. For this
case, the change fixes a buffer overflow if a cluster is referenced in
an L2 table that cannot be represented by an existing refcount block.
(new_table[refcount_table_index] was out of bounds)
[Bump the qemu-iotests 026 refblock_alloc.write leak count from 10 to
11.
--Stefan]
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
len could become negative and would pass the check then. Nothing bad
happened because bdrv_pread() happens to return an error for negative
length values, but make variables for sizes unsigned anyway.
This patch also changes the behaviour to error out on invalid lengths
instead of silently truncating it to 1023.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This avoids an unbounded allocation.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>