patch keeping us from creating too large extents.
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJauVVBAAoJEPQH2wBh1c9AGtYIAJ1ojl6+guKQHjtzv9W8ch50
2tzH9/lFkhq/Tyay8MCcq1dWQr23tKqusxi6fDHVdc8XIx6fDuPFzWxUQwwLvS81
9CA6Qs2NngkiAs89ZZ0Sc9aj+LzFbKvXDXPYd4FFeLVVzD0CA4qghH5THrrH6LRm
4DoRUK1QejrOC0v2zCRQvEN6RlI6WFGCf9YiYKkdrvswnjtLgOobCt7TvC9Nade2
smGyxtDENkV6bAZgQgxMAlf88GCyKslb4Fu6U+sfKMDejWmlYEREbBsQc+/Gp6Af
R6ZeiXEJ+KUF34RktPTmhJlUbzRc4Mw0Ij7Abrfhtpre9hifIp62XSrM8sasgLo=
=k+ly
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/maxreitz/tags/pull-block-2018-03-26' into staging
A fix for dirty bitmap migration through shared storage, and a VMDK
patch keeping us from creating too large extents.
# gpg: Signature made Mon 26 Mar 2018 21:17:05 BST
# gpg: using RSA key F407DB0061D5CF40
# gpg: Good signature from "Max Reitz <mreitz@redhat.com>"
# Primary key fingerprint: 91BE B60A 30DB 3E88 57D1 1829 F407 DB00 61D5 CF40
* remotes/maxreitz/tags/pull-block-2018-03-26:
vmdk: return ERROR when cluster sector is larger than vmdk limitation
iotests: enable shared migration cases in 169
qcow2: fix bitmaps loading when bitmaps already exist
qcow2-bitmap: add qcow2_reopen_bitmaps_rw_hint()
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The include/block/aio-wait.h header file was added by commit
7719f3c968 ("block: extract
AIO_WAIT_WHILE() from BlockDriverState") without updating MAINTAINERS.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20180312132204.23683-1-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Check that two coroutines can queue each other repeatedly without
hitting stack exhaustion.
Switch to qemu_init_main_loop() in main() because coroutines use
qemu_get_aio_context() - they don't know about test-aio's ctx variable.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20180322152834.12656-4-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
qemu_aio_coroutine_enter() is (indirectly) called recursively when
processing co_queue_wakeup. This can lead to stack exhaustion.
This patch rewrites co_queue_wakeup in an iterative fashion (instead of
recursive) with bounded memory usage to prevent stack exhaustion.
qemu_co_queue_run_restart() is inlined into qemu_aio_coroutine_enter()
and the qemu_coroutine_enter() call is turned into a loop to avoid
recursion.
There is one change that is worth mentioning: Previously, when
coroutine A queued coroutine B, qemu_co_queue_run_restart() entered
coroutine B from coroutine A. If A was terminating then it would still
stay alive until B yielded. After this patch B is entered by A's parent
so that a A can be deleted immediately if it is terminating.
It is safe to make this change since B could never interact with A if it
was terminating anyway.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20180322152834.12656-3-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
QSIMPLEQ_CONCAT(a, b) joins a = a + b. The new QSIMPLEQ_PREPEND(a, b)
API joins a = b + a.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20180322152834.12656-2-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Commit ef0e64a983 "ide: pass IDEState to trim AIO callback" changed the
IDE trim callback from using a BlockBackend to an IDEState but forgot to update
the dma_blk_io() call in hw/ide/macio.c accordingly.
Without this fix qemu-system-ppc segfaults when issuing an IDE trim command on
any of the PPC Mac machines (easily triggered by running the Debian installer).
Reported-by: Howard Spoelstra <hsp.cat7@gmail.com>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Anton Nefedov <anton.nefedov@virtuozzo.com>
Message-id: 20180223184700.28854-1-mark.cave-ayland@ilande.co.uk
Signed-off-by: John Snow <jsnow@redhat.com>
commit 947858b0 "ide: abort TRIM operation for invalid range"
is incorrect for macio; just ide_dma_error() without doing a callback
is not enough for that errorpath.
Instead, pass -EINVAL to the callback and handle it there
(see related motivation for read/write in 58ac32113).
It will however catch possible EINVAL from the block layer too.
Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-id: 1520010495-58172-1-git-send-email-anton.nefedov@virtuozzo.com
Signed-off-by: John Snow <jsnow@redhat.com>
The value of CCOUNT special register is calculated as time elapsed
since CCOUNT == 0 multiplied by the core frequency. In icount mode time
increment between consecutive instructions that don't involve time
warps is constant, but unless the result of multiplication of this
constant by the core frequency is a whole number the CCOUNT increment
between these instructions may not be constant. E.g. with icount=7 each
instruction takes 128ns, with core clock of 10MHz CCOUNT values for
consecutive instructions are:
502: (128 * 502 * 10000000) / 1000000000 = 642.56
503: (128 * 503 * 10000000) / 1000000000 = 643.84
504: (128 * 504 * 10000000) / 1000000000 = 645.12
I.e.the CCOUNT increments depend on the absolute time. This results in
varying CCOUNT differences for consecutive instructions in tests that
involve time warps and don't set CCOUNT explicitly.
Change frequency of the core used in tests so that clock cycle takes
exactly 64ns. Change icount power used in tests to 6, so that each
instruction takes exactly 1 clock cycle. With these changes CCOUNT
increments only depend on the number of executed instructions and that's
what timer tests expect, so they work correctly.
Longer story:
http://lists.nongnu.org/archive/html/qemu-devel/2018-03/msg04326.html
Cc: Pavel Dovgaluk <Pavel.Dovgaluk@ispras.ru>
Cc: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Change #include <xtensa-isa.h> to #include "xtensa-isa.h" in imported
files to make references to local files consistent.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Fix definitions of existing cores and core importing script to follow
the rule of naming non-top level source files.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
VMDK has a hard limitation of extent size, which is due to the size of grain
table entry is 32 bits. It means it can only point to a grain located at
offset = 2^32. To avoid writing the user data beyond limitation and record a useless offset
in grain table. We should return ERROR here.
Signed-off-by: yuchenlin <yuchenlin@synology.com>
Message-id: 20180322133337.28024-1-yuchenlin@synology.com
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Shared migration for dirty bitmaps is fixed by previous patches,
so we can enable the test.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20180320170521.32152-5-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
On reopen with existing bitmaps, instead of loading bitmaps, lets
reopen them if needed. This also fixes bitmaps migration through
shared storage.
Consider the case. Persistent bitmaps are stored on bdrv_inactivate.
Then, on destination process_incoming_migration_bh() calls
bdrv_invalidate_cache_all() which leads to
qcow2_load_autoloading_dirty_bitmaps() which fails if bitmaps are
already loaded on destination start.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20180320170521.32152-3-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Add version of qcow2_reopen_bitmaps_rw, which do the same work but
also return a hint about was header updated or not. This will be
used in the following fix for bitmaps reloading after migration.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20180320170521.32152-2-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
A recent glibc change relies on the fact that the iaoq must be 3,
and computes an address based on that. QEMU had been ignoring the
priv level for user-only, which produced an incorrect address.
Reported-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
if '-w 16' was given as a cmdline args a local copy of insnmask
is set and not the global one.
Signed-off-by: Peer Adelt <peer.adelt@hni.uni-paderborn.de>
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Message-Id: <20180319115846.9662-1-kbastian@mail.uni-paderborn.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Thomas.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJauOmSAAoJEL/70l94x66DwA4IAIfXUyWSDzAMTc19N/gY4eKB
cptfJas1CmfrMU+EBIVZoiVdYF1H5qvctxVSaCXL3y7XNfwrjfDoiplfbi9rTSKb
pW59bqIf7Y+ViOYDYHdbxKMcvWxIaiWKfpzWkncy+aeqObs620VSCbVmqVsQsKQu
1OHWrTlgNAP4aqPy9gZ6O1YXBDxTCIKW9N+QIdho5RqB1uPFkjBJcxlF04ydF9S7
kIgblBsosljTOk03I2hf6KKtfXfRXctgE/RYyE8SW3dy+CQWfiGjkE/z17ABBjK2
g7Rex6S9NA/+fDXO+2MAYnx6iBA9Dkxt2CcWWDjGwg+nXS4+B/OoF4MhRwV6N2g=
=5hGp
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
Miscellaenous bugfixes, including crash fixes from Alexey, Peter M. and
Thomas.
# gpg: Signature made Mon 26 Mar 2018 13:37:38 BST
# gpg: using RSA key BFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini/tags/for-upstream:
qemu-pr-helper: Actually allow users to specify pidfile
chardev/char-fe: Allow NULL chardev in qemu_chr_fe_init()
iothread: fix breakage on windows
scsi: turn "is this a SCSI device?" into a conditional hint
chardev-socket: remove useless if
tcg: Really fix cpu_io_recompile
vhost-user-test: add back memfd check
vhost-user-test: do not hang if chardev creation failed
scripts/device-crash-test: Remove fixed isapc-with-iommu entry
hw/audio: Fix crashes when devices are used on ISA bus without DMA
fdc: Exit if ISA controller does not support DMA
hw/net/can: Fix segfaults when using the devices without bus
WHPX improve vcpu_post_run perf
WHPX fix WHvSetPartitionProperty in PropertyCode
WHPX fix WHvGetCapability out WrittenSizeInBytes
scripts/get_maintainer.pl: Print proper error message for missing $file
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Due to wrong specification of arguments to getopt_long() any
attempt to set pidfile resulted in:
1) the default to be leaked
2) the @pidfile variable to be set to NULL (because optarg is
NULL without this patch).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Message-Id: <6f10cd53d361a395aa0e85a9311ec4e9a8fc11e5.1521868451.git.mprivozn@redhat.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
All the functions in char-fe.c handle the CharBackend
having a NULL Chardev pointer, which means that the
backend exists but is not connected to anything. The
exception is qemu_chr_fe_init(), which will crash if
passed a NULL Chardev pointer argument. This can happen
for various boards if they're started with 'nodefaults':
arm-softmmu/qemu-system-arm -S -nodefaults -M cubieboard
riscv32-softmmu/qemu-system-riscv32 -nodefaults -M sifive_e
Make qemu_chr_fe_init() accept a NULL chardev. This allows
UART models to handle NULL chardev properties without
generally needing to special case them or to manually
create a NullChardev.
Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20180323152948.27048-1-peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
OOB can enable iothread for parsing even on Windows. We need some tunes
to enable that on Windows otherwise it'll break Windows users. This
patch fixes the breakage on Windows with qemu-system-ppc.exe.
Reported-by: Howard Spoelstra <hsp.cat7@gmail.com>
Tested-by: Howard Spoelstra <hsp.cat7@gmail.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180322085630.23654-1-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If the user does not have permissions to send ioctls to the device (due to
SELinux or cgroups, for example), the output can look like
qemu-kvm: -device scsi-block,drive=disk: cannot get SG_IO version number:
Operation not permitted. Is this a SCSI device?
but this is confusing because the ioctl was blocked _before_ the device
even received the SG_GET_VERSION_NUM ioctl. Therefore, for EPERM errors
the suggestion should be eliminated. To make that simpler, change the
code to use error_append_hint.
Reported-by: Ala Hino <ahino@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This trips Coverity, which believes the subsequent qio_channel_create_watch
can dereference a NULL pointer. In reality, tcp_chr_connect's callers
all have s->ioc properly initialized, since they are all rooted at
tcp_chr_new_client.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We have confused the number of instructions that have been
executed in the TB with the number of instructions needed
to repeat the I/O instruction.
We have used cpu_restore_state_from_tb, which means that
the guest pc is pointing to the I/O instruction. The only
time the answer to the later question is not 1 is when
MIPS or SH4 need to re-execute the branch for the delay
slot as well.
We must rely on cpu->cflags_next_tb to generate the next TB,
as otherwise we have a race condition with other guest cpus
within the TB cache.
Fixes: 0790f86861
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20180319031545.29359-1-richard.henderson@linaro.org>
Tested-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This revert commit fb68096da3, and
modify test_read_guest_mem() to use different chardev names, when
using memfd (_test_server_free(), where the chardev is removed, runs
in idle).
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180215212552.26997-4-marcandre.lureau@redhat.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Before the chardev name fix, the following error may happen: "attempt
to add duplicate property 'chr-test' to object (type 'container')",
due to races.
Sadly, error_vprintf() uses g_test_message(), so you have to use
read the cryptic --debug-log to see it. Later, it would make sense to
use g_critical() instead, and catch errors with
g_test_expect_message() (in glib 2.34).
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180215212552.26997-5-marcandre.lureau@redhat.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Fixed in a0c167a184 ("x86_iommu: check
if machine has PCI bus").
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1521193892-15552-5-git-send-email-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The cs4231a, gus and sb16 sound cards crash QEMU when the user tries
to instantiate them on a machine with DMA-less ISA bus (for example
with "qemu-system-mips64el -M mips -device sb16"). Add proper checks
to the realize functions to avoid the crashes.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1521193892-15552-4-git-send-email-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
A "powernv" machine type defines an ISA bus but it does not add any DMA
controller to it so it is possible to hit assert(fdctrl->dma) by
adding "-machine powernv -device isa-fdc".
This replaces assert() with an error message.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[thuth: Slightly adjusted error message and updated scripts/device-crash-test]
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1521193892-15552-3-git-send-email-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The CAN devices can currently be used to crash QEMU, e.g.:
$ x86_64-softmmu/qemu-system-x86_64 -device kvaser_pci
Segmentation fault (core dumped)
So we've got to add a proper check here that the corresponding
bus is available.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1521193892-15552-2-git-send-email-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This removes the additional call to WHvGetVirtualProcessorRegisters in
whpx_vcpu_post_run now that the WHV_VP_EXIT_CONTEXT is returned in all
WHV_RUN_VP_EXIT_CONTEXT structures.
Signed-off-by: Justin Terry (VM) <juterry@microsoft.com>
Message-Id: <1521039163-138-4-git-send-email-juterry@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This fixes a breaking change to WHvSetPartitionProperty to pass the 'in'
PropertyCode on function invocation introduced in Windows Insider SDK 17110.
Usage of this indicates the PropertyCode of the opaque PropertyBuffer passed in
on function invocation.
Also fixes the removal of the PropertyCode parameter from the
WHV_PARTITION_PROPERTY struct as it is now passed to the function directly.
Signed-off-by: Justin Terry (VM) <juterry@microsoft.com>
Message-Id: <1521039163-138-3-git-send-email-juterry@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This fixes a breaking change to WHvGetCapability to include the 'out'
WrittenSizeInBytes introduced in Windows Insider SDK 17110.
This specifies on return the safe length to read into the WHV_CAPABILITY
structure passed to the call.
Signed-off-by: Justin Terry (VM) <juterry@microsoft.com>
Message-Id: <1521039163-138-2-git-send-email-juterry@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If you pass scripts/get_maintainer.pl the name of a FIFO or other
exciting object (/dev/stdin, for example), it would falsely print
"file not found". Instead: stat the object rather than using -f so
that we do not mind if the object is not a file; and print the errno
value in the error message.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Thomas Huth <thuth@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Anthony PERARD <anthony.perard@citrix.com>
Message-Id: <1520535787-6223-13-git-send-email-ian.jackson@eu.citrix.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
It's unclear what the real maximum is, but we use an uint32_t to store
the log size in vhdx_co_create(), so we should check that the given
value fits in 32 bits.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
error_setg_errno() is meant for cases where we got an errno from the OS
that can add useful extra information to an error message. It's
pointless if we pass a constant errno, these cases should use plain
error_setg().
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Images with a non-power-of-two block size are invalid and cannot be
opened. Reject such block sizes when creating an image.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
It's unclear what the real maximum cluster size is for the Parallels
format, but let's at least make sure that we don't get integer
overflows in our .bdrv_co_create implementation.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
This tests that the .bdrv_truncate implementation for luks doesn't crash
for invalid image sizes.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Commit e39e959e fixed an invalid assertion in the .bdrv_length
implementation, but left a similar assertion in place for
.bdrv_truncate. Instead of crashing when the user requests a too large
image size, fail gracefully.
A file size of exactly INT64_MAX caused failure before, but is actually
legal.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
We want to test resizing even for luks. The only change that is needed
is to explicitly zero out new space for luks because it's undefined.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Use qemu_uuid_unparse() instead of uuid_unparse() to make vdi.c compile
again when CONFIG_VDI_DEBUG is set. In order to prevent future bitrot,
replace '#ifdef CONFIG_VDI_DEBUG' by 'if (VDI_DEBUG)' so that the
compiler always sees the code.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
What static=on really does is what we call metadata preallocation for
other block drivers. While we can still change the QMP interface, make
it more consistent by using 'preallocation' for VDI, too.
This doesn't implement any new functionality, so the only supported
preallocation modes are 'off' and 'metadata' for now.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
When we try to allocate new clusters we first look for available ones
starting from s->free_cluster_index and once we find them we increase
their reference counts. Before we get to call update_refcount() to do
this last step s->free_cluster_index is already pointing to the next
cluster after the ones we are trying to allocate.
During update_refcount() it may happen however that we also need to
allocate a new refcount block in order to store the refcounts of these
new clusters (and to complicate things further that may also require
us to grow the refcount table). After all this we don't know if the
clusters that we originally tried to allocate are still available, so
we return -EAGAIN to ask the caller to restart the search for free
clusters.
This is what can happen in a common scenario:
1) We want to allocate a new cluster and we see that cluster N is
free.
2) We try to increase N's refcount but all refcount blocks are full,
so we allocate a new one at N+1 (where s->free_cluster_index was
pointing at).
3) Once we're done we return -EAGAIN to look again for a free
cluster, but now s->free_cluster_index points at N+2, so that's
the one we allocate. Cluster N remains unallocated and we have a
hole in the qcow2 file.
This can be reproduced easily:
qemu-img create -f qcow2 -o cluster_size=512 hd.qcow2 1M
qemu-io -c 'write 0 124k' hd.qcow2
After this the image has 132608 bytes (256 clusters), and the refcount
block is full. If we write 512 more bytes it should allocate two new
clusters: the data cluster itself and a new refcount block.
qemu-io -c 'write 124k 512' hd.qcow2
However the image has now three new clusters (259 in total), and the
first one of them is empty (and unallocated):
dd if=hd.qcow2 bs=512c skip=256 count=1 | hexdump -C
If we write larger amounts of data in the last step instead of the 512
bytes used in this example we can create larger holes in the qcow2
file.
What this patch does is reset s->free_cluster_index to its previous
value when alloc_refcount_block() returns -EAGAIN. This way the caller
will try to allocate again the original clusters if they are still
free.
The output of iotest 026 also needs to be updated because now that
images have no holes some tests fail at a different point and the
number of leaked clusters is different.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>