-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
iQIcBAABAgAGBQJScnrnAAoJEH8JsnLIjy/WhhsP/2No1yEGNzfhw0WLDsEGBJI7
zjG+QkRMO4q2t256SxNr84KBFJlYKBvGrx+W8xC66AdvR1feL5hmWdXAMTJovx6Z
3Qt59RI9iISZ2OEtc9FhdsC+dSdM/3qie17XuuSCqifsi4xLjIZK/s18+RnLa0t/
nRObYP4prRl0c3o1gKaUvNz2wkIqctQAIe8UQkn6R1vPC6D60m/H9dDj4Kj68HO0
ICsF4AXBR/V2a8gU36/PGexBVyfgC4HOeuN0qNSTgYOKxLuNR+SrlzzhHE+jZTs5
GASm3vg/vUgBOO1759X5T8hveO6yu8XL82l+/d5nIK4gYGORIQZT74dyV5JgQIlF
Y47d0cF28+C/fuL1jh7c+2HY5WmmJQosMi9CaCBj0lvH0k5caEjqwPeHtRBmEyu3
1wAcLQJowZrWB5ez9MjezsaL4sPCymvB/4F443xdz5V19mE41bLZGW2EIT7MXHY7
IcwLU/opx76GMOFfWVMA7jeQkjiPaqGeaQHJzdnGUzIthqyiTigQMfi5P3nXGDic
uQi+KrqP9lNpJlZk4xGQnFovHNmKZrnLhUvqOIPk7/wKMvlU6ewdzp5Fnwzqw4MW
uJ/6eBJYolMyY+q37AH3Q6ZUkwTJi9O1drCPA0Ogr/dJiCyAiOoKuL0N74VabpcD
AahXw+yYV0qh6H4YjOzW
=wGCx
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'kwolf/tags/for-anthony' into staging
Block patches for 1.7.0-rc0 (v2)
# gpg: Signature made Thu 31 Oct 2013 04:44:39 PM CET using RSA key ID C88F2FD6
# gpg: Can't check signature: public key not found
* kwolf/tags/for-anthony: (30 commits)
vmdk: Implment bdrv_get_specific_info
qapi: Add optional field 'compressed' to ImageInfo
qemu-iotests: prefill some data to test image
sheepdog: check simultaneous create in resend_aioreq
sheepdog: cancel aio requests if possible
sheepdog: make add_aio_request and send_aioreq void functions
sheepdog: try to reconnect to sheepdog after network error
coroutine: add co_aio_sleep_ns() to allow sleep in block drivers
sheepdog: reload inode outside of resend_aioreq
sheepdog: handle vdi objects in resend_aio_req
sheepdog: check return values of qemu_co_recv/send correctly
qemu-iotests: Test case for backing file deletion
qemu-iotests: drop duplicated "create_image"
qemu-iotests: Fix 051 reference output
block: Avoid unecessary drv->bdrv_getlength() calls
block: Disable BDRV_O_COPY_ON_READ for the backing file
ahci: fix win7 hang on boot
sheepdog: pass copy_policy in the request
sheepdog: explicitly set copies as type uint8_t
block: Don't copy backing file name on error
...
Message-id: 1383064269-27720-1-git-send-email-kwolf@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
Implement .bdrv_get_specific_info to return the extent information.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
After reconnection happens, all the inflight requests are moved to the
failed request list. As a result, sd_co_rw_vector() can send another
create request before resend_aioreq() resends a create request from
the failed list.
This patch adds a helper function check_simultaneous_create() and
checks simultaneous create requests more strictly in resend_aioreq().
Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Tested-by: Liu Yuan <namei.unix@gmail.com>
Reviewed-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This patch tries to cancel aio requests in pending queue and failed
queue. When the sheepdog driver cannot cancel the requests, it waits
for them to be completed.
Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Tested-by: Liu Yuan <namei.unix@gmail.com>
Reviewed-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
These functions no longer return errors. We can make them void
functions and simplify the codes.
Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Tested-by: Liu Yuan <namei.unix@gmail.com>
Reviewed-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This introduces a failed request queue and links all the inflight
requests to the list after network error happens. After QEMU
reconnects to the sheepdog server successfully, the sheepdog block
driver will retry all the requests in the failed queue.
Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Tested-by: Liu Yuan <namei.unix@gmail.com>
Reviewed-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This prepares for using resend_aioreq() after reconnecting to the
sheepdog server.
Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Tested-by: Liu Yuan <namei.unix@gmail.com>
Reviewed-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The current resend_aio_req() doesn't work when the request is against
vdi objects. This fixes the problem.
Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Tested-by: Liu Yuan <namei.unix@gmail.com>
Reviewed-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If qemu_co_recv/send doesn't return the specified length, it means
that an error happened.
Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Tested-by: Liu Yuan <namei.unix@gmail.com>
Reviewed-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The block layer generally keeps the size of an image cached in
bs->total_sectors so that it doesn't have to perform expensive
operations to get the size whenever it needs it.
This doesn't work however when using a backend that can change its size
without qemu being aware of it, i.e. passthrough of removable media like
CD-ROMs or floppy disks. For this reason, the caching is disabled when a
removable device is used.
It is obvious that checking whether the _guest_ device has removable
media isn't the right thing to do when we want to know whether the size
of the host backend can change. To make things worse, non-top-level
BlockDriverStates never have any device attached, which makes qemu
assume they are removable, so drv->bdrv_getlength() is always called on
the protocol layer. In the case of raw-posix, this causes unnecessary
lseek() system calls, which turned out to be rather expensive.
This patch completely changes the logic and disables bs->total_sectors
caching only for certain block driver types, for which a size change is
expected: host_cdrom and host_floppy on POSIX, host_device on win32; also
the raw format in case it sits on top of one of these protocols, but in
the common case the nested bdrv_getlength() call on the protocol driver
will use the cache again and avoid an expensive drv->bdrv_getlength()
call.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Currently copy_policy isn't used. Recent sheepdog supports erasure coding, which
make use of copy_policy internally, but require client explicitly passing
copy_policy from base inode to newly creately inode for snapshot related
operations.
If connected sheep daemon doesn't utilize copy_policy, passing it to sheep
daemon is just one extra null effect operation. So no compatibility problem.
With this patch, sheepdog can provide erasure coded volume for QEMU VM.
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Liu Yuan <namei.unix@gmail.com>
Acked-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
'copies' is actually uint8_t since day one, but request headers and some helper
functions parameterize it as uint32_t for unknown reasons and effectively
reserve 24 bytes for possible future use. This patch explicitly set the correct
for copies and reserve the left bytes.
This is a preparation patch that allow passing copy_policy in request header.
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Liu Yuan <namei.unix@gmail.com>
Acked-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Opening the qcow2 image with BDRV_O_NO_FLUSH prevents any flushes during
the image creation. This means that the image has not yet been flushed
to disk when qemu-img create exits. This flush is delayed until the next
operation on the image involving opening it without BDRV_O_NO_FLUSH and
closing (or directly flushing) it. For large images and/or images with a
small cluster size and preallocated metadata, this flush may take a
significant amount of time and may occur unexpectedly.
Reopening the image without BDRV_O_NO_FLUSH right before the end of
qcow2_create2() results in hoisting the potentially costly flush into
the image creation, which is expected to take some time (whereas
successive image operations may be not).
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
compatiblity -> compatibility
continously -> continuously
existance -> existence
usefull -> useful
shoudl -> should
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
this adds a check that a dynamic VHD file has not been
accidently truncated (e.g. during transfer or upload).
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Saving the VM state is done using bdrv_pwrite. This function may perform
a read-modify-write, which in this case results in data being read from
beyond the end of the virtual disk. Since we are actually trying to
access an area which is not a part of the virtual disk, zero_beyond_eof
has to be set to false before performing the partial write, otherwise
the VM state may become corrupted.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Since df2a6f29a5, bdrv_co_do_writev increases the total_sectors value of
a growable block devices on writes after the current end. This leads to
the virtual disk apparently growing in qcow2_save_vmstate, which in turn
affects the disk size captured by the internal snapshot taken directly
afterwards through e.g. the HMP savevm command. Such a "grown" snapshot
cannot be loaded after reopening the qcow2 image, since its disk size
differs from the actual virtual disk size (writing a VM state does not
actually increase the virtual disk size).
Fix this by restoring total_sectors at the end of qcow2_save_vmstate.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The VMFS extent line in description file doesn't have start offset as
FLAT lines does, and it should be defaulted to 0. The flat_offset
variable is initialized to -1, so we need to set it in this case.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Previously cid of parent is parsed from image file for every IO request.
We already have L1/L2 cache and don't have assumption that parent image
can be updated behind us, so remove this to get more efficiency.
The parent CID is checked only for once after opening.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
On one occasion, hdev_open() returned -1 in case of an unknown error
instead of a proper -errno value. Adjust this to match the behavior of
raw_open() (in raw-win32), which is to return -EINVAL in this case.
Also, change the call to error_setg*() to match the one in raw_open() as
well.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
An extra 'p++' after while loop when *p == '\n' will move p to unknown
data position, risking parsing junk data or memory access violation.
Cc: qemu-stable@nongnu.org
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Convert "fprintf(stderr,..." and standardize error messages:
Remove a few local_error's and use errp.
Remove "VMDK:" or "Vmdk:" prefixes in error message and fix to upper
case.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Make use of the error parameter in the opening and creating functions in
block/raw-posix.c.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Make use of the error parameter in the opening and creating functions in
block/raw-win32.c.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Propagate errors in raw_create rather than directly reporting and
afterwards discarding them.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Evaluate the runtime overlap check options and set
BDRVQcowState.overlap_check appropriately.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Introduces the macros QCOW2_OL_CONSTANT and QCOW2_OL_ALL in addition to
the already existing QCOW2_OL_CACHED, signifying all metadata overlap
checks that can be performed in constant time (regardless of image size
etc.) and truly all available overlap checks, respectively.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Add an array which assigns the option string to its corresponding
overlap check bit.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Add runtime options to tune the overlap checks to be performed before
write accesses.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Replace the QCOW2_OL_DEFAULT macro by a variable overlap_check in
BDRVQcowState.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In qcow2_check_metadata_overlap and qcow2_pre_write_overlap_check,
change the parameter signifying the checks to perform from its current
positive form to a negative one, i.e., it will no longer explicitly
specify every check to perform but rather a mask of checks not to
perform.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When trying to find a new snapshot ID, the existing ones are converted
to integers using strtoul. This function returns an unsigned long,
therefore its result should be saved in an unsigned long as well.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If the new snapshot table could not be written in qcow2_snapshot_create,
the old snapshot table has to be restored in memory and the new one
released. This should include restoration of the old snapshot count as
well, which is added by this patch.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In qcow2_write_compressed, if the compression fails, a normal cluster is
written to disk. This is done through bdrv_write on the qcow2 BDS
itself (using the guest offset), thus it is wrong to do a metadata
overlap check before.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The error message in qcow2_downgrade about an unsupported refcount
order is missing a space. This patch adds it.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This field is used by blkverify to disable external snapshots creation.
It will also be used by block filters like quorum to disable external
snapshot creation.
Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
if a raw device like an iscsi target or host device is used
the current implementation makes a second call out to get
the block status of bs->file.
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
qcow2_write_snapshots relies on the length of every snapshot ID and name
fitting into an unsigned 16 bit integer. This is currently ensured by
QEMU through generally only allowing 128 byte IDs and 256 byte names.
However, if this should change in the future, the length written to the
image file should not be silently truncated (though the name itself
would be written completely).
Since this is currently not an issue but might require attention due to
internal QEMU changes in the future, an assert ensuring sanity is enough
for now.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If an error occurs during qcow2_write_snapshots, the newly allocated
snapshot table clusters are leaked and should thus be freed.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
qcow2_write_snapshots does contain a fail label and there is no reason
not to use it on some errors; therefore, we should always jump there on
error.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In qcow2_free_any_clusters, preallocated zero clusters should be freed
just as normal clusters are.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Currently, qcow2_check_metadata_overlap uses bdrv_read to read inactive
L1 tables from disk. The number of sectors to read is calculated through
a truncating integer division, therefore, if the L1 table size is not a
multiple of the sector size, the final entries will not be read and
their entries in memory remain undefined (from the g_malloc).
Using bdrv_pread fixes this.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Add a new ImageInfoSpecificQCow2 type as a subtype of ImageInfoSpecific.
This contains the compatibility level as a string and an optional
lazy_refcounts boolean (optional means mandatory for compat >= 1.1 and
not available for compat == 0.10).
Also, add qcow2_get_specific_info, which returns this information.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Add a function for generically dumping the ImageInfoSpecific information
in a human-readable format to block/qapi.c.
Use this function in bdrv_image_info_dump and qemu-io-cmds.c:info_f to
allow qemu-img info resp. qemu-io -c info to print that format specific
information.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Add a function for retrieving an ImageInfoSpecific object from a block
driver.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Switch the string to enum type BlockJobType in BlockJobDriver.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We will use BlockJobType as the enum type name of block jobs in QAPI,
rename current BlockJobType to BlockJobDriver, which will eventually
become a set of operations, similar to block drivers.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>