vpc_open() was checking for bdrv_getlength() failure in one, but
not the other, location.
Reported-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Since commit cfc87e00 "block/vpc.c: Handle write failures in
get_image_offset()" older versions of gcc (in this case 4.7) incorrectly
warn that "ret" can be used uninitialised in vpc_co_pwritev().
Setting ret to 0 at the start of vpc_co_pwritev() prevents the warning
in gcc 4.7 and enables compilation with -Werror to succeed.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1500625265-23844-1-git-send-email-mark.cave-ayland@ilande.co.uk
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Coverity (CID 1355236) points out that get_image_offset() doesn't check that
it actually succeeded in writing the updated block bitmap to the file.
Check the error return from bdrv_pwrite_sync() and propagate an error
response back up to the function which calls get_image_offset() for
a write so that it can return the error to its caller.
get_sector_offset() is only used for reads, but we move it to the
same API for consistency.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
blk_truncate() itself will pass that value to bdrv_truncate(), and all
callers of blk_truncate() just set the parameter to PREALLOC_MODE_OFF
for now.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170613202107.10125-4-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
The lone caller that cares about a return of BDRV_BLOCK_RAW
(namely, io.c:bdrv_co_get_block_status) completely replaces the
return value, so there is no point in passing BDRV_BLOCK_DATA.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This allows us to remove lots of includes of migration/migration.h
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
For one thing, this allows us to drop the error message generation from
qemu-img.c and blockdev.c and instead have it unified in
bdrv_truncate().
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170328205129.15138-3-mreitz@redhat.com
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
blk_new_open() is a convenience function that processes flags rather
than QDict options as a simple way to just open an image file.
In order to keep it convenient in the future, it must automatically
request the necessary permissions. This can easily be inferred from the
flags for read and write, but we need another flag that tells us whether
to get the resize permission.
We can't just always request it because that means that no block jobs
can run on the resulting BlockBackend (which is something that e.g.
qemu-img commit wants to do), but we also can't request it never because
most of the .bdrv_create() implementations call blk_truncate().
The solution is to introduce another flag that is passed by all users
that want to resize the image.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
This makes use of the .bdrv_child_perm() implementation for formats that
we just added. All format drivers expose the permissions they actually
need nows, so that they can be set accordingly and updated when parents
are attached or detached.
The only format not included here is raw, which was already converted
with the other filter drivers.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
The way that attaching bs->file worked was a bit unusual in that it was
the only child that would be attached to a node which is not opened yet.
Because of this, the block layer couldn't know yet which permissions the
driver would eventually need.
This patch moves the point where bs->file is attached to the beginning
of the individual .bdrv_open() implementations, so drivers already know
what they are going to do with the child. This is also more consistent
with how driver-specific children work.
For a moment, bdrv_open() gets its own BdrvChild to perform image
probing, but instead of directly assigning this BdrvChild to the BDS, it
becomes a temporary one and the node name is passed as an option to the
drivers, so that they can simply use bdrv_open_child() to create another
reference for their own use.
This duplicated child for (the not opened yet) bs is not the final
state, a follow-up patch will change the image probing code to use a
BlockBackend, which is completely independent of bs.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
If a migration is already in progress and somebody attempts
to add a migration blocker, this should rightly fail.
Add an errp parameter and a retcode return value to migrate_add_blocker.
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
Message-Id: <1484566314-3987-5-git-send-email-ashijeetacharya@gmail.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Acked-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Merged with recent 'Allow invtsc migration' change
Previously we conditionally generated footer->uuid, when libuuid was
available. Now that we have a built-in implementation, we can switch to
it.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-Id: <1474432046-325-6-git-send-email-famz@redhat.com>
This is the final patch for converting the common I/O path to take
a BdrvChild parameter instead of BlockDriverState.
The completion of this conversion means that all users that perform I/O
on an image need to actually hold a reference (in the form of BdrvChild,
possible as part of a BlockBackend) to that image. This also protects
against inconsistent use of BlockBackend vs. BlockDriverState functions
because direct use of a BlockDriverState isn't possible any more and
blk->root is private for block-backends.c.
In addition, we can now distinguish different users in the I/O path,
and the future op blockers work is going to add assertions based on
permissions stored in BdrvChild.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Move it to the actual users. There are still a few includes of
qemu/bswap.h in headers; removing them is left for future work.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We have several block drivers that understand BDRV_REQ_FUA,
and emulate it in the block layer for the rest by a full flush.
But without a way to actually request BDRV_REQ_FUA during a
pass-through blk_pwrite(), FUA-aware block drivers like NBD are
forced to repeat the emulation logic of a full flush regardless
of whether the backend they are writing to could do it more
efficiently.
This patch just wires up a flags argument; followup patches
will actually make use of it in the NBD driver and in qemu-io.
Signed-off-by: Eric Blake <eblake@redhat.com>
Acked-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Add more useful error information to failure paths in vpc_open
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The check on the max_table_size field not being larger than required is
valid, and in accordance with the VHD spec. However, there have been
VHD images encountered in the wild that have an out-of-spec max table
size that is technically too large.
There is no issue in allowing this larger table size, as we also
later verify that the computed size (used for the pagetable) is
large enough to fit all sectors. In addition, max_table_entries
is bounds checked against SIZE_MAX and INT_MAX.
Remove the strict check, so that we can accomodate these sorts of
images that are benignly out of spec.
Reported-by: Stefan Hajnoczi <stefanha@redhat.com>
Reported-by: Grant Wu <grantwwu@gmail.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The old VHD_MAX_SECTORS value is incorrect, and is a throwback
to the CHS calculations. The VHD specification allows images up to 2040
GiB, which (using 512 byte sectors) corresponds to a maximum number of
sectors of 0xff000000, rather than the old value of 0xfe0001ff.
Update VHD_MAX_SECTORS to reflect the correct value.
Also, update comment references to the actual size limit, and correct
one compare so that we can have sizes up to the limit.
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
XenConverter VHD images are another VHD image where current_size is
different from the CHS values in the the format header. Use
current_size as the default, by looking at the creator_app signature
field.
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The vpc driver has two methods of determining virtual disk size. The
correct one to use depends on the software that generated the image
file. Add the XenServer creator_app signature so that image size is
correctly detected for those images.
Reported-by: Grant Wu <grantwwu@gmail.com>
Reported-by: Spencer Baugh <sbaugh@catern.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Add more useful error information to failure paths in vpc_create().
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
bdrv_pwrite_sync used to return zero or negative error, while blk_pwrite returns
the number of written bytes when successful. This caused VPC image creation
to fail spectacularly: it wrote the first 512 bytes, and then exited immediately
because of the non-zero answer from blk_pwrite. But the truly spectacular part
is that it returns a positive value (the 512 that blk_pwrite returned) causing
everyone to believe that it succeeded.
This fixes qemu-iotests with vpc format.
Fixes: b8f45cdf78
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
All callers of blk_new_open() either don't rely on the WCE bit set after
blk_new_open() because they explicitly set it anyway, or they pass
BDRV_O_CACHE_WB unconditionally.
This patch changes blk_new_open() so that it always enables writeback
mode and asserts that BDRV_O_CACHE_WB is clear. For those callers that
used to pass BDRV_O_CACHE_WB unconditionally, the flag is removed now.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Commit 57cb38b included qapi/error.h into qemu/osdep.h to get the
Error typedef. Since then, we've moved to include qemu/osdep.h
everywhere. Its file comment explains: "To avoid getting into
possible circular include dependencies, this file should not include
any other QEMU headers, with the exceptions of config-host.h,
compiler.h, os-posix.h and os-win32.h, all of which are doing a
similar job to this file and are under similar constraints."
qapi/error.h doesn't do a similar job, and it doesn't adhere to
similar constraints: it includes qapi-types.h. That's in excess of
100KiB of crap most .c files don't actually need.
Add the typedef to qemu/typedefs.h, and include that instead of
qapi/error.h. Include qapi/error.h in .c files that need it and don't
get it now. Include qapi-types.h in qom/object.h for uint16List.
Update scripts/clean-includes accordingly. Update it further to match
reality: replace config.h by config-target.h, add sysemu/os-posix.h,
sysemu/os-win32.h. Update the list of includes in the qemu/osdep.h
comment quoted above similarly.
This reduces the number of objects depending on qapi/error.h from "all
of them" to less than a third. Unfortunately, the number depending on
qapi-types.h shrinks only a little. More work is needed for that one.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
[Fix compilation without the spice devel packages. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Before this patch, blk_new() automatically assigned a name to the new
BlockBackend and considered it referenced by the monitor. This patch
removes the implicit monitor_add_blk() call from blk_new() (and
consequently the monitor_remove_blk() call from blk_delete(), too) and
thus blk_new() (and related functions) no longer take a BB name
argument.
In fact, there is only a single point where blk_new()/blk_new_open() is
called and the new BB is monitor-owned, and that is in blockdev_init().
Besides thus relieving us from having to invent names for all of the BBs
we use in qemu-img, this fixes a bug where qemu cannot create a new
image if there already is a monitor-owned BB named "image".
If a BB and its BDS tree are created in a single operation, as of this
patch the BDS tree will be created before the BB is given a name
(whereas it was the other way around before). This results in minor
change to the output of iotest 087, whose reference output is amended
accordingly.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
All users of the block layers are supposed to go through a BlockBackend.
The .bdrv_create() implementation is one such user, so this patch
converts it.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When QEMU creates a VHD image, it goes by the original spec,
calculating the current_size based on the nearest CHS geometry (with an
exception for disks > 127GB).
Apparently, Azure will only allow images that are sized to the nearest
MB, and the current_size as calculated from CHS cannot guarantee that.
Allow QEMU to create images similar to how Hyper-V creates images, by
setting current_size to the specified virtual disk size. This
introduces an option, force_size, to be passed to the vpc format during
image creation, e.g.:
qemu-img convert -f raw -o force_size -O vpc test.img test.vhd
When using the "force_size" option, the creator app field used by
QEMU will be "qem2" instead of "qemu", to indicate the difference.
In light of this, we also add parsing of the "qem2" field during
vpc_open.
Bug reference: https://bugs.launchpad.net/qemu/+bug/1490611
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The VHD file format is used by both Virtual PC, and Hyper-V. However,
how the virtual disk size is calculated varies between the two.
Virtual PC uses the CHS drive parameters to determine the drive size.
Hyper-V, on the other hand, uses the current_size field in the footer
when determining image size.
This is problematic for a few reasons:
* VHD images from Hyper-V, using CHS calculations, will likely be
trunctated.
* If we just rely always on current_size, then QEMU may have data
compatibility issues with Virtual PC (we may write too much data
into a VHD file to be used by Virtual PC, for instance).
* Existing VHD images created by QEMU have used the CHS calculations,
except for images exceeding the 127GB limit. We want to remain
compatible with our own generated images.
Luckily, the VHD specification defines a 'Creator App' field, that is
used to indicate what software created the VHD file.
This patch does two things:
1. Uses the 'Creator App' field to help determine how to calculate
size, and
2. Adds a VPC format option 'force_size_calc', so that the user can
override the 'Creator App' auto-detection, in case there exist
VHD images with unknown or contradictory 'Creator App' entries.
N.B.: We currently use the maximum CHS value as an indication to use the
current_size field. This patch does not change that, even with the
'force_size_calc' option.
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The added parameter can be used to return the BDS pointer which the
valid offset is referring to. Its value should be ignored unless
BDRV_BLOCK_OFFSET_VALID in ret is set.
Until block drivers fill in the right value, let's clear it explicitly
right before calling .bdrv_get_block_status.
The "bs->file" condition in bdrv_co_get_block_status is kept now to keep iotest
case 102 passing, and will be fixed once all drivers return the right file
pointer.
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1453780743-16806-2-git-send-email-famz@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.
This commit was created with scripts/clean-includes.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This patch removes the temporary duplication between bs->file and
bs->file_child by converting everything to BdrvChild.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Now that this parameter is effectively unused, we can drop it and just
pass NULL on to bdrv_open_inherit().
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When we allocate the pagetable based on max_table_entries, we multiply
the max table entry value by 4 to accomodate a table of 32-bit integers.
However, max_table_entries is a uint32_t, and the VPC driver accepts
ranges for that entry over 0x40000000. So during this allocation:
s->pagetable = qemu_try_blockalign(bs->file, s->max_table_entries * 4);
The size arg overflows, allocating significantly less memory than
expected.
Since qemu_try_blockalign() size argument is size_t, cast the
multiplication correctly to prevent overflow.
The value of "max_table_entries * 4" is used elsewhere in the code as
well, so store the correct value for use in all those cases.
We also check the Max Tables Entries value, to make sure that it is <
SIZE_MAX / 4, so we know the pagetable size will fit in size_t.
Cc: qemu-stable@nongnu.org
Reported-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
There are several error messages that identify a BlockDriverState by
its device name. However those errors can be produced in nodes that
don't have a device name associated.
In those cases we should use bdrv_get_device_or_node_name() to fall
back to the node name and produce a more meaningful message. The
messages are also updated to use the more generic term 'node' instead
of 'device'.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 9823a1f0514fdb0692e92868661c38a9e00a12d6.1428485266.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The code to check the bitmap for the allocation status of each sector
has been "disabled by reason" ever since the vpc driver existed.
The reason might be that we might end up reading sector by sector
in vpc_read if we really used it. This would be a performance desaster.
The current code would furthermore not work if the disabled parts get
reactivated since vpc_read and vpc_write only use get_sector_offset to
check the allocation status of the first sector of a read/write operation.
This might lead to sectors incorrectly treated as zero in vpc_read and
to sectors getting allocated twice in vpc_write.
Signed-off-by: Peter Lieven <pl@kamp.de>
Message-id: 1425379316-19639-6-git-send-email-pl@kamp.de
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
the field is named current size in the spec. Name it accordingly.
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1425379316-19639-5-git-send-email-pl@kamp.de
Signed-off-by: Max Reitz <mreitz@redhat.com>
The VHD spec [1] allows for total_sectors of 65535 x 16 x 255 (~127GB)
represented by a CHS geometry. If total_sectors is greater
than 65535 x 16 x 255 this geometry is set as a maximum.
Qemu, Hyper-V and disk2vhd use this special geometry as an indicator
to use the image current size from the footer as disk size.
This patch changes vpc_create to effectively calculate a CxHxS geometry
for the given image size if possible while rounding up if necessary.
If the image size is too big to be represented in CHS we set the maximum
and write the exact requested image size into the footer.
This partly reverts commit 258d2edb, but leaves support for >127G disks
intact.
[1] http://download.microsoft.com/download/f/f/e/ffef50a5-07dd-4cf8-aaa3-442c0673a029/Virtual%20Hard%20Disk%20Format%20Spec_10_18_06.doc
Signed-off-by: Peter Lieven <pl@kamp.de>
Message-id: 1425379316-19639-4-git-send-email-pl@kamp.de
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
The CHS calculation as done per the VHD spec imposes a maximum image
size of ~127 GB. Real VHD images exist that are larger than that.
Apparently there are two separate non-standard ways to achieve this:
You could use more heads than the spec does - this is the option that
qemu-img create chooses.
However, other images exist where the geometry is set to the maximum
(65535/16/255), but the actual image size is larger. Until now, such
images are truncated at 127 GB when opening them with qemu.
This patch changes the vpc driver to ignore geometry in this case and
only trust the size field in the header.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
[PL: Fixed maximum geometry in the commit msg]
Signed-off-by: Peter Lieven <pl@kamp.de>
Message-id: 1425379316-19639-3-git-send-email-pl@kamp.de
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
*pnum can't be greater than s->block_size / BDRV_SECTOR_SIZE for allocated
sectors since there is always a bitmap in between.
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1425379316-19639-2-git-send-email-pl@kamp.de
Signed-off-by: Max Reitz <mreitz@redhat.com>
This implements bdrv_co_get_block_status() for VHD images. This can
significantly speed up qemu-img convert operation because only with this
function implemented sparseness can be considered. (Before, converting a
1 TB empty image took several minutes for me, now it's instantaneous.)
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
If total_sectors is rounded to match the geometry, total_size needs to
be changed as well. Otherwise we end up with an image whose geometry
describes a disk larger than the image file, which doesn't end well.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
In commit fef6070, the need for NOCOW was removed from the vpc driver,
as we removed the the posix calls. However, the BLOCK_OPT_NOCOW was not
removed from vpc_create_opts. This was a mistake - remove the opt from
there as well.
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Message-id: 8ba076fa725fed681cde7d8afc4fb239ae06a9c6.1417620301.git.jcody@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
device_name[] can become non-empty only in bdrv_new_root() and
bdrv_move_feature_fields(). The latter is used only to undo damage
done by bdrv_swap(). The former is called only by blk_new_with_bs().
Therefore, when a BlockDriverState's device_name[] is non-empty, then
it's been created with a BlockBackend, and vice versa. Furthermore,
blk_new_with_bs() keeps the two names equal.
Therefore, device_name[] is redundant. Eliminate it.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The beX_to_cpu() and cpu_to_beX() functions perform the same operation -
they do a byteswap if the host CPU endianness is little-endian or a
nothing otherwise.
The point of two names for the same operation is that it documents which
direction the data is being converted. This makes it clear whether the
data is suitable for CPU processing or in its external representation.
This patch fixes incorrect beX_to_cpu()/cpu_to_beX() usage.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Currently the file size requested by user is rounded down to nearest
sector, causing the actual file size could be a bit less than the size
user requested. Since some formats (like qcow2) record virtual disk
size in bytes, this can make the last few bytes cannot be accessed.
This patch fixes it by rounding up file size to nearest sector so that
the actual file size is no less than the requested file size.
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
cpu_to_be32() is wrong since vhd_type is an enum constant
(just a regular CPU-endian integer).
Signed-off-by: Xiaodong Gong <gordongong0350@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.
This patch addresses the allocations in the vpc block driver.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Use the block layer to create, and write to, the image file in the VPC
.bdrv_create() operation.
This has a couple of benefits: Images can now be created over protocols,
and hacks such as NOCOW are not needed in the image format driver, and
the underlying file protocol appropriate for the host OS can be relied
upon.
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Add 'nocow' option so that users could have a chance to set NOCOW flag to
newly created files. It's useful on btrfs file system to enhance performance.
Btrfs has low performance when hosting VM images, even more when the guest
in those VM are also using btrfs as file system. One way to mitigate this bad
performance is to turn off COW attributes on VM files. Generally, there are
two ways to turn off NOCOW on btrfs: a) by mounting fs with nodatacow, then
all newly created files will be NOCOW. b) per file. Add the NOCOW file
attribute. It could only be done to empty or new files.
This patch tries the second way, according to the option, it could add NOCOW
per file.
For most block drivers, since the create file step is in raw-posix.c, so we
can do setting NOCOW flag ioctl in raw-posix.c only.
But there are some exceptions, like block/vpc.c and block/vdi.c, they are
creating file by calling qemu_open directly. For them, do the same setting
NOCOW flag ioctl work in them separately.
[Fixed up 082.out due to the new 'nocow' creation option
--Stefan]
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Now that all backend drivers are using QemuOpts, remove all
QEMUOptionParameter related codes.
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This fixes some cases of division by zero crashes.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This adds checks to make sure that max_table_entries and block_size
are in sane ranges. Memory is allocated based on max_table_entries,
and block_size is used to calculate indices into that allocated
memory, so if these values are incorrect that can lead to potential
unbounded memory allocation, or invalid memory accesses.
Also, the allocation of the pagetable is changed from g_malloc0()
to qemu_blockalign().
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Returning "Wrong medium type" for an image that does not have a valid
header is a bit weird. Improve the error by mentioning what format
was trying to open it.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
this adds a check that a dynamic VHD file has not been
accidently truncated (e.g. during transfer or upload).
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The VHD footer and header structs (vhd_footer and vhd_dyndisk_header)
are on-disk structures for the image format, and as such should be
packed.
Go ahead and make these typedefs as well, with the preferred QEMU
naming convention, so that the packed attribute is used consistently
with the struct.
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Add an Error ** parameter to BlockDriver.bdrv_open and
BlockDriver.bdrv_file_open to allow more specific error messages.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Depending on the subformat, has_zero_init on VHD must behave like raw
and query the underlying storage (fixed) or like other sparse formats
that can always return 1 (dynamic, differencing).
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This reverts commit f880defbb0.
Jeff Cody's testing revealed that the interpretation of size differs
even between VirtualPC and HyperV. Revert this so there is time to
consider the impact of any backwards incompatible behavior this change
creates.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The size calculated from the CHS values is not the real image (disk) size,
but usually a smaller value. This is caused by rounding effects.
Only older operating systems use CHS. Such guests won't be able to use
the whole disk. All modern operating systems use the real size.
This patch fixes https://bugs.launchpad.net/qemu/+bug/1105670/.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Message-id: 1360265212-22037-1-git-send-email-sw@weilnetz.de
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Return -errno instead of -1 on errors. While touching the
code, fix a memory leak.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The VHD specification allows for up to a 2 TB disk size. The current
implementation in qemu emulates EIDE and ATA-2 hardware which only allows
for up to 127 GB. This disk size limitation can be overridden by allowing
up to 255 heads instead of the normal 4 bit limitation of 16. Doing so
allows disk images to be created of up to nearly 2 TB. This change does
not violate the VHD format specification nor does it change how smaller
disks (ie, <=127GB) are defined.
[Charles Arnold also writes: "In analyzing a 160 GB VHD fixed disk image
created on Windows 2008 R2, it appears that MS is also ignoring the CHS
values in the footer geometry field in whatever driver they use for
accessing the image. The CHS values are set at 65535,16,255 which
obviously doesn't represent an image size of 160 GB." -- Stefan]
Signed-off-by: Charles Arnold <carnold@suse.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Initialize the uuid field in the footer with a generated uuid.
Signed-off-by: Charles Arnold <carnold@suse.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
There is currently nothing that needs to be done for VPC image
file reopen.
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This patch converts all block layer close calls, that correspond
to qemu_open calls, to qemu_close.
Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This patch converts all block layer open calls to qemu_open.
Note that this adds the O_CLOEXEC flag to the changed open paths
when the O_CLOEXEC macro is defined.
Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
After validation check, the 'checksum' is not written back
to footer, which leave it with zero.
This results in errors while loadding it under Microsoft's
Hyper-V environment, and also errors from utilities like
Citrix's vhd-util.
Signed-off-by: Zhang Shengju <sean_zhang@trendmicro.com.cn>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The geometry calculation algorithm from the VHD spec rounds the image
size down if it doesn't exactly match a geometry. During image
conversion, this causes the image to be truncated. For dynamic images,
we already have code in place to round up instead, let's do the same for
fixed images.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The Virtual Hard Disk Image Format Specification allows for three
types of hard disk formats, Fixed, Dynamic, and Differencing. Qemu
currently only supports Dynamic disks. This patch adds support for
the Fixed Disk format.
Usage:
Example 1: qemu-img create -f vpc -o type=fixed <filename> [size]
Example 2: qemu-img convert -O vpc -o type=fixed <input filename> <output filename>
While it is also allowed to specify '-o type=dynamic', the default disk type
remains Dynamic and is what is used when the type is left unspecified.
Signed-off-by: Charles Arnold <carnold@suse.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
There are two different types of flush that you can do: Flushing one level up
to the OS (i.e. writing data to the host page cache) or flushing it all the way
down to the disk. The existing functions flush to the disk, reflect this in the
function name.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The Data Offset field in the Dynamic Disk Header is an 8 byte field.
Although the specification (2006-10-11) gives an example of initializing
only the first 4 bytes, images generated by Microsoft on Windows initialize
all 8 bytes.
Failure to initialize all 8 bytes results in errors from utilities
like Citrix's vhd-util which checks specifically for the proper Data
Offset field initialization.
Signed-off-by: Charles Arnold <carnold@suse.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Since coroutine operation is now mandatory, convert all bdrv_flush
implementations to coroutines. For qcow2, this means taking the lock.
Other implementations are simpler and just forward bdrv_flush to the
underlying protocol, so they can avoid the lock.
The bdrv_flush callback is then unused and can be eliminated.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This does the first part of the conversion to coroutines, by
wrapping bdrv_write implementations to take the mutex.
Drivers that implement bdrv_write rather than bdrv_co_writev can
then benefit from asynchronous operation (at least if the underlying
protocol supports it, which is not the case for raw-win32), even
though they still operate with a bounce buffer.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This does the first part of the conversion to coroutines, by
wrapping bdrv_read implementations to take the mutex.
Drivers that implement bdrv_read rather than bdrv_co_readv can
then benefit from asynchronous operation (at least if the underlying
protocol supports it, which is not the case for raw-win32), even
though they still operate with a bounce buffer.
raw-win32 does not need the lock, because it cannot yield.
nbd also doesn't probably, but better be safe.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The big conversion of bdrv_read/write to coroutines caused the two
homonymous callbacks in BlockDriver to become reentrant. It goes
like this:
1) bdrv_read is now called in a coroutine, and calls bdrv_read or
bdrv_pread.
2) the nested bdrv_read goes through the fast path in bdrv_rw_co_entry;
3) in the common case when the protocol is file, bdrv_co_do_readv calls
bdrv_co_readv_em (and from here goes to bdrv_co_io_em), which yields
until the AIO operation is complete;
4) if bdrv_read had been called from a bottom half, the main loop
is free to iterate again: a device model or another bottom half
can then come and call bdrv_read again.
This applies to all four of read/write/flush/discard. It would also
apply to is_allocated, but it is not used from within coroutines:
besides qemu-img.c and qemu-io.c, which operate synchronously, the
only user is the monitor. Copy-on-read will introduce a use in the
block layer, and will require converting it.
The solution is "simply" to convert all drivers to coroutines! We
just need to add a CoMutex that is taken around affected operations.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
VHD files technically can be up to 2Tb, but virtual pc is limited
to 127G. Currently qemu-img refused to create vpc files > 127G,
but it is failing to return error when converting from a non-vpc
VHD file which is >127G. It returns success, but creates a truncated
converted image. Also, qemu-img info claims the vpc file is 127G
(and clean).
This patch detects a too-large vpc file and returns -EFBIG. Without
this patch,
=============================================================
root@ip-10-38-123-242:~/qemu-fixed# qemu-img info /mnt/140g-dynamic.vhd
image: /mnt/140g-dynamic.vhd
file format: vpc
virtual size: 127G (136899993600 bytes)
disk size: 284K
root@ip-10-38-123-242:~/qemu-fixed# qemu-img convert -f vpc -O raw /mnt/140g-dynamic.vhd /mnt/y
root@ip-10-38-123-242:~/qemu-fixed# echo $?
0
root@ip-10-38-123-242:~/qemu-fixed# qemu-img info /mnt/y
image: /mnt/y
file format: raw
virtual size: 127G (136899993600 bytes)
disk size: 0
=============================================================
(The 140G image was truncated with no warning or error.)
With the patch, I get:
=============================================================
root@ip-10-38-123-242:~/qemu-fixed# ./qemu-img info /mnt/140g-dynamic.vhd
qemu-img: Could not open '/mnt/140g-dynamic.vhd': File too large
root@ip-10-38-123-242:~/qemu-fixed# ./qemu-img convert -f vpc -O raw /mnt/140g-dynamic.vhd /mnt/y
qemu-img: Could not open '/mnt/140g-dynamic.vhd': File too large
qemu-img: Could not open '/mnt/140g-dynamic.vhd'
=============================================================
See https://bugs.launchpad.net/qemu/+bug/814222 for details.
Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Use get_option_parameter() to instead of duplicating the loop, and
use BDRV_SECTOR_SIZE to instead of 512
Signed-off-by: Mitnick Lyu <mitnick.lyu@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
All drivers use bs->file instead of s->hd for quite a while now, so it's time
to remove s->hd.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>