This is derived from the Supriya Kannery's reopen patches.
This contains the raw-posix driver changes for the bdrv_reopen_*
functions. All changes are staged into a temporary scratch buffer
during the prepare() stage, and copied over to the live structure
during commit(). Upon abort(), all changes are abandoned, and the
live structures are unmodified.
The _prepare() will create an extra fd - either by means of a dup,
if possible, or opening a new fd if not (for instance, access
control changes). Upon _commit(), the original fd is closed and
the new fd is used. Upon _abort(), the duplicate/new fd is closed.
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The aligned_buf pointer and aligned_buf size are no longer used in
raw_posix.c, so remove all references to them.
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Rather than check for a non-NULL aligned_buf to determine if
raw_aio_submit needs to check for alignment, check for the presence
of BDRV_O_NOCACHE in the bs->open_flags.
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Block drivers should ignore BDRV_O_CACHE_WB in .bdrv_open flags,
and in the bs->open_flags.
This patch removes the code, leaving the behaviour behind as if
BDRV_O_CACHE_WB was set.
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Code motion, to move parsing of open flags into a helper function.
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Move AIO initialization for raw-posix block driver into a helper function.
In addition to just code motion, the aio_ctx pointer is checked for NULL,
prior to calling laio_init(), to make sure laio_init() is only run once.
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This patch converts all block layer close calls, that correspond
to qemu_open calls, to qemu_close.
Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This patch converts all block layer open calls to qemu_open.
Note that this adds the O_CLOEXEC flag to the changed open paths
when the O_CLOEXEC macro is defined.
Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Move the declaration of s into the #ifdef sections that actually make
use of it.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Either FIEMAP, or SEEK_DATA+SEEK_HOLE can be used to implement the
is_allocated callback for raw files. On Linux ext4, btrfs and XFS
all support it.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Use __APPLE__ and __MACH__ macros instead of CONFIG_COCOA to detect Mac
OS X host. The patch is based on Ben Leslie's patch:
http://patchwork.ozlabs.org/patch/97859/
Signed-off-by: Ben Leslie <benno@benno.id.au>
Signed-off-by: Pavel Borzenkov <pavel.borzenkov@gmail.com>
Acked-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
Cc: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Since coroutine operation is now mandatory, convert both bdrv_discard
implementations to coroutines. For qcow2, this means taking the lock
around the operation. raw-posix remains synchronous.
The bdrv_discard callback is then unused and can be eliminated.
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Block drivers now only need to provide either of .bdrv_co_flush,
.bdrv_aio_flush() or for legacy drivers .bdrv_flush(). Remove
the redundant .bdrv_flush() implementations.
[Paolo Bonzini: change raw driver to bdrv_co_flush]
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Block drivers only need to provide one of sync, aio, or coroutine
interfaces. Since raw-posix.c provides aio interfaces, simply drop the
synchronous interfaces since they can be emulated using aio and
coroutines.
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Allow to resize images that reside on host devices up to the available
space. This allows to grow images after resizing the device manually or
vice versa.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Add some notes about Linux AIO explaining why we don't use AIO in
some situations.
Signed-off-by: Frediano Ziglio <freddy77@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Callees always return 0, except for FreeBSD's cdrom_eject(), which
returns -ENOTSUP when the device is in a terminally wedged state.
The only caller is bdrv_eject(), and it maps -ENOTSUP to 0 since
commit 4be9762a.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The only caller is bdrv_set_locked(), and it ignores the value.
Callees always return 0, except for FreeBSD's cdrom_set_locked(),
which returns -ENOTSUP when the device is in a terminally wedged
state.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
qemu-img.c wants to count allocated file size of image. Previously it
counts a single bs->file by 'stat' or Window API. As VMDK introduces
multiple file support, the operation becomes format specific with
platform specific meanwhile.
The functions are moved to block/raw-{posix,win32}.c and qemu-img.c calls
bdrv_get_allocated_file_size to count the bs. And also added VMDK code
to count his own extents.
Signed-off-by: Fam Zheng <famcool@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
On Linux x86_64 host with 32bit userspace, running
qemu or even just "qemu-img create -f qcow2 some.img 1G"
causes a kernel warning:
ioctl32(qemu-img:5296): Unknown cmd fd(3) cmd(00005326){t:'S';sz:0} arg(7fffffff) on some.img
ioctl32(qemu-img:5296): Unknown cmd fd(3) cmd(801c0204){t:02;sz:28} arg(fff77350) on some.img
ioctl 00005326 is CDROM_DRIVE_STATUS,
ioctl 801c0204 is FDGETPRM.
The warning appears because the Linux compat-ioctl handler for these
ioctls only applies to block devices, while qemu also uses the ioctls on
plain files. Work around by calling fstat() the ensure the ioctls are
only used on block devices.
Signed-off-by: Johannes Stezenbach <js@sig21.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
use the correct way to get the size of a disk device or partition
From: Adam Hamsik <haad@netbsd.org>
Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
On NetBSD a userland process is better with the character device
interface. In addition, a block device can't be opened twice; if a Xen
backend opens it, qemu can't and vice-versa.
Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Change BDRV_O_NOCACHE to only imply bypassing the host OS file cache,
but no writeback semantics. All existing callers are changed to also
specify BDRV_O_CACHE_WB to give them writeback semantics.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This patch removes all references to signal.h when qemu-common.h is included
as they become redundant.
Signed-off-by: Alexandre Raymond <cerbere@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Add support to discard blocks in a raw image residing on an XFS filesystem
by calling the XFS_IOC_UNRESVSP64 ioctl to punch holes. Support for other
hole punching mechanisms can be added when they become available.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This changes bdrv_flush to return 0 on success and -errno in case of failure.
It's a requirement for implementing proper error handle in users of bdrv_flush.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Move timer init functions to a new file, qemu-timer-common.c. Make other
critical timer functions inlined to preserve performance in
qemu-timer.c, also move muldiv64() (used by the inline functions)
to qemu-timer.h.
Adjust block/raw-posix.c and simpletrace.c to use get_clock() directly.
Remove a similar/duplicate definition in qemu-tool.c.
Adjust hw/omap_clk.c to include qemu-timer.h because muldiv64() is used
there.
After this change, tracing can be used also for user code and
simpletrace on Win32.
Cc: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Acked-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Replace the hardcoded handling of 512 byte alignment with bs->buffer_alignment
to handle larger sector size devices correctly.
Note that we can not rely on it to be initialize in bdrv_open, so deal
with the worst case there.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Allow symbolic links which point to /dev/sgX devices.
Signed-off-by: Bernhard Kohl <bernhard.kohl@nsn.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
On Linux, we have code to detect CD-ROMs using an ioctl. We shouldn't lose
anything but false positives by removing the check for a /dev/cd* path.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Assuming that any image on a block device is not properly zero-initialized is
actually wrong: Only raw images have this problem. Any other image format
shouldn't care about it, they initialize everything properly themselves.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
There is no need to have a second set of integral types.
Replace them by the standard types from stdint.h.
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
raw_pread_aligned() retries up to two times if the block device backs
a virtual CD-ROM (a drive with media=cdrom and if=ide, scsi, xen or
none). This makes no sense. Whether retrying reads can correct read
errors can only depend on what we're reading, not on how the result
gets used. We need to check what whether we're reading from a
physical CD-ROM or floppy here.
I doubt retrying is useful even then. Left for another day.
Impact:
* Virtual CD-ROM backed by host_cdrom behaves the same.
* Virtual CD-ROM backed by file or host_device no longer retries.
* A drive backed by host_cdrom now retries even if it's not a virtual
CD-ROM.
* Any drive backed by host_floppy now retries.
While there, clean up gratuitous use of goto.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Clean up raw-posix.c to be more consistent using BDRV_SECTOR_SIZE
instead of hard coded 512 values.
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This patch combines the lseek+read/write calls to use pread/pwrite
instead. This will result in fewer system calls and is already used by
AIO.
Thanks to Jan Kiszka <jan.kiszka@siemens.com> for identifying excessive
lseek and Christoph Hellwig <hch@lst.de> for confirming that this
approach should work.
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Format drivers shouldn't need to bother with things like file names, but rather
just get an open BlockDriverState for the underlying protocol. This patch
introduces this behaviour for bdrv_open implementation. For protocols which
need to access the filename to open their file/device/connection/... a new
callback bdrv_file_open is introduced which doesn't get an underlying file
opened.
For now, also some of the more obscure formats use bdrv_file_open because they
open() the file themselves instead of using the block.c functions. They need to
be fixed in later patches.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We're running into various problems because the "raw" file access, which
is used internally by the various image formats is entangled with the
"raw" image format, which maps the VM view 1:1 to a file system.
This patch renames the raw file backends to the file protocol which
is treated like other protocols (e.g. nbd and http) and adds a new
"raw" image format which is just a wrapper around calls to the underlying
protocol.
The patch is surprisingly simple, besides changing the probing logical
in block.c to only look for image formats when using bdrv_open and
renaming of the old raw protocols to file there's almost nothing in there.
For creating images, a new bdrv_create_file is introduced which guesses the
protocol to use. This allows using qemu-img create -f raw (or just using the
default) for both files and host devices. Converting the other format drivers
to use this function to create their images is left for later patches.
The only issues still open are in the handling of the host devices.
Firstly in current qemu we can specifiy the host* format names
on various command line acceping images, but the new code can't
do that without adding some translation. Second the layering breaks
the no_zero_init flag in the BlockDriver used by qemu-img. I'm not
happy how this is done per-driver instead of per-state so I'll
prepare a separate patch to clean this up.
There's some more cleanup opportunity after this patch, e.g. using
separate lists and registration functions for image formats vs
protocols and maybe even host drivers, but this can be done at a
later stage.
Also there's a check for protocol in bdrv_open for the BDRV_O_SNAPSHOT
case that I don't quite understand, but which I fear won't work as
expected - possibly even before this patch.
Note that this patch requires various recent block patches from Kevin
and me, which should all be in his block queue.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Split up the raw_getlength into separate generic, solaris and BSD
versions to reduce the ifdef maze a bit. The BSD variant still
is a complete maze, but to clean it up properly we'd need some
people using the BSD variants to figure out what code is used
for what variant.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
bdrv_open already takes care of this for us.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Now that we output an error message according to the returned error code in
qemu-img, let's return the real error codes. "Input/output error" for
everything isn't helpful.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
This shouldn't happen under any normal circumstances. However, it looks like
it's possible to achieve this with corrupted images. Without this patch
raw_pread is hanging in an endless loop in such cases.
The patch is not affecting growable files, for which such reads happen in
normal use cases. raw_pread_aligned already handles these cases and won't
return zero in the first place.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>