Commit Graph

510 Commits

Author SHA1 Message Date
Paolo Bonzini
848c66e8f5 block: add a CoMutex to synchronous read drivers
The big conversion of bdrv_read/write to coroutines caused the two
homonymous callbacks in BlockDriver to become reentrant.  It goes
like this:

1) bdrv_read is now called in a coroutine, and calls bdrv_read or
bdrv_pread.

2) the nested bdrv_read goes through the fast path in bdrv_rw_co_entry;

3) in the common case when the protocol is file, bdrv_co_do_readv calls
bdrv_co_readv_em (and from here goes to bdrv_co_io_em), which yields
until the AIO operation is complete;

4) if bdrv_read had been called from a bottom half, the main loop
is free to iterate again: a device model or another bottom half
can then come and call bdrv_read again.

This applies to all four of read/write/flush/discard.  It would also
apply to is_allocated, but it is not used from within coroutines:
besides qemu-img.c and qemu-io.c, which operate synchronously, the
only user is the monitor.  Copy-on-read will introduce a use in the
block layer, and will require converting it.

The solution is "simply" to convert all drivers to coroutines!  We
just need to add a CoMutex that is taken around affected operations.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-10-21 17:34:13 +02:00
Paolo Bonzini
bae0a0cc38 vmdk: clean up open
Move vmdk_parent_open to vmdk_open.  There's another path how
vmdk_parent_open can be reached:

  vmdk_parse_extents() ->  vmdk_open_sparse() ->  vmdk_open_vmdk4() ->
  vmdk_open_desc_file().

If that can happen, however, the code is bogus.  vmdk_parent_open
reads from bs->file:

    if (bdrv_pread(bs->file, s->desc_offset, desc, DESC_SIZE) != DESC_SIZE) {

but it is always called with s->desc_offset == 0 and with the same
bs->file.  So the data that vmdk_parent_open reads comes always from the
same place, and anyway there is only one place where it can write it,
namely bs->backing_file.

So, if it cannot happen, the patched code is okay.

It is also possible that the recursive call can happen, but only once.  In
that case there would still be a bug in vmdk_open_desc_file setting
s->desc_offset = 0, but the patched code is okay.

Finally, in the case where multiple recursive calls can happen the code
would need to be rewritten anyway.  It is likely that this would anyway
involve adding several parameters to vmdk_parent_open, and calling it from
vmdk_open_vmdk4.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-10-21 17:34:13 +02:00
Paolo Bonzini
588b65a37a vmdk: fix return values of vmdk_parent_open
While vmdk_open_desc_file (touched by the patch) correctly changed -1
to -EINVAL, vmdk_open did not.  Fix it directly in vmdk_parent_open.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-10-21 17:34:13 +02:00
Kevin Wolf
8f1efd00c4 qcow2: Fix bdrv_write_compressed error handling
If during allocation of compressed clusters the cluster was already allocated
uncompressed, fail and properly release the l2_table (the latter avoids a
failed assertion).

While at it, make it return some real error numbers instead of -1.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
2011-10-21 17:34:13 +02:00
Paolo Bonzini
4265d620c5 block: add bdrv_co_discard and bdrv_aio_discard support
This similarly adds support for coroutine and asynchronous discard.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-10-21 17:34:13 +02:00
Stefan Hajnoczi
6f6dc6565e block: drop redundant bdrv_flush implementation
Block drivers now only need to provide either of .bdrv_co_flush,
.bdrv_aio_flush() or for legacy drivers .bdrv_flush().  Remove
the redundant .bdrv_flush() implementations.

[Paolo Bonzini: change raw driver to bdrv_co_flush]

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-10-21 17:34:13 +02:00
Paolo Bonzini
d8716b41b9 sheepdog: add coroutine_fn markers
This makes the following patch easier to review.

Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-10-21 17:34:12 +02:00
Anthony Liguori
2a22e6eb1b Merge remote-tracking branch 'kwolf/for-anthony' into staging 2011-10-14 12:36:50 -05:00
Stefan Hajnoczi
d8b7e0adf5 block: use coroutine interface for raw format
The raw format delegates all operations to bs->file (the protocol).
Previously this block driver exposed both sync and aio interfaces.
Since the block layer now works in terms of coroutines, expose the
coroutine interfaces and drop the others.  This avoids unnecessary
emulation of sync and aio interfaces.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-10-14 17:31:22 +02:00
Stefan Hajnoczi
1b1e8c6ec7 raw-posix: remove bdrv_read()/bdrv_write()
Block drivers only need to provide one of sync, aio, or coroutine
interfaces.  Since raw-posix.c provides aio interfaces, simply drop the
synchronous interfaces since they can be emulated using aio and
coroutines.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-10-14 17:31:22 +02:00
Stefan Weil
add8d26203 block/qcow: Fix use of free() instead of g_free()
cppcheck reported this error:

qemu/block/qcow.c:599: error: Mismatching allocation and deallocation: cluster_data

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2011-10-14 14:57:46 +01:00
Dong Xu Wang
7acae208ca sheepdog: correct spelling
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2011-10-14 14:46:59 +01:00
Kevin Wolf
0d460d6f41 vvfat: Fix potential buffer overflow
path2[PATH_MAX] can be used for the null termination, so make the array big
enough to allow this.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-10-11 09:42:46 +02:00
Stefan Weil
7ef6d3dc39 block/vvfat: Remove unused code
The unused code was detected using cppcheck.

Cc: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-10-11 09:42:45 +02:00
Stefan Weil
ce137829e7 block/vvfat: Fix potential memory leaks and other memory errors
cppcheck reported memory leaks and mismatched g_malloc() with free()
instead of g_free().

Fix these errors.

Cc: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-10-11 09:42:45 +02:00
Christoph Hellwig
55b949c847 block: allow resizing of images residing on host devices
Allow to resize images that reside on host devices up to the available
space.  This allows to grow images after resizing the device manually or
vice versa.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-10-11 09:41:46 +02:00
Stefan Hajnoczi
e4fc8781db qed: fix use-after-free during l2 cache commit
QED's metadata caching strategy allows two parallel requests to race for
metadata lookup.  The first one to complete will populate the metadata
cache and the second one will drop the data it just read in favor of the
cached data.

There is a use-after-free in qed_read_l2_table_cb() and
qed_commit_l2_update() where l2_table->offset was used after the
l2_table may have been freed due to a metadata lookup race.  Fix this by
keeping the l2_offset in a local variable and not reaching into the
possibly freed l2_table.

Reported-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-10-05 10:52:31 -05:00
Nick Thomas
363c3c8535 block/curl: Don't finish AIOCBs too early
The previous behaviour was to finish AIOCBs inside curl_aio_readv()
if the data was cached. This caused the following failed assertion
at hw/ide/pci.c:314: bmdma_cmd_writeb

"Assertion `bm->bus->dma->aiocb == ((void *)0)' failed."

By scheduling a QEMUBH and performing the completion inside the
callback, we avoid this problem.

Signed-off-by: Nick Thomas <nick@bytemark.co.uk>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-21 15:16:04 +02:00
Nick Thomas
c84dcdc1d6 block/curl: Implement a flush function on the fd handlers
Signed-off-by: Nick Thomas <nick@bytemark.co.uk>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-21 15:16:02 +02:00
Sage Weil
16a06b2430 rbd: allow escaping in config string
The config string is variously delimited by =, @, and /, depending on the
field.  Allow these characters to be escaped by preceeding them with \.

Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-20 14:34:17 +02:00
Sage Weil
7a3f5fe9af rbd: call flush, if available
librbd recently added async writeback and flush support.  If the new
rbd_flush() call is available, call it.

Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-20 12:27:44 +02:00
Sage Weil
9e1fbcde57 rbd: update comment heading
Properly document the configuration string syntax and semantics.  Remove
(out of date) details about the librbd implementation.

Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-20 12:27:44 +02:00
Sage Weil
f9fe18ec77 rbd: ignore failures when reading from default conf location
If we are reading from the default config location, ignore any failures.
It is perfectly legal for the user to specify exactly the options they need
and to not rely on any config file.

Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-20 12:27:44 +02:00
Fam Zheng
b3c0bfb6f9 VMDK: fix leak of extent_file
Release extent_file on error in vmdk_parse_extents. Added closing files
in freeing extents.

Signed-off-by: Fam Zheng <famcool@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-20 12:27:43 +02:00
Kevin Wolf
074abad326 raw-posix: Fix bdrv_flush error return values
bdrv_flush is supposed to use 0/-errno return values

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-19 11:34:33 +02:00
Paolo Bonzini
b90fb4b8f5 nbd: support feature negotiation
nbd supports writing flags in bytes 24...27 of the header,
and uses that for the read-only flag.  Add support for it
in qemu-nbd.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-19 11:34:33 +02:00
Stefan Weil
b2bedb2144 Remove blanks before \n in output strings
Those blanks violate the coding conventions, see
scripts/checkpatch.pl.

Blanks missing after colons in the changed lines were added.

This patch does not try to fix tabs, long lines and other
problems in the changed lines, therefore checkpatch.pl reports
many violations.

Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-09-16 08:25:56 -05:00
Frediano Ziglio
ee18e73023 qcow2: fix range check
QCowL2Meta::offset is not cluster aligned but only sector aligned
however nb_clusters count cluster from cluster start.
This fix range check. Note that old code have no corruption issues
related to this check cause it only cause intersection to occur
when shouldn't.

Signed-off-by: Frediano Ziglio <freddy77@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-12 15:17:22 +02:00
Frediano Ziglio
dea43a65d6 qcow2: align cluster_data to block to improve performance using O_DIRECT
Signed-off-by: Frediano Ziglio <freddy77@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-12 15:17:22 +02:00
Frediano Ziglio
05140499d3 qcow2: initialize metadata before inserting in cluster_allocs
QCow2Meta structure was inserted into list before many fields are
initialized. Currently is not a problem cause all occur in a lock
but if qcow2_alloc_clusters would in a future unlock this lock
some issues could arise.
Initializing fields before inserting fix the problem.

Signed-off-by: Frediano Ziglio <freddy77@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-12 15:17:22 +02:00
Markus Armbruster
ab359cd17e nbd: Clean up use of block_int.h
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-12 15:17:22 +02:00
Sage Weil
eb93d5d990 rbd: fix leak in qemu_rbd_open failure paths
Fix leak of s->snap in failure path.  Simplify error paths for the whole
function.

Reported-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Sage Weil <sage@newdream.net>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-12 15:17:21 +02:00
Sage Weil
dfe80b071b rbd: clean up, fix style
No assignment in condition.  Remove duplicate ret > 0 check.

Signed-off-by: Sage Weil <sage@newdream.net>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-12 15:17:21 +02:00
Sage Weil
7c7e9df023 rbd: allow client id to be specified in config string
Allow the client id to be specified in the config string via 'id=' so that
users can control who they authenticate as.  Currently they are stuck with
the default ('admin').  This is necessary for anyone using authentication
in their environment.

Signed-off-by: Sage Weil <sage@newdream.net>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-12 15:17:20 +02:00
Markus Armbruster
025e849a50 block: Rename bdrv_set_locked() to bdrv_lock_medium()
While there, make the locked parameter bool.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-12 15:17:20 +02:00
Frediano Ziglio
a791236992 qcow2: removed unused depends_on field
Signed-off-by: Frediano Ziglio <freddy77@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-12 15:17:17 +02:00
Fam Zheng
bb45ded931 VMDK: bugfix, opening vSphere 4 exported image
The vSphere 4 exported image is streamOptimized extent, which is not
quite correctly handled. Ignore rdgOffset when RGD flag bit not set.

Signed-off-by: Fam Zheng <famcool@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-06 12:33:29 +02:00
Fam Zheng
f16f509d17 VMDK: bugfix, open Haiku vmdk image
Haiku provides a specially formed vmdk image, which let qemu abort. It a
combination of sparse header and flat data (i.e. with not l1/l2 table at
all). The fix is turn to descriptor when sparse header is zero in field
'capacity'.

Signed-off-by: Fam Zheng <famcool@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-06 12:33:03 +02:00
Fam Zheng
6c031aac4d VMDK: creating streamOptimized subformat
Creating streamOptimized subformat. Added subformat option
'streamOptimized', to create a image with compression enabled and each
cluster with a GrainMarker.

Signed-off-by: Fam Zheng <famcool@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-06 12:31:33 +02:00
Fam Zheng
2b2c8c5dec VMDK: read/write compressed extent
Add support for reading/writing compressed extent.

Signed-off-by: Fam Zheng <famcool@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-06 12:30:28 +02:00
Fam Zheng
432bb170af VMDK: Opening compressed extent.
Added flags field for compressed/streamOptimized extents, open and save
image configuration.

Signed-off-by: Fam Zheng <famcool@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-06 12:27:42 +02:00
Fam Zheng
dd3f6ee211 VMDK: separate vmdk_read_extent/vmdk_write_extent
Factor out read/write extent code, since there will be more things to
take care of once reading/writing compressed clusters is introduced.

Signed-off-by: Fam Zheng <famcool@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-06 12:25:32 +02:00
Fam Zheng
86c6b429bf VMDK: add twoGbMaxExtentSparse support
Add twoGbMaxExtentSparse support. Introduce vmdk_free_last_extent.

Signed-off-by: Fam Zheng <famcool@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-06 12:22:02 +02:00
Fam Zheng
6398de5160 VMDK: enable twoGbMaxExtentFlat
Enable the createType 'twoGbMaxExtentFlat'. The supporting code is
already in.

Signed-off-by: Fam Zheng <famcool@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-06 12:20:43 +02:00
Markus Armbruster
be32f75f4d block/raw: Fix to forward method bdrv_media_changed()
Block driver "raw" forwards most methods to the underlying block
driver.  However, it doesn't implement method bdrv_media_changed().
Makes bdrv_media_changed() always return -ENOTSUP.

I believe -fda /dev/fd0 gives you raw over host_floppy, and disk
change detection (fdc register 7 bit 7) is broken.  Testing my theory
requires a computer museum, though.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-06 11:24:06 +02:00
Kevin Wolf
0fa9131a44 qcow2: Fix error cases to run depedent requests
Requests depending on a failed request would end up waiting forever. This fixes
the error path to continue dependent requests even when the request has failed.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-06 11:23:51 +02:00
Kevin Wolf
8e217d5384 qcow2: Properly initialise QcowL2Meta
Dependency list pointers filled with random garbage from the stack aren't a
good idea.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-06 11:23:51 +02:00
Frediano Ziglio
c30e624d5d linux aio: some comments
Add some notes about Linux AIO explaining why we don't use AIO in
some situations.

Signed-off-by: Frediano Ziglio <freddy77@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-06 11:23:51 +02:00
Stefan Weil
541dc0d47f Use new macro QEMU_PACKED for packed structures
Most changes were made using these commands:

git grep -la '__attribute__((packed))'|xargs perl -pi -e 's/__attribute__\(\(packed\)\)/QEMU_PACKED/'
git grep -la '__attribute__ ((packed))'|xargs perl -pi -e 's/__attribute__ \(\(packed\)\)/QEMU_PACKED/'
git grep -la '__attribute__((__packed__))'|xargs perl -pi -e 's/__attribute__\(\(__packed__\)\)/QEMU_PACKED/'
git grep -la '__attribute__ ((__packed__))'|xargs perl -pi -e 's/__attribute__ \(\(__packed__\)\)/QEMU_PACKED/'
git grep -la '__attribute((packed))'|xargs perl -pi -e 's/__attribute\(\(packed\)\)/QEMU_PACKED/'

Whitespace in linux-user/syscall_defs.h was fixed manually
to avoid warnings from scripts/checkpatch.pl.

Manual changes were also applied to hw/pc.c.

I did not fix indentation with tabs in block/vvfat.c.
The patch will show 4 errors with scripts/checkpatch.pl.

Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2011-09-03 10:45:59 +00:00
Frediano Ziglio
2f4b759367 qcow2: remove unused qcow2_create_refcount_update function
Signed-off-by: Frediano Ziglio <freddy77@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-25 15:23:10 +02:00
Frediano Ziglio
35ee5e39c5 qcow2: use always stderr for debugging
let all DEBUG_ALLOC2 printf goes to stderr

Signed-off-by: Frediano Ziglio <freddy77@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-25 15:22:25 +02:00
MORITA Kazutaka
2df4624662 sheepdog: use coroutines
This makes the sheepdog block driver support bdrv_co_readv/writev
instead of bdrv_aio_readv/writev.

With this patch, Sheepdog network I/O becomes fully asynchronous.  The
block driver yields back when send/recv returns EAGAIN, and is resumed
when the sheepdog network connection is ready for the operation.

Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-24 14:53:51 +02:00
Frediano Ziglio
ab0997e0af qcow2: remove memory leak
Signed-off-by: Frediano Ziglio <freddy77@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23 17:41:15 +02:00
Frediano Ziglio
3fc48d0983 qcow2: Removed QCowAIOCB entirely
Signed-off-by: Frediano Ziglio <freddy77@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23 17:41:14 +02:00
Frediano Ziglio
5ebaa27e9a qcow2: reindent and use while before the big jump
prepare to remove read/write callbacks

Signed-off-by: Frediano Ziglio <freddy77@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23 17:41:14 +02:00
Frediano Ziglio
e78c69b89c qcow2: remove common from QCowAIOCB
Signed-off-by: Frediano Ziglio <freddy77@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23 17:41:14 +02:00
Frediano Ziglio
c2bdd9904b qcow2: remove cluster_offset from QCowAIOCB
Signed-off-by: Frediano Ziglio <freddy77@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23 17:41:14 +02:00
Frediano Ziglio
c227140397 qcow2: remove l2meta from QCowAIOCB
Signed-off-by: Frediano Ziglio <freddy77@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23 17:41:14 +02:00
Frediano Ziglio
faf575c136 qcow2: removed cur_nr_sectors field in QCowAIOCB
Signed-off-by: Frediano Ziglio <freddy77@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23 17:41:14 +02:00
Frediano Ziglio
4617310c33 qcow2: Removed unused AIOCB fields
Signed-off-by: Frediano Ziglio <freddy77@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23 17:41:14 +02:00
Frediano Ziglio
122bbd1dd9 qcow: remove old #undefined code
Signed-off-by: Frediano Ziglio <freddy77@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23 17:41:14 +02:00
Frediano Ziglio
27deebe836 qcow: Remove QCowAIOCB
Embed qcow_aio_read_cb into qcow_co_readv and qcow_aio_write_cb into qcow_co_writev

Signed-off-by: Frediano Ziglio <freddy77@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23 17:41:14 +02:00
Frediano Ziglio
43ca85b559 qcow: move some blocks of code to avoid useless variable initialization
Signed-off-by: Frediano Ziglio <freddy77@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23 17:41:14 +02:00
Frediano Ziglio
430bbaaa95 qcow: QCowAIOCB field cleanup
remove unused field from this structure and put some of them in qcow_aio_read_cb and qcow_aio_write_cb

Signed-off-by: Frediano Ziglio <freddy77@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23 17:41:14 +02:00
Frediano Ziglio
f5cd8173e7 qcow/qcow2: Allocate QCowAIOCB structure using stack
instead of calling qemi_aio_get use stack

Signed-off-by: Frediano Ziglio <freddy77@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23 17:41:14 +02:00
Nicholas Thomas
f785a5ae36 block/curl: Handle failed reads gracefully.
Current behaviour if a read fails is for the acb to not get finished.
This causes an infinite loop in bdrv_read_em (block.c). The read failure
never gets reported to the  guest and if the error condition clears, the
process never recovers.

With this patch, when curl reports a failure we finish the acb as a
failure. This results in the guest receiving an I/O error (rather than
the read hanging indefinitely) and if the error condition subsequently
clears, retries work as expected.

The simplest test is to put an ISO on a web server you have control over
and open it with qemu-io. Then move the ISO out of the way and attempt
to read some data - you should see behaviour matching the above.

Signed-off-by: Nick Thomas <nick@bytemark.co.uk>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23 17:41:14 +02:00
Scott Wood
de33b1f3dd qcow: initialize coroutine mutex
commit 52b8eb6013 added a mutex,
but never initialized it.  This caused a segfault.

Reported-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23 14:15:17 +02:00
Devin Nakamura
d57237f291 qcow2: fix typo in documentation for qcow2_get_cluster_offset()
Documentation states the num is measured in clusters, but its
actually measured in sectors

Signed-off-by: Devin Nakamura <devin122@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23 14:15:17 +02:00
Philipp Hahn
6cbc3031c8 qcow2: Fix DEBUG_* compilation
By introducing BlockDriverState compiling qcow2 with DEBUG_ALLOC and DEBUG_EXT
defined got broken.
Define a BdrvCheckResult structure locally which is now needed as the second
argument.

Also fix qcow2_read_extensions() needing BDRVQcowState.

Signed-off-by: Philipp Hahn <hahn@univention.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23 14:15:17 +02:00
Stefan Weil
2c993ec294 w32: Fix qemu_ftruncate64
SetFilePointer returns INVALID_SET_FILE_POINTER when it fails.
In addition, GetLastError must be checked.

The first call of SetFilePointer did not use INVALID_SET_FILE_POINTER,
the second call used wrong error handling.

Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-08-22 10:19:00 -05:00
Anthony Liguori
7267c0947d Use glib memory allocation and free functions
qemu_malloc/qemu_free no longer exist after this commit.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-08-20 23:01:08 -05:00
Kevin Wolf
35d7ace74b qcow2: Fix L1 table size after bdrv_snapshot_goto
When loading an internal snapshot whose L1 table is smaller than the current L1
table, the size of the current L1 would be shrunk to the snapshot's L1 size in
memory, but not on disk. This lead to incorrect refcount updates and eventuelly
to image corruption.

Instead of writing the new L1 size to disk, this simply retains the bigger L1
size that is currently in use and makes sure that the unused part is zeroed.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Tested-by: Philipp Hahn <hahn@univention.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-08-05 07:15:47 -05:00
Kevin Wolf
384acbf46b async: Remove AsyncContext
The purpose of AsyncContexts was to protect qcow and qcow2 against reentrancy
during an emulated bdrv_read/write (which includes a qemu_aio_wait() call and
can run AIO callbacks of different requests if it weren't for AsyncContexts).

Now both qcow and qcow2 are protected by CoMutexes and AsyncContexts can be
removed.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-02 15:53:41 +02:00
Kevin Wolf
52b8eb6013 qcow: Use coroutines
The old qcow format is another user of the AsyncContext infrastructure.
Converting it to coroutines (and therefore CoMutexes) allows to remove
AsyncContexts.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-02 15:53:41 +02:00
Kevin Wolf
68d100e905 qcow2: Use coroutines
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-02 15:53:41 +02:00
Frediano Ziglio
f6e8ffc22f raw-posix: Always check paio_init result
Signed-off-by: Frediano Ziglio <freddy77@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-01 12:10:29 +02:00
Frediano Ziglio
c1ee7d56f3 raw-posix: Typo fix
Signed-off-by: Frediano Ziglio <freddy77@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-01 12:10:29 +02:00
Serge E. Hallyn
efc8243d00 block/vpc.c: Detect too-large vpc file
VHD files technically can be up to 2Tb, but virtual pc is limited
to 127G.  Currently qemu-img refused to create vpc files > 127G,
but it is failing to return error when converting from a non-vpc
VHD file which is >127G.  It returns success, but creates a truncated
converted image.  Also, qemu-img info claims the vpc file is 127G
(and clean).

This patch detects a too-large vpc file and returns -EFBIG.  Without
this patch,

=============================================================
root@ip-10-38-123-242:~/qemu-fixed# qemu-img info /mnt/140g-dynamic.vhd
image: /mnt/140g-dynamic.vhd
file format: vpc
virtual size: 127G (136899993600 bytes)
disk size: 284K
root@ip-10-38-123-242:~/qemu-fixed# qemu-img convert -f vpc -O raw /mnt/140g-dynamic.vhd /mnt/y
root@ip-10-38-123-242:~/qemu-fixed# echo $?
0
root@ip-10-38-123-242:~/qemu-fixed# qemu-img info /mnt/y
image: /mnt/y
file format: raw
virtual size: 127G (136899993600 bytes)
disk size: 0
=============================================================

(The 140G image was truncated with no warning or error.)

With the patch, I get:

=============================================================
root@ip-10-38-123-242:~/qemu-fixed# ./qemu-img info /mnt/140g-dynamic.vhd
qemu-img: Could not open '/mnt/140g-dynamic.vhd': File too large
root@ip-10-38-123-242:~/qemu-fixed# ./qemu-img convert -f vpc -O raw /mnt/140g-dynamic.vhd /mnt/y
qemu-img: Could not open '/mnt/140g-dynamic.vhd': File too large
qemu-img: Could not open '/mnt/140g-dynamic.vhd'
=============================================================

See https://bugs.launchpad.net/qemu/+bug/814222 for details.

Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-01 12:10:28 +02:00
Markus Armbruster
822e1cd17e block: Make BlockDriver method bdrv_eject() return void
Callees always return 0, except for FreeBSD's cdrom_eject(), which
returns -ENOTSUP when the device is in a terminally wedged state.

The only caller is bdrv_eject(), and it maps -ENOTSUP to 0 since
commit 4be9762a.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-01 12:10:28 +02:00
Markus Armbruster
7bf37feddc block: Make BlockDriver method bdrv_set_locked() return void
The only caller is bdrv_set_locked(), and it ignores the value.

Callees always return 0, except for FreeBSD's cdrom_set_locked(),
which returns -ENOTSUP when the device is in a terminally wedged
state.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-01 12:10:28 +02:00
Markus Armbruster
02266d547a block/raw-win32: Drop disabled code for removable host devices
It's been disabled since the start (commit 19cb3738, Aug 2006), and
has been untouched except for spelling fixes and such.  I don't feel
like dragging it along any further.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-01 12:10:28 +02:00
Blue Swirl
00aa0040e8 Wrap recv to avoid warnings
Avoid warnings like these by wrapping recv():
  CC    slirp/ip_icmp.o
/src/qemu/slirp/ip_icmp.c: In function 'icmp_receive':
/src/qemu/slirp/ip_icmp.c:418:5: error: passing argument 2 of 'recv' from incompatible pointer type [-Werror]
/usr/local/lib/gcc/i686-mingw32msvc/4.6.0/../../../../i686-mingw32msvc/include/winsock2.h:547:32: note: expected 'char *' but argument is of type 'struct icmp *'

Remove also casts used to avoid warnings.

Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2011-07-25 14:38:56 +00:00
Kevin Wolf
93913dfd8a qcow2: Use Qcow2Cache in writeback mode during loadvm/savevm
In snapshotting there is no guest involved, so we can safely use a writeback
mode and do the flushes in the right place (i.e. at the very end). This
improves the time that creating/restoring an internal snapshot takes with an
image in writethrough mode.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-07-19 15:39:22 +02:00
Fam Zheng
4a1d5e1fde block: add bdrv_get_allocated_file_size() operation
qemu-img.c wants to count allocated file size of image. Previously it
counts a single bs->file by 'stat' or Window API. As VMDK introduces
multiple file support, the operation becomes format specific with
platform specific meanwhile.

The functions are moved to block/raw-{posix,win32}.c and qemu-img.c calls
bdrv_get_allocated_file_size to count the bs. And also added VMDK code
to count his own extents.

Signed-off-by: Fam Zheng <famcool@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-07-19 15:39:08 +02:00
Fam Zheng
ae261c86aa VMDK: fix coding style
Conform coding style in vmdk.c to pass scripts/checkpatch.pl checks.

Signed-off-by: Fam Zheng <famcool@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-07-19 15:39:07 +02:00
Fam Zheng
f66fd6c383 VMDK: create different subformats
Add create option 'format', with enums:
    monolithicSparse
    monolithicFlat
    twoGbMaxExtentSparse
    twoGbMaxExtentFlat
Each creates a subformat image file. The default is monolithicSparse.

Signed-off-by: Fam Zheng <famcool@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-07-19 15:39:07 +02:00
Fam Zheng
7fa60fa377 VMDK: open/read/write for monolithicFlat image
Parse vmdk decriptor file and open mono flat image.
Read/write the flat extent.

Signed-off-by: Fam Zheng <famcool@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-07-19 15:39:07 +02:00
Fam Zheng
91b85bd388 VMDK: change get_cluster_offset return type
The return type of get_cluster_offset was an offset that use 0 to denote
'not allocated', this will be no longer true for flat extents, as we see
flat extent file as a single huge cluster whose offset is 0 and length
is the whole file length.
So now we use int return value, 0 means success and otherwise offset
invalid.

Signed-off-by: Fam Zheng <famcool@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-07-19 15:39:07 +02:00
Fam Zheng
69b4d86d9f VMDK: move 'static' cid_update flag to bs field
Cid_update is the flag for updating CID on first write after opening the
image. This should be per image open rather than per program life cycle,
so change it from static var of vmdk_write to a field in BDRVVmdkState.

Signed-off-by: Fam Zheng <famcool@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-07-19 15:39:07 +02:00
Fam Zheng
333c574d05 VMDK: flush multiple extents
Flush all the file that referenced by the image.

Signed-off-by: Fam Zheng <famcool@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-07-19 15:39:06 +02:00
Fam Zheng
e1da9b2433 VMDK: add field BDRVVmdkState.desc_offset
There are several occurrence of magic number 0x200 as the descriptor
offset within mono sparse image file. This is not the case for images
with separate descriptor file. So a field is added to BDRVVmdkState to
hold the correct value.

Signed-off-by: Fam Zheng <famcool@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-07-19 15:39:06 +02:00
Fam Zheng
b4b3ab146c VMDK: separate vmdk_open by format version
Separate vmdk_open by subformats to:
* vmdk_open_vmdk3
* vmdk_open_vmdk4

Signed-off-by: Fam Zheng <famcool@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-07-19 15:39:06 +02:00
Fam Zheng
01fc99d6a8 VMDK: probe for monolithicFlat images
Probe as the same behavior as VMware does.
Recognize image as monolithicFlat descriptor file when the file is text
and the first effective line (not '#' leaded comment or space line) is
either 'version=1' or 'version=2'. No space or upper case charactors
accepted.

Signed-off-by: Fam Zheng <famcool@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-07-19 15:39:06 +02:00
Fam Zheng
0e69c54394 VMDK: bugfix, align offset to cluster in get_whole_cluster
In get_whole_cluster, the offset is not aligned to cluster when reading
from backing_hd. When the first write to child is not at the cluster
boundary, wrong address data from parent is copied to child.

Signed-off-by: Fam Zheng <famcool@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-07-19 15:39:06 +02:00
Fam Zheng
b3976d3c8b VMDK: introduce VmdkExtent
Introduced VmdkExtent array into BDRVVmdkState, enable holding multiple
image extents for multiple file image support.

Signed-off-by: Fam Zheng <famcool@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-07-19 15:39:05 +02:00
MORITA Kazutaka
a8e0fdd715 sheepdog: add full data preallocation support
This introduces qemu-img create option for sheepdog which allows the
data to be fully preallocated (note that sheepdog always preallocates
metadata).

The option is disabled by default and you need to enable it like the
following:

qemu-img create sheepdog:test -o preallocation=full 1G

Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-07-18 16:06:26 +02:00
Johannes Stezenbach
343f85685c block/raw-posix: Linux compat-ioctl warning workaround
On Linux x86_64 host with 32bit userspace, running
qemu or even just "qemu-img create -f qcow2 some.img 1G"
causes a kernel warning:

ioctl32(qemu-img:5296): Unknown cmd fd(3) cmd(00005326){t:'S';sz:0} arg(7fffffff) on some.img
ioctl32(qemu-img:5296): Unknown cmd fd(3) cmd(801c0204){t:02;sz:28} arg(fff77350) on some.img

ioctl 00005326 is CDROM_DRIVE_STATUS,
ioctl 801c0204 is FDGETPRM.

The warning appears because the Linux compat-ioctl handler for these
ioctls only applies to block devices, while qemu also uses the ioctls on
plain files.  Work around by calling fstat() the ensure the ioctls are
only used on block devices.

Signed-off-by: Johannes Stezenbach <js@sig21.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-07-05 11:23:29 +02:00
Markus Armbruster
6daf194dde Strip trailing '\n' from error_report()'s first argument
error_report() prepends location, and appends a newline.  The message
constructed from the arguments should not contain a newline.  Fix the
obvious offenders.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2011-06-24 09:13:36 +01:00
Markus Armbruster
db78ef5b0a sheepdog: qemu_bh_new() can't return null pointer, drop check
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2011-06-24 09:13:36 +01:00
Kevin Wolf
9e2a3701a1 qcow2: Fix in-flight list after qcow2_cache_put failure
If qcow2_cache_put returns an error during cluster allocation and the
allocation fails, it must be removed from the list of in-flight allocations.
Otherwise we'd get a loop in the list when the ACB is used for the next
allocation.

Luckily, this qcow2_cache_put shouldn't fail anyway because the L2 table is
only read, so that qcow2_cache_put doesn't even involve I/O.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2011-06-15 14:36:15 +02:00
Kevin Wolf
e67a64a869 vdi: Avoid direct AIO callback
bdrv_aio_* must not call the callback before returning to its caller. In vdi,
this could happen in some error cases. This starts the real requests processing
in a BH to avoid this situation.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-06-15 14:35:15 +02:00
Kevin Wolf
b11a24dee6 qcow: Avoid direct AIO callback
bdrv_aio_* must not call the callback before returning to its caller. In qcow,
this could happen in some error cases. This starts the real requests processing
in a BH to avoid this situation.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-06-14 17:03:27 +02:00
Kevin Wolf
42496d6240 qcow2: Avoid direct AIO callback
bdrv_aio_* must not call the callback before returning to its caller. In qcow2,
this could happen in some error cases. This starts the real requests processing
in a BH to avoid this situation.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-06-14 17:03:25 +02:00
Stefan Weil
7bf4162a80 block/rbd: Remove unused local variable
Variable 'snap' is assigned a value that is never used.
Remove snap and the related code.

Cc: Christian Brunner <chb@muc.de>
Cc: Josh Durgin <josh.durgin@dreamhost.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-06-14 10:01:19 +02:00
Anthony Liguori
13748cf499 Merge remote-tracking branch 'stefanha/trivial-patches' into staging 2011-06-08 12:13:58 -05:00
Kevin Wolf
99cce9fa4e qemu-img create: Fix displayed default cluster size
When not specifying a cluster size on the command line, qemu-img printed
a cluster size of 0:

    Formatting '/tmp/test.qcow2', fmt=qcow2 size=67108864
    encryption=off cluster_size=0

This patch adds the default cluster size to the QEMUOptionParameter list, so
that it displays the default value that is used.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-06-08 11:56:40 +02:00
Kevin Wolf
80fa3341a7 qcow2: Fix memory leaks in error cases
This fixes memory leaks that may be caused by I/O errors during L1 table growth
(can happen during save_vm) and in qemu-img check.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-06-08 11:56:40 +02:00
Josh Durgin
30cdc48cc8 rbd: Add bdrv_truncate implementation
Reviewed-by: Christian Brunner <chb@muc.de>
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-06-08 11:56:40 +02:00
Josh Durgin
51a135287a rbd: check return values when scheduling aio
If scheduling fails, the number of outstanding I/Os must be correct,
or there will be a hang when waiting for everything to be flushed.

Reviewed-by: Christian Brunner <chb@muc.de>
Reported-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-06-08 11:56:40 +02:00
Josh Durgin
fab5cf592c rbd: allow configuration of rados from the rbd filename
The new format is rbd:pool/image[@snapshot][:option1=value1[:option2=value2...]]
Each option is used to configure rados, and may be any Ceph option, or "conf".
The "conf" option specifies a Ceph configuration file to read.

This allows rbd volumes from more than one Ceph cluster to be used by
specifying different monitor addresses, as well as having different
logging levels or locations for different volumes.

Reviewed-by: Christian Brunner <chb@muc.de>
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-06-08 11:56:40 +02:00
Josh Durgin
ad32e9c003 rbd: use the higher level librbd instead of just librados
librbd stacks on top of librados to provide access
to rbd images.

Using librbd simplifies the qemu code, and allows
qemu to use new versions of the rbd format
with few (if any) changes.

Reviewed-by: Christian Brunner <chb@muc.de>
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-06-08 11:56:39 +02:00
Christoph Egger
d1f6fd8d14 block/raw-posix: get right partition size
use the correct way to get the size of a disk device or partition

From: Adam Hamsik <haad@netbsd.org>
Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-06-08 11:56:39 +02:00
Christoph Egger
1de1ae0a7d block/raw-posix: use a character device if a block device is given
On NetBSD a userland process is better with the character device
interface. In addition, a block device can't be opened twice; if a Xen
backend opens it, qemu can't and vice-versa.

Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-06-08 10:39:32 +02:00
Alexander Graf
16372ff03d vmdk: fix endianness bugs
The vmdk code is sloppy when handling the header descriptor during
creation of an image. Fix all header accesses in the create path to
either store native endianness or convert it when appropriate.

Reported-by: Yury Tsarev <ytsarev@novell.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-06-08 10:39:32 +02:00
Christoph Hellwig
a659979328 block: clarify the meaning of BDRV_O_NOCACHE
Change BDRV_O_NOCACHE to only imply bypassing the host OS file cache,
but no writeback semantics.  All existing callers are changed to also
specify BDRV_O_CACHE_WB to give them writeback semantics.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-06-08 10:39:32 +02:00
Alexandre Raymond
9bf0960a9a Fix compilation warning due to missing header for sigaction (followup)
This patch removes all references to signal.h when qemu-common.h is included
as they become redundant.

Signed-off-by: Alexandre Raymond <cerbere@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2011-06-08 09:04:29 +01:00
Stefan Hajnoczi
77a5a0001b qed: support for growing images
The .bdrv_truncate() operation resizes images and growing is easy to
implement in QED.  Simply check that the new size is valid and then
update the image_size header field to reflect the new size.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-05-18 14:39:15 +02:00
Stefan Hajnoczi
6f321e93ab qed: Periodically flush and clear need check bit
One strategy to limit the startup delay of consistency check when
opening image files is to ensure that the file is marked dirty for as
little time as possible.

QED currently marks the image dirty when the first allocating write
request is issued and clears the dirty bit again when the image is
cleanly closed.  In practice that means the image is marked dirty for
most of a guest's lifetime and prone to being in a dirty state upon
crash or power failure.

It is safe to clear the dirty bit after all allocating write requests
have completed and a flush has been performed.  This patch adds a timer
after the last allocating write request completes.  When the timer fires
it will flush and then clear the dirty bit.  The timer is set to 5
seconds and is cancelled upon arrival of a new allocating write request.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-05-18 14:38:46 +02:00
Stefan Weil
a1c7273b82 Fix typos in comments and code (occured -> occurred and related)
The code changed here is an unused data type name (evt_flush_occurred).

Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2011-05-08 10:02:18 +01:00
Stefan Weil
ebabb67a17 Fix typo in code and comments
Replace writeable -> writable

Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2011-05-06 08:19:25 +01:00
Nick Thomas
d2d979c628 NBD: Avoid leaking a couple of strings when the NBD device is closed
Signed-off-by: Nick Thomas <nick@bytemark.co.uk>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-05-03 11:29:21 +02:00
Stefan Hajnoczi
19dfc44a94 qed: Fix consistency check on 32-bit hosts
The qed_bytes_to_clusters() function is normally used with size_t
lengths.  Consistency check used it with file size length and therefore
failed on 32-bit hosts when the image file is 4 GB or more.

Make qed_bytes_to_clusters() explicitly 64-bit and update consistency
check to keep 64-bit cluster counts.

Reported-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-04-27 16:21:00 +02:00
Mitnick Lyu
2d56a546a7 vpc.c: Use get_option_parameter() does the search
Use get_option_parameter() to instead of duplicating the loop, and
use BDRV_SECTOR_SIZE to instead of 512

Signed-off-by: Mitnick Lyu <mitnick.lyu@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-04-13 12:31:41 +02:00
Anthony Liguori
21df65b644 qed: Add support for zero clusters
Zero clusters are similar to unallocated clusters except instead of reading
their value from a backing file when one is available, the cluster is always
read as zero.

This implements read support only.  At this stage, QED will never write a
zero cluster.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-04-13 12:06:41 +02:00
Nick Thomas
33897dc7d6 NBD device: Separate out parsing configuration and opening sockets.
We also change the way the file parameter is parsed so IPv6 IP
addresses can be used, e.g.: "drive=nbd:[::1]:5000"

Signed-off-by: Nick Thomas <nick@bytemark.co.uk>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-04-07 13:51:48 +02:00
Stefan Weil
4ff9786c67 Fix trivial "endianness bugs"
Replace endianess -> endianness.

Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2011-04-03 21:42:57 +02:00
Michael Tokarev
8cffde7329 get rid of private bitmap functions in block/sheepdog.c, use generic ones
qemu now has generic bitmap functions,
so don't redefine them in sheepdog.c,
use common header instead.  A small cleanup.

Here's only one function which is actually
used in sheepdog and gets replaced with
a generic one (simplified):

- static inline int test_bit(int nr, const volatile unsigned long *addr)
+ static inline int test_bit(int nr, const unsigned long *addr)
 {
-  return ((1UL << (nr % BITS_PER_LONG))
            & ((unsigned long*)addr)[nr / BITS_PER_LONG])) != 0;
+  return 1UL & (addr[nr / BITS_PER_LONG] >> (nr & (BITS_PER_LONG-1)));
 }

The body is equivalent, but the argument is not: there's
"volatile" in there.  Why it is used for - I'm not sure.

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Acked-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2011-04-01 22:23:06 +02:00
Stefan Weil
5614c188c6 block/qcow: Don't ignore immediate read/write and other failures
This patch is similar to 171e3d6b99
which fixed qcow2:

Returning -EIO is far from optimal, but at least it's an error code.

In addition to read/write failures, -EIO is also returned when
decompress_cluster failed.

Cc: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-03-15 13:21:14 +01:00
Stefan Weil
40a892b78c block/vdi: Don't ignore immediate read/write failures
This patch is similar to 171e3d6b99
which fixed qcow2:

Returning -EIO is far from optimal, but at least it's an error code.

Cc: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-03-15 13:21:14 +01:00
Kevin Wolf
16fde5f2c2 qcow2: Fix order in L2 table COW
When copying L2 tables (this happens only with internal snapshots), the order
wasn't completely safe, so that after a crash you could end up with a L2 table
that has too low refcount, possibly leading to corruption in the long run.

This patch puts the operations in the right order: First allocate the new
L2 table and replace the reference, and only then decrease the refcount of the
old table.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-02-10 13:24:29 +01:00
Kevin Wolf
10b758e85c qed: Report error for unsupported features
Instead of just returning -ENOTSUP, generate a more detailed error.

Unfortunately we don't have a helpful text for features that we don't know yet,
so just print the feature mask. It might be useful at least if someone asks for
help.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Acked-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2011-02-10 13:23:59 +01:00
Kevin Wolf
e8cdcec123 qcow2: Report error for version > 2
The qcow2 driver is now declared responsible for any QCOW image that has
version 2 or greater (before this, version 3 would be detected as raw).

For everything newer than version 2, an error is reported.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
2011-02-10 13:23:56 +01:00
Kevin Wolf
8af3648843 qcow2: Fix error handling for reading compressed clusters
When reading a compressed cluster failed, qcow2 falsely returned success.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
2011-02-10 13:23:44 +01:00
Kevin Wolf
3ab4c7e92d qcow2: Fix error handling for immediate backing file read failure
Requests could return success even though they failed when bdrv_aio_readv
returned NULL for a backing file read.

Reported-by: Chunqiang Tang <ctang@us.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-02-10 13:23:44 +01:00
Chunqiang Tang
e0d9c6f937 QCOW2: bug fix - read base image beyond its size
This patch fixes the following bug in QCOW2. For a QCOW2 image that is larger
than its base image, when handling a read request straddling over the end of the
base image, the QCOW2 driver attempts to read beyond the end of the base image
and the request would fail.

This bug was found by Fast Virtual Disk (FVD)'s fully automated testing tool.
The following test triggered the bug.

dd if=/dev/zero of=/var/ramdisk/truth.raw count=0 bs=1 seek=1098561536
dd if=/dev/zero of=/var/ramdisk/zero-500M.raw count=0 bs=1 seek=593099264
./qemu-img create -f qcow2 -ocluster_size=65536,backing_fmt=blksim -b /var/ramdisk/zero-500M.raw /var/ramdisk/test.qcow2 1098561536
./qemu-io --auto --seed=30477694 --truth=/var/ramdisk/truth.raw --format=qcow2 --test=blksim:/var/ramdisk/test.qcow2 --verify_write=true --compare_before=false --compare_after=true --round=100000 --parallel=100 --io_size=10485760 --fail_prob=0 --cancel_prob=0 --instant_qemubh=true

Signed-off-by: Chunqiang Tang <ctang@us.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-02-10 13:23:44 +01:00
Stefan Weil
4f3669ea5b block/vdi: Fix wrong size in conditionally used memset, memcmp
Error report from cppcheck:
block/vdi.c:122: error: Using sizeof for array given as function argument returns the size of pointer.
block/vdi.c:128: error: Using sizeof for array given as function argument returns the size of pointer.

Fix both by setting the correct size.

The buggy code is only used when QEMU is build without uuid support.
The bug is not critical, so there is no urgent need to apply it to
old versions of QEMU.

Cc: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-02-07 10:07:25 +01:00
Kevin Wolf
e1a7107f2d qcow2: Really use cache=unsafe for image creation
For cache=unsafe we also need to set BDRV_O_CACHE_WB, otherwise we have some
strange unsafe writethrough mode.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2011-02-07 09:44:22 +01:00
Blue Swirl
1869a65385 qcow2-refcount: remove write-only variables
Variables l2_modified and l2_size are not really used, remove them.
Spotted by GCC 4.6.0:
  CC    block/qcow2-refcount.o
/src/qemu/block/qcow2-refcount.c: In function 'qcow2_update_snapshot_refcount':
/src/qemu/block/qcow2-refcount.c:708:37: error: variable 'l2_modified' set but not used [-Werror=unused-but-set-variable]
/src/qemu/block/qcow2-refcount.c:708:9: error: variable 'l2_size' set but not used [-Werror=unused-but-set-variable]

CC: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-01-31 10:05:34 +01:00
Kevin Wolf
1b40bbd13a raw-win32: Fix bdrv_flush return value
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-01-31 10:03:00 +01:00
Stefan Hajnoczi
0d09c79700 qed: Images with backing file do not require QED_F_NEED_CHECK
The consistency check on open is necessary in order to fix inconsistent
table offsets left as a result of a crash mid-operation.  Images with a
backing file actually flush before updating table offsets and are
therefore guaranteed to be consistent.  Do not mark these images dirty.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-01-31 10:03:00 +01:00
Kevin Wolf
5ea929e3d1 qcow2: Add bdrv_discard support
This adds a bdrv_discard function to qcow2 that frees the discarded clusters.
It does not yet pass the discard on to the underlying file system driver, but
the space can be reused by future writes to the image.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2011-01-31 10:03:00 +01:00
MORITA Kazutaka
b444736346 sheepdog: support creating images on remote hosts
This patch parses the input filename in sd_create(), and enables us
specifying a target server to create sheepdog images.

Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-01-31 10:03:00 +01:00
Jes Sorensen
bf595021c7 Reorganize struct Qcow2Cache for better struct packing
Move size after the two pointers in struct Qcow2Cache to get better
packing of struct elements on 64 bit architectures.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-01-31 10:03:00 +01:00
Stefan Hajnoczi
c743849bee qed: Refuse to create images on block devices
QED relies on the underlying filesystem to extend the file and maintain
its size.  Check that images are not created on a block device.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-01-24 16:41:50 +01:00
Kevin Wolf
3de0a2944b qcow2: Batch flushes for COW
qcow2 calls bdrv_flush() after performing COW in order to ensure that the
L2 table change is never written before the copy is safe on disk. Now that the
L2 table is cached, we can wait with flushing until we write out the next L2
table.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-01-24 16:41:49 +01:00
Kevin Wolf
29c1a7301a qcow2: Use QcowCache
Use the new functions of qcow2-cache.c for everything that works on refcount
block and L2 tables.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-01-24 16:41:49 +01:00
Kevin Wolf
493810940b qcow2: Add QcowCache
This adds some new cache functions to qcow2 which can be used for caching
refcount blocks and L2 tables. When used with cache=writethrough they work
like the old caching code which is spread all over qcow2, so for this case we
have merely a cleanup.

The interesting case is with writeback caching (this includes cache=none) where
data isn't written to disk immediately but only kept in cache initially. This
leads to some form of metadata write batching which avoids the current "write
to refcount block, flush, write to L2 table" pattern for each single request
when a lot of cluster allocations happen. Instead, cache entries are only
written out if its required to maintain the right order. In the pure cluster
allocation case this means that all metadata updates for requests are done in
memory initially and on sync, first the refcount blocks are written to disk,
then fsync, then L2 tables.

This improves performance of scenarios with lots of cluster allocations
noticably (e.g. installation or after taking a snapshot).

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-01-24 11:08:51 +01:00
Aurelien Jarno
653df36bbe qcow2: fix unaligned access
cpu_to_be64w() is called with an obviously non-aligned pointer. Use
cpu_to_be64wu() instead. It fixes unaligned accesses errors on IA64
hosts.

Cc: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-01-24 11:08:50 +01:00
Blue Swirl
f0ff243a16 vpc: fix a file descriptor leak
Fix a file descriptor leak, reported by cppcheck:
[/src/qemu/block/vpc.c:524]: (error) Resource leak: fd

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2011-01-12 19:49:00 +00:00
Blue Swirl
08089edcd2 vvfat: fix a file descriptor leak
Fix a file descriptor leak, reported by cppcheck:
[/src/qemu/block/vvfat.c:759]: (error) Resource leak: dir

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2011-01-12 19:48:58 +00:00