Commit Graph

1760 Commits

Author SHA1 Message Date
Stefan Hajnoczi
370e681629 block/cloop: use PRIu32 format specifier for uint32_t
PRIu32 is the format string specifier for uint32_t, let's use it.
Variables ->block_size, ->n_blocks, and i are all uint32_t.

Suggested-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-04-23 11:34:10 +02:00
Fam Zheng
9b17031ac4 vmdk: Fix "%x" to PRIx32 in format strings for cid
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-04-22 14:14:30 +02:00
Kevin Wolf
98522f63f4 block: Add errp to bdrv_new()
This patch adds an errp parameter to bdrv_new() and updates all its
callers. The next patches will make use of this in order to check for
duplicate IDs. Most of the callers know that their ID is fine, so they
can simply assert that there is no error.

Behaviour doesn't change with this patch yet as bdrv_new() doesn't
actually assign errors to errp.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-04-22 12:00:20 +02:00
Aakriti Gupta
5ff679b47d convert fprintf() calls to error_setg() in block/qed.c:bdrv_qed_create()
This patch converts fprintf() calls to error_setg() in block/qed.c:bdrv_qed_create()
(error_setg() is part of error reporting API in include/qapi/error.h)

Signed-off-by: Aakriti Gupta <aakritty@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-04-22 11:57:02 +02:00
Maria Kustova
acd7fdc6d8 curl: Replaced old error handling with error reporting API.
Signed-off-by: Maria Kustova <maria.k@catit.be>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-04-22 11:57:02 +02:00
Fam Zheng
b8afb520e4 block: Handle error of bdrv_getlength in bdrv_create_dirty_bitmap
bdrv_getlength could fail, check the return value before using it.
Return NULL and set errno if it fails. Callers are updated to handle
the error case.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-04-22 11:57:02 +02:00
Fam Zheng
4ab9dab5b9 vmdk: Fix %d and %lld to PRI* in format strings
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-04-22 11:57:02 +02:00
Fam Zheng
cd82b6fb4d iscsi: Remember to set ret for iscsi_open in error case
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-04-11 13:59:49 +02:00
Kevin Wolf
715c3f60ef bochs: Fix catalog size check
The old check was off by a factor of 512 and didn't consider cases where
we don't get an exact division. This could lead to an out-of-bounds
array access in seek_to_sector().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2014-04-11 13:59:49 +02:00
Kevin Wolf
28ec11bc88 bochs: Fix memory leak in bochs_open() error path
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2014-04-11 13:59:49 +02:00
Kevin Wolf
8885eadedd qcow2: Put cache reference in error case
When qcow2_get_cluster_offset() sees a zero cluster in a version 2
image, it (rightfully) returns an error. But in doing so it shouldn't
leak an L2 table cache reference.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-04-04 17:10:08 +02:00
Kevin Wolf
4c2e5f8f46 qcow2: Flush metadata during read-only reopen
If lazy refcounts are enabled for a backing file, committing to this
backing file may leave it in a dirty state even if the commit succeeds.
The reason is that the bdrv_flush() call in bdrv_commit() doesn't flush
refcount updates with lazy refcounts enabled, and qcow2_reopen_prepare()
doesn't take care to flush metadata.

In order to fix this, this patch also fixes qcow2_mark_clean(), which
contains another ineffective bdrv_flush() call beause lazy refcounts are
disabled only afterwards. All existing callers of qcow2_mark_clean()
either don't modify refcounts or already flush manually, so that this
fixes only a latent, but not yet actually triggerable bug.

Another instance of the same problem is live snapshots. Again, a real
corruption is prevented by an explicit flush for non-read-only images in
external_snapshot_prepare(), but images using lazy refcounts stay dirty.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-04 14:12:26 +02:00
Fam Zheng
cbee81f6de iscsi: Don't set error if already set in iscsi_do_inquiry
This eliminates the possible assertion failure in error_setg().

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-04-04 14:11:34 +02:00
Peter Maydell
784a5592c9 Merge remote-tracking branch 'remotes/bonzini/scsi-next' into staging
* remotes/bonzini/scsi-next:
  iscsi: always query max WRITE SAME length
  iscsi: ignore flushes on scsi-generic devices
  iscsi: recognize "invalid field" ASCQ from WRITE SAME command
  scsi-bus: remove bogus assertion

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-04-03 12:24:35 +01:00
Paolo Bonzini
c97ca29db0 iscsi: always query max WRITE SAME length
Max WRITE SAME length is also used when the UNMAP bit is zero, so it
should be queried even if LBPWS=0.  Same for the optimal transfer
length.

However, the write_zeroes_alignment only matters for UNMAP=1 so we
still restrict it to LBPWS=1.

Reviewed-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-04-03 13:10:53 +02:00
Paolo Bonzini
b2f9c08a4f iscsi: ignore flushes on scsi-generic devices
Non-block SCSI devices do not support flushing, but we may still send
them requests via bdrv_flush_all.  Just ignore them.

Reviewed-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-04-03 13:10:45 +02:00
Paolo Bonzini
27898a5daa iscsi: recognize "invalid field" ASCQ from WRITE SAME command
Some targets may return "invalid field" as the ASCQ from WRITE SAME
if they support the command only without the UNMAP field.  Recognize
that, and return ENOTSUP just like for "invalid operation code".

Reviewed-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-04-03 13:10:32 +02:00
Stefan Hajnoczi
c792707f54 qcow2: link all L2 meta updates in preallocate()
preallocate() only links the first QCowL2Meta's data clusters into the
L2 table and ignores any chained QCowL2Metas in the linked list.

Chains of QCowL2Meta structs are built up when contiguous clusters span
L2 tables.  Each QCowL2Meta describes one L2 table update.  This is a
rare case in preallocate() but can happen.

This patch fixes preallocate() by iterating over the whole list of
QCowL2Metas.  Compare with the qcow2_co_writev() function's
implementation, which is similar but also also handles request
dependencies.  preallocate() only performs one allocation at a time so
there can be no dependencies.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 15:22:35 +02:00
Kevin Wolf
9302e863aa parallels: Sanity check for s->tracks (CVE-2014-0142)
This avoids a possible division by zero.

Convert s->tracks to unsigned as well because it feels better than
surviving just because the results of calculations with s->tracks are
converted to unsigned anyway.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 15:22:35 +02:00
Kevin Wolf
afbcc40bee parallels: Fix catalog size integer overflow (CVE-2014-0143)
The first test case would cause a huge memory allocation, leading to a
qemu abort; the second one to a too small malloc() for the catalog
(smaller than s->catalog_size), which causes a read-only out-of-bounds
array access and on big endian hosts an endianess conversion for an
undefined memory area.

The sample image used here is not an original Parallels image. It was
created using an hexeditor on the basis of the struct that qemu uses.
Good enough for trying to crash the driver, but not for ensuring
compatibility.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 15:22:35 +02:00
Kevin Wolf
5dae6e30c5 qcow2: Limit snapshot table size
Even with a limit of 64k snapshots, each snapshot could have a filename
and an ID with up to 64k, which would still lead to pretty large
allocations, which could potentially lead to qemu aborting. Limit the
total size of the snapshot table to an average of 1k per entry when
the limit of 64k snapshots is fully used. This should be plenty for any
reasonable user.

This also fixes potential integer overflows of s->snapshot_size.

Suggested-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 15:22:35 +02:00
Kevin Wolf
6a83f8b5be qcow2: Check maximum L1 size in qcow2_snapshot_load_tmp() (CVE-2014-0143)
This avoids an unbounded allocation.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 15:22:35 +02:00
Kevin Wolf
c05e4667be qcow2: Fix L1 allocation size in qcow2_snapshot_load_tmp() (CVE-2014-0145)
For the L1 table to loaded for an internal snapshot, the code allocated
only enough memory to hold the currently active L1 table. If the
snapshot's L1 table is actually larger than the current one, this leads
to a buffer overflow.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 15:22:35 +02:00
Kevin Wolf
11b128f406 qcow2: Fix NULL dereference in qcow2_open() error path (CVE-2014-0146)
The qcow2 code assumes that s->snapshots is non-NULL if s->nb_snapshots
!= 0. By having the initialisation of both fields separated in
qcow2_open(), any error occuring in between would cause the error path
to dereference NULL in qcow2_free_snapshots() if the image had any
snapshots.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 15:22:35 +02:00
Kevin Wolf
6b7d4c5558 qcow2: Fix copy_sectors() with VM state
bs->total_sectors is not the highest possible sector number that could
be involved in a copy on write operation: VM state is after the end of
the virtual disk. This resulted in wrong values for the number of
sectors to be copied (n).

The code that checks for the end of the image isn't required any more
because the code hasn't been calling the block layer's bdrv_read() for a
long time; instead, it directly calls qcow2_readv(), which doesn't error
out on VM state sector numbers.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 15:22:35 +02:00
Stefan Hajnoczi
f0dce23475 dmg: prevent chunk buffer overflow (CVE-2014-0145)
Both compressed and uncompressed I/O is buffered.  dmg_open() calculates
the maximum buffer size needed from the metadata in the image file.

There is currently a buffer overflow since ->lengths[] is accounted
against the maximum compressed buffer size but actually uses the
uncompressed buffer:

  switch (s->types[chunk]) {
  case 1: /* copy */
      ret = bdrv_pread(bs->file, s->offsets[chunk],
                       s->uncompressed_chunk, s->lengths[chunk]);

We must account against the maximum uncompressed buffer size for type=1
chunks.

This patch fixes the maximum buffer size calculation to take into
account the chunk type.  It is critical that we update the correct
maximum since there are two buffers ->compressed_chunk and
->uncompressed_chunk.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 15:22:35 +02:00
Stefan Hajnoczi
686d7148ec dmg: use uint64_t consistently for sectors and lengths
The DMG metadata is stored as uint64_t, so use the same type for
sector_num.  int was a particularly poor choice since it is only 32-bit
and would truncate large values.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 15:22:35 +02:00
Stefan Hajnoczi
c165f77580 dmg: sanitize chunk length and sectorcount (CVE-2014-0145)
Chunk length and sectorcount are used for decompression buffers as well
as the bdrv_pread() count argument.  Ensure that they have reasonable
values so neither memory allocation nor conversion from uint64_t to int
will cause problems.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 15:22:35 +02:00
Stefan Hajnoczi
eb71803b04 dmg: use appropriate types when reading chunks
Use the right types instead of signed int:

  size_t new_size;

  This is a byte count for g_realloc() that is calculated from uint32_t
  and size_t values.

  uint32_t chunk_count;

  Use the same type as s->n_chunks, which is used together with
  chunk_count.

This patch is a cleanup and does not fix bugs.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 15:22:35 +02:00
Stefan Hajnoczi
b404bf8542 dmg: drop broken bdrv_pread() loop
It is not necessary to check errno for EINTR and the block layer does
not produce short reads.  Therefore we can drop the loop that attempts
to read a compressed chunk.

The loop is buggy because it incorrectly adds the transferred bytes
twice:

  do {
      ret = bdrv_pread(...);
      i += ret;
  } while (ret >= 0 && ret + i < s->lengths[chunk]);

Luckily we can drop the loop completely and perform a single
bdrv_pread().

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 15:22:35 +02:00
Stefan Hajnoczi
73ed27ec28 dmg: prevent out-of-bounds array access on terminator
When a terminator is reached the base for offsets and sectors is stored.
The following records that are processed will use this base value.

If the first record we encounter is a terminator, then calculating the
base values would result in out-of-bounds array accesses.  Don't do
that.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 15:22:35 +02:00
Stefan Hajnoczi
2c1885adcf dmg: coding style and indentation cleanup
Clean up the mix of tabs and spaces, as well as the coding style
violations in block/dmg.c.  There are no semantic changes since this
patch simply reformats the code.

This patch is necessary before we can make meaningful changes to this
file, due to the inconsistent formatting and confusing indentation.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 15:22:35 +02:00
Kevin Wolf
cab60de930 qcow2: Fix new L1 table size check (CVE-2014-0143)
The size in bytes is assigned to an int later, so check that instead of
the number of entries.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 15:22:35 +02:00
Kevin Wolf
0abe740f1d qcow2: Protect against some integer overflows in bdrv_check
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 15:22:35 +02:00
Kevin Wolf
bb572aefbd qcow2: Fix types in qcow2_alloc_clusters and alloc_clusters_noref
In order to avoid integer overflows.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 15:22:34 +02:00
Kevin Wolf
2b5d5953ee qcow2: Check new refcount table size on growth
If the size becomes larger than what qcow2_open() would accept, fail the
growing operation.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 15:22:34 +02:00
Kevin Wolf
db8a31d11d qcow2: Avoid integer overflow in get_refcount (CVE-2014-0143)
This ensures that the checks catch all invalid cluster indexes
instead of returning the refcount of a wrong cluster.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 15:22:34 +02:00
Kevin Wolf
b106ad9185 qcow2: Don't rely on free_cluster_index in alloc_refcount_block() (CVE-2014-0147)
free_cluster_index is only correct if update_refcount() was called from
an allocation function, and even there it's brittle because it's used to
protect unfinished allocations which still have a refcount of 0 - if it
moves in the wrong place, the unfinished allocation can be corrupted.

So not using it any more seems to be a good idea. Instead, use the
first requested cluster to do the calculations. Return -EAGAIN if
unfinished allocations could become invalid and let the caller restart
its search for some free clusters.

The context of creating a snapsnot is one situation where
update_refcount() is called outside of a cluster allocation. For this
case, the change fixes a buffer overflow if a cluster is referenced in
an L2 table that cannot be represented by an existing refcount block.
(new_table[refcount_table_index] was out of bounds)

[Bump the qemu-iotests 026 refblock_alloc.write leak count from 10 to
11.
--Stefan]

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 15:21:03 +02:00
Kevin Wolf
6d33e8e7dc qcow2: Fix backing file name length check
len could become negative and would pass the check then. Nothing bad
happened because bdrv_pread() happens to return an error for negative
length values, but make variables for sizes unsigned anyway.

This patch also changes the behaviour to error out on invalid lengths
instead of silently truncating it to 1023.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 14:19:09 +02:00
Kevin Wolf
2d51c32c4b qcow2: Validate active L1 table offset and size (CVE-2014-0144)
This avoids an unbounded allocation.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 14:19:09 +02:00
Kevin Wolf
ce48f2f441 qcow2: Validate snapshot table offset/size (CVE-2014-0144)
This avoid unbounded memory allocation and fixes a potential buffer
overflow on 32 bit hosts.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 14:19:09 +02:00
Kevin Wolf
8c7de28305 qcow2: Validate refcount table offset
The end of the refcount table must not exceed INT64_MAX so that integer
overflows are avoided.

Also check for misaligned refcount table. Such images are invalid and
probably the result of data corruption. Error out to avoid further
corruption.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 14:19:09 +02:00
Kevin Wolf
5dab2faddc qcow2: Check refcount table size (CVE-2014-0144)
Limit the in-memory reference count table size to 8 MB, it's enough in
practice. This fixes an unbounded allocation as well as a buffer
overflow in qcow2_refcount_init().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 14:19:09 +02:00
Kevin Wolf
a1b3955c94 qcow2: Check backing_file_offset (CVE-2014-0144)
Header, header extension and the backing file name must all be stored in
the first cluster. Setting the backing file to a much higher value
allowed header extensions to become much bigger than we want them to be
(unbounded allocation).

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 14:19:09 +02:00
Kevin Wolf
24342f2cae qcow2: Check header_length (CVE-2014-0144)
This fixes an unbounded allocation for s->unknown_header_fields.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 14:19:09 +02:00
Fam Zheng
6d4b9e55fc curl: check data size before memcpy to local buffer. (CVE-2014-0144)
curl_read_cb is callback function for libcurl when data arrives. The
data size passed in here is not guaranteed to be within the range of
request we submitted, so we may overflow the guest IO buffer. Check the
real size we have before memcpy to buffer to avoid overflow.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 14:19:09 +02:00
Jeff Cody
1d7678dec4 vhdx: Bounds checking for block_size and logical_sector_size (CVE-2014-0148)
Other variables (e.g. sectors_per_block) are calculated using these
variables, and if not range-checked illegal values could be obtained
causing infinite loops and other potential issues when calculating
BAT entries.

The 1.00 VHDX spec requires BlockSize to be min 1MB, max 256MB.
LogicalSectorSize is required to be either 512 or 4096 bytes.

Reported-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 14:19:09 +02:00
Jeff Cody
63fa06dc97 vdi: add bounds checks for blocks_in_image and disk_size header fields (CVE-2014-0144)
The maximum blocks_in_image is 0xffffffff / 4, which also limits the
maximum disk_size for a VDI image to 1024TB.  Note that this is the maximum
size that QEMU will currently support with this driver, not necessarily the
maximum size allowed by the image format.

This also fixes an incorrect error message, a bug introduced by commit
5b7aa9b56d (Reported by Stefan Weil)

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 14:06:31 +02:00
Kevin Wolf
5e71dfad76 vpc: Validate block size (CVE-2014-0142)
This fixes some cases of division by zero crashes.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 13:59:47 +02:00
Jeff Cody
97f1c45c6f vpc/vhd: add bounds check for max_table_entries and block_size (CVE-2014-0144)
This adds checks to make sure that max_table_entries and block_size
are in sane ranges.  Memory is allocated based on max_table_entries,
and block_size is used to calculate indices into that allocated
memory, so if these values are incorrect that can lead to potential
unbounded memory allocation, or invalid memory accesses.

Also, the allocation of the pagetable is changed from g_malloc0()
to qemu_blockalign().

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 13:59:47 +02:00
Kevin Wolf
a9ba36a45d bochs: Fix bitmap offset calculation
32 bit truncation could let us access the wrong offset in the image.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 13:59:47 +02:00
Kevin Wolf
8e53abbc20 bochs: Check extent_size header field (CVE-2014-0142)
This fixes two possible division by zero crashes: In bochs_open() and in
seek_to_sector().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 13:59:47 +02:00
Kevin Wolf
e3737b820b bochs: Check catalog_size header field (CVE-2014-0143)
It should neither become negative nor allow unbounded memory
allocations. This fixes aborts in g_malloc() and an s->catalog_bitmap
buffer overflow on big endian hosts.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 13:59:47 +02:00
Kevin Wolf
246f65838d bochs: Use unsigned variables for offsets and sizes (CVE-2014-0147)
Gets us rid of integer overflows resulting in negative sizes which
aren't correctly checked.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 13:59:47 +02:00
Kevin Wolf
3dd8a6763b bochs: Unify header structs and make them QEMU_PACKED
This is an on-disk structure, so offsets must be accurate.

Before this patch, sizeof(bochs) != sizeof(header_v1), which makes the
memcpy() between both invalid. We're lucky enough that the destination
buffer happened to be the larger one, and the memcpy size to be taken
from the smaller one, so we didn't get a buffer overflow in practice.

This patch unifies the both structures, eliminating the need to do a
memcpy in the first place. The common fields are extracted to the top
level of the struct and the actually differing part gets a union of the
two versions.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 13:59:47 +02:00
Stefan Hajnoczi
42d43d35d9 block/cloop: fix offsets[] size off-by-one
cloop stores the number of compressed blocks in the n_blocks header
field.  The file actually contains n_blocks + 1 offsets, where the extra
offset is the end-of-file offset.

The following line in cloop_read_block() results in an out-of-bounds
offsets[] access:

    uint32_t bytes = s->offsets[block_num + 1] - s->offsets[block_num];

This patch allocates and loads the extra offset so that
cloop_read_block() works correctly when the last block is accessed.

Notice that we must free s->offsets[] unconditionally now since there is
always an end-of-file offset.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 13:59:47 +02:00
Stefan Hajnoczi
f56b9bc3ae block/cloop: refuse images with bogus offsets (CVE-2014-0144)
The offsets[] array allows efficient seeking and tells us the maximum
compressed data size.  If the offsets are bogus the maximum compressed
data size will be unrealistic.

This could cause g_malloc() to abort and bogus offsets mean the image is
broken anyway.  Therefore we should refuse such images.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 13:59:47 +02:00
Stefan Hajnoczi
7b103b36d6 block/cloop: refuse images with huge offsets arrays (CVE-2014-0144)
Limit offsets_size to 512 MB so that:

1. g_malloc() does not abort due to an unreasonable size argument.

2. offsets_size does not overflow the bdrv_pread() int size argument.

This limit imposes a maximum image size of 16 TB at 256 KB block size.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 13:59:47 +02:00
Stefan Hajnoczi
509a41bab5 block/cloop: prevent offsets_size integer overflow (CVE-2014-0143)
The following integer overflow in offsets_size can lead to out-of-bounds
memory stores when n_blocks has a huge value:

    uint32_t n_blocks, offsets_size;
    [...]
    ret = bdrv_pread(bs->file, 128 + 4, &s->n_blocks, 4);
    [...]
    s->n_blocks = be32_to_cpu(s->n_blocks);

    /* read offsets */
    offsets_size = s->n_blocks * sizeof(uint64_t);
    s->offsets = g_malloc(offsets_size);

    [...]

    for(i=0;i<s->n_blocks;i++) {
        s->offsets[i] = be64_to_cpu(s->offsets[i]);

offsets_size can be smaller than n_blocks due to integer overflow.
Therefore s->offsets[] is too small when the for loop byteswaps offsets.

This patch refuses to open files if offsets_size would overflow.

Note that changing the type of offsets_size is not a fix since 32-bit
hosts still only have 32-bit size_t.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 13:59:47 +02:00
Stefan Hajnoczi
d65f97a82c block/cloop: validate block_size header field (CVE-2014-0144)
Avoid unbounded s->uncompressed_block memory allocation by checking that
the block_size header field has a reasonable value.  Also enforce the
assumption that the value is a non-zero multiple of 512.

These constraints conform to cloop 2.639's code so we accept existing
image files.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 13:59:47 +02:00
Prasad Joshi
c5a33ee9ee qcow2: fix two memory leaks in qcow2_open error code path
Signed-off-by: Prasad Joshi <prasadjoshi.linux@gmail.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 13:49:53 +02:00
Markus Armbruster
4c7096607d vvfat: Fix :floppy: option to suppress partition table
Regressed in commit 7ad9be6, v1.5.0.

Reported-by: Kiyokazu SUTO <suto@ks-and-ks.ne.jp>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 13:49:53 +02:00
Stefan Hajnoczi
7b770c720b mirror: fix early wake from sleep due to aio
The mirror blockjob coroutine rate-limits itself by sleeping.  The
coroutine also performs I/O asynchronously so it's important that the
aio callback doesn't wake the coroutine early as that breaks
rate-limiting.

Reported-by: Joaquim Barrera <jbarrera@ac.upc.edu>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-03-25 14:09:50 +01:00
Paolo Bonzini
cc8c9d6c6f mirror: fix throttling delay calculation
The throttling delay calculation was using an inaccurate sector count to
calculate the time to sleep.  This broke rate-limiting for the block
mirror job.

Move the delay calculation into mirror_iteration() where we know how
many sectors were transferred.  This lets us calculate an accurate delay
time.

Reported-by: Joaquim Barrera <jbarrera@ac.upc.edu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-03-25 14:09:50 +01:00
Deepak Kathayat
dc6fb73d21 Fixed various typos
Signed-off-by: Deepak Kathayat <deepak.mk17@gmail.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-03-25 14:09:50 +01:00
Peter Lieven
20fccb187c block/nfs: report errors from libnfs
if an NFS operation fails we should report what libnfs knows
about the failure. It is likely more than just an error code.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-03-19 09:39:41 +01:00
Max Reitz
a134d90f50 qcow2: Fix fail path in realloc_refcount_block()
If qcow2_alloc_clusters() fails, new_offset and ret will both be
negative after the fail label, thus passing the first if condition and
subsequently resulting in a call of qcow2_free_clusters() with an
invalid (negative) offset parameter. Fix this by introducing a new label
"fail_free_cluster" which is only invoked if new_offset is indeed
pointing to a newly allocated cluster that should be cleaned up by
freeing it.

While we're at it, clean up the whole fail path. qcow2_cache_put()
should (and actually can) never fail, hence the return value can safely
be ignored (aside from asserting that it indeed did not fail).

Furthermore, there is no reason to give QCOW2_DISCARD_ALWAYS to
qcow2_free_clusters(), a mere QCOW2_DISCARD_OTHER will suffice.

Ultimately, rename the "fail" label to "done", as it is invoked both on
failure and success.

Suggested-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-03-19 09:39:41 +01:00
Max Reitz
8a15b813e6 qcow2: Correct comment for realloc_refcount_block()
Contrary to the comment describing this function's behavior, it does not
return 0 on success, but rather the offset of the newly allocated
cluster. This patch adjusts the comment accordingly to reflect the
actual behavior.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-03-19 09:39:41 +01:00
Kevin Wolf
5a8a30db47 block: Add error handling to bdrv_invalidate_cache()
If it returns an error, the migrated VM will not be started, but qemu
exits with an error message.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-03-19 09:39:41 +01:00
Stefan Hajnoczi
4a41a2d68a nbd: close socket if connection breaks
nbd_receive_reply() is called by the event loop whenever data is
available or the socket has been closed by the remote side.

This patch closes the socket when an error occurs to prevent the
nbd_receive_reply() handler from being called indefinitely after the
connection has failed.

Note that we were already correctly returning EIO for pending requests
but leaving the nbd_receive_reply() handler registered resulted in high
CPU consumption and a flood of error messages.

Reuse nbd_teardown_connection() to close the socket.

Reported-by: Zhifeng Cai <bluewindow@h3c.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-03-14 16:28:28 +01:00
Jeff Cody
62e466e845 block: Explicitly specify 'unsigned long long' for VHDX 64-bit constants
On 32-bit hosts, some compilers will warn on too large integer constants
for constants that are 64-bit in length.  Explicitly put a 'ULL' suffix
on those defines.

Reported-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-03-14 16:25:24 +01:00
Max Reitz
57ed25b1b0 block/raw-win32: bdrv_parse_filename() for hdev
The "host_device" protocol driver should strip the "host_device:" prefix
from filenames if present.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-03-13 14:42:25 +01:00
Max Reitz
cc28c6aa46 block/raw-posix: Strip protocol prefix on creation
The hdev_create() implementation in block/raw-posix.c is used by the
"host_device", "host_cdrom" and "host_floppy" protocol block drivers
together. Thus, any of the associated prefixes may occur and exactly one
should should be stripped, if it does (thus,
"host_device:host_cdrom:/dev/cdrom" is not shortened to "/dev/cdrom").

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-03-13 14:42:25 +01:00
Max Reitz
18fa1c42a3 block/raw-posix: bdrv_parse_filename() for cdrom
The "host_cdrom" protocol drivers should strip the "host_cdrom:" prefix
from filenames if present.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-03-13 14:42:24 +01:00
Max Reitz
d3f4984583 block/raw-posix: bdrv_parse_filename() for floppy
The "host_floppy" protocol driver should strip the "host_floppy:" prefix
from filenames if present.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-03-13 14:42:24 +01:00
Max Reitz
7af803d4f8 block/raw-posix: bdrv_parse_filename() for hdev
The "host_device" protocol driver should strip the "host_device:" prefix
from filenames if present.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-03-13 14:42:24 +01:00
Kevin Wolf
27eb6c097c qcow2: Don't write with BDRV_O_INCOMING
qcow2_open() causes writes when repairing an image with the dirty flag
set and when clearing autoclear flags. It shouldn't do this when another
qemu instance is still actively working on this image file.

One effect of the bug is that images may have a cleared dirty flag while
the migration source host still has it in use with lazy refcounts
enabled, so refcounts are not accurate and the dirty flag must remain
set.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-03-13 14:42:24 +01:00
Kevin Wolf
d475e5acd2 qcow2: Keep option in qcow2_invalidate_cache()
Instead of manually building a list of all options from BDRVQcowState
values just reuse the options that were used to open the image.
qcow2_open() won't fully use all of the options in the QDict, but that's
okay.

This fixes all of the driver-specific options in qcow2, except for
lazy-refcounts, which was special cased before.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-03-13 14:42:24 +01:00
Benoît Canet
b5042a3622 block: Rewrite the snapshot authorization mechanism for block filters.
This patch keep the recursive way of doing things but simplify it by giving
two responsabilities to all block filters implementors.

They will need to do two things:

-Set the is_filter field of their block driver to true.

-Implement the bdrv_recurse_is_first_non_filter method of their block driver like
it is done on the Quorum block driver. (block/quorum.c)

[Paolo Bonzini <pbonzini@redhat.com> pointed out that this patch changes
the semantics of blkverify, which now recurses down both bs->file and
s->test_file.
-- Stefan]

Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-03-13 14:23:27 +01:00
Max Reitz
dba2855572 qcow2: Check bs->drv in copy_sectors()
Before dereferencing bs->drv for a call to its member bdrv_co_readv(),
copy_sectors() should check whether that pointer is indeed valid, since
it may have been set to NULL by e.g. a concurrent write triggering the
corruption prevention mechanism.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-03-13 14:23:27 +01:00
Kevin Wolf
3456a8d185 block: Update image size in bdrv_invalidate_cache()
After migration has completed, we call bdrv_invalidate_cache() so that
drivers which cache some data drop their stale copy of the data and
reread it from the image file to get a new version of data that the
source modified while the migration was running.

Reloading metadata from the image file is useless, though, if the size
of the image file stays stale (this is a value that is cached for all
image formats in block.c). Reads from (meta)data after the old EOF
return only zeroes, causing image corruption.

We need to update bs->total_sectors in all layers that could potentially
have changed their size (i.e. backing files are not a concern - if they
are changed, we're in bigger trouble)

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-03-13 14:23:27 +01:00
Max Reitz
26d49c4675 qcow2-refcount: Sanitize refcount table entry
When reading the refcount table entry in get_refcount(), only bits which
are actually significant for the refcount block offset should be taken
into account.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-03-13 14:23:27 +01:00
Max Reitz
d5546c5e77 block/raw-win32: Strip "file:" prefix on creation
The bdrv_create() implementation of the block/raw-win32 "file" protocol
driver should strip the "file:" prefix from filenames if present.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-03-06 16:18:12 +01:00
Max Reitz
7dc74db88b block/raw-win32: Implement bdrv_parse_filename()
The "file" protocol driver should strip the "file:" prefix from
filenames if present.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-03-06 16:18:11 +01:00
Max Reitz
464d9f641d block/raw-posix: Strip "file:" prefix on creation
The bdrv_create() implementation of the block/raw-posix "file" protocol
driver should strip the "file:" prefix from filenames if present.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-03-06 16:18:09 +01:00
Max Reitz
078896a9ee block/raw-posix: Implement bdrv_parse_filename()
The "file" protocol driver should strip the "file:" prefix from
filenames if present.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-03-06 16:18:06 +01:00
Jeff Cody
50c75136be block: mirror - remove code cruft that has no function
Originally, this built up the error message with the backing filename,
so that errp was set as follows:
    error_set(errp, QERR_OPEN_FILE_FAILED, backing_filename);

However, we now propagate the local_error from the
bdrv_open_backing_file() call instead, making these 2 lines useless
code.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-03-06 11:47:40 +01:00
Kevin Wolf
f47c3f5a80 iscsi: Use bs->sg for everything else than disks
The current iscsi block driver code makes the rather arbitrary decision
that TYPE_MEDIUM_CHANGER and TYPE_TAPE devices have bs->sg = 1 and all
other device types are disks.

Instead of this, check for TYPE_DISK to expose the disk interface and
make everything else bs->sg = 1. In particular, this includes devices
with TYPE_STORAGE_ARRAY, which is what LUN 0 of an iscsi target is.
(See https://bugzilla.redhat.com/show_bug.cgi?id=1067784 for the exact
scenario.)

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
2014-03-05 16:58:20 +01:00
Bharata B Rao
b1f7d84fd2 gluster: Remove unused defines and header include
Remove the definitions of GLUSTER_FD_WRITE and GLUSTER_FD_READ which are
no longer used. Also sockets.h isn't needed any more.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-03-05 15:58:32 +01:00
Bharata B Rao
85c09bc016 gluster: Change licence to GPLv2+
Pipe handling mechanism in gluster driver was based on similar implementation
in RBD driver and hence had GPLv2 and associated copyright information.
After changing gluster driver to coroutine based implementation, the pipe
handling code no longer exists and hence change gluster driver's licence to
GPLv2+ and remove RBD copyrights.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-03-05 15:58:32 +01:00
Andreas Färber
f55ea6297c block/gluster: Add missing argument to qemu_gluster_init() call
Commit adccfbcd60 (block: gluster - add
reopen support.) did not supply the qemu_gluster_init() Error **
argument, needed since commit a7451cb850
(gluster: correctly propagate errors).

Pass through qemu_gluster_reopen_prepare()'s errp, as done in
qemu_gluster_open().

Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-03-04 20:20:57 +00:00
Peter Maydell
4a29420ea1 Block pull request
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJTENUMAAoJEJykq7OBq3PIXoQH/R7Q9CqASrBEfbT1GJDSr3E5
 yq0ZZ4nJBhhn0f8tOw9NWGCmTH1UZn0u4PbYEj4jTGJyWY2POkCKLPRvurge4c+Q
 L+OwzspW14O35o7WuBC8uWvSxh8ExaiwwrutCsIsz68UAcnoiwyTMSKghIy9+Osk
 lJKRFz6IFbgxCocqeNQrilj84vYSv9VzXZY8/nIhDMFsbjaKW7sKwu4jnXB7+v5J
 cA0+voafINNICNMeXzkqBbKwWIFtYOkPe287MZQNbpS2vCS2bsZeivDxsBJ26Cmx
 Xho8Xz5utvBMBrNtidHCTsd2jrj0VBdGgksYoHPyIwK/gBr1EVTGKpebraKXBEU=
 =rgw9
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging

Block pull request

# gpg: Signature made Fri 28 Feb 2014 18:27:24 GMT using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha/tags/block-pull-request:
  block/vmdk: do not report file offset for compressed extents
  discard rbd error output when not relevant in qemu-iotests
  block: use /var/tmp instead of /tmp for -snapshot
  qemu-io-test: Disable Quorum test when not compiled in.
  qmp: Make Quorum error events more palatable.
  qmp: Fix BlockdevOptionQuorum.
  block: gluster - add reopen support.
  block: gluster - code movements, state storage changes
  qemu-iotests: add more tests to the "quick" group

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-03-04 16:33:25 +00:00
Peter Maydell
739aa555b8 Merge remote-tracking branch 'remotes/bonzini/scsi-next' into staging
* remotes/bonzini/scsi-next:
  block/iscsi: fix segfault if writesame fails
  scsi-disk: Add support for port WWN and index descriptors in VPD page 83h
  block/iscsi: query for supported VPD pages
  block/iscsi: fix deadlock on scsi check condition
  scsi-bus: Fix transfer length for VERIFY with BYTCHK=11b
  scsi: report thin provisioning errors with werror=report
  scsi: Change scsi sense buf size to 252

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-03-04 14:25:34 +00:00
Peter Lieven
3eba13ec25 block/vmdk: do not report file offset for compressed extents
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-02-28 18:59:07 +01:00
Benoît Canet
0c762736df qmp: Make Quorum error events more palatable.
Insert quorum QMP events documentation alphabetically.

Also change the "ret" errno value by an optional "error" being an strerror(-ret)
in the QUORUM_REPORT_BAD qmp event.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-02-28 18:59:06 +01:00
Jeff Cody
adccfbcd60 block: gluster - add reopen support.
Gluster does parse open flags in its .bdrv_open() implementation,
and the .bdrv_reopen_* implementations need to do the same.

A new gluster connection to the image file to be created is established
in the .bdrv_reopen_prepare(), and the image file opened with the new
flags.

If this is successful, then the old image file is closed, and the
old connection torn down. The relevant structure pointers in the gluster
state structure are updated to the new connection.

If it is not successful, then the new file handle and connection is
abandoned (if it exists), while the old connection is not modified at
all.

With reopen supported, block-commit (and offline commit) is now also
supported for image files whose base image uses the native gluster
protocol driver.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-02-28 18:59:06 +01:00
Jeff Cody
1b37b3442f block: gluster - code movements, state storage changes
In preparation for supporting reopen on gluster, move flag
parsing out to a function.  Also, add a NULL check in the
gconf cleanup.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-02-28 18:59:06 +01:00
Peter Maydell
e7a1d6c52a Block patches
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJTB8hAAAoJEH8JsnLIjy/WQdgP/jEu5baA1/qKanDsS9l+81u1
 /sIYSWpHDEJ0uavqTMBeyMOwkzel7SZRusIwA/d5pMqxbY6/86YJumTTozFWvtqc
 IABqHtRKCxjcLdZRPbkuNAOiw6p76vSZa543o2t8OAhK2DIFy530wWXeoQEYvuJX
 4pOh0lTradOrF1z6uW4ozgQ1efPppwh/iqwfWWNJVTgfnWxJk6qQaATEgkuSdsUN
 Wp78UzOxLGO6JKJB6kP3LfNL0ANTYHpfH2/wkE6cW6TkSUduOm6hIBY+tb9khqYt
 INOKxqFADK6EOgjvJBsZuZUtOnHK5oM921LepN/mOPAs6gKcn2j+FfqJrl3I1/5M
 AXM3M0FPuijEKPGWw7pCLt7j84KJkD9a/rsKO37yRzw17fOma2Rpr4TrX43BF+5t
 CGqQ7PzDJ6Fng4EXjyNDzviwXIK8xmG1tfn92tq/BUd6OuM9MCyzEGvEiGOMBoXv
 w4iOV7UC+1P3TjnTBhMlBVGywSfdOJoHr9k4lXGNp0h8fPhM9rfruI3BFysxaas6
 GmKbd7yvKwXOTptd3I9SB8BzVUL3CcD3FK24+cWKAl8GgyiDIWRlvBYyMp3p8Z8f
 NDzcxYP6aRGsoddvpIWr3Tz89hw5wTW5u3RmNgxJUguz6HYKFbl30dpGT+96q2BN
 YIAANTdPxn7BP6r3glQH
 =ZaDG
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Block patches

# gpg: Signature made Fri 21 Feb 2014 21:42:24 GMT using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"

* remotes/kevin/tags/for-upstream: (54 commits)
  iotests: Mixed quorum child device specifications
  quorum: Simplify quorum_open()
  quorum: Add unit test.
  quorum: Add quorum_open() and quorum_close().
  quorum: Implement recursive .bdrv_recurse_is_first_non_filter in quorum.
  quorum: Add quorum_co_flush().
  quorum: Add quorum_invalidate_cache().
  quorum: Add quorum_getlength().
  quorum: Add quorum mechanism.
  quorum: Add quorum_aio_readv.
  blkverify: Extract qemu_iovec_clone() and qemu_iovec_compare() from blkverify.
  quorum: Add quorum_aio_writev and its dependencies.
  quorum: Create BDRVQuorumState and BlkDriver and do init.
  quorum: Create quorum.c, add QuorumChildRequest and QuorumAIOCB.
  check-qdict: Test termination of qdict_array_split()
  check-qdict: Adjust test for qdict_array_split()
  qdict: Extract non-QDicts in qdict_array_split()
  qemu-config: Sections must consist of keys
  qemu-iotests: Check qemu-img command line parsing
  qemu-img: Allow -o help with incomplete argument list
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-02-25 10:50:11 +00:00
Peter Maydell
6dedf0522c Merge remote-tracking branch 'remotes/bonzini/configure' into staging
* remotes/bonzini/configure:
  build: softmmu targets do not have a "main.o" file
  configure: Disable libtool if -fPIE does not work with it (bug #1257099)
  block: convert block drivers linked with libs to modules
  Makefile: introduce common-obj-m and block-obj-m for DSO
  Makefile: install modules with "make install"
  module: implement module loading
  rules.mak: introduce DSO rules
  darwin: do not use -mdynamic-no-pic
  block: use per-object cflags and libs
  rules.mak: allow per object cflags and libs
  rules.mak: fix $(obj) to a real relative path
  util: Split out exec_dir from os_find_datadir

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-02-24 15:38:00 +00:00
Peter Lieven
d9738fd246 block/iscsi: fix segfault if writesame fails
commit fa6252b0 introduced a segfault because it tries
to read iTask.task->sense after iTask.task has been
freed.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-02-22 15:59:24 +01:00
Peter Lieven
24d3bd67ac block/iscsi: query for supported VPD pages
this patch ensures that we only query for block provisioning and
block limits vpd pages if they are advertised. It also cleans
up the inquiry code and eliminates some redundant code.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-02-22 10:02:23 +01:00
Peter Lieven
837c390137 block/iscsi: fix deadlock on scsi check condition
the retry logic was broken because the complete status
of the task structure was not reset. this resulted in
an infinite loop retrying the command over and over.

CC: qemu-stable@nongnu.org
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-02-22 10:02:23 +01:00
Max Reitz
8a87f3d722 quorum: Simplify quorum_open()
Although it may not look like it, this patch simplifies quorum_open().
qdict_array_split() is now able to return QLists with different objects
than only QDicts, therefore it will now do all the work and
quorum_open() does not have to handle reference strings by itself.

This allows mixing full option dicts and reference strings for
specifying the child block devices of quorum; furthermore, it improves
handling of malformed specifications.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 22:32:24 +01:00
Benoît Canet
c88a1de51a quorum: Add quorum_open() and quorum_close().
Example of command line:

-drive if=virtio,driver=quorum,\
children.0.file.filename=1.raw,\
children.0.node-name=1.raw,\
children.0.driver=raw,\
children.1.file.filename=2.raw,\
children.1.node-name=2.raw,\
children.1.driver=raw,\
children.2.file.filename=3.raw,\
children.2.node-name=3.raw,\
children.2.driver=raw,\
vote-threshold=2

blkverify=on with vote-threshold=2 and two files can be passed to
emulate blkverify.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 22:29:51 +01:00
Benoît Canet
98a7a38f81 quorum: Implement recursive .bdrv_recurse_is_first_non_filter in quorum.
This is used to activate quorum snapshot.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 22:29:51 +01:00
Benoît Canet
1c508d174d quorum: Add quorum_co_flush().
Makes a vote to select error if any.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 22:29:51 +01:00
Benoît Canet
a28e4c408b quorum: Add quorum_invalidate_cache().
We really want that live migration works with quorum so implement
quorum_invalidate_cache().

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 22:29:50 +01:00
Benoît Canet
d55dee2044 quorum: Add quorum_getlength().
Check that every bs file returns the same length.
Otherwise, return -EIO to disable the quorum and
avoid length discrepancy.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 22:29:50 +01:00
Benoît Canet
95c6bff356 quorum: Add quorum mechanism.
This patchset enables the core of the quorum mechanism.
The num_children reads are compared to get the majority version and if this
version exists more than threshold times the guest won't see the error at all.

If a block is corrupted or if an error occurs during an IO or if the quorum
cannot be established QMP events are used to report to the management.

Use gnutls's SHA-256 to compare versions.

--enable-quorum must be used to enable the feature.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 22:29:50 +01:00
Benoît Canet
7db6982a19 quorum: Add quorum_aio_readv.
Add code to do num_children reads in parallel and cleanup the structures
afterwards.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 22:29:49 +01:00
Benoît Canet
f70d7f7e4d blkverify: Extract qemu_iovec_clone() and qemu_iovec_compare() from blkverify.
qemu_iovec_compare() will be used to compare IOs vectors in quorum blkverify
mode. The patch extracts these functions in order to factorize the code.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 22:29:49 +01:00
Benoît Canet
13e7956e31 quorum: Add quorum_aio_writev and its dependencies.
Writes are mirrored num_children times on num_children devices.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 22:29:49 +01:00
Benoît Canet
cadebd7a2a quorum: Create BDRVQuorumState and BlkDriver and do init.
Create the structure holding the quorum settings and write the minimal block
driver instanciation boilerplate.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 22:29:48 +01:00
Benoît Canet
27cec15e4e quorum: Create quorum.c, add QuorumChildRequest and QuorumAIOCB.
Quorum is a block filter mirroring writes to num_children children.
For reads quorum reads each children and does a vote.
If more than vote_threshold versions are identical the quorum is reached and
this winning version is returned to the guest. So quorum prevents bit corruption.
For high availability purpose minority errors are reported via QMP but the guest
does not see them.

This patch creates the driver C source file and introduces the structures that
will be used in asynchronous reads and writes.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 22:29:48 +01:00
Paolo Bonzini
5b7aa9b56d vdi: say why an image is bad
Instead of just putting it in debugging output, we can now put the
value in an Error.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 21:02:24 +01:00
Paolo Bonzini
76abe4071d block: do not abuse EMEDIUMTYPE
Returning "Wrong medium type" for an image that does not have a valid
header is a bit weird.  Improve the error by mentioning what format
was trying to open it.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 21:02:24 +01:00
Paolo Bonzini
89ac8480a8 vmdk: correctly propagate errors
Now that we can return the "right" errors, use the Error** parameter
to pass them back instead of just printing them.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 21:02:24 +01:00
Paolo Bonzini
37f09e5e3d vmdk: do not try opening a file as both image and descriptor
This prepares for propagating errors from vmdk_open_sparse and
vmdk_open_desc_file up to the caller of vmdk_open.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 21:02:24 +01:00
Paolo Bonzini
d1833ef52b vmdk: push vmdk_read_desc up to caller
Currently, we just try reading a VMDK file as both image and descriptor.
This makes it hard to choose which of the two attempts gave the best error.
We'll decide in advance if the file looks like an image or a descriptor,
and this patch is the first step to that end.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 21:02:24 +01:00
Paolo Bonzini
a8842e6d2a vmdk: extract vmdk_read_desc
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 21:02:24 +01:00
Paolo Bonzini
c0f92b526d vvfat: correctly propagate errors
Before:
    $ ./qemu-io-old
    qemu-io-old> open -r -o driver=vvfat,fat-type=24,dir=i386-softmmu
    Valid FAT types are only 12, 16 and 32
    qemu-io-old: can't open device (null): Could not open image: Invalid argument

After:
    $ ./qemu-io
    qemu-io> open -r -o driver=vvfat,fat-type=24,dir=i386-softmmu
    qemu-io: can't open device (null): Valid FAT types are only 12, 16 and 32

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 21:02:23 +01:00
Paolo Bonzini
6890aad46b vhdx: correctly propagate errors
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 21:02:23 +01:00
Paolo Bonzini
0fea6b7972 qed: correctly propagate errors
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 21:02:23 +01:00
Paolo Bonzini
b6d5066d32 qcow: correctly propagate errors
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 21:02:23 +01:00
Paolo Bonzini
2a94fee3f6 curl: correctly propagate errors
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 21:02:23 +01:00
Paolo Bonzini
f8d924e481 cow: correctly propagate errors
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 21:02:23 +01:00
Paolo Bonzini
a7451cb850 gluster: correctly propagate errors
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 21:02:23 +01:00
Paolo Bonzini
24897a767b gluster: default scheme to gluster:// and host to localhost.
Currently, "gluster:///volname/img" and (using file. options)
"file.driver=gluster,file.filename=foo" will segfault.  Also,
"//host/volname/img" will be rejected, but it is a valid URL
that should be accepted just fine with "file.driver=gluster".
Accept all of these, by inferring missing transport and host
as TCP and localhost respectively.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 21:02:23 +01:00
Paolo Bonzini
f2917853f7 iscsi: correctly propagate errors in iscsi_open
Before:
    $ ./qemu-io-old
    qemu-io-old> open -r -o file.driver=iscsi,file.filename=foo
    Failed to parse URL : foo
    qemu-io-old: can't open device (null): Could not open 'foo': Invalid argument

After:
    $ ./qemu-io
    qemu-io> open -r -o file.driver=iscsi,file.filename=foo
    qemu-io: can't open device (null): Failed to parse URL : foo

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 21:02:23 +01:00
Paolo Bonzini
35cb1748d5 iscsi: fix indentation
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 21:02:23 +01:00
Paolo Bonzini
77e8b9ca64 nbd: correctly propagate errors
Before:
    $ ./qemu-io-old
    qemu-io-old> open -r -o file.driver=nbd
    one of path and host must be specified.
    qemu-io-old: can't open device (null): Could not open image: Invalid argument
    $ ./qemu-io-old
    qemu-io-old> open -r -o file.driver=nbd,file.host=foo,file.path=bar
    path and host may not be used at the same time.
    qemu-io-old: can't open device (null): Could not open image: Invalid argument

After:
    $ ./qemu-io
    qemu-io> open -r -o file.driver=nbd
    qemu-io: can't open device (null): one of path and host must be specified.
    $ ./qemu-io
    qemu-io> open -r -o file.driver=nbd,file.host=foo,file.path=bar
    qemu-io: can't open device (null): path and host may not be used at the same time.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 21:02:22 +01:00
Paolo Bonzini
a69d9af449 nbd: produce a better error if neither host nor port is passed
Before:
    $ qemu-io-old
    qemu-io-old> open -r -o file.driver=nbd
    qemu-io-old: can't open device (null): Could not open image: Invalid argument
    $ ./qemu-io-old
    qemu-io-old> open -r -o file.driver=nbd,file.host=foo,file.path=bar
    path and host may not be used at the same time.
    qemu-io-old: can't open device (null): Could not open image: Invalid argument

After:
    $ ./qemu-io
    qemu-io> open -r -o file.driver=nbd
    one of path and host must be specified.
    qemu-io: can't open device (null): Could not open image: Invalid argument
    $ ./qemu-io
    qemu-io> open -r -o file.driver=nbd,file.host=foo,file.path=bar
    path and host may not be used at the same time.
    qemu-io: can't open device (null): Could not open image: Invalid argument

Next patch will fix the error propagation.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 21:02:22 +01:00
Max Reitz
f7d9fd8c72 block: Remove bdrv_open_image()'s force_raw option
This option is now unnecessary since specifying BDRV_O_PROTOCOL as flag
will do exactly the same.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 21:02:22 +01:00
Max Reitz
2e40134bfd block: Make bdrv_file_open() static
Add the bdrv_open() option BDRV_O_PROTOCOL which results in passing the
call to bdrv_file_open(). Additionally, make bdrv_file_open() static and
therefore bdrv_open() the only way to call it.

Consequently, all existing calls to bdrv_file_open() have to be adjusted
to use bdrv_open() with the BDRV_O_PROTOCOL flag instead.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 21:02:22 +01:00
Max Reitz
ddf5636dc9 block: Add reference parameter to bdrv_open()
Allow bdrv_open() to handle references to existing block devices just as
bdrv_file_open() is already capable of.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 21:02:22 +01:00
Max Reitz
f67503e5bd block: Change BDS parameter of bdrv_open() to **
Make bdrv_open() take a pointer to a BDS pointer, similarly to
bdrv_file_open(). If a pointer to a NULL pointer is given, bdrv_open()
will create a new BDS with an empty name; if the BDS pointer is not
NULL, that existing BDS will be reused (in the same way as bdrv_open()
already did).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 21:02:21 +01:00
Kevin Wolf
a71835a0cc qcow2: Set zero flag for discarded clusters
Instead of making the backing file contents visible again after a discard
request, set the zero flag if possible (i.e. on version >= 3).

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-02-21 21:02:21 +01:00
Fam Zheng
6ebc91e5d0 block: use per-object cflags and libs
No longer adds flags and libs for them to global variables, instead
create config-host.mak variables like FOO_CFLAGS and FOO_LIBS, which is
used as per object cflags and libs.

This removes unwanted dependencies from libcacard.

Signed-off-by: Fam Zheng <famz@redhat.com>
[Split from Fam's patch to enable modules. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-02-20 13:12:54 +01:00
Peter Maydell
4c0c9bbe78 Merge remote-tracking branch 'remotes/qmp-unstable/queue/qmp' into staging
* remotes/qmp-unstable/queue/qmp:
  monitor: Add object_add class argument completion.
  monitor: Add object_del id argument completion.
  monitor: Add device_add device argument completion.
  monitor: Add device_del id argument completion.
  qmp: expose list of supported character device backends
  Use error_is_set() only when necessary
  QMP: allow JSON dict arguments in qmp-shell
  hmp: migrate command (without -d) now blocks correctly

Conflicts:
	blockdev.c

[PMM: resolved trivial conflict in blockdev.c]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-02-20 12:10:23 +00:00
Markus Armbruster
84d18f065f Use error_is_set() only when necessary
error_is_set(&var) is the same as var != NULL, but it takes
whole-program analysis to figure that out.  Unnecessarily hard for
optimizers, static checkers, and human readers.  Dumb it down to
obvious.

Gets rid of several dozen Coverity false positives.

Note that the obvious form is already used in many places.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-02-17 11:57:23 -05:00
Jeff Cody
cc67f4d1f9 block: mirror - use local_err to avoid NULL errp
When starting a block job, commit_active_start() relies on whether *errp
is set by mirror_start_job.  This allows it to determine if the mirror
job start failed, so that it can clean up any changes to open flags from
the bdrv_reopen().  If errp is NULL, then it will not be able to
determine if mirror_start_job failed or not.

To avoid this, use a local Error variable, and then propagate the error
(if any) to errp.

Reported-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-02-14 18:05:39 +01:00
Jeff Cody
39a611a3e0 block: Don't throw away errno via error_setg
There are a handful of places in the block layer where a failure path
has a valid -errno value, yet error_setg() is used.  Those instances
should instead use error_setg_errno(), to preserve as much error
information as possible.

This patch replaces those instances with error_setg_errno(), so that
errno is passed up the stack in the error message.

Reported-By: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-02-14 18:05:38 +01:00
Jeff Cody
28f106afb3 block: Add notes to iSCSI's .bdrv_open and .bdrv_reopen_prepare
iSCSI currently does not need to do any actions to support the
current usage of bdrv_reopen().  However, it is important to note
a couple of things: 1.) A connection will not be re-established to
an iSCSI target, and 2.) If iscsi_open() is changed to parse 'flags',
then iscsi_reopen_prepare() may need to be more than a stub.

In light of the above, this commit adds comments above both of the
functions to bring attention to these facts.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-02-14 13:38:35 +01:00
Kevin Wolf
eaf944a438 blkdebug: Don't leak bs->file on failure
Reported-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2014-02-09 09:12:39 +01:00
Kevin Wolf
ad6aef43d3 raw: Fix BlockLimits passthrough
raw copies over the BlockLimits of bs->file during bdrv_open().
However, since commit d34682cd it is immediately overwritten during
bdrv_refresh_limits(). This caused all fields except for
opt_transfer_length and opt_mem_alignment (which happen to be correctly
inherited in generic code) to be zeroed.

Move the BlockLimit assignment to a .bdrv_refresh_limits() callback to
make it work again for all fields.

Reported-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2014-02-09 09:12:39 +01:00
Hu Tao
7c2bbf4aa6 qcow2: check for NULL l2meta
In the case of a metadata preallocation with a large cluster size,
qcow2_alloc_cluster_offset() can allocate nothing and returns a
NULL l2meta. This patch checks for it and link2 l2 with only valid
l2meta.

Replace 9 and 512 with BDRV_SECTOR_BITS, BDRV_SECTOR_SIZE
respectively while at the function.

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-09 09:12:39 +01:00
Hu Tao
33304ec9fa qcow2: fix offset overflow in qcow2_alloc_clusters_at()
When cluster size is big enough it can lead to an offset overflow
in qcow2_alloc_clusters_at(). This patch fixes it.

The allocation is stopped each time at L2 table boundary
(see handle_alloc()), so the possible maximum bytes could be

  2^(cluster_bits - 3 + cluster_bits)

cluster_bits - 3 is used to compute the number of entry by L2
and the additional cluster_bits is to take into account each
clusters referenced by the L2 entries.

so int is safe for cluster_bits<=17, unsafe otherwise.

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-09 09:12:39 +01:00
Hu Tao
16f0587e0a qcow2: remove n_start and n_end of qcow2_alloc_cluster_offset()
n_start can be actually calculated from offset. The number of
sectors to be allocated(n_end - n_start) can be passed in in
num. By removing n_start and n_end, we can save two parameters.

The side effect is there is a bug in qcow2.c:preallocate() that
passes incorrect n_start to qcow2_alloc_cluster_offset() is
fixed. The bug can be triggerred by a larger cluster size than
the default value(65536), for example:

./qemu-img create -f qcow2 \
  -o 'cluster_size=131072,preallocation=metadata' file.img 4G

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-09 09:12:39 +01:00
Peter Lieven
5d259fc7da block/iscsi: always fill bs->bl.opt_transfer_length
the opt_transfer_length has nothing to do with logical
block provisioning stuff so always copy it from
the block limits VPD page.

Reported-By: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-09 09:12:38 +01:00
Peter Lieven
6542aa9c75 block: add native support for NFS
This patch adds native support for accessing images on NFS
shares without the requirement to actually mount the entire
NFS share on the host.

NFS Images can simply be specified by an url of the form:
nfs://<host>/<export>/<filename>[?param=value[&param2=value2[&...]]]

For example:
qemu-img create -f qcow2 nfs://10.0.0.1/qemu-images/test.qcow2

You need LibNFS from Ronnie Sahlberg available at:
   git://github.com/sahlberg/libnfs.git
for this to work.

During configure it is automatically probed for libnfs and support
is enabled on-the-fly. You can forbid or enforce libnfs support
with --disable-libnfs or --enable-libnfs respectively.

Due to NFS restrictions you might need to execute your binaries
as root, allow them to open priviledged ports (<1024) or specify
insecure option on the NFS server.

For additional information on ROOT vs. non-ROOT operation and URL
format + parameters see:
   https://raw.github.com/sahlberg/libnfs/master/README

Supported by qemu are the uid, gid and tcp-syncnt URL parameters.

LibNFS currently support NFS version 3 only.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-09 09:12:38 +01:00
Markus Armbruster
f50159fa9b block/vhdx: Error checking fixes
Errors are inadvertently ignored in a few places.  Has always been
broken.  Spotted by Coverity.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-01-31 22:05:03 +01:00
Peter Lieven
f43aa8e18a block/vmdk: add basic .bdrv_check support
this adds a basic vmdk corruption check. it should detect severe
table corruptions and file truncation.

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-01-31 22:05:03 +01:00
Jeff Cody
14b4a8b9c6 block: remove qcow2 .bdrv_make_empty implementation
The QCOW2 .bdrv_make_empty implementation always returns 0 for success,
but does not actually do anything.

The proper way to not support an optional driver function stub is to
just not implement it, so let's remove the stub.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-01-31 22:05:03 +01:00
Jeff Cody
55aff7f133 block: remove QED .bdrv_make_empty implementation
The QED .bdrv_make_empty() implementation does nothing but return
-ENOTSUP, which causes problems in bdrv_commit().  Since the function
stub exists for QED, it is called, which then always returns an error.

The proper way to not support an optional driver function stub is to
just not implement it, so let's remove the stub.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-01-31 22:05:03 +01:00
Anthony Liguori
e9f526ab7b Merge remote-tracking branch 'bonzini/scsi-next' into staging
* bonzini/scsi-next:
  scsi: Support TEST UNIT READY in the dummy LUN0
  block: add .bdrv_reopen_prepare() stub for iscsi
  virtio-scsi: Prevent assertion on missed events
  virtio-scsi: Cleanup of I/Os that never started
  scsi: Assign cancel_io vector for scsi_disk_emulate_ops

Conflicts:
	block/iscsi.c

aliguori: resolve trivial merge conflict in block/iscsi.c

Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2014-01-24 15:50:14 -08:00
Kevin Wolf
9e1cb96d9a qemu-iotests: Test pwritev RMW logic
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-01-24 17:40:25 +01:00
Kevin Wolf
b35ee7fb23 blkdebug: Make required alignment configurable
The new 'align' option of blkdebug can be used in order to emulate
backends with a required 4k alignment on hosts which only really require
512 byte alignment.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-24 17:40:03 +01:00
Paolo Bonzini
2c9880c45e iscsi: Set bs->request_alignment
The iSCSI backend already gets the block size from the READ CAPACITY
command it sends.  Save it so that the generic block layer gets it
too.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-01-24 17:40:03 +01:00
Kevin Wolf
793ed47a7a block: Switch BdrvTrackedRequest to byte granularity
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-01-24 17:40:02 +01:00
Paolo Bonzini
c25f53b06e raw: Probe required direct I/O alignment
Add a bs->request_alignment field that contains the required
offset/length alignment for I/O requests and fill it in the raw block
drivers. Use ioctls if possible, else see what alignment it takes for
O_DIRECT to succeed.

While at it, also expose the memory alignment requirements, which may be
(and in practice are) different from the disk alignment requirements.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-01-24 17:40:02 +01:00
Kevin Wolf
355ef4ac95 block: Update BlockLimits when they might have changed
When reopening with different flags, or when backing files disappear
from the chain, the limits may change. Make sure they get updated in
these cases.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoît Canet <benoit@irqsave.net>
2014-01-24 17:40:01 +01:00
Kevin Wolf
d34682cd4a block: Move initialisation of BlockLimits to bdrv_refresh_limits()
This function separates filling the BlockLimits from bdrv_open(), which
allows it to call it from other operations which may change the limits
(e.g. modifications to the backing file chain or bdrv_reopen)

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-01-24 17:40:01 +01:00
Jeff Cody
4da8358596 block: resize backing image during active layer commit, if needed
If the top image to commit is the active layer, and also larger than
the base image, then an I/O error will likely be returned during
block-commit.

For instance, if we have a base image with a virtual size 10G, and a
active layer image of size 20G, then committing the snapshot via
'block-commit' will likely fail.

This will automatically attempt to resize the base image, if the
active layer image to be committed is larger.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-24 16:12:49 +01:00
Peter Maydell
031fd1be56 block/curl: Implement the libcurl timer callback interface
libcurl versions 7.16.0 and later have a timer callback interface which
must be implemented in order for libcurl to make forward progress (it
will sometimes rely on being called back on the timeout if there are
no file descriptors registered). Implement the callback, and use a
QEMU AIO timer to ensure we prod libcurl again when it asks us to.

Based on Peter's original patch plus my fix to add curl_multi_timeout_do.
Should compile just fine even on older versions of libcurl.

I also tried copy-on-read and streaming:

    $ ./qemu-img create -f qcow2 -o \
         backing_file=http://download.fedoraproject.org/pub/fedora/linux/releases/20/Live/x86_64/Fedora-Live-Desktop-x86_64-20-1.iso \
         foo.qcow2 1G
    $ x86_64-softmmu/qemu-system-x86_64 \
         -drive if=none,file=foo.qcow2,copy-on-read=on,id=cd \
         -device ide-cd,drive=cd --enable-kvm -m 1024

Direct http usage is probably too slow, but with copy-on-read ultimately
the image does boot!

After some time, streaming gets canceled by an EIO, which needs further
investigation.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-24 16:07:08 +01:00
Benoît Canet
212a5a8f09 block: Create authorizations mechanism for external snapshot and resize.
Signed-off-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-24 16:07:08 +01:00
Benoît Canet
c13163fba1 qmp: Add QMP query-named-block-nodes to list the named BlockDriverState nodes.
Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-24 16:07:08 +01:00
Fam Zheng
c8059b97e1 qapi: Add "backing" to BlockStats
Currently there is no way to query BlockStats of the backing chain. This
adds "backing" field into BlockStats to make it possible.

The comment of "parent" is reworded.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-24 14:33:01 +01:00
Fam Zheng
d8a7b061ae vmdk: Fix format specific information (create type) for streamOptimized
Previously the field is wrong:

    $ ./qemu-img create -f vmdk -o subformat=streamOptimized /tmp/a.vmdk 1G

    $ ./qemu-img info /tmp/a.vmdk
    image: /tmp/a.vmdk
    file format: vmdk
    virtual size: 1.0G (1073741824 bytes)
    disk size: 12K
    Format specific information:
        cid: 1390460459
        parent cid: 4294967295
>>>     create type: monolithicSparse
        <snip>

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-24 14:33:00 +01:00
Zhang Min
6df3bf8eb3 drive mirror:fix memory leak
In the function mirror_iteration() -> qemu_iovec_init(),
it allocates memory for op->qiov.iov, when the write request calls back,
but in the function mirror_iteration_done(), it only frees the op,
not free the op->qiov.iov, so this causes memory leak.

It should use qemu_iovec_destroy() to free op->qiov.

Signed-off-by: Zhang Min <rudy.zhangmin@huawei.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-24 14:33:00 +01:00
Liu Yuan
9cd767376f sheepdog: fix 'qemu-img map'
It was muted in the previous commit 4bc74be9. Let's revive it since nothing
prevents us to do it.

With this patch, following command will work as other formats:

$ qemu-img map sheepdog:image

Cc: qemu-devel@nongnu.org
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-24 14:33:00 +01:00
Fam Zheng
34ceed81f9 vmdk: Check for overhead when opening
Report an error if file size is even smaller than metadata.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-24 14:33:00 +01:00
Hu Tao
46bae92713 qcow2: fix wrong value of L1E_OFFSET_MASK, L2E_OFFSET_MASK and REFT_OFFSET_MASK
Accoring to qcow spec, the offset fields in l1e, l2e and ref table entry
start at bit 9. The offset is cluster offset, and the smallest possible
cluster size is 512 bytes.

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-24 14:33:00 +01:00
Max Reitz
22511ad681 blkverify: Don't require protocol filename
If the filename is not prefixed by "blkverify:" in
blkverify_parse_filename(), the blkverify driver was not selected
through that protocol prefix, but by an explicit command line (or QMP)
option (like driver=blkverify).

If blkverify_parse_filename() has been called, a filename has been
given. If it is not prefixed, it is probably really just a plain
filename. This is no problem, since we can use it as the test image
filename and rely on the user to specify the raw image filename through
the new corresponding option.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-22 12:07:18 +01:00
Max Reitz
70b6198acc blkverify: Allow command-line configuration
Introduce the "test" and "raw" options for specifying images.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-22 12:07:18 +01:00
Max Reitz
4373593d51 blkdebug: Allow command-line file configuration
Introduce the "image" option as an alternative to specifying the image
through the filename.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-22 12:07:18 +01:00
Max Reitz
72daa72eee block: Allow reference for bdrv_file_open()
Allow specifying a reference to an existing block device (by name) for
bdrv_file_open() instead of a filename and/or options.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-22 12:07:17 +01:00
Max Reitz
89f2b21e36 blkdebug: Use command-line in read_config()
Use qemu_config_parse_qdict() to parse the command-line options in
addition to the config file.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-22 12:07:17 +01:00
Max Reitz
85a040e548 blkdebug: Always call read_config()
Move the check whether there actually is a config file into the
read_config() function.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-22 12:07:17 +01:00
Max Reitz
d4881b9bcb blkdebug: Don't require sophisticated filename
If the filename is not prefixed by "blkdebug:" in
blkdebug_parse_filename(), the blkdebug driver was not selected through
that protocol prefix, but by an explicit command line option
(file.driver=blkdebug or something similar). Contrary to the current
reaction, this is not a problem at all; we just need to store the
filename (in the x-image option) and can go on; the user just has to
manually specify the config option.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-22 12:07:17 +01:00
Max Reitz
466b49f276 blkdebug: Use errp for read_config()
Use an Error variable in the read_config() function.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-22 12:07:17 +01:00
Fam Zheng
585ea0c841 vmdk: Fix big flat extent IO
Local variable "n" as int64_t avoids overflow with large sector number
calculation. See test case change for failure case.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-22 12:07:16 +01:00
Liu Yuan
9f23fce7b2 sheepdog: fix clone operation by 'qemu-img create -b'
We should pass base_inode->vdi_id to base_vdi_id of SheepdogVdiReq so that sheep
can create a clone instead a fresh volume.

This fixes following command:

qemu-create -b sheepdog:base sheepdog:clone

so users can boot sheepdog:clone as a normal volume.

Cc: qemu-devel@nongnu.org
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-22 12:07:16 +01:00
Bharata B Rao
cf7f616b9d gluster: Add support for creating zero-filled image
GlusterFS supports creation of zero-filled file on GlusterFS volume
by means of an API called glfs_zerofill(). Use this API from QEMU to
create an image that is filled with zeroes by using the preallocation
option of qemu-img.

qemu-img create gluster://server/volume/image -o preallocation=full 10G

The allowed values for preallocation are 'full' and 'off'. By default
preallocation is off and image is not zero-filled.

glfs_zerofill() offloads the writing of zeroes to the server and if
the storage supports SCSI WRITESAME, GlusterFS server can issue
BLKZEROOUT ioctl to achieve the zeroing.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-22 12:07:16 +01:00
Bharata B Rao
7c815372f3 gluster: Implement .bdrv_co_write_zeroes for gluster
Support .bdrv_co_write_zeroes() from gluster driver by using GlusterFS API
glfs_zerofill() that off-loads the writing of zeroes to GlusterFS server.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-22 12:07:16 +01:00
Bharata B Rao
15744b0b8f gluster: Convert aio routines into coroutines
Convert the read, write, flush and discard implementations from aio-based
ones to coroutine based ones.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-22 12:07:16 +01:00
Peter Lieven
92397116a6 block/iscsi: return -ENOMEM if an async call fails immediately
if an async libiscsi call fails directly it can only be due
to an out of memory condition. All other errors are returned
through the callback.

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-22 12:07:16 +01:00
Stefan Hajnoczi
e04fb07fd1 rbd: switch from pipe to QEMUBH completion notification
rbd callbacks are called from non-QEMU threads.  Up until now a pipe was
used to signal completion back to the QEMU iothread.

The pipe writer code handles EAGAIN using select(2).  The select(2) API
is not scalable since fd_set size is static.  FD_SET() can write beyond
the end of fd_set if the file descriptor number is too high.  (QEMU's
main loop uses poll(2) to avoid this issue with select(2).)

Since the pipe itself is quite clumsy to use and QEMUBH is now
thread-safe, just schedule a BH from the rbd callback function.  This
way we can simplify I/O completion in addition to eliminating the
potential FD_SET() crash when file descriptor numbers become too high.

Crash scenario: QEMU already has 1024 file descriptors open.  Hotplug an
rbd drive and get the pipe writer to take the select(2) code path.

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Tested-by: Josh Durgin <josh.durgin@inktank.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-22 12:07:16 +01:00
Jeff Cody
dc6afb99b3 block: add .bdrv_reopen_prepare() stub for iscsi
To suppport reopen(), the .bdrv_reopen_prepare() stub must exist.
iSCSI does not have anything that needs to be done to support reopen,
so we can just implement the _prepare() stub.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-01-15 10:44:52 +01:00
Edgar E. Iglesias
133fe77437 Merge remote branch 'luiz/queue/qmp' into qmpq
* luiz/queue/qmp:
  migration: qmp_migrate(): keep working after syntax error
  qerror: Remove assert_no_error()
  qemu-option: Remove qemu_opts_create_nofail
  target-i386: Remove assert_no_error usage
  hw: Remove assert_no_error usages
  qdev: Delete dead code
  error: Add error_abort
  monitor: add object-add (QMP) and object_add (HMP) command
  monitor: add object-del (QMP) and object_del (HMP) command
  qom: catch errors in object_property_add_child
  qom: fix leak for objects created with -object
  rng: initialize file descriptor to -1
  qemu-monitor: HMP cpu-add wrapper
  vl: add missing transition debug->finish_migrate

Message-Id: 1389045795-18706-1-git-send-email-lcapitulino@redhat.com
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2014-01-14 12:10:08 +10:00
Anthony Liguori
eedc1a5db5 Merge remote-tracking branch 'bonzini/scsi-next' into staging
* bonzini/scsi-next:
  scsi-disk: add UNMAP limits to block limits VPD page
  block/iscsi: use a bh to schedule co reentrance

Message-id: 1387720926-11421-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2014-01-10 11:05:17 -08:00
Peter Crosthwaite
87ea75d5e1 qemu-option: Remove qemu_opts_create_nofail
This is a boiler-plate _nofail variant of qemu_opts_create. Remove and
use error_abort in call sites.

null/0 arguments needs to be added for the id and fail_if_exists fields
in affected callsites due to argument inconsistency between the normal and
no_fail variants.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-01-06 15:02:30 -05:00
Fam Zheng
18da7f94cd commit: Remove unused check
We support top == active for commit now, remove the check and add an
assertion here.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-20 16:26:16 +01:00
Fam Zheng
20a63d2cec commit: Support commit active layer
If active is top, it will be mirrored to base, (with block/mirror.c
code), then the image is switched when user completes the block job.

QMP documentation is updated.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-20 16:26:16 +01:00
Fam Zheng
03544a6e9e block: Add commit_active_start()
commit_active_start is implemented in block/mirror.c, It will create a
job with "commit" type and designated base in block-commit command. This
will be used for committing active layer of device.

Sync mode is removed from MirrorBlockJob because there's no proper type
for commit. The used information is is_none_mode.

The common part of mirror_start and commit_active_start is moved to
mirror_start_job().

Fix the comment wording for commit_start.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-20 16:26:16 +01:00
Fam Zheng
5bc361b813 mirror: Move base to MirrorBlockJob
This allows setting the base before entering mirror_run, commit will
make use of it.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-20 16:26:16 +01:00
Fam Zheng
f95c625ce4 mirror: Don't close target
Let reference count manage target and don't call bdrv_close here.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-20 16:26:16 +01:00
Fam Zheng
917703c179 vmdk: Allow vmdk_create to work with protocol
This improves vmdk_create to use bdrv_* functions to replace qemu_open
and other fd functions. The error handling are improved as well. One
difference is that bdrv_pwrite will round up buffer to sectors, so for
description file, an extra bdrv_truncate is used in the end to drop
inding zeros.

Notes:

 - A bonus bug fix is correct endian is used in initializing GD entries.

 - ROUND_UP and DIV_ROUND_UP are used where possible.

I tested that new code produces exactly the same file as previously.

Signed-off-by: Fam Zheng <famz@redhat.com>
Tested-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-20 13:56:56 +01:00
Fam Zheng
b47053bd03 vmdk: Check VMFS extent line field number
VMFS extent line in description file should be with 4 fields:

    RW <size> VMFS "file-name.vmdk"

Check the number explicitly and report error if offset is appended as
FLAT, which should be invalid format.

Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-20 09:11:58 +01:00
Jeff Cody
7e30e6a674 block: vhdx - improve error message, and .bdrv_check implementation
If there is a dirty log file to be replayed in a VHDX image, it is
replayed in .vhdx_open().  However, if the file is opened read-only,
then a somewhat cryptic error message results.

This adds a more helpful error message for the user.  If an image file
contains a log to be replayed, and is opened read-only, the user is
instructed to run 'qemu-img check -r all' on the image file.

Running qemu-img check -r all will cause the image file to be opened
r/w, which will replay the log file.  If a log file replay is detected,
this is flagged, and bdrv_check will increase the corruptions_fixed
count for the image.

[Fixed typo in error message that was pointed out by Eric Blake
<eblake@redhat.com>.
--Stefan]

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-20 09:11:58 +01:00
Stefan Weil
219c252193 block/iscsi: Fix compilation for libiscsi 1.4.0 (API change)
Function iscsi_read10_task got additional parameters starting with version
libiscsi 1.5.0.

libiscsi 1.4.0 is still widely used (Debian wheezy, jessie and other Linux
distributions currently provide packages for QEMU which use it), so we
still need support for this older API.

Reviewed-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-20 09:11:58 +01:00
Liu Yuan
e50d7607f1 sheepdog: fix dynamic grow for running qcow2 format
When running qcow2 over sheepdog, we might meet following problem

  qemu-system-x86_64: shrinking is not supported

And cause IO errors to Guest. This is because we abuse bs->total_sectors, which
is manipulated by generic block layer and race with sheepdog code.

We should directly check if offset > vdi_size to dynamically enlarge the volume
instead of 'offset > bs->total_sectors', which will cause problem when following
case happens:

   vdi_size > offset > bs->total_sectors

   # then trigger sd_truncate() to shrink the volume wrongly.

Cc: qemu-devel@nongnu.org
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Reported-by: Hadrien KOHL <hadrien.kohl@gmail.com>
Signed-off-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-20 09:11:57 +01:00
Anthony Liguori
b91f93243b Collection of little cleanups anf bugfixes.
nbd patches in preparation of spice-nbd.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJSrseRAAoJEEy22O7T6HE4vzIP/AwLTXZ1E73ZIF8t/e+1exYJ
 coQpvFkLgHeXvDPc/Ml5CJWFcUdEHMpprm3hIQgvsUyAujswSgZiO4Wn5vgMId+B
 fgjMX342j7P/lK27r8iOMCN7KZBMwh972DqTVzyzFdJAL9wgpUsN4Fq1vjQXCJiW
 jalLRS/xcqdRsu8ZNIvaLth+NgBOp7N0pgWOzYFPBJzgxGJw/pGTCG+rZaCJHyar
 F7T0K4CjHNGAGe255T4qzA5hOt6x9xDgd9zlDlWMzoqGtc0SkmomWyFWKx4fGv3x
 6WIZQjho17Sb7oe878OJUI6Ct5Bz1NzvE6WnaiQuedyM3CHO1/ynxqu5r68/vCDs
 fMMccnNCM/lUjmGrggi3PiRn1XeOBhH9ltEoehAqv/wTtgT7dUf97YV8I4zZAmUU
 0uKmEiyCHKktpPP8NXl+pAeSZVnI7LLDjIeQyVUjDx29G+GKuAxzYEH+m9ZW5BdQ
 DYDwvO0nm2WfpKlezUngffEPFOixEMJcsS+GeRUNLSjcA2eGErb2L3seGul8uLnM
 O1RBXZzTpFq34gmIUhgN5wYFKQu7jlO4tgNGbbG9j+CuEMVab/7eZEgOpT2iedRx
 sWvBtd2R0SB4/D3VEOsybjU3sgZVoIMlbreEUHxh6v5ntYkd81TlVi/Fuvp0Lrzm
 NGG74+6iyXSWdQfHemuo
 =J7aY
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'spice/tags/pull-spice-1' into staging

Collection of little cleanups anf bugfixes.
nbd patches in preparation of spice-nbd.

# gpg: Signature made Mon 16 Dec 2013 01:27:45 AM PST using RSA key ID D3E87138
# gpg: Can't check signature: public key not found

# By Marc-André Lureau (12) and Gerd Hoffmann (4)
# Via Gerd Hoffmann
* spice/tags/pull-spice-1:
  spice: stop server for qxl hard reset
  spice: move spice_server_vm_{start,stop} calls into qemu_spice_display_*()
  spice: move qemu_spice_display_*() from spice-graphics to spice-core
  nbd: avoid uninitialized warnings
  nbd: finish any pending coroutine
  nbd: make nbd_client_session_close() idempotent
  nbd: pass export name as init argument
  nbd: don't change socket block during negotiate
  Split nbd block client code
  spice-char: implement chardev port event
  char: add qemu_chr_fe_event()
  include: add missing config-host.h include
  qmp_change_blockdev() remove unused has_format
  spice-char: remove unused field
  vscclient: do not add a socket watch if there is not data to send
  spice: flip streaming video mode to off by default
2013-12-16 09:44:13 -08:00
Anthony Liguori
80d6f5eae7 Block patches
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.15 (GNU/Linux)
 
 iQIcBAABAgAGBQJSq0gXAAoJEH8JsnLIjy/WRosQAL+l9phOdbeyOkybR6fvt+i1
 ZuR8FQTS01xtnVvqeVEueVFrWvFiovBtS4N5ubie1O6CQGF/4HIrFTgRSeUzpPaG
 qYjMY9pmg2QXP4Kg24z+wMlVbn7uzK0Akrlerr47emLtMLrHdsLrs2m0DAlW8KZN
 6umGk+HQ5wPwmF8b4w3BGymOQk22oDCiTXmuBCzdZ+GpsTAhSr+XupvNxo8b+aFk
 kpphwulOPFntoHlOmcypgjVM8CfUUfjlkQZrZOXj5v63FPhwfeZ3DiVZIWGhOeJl
 qEaWRtoH2nELe67Py/vvGoITLdH2gS0l1JBkk2wLIqAjA0GiAXpZJrskop4NYAqm
 NvHaBHUPob5m3JVuh35hAMEbyFaIxS5teC6q1Pirjskiks0jvxYYlTiZw+RDkzqX
 gqBmBacyuSAJB9UlvlH3si3zSJ4MqN4feVTv8XWXoprq4+gC0xpyCtkZIb4ZrM9/
 oAmmYcEytf4Z2fqjPkmcD5lvLl48j3hYkd5zDwEy5TxWvyfqQSkkVx21AQMsqQvq
 PokiKFJUuyPfEGQgugWDsWM4FdXRhOrK9Q6479lAWP7PE9j9UTI24Z4JJVfLQz+z
 0m5IBuaQlBqWbN48twi4DSMTo5jFkxp/cKTXNGeV2cVDYyUuveAj1atTqUh8rZby
 95kteXfJhVGiaMQTniIl
 =XYe2
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'kwolf/tags/for-anthony' into staging

Block patches

# gpg: Signature made Fri 13 Dec 2013 09:47:03 AM PST using RSA key ID C88F2FD6
# gpg: Can't check signature: public key not found

# By Peter Lieven (2) and others
# Via Kevin Wolf
* kwolf/tags/for-anthony:
  blkdebug: Use QLIST_FOREACH_SAFE to resume IO
  qemu-img: make progress output more accurate during convert
  block: expect get_block_status errors in bdrv_make_zero
  block/vvfat: Fix compiler warnings for OpenBSD
  qapi-schema.json: Change 1.8 reference to 2.0
  sheepdog: check if '-o redundancy' is passed from user

Message-id: 1386956943-19474-1-git-send-email-kwolf@redhat.com
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-12-16 09:43:28 -08:00
Peter Lieven
8b9dfe9098 block/iscsi: use a bh to schedule co reentrance
this fixes a potential segfault and performance regression.

If the coroutine is reentered directly in the iscsi_co_generic_cb
iscsi_process_{read,write} are interrupted and reentered any
time later. One the one hand this could happen after an iscsi_close
where the iscsi context is already gone (segfault). On the
other hand this limits the number of processed callbacks
in each aio_dispatch to one (potential performance regression).

Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-12-16 11:25:51 +01:00
Marc-André Lureau
b1b27b6426 nbd: avoid uninitialized warnings
==15815== Thread 1:
==15815== Syscall param socketcall.sendto(msg) points to uninitialised byte(s)
==15815==    at 0x65AD5CB: send (send.c:31)
==15815==    by 0x37F84B: nbd_wr_sync (nbd.c:145)
==15815==    by 0x37F94B: write_sync (nbd.c:186)
==15815==    by 0x380FA9: nbd_send_request (nbd.c:681)
==15815==    by 0x1C4A2D: nbd_teardown_connection (nbd-client.c:337)
==15815==    by 0x1C4AD8: nbd_client_session_close (nbd-client.c:354)
==15815==    by 0x1ED2D8: close_socketpair (spicebd.c:132)
==15815==    by 0x1EE265: spice_close (spicebd.c:457)
==15815==    by 0x1ACBF6: bdrv_close (block.c:1519)
==15815==    by 0x1AD804: bdrv_delete (block.c:1772)
==15815==    by 0x1B4136: bdrv_unref (block.c:4476)
==15815==    by 0x1ACCE0: bdrv_close (block.c:1541)
==15815==  Address 0x7feffef98 is on thread 1's stack

Signed-off-by: Marc-André Lureau <marcandre.lureau@gmail.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-12-16 10:12:20 +01:00
Marc-André Lureau
69152c09d3 nbd: finish any pending coroutine
Make sure all pending coroutines are finished when closing the session.

Signed-off-by: Marc-André Lureau <marcandre.lureau@gmail.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-12-16 10:12:20 +01:00
Marc-André Lureau
5ad283ebb8 nbd: make nbd_client_session_close() idempotent
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-12-16 10:12:20 +01:00
Marc-André Lureau
e2bc625f9b nbd: pass export name as init argument
There is no need to keep the export name around, and it seems a better
fit as an argument in the init() call.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-12-16 10:12:20 +01:00
Marc-André Lureau
e53a18e488 nbd: don't change socket block during negotiate
The caller might handle non-blocking using coroutine. Leave the choice
to the caller to use a blocking or non-blocking negotiate.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-12-16 10:12:20 +01:00
Marc-André Lureau
2302c1cafb Split nbd block client code
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-12-16 10:12:20 +01:00
Fam Zheng
c547e5640d blkdebug: Use QLIST_FOREACH_SAFE to resume IO
Qemu-iotest 030 was broken.

When the coroutine runs and finishes, it will remove itself from the req
list, so let's use safe version of foreach to avoid use after free.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-12-13 17:11:19 +01:00
Stefan Weil
f671d173c7 block/vvfat: Fix compiler warnings for OpenBSD
The buildbot shows these compiler warnings:

block/vvfat.c: In function 'create_short_and_long_name':
block/vvfat.c:620: warning: array size (8) smaller than bound length (11)
block/vvfat.c:620: warning: array size (8) smaller than bound length (11)
block/vvfat.c:635: warning: array size (8) smaller than bound length (11)
block/vvfat.c:635: warning: array size (8) smaller than bound length (11)

They are caused by tricky code where 8 characters for the name are followed
by 3 characters for the extension, and some operations touch both name and
extension.

Using an 11 character name which includes the extension fixes the compiler
warning, satisfies cppcheck, valgrind and maybe other static and dynamic
code checkers, and even simplifies some parts of the code.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-12-13 14:49:50 +01:00
Liu Yuan
a3120deee5 sheepdog: check if '-o redundancy' is passed from user
This fix a segfault (that is caused by b3af018f3) of following command:

$ qemu-img convert some_img sheepdog:some_img

Cc: qemu-devel@nongnu.org
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-12-13 14:49:50 +01:00
Peter Lieven
063c3378a9 block/iscsi: introduce bdrv_co_{readv, writev, flush_to_disk}
this converts read, write and flush functions from aio to coroutines
eliminating almost 200 lines of code.

The requirement for libiscsi is bumped to version 1.4.0 which was
released in may 2012.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-12-09 11:28:16 +01:00
Hu Tao
ac95acdb8e qcow2: use start_of_cluster() and offset_into_cluster() everywhere
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-06 16:53:50 +01:00
Peter Lieven
7572ddc8db block/iscsi: set bs->bl.opt_transfer_length
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-05 11:45:24 +01:00
Peter Lieven
1c0704a556 block/iscsi: set bdi->cluster_size
this patch aims to set bdi->cluster_size to the internal page size
of the iscsi target so that enabled callers can align requests
properly.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-05 11:45:24 +01:00
Wenchao Xia
8c116b0e41 qemu-nbd: support internal snapshot export
Now it is possible to directly export an internal snapshot, which
can be used to probe the snapshot's contents without qemu-img
convert.

Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-04 15:19:00 +01:00
Wenchao Xia
7b4c4781e3 snapshot: distinguish id and name in load_tmp
Since later this function will be used so improve it. The only caller of it
now is qemu-img, and it is not impacted by introduce function
bdrv_snapshot_load_tmp_by_id_or_name() that call bdrv_snapshot_load_tmp()
twice to keep old search logic. bdrv_snapshot_load_tmp_by_id_or_name() return
int to let caller know the errno, and errno will be used later.
Also fix a typo in comments of bdrv_snapshot_delete().

Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-04 15:19:00 +01:00
Kevin Wolf
f8413b3c23 qcow2: Zero-initialise first cluster for new images
Strictly speaking, this is only required for has_zero_init() == false,
but it's easy enough to just do a cluster-aligned write that is padded
with zeros after the header.

This fixes that after 'qemu-img create' header extensions are attempted
to be parsed that are really just random leftover data.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-04 11:29:37 +01:00
Paolo Bonzini
97a2ae3453 raw-posix: add support for write_zeroes on XFS and block devices
The code is similar to the implementation of discard and write_zeroes
with UNMAP.  However, failure must be propagated up to block.c.

The stale page cache problem can be reproduced as follows:

    # modprobe scsi-debug lbpws=1 lbprz=1
    # ./qemu-io /dev/sdXX
    qemu-io> write -P 0xcc 0 2M
    qemu-io> write -z 0 1M
    qemu-io> read -P 0x00 0 512
    Pattern verification failed at offset 0, 512 bytes
    qemu-io> read -v 0 512
    00000000:  cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc  ................
    ...

    # ./qemu-io --cache=none /dev/sdXX
    qemu-io> write -P 0xcc 0 2M
    qemu-io> write -z 0 1M
    qemu-io> read -P 0x00 0 512
    qemu-io> read -v 0 512
    00000000:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    ...

And similarly with discard instead of "write -z".

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-03 15:26:49 +01:00
Paolo Bonzini
d0b4503ed2 raw-posix: implement write_zeroes with MAY_UNMAP for block devices
See the next commit for the description of the Linux kernel problem
that is worked around in raw_open_common.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-03 15:26:49 +01:00
Paolo Bonzini
260a82e524 raw-posix: implement write_zeroes with MAY_UNMAP for files
Writing zeroes to a file can be done by punching a hole if
MAY_UNMAP is set.

Note that in this case ENOTSUP is not ignored, but makes
the block layer fall back to the generic implementation.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-03 15:26:49 +01:00
Paolo Bonzini
fa6252b056 block/iscsi: check WRITE SAME support differently depending on MAY_UNMAP
The current check is right for MAY_UNMAP=1.  For MAY_UNMAP=0, just
try and fall back to regular writes as soon as a WRITE SAME command
fails.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-03 15:26:49 +01:00
Peter Lieven
2af8a1a704 block/iscsi: updated copyright
added myself to reflect recent work on the iscsi block driver.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-03 15:26:49 +01:00
Peter Lieven
4b52498e62 block/iscsi: remove .bdrv_has_zero_init
since commit 3ac21627 the default value changed to 0.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-03 15:26:49 +01:00
Paolo Bonzini
cffb1ec600 block drivers: expose requirement for write same alignment from formats
This will let misaligned but large requests use zero clusters.  This
is important because the cluster size is not guest visible.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-03 15:26:49 +01:00
Paolo Bonzini
95de6d7078 block drivers: add discard/write_zeroes properties to bdrv_get_info implementation
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-03 15:26:49 +01:00
Paolo Bonzini
97b00e2851 vpc, vhdx: add get_info
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-03 15:26:49 +01:00
Paolo Bonzini
7ce21016b6 block: handle ENOTSUP from discard in generic code
Similar to write_zeroes, let the generic code receive a ENOTSUP for
discard operations.  Since bdrv_discard has advisory semantics,
we can just swallow the error.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-03 15:26:49 +01:00
Fam Zheng
af057fe740 vmdk: Fix creating big description file
The buffer for description file was 4096 which only covers a few
hundred of extents. This changes the buffer to dynamic allocated with
g_strdup_printf in order to support bigger cases.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-03 09:54:16 +01:00
Fam Zheng
509d39aa22 vmdk: Allow read only open of VMDK version 3
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-11-29 17:41:14 +01:00
Kevin Wolf
c9fbb99d41 block: Use BDRV_O_NO_BACKING where appropriate
If you open an image temporarily just because you want to check its size
or get it flushed, there's no real reason to open the whole backing file
chain.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2013-11-29 17:41:09 +01:00
Fam Zheng
4cc70e9337 blkdebug: add "remove_break" command
This adds "remove_break" command which is the reverse of blkdebug
command "break": it removes all breakpoints with given tag and resumes
all the requests.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-29 13:40:37 +01:00
Liu Yuan
b3af018f3b sheepdog: support user-defined redundancy option
Sheepdog support two kinds of redundancy, full replication and erasure coding.

# create a fully replicated vdi with x copies
 -o redundancy=x (1 <= x <= SD_MAX_COPIES)

# create a erasure coded vdi with x data strips and y parity strips
 -o redundancy=x:y (x must be one of {2,4,8,16} and 1 <= y < SD_EC_MAX_STRIP)

E.g, to convert a vdi into sheepdog vdi 'test' with 8:3 erasure coding scheme

$ qemu-img convert -o redundancy=8:3 linux-0.2.img sheepdog:test

Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-29 13:40:37 +01:00
Liu Yuan
c31d482f29 sheepdog: refactor do_sd_create()
We can actually use BDRVSheepdogState *s to pass most of the parameters.

Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-29 13:40:37 +01:00
Charlie Shepherd
091b1108ca COW: Extend checking allocated bits to beyond one sector
cow_co_is_allocated() only checks one sector's worth of allocated bits
before returning. This is allowed but (slightly) inefficient, so extend
it to check all of the file's metadata sectors.

Signed-off-by: Charlie Shepherd <charlie@ctshepherd.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

[kwolf: silenced compiler warning (-Wmaybe-uninitialized for changed)]
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-11-29 13:40:36 +01:00
Charlie Shepherd
14b98fdaf3 COW: Speed up writes
Process a whole sector's worth of COW bits by reading a sector, setting
the bits after skipping any already set bits, then writing it out again.
Make sure we only flush once before writing metadata, and only if we
need to write metadata.

Signed-off-by: Charlie Shepherd <charlie@ctshepherd.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-29 13:40:36 +01:00
Fam Zheng
21b5683508 qapi: Change BlockDirtyInfo to list
We have multiple dirty bitmaps in BDS now, switch QAPI to allow query
it (BlockInfo.dirty_bitmaps), and also drop old BlockInfo.dirty.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-11-29 13:40:36 +01:00
Fam Zheng
e4654d2d94 block: per caller dirty bitmap
Previously a BlockDriverState has only one dirty bitmap, so only one
caller (e.g. a block job) can keep track of writing. This changes the
dirty bitmap to a list and creates a BdrvDirtyBitmap for each caller, the
lifecycle is managed with these new functions:

    bdrv_create_dirty_bitmap
    bdrv_release_dirty_bitmap

Where BdrvDirtyBitmap is a linked list wrapper structure of HBitmap.

In place of bdrv_set_dirty_tracking, a BdrvDirtyBitmap pointer argument
is added to these functions, since each caller has its own dirty bitmap:

    bdrv_get_dirty
    bdrv_dirty_iter_init
    bdrv_get_dirty_count

bdrv_set_dirty and bdrv_reset_dirty prototypes are unchanged but will
internally walk the list of all dirty bitmaps and set them one by one.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-11-29 13:40:33 +01:00
Max Reitz
f4a193e717 block/stream: Don't stream unbacked devices
If a block device is unbacked, a streaming blockjob should immediately
finish instead of beginning to try to stream, then noticing the backing
file does not contain even the first sector (since it does not exist)
and then finishing normally.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-11-28 11:39:31 +01:00
Liu Yuan
8582972227 sheepdog: implement .bdrv_get_allocated_file_size
With this patch, qemu-img info sheepdog:image will show disk size for sheepdog
images.

Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Liu Yuan <namei.unix@gmail.com>
Reviewed-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-11-28 10:30:52 +01:00
Peter Lieven
d4cd961507 iscsi: add bdrv_co_write_zeroes
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-28 10:30:52 +01:00
Peter Lieven
01a6a238a3 iscsi: simplify iscsi_co_discard
now that bdrv_co_discard can handle limits we do not need
the request split logic here anymore.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-28 10:30:52 +01:00
Peter Lieven
ba6c59191f iscsi: set limits in BlockDriverState
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-28 10:30:52 +01:00
Peter Lieven
04f19e4d2d block/raw: copy BlockLimits on raw_open
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-28 10:30:51 +01:00
Peter Lieven
186d4f2b1d block/iscsi: add .bdrv_get_info
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-28 10:30:51 +01:00
Peter Lieven
d32f35cbc5 block: introduce BDRV_REQ_MAY_UNMAP request flag
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-28 10:30:51 +01:00
Peter Lieven
aa7bfbfff7 block: add flags to bdrv_*_write_zeroes
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-28 10:30:51 +01:00
Peter Lieven
78a52ad5ac qcow2: fix possible corruption when reading multiple clusters
if multiple sectors spanning multiple clusters are read the
function count_contiguous_clusters should ensure that the
cluster type should not change between the clusters.

Especially the for-loop should break when we have one
or more normal clusters followed by a compressed cluster.

Unfortunately the wrong macro was used in the mask to
compare the flags.

This was discovered while debugging a data corruption
issue when converting a compressed qcow2 image to raw.
qemu-img reads 2MB chunks which span multiple clusters.

CC: qemu-stable@nongnu.org
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-11-14 13:09:07 +01:00
Fam Zheng
b04b6b6ec3 block: Print its file name if backing file opening failed
If backing file doesn't exist, the error message is confusing and
misleading:

    $ qemu /tmp/a.qcow2
    qemu: could not open disk image /tmp/a.qcow2: Could not open file: No
    such file or directory

But...

    $ ls /tmp/a.qcow2
    /tmp/a.qcow2

    $ qemu-img info /tmp/a.qcow2
    image: /tmp/a.qcow2
    file format: qcow2
    virtual size: 8.0G (8589934592 bytes)
    disk size: 196K
    cluster_size: 65536
    backing file: /tmp/b.qcow2

Because...

    $ ls /tmp/b.qcow2
    ls: cannot access /tmp/b.qcow2: No such file or directory

This is not intuitive. It's better to have the missing file's name in
the error message. With this patch:

    $ qemu-io -c 'read 0 512' /tmp/a.qcow2
    qemu-io: can't open device /tmp/a.qcow2: Could not open backing
    file: Could not open '/stor/vm/arch.raw': No such file or directory
    no file open, try 'help open'

Which is a little bit better.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-11-14 13:09:06 +01:00
Jeff Cody
3412f7b1bd block: vhdx - add .bdrv_create() support
This adds support for VHDX image creation, for images of type "Fixed"
and "Dynamic".  "Differencing" types (i.e., VHDX images with backing
files) are currently not supported.

Options for image creation include:
    * log size:
        The size of the journaling log for VHDX.  Minimum is 1MB,
        and it must be a multiple of 1MB. Invalid log sizes will be
        silently fixed by rounding up to the nearest MB.

        Default is 1MB.

    * block size:
        This is the size of a payload block.  The range is 1MB to 256MB,
        inclusive, and must be a multiple of 1MB as well.  Invalid sizes
        and multiples will be silently fixed.  If '0' is passed, then
        a sane size is chosen (depending on virtual image size).

        Default is 0 (Auto-select).

    * subformat:
        - "dynamic"
            An image without data pre-allocated.
        - "fixed"
            An image with data pre-allocated.

        Default is "dynamic"

When creating the image file, the lettered sections are created:

-----------------------------------------------------------------.
|   (A)    |   (B)    |    (C)    |     (D)       |     (E)
|  File ID |  Header1 |  Header 2 |  Region Tbl 1 |  Region Tbl 2
|          |          |           |               |
.-----------------------------------------------------------------.
0         64KB      128KB       192KB           256KB          320KB

.---- ~ ----------- ~ ------------ ~ ---------------- ~ -----------.
|     (F)     |     (G)       |    (H)    |
| Journal Log |  BAT / Bitmap |  Metadata |  .... data ......
|             |               |           |
.---- ~ ----------- ~ ------------ ~ ---------------- ~ -----------.
1MB         (var.)          (var.)      (var.)

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-07 13:58:59 +01:00
Jeff Cody
61c02e5687 block: vhdx - fix comment typos in header, fix incorrect struct fields
VHDXPage83Data and VHDXParentLocatorHeader both incorrectly had their
MSGUID fields set as arrays of 16.  This is incorrect (it stems from
an early version where those fields were uint_8 arrays).  Those fields
were, up to this patch, unused.

Also, there were a couple of typos and incorrect wording in comments,
and those have been fixed up as well.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-07 13:58:59 +01:00
Jeff Cody
1e74a971cb block: vhdx - break out code operations to functions
This is preperation for vhdx_create().  The ability to write headers,
and calculate the number of BAT entries will be needed within the
create() functions, so move this relevant code into helper functions.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-07 13:58:59 +01:00
Jeff Cody
c325ee1de8 block: vhdx - move more endian translations to vhdx-endian.c
In preparation for vhdx_create(), move more endian translation
functions out to vhdx-endian.c.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-07 13:58:59 +01:00
Jeff Cody
0b7da092b4 block: vhdx - remove BAT file offset bit shifting
Bit shifting can be fun, but in this case it was unnecessary.  The
upper 44 bits of the 64-bit BAT entry is specifies the File Offset,
so we shifted the bits to get access to the value.

However, per the spec the value is in MB.  So we dutifully shifted back
to the left by 20 bits, to convert to a true uint64_t file offset.

This replaces those steps with just a bit mask, to get rid of the lower
20 bits instead.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-07 13:58:59 +01:00
Jeff Cody
d92aa8833c block: vhdx write support
This adds support for writing to VHDX image files, using coroutines.
Writes into the BAT table goes through the VHDX log.  Currently, BAT
table writes occur when expanding a dynamic VHDX file, and allocating a
new BAT entry.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-07 13:58:59 +01:00
Jeff Cody
8adc52336d block: vhdx - add log write support
This adds support for writing to the VHDX log.

For spec details, see VHDX Specification Format v1.00:
https://www.microsoft.com/en-us/download/details.aspx?id=34750

There are a few limitations to this log support:
1.) There is no caching yet
2.) The log is flushed after each entry

The primary write interface, vhdx_log_write_and_flush(), performs a log
write followed by an immediate flush of the log.

As each log entry sector is a minimum of 4KB, partial sector writes are
filled in with data from the disk write destination.

If the current file log GUID is 0, a new GUID is generated and updated
in the header.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-07 13:58:59 +01:00
Jeff Cody
1a848fd451 block: vhdx - add region overlap detection for image files
Regions in the image file cannot overlap - the log, region tables,
and metdata must all be unique and non-overlapping.

This adds region checking by means of a QLIST; there can be a variable
number of regions and metadata (there may be metadata or region tables
that we do not recognize / know about, but are not required).

This adds the capability to register a region for later checking, and
to check against registered regions for any overlap.

Also, if neither the BAT or Metadata region tables are found, return
error.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-07 13:58:59 +01:00
Jeff Cody
0a43a1b5d7 block: vhdx - log parsing, replay, and flush support
This adds support for VHDX v0 logs, as specified in Microsoft's
VHDX Specification Format v1.00:
https://www.microsoft.com/en-us/download/details.aspx?id=34750

The following support is added:

* Log parsing, and validation - validate that an existing log
  is correct.

* Log search - search through an existing log, to find any valid
  sequence of entries.

* Log replay and flush - replay an existing log, and flush/clear
  the log when complete.

The VHDX log is a circular buffer, with elements (sectors) of 4KB.

A log entry is a variably-length number of sectors, that is
comprised of a header and 'descriptors', that describe each sector.

A log may contain multiple entries, know as a log sequence.  In a log
sequence, each log entry immediately follows the previous entry, with an
incrementing sequence number.  There can only ever be one active and
valid sequence in the log.

Each log entry must match the file log GUID in order to be valid (along
with other criteria).  Once we have flushed all valid log entries, we
marked the file log GUID to be zero, which indicates a buffer with no
valid entries.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-07 13:58:58 +01:00
Jeff Cody
c46415afc2 block: vhdx code movement - move vhdx_close() above vhdx_open()
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-07 13:58:58 +01:00
Jeff Cody
c3906c5e82 block: vhdx - update log guid in header, and first write tracker
Allow tracking of first file write in the VHDX image, as well as
the ability to update the GUID in the header.  This is in preparation
for log support.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-07 13:58:58 +01:00
Jeff Cody
0f48e8f097 block: vhdx - break endian translation functions out
This moves the endian translation functions out from the vhdx.c source,
into a separate source file. In addition to the previously defined
endian functions, new endian translation functions for log support are
added as well.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-07 13:58:58 +01:00
Jeff Cody
625565d27e block: vhdx - log support struct and defines
This adds some magic number defines, and internal structure definitions
for VHDX log replay support.  The struct VHDXLogEntries does not reflect
an on-disk data structure, and thus does not need to be packed.

Some minor code style fixes are applied as well.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-07 13:58:58 +01:00
Jeff Cody
28541d4628 block: vhdx code movement - VHDXMetadataEntries and BDRVVHDXState to header.
In preparation for VHDX log support, move these structures to the
header.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-07 13:58:58 +01:00
Jeff Cody
4f18b7824a block: vhdx - add header update capability.
This adds the ability to update the headers in a VHDX image, including
generating a new MS-compatible GUID.

As VHDX depends on uuid.h, VHDX is now a configurable build option.  If
VHDX support is enabled, that will also enable uuid as well.  The
default is to have VHDX enabled.

To enable/disable VHDX:  --enable-vhdx, --disable-vhdx

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-07 13:58:58 +01:00
Jeff Cody
6e9d290bf6 block: vhdx - minor comments and typo correction.
Just a couple of minor comments to help note where allocated
buffers are freed, and a typo fix.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-07 13:58:58 +01:00
Peter Lieven
0173e7bbf3 block/vpc: fix virtual size for images created with disk2vhd
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-07 13:58:58 +01:00
Andreas Tobler
511018e4b4 block/raw-posix: fix FreeBSD compilation
The below patch is needed to compile qemu trunk on FreeBSD with gcc48,
clang will fail.... ;). Host x84_64-freebsd.

Signed-off-by: Andreas Tobler <andreast@FreeBSD.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-07 13:53:30 +01:00
Peter Maydell
e4ef9f465c bswap.h: Remove cpu_to_be64wu()
Replace the legacy cpu_to_be64wu() with stq_be_p().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1383669517-25598-9-git-send-email-peter.maydell@linaro.org
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-11-05 19:57:47 -08:00
Anthony Liguori
a126050a10 Block patches for 1.7.0-rc0 (v2)
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.14 (GNU/Linux)
 
 iQIcBAABAgAGBQJScnrnAAoJEH8JsnLIjy/WhhsP/2No1yEGNzfhw0WLDsEGBJI7
 zjG+QkRMO4q2t256SxNr84KBFJlYKBvGrx+W8xC66AdvR1feL5hmWdXAMTJovx6Z
 3Qt59RI9iISZ2OEtc9FhdsC+dSdM/3qie17XuuSCqifsi4xLjIZK/s18+RnLa0t/
 nRObYP4prRl0c3o1gKaUvNz2wkIqctQAIe8UQkn6R1vPC6D60m/H9dDj4Kj68HO0
 ICsF4AXBR/V2a8gU36/PGexBVyfgC4HOeuN0qNSTgYOKxLuNR+SrlzzhHE+jZTs5
 GASm3vg/vUgBOO1759X5T8hveO6yu8XL82l+/d5nIK4gYGORIQZT74dyV5JgQIlF
 Y47d0cF28+C/fuL1jh7c+2HY5WmmJQosMi9CaCBj0lvH0k5caEjqwPeHtRBmEyu3
 1wAcLQJowZrWB5ez9MjezsaL4sPCymvB/4F443xdz5V19mE41bLZGW2EIT7MXHY7
 IcwLU/opx76GMOFfWVMA7jeQkjiPaqGeaQHJzdnGUzIthqyiTigQMfi5P3nXGDic
 uQi+KrqP9lNpJlZk4xGQnFovHNmKZrnLhUvqOIPk7/wKMvlU6ewdzp5Fnwzqw4MW
 uJ/6eBJYolMyY+q37AH3Q6ZUkwTJi9O1drCPA0Ogr/dJiCyAiOoKuL0N74VabpcD
 AahXw+yYV0qh6H4YjOzW
 =wGCx
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'kwolf/tags/for-anthony' into staging

Block patches for 1.7.0-rc0 (v2)

# gpg: Signature made Thu 31 Oct 2013 04:44:39 PM CET using RSA key ID C88F2FD6
# gpg: Can't check signature: public key not found

* kwolf/tags/for-anthony: (30 commits)
  vmdk: Implment bdrv_get_specific_info
  qapi: Add optional field 'compressed' to ImageInfo
  qemu-iotests: prefill some data to test image
  sheepdog: check simultaneous create in resend_aioreq
  sheepdog: cancel aio requests if possible
  sheepdog: make add_aio_request and send_aioreq void functions
  sheepdog: try to reconnect to sheepdog after network error
  coroutine: add co_aio_sleep_ns() to allow sleep in block drivers
  sheepdog: reload inode outside of resend_aioreq
  sheepdog: handle vdi objects in resend_aio_req
  sheepdog: check return values of qemu_co_recv/send correctly
  qemu-iotests: Test case for backing file deletion
  qemu-iotests: drop duplicated "create_image"
  qemu-iotests: Fix 051 reference output
  block: Avoid unecessary drv->bdrv_getlength() calls
  block: Disable BDRV_O_COPY_ON_READ for the backing file
  ahci: fix win7 hang on boot
  sheepdog: pass copy_policy in the request
  sheepdog: explicitly set copies as type uint8_t
  block: Don't copy backing file name on error
  ...

Message-id: 1383064269-27720-1-git-send-email-kwolf@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-10-31 17:02:26 +01:00
Fam Zheng
f4c129a38a vmdk: Implment bdrv_get_specific_info
Implement .bdrv_get_specific_info to return the extent information.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-31 14:44:43 +01:00
MORITA Kazutaka
80308d33ec sheepdog: check simultaneous create in resend_aioreq
After reconnection happens, all the inflight requests are moved to the
failed request list.  As a result, sd_co_rw_vector() can send another
create request before resend_aioreq() resends a create request from
the failed list.

This patch adds a helper function check_simultaneous_create() and
checks simultaneous create requests more strictly in resend_aioreq().

Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Tested-by: Liu Yuan <namei.unix@gmail.com>
Reviewed-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-30 12:22:24 +01:00
MORITA Kazutaka
35200687a1 sheepdog: cancel aio requests if possible
This patch tries to cancel aio requests in pending queue and failed
queue.  When the sheepdog driver cannot cancel the requests, it waits
for them to be completed.

Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Tested-by: Liu Yuan <namei.unix@gmail.com>
Reviewed-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-30 12:22:20 +01:00
MORITA Kazutaka
a37dcdf9ae sheepdog: make add_aio_request and send_aioreq void functions
These functions no longer return errors.  We can make them void
functions and simplify the codes.

Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Tested-by: Liu Yuan <namei.unix@gmail.com>
Reviewed-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-30 12:22:15 +01:00
MORITA Kazutaka
011603cacf sheepdog: try to reconnect to sheepdog after network error
This introduces a failed request queue and links all the inflight
requests to the list after network error happens.  After QEMU
reconnects to the sheepdog server successfully, the sheepdog block
driver will retry all the requests in the failed queue.

Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Tested-by: Liu Yuan <namei.unix@gmail.com>
Reviewed-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-30 12:22:12 +01:00
MORITA Kazutaka
72e0996c41 sheepdog: reload inode outside of resend_aioreq
This prepares for using resend_aioreq() after reconnecting to the
sheepdog server.

Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Tested-by: Liu Yuan <namei.unix@gmail.com>
Reviewed-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-30 12:22:05 +01:00
MORITA Kazutaka
2412aec745 sheepdog: handle vdi objects in resend_aio_req
The current resend_aio_req() doesn't work when the request is against
vdi objects.  This fixes the problem.

Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Tested-by: Liu Yuan <namei.unix@gmail.com>
Reviewed-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-30 12:22:01 +01:00
MORITA Kazutaka
80731d9da5 sheepdog: check return values of qemu_co_recv/send correctly
If qemu_co_recv/send doesn't return the specified length, it means
that an error happened.

Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Tested-by: Liu Yuan <namei.unix@gmail.com>
Reviewed-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-30 12:21:44 +01:00
Kevin Wolf
b94a261057 block: Avoid unecessary drv->bdrv_getlength() calls
The block layer generally keeps the size of an image cached in
bs->total_sectors so that it doesn't have to perform expensive
operations to get the size whenever it needs it.

This doesn't work however when using a backend that can change its size
without qemu being aware of it, i.e. passthrough of removable media like
CD-ROMs or floppy disks. For this reason, the caching is disabled when a
removable device is used.

It is obvious that checking whether the _guest_ device has removable
media isn't the right thing to do when we want to know whether the size
of the host backend can change. To make things worse, non-top-level
BlockDriverStates never have any device attached, which makes qemu
assume they are removable, so drv->bdrv_getlength() is always called on
the protocol layer. In the case of raw-posix, this causes unnecessary
lseek() system calls, which turned out to be rather expensive.

This patch completely changes the logic and disables bs->total_sectors
caching only for certain block driver types, for which a size change is
expected: host_cdrom and host_floppy on POSIX, host_device on win32; also
the raw format in case it sits on top of one of these protocols, but in
the common case the nested bdrv_getlength() call on the protocol driver
will use the cache again and avoid an expensive drv->bdrv_getlength()
call.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-29 13:10:26 +01:00
Liu Yuan
1841f8801c sheepdog: pass copy_policy in the request
Currently copy_policy isn't used. Recent sheepdog supports erasure coding, which
make use of copy_policy internally, but require client explicitly passing
copy_policy from base inode to newly creately inode for snapshot related
operations.

If connected sheep daemon doesn't utilize copy_policy, passing it to sheep
daemon is just one extra null effect operation. So no compatibility problem.

With this patch, sheepdog can provide erasure coded volume for QEMU VM.

Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Liu Yuan <namei.unix@gmail.com>
Acked-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-28 17:40:00 +01:00
Liu Yuan
29a67f7e92 sheepdog: explicitly set copies as type uint8_t
'copies' is actually uint8_t since day one, but request headers and some helper
functions parameterize it as uint32_t for unknown reasons and effectively
reserve 24 bytes for possible future use. This patch explicitly set the correct
for copies and reserve the left bytes.

This is a preparation patch that allow passing copy_policy in request header.

Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Liu Yuan <namei.unix@gmail.com>
Acked-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-28 17:39:56 +01:00
Max Reitz
ba2ab2f2ca qcow2: Flush image after creation
Opening the qcow2 image with BDRV_O_NO_FLUSH prevents any flushes during
the image creation. This means that the image has not yet been flushed
to disk when qemu-img create exits. This flush is delayed until the next
operation on the image involving opening it without BDRV_O_NO_FLUSH and
closing (or directly flushing) it. For large images and/or images with a
small cluster size and preallocated metadata, this flush may take a
significant amount of time and may occur unexpectedly.

Reopening the image without BDRV_O_NO_FLUSH right before the end of
qcow2_create2() results in hoisting the potentially costly flush into
the image creation, which is expected to take some time (whereas
successive image operations may be not).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-28 17:34:32 +01:00
Stefan Weil
73f395fa88 misc: New spelling fixes in comments
compatiblity -> compatibility
continously -> continuously
existance -> existence
usefull -> useful
shoudl -> should

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-10-26 13:01:57 +04:00
Peter Lieven
fb8fe35f63 block/vpc: check that the image has not been truncated
this adds a check that a dynamic VHD file has not been
accidently truncated (e.g. during transfer or upload).

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-24 17:34:48 +02:00
Max Reitz
6e13610aa4 qcow2: Unset zero_beyond_eof in save_vmstate
Saving the VM state is done using bdrv_pwrite. This function may perform
a read-modify-write, which in this case results in data being read from
beyond the end of the virtual disk. Since we are actually trying to
access an area which is not a part of the virtual disk, zero_beyond_eof
has to be set to false before performing the partial write, otherwise
the VM state may become corrupted.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-24 11:50:51 +02:00
Max Reitz
eedff66f21 qcow2: Restore total_sectors value in save_vmstate
Since df2a6f29a5, bdrv_co_do_writev increases the total_sectors value of
a growable block devices on writes after the current end. This leads to
the virtual disk apparently growing in qcow2_save_vmstate, which in turn
affects the disk size captured by the internal snapshot taken directly
afterwards through e.g. the HMP savevm command. Such a "grown" snapshot
cannot be loaded after reopening the qcow2 image, since its disk size
differs from the actual virtual disk size (writing a VM state does not
actually increase the virtual disk size).

Fix this by restoring total_sectors at the end of qcow2_save_vmstate.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-24 11:45:06 +02:00
Fam Zheng
dbbcaa8d43 vmdk: fix VMFS extent parsing
The VMFS extent line in description file doesn't have start offset as
FLAT lines does, and it should be defaulted to 0. The flat_offset
variable is initialized to -1, so we need to set it in this case.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-10-18 13:41:36 +02:00
Fam Zheng
c338b6ad60 vmdk: Only read cid from image file when opening
Previously cid of parent is parsed from image file for every IO request.
We already have L1/L2 cache and don't have assumption that parent image
can be updated behind us, so remove this to get more efficiency.

The parent CID is checked only for once after opening.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-10-18 13:39:59 +02:00
Max Reitz
45d57f6e71 block/raw-win32: Always use -errno in hdev_open
On one occasion, hdev_open() returned -1 in case of an unknown error
instead of a proper -errno value. Adjust this to match the behavior of
raw_open() (in raw-win32), which is to return -EINVAL in this case.
Also, change the call to error_setg*() to match the one in raw_open() as
well.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-10-17 14:55:46 +02:00
Fam Zheng
899f1ae219 vmdk: Fix vmdk_parse_extents
An extra 'p++' after while loop when *p == '\n' will move p to unknown
data position, risking parsing junk data or memory access violation.

Cc: qemu-stable@nongnu.org
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:50:02 +02:00
Fam Zheng
52c8d629ca vmdk: refuse enabling zeroed grain with flat images
This is a header flag and we needs sparse for the header.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:50:01 +02:00
Fam Zheng
4823970bcb vmdk: convert error code to use errp
Convert "fprintf(stderr,..." and standardize error messages:

Remove a few local_error's and use errp.

Remove "VMDK:" or "Vmdk:" prefixes in error message and fix to upper
case.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:50:01 +02:00
Max Reitz
e428e439df block/raw-posix: Employ error parameter
Make use of the error parameter in the opening and creating functions in
block/raw-posix.c.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:50:00 +02:00
Max Reitz
ca2884087a blkverify: Employ error parameter
Make use of the error parameter in blkverify_open.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:50:00 +02:00
Max Reitz
10ffa72fae blkdebug: Employ error parameter
Make use of the error parameter in blkdebug_open.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:50:00 +02:00
Max Reitz
c6252b7cea block/raw-win32: Employ error parameter
Make use of the error parameter in the opening and creating functions in
block/raw-win32.c.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:50:00 +02:00
Max Reitz
92f1deec31 block/raw_bsd: Employ error parameter
Propagate errors in raw_create rather than directly reporting and
afterwards discarding them.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:50:00 +02:00
Max Reitz
1fa5cc839a qcow2: Evaluate overlap check options
Evaluate the runtime overlap check options and set
BDRVQcowState.overlap_check appropriately.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:50:00 +02:00
Max Reitz
4a273c398b qcow2: Add more overlap check bitmask macros
Introduces the macros QCOW2_OL_CONSTANT and QCOW2_OL_ALL in addition to
the already existing QCOW2_OL_CACHED, signifying all metadata overlap
checks that can be performed in constant time (regardless of image size
etc.) and truly all available overlap checks, respectively.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:50:00 +02:00
Max Reitz
4092e99d93 qcow2: Array assigning options to OL check bits
Add an array which assigns the option string to its corresponding
overlap check bit.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:50:00 +02:00
Max Reitz
05de7e86ca qcow2: Add overlap-check options
Add runtime options to tune the overlap checks to be performed before
write accesses.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:50:00 +02:00
Max Reitz
3e3553905c qcow2: Make overlap check mask variable
Replace the QCOW2_OL_DEFAULT macro by a variable overlap_check in
BDRVQcowState.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:50:00 +02:00
Max Reitz
231bb26764 qcow2: Use negated overflow check mask
In qcow2_check_metadata_overlap and qcow2_pre_write_overlap_check,
change the parameter signifying the checks to perform from its current
positive form to a negative one, i.e., it will no longer explicitly
specify every check to perform but rather a mask of checks not to
perform.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:50:00 +02:00
Max Reitz
00c49b21e7 qcow2: Use better type for numerical snapshot ID
When trying to find a new snapshot ID, the existing ones are converted
to integers using strtoul. This function returns an unsigned long,
therefore its result should be saved in an unsigned long as well.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:50:00 +02:00
Max Reitz
84757f7e67 qcow2: Fix snapshot restoration in snapshot_create
If the new snapshot table could not be written in qcow2_snapshot_create,
the old snapshot table has to be restored in memory and the new one
released. This should include restoration of the old snapshot count as
well, which is added by this patch.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:50:00 +02:00
Max Reitz
f9bff97143 qcow2: Remove wrong metadata overlap check
In qcow2_write_compressed, if the compression fails, a normal cluster is
written to disk. This is done through bdrv_write on the qcow2 BDS
itself (using the guest offset), thus it is wrong to do a metadata
overlap check before.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:49:59 +02:00
Max Reitz
9e3f08923a qcow2: Add missing space in error message
The error message in qcow2_downgrade about an unsupported refcount
order is missing a space. This patch adds it.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:49:59 +02:00
Benoît Canet
f6186f49e2 block: Add BlockDriver.bdrv_check_ext_snapshot.
This field is used by blkverify to disable external snapshots creation.
It will also be used by block filters like quorum to disable external
snapshot creation.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:49:59 +02:00
Peter Lieven
92bc50a5ad block/get_block_status: avoid redundant callouts on raw devices
if a raw device like an iscsi target or host device is used
the current implementation makes a second call out to get
the block status of bs->file.

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:49:59 +02:00
Max Reitz
88fb153512 qcow2: Assert against snapshot name/ID overflow
qcow2_write_snapshots relies on the length of every snapshot ID and name
fitting into an unsigned 16 bit integer. This is currently ensured by
QEMU through generally only allowing 128 byte IDs and 256 byte names.
However, if this should change in the future, the length written to the
image file should not be silently truncated (though the name itself
would be written completely).

Since this is currently not an issue but might require attention due to
internal QEMU changes in the future, an assert ensuring sanity is enough
for now.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:49:59 +02:00
Max Reitz
9186ad9658 qcow2: Free allocated snapshot table on error
If an error occurs during qcow2_write_snapshots, the newly allocated
snapshot table clusters are leaked and should thus be freed.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:49:59 +02:00
Max Reitz
37d41f0a04 qcow2: Always use error path on writing snapshots
qcow2_write_snapshots does contain a fail label and there is no reason
not to use it on some errors; therefore, we should always jump there on
error.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:49:59 +02:00
Max Reitz
8f730dd24e qcow2: Free preallocated zero clusters
In qcow2_free_any_clusters, preallocated zero clusters should be freed
just as normal clusters are.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:49:59 +02:00
Max Reitz
998b959c1e qcow2: Use pread for inactive L1 in overlap check
Currently, qcow2_check_metadata_overlap uses bdrv_read to read inactive
L1 tables from disk. The number of sectors to read is calculated through
a truncating integer division, therefore, if the L1 table size is not a
multiple of the sector size, the final entries will not be read and
their entries in memory remain undefined (from the g_malloc).
Using bdrv_pread fixes this.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:49:59 +02:00
Max Reitz
37764dfb71 qcow2: Add support for ImageInfoSpecific
Add a new ImageInfoSpecificQCow2 type as a subtype of ImageInfoSpecific.
This contains the compatibility level as a string and an optional
lazy_refcounts boolean (optional means mandatory for compat >= 1.1 and
not available for compat == 0.10).

Also, add qcow2_get_specific_info, which returns this information.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 14:03:57 +02:00
Max Reitz
a8d8ecb77f block/qapi: Human-readable ImageInfoSpecific dump
Add a function for generically dumping the ImageInfoSpecific information
in a human-readable format to block/qapi.c.

Use this function in bdrv_image_info_dump and qemu-io-cmds.c:info_f to
allow qemu-img info resp. qemu-io -c info to print that format specific
information.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 10:52:54 +02:00
Max Reitz
eae041fe6f block: Add bdrv_get_specific_info
Add a function for retrieving an ImageInfoSpecific object from a block
driver.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 10:52:54 +02:00
Fam Zheng
79e14bf778 qapi: make use of new BlockJobType
Switch the string to enum type BlockJobType in BlockJobDriver.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 10:52:54 +02:00
Fam Zheng
3fc4b10af0 blockjob: rename BlockJobType to BlockJobDriver
We will use BlockJobType as the enum type name of block jobs in QAPI,
rename current BlockJobType to BlockJobDriver, which will eventually
become a set of operations, similar to block drivers.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 10:52:54 +02:00
Anthony Liguori
634ebf4b17 Merge remote-tracking branch 'bonzini/scsi-next' into staging
# By Asias He (1) and Peter Lieven (1)
# Via Paolo Bonzini
* bonzini/scsi-next:
  scsi: Allocate SCSITargetReq r->buf dynamically [CVE-2013-4344]
  block/iscsi: reenable iscsi_co_get_block_status

Message-id: 1381332391-8781-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-10-10 10:03:00 -07:00
Peter Lieven
24c7608a5d block/iscsi: reenable iscsi_co_get_block_status
Commit f35c934a accidently disabled iscsi_co_get_block_status for all
libiscsi versions. Its not possible to check for enumeration constants
in the C preprocessor. This patch changes the check to the preprocessor
constant LIBISCSI_FEATURE_IOVECTOR which was introduced shortly after
get_lba_status support was added to libiscsi.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-09 10:43:42 +02:00
Max Reitz
e3b21ef9e0 qcow2: Free allocated L2 cluster on error
If an error occurs in l2_allocate, the allocated (but unused) L2 cluster
should be freed.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-10-07 13:23:19 +02:00
Max Reitz
fda74f826b qcow2: Switch L1 table in a single sequence
Switching the L1 table in memory should be an atomic operation, as far
as possible. Calling qcow2_free_clusters on the old L1 table on disk is
not a good idea when the old L1 table is no longer valid and the address
to the new one hasn't yet been written into the corresponding
BDRVQcowState field. To be more specific, this can lead to segfaults due
to qcow2_check_metadata_overlap trying to access the L1 table during the
free operation.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-10-02 15:38:29 +02:00
Jeff Cody
5641bf4056 block: vhdx - add migration blocker
This blocks migration for VHDX image files, until the
functionality can be supported.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-10-02 15:24:39 +02:00
Max Reitz
db0749012b qcow2: CHECK_OFLAG_COPIED is obsolete
CHECK_OFLAG_COPIED as a parameter to check_refcounts_l1 and
check_refcounts_l2 is obselete now, since the OFLAG_COPIED consistency
check is actually no longer performed by these functions (but by
check_oflag_copied).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-10-02 11:40:41 +02:00
Max Reitz
1e242b5544 qcow2: Correct endianness in overlap check
If an inactive L1 table is loaded from disk, its entries are in big
endian and have to be converted to host byte order before using them.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-10-02 11:06:35 +02:00
Kevin Wolf
61653008ad qcow2: Remove useless count_contiguous_clusters() parameter
All callers pass start = 0, and it's doubtful if any other value would
actually do what you expect. Remove the parameter.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
2013-09-27 17:22:43 +02:00
Max Reitz
22f0dd29af qcow2: COMPRESSED on count_contiguous_clusters
Compressed clusters can never be contiguous, therefore the corresponding
flag does not need to be given explicitly to count_contiguous_clusters.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-27 17:22:43 +02:00
Max Reitz
15684a4742 qcow2: count_contiguous_clusters and compression
The function is not intended to be used on compressed clusters and will
not work correctly, if used anyway, since L2E_OFFSET_MASK is not the
right mask for determining the offset of compressed clusters. Therefore,
assert that the first cluster is not compressed and always include the
compression flag in the mask of significant flags, i.e., stop the search
as soon as a compressed cluster occurs.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-27 17:22:43 +02:00
Max Reitz
320c706666 qcow2: Free only newly allocated clusters on error
In expand_zero_clusters_in_l1, a new cluster is only allocated if it was
not already preallocated. On error, such preallocated clusters should
not be freed, but only the newly allocated ones.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-27 17:22:43 +02:00
Max Reitz
be0b742ee3 qcow2: Always use error path in l2_allocate
Just returning -errno in some cases prevents
trace_qcow2_l2_allocate_done from being executed (and, in one case, also
the unused allocated L2 table from being freed). Always going down the
error path fixes this.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-27 17:22:43 +02:00
Max Reitz
8585afd813 qcow2: Don't put invalid L2 table into cache
In l2_allocate, the fail path is executed if qcow2_cache_flush fails.
However, the L2 table has not yet been fetched from the L2 table cache.
The qcow2_cache_put in the fail path therefore basically gives an
undefined argument as the L2 table address (in this case).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-27 11:31:59 +02:00
Max Reitz
e390cf5a97 qcow2: Correct bitmap size in zero expansion
Since the expanded_clusters bitmap is addressed using host offsets in
the underlying image file, the correct size to use for allocating the
bitmap is not determined by the guest disk image but by the underlying
host image file.

Furthermore, this size may change during the expansion due to cluster
allocations on growable image files. In this case, the bitmap needs to
be resized as well to reflect the growth.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-27 11:16:35 +02:00
Max Reitz
c01dbccbad qcow2: Assert against currently impossible overflow
If qcow2_alloc_cluster_link_l2 is called with a QCowL2Meta describing a
request crossing L2 boundaries, a buffer overflow will occur. This is
impossible right now since such requests are never generated (every
request is shortened to L2 boundaries before) and probably also
completely unintended (considering the name "QCowL2Meta"), however, it
is still worth an assertion.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-25 21:57:44 +02:00
Jeff Cody
687fb89366 block: qed - use QEMU_PACKED for on-disk structures
QEDHeader is read, and written, directly from on-disk images
via bdrv_pread()/write().  To avoid any unintentional padding,
these structs should be packed.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-25 20:51:15 +02:00
Jeff Cody
c4217f645d block: qcow2 - used QEMU_PACKED for on-disk structures
QCowHeader and QCowExtension are structs that reside in the on-disk
image format, and are read and written directly via bdrv_pread()/write(),
and as such should be packed to avoid any unintentional struct padding.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-25 20:51:13 +02:00
Jeff Cody
e54835c06d block: vpc - use QEMU_PACKED for on-disk structures
The VHD footer and header structs (vhd_footer and vhd_dyndisk_header)
are on-disk structures for the image format, and as such should be
packed.

Go ahead and make these typedefs as well, with the preferred QEMU
naming convention, so that the packed attribute is used consistently
with the struct.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-25 20:51:10 +02:00
Jeff Cody
8368febd81 block: vdi - use QEMU_PACKED for on-disk structures
The header struct VdiHeader is an on-disk structure for the image
format, and as such should be packed.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-25 20:51:05 +02:00
Stefan Hajnoczi
9e6337d081 rbd: avoid qemu_rbd_snap_list() memory leaks
When there are no snapshots qemu_rbd_snap_list() returns 0 and the
snapshot table pointer is NULL.  Don't forget to free the snaps buffer
we allocated for librbd rbd_snap_list().

When the function succeeds don't forget to free the snaps buffer after
calling rbd_snap_list_end().

Cc: qemu-stable@nongnu.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-25 16:22:00 +02:00
Stefan Weil
c3e4f43a99 block: Fix compiler warning (-Werror=uninitialized)
The patch fixes a warning from gcc (Debian 4.6.3-14+rpi1) 4.6.3:

block/stream.c:141:22: error:
‘copy’ may be used uninitialized in this function [-Werror=uninitialized]

This is not a real bug - a better compiler would not complain.

Now 'copy' has always a defined value, so the check for ret >= 0
can be removed.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-25 16:21:28 +02:00
Benoît Canet
030be32184 block: introduce BlockDriver.bdrv_needs_filename to enable some drivers.
Some drivers will have driver specifics options but no filename.
This new bool allow the block layer to treat them correctly.

The .bdrv_needs_filename is set in drivers not having .bdrv_parse_filename and
not having .bdrv_open.

The first exception to this rule will be the quorum driver.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-25 16:21:28 +02:00
Fam Zheng
301c7d38a0 vmdk: fix cluster size check for flat extents
We use the extent size as cluster size for flat extents (where no L1/L2
table is allocated so it's safe) reuse sector calculating code with
sparse extents.

Don't pass in the cluster size for adding flat extent, just set it to
sectors later, then the cluster size checking will not fail.

The cluster_sectors is changed to int64_t to allow big flat extent.

Without this, flat extent opening is broken:

    # qemu-img create -f vmdk -o subformat=monolithicFlat /tmp/a.vmdk 100G
    Formatting '/tmp/a.vmdk', fmt=vmdk size=107374182400 compat6=off subformat='monolithicFlat' zeroed_grain=off
    # qemu-img info /tmp/a.vmdk
    image: /tmp/a.vmdk
    file format: raw
    virtual size: 0 (0 bytes)
    disk size: 4.0K

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-25 16:21:28 +02:00
Max Reitz
7454d60045 qcow2: Don't shadow return value
When trying to update the refcounts for a snapshot, the return value of
update_refcount on a compressed cluster was pretty much ignored,
cancelling the update on error but returning 0. This is caused by an
inner "ret" variable shadowing the outer one (the latter is used in the
return statement).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-25 10:08:56 +02:00
Anthony Liguori
16121fa39e Merge remote-tracking branch 'stefanha/block' into staging
# By Stefan Hajnoczi (4) and others
# Via Stefan Hajnoczi
* stefanha/block:
  virtio-blk: do not relay a previous driver's WCE configuration to the current
  blockdev: do not default cache.no-flush to true
  block: don't lose data from last incomplete sector
  qcow2: Correct snapshots size for overlap check
  coroutine: fix /perf/nesting coroutine benchmark
  coroutine: add qemu_coroutine_yield benchmark
  qemu-timer: do not take the lock in timer_pending
  qemu-timer: make qemu_timer_mod_ns() and qemu_timer_del() thread-safe
  qemu-timer: drop outdated signal safety comments
  osdep: warn if open(O_DIRECT) on fails with EINVAL
  libcacard: link against qemu-error.o for error_report()

Message-id: 1379698931-946-1-git-send-email-stefanha@redhat.com
2013-09-23 11:53:05 -05:00
Anthony Liguori
f3ca508f00 Merge remote-tracking branch 'bonzini/scsi-next' into staging
# By Hervé Poussineau (5) and Stefan Weil (1)
# Via Paolo Bonzini
* bonzini/scsi-next:
  block/iscsi: Drop iscsi_co_get_block_status for older versions of libiscsi
  lsi: add 53C810 variant
  lsi: remove todo
  lsi: ignore write accesses to CTEST0 registers
  lsi: check ssid versus sdid only if ssid is valid
  lsi: use constant name instead of its value
2013-09-23 11:52:32 -05:00
Max Reitz
0f39ac9a07 qcow2: Correct snapshots size for overlap check
Using s->snapshots_size instead of snapshots_size for the metadata
overlap check in qcow2_write_snapshots leads to the detection of an
overlap with the main qcow2 image header when deleting the last
snapshot, since s->snapshots_size has not yet been updated and is
therefore non-zero. However, the offset returned by qcow2_alloc_clusters
will be zero since snapshots_size is zero. Therefore, an overlap is
detected albeit no such will occur.

This patch fixes this by replacing s->snapshots_size by snapshots_size
when calling qcow2_pre_write_overlap_check.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-20 12:48:03 +02:00
Stefan Weil
f35c934a5a block/iscsi: Drop iscsi_co_get_block_status for older versions of libiscsi
Debian wheezy includes libiscsi-dev 1.4.0 which does not provide
SCSI_PROVISIONING_TYPE_DEALLOCATED. Drop iscsi_co_get_block_status
in this case to allow compilation without errors.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-18 01:28:50 +02:00
Anthony Liguori
5dc11192b2 Merge remote-tracking branch 'kwolf/for-anthony' into staging
# By Max Reitz (16) and others
# Via Kevin Wolf
* kwolf/for-anthony: (33 commits)
  qemu-iotests: Fix test 038
  block: Assert validity of BdrvActionOps
  qemu-iotests: Cleanup test image in test number 007
  qemu-img: fix invalid JSON
  coroutine: add ./configure --disable-coroutine-pool
  qemu-iotests: Adjustments due to error propagation
  qcow2: Use Error parameter
  qemu-img create: Emit filename on error
  block: Error parameter for create functions
  block: Error parameter for open functions
  bdrv: Use "Error" for creating images
  bdrv: Use "Error" for opening images
  qemu-iotests: add 057 internal snapshot for block device test case
  hmp: add interface hmp_snapshot_delete_blkdev_internal
  hmp: add interface hmp_snapshot_blkdev_internal
  qmp: add interface blockdev-snapshot-delete-internal-sync
  qmp: add interface blockdev-snapshot-internal-sync
  qmp: add internal snapshot support in qmp_transaction
  snapshot: distinguish id and name in snapshot delete
  snapshot: new function bdrv_snapshot_find_by_id_and_name()
  ...

Message-id: 1379073063-14963-1-git-send-email-kwolf@redhat.com
2013-09-17 09:51:40 -05:00
Peter Lieven
65f3e33964 iscsi: split discard requests in multiple parts
Replace .bdrv_aio_discard with .bdrv_co_discard so that discard
requests can be split in multiple parts, each for a small amount
of sectors.

This is useful because we expose a generic API with no limit
on the amount of sectors that can be unmapped in one request.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-12 13:14:19 +02:00
Max Reitz
3ef6c40ad0 qcow2: Use Error parameter
Employ usage of the new Error ** parameter in qcow2_open, qcow2_create
and associated functions.

Signed-off-by: Max Reitz <mreitz@redhat.com>
2013-09-12 10:12:48 +02:00
Max Reitz
cc84d90ff5 block: Error parameter for create functions
Add an Error ** parameter to bdrv_create and its associated functions to
allow more specific error messages.

Signed-off-by: Max Reitz <mreitz@redhat.com>
2013-09-12 10:12:48 +02:00
Max Reitz
34b5d2c68e block: Error parameter for open functions
Add an Error ** parameter to bdrv_open, bdrv_file_open and associated
functions to allow more specific error messages.

Signed-off-by: Max Reitz <mreitz@redhat.com>
2013-09-12 10:12:48 +02:00
Max Reitz
d5124c00d8 bdrv: Use "Error" for creating images
Add an Error ** parameter to BlockDriver.bdrv_create to allow more
specific error messages.

Signed-off-by: Max Reitz <mreitz@redhat.com>
2013-09-12 10:12:48 +02:00
Max Reitz
015a1036a7 bdrv: Use "Error" for opening images
Add an Error ** parameter to BlockDriver.bdrv_open and
BlockDriver.bdrv_file_open to allow more specific error messages.

Signed-off-by: Max Reitz <mreitz@redhat.com>
2013-09-12 10:12:47 +02:00
Wenchao Xia
a89d89d3e6 snapshot: distinguish id and name in snapshot delete
Snapshot creation actually already distinguish id and name since it take
a structured parameter *sn, but delete can't. Later an accurate delete
is needed in qmp_transaction abort and blockdev-snapshot-delete-sync,
so change its prototype. Also *errp is added to tip error, but return
value is kepted to let caller check what kind of error happens. Existing
caller for it are savevm, delvm and qemu-img, they are not impacted by
introducing a new function bdrv_snapshot_delete_by_id_or_name(), which
check the return value and do the operation again.

Before this patch:
  For qcow2, it search id first then name to find the one to delete.
  For rbd, it search name.
  For sheepdog, it does nothing.

After this patch:
  For qcow2, logic is the same by call it twice in caller.
  For rbd, it always fails in delete with id, but still search for name
in second try, no change to user.

Some code for *errp is based on Pavel's patch.

Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-12 10:12:47 +02:00
Wenchao Xia
2ea1dd758c snapshot: new function bdrv_snapshot_find_by_id_and_name()
To make it clear about id and name in searching, add this API
to distinguish them. Caller can choose to search by id or name,
*errp will be set only for exception.

Some code are modified based on Pavel's patch.

Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-12 10:12:47 +02:00
Max Reitz
9296b3ed70 qcow2: Implement bdrv_amend_options
Implement bdrv_amend_options for compat, size, backing_file, backing_fmt
and lazy_refcounts.

Downgrading images from compat=1.1 to compat=0.10 is achieved through
handling all incompatible flags accordingly, clearing all compatible and
autoclear flags and expanding all zero clusters.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-12 10:12:46 +02:00
Max Reitz
b6481f376b qcow2: Save refcount order in BDRVQcowState
Save the image refcount order in BDRVQcowState. This will be relevant
for future code supporting different refcount orders than four and also
for code that needs to verify a certain refcount order for an opened
image.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-12 10:12:46 +02:00
Max Reitz
32b6444d23 qcow2-cluster: Expand zero clusters
Add functionality for expanding zero clusters. This is necessary for
downgrading the image version to one without zero cluster support.

For non-backed images, this function may also just discard zero clusters
instead of truly expanding them.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-12 10:12:46 +02:00
Max Reitz
e7108feaac qcow2-cache: Empty cache
Add a function for emptying a cache, i.e., flushing it and marking all
elements invalid.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-12 10:12:46 +02:00
Tal Kain
56e023af80 raw-win32.c: Fix incorrect handling behaviour of small block files
It is a valid case that the read data's size is smaller than the
requested size since there could be files that are smaller than
the minimum block size (For ex. when a VMDK disk descriptor file)

Signed-off-by: Tal Kain <tal.kain@ravellosystems.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-12 10:12:46 +02:00
Kevin Wolf
1ebf561c11 qcow2: Discard VM state in active L1 after creating snapshot
During savevm, the VM state is written to the active L1 of the image and
then a snapshot is taken. After that, the VM state isn't needed any more
in the active L1 and should be discarded. This is implemented by this
patch.

The impact of not discarding the VM state is that a snapshot can never
become smaller than any previous snapshot (because it would be padded
with old VM state), and more importantly that future savevm operations
cause unnecessary COWs (with associated flushes), which makes subsequent
snapshots much slower.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2013-09-12 10:12:46 +02:00
Kevin Wolf
670df5e3b4 qcow2: Pass discard type to qcow2_discard_clusters()
The function will be used internally instead of only being called for
guest discard requests.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2013-09-12 10:12:46 +02:00
Peter Lieven
54a5c1d5db iscsi: add .bdrv_get_block_status
this patch adds a coroutine for .bdrv_co_block_status as well as
a generic framework that can be used to build coroutines in block/iscsi.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-12 08:46:21 +02:00
Peter Lieven
f18a7cbb09 iscsi: add logical block provisioning information to iscsilun
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-12 08:46:21 +02:00
Paolo Bonzini
5accc8408f scsi: prefer UUID to VM name for the initiator name
The UUID is unique even across multiple hosts, thus it is
better than a VM name even if it is less user-friendly.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-12 08:46:21 +02:00
Paolo Bonzini
f5f7abcfd5 raw-posix: report unwritten extents as zero
These are created for example with XFS_IOC_ZERO_RANGE.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:09 +02:00
Paolo Bonzini
63390a8d14 raw-posix: return get_block_status data and flags
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:09 +02:00
Paolo Bonzini
4bc74be997 block: return get_block_status data and flags for formats
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:09 +02:00
Paolo Bonzini
b6b8a33354 block: introduce bdrv_get_block_status API
For now, bdrv_get_block_status is just another name for bdrv_is_allocated.
The next patches will add more flags.

This also touches all block drivers with a mostly mechanical rename.  The
sole exception is cow; because it calls cow_co_is_allocated from the read
code, we keep that function and make cow_co_get_block_status a wrapper.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:09 +02:00
Paolo Bonzini
d663640c04 block: expect errors from bdrv_co_is_allocated
Some bdrv_is_allocated callers do not expect errors, but the fallback
in qcow2.c might make other callers trip on assertion failures or
infinite loops.

Fix the callers to always look for errors.

Cc: qemu-stable@nongnu.org
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:09 +02:00
Paolo Bonzini
4f5786376e block: remove bdrv_is_allocated_above/bdrv_co_is_allocated_above distinction
Now that bdrv_is_allocated detects coroutine context, the two can
use the same code.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:09 +02:00
Paolo Bonzini
bdad13b9de block: make bdrv_co_is_allocated static
bdrv_is_allocated can detect coroutine context and go through a fast
path, similar to other block layer functions.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:08 +02:00
Paolo Bonzini
e641c1e81e cow: do not call bdrv_co_is_allocated
As we change bdrv_is_allocated to gather more information from bs and
bs->file, it will become a bit slower.  It is still appropriate for online
jobs, but not for reads/writes.  Call the internal function instead.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:08 +02:00
Paolo Bonzini
26ae980492 cow: make writes go at a less indecent speed
Only sync once per write, rather than once per sector.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:08 +02:00
Paolo Bonzini
276cbc7f2f cow: make reads go at a decent speed
Do not do two reads for each sector; load each sector of the bitmap
and use bitmap operations to process it.

Writes are still dog slow!

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:08 +02:00
Fam Zheng
4f6fd3491c block: make bdrv_delete() static
Manage BlockDriverState lifecycle with refcnt, so bdrv_delete() is no
longer public and should be called by bdrv_unref() if refcnt is
decreased to 0.

This is an identical change because effectively, there's no multiple
reference of BDS now: no caller of bdrv_ref() yet, only bdrv_new() sets
bs->refcnt to 1, so all bdrv_unref() now actually delete the BDS.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:08 +02:00
Fam Zheng
13c91cb7e2 iscsi: use bdrv_new() instead of stack structure
BlockDriverState structure needs bdrv_new() to initialize refcnt, don't
allocate a local structure variable and memset to 0, becasue with coming
refcnt implementation, bdrv_unref will crash if bs->refcnt not
initialized to 1.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:08 +02:00
Fam Zheng
3d34c6cd99 vvfat: use bdrv_new() to allocate BlockDriverState
we need bdrv_new() to properly initialize BDS, don't allocate memory
manually.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:08 +02:00
Stefan Weil
68dc036488 w32: Fix access to host devices (regression)
QEMU failed to open host devices like \\.\PhysicalDrive0 (first hard disk)
since some time (commit 8a79380b8ef1b02d2abd705dd026a18863b09020?).

Those devices use hdev_open which did not use the latest API for options.
This resulted in a fatal runtime error:

  Block protocol 'host_device' doesn't support the option 'filename'

Duplicate code from raw_open to fix this.

Cc: qemu-stable@nongnu.org
Reported-by: David Brenner <david.brenner3@gmail.com>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:08 +02:00
Benoît Canet
2024c1df43 block: Add iops_size to do the iops accounting for a given io size.
This feature can be used in case where users are avoiding the iops limit by
doing jumbo I/Os hammering the storage backend.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:07 +02:00
Benoît Canet
3e9fab690d block: Add support for throttling burst max in QMP and the command line.
The max parameter of the leaky bucket throttling algorithm can be used to
allow the guest to do bursts.
The max value is a pool of I/O that the guest can use without being throttled
at all. Throttling is triggered once this pool is empty.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:07 +02:00
Benoît Canet
cc0681c454 block: Enable the new throttling code in the block layer.
Signed-off-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:07 +02:00
Anthony Liguori
bb7d4d82b6 Merge remote-tracking branch 'kwolf/for-anthony' into staging
# By Max Reitz (11) and others
# Via Kevin Wolf
* kwolf/for-anthony: (26 commits)
  qemu-iotests: Overlapping cluster allocations
  qcow2_check: Mark image consistent
  qcow2-refcount: Repair shared refcount blocks
  qcow2-refcount: Repair OFLAG_COPIED errors
  qcow2-refcount: Move OFLAG_COPIED checks
  qcow2: Employ metadata overlap checks
  qcow2: Metadata overlap checks
  qcow2: Add corrupt bit
  qemu-iotests: Snapshotting zero clusters
  qcow2-refcount: Snapshot update for zero clusters
  option: Add assigned flag to QEMUOptionParameter
  gluster: Abort on AIO completion failure
  block: Remove old raw driver
  switch raw block driver from "raw.o" to "raw_bsd.o"
  raw_bsd: register bdrv_raw
  raw_bsd: add raw_create_options
  raw_bsd: introduce "special members"
  raw_bsd: add raw_create()
  raw_bsd: emit debug events in bdrv_co_readv() and bdrv_co_writev()
  add skeleton for BSD licensed "raw" BlockDriver
  ...

Message-id: 1378111792-20436-1-git-send-email-kwolf@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-09-03 12:32:46 -05:00
Max Reitz
24530f3e06 qcow2_check: Mark image consistent
If no corruptions remain after an image repair (and no errors have been
encountered), clear the corrupt flag in qcow2_check.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-02 10:15:15 +02:00
Max Reitz
afa50193cd qcow2-refcount: Repair shared refcount blocks
If the refcount of a refcount block is greater than one, we can at least
try to repair that problem by duplicating the affected block.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-02 10:06:59 +02:00
Stefan Hajnoczi
5b21a2ae4d curl: qemu_bh_new() can never return NULL
Drop error code path which cannot be taken since qemu_bh_new() does not
return NULL.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-09-01 19:11:56 +04:00
Stefan Weil
4c293dc6e4 misc: Fix some typos in names and comments
Most typos were found using a modified version of codespell:

accross -> across
issueing -> issuing
TICNT_THRESHHOLD -> TICNT_THRESHOLD
bandwith -> bandwidth
VCARD_7816_PROPIETARY -> VCARD_7816_PROPRIETARY
occured -> occurred
gaurantee -> guarantee
sofware -> software

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-09-01 18:59:24 +04:00
Max Reitz
e23e400ec6 qcow2-refcount: Repair OFLAG_COPIED errors
Since the OFLAG_COPIED checks are now executed after the refcounts have
been repaired (if repairing), it is safe to assume that they are correct
but the OFLAG_COPIED flag may be not. Therefore, if its value differs
from what it should be (considering the according refcount), that
discrepancy can be repaired by correctly setting (or clearing that flag.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-08-30 15:48:44 +02:00
Max Reitz
4f6ed88c03 qcow2-refcount: Move OFLAG_COPIED checks
Move the OFLAG_COPIED checks out of check_refcounts_l1 and
check_refcounts_l2 and after the actual refcount checks/fixes (since the
refcounts might actually change there).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-08-30 15:48:44 +02:00
Max Reitz
cf93980e77 qcow2: Employ metadata overlap checks
The pre-write overlap check function is now called before most of the
qcow2 writes (aborting it on collision or other error).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-08-30 15:48:43 +02:00
Max Reitz
a40f1c2add qcow2: Metadata overlap checks
Two new functions are added; the first one checks a given range in the
image file for overlaps with metadata (main header, L1 tables, L2
tables, refcount table and blocks).

The second one should be used immediately before writing to the image
file as it calls the first function and, upon collision, marks the
image as corrupt and makes the BDS unusable, thereby preventing
further access.

Both functions take a bitmask argument specifying the structures which
should be checked for overlaps, making it possible to also check
metadata writes against colliding with other structures.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-08-30 15:48:43 +02:00
Max Reitz
69c9872653 qcow2: Add corrupt bit
This adds an incompatible bit indicating corruption to qcow2. Any image
with this bit set may not be written to unless for repairing (and
subsequently clearing the bit if the repair has been successful).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-08-30 15:48:43 +02:00
Max Reitz
8b81a7b6ba qcow2-refcount: Snapshot update for zero clusters
Account for all cluster types in qcow2_update_snapshot_refcounts;
this prevents this function from updating the refcount of unallocated
zero clusters which effectively led to wrong adjustments of the refcount
of cluster 0 (the main qcow2 header). This in turn resulted in images
with (unallocated) zero clusters having a cluster 0 refcount greater
than one after creating a snapshot.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-08-30 15:28:52 +02:00
Bharata B Rao
9faa574f7d gluster: Abort on AIO completion failure
Currently if gluster AIO callback thread fails to notify the QEMU thread about
AIO completion, we try graceful recovery by marking the disk drive as
inaccessible. This error recovery code is race-prone as found by Asias and
Stefan. However as found out by Paolo, this kind of error is impossible and
hence simplify the code that handles this error recovery.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-08-30 15:28:52 +02:00
Kevin Wolf
e5b1d99f55 block: Remove old raw driver
This is unused code now.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-08-30 15:28:52 +02:00
Laszlo Ersek
7a6d3fc594 switch raw block driver from "raw.o" to "raw_bsd.o"
"Incoming" function prototypes and "outgoing" function calls must match
reality. Implemented using the "struct BlockDriver" definition in
"include/block/block_int.h", and gcc errors & warnings.

v1->v2:

On 08/20/13 09:51, Kevin Wolf wrote:
> Am 18.08.2013 um 16:29 hat Paolo Bonzini geschrieben:
>> Il 16/08/2013 16:15, Laszlo Ersek ha scritto:
>>> +static int raw_reopen_prepare(BDRVReopenState *reopen_state,
>>> +                              BlockReopenQueue *queue, Error **errp)
>>>  {
>>> -    return bdrv_reopen_prepare(bs->file);
>>> +    BDRVReopenState tmp = *reopen_state;
>>> +
>>> +    tmp.bs = tmp.bs->file;
>>> +    return bdrv_reopen_prepare(&tmp, queue, errp);
>>>  }
>>
>> This should just return zero, my fault.
>
> Which is because bdrv_reopen_queue() already queues bs->file for reopen.
> The simple return 0; implementation is shared by all other format drivers
> that support reopening images.

Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-08-30 15:28:52 +02:00
Laszlo Ersek
775d6afd5c raw_bsd: register bdrv_raw
On 08/05/13 15:03, Paolo Bonzini wrote:
>
> [...]
>
> 5) Formats are registered with bdrv_register (takes a BlockDriver*). You
> also need to pass the caller of bdrv_register to block_init.

Fill in the BlockDriver structure with the raw_*() functions that have
been added to "block/raw_bsd.c", in the order the fields are defined in
"include/block/block_int.h".

I needed more explanation / naming examples for registering the driver
than what Paolo gave me, so I copied / adapted from "block/qcow2.c". The
parts I took as basis for modification are blamed on

    commit 5efa9d5a8b
    Author: Anthony Liguori <aliguori@us.ibm.com>
    Date:   Sat May 9 17:03:42 2009 -0500

        Convert block infrastructure to use new module init functionality

    commit 20d97356c9
    Author: Blue Swirl <blauwirbel@gmail.com>
    Date:   Fri Apr 23 20:19:47 2010 +0000

        Fix OpenBSD build

Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-08-30 15:28:52 +02:00
Laszlo Ersek
ff369a483d raw_bsd: add raw_create_options
On 08/05/13 15:03, Paolo Bonzini wrote:
>
> [...]
>
> 4) There is another member, .create_options, which is an array of
> QEMUOptionParameter structs, terminated by an all-zero item.  The only
> option you need is for the virtual disk size.  You will find something
> to copy from in other block drivers, for example block/qcow2.c.

Code taken and adapted from "block/qcow2.c", as suggested. The code being
copied/modified is blamed on

    commit 20d97356c9
    Author: Blue Swirl <blauwirbel@gmail.com>
    Date:   Fri Apr 23 20:19:47 2010 +0000

        Fix OpenBSD build

and

    commit 7c80ab3f21
    Author: Jes Sorensen <Jes.Sorensen@redhat.com>
    Date:   Fri Dec 17 16:02:39 2010 +0100

        block/qcow2.c: rename qcow_ functions to qcow2_

Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-08-30 15:28:52 +02:00