Commit Graph

1043 Commits

Author SHA1 Message Date
Paolo Bonzini
1288844e7c iscsi: reorganize iscsi_readcapacity_sync
Avoid the goto, and use the same retry logic for the 10- and 16-
byte versions.

Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-18 12:43:03 +02:00
Paolo Bonzini
f0d2a4d4d6 iscsi: simplify freeing of tasks
Always free them in the iscsi_aio_*_acb functions and remove the
checks in their callers.  Remove ifs when the task struct was
previously dereferenced (spotted by Coverity).

Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-18 12:43:03 +02:00
Ján Tomko
2330790879 nbd: strip braces from literal IPv6 address in URI
Otherwise they would get passed to getaddrinfo and fail with:
address resolution failed for [::1]🔢 Name or service not known

(Broken by commit v1.4.0-736-gf17c90b)

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-18 11:43:00 +02:00
Anthony Liguori
21a885a7e2 Merge remote-tracking branch 'luiz/queue/qmp' into staging
# By Luiz Capitulino
# Via Luiz Capitulino
* luiz/queue/qmp:
  qerror: drop QERR_OPEN_FILE_FAILED macro
  block: bdrv_reopen_prepare(): don't use QERR_OPEN_FILE_FAILED
  savevm: qmp_xen_save_devices_state(): use error_setg_file_open()
  dump: qmp_dump_guest_memory(): use error_setg_file_open()
  cpus: use error_setg_file_open()
  blockdev: use error_setg_file_open()
  block: mirror_complete(): use error_setg_file_open()
  rng-random: use error_setg_file_open()
  error: add error_setg_file_open() helper

Message-id: 1371484631-29510-1-git-send-email-lcapitulino@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-06-17 13:14:46 -05:00
Evgeny Budilovsky
0bed087df2 vmdk: Allow reading variable size descriptor files
the hard-coded 2k buffer on the stack won't allow reading big descriptor
files which can be generated when storing big images. For example 500G
vmdk splitted to 2G chunks.

Signed-off-by: Evgeny Budilovsky <evgeny.budilovsky@ravellosystems.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-06-17 17:47:59 +02:00
Richard W.M. Jones
8da1aa15db curl: Don't set curl options on the handle just before it's going to be deleted.
(Found by Kamil Dudka)

Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Cc: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-06-17 17:47:59 +02:00
Stefan Hajnoczi
5a394b9e96 vmdk: byteswap VMDK4Header.desc_offset field
Remember to byteswap VMDK4Header.desc_offset on big-endian machines.

Cc: qemu-stable@nongnu.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-06-17 17:47:59 +02:00
Richard W.M. Jones
a7cea2ba47 block/curl.c: Refuse to open the handle for writes.
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-06-17 17:47:59 +02:00
Liu Yuan
cede621ffc sheepdog: support 'qemu-img snapshot -a'
Just call sd_create_branch() in the snapshot_goto to rollback the image is good
enough. With this patch, 'loadvm' process for sheepdog is modified:

Suppose we have a snapshot chain A --> B --> C, we do 'loadvm A' so as to get
a new chain,

A --> B
|
V
C1

in the old code:

1 reload inode of A (in snapshot_goto)
2 read vmstate via A's vdi_id (loadvm_state)
3 delete C and create C1, reload inode of C1 (sd_create_branch on write)

with this patch applied:

1 reload inode of A, delete C and create C1  (in snapshot_goto)
2 read vmstate via C1's parent, that is A's vdi_id (loadvm_state)

This will fix the possible bug that QEMU exit between 2 and 3 in the old code

Cc: qemu-devel@nongnu.org
Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-06-17 17:47:58 +02:00
Liu Yuan
b579ffb3fd sheepdog: fix snapshot tag initialization
This is an old and obvious bug. We should pass snapshot_id to the
tag. Or simple command like 'qemu-img snapshot -a tag sheepdog:image' will fail

Cc: qemu-devel@nongnu.org
Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-06-17 17:47:58 +02:00
Luiz Capitulino
dacc26aae5 block: mirror_complete(): use error_setg_file_open()
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Kevin Wolf <kwolf@redhat.com>
2013-06-17 11:01:14 -04:00
Richard W.M. Jones
9e5e2b23d3 curl: Whitespace only changes.
Trivial patch to remove odd whitespace.

Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-06-11 23:45:43 +04:00
Wenchao Xia
553a7e8718 qmp: add ImageInfo in BlockDeviceInfo used by query-block
Now image info will be retrieved as an embbed json object inside
BlockDeviceInfo, backing chain info and all related internal snapshot
info can be got in the enhanced recursive structure of ImageInfo. New
recursive member *backing-image is added to reflect the backing chain
status.

Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-06-07 13:37:45 +02:00
Wenchao Xia
43526ec8d1 block: add image info query function bdrv_query_image_info()
This patch adds function bdrv_query_image_info(), which will
retrieve image info in qmp object format. The implementation is
based on the code moved from qemu-img.c, but uses block layer
function to get snapshot info.

Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-06-07 13:37:45 +02:00
Wenchao Xia
fb0ed4539c block: add snapshot info query function bdrv_query_snapshot_info_list()
This patch adds function bdrv_query_snapshot_info_list(), which will
retrieve snapshot info of an image in qmp object format. The implementation
is based on the code moved from qemu-img.c with modification to fit more
for qmp based block layer API.

Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-06-07 13:37:45 +02:00
Kevin Wolf
bf736fe34c blkdebug: Add BLKDBG_FLUSH_TO_OS/DISK events
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-06-06 11:27:22 +02:00
Wenchao Xia
5b91704469 block: dump snapshot and image info to specified output
bdrv_snapshot_dump() and bdrv_image_info_dump() do not dump to a buffer now,
some internal buffers are still used for format control, which have no
chance to be truncated. As a result, these two functions have no more issue
of truncation, and they can be used by both qemu and qemu-img with correct
parameter specified.

Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-06-04 13:56:30 +02:00
Wenchao Xia
f364ec65b5 block: move qmp and info dump related code to block/qapi.c
This patch is a pure code move patch, except following modification:
1 get_human_readable_size() is changed to static function.
2 dump_human_image_info() is renamed to bdrv_image_info_dump().
3 in qmp_query_block() and qmp_query_blockstats, use bdrv_next(bs)
instead of direct traverse of global array 'bdrv_states'.
4 collect_snapshots() and collect_image_info() are renamed, unused parameter
*fmt in collect_image_info() is removed.
5 code style fix.

To avoid conflict and tip better, macro in header file is BLOCK_QAPI_H
instead of QAPI_H. Now block.h and snapshot.h are at the same level in
include path, block_int.h and qapi.h will both include them.

Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-06-04 13:56:30 +02:00
Wenchao Xia
de08c606f9 block: move snapshot code in block.c to block/snapshot.c
All snapshot related code, except bdrv_snapshot_dump() and
bdrv_is_snapshot(), is moved to block/snapshot.c. bdrv_snapshot_dump()
will be moved to another file later. bdrv_is_snapshot() is not related
with internal snapshot. It also fixes small code style errors reported
by check script.

Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-06-04 13:56:30 +02:00
Qiao Nuohan
ce3a4718fe Remove twice include of qemu-common.h
This patch is used to remove twice include of "qemu-common.h" in
block/win32-aio.c

Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-05-18 16:35:11 +04:00
Kevin Wolf
2cf7cfa1cd qcow2: Catch some L1 table index overflows
This catches the situation that is described in the bug report at
https://bugs.launchpad.net/qemu/+bug/865518 and goes like this:

    $ qemu-img create -f qcow2 huge.qcow2 $((1024*1024))T
    Formatting 'huge.qcow2', fmt=qcow2 size=1152921504606846976 encryption=off cluster_size=65536 lazy_refcounts=off
    $ qemu-io /tmp/huge.qcow2 -c "write $((1024*1024*1024*1024*1024*1024 - 1024)) 512"
    Segmentation fault

With this patch applied the segfault will be avoided, however the case
will still fail, though gracefully:

    $ qemu-img create -f qcow2 /tmp/huge.qcow2 $((1024*1024))T
    Formatting 'huge.qcow2', fmt=qcow2 size=1152921504606846976 encryption=off cluster_size=65536 lazy_refcounts=off
    qemu-img: The image size is too large for file format 'qcow2'

Note that even long before these overflow checks kick in, you get
insanely high memory usage (up to INT_MAX * sizeof(uint64_t) = 16 GB for
the L1 table), so with somewhat smaller image sizes you'll probably see
qemu aborting for a failed g_malloc().

If you need huge image sizes, you should increase the cluster size to
the maximum of 2 MB in order to get higher limits.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-05-14 16:44:33 +02:00
Dong Xu Wang
c7e775e4dd remove double semicolons
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-05-12 13:25:55 +04:00
Fam Zheng
cdeaf1f159 vmdk: add bdrv_co_write_zeroes
Use special offset to write zeroes efficiently, when zeroed-grain GTE is
available. If zero-write an allocated cluster, cluster is leaked because
its offset pointer is overwritten by "0x1".

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-05-03 10:33:49 +02:00
Fam Zheng
e304e8e5a0 vmdk: store fields of VmdkMetaData in cpu endian
Previously VmdkMetaData.offset is stored little endian while other
fields are cpu endian. This changes offset to cpu endian and convert
before writing to image.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-05-03 10:33:46 +02:00
Fam Zheng
95b0aa4231 vmdk: change magic number to macro
Two hard coded flag bits are changed to macros.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-05-03 10:33:43 +02:00
Fam Zheng
69e0b6dfa4 vmdk: Add option to create zeroed-grain image
Add image create option "zeroed-grain" to enable zeroed-grain GTE
feature of vmdk sparse extents. When this option is on, header version
of newly created extent will be 2 and VMDK4_FLAG_ZERO_GRAIN flag bit
will be set.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-05-03 10:33:41 +02:00
Fam Zheng
14ead646fe vmdk: add support for “zeroed‐grain” GTE
Introduced support for zeroed-grain GTE, as specified in Virtual Disk
Format 5.0[1].

    Recent VMware hosted platform products support a new “zeroed‐grain”
    grain table entry (GTE). The zeroed‐grain GTE returns all zeros on
    read.  In other words, the zeroed‐grain GTE indicates that a grain
    in the child disk is zero‐filled but does not actually occupy space
    in storage.  A sparse extent with zeroed‐grain GTE has the following
    in its header:

     * SparseExtentHeader.version = 2
     * SparseExtentHeader.flags has bit 2 set

    Other than the new flag and the possibly zeroed‐grain GTE, version 2
    sparse extents are identical to version 1.  Also, a zeroed‐grain GTE
    has value 0x1 in the GT table.

[1] Virtual Disk Format 5.0, http://www.vmware.com/support/developer/vddk/vmdk_50_technote.pdf?src=vmdk
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-05-03 10:33:38 +02:00
Fam Zheng
65f7472577 vmdk: named return code.
Internal routines in vmdk.c previously return -1 on error and 0 on
success. More return values are useful for future changes such as
zeroed-grain GTE. Change all the magic `return 0` and `return -1` to
macro names:

 * VMDK_OK      0
 * VMDK_ERROR   (-1)
 * VMDK_UNALLOC (-2)
 * VMDK_ZEROED  (-3)

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-05-03 10:33:35 +02:00
Jeff Cody
059e2fbbca block: add read-only support to VHDX image format.
This adds in read-only support to the VHDX image format.  This supports
reads for fixed-size, and dynamic sized VHDX images.

Differencing files are still unsupported.

The image must be opened without BDRV_O_RDWR set, because we do not
yet update the headers.  I.e., pass 'readonly=on' in the drive image
options from the QEMU commandline.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-05-03 10:31:58 +02:00
Jeff Cody
e8d4e5ffdb block: initial VHDX driver support framework - supports open and probe
This is the initial block driver framework for VHDX image support
(i.e. Hyper-V image file formats), that supports opening VHDX files, and
parsing the headers.

This commit does not yet enable:
    - reading
    - writing
    - updating the header
    - differencing files (images with parents)
    - log replay / dirty logs (only clean images)

This is based on Microsoft's VHDX specification:
    "VHDX Format Specification v0.95", published 4/12/2012
    https://www.microsoft.com/en-us/download/details.aspx?id=29681

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-05-03 10:31:58 +02:00
Jeff Cody
203cdba3bc block: vhdx header for the QEMU support of VHDX images
This is based on Microsoft's VHDX specification:
    "VHDX Format Specification v0.95", published 4/12/2012
    https://www.microsoft.com/en-us/download/details.aspx?id=29681

These structures define the various header, metadata, and other
block structures defined in the VHDX specification.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-05-03 10:31:58 +02:00
Liu Yuan
859e5553a4 sheepdog: fix loadvm operation
Currently the 'loadvm' opertaion works as following:
1. switch to the snapshot
2. mark current working VDI as a snapshot
3. rely on sd_create_branch to create a new working VDI based on the snapshot

This works not the same as other format as QCOW2. For e.g,

qemu > savevm # get a live snapshot snap1
qemu > savevm # snap2
qemu > loadvm 1 # This will steally create snap3 of the working VDI

Which will result in following snapshot chain:

base <-- snap1 <-- snap2 <-- snap3
          ^
          |
      working VDI

snap3 was unnecessarily created and might be annoying users.

This patch discard the unnecessary 'snap3' creation. and implement
rollback(loadvm) operation to the specified snapshot by
1. switch to the snapshot
2. delete working VDI
3. rely on sd_create_branch to create a new working VDI based on the snapshot

The snapshot chain for above example will be:

base <-- snap1 <-- snap2
          ^
          |
      working VDI

Cc: qemu-devel@nongnu.org
Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Liu Yuan <tailai.ly@taobao.com>
Reviewed-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-26 13:37:51 +02:00
MORITA Kazutaka
13c31de2fd sheepdog: resend write requests when SD_RES_READONLY is received
When a snapshot is taken from out side of qemu (e.g. qemu-img
snapshot), write requests to the current vdi return SD_RES_READONLY.
In this case, the sheepdog block driver needs to update the current
inode to the latest one and resend the write requests.

Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-26 13:26:28 +02:00
MORITA Kazutaka
9ff53a0eb8 sheepdog: add helper function to reload inode
This adds a helper function to update the current inode state with the
specified vdi object.

Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-26 13:26:28 +02:00
MORITA Kazutaka
6a0b549033 sheepdog: add SD_RES_READONLY result code
Sheepdog returns SD_RES_READONLY when qemu sends write requests to the
snapshot vdi.  This adds the result code and makes sd_strerror() print
its error reason.

Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-26 13:26:28 +02:00
MORITA Kazutaka
982dcbf4cb sheepdog: cleanup find_vdi_name
This makes 'filename' and 'tag' constant variables, and renames
'for_snapshot' to 'lock' to clear how it works.

Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-26 13:26:28 +02:00
Kevin Wolf
c3ca988d2b rbd: Fix use after free in rbd_open()
Commit a9ccedc3 frees the QemuOpts for the driver-specific options
immediately, even though it still needs the filename string that is
contained there. This doesn't work. Move the deletion of the QemuOpts to
the end of the function where its content isn't needed any more.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-26 13:26:28 +02:00
Liu Yuan
8d71c63137 sheepdog: implement .bdrv_co_is_allocated()
Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Liu Yuan <tailai.ly@taobao.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-26 13:26:28 +02:00
Liu Yuan
e8bfaa2fae sheepdog: use BDRV_SECTOR_SIZE
Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Liu Yuan <tailai.ly@taobao.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-26 13:26:28 +02:00
Liu Yuan
cac8f4a60f sheepdog: add discard/trim support for sheepdog
The 'TRIM' command from VM that is to release underlying data storage for
better thin-provision is already supported by the Sheepdog.

This patch adds the TRIM support at QEMU part.

For older Sheepdog that doesn't support it, we return 0(success) to upper layer.

Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Liu Yuan <tailai.ly@taobao.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-26 13:26:27 +02:00
Anthony Liguori
f1ab7a5acf Merge remote-tracking branch 'kwolf/for-anthony' into staging
# By Kevin Wolf (16) and Stefan Hajnoczi (4)
# Via Kevin Wolf
* kwolf/for-anthony:
  qemu-iotests: add 053 unaligned compressed image size test
  block: Allow overriding backing.file.filename
  block: Remove filename parameter from .bdrv_file_open()
  vvfat: Use bdrv_open options instead of filename
  sheepdog: Use bdrv_open options instead of filename
  rbd: Use bdrv_open options instead of filename
  iscsi: Use bdrv_open options instead of filename
  gluster: Use bdrv_open options instead of filename
  curl: Use bdrv_open options instead of filename
  blkverify: Use bdrv_open options instead of filename
  blkdebug: Use bdrv_open options instead of filename
  raw-win32: Use bdrv_open options instead of filename
  raw-posix: Use bdrv_open options instead of filename
  block: Enable filename option
  block: Add driver-specific options for backing files
  block: Fail gracefully when using a format driver on protocol level
  qemu-iotests: Fix _filter_qemu
  qemu-img: do not zero-pad the compressed write buffer
  qcow: allow sub-cluster compressed write to last cluster
  qcow2: allow sub-cluster compressed write to last cluster

Message-id: 1366630294-18984-1-git-send-email-kwolf@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-04-22 08:08:22 -05:00
Anthony Liguori
25690739f1 Merge remote-tracking branch 'bonzini/nbd-next' into staging
# By Stefan Hajnoczi
# Via Paolo Bonzini
* bonzini/nbd-next:
  nbd: set TCP_NODELAY
  nbd: use TCP_CORK in nbd_co_send_request()
  nbd: unlock mutex in nbd_co_send_request() error path

Message-id: 1366381830-11267-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-04-22 08:05:14 -05:00
Kevin Wolf
56d1b4d21d block: Remove filename parameter from .bdrv_file_open()
It is unused now in all block drivers.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-04-22 11:34:35 +02:00
Kevin Wolf
7ad9be64e8 vvfat: Use bdrv_open options instead of filename
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-04-22 10:27:59 +02:00
Kevin Wolf
c8c96350e0 sheepdog: Use bdrv_open options instead of filename
This is only to convert the internal interface that is used for passing
the "filename" to be parsed, but converting to actual fine grained
options is left for another day, as it doesn't look trivial.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-04-22 10:27:59 +02:00
Kevin Wolf
a9ccedc3da rbd: Use bdrv_open options instead of filename
This is only to convert the internal interface that is used for passing
the "filename" to be parsed, but converting to actual fine grained
options is left for another day, as it doesn't look trivial.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-04-22 10:27:59 +02:00
Kevin Wolf
60beb3412d iscsi: Use bdrv_open options instead of filename
This is only to convert the internal interface that is used for passing
the "filename" to be parsed, but converting to actual fine grained
options is left for another day, as it doesn't look trivial.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-04-22 10:27:59 +02:00
Kevin Wolf
b489477653 gluster: Use bdrv_open options instead of filename
This is only to convert the internal interface that is used for passing
the "filename" to be parsed, but converting to actual fine grained
options is left for another day, as it doesn't look trivial.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-04-22 10:27:59 +02:00
Kevin Wolf
8e6d58cd5b curl: Use bdrv_open options instead of filename
As a bonus, going through the QemuOpts QEMU_OPT_SIZE parser for the
readahead option gives us proper error reporting that the previous use
of atoi() lacked.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-04-22 10:27:59 +02:00
Kevin Wolf
16c790926b blkverify: Use bdrv_open options instead of filename
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-04-22 10:27:59 +02:00
Kevin Wolf
f468121290 blkdebug: Use bdrv_open options instead of filename
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-04-22 10:27:59 +02:00
Kevin Wolf
8a79380b8e raw-win32: Use bdrv_open options instead of filename
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-04-22 10:27:59 +02:00
Kevin Wolf
c66a615723 raw-posix: Use bdrv_open options instead of filename
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-04-22 10:27:59 +02:00
Kevin Wolf
31ca6d077c block: Add driver-specific options for backing files
Options starting in "backing." are passed to the backing file now. If
you don't need to specify the filename for the backing file, you can add
it on the command line instead of in the image file:

$ qemu-nbd -t /tmp/test.img
$ qemu-img create -f qcow2 empty.qcow2 1G
$ qemu-system-x86_64 -drive file=empty.qcow2,backing.file.driver=nbd,\
    backing.file.host=localhost

Note that this doesn't override the backing filename from the image. If
the image has one, this will fail because NBD doesn't want the options
and a filename at the same time.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-04-22 10:27:59 +02:00
Stefan Hajnoczi
16b3c5cd9f qcow: allow sub-cluster compressed write to last cluster
Compression in qcow requires image length to be a multiple of the
cluster size.  Lift this requirement by zero-padding the final cluster
when necessary.  The virtual disk size is still not cluster-aligned, so
the guest cannot access the zero sectors.

Note that this is almost identical to the qcow2 version of this code.
qcow2's compression code is drawn from qcow.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-04-22 10:27:58 +02:00
Stefan Hajnoczi
f4d38bef7c qcow2: allow sub-cluster compressed write to last cluster
Compression in qcow2 requires image length to be a multiple of the
cluster size.  Lift this requirement by zero-padding the final cluster
when necessary.  The virtual disk size is still not cluster-aligned, so
the guest cannot access the zero sectors.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-04-22 10:27:58 +02:00
Richard W.M. Jones
c7a101f529 ssh: Remove unnecessary use of strlen function.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-19 11:45:38 +02:00
Stefan Weil
6ae7d660a0 block/ssh: Add missing gcc format attributes
Now gcc will check whether format string and variable arguments match.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-19 11:44:16 +02:00
Stefan Hajnoczi
97ebbab0e3 nbd: set TCP_NODELAY
Disable the Nagle algorithm to reduce latency.  Note this means we must
also use TCP_CORK when sending header followed by payload to avoid
fragmenting lots of little packets.  The previous patch took care of
that.

Suggested-by: Nick Thomas <nick@bytemark.co.uk>
Tested-by: Nick Thomas <nick@bytemark.co.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-04-15 16:35:17 +02:00
Stefan Hajnoczi
0fcece25c0 nbd: use TCP_CORK in nbd_co_send_request()
Use TCP_CORK to defer packet transmission until both the header and the
payload have been written.

Suggested-by: Nick Thomas <nick@bytemark.co.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-04-15 16:30:40 +02:00
Stefan Hajnoczi
6760c47aa4 nbd: unlock mutex in nbd_co_send_request() error path
Cc: qemu-stable@nongnu.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-04-15 16:29:46 +02:00
Josh Durgin
dc7588c1eb rbd: add an asynchronous flush
The existing bdrv_co_flush_to_disk implementation uses rbd_flush(),
which is sychronous and causes the main qemu thread to block until it
is complete. This results in unresponsiveness and extra latency for
the guest.

Fix this by using an asynchronous version of flush.  This was added to
librbd with a special #define to indicate its presence, since it will
be backported to stable versions. Thus, there is no need to check the
version of librbd.

Implement this as bdrv_aio_flush, since it matches other aio functions
in the rbd block driver, and leave out bdrv_co_flush_to_disk when the
asynchronous version is available.

Reported-by: Oliver Francke <oliver@filoo.de>
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-15 10:18:05 +02:00
Richard W.M. Jones
9a2d462e7b block: ssh: Use libssh2_sftp_fsync (if supported by libssh2) to flush to disk.
libssh2_sftp_fsync is an extension to libssh2 to support fsync(2) over
sftp, which is itself an extension of OpenSSH.

If both libssh2 and the ssh daemon support it, this will allow
bdrv_flush_to_disk to commit changes through to disk on the remote
server.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-15 10:18:05 +02:00
Richard W.M. Jones
0a12ec87a5 block: Add support for Secure Shell (ssh) block device.
qemu-system-x86_64 -drive file=ssh://hostname/some/image

QEMU will ssh into 'hostname' and open '/some/image' which is made
available as a standard block device.

You can specify a username (ssh://user@host/...) and/or a port number
(ssh://host:port/...).  You can also use an alternate syntax using
properties (file.user, file.host, file.port, file.path).

Current limitations:

- Authentication must be done without passwords or passphrases, using
  ssh-agent.  Other authentication methods are not supported.

- Uses a single connection, instead of concurrent AIO with multiple
  SSH connections.

This is implemented using libssh2 on the client side.  The server just
requires a regular ssh daemon with sftp-server support.  Most ssh
daemons on Unix/Linux systems will work out of the box.

Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Cc: Stefan Hajnoczi <stefanha@gmail.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-15 10:18:05 +02:00
Kevin Wolf
8d3b1a2d0b block: Introduce bdrv_pwritev() for qcow2_save_vmstate
Directly pass the QEMUIOVector on instead of linearising it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-15 08:26:18 +02:00
Kevin Wolf
cf8074b382 block: Introduce bdrv_writev_vmstate
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-15 08:26:18 +02:00
Aurelien Jarno
753d9b82c5 aes: move aes.h from include/block to include/qemu
Move aes.h from include/block to include/qemu to show it can be reused
by other subsystems.

Cc: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2013-04-13 13:51:57 +02:00
Paolo Bonzini
0d09e41a51 hw: move headers to include/
Many of these should be cleaned up with proper qdev-/QOM-ification.
Right now there are many catch-all headers in include/hw/ARCH depending
on cpu.h, and this makes it necessary to compile these files per-target.
However, fixing this does not belong in these patches.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-04-08 18:13:10 +02:00
Kevin Wolf
c2b6ff51e4 qcow2: Fix L1 write error handling in qcow2_update_snapshot_refcount
It ignored the error code, and at least the 'goto fail' is obvious
nonsense as it creates an endless loop (if the next attempt doesn't
magically succeed) and leaves the in-memory L1 table in big-endian
instead of converting it back.

In error cases, there's no point in writing an updated L1 table, so
skip this part for them.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-04-05 18:58:05 +02:00
Kevin Wolf
c2bc78b6a9 qcow2: Return real error in qcow2_update_snapshot_refcount
This fixes the error message triggered by the following script:

    cat > /tmp/blkdebug.cfg <<EOF
    [inject-error]
    event = "cluster_free"
    errno = "28"
    immediately = "off"
    EOF

    $qemu_img create -f qcow2 test.qcow2 10G
    $qemu_img snapshot -c snap test.qcow2
    $qemu_img snapshot -d snap blkdebug:/tmp/blkdebug.cfg:test.qcow2

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-04-05 18:58:05 +02:00
Stefan Hajnoczi
f9e8cacc55 oslib-posix: rename socket_set_nonblock() to qemu_set_nonblock()
The fcntl(fd, F_SETFL, O_NONBLOCK) flag is not specific to sockets.
Rename to qemu_set_nonblock() just like qemu_set_cloexec().

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2013-04-02 11:47:37 -04:00
Kevin Wolf
ecdd5333ab qcow2: Gather clusters in a looping loop
Instead of just checking once in exactly this order if there are
dependendies, non-COW clusters and new allocation, this starts looping
around these. This way we can, for example, gather non-COW clusters after
new allocations as long as the host cluster offsets stay contiguous.

Once handle_dependencies() is extended so that COW areas of in-flight
allocations can be overwritten, this allows to continue with gathering
other clusters (we wouldn't be able to do that without this change
because we would have missed a possible second dependency in one of the
next clusters).

This means that in the typical sequential write case, we can combine the
COW overwrite of one cluster with the allocation of the next cluster as
soon as something like Delayed COW gets actually implemented. It is only
by avoiding splitting requests this way that Delayed COW actually starts
improving performance noticably.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28 11:52:44 +01:00
Kevin Wolf
2c3b32d256 qcow2: Move cluster gathering to a non-looping loop
This patch is mainly to separate the indentation change from the
semantic changes. All that really changes here is that everything moves
into a while loop, all 'goto done' become 'break' and at the end of the
loop a new 'break is inserted.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28 11:52:44 +01:00
Kevin Wolf
88c6588c51 qcow2: Allow requests with multiple l2metas
Instead of expecting a single l2meta, have a list of them. This allows
to still have a single I/O request for the guest data, even though
multiple l2meta may be needed in order to describe both a COW overwrite
and a new cluster allocation (typical sequential write case).

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28 11:52:44 +01:00
Kevin Wolf
710c2496d8 qcow2: Use byte granularity in qcow2_alloc_cluster_offset()
This gets rid of the nb_clusters and keep_clusters and the associated
complicated calculations. Just advance the number of bytes that have
been processed and everything is fine.

This patch advances the variables even after the last operation even
though they aren't used any more afterwards to make things look more
uniform. A later patch will turn the whole thing into a loop and then
it actually starts making sense.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28 11:52:44 +01:00
Kevin Wolf
411d62b04b qcow2: Prepare handle_alloc/copied() for byte granularity
This makes handle_alloc() and handle_copied() return byte-granularity
host offsets instead of returning always the cluster start. This is
required so that qcow2_alloc_cluster_offset() can stop aligning
everything to cluster boundaries.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28 11:52:43 +01:00
Kevin Wolf
e62daaf679 qcow2: handle_copied(): Implement non-zero host_offset
Look only for clusters that start at a given physical offset.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28 11:52:43 +01:00
Kevin Wolf
c53ede9f6d qcow2: handle_copied(): Get rid of keep_clusters parameter
Now *bytes is used to return the length of the area that can be written
to without performing an allocation or COW.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28 11:52:43 +01:00
Kevin Wolf
acb0467f8d qcow2: handle_copied(): Get rid of nb_clusters parameter
handle_copied() uses its bytes parameter now to determine how many
clusters it should try to find.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28 11:52:43 +01:00
Kevin Wolf
0af729ec00 qcow2: Factor out handle_copied()
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28 11:52:43 +01:00
Kevin Wolf
83baa9a471 qcow2: Clean up handle_alloc()
Things can be simplified a bit now. No semantic changes.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28 11:52:43 +01:00
Kevin Wolf
c37f4cd71d qcow2: Finalise interface of handle_alloc()
The interface works completely on a byte granularity now and duplicated
parameters are removed.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28 11:52:43 +01:00
Kevin Wolf
3b8e2e260c qcow2: handle_alloc(): Get rid of keep_clusters parameter
handle_alloc() is now called with the offset at which the actual new
allocation starts instead of the offset at which the whole write request
starts, part of which may already be processed.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28 11:52:43 +01:00
Kevin Wolf
f5bc635094 qcow2: handle_alloc(): Get rid of nb_clusters parameter
We already communicate the same information in *bytes.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28 11:52:43 +01:00
Kevin Wolf
10f0ed8b2f qcow2: Factor out handle_alloc()
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28 11:52:43 +01:00
Kevin Wolf
037689d896 qcow2: Decouple cluster allocation from cluster reuse code
This moves some code that prepares the allocation of new clusters to
where the actual allocation happens. This is the minimum required to be
able to move it to a separate function in the next patch.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28 11:52:43 +01:00
Kevin Wolf
65eb2e35c0 qcow2: Change handle_dependency to byte granularity
This is a more precise description of what really constitutes a
dependency. The behaviour doesn't change at this point because the COW
area of the old request is still aligned to cluster boundaries and
therefore an overlap is detected wheneven the requests touch any part of
the same cluster.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28 11:52:43 +01:00
Kevin Wolf
d9d74f4177 qcow2: Improve check for overlapping allocations
The old code detected an overlapping allocation even when the
allocations didn't actually overlap, but were only adjacent.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28 11:52:43 +01:00
Kevin Wolf
17a71e5823 qcow2: Handle dependencies earlier
Handling overlapping allocations isn't just a detail of cluster
allocation. It is rather one of three ways to get the host cluster
offset for a write request:

1. If a request overlaps an in-flight allocations, the cluster offset
   can be taken from there (this is what handle_dependencies will evolve
   into) or the request must just wait until the allocation has
   completed. Accessing the L2 is not valid in this case, it has
   outdated information.

2. Outside overlapping areas, check the clusters that can be written to
   as they are, with no COW involved.

3. If a COW is required, allocate new clusters

Changing the code to reflect this doesn't change the behaviour because
overlaps cannot exist for clusters that are kept in step 2. It does
however make it easier for later patches to work on clusters that belong
to an allocation that is still in flight.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28 11:52:42 +01:00
Kevin Wolf
9ee6439e27 qcow2: Remove bogus unlock of s->lock
The unlock wakes up the next coroutine, but the currently running
coroutine will lock it again before it yields, so this doesn't make a
lot of sense.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28 11:52:42 +01:00
Kevin Wolf
c349ca4bb2 qcow2: Fix "total clusters" number in bdrv_check
This should be based on the virtual disk size, not on the size of the
image.

Interesting observation: With some VM state stored in the image file,
percentages higher than 100% are possible, even though snapshots
themselves are ignored. This is a qcow2 bug to be fixed another day: The
VM state should be discarded in the active L2 tables after completing
the snapshot creation.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28 11:52:42 +01:00
Stefan Weil
ea804cadf8 block: Add options QDict to bdrv_file_open() prototypes (fix MinGW build)
The new parameter is unused yet.

This part was missing in commit 787e4a8500.

Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-25 09:53:04 +01:00
Liu Yuan
d43731c758 rbd: fix compile error
Commit 787e4a85 [block: Add options QDict to bdrv_file_open() prototypes] didn't
update rbd.c accordingly.

Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Liu Yuan <tailai.ly@taobao.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-25 09:51:43 +01:00
Kevin Wolf
681e7ad024 nbd: Check against invalid option combinations
A file name may only specified if no host or socket path is specified.
The latter two may not appear at the same time either.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-03-22 17:51:32 +01:00
Kevin Wolf
bebbf7fa9c nbd: Use default port if only host is specified
The URL method already takes care to apply the default port when none is
specfied. Directly specifying driver-specific options required the port
number until now. Allow leaving it out and apply the default.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-03-22 17:51:32 +01:00
Kevin Wolf
f5866fa438 block: Make find_image_format safe with NULL filename
In order to achieve this, the .bdrv_probe callbacks of all drivers must
cope with this. The DMG driver is the only one that bases its decision
on the filename and it needs to be changed.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-03-22 17:51:32 +01:00
Kevin Wolf
6963a30d82 block: Introduce .bdrv_parse_filename callback
If a driver needs structured data and not just a string, it can provide
a .bdrv_parse_filename callback now that parses the command line string
into separate options. Keeping this separate from .bdrv_open_filename
ensures that the preferred way of directly specifying the options always
works as well if parsing the string works.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-03-22 17:51:32 +01:00
Kevin Wolf
f53a1febcd nbd: Accept -drive options for the network connection
The existing parsers for the file name now parse everything into the
bdrv_open() options QDict. Instead of using these parsers, you can now
directly specify the options on the command line, like this:

    qemu-system-x86_64 -drive file=nbd:,file.port=1234,file.host=::1

Clearly the file=... part could use further improvement, but it's a
start.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-03-22 17:51:32 +01:00
Kevin Wolf
f17c90bed1 nbd: Keep hostname and port separate
The NBD block supports an URL syntax, for which a URL parser returns
separate hostname and port fields. It also supports the traditional qemu
syntax encoded in a filename. Until now, after parsing the URL to get
each piece of information, a new string is built to be fed to socket
functions.

Instead of building a string in the URL case that is immediately parsed
again, parse the string in both cases and use the QemuOpts interface to
qemu-sockets.c.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-03-22 17:51:31 +01:00
Kevin Wolf
787e4a8500 block: Add options QDict to bdrv_file_open() prototypes
The new parameter is unused yet.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-03-22 17:51:31 +01:00
Kevin Wolf
acdfb480ba qcow2: Fix segfault in qcow2_invalidate_cache
Need to pass an options QDict to qcow2_open() now. This fixes a segfault
on the migration target with qcow2.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-03-19 11:48:36 +01:00
Liu Yuan
fca23f0ad2 sheepdog: show error message for halt status
Sheepdog (neither quorum nor unsafe mode) will refuse to serve IO requests when
number of alive nodes is less than that of copies specified by users. This will
return 0x19 to QEMU client which currently doesn't recognize it.

This patch adds an error description when QEMU client receives it, other than
plainly printing 'Invalid error code'

Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Liu Yuan <tailai.ly@taobao.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-03-19 11:48:36 +01:00
Stefan Hajnoczi
c4d9d19645 threadpool: drop global thread pool
Now that each AioContext has a ThreadPool and the main loop AioContext
can be fetched with bdrv_get_aio_context(), we can eliminate the concept
of a global thread pool from thread-pool.c.

The submit functions must take a ThreadPool* argument.

block/raw-posix.c and block/raw-win32.c use
aio_get_thread_pool(bdrv_get_aio_context(bs)) to fetch the main loop's
ThreadPool.

tests/test-thread-pool.c must be updated to reflect the new
thread_pool_submit() function prototypes.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2013-03-15 16:07:51 +01:00
MORITA Kazutaka
ed9ba72467 sheepdog: set io_flush handler in do_co_req
If an io_flush handler is not set, qemu_aio_wait doesn't invoke
callbacks.

Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-15 16:07:50 +01:00
MORITA Kazutaka
0d6db300cd sheepdog: use non-blocking fd in coroutine context
Using a blocking socket in the coroutine context reduces the chance of
switching to other work.  This patch makes the sheepdog driver use a
non-blocking fd always.

Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-15 16:07:50 +01:00
Paolo Bonzini
381b487d54 qcow2: make is_allocated return true for zero clusters
Otherwise, live migration of the top layer will miss zero clusters and
let the backing file show through.  This also matches what is done in qed.

QCOW2_CLUSTER_ZERO clusters are invalid in v2 image files.  Check this
directly in qcow2_get_cluster_offset instead of replicating the test
everywhere.

Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-15 16:07:50 +01:00
Stefan Hajnoczi
3647917919 qcow2: drop unnecessary flush in qcow2_update_snapshot_refcount()
We already flush when the function completes.  There is no need to flush
after every compressed cluster.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-03-15 16:07:50 +01:00
Stefan Hajnoczi
f9cb2860bd qcow2: drop flush in update_cluster_refcount()
The update_cluster_refcount() function increments/decrements a cluster's
refcount and then returns the new refcount value.

There is no need to flush since both update_cluster_refcount() callers
already take care of this:

1. qcow2_alloc_bytes() calls update_cluster_refcount() when compressed
   sectors will be appended to an existing cluster with enough free
   space.  qcow2_alloc_bytes() already flushes so there is no need to do
   so in update_cluster_refcount().

2. qcow2_update_snapshot_refcount() sets a cache dependency on refcounts
   if it needs to update L2 entries.  It also flushes before completing.

Removing this flush significantly speeds up qcow2 snapshot creation:

  $ qemu-img create -f qcow2 test.qcow2 -o size=50G,preallocation=metadata
  $ time qemu-img snapshot -c new test.qcow2

Time drops from more than 3 minutes to under 1 second.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-03-15 16:07:50 +01:00
Stefan Hajnoczi
2154f24e4e qcow2: flush in qcow2_update_snapshot_refcount()
Users of qcow2_update_snapshot_refcount() do not flush consistently.
qcow2_snapshot_create() flushes but qcow2_snapshot_goto() and
qcow2_snapshot_delete() do not.

Solve this by moving the bdrv_flush() into
qcow2_update_snapshot_refcount().

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-03-15 16:07:50 +01:00
Stefan Hajnoczi
c1f5bafd70 qcow2: set L2 cache dependency in qcow2_alloc_bytes()
Compressed writes use qcow2_alloc_bytes() to allocate space with byte
granularity.  The affected clusters' refcounts will be incremented but
we do not need to flush yet.

Set a L2 cache dependency on the refcount block cache, so that the
refcounts get written out before the L2 updates.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-03-15 16:07:50 +01:00
Stefan Hajnoczi
f6977f1556 qcow2: flush refcount cache correctly in qcow2_write_snapshots()
Since qcow2 metadata is cached we need to flush the caches, not just the
underlying file.  Use bdrv_flush(bs) instead of bdrv_flush(bs->file).

Also add the error return path when bdrv_flush() fails and move the
flush after checking for qcow2_alloc_clusters() failure so that the
qcow2_alloc_clusters() error return value takes precedence.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-03-15 16:07:50 +01:00
Stefan Hajnoczi
9991923b26 qcow2: flush refcount cache correctly in alloc_refcount_block()
update_refcount() affects the refcount cache, it does not write to disk.
Therefore bdrv_flush(bs->file) does nothing.  We need to flush the
refcount cache in order to write out the refcount updates!

While we're here also add error returns when qcow2_cache_flush() fails.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-03-15 16:07:49 +01:00
Kevin Wolf
74c4510a3c qcow2: Allow lazy refcounts to be enabled on the command line
qcow2 images now accept a boolean lazy_refcounts options. Use it like
this:

  -drive file=test.qcow2,lazy_refcounts=on

If the option is specified on the command line, it overrides the default
specified by the qcow2 header flags that were set when creating the
image.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-15 16:07:49 +01:00
Kevin Wolf
de9c0cec6c block: Add options QDict to bdrv_open() prototype
It doesn't do anything yet except storing the options QDict in the
BlockDriverState.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-15 16:07:49 +01:00
Kevin Wolf
1a86938f04 block: Add options QDict to .bdrv_open()
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-15 16:07:49 +01:00
Peter Lieven
cb1b83e740 iscsi: add iscsi_truncate support
this patch adds iscsi_truncate which effectively allows for
online resizing of iscsi volumes. for this to work you have
to resize the volume on your storage and then call
block_resize command in qemu which will issue a
readcapacity16 to update the capacity.

v4:
  - factor out complete readcapacity logic into a separate function
  - handle capacity change check condition in readcapacity function
    (this happens if the block_resize cmd is the first iscsi task
    executed after a resize on the storage)

v3:
  - remove switch statement in iscsi_open
  - create separate patch for brdv_drain_all() in bdrv_truncate()

v2:
  - add a general bdrv_drain_all() before bdrv_truncate() to avoid
    in-flight AIOs while the device is truncated
  - since no AIOs are in flight we can use a sync libiscsi call
    to re-read the capacity
  - factor out the readcapacity16 logic as it is redundant
    to iscsi_open() and iscsi_truncate().

Signed-off-by: Peter Lieven <pl@kamp.de>
[allow any type of unit attention check condition in iscsi_readcapacity_sync(),
 as in Message-ID: <51263A2A.6070304@dlhnet.de> - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-03-05 17:51:51 +01:00
Peter Lieven
1dde716ed6 iscsi: retry read, write, flush and unmap on unit attention check conditions
the storage might return a check condition status for various reasons.
(e.g. bus reset, capacity change, thin-provisioning info etc.)

currently all these informative status responses lead to an I/O error
which is populated to the guest. this patch introduces a retry mechanism
to avoid this.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-03-05 17:51:50 +01:00
MORITA Kazutaka
1b8bbb46e7 sheepdog: add support for connecting to unix domain socket
This patch adds support for a unix domain socket for a connection
between qemu and local sheepdog server.  You can use the unix domain
socket with the following syntax:

 $ qemu sheepdog+unix:///<vdiname>?socket=<socket path>[#snapid]

Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-04 09:54:17 +01:00
MORITA Kazutaka
25af257d21 sheepdog: use inet_connect to simplify connect code
This uses the form "<host>:<port>" for the representation of the
sheepdog server to use inet_connect.

Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-04 09:54:17 +01:00
MORITA Kazutaka
5d6768e3b8 sheepdog: accept URIs
The URI syntax is consistent with the NBD and Gluster syntax.  The
syntax is

  sheepdog[+tcp]://[host:port]/vdiname[#snapid|#tag]

Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-04 09:54:17 +01:00
MORITA Kazutaka
bf1c852aa9 move socket_set_nodelay to osdep.c
Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-04 09:54:17 +01:00
Stefan Hajnoczi
4db35162ea qcow2: support compressed clusters in BlockFragInfo
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-02-22 21:21:09 +01:00
Stefan Hajnoczi
fba31bae2d qcow2: record fragmentation statistics during check
The qemu-img check command can display fragmentation statistics:
 * Total number of clusters in virtual disk
 * Number of allocated clusters
 * Number of fragmented clusters

This patch adds fragmentation statistics support to qcow2.

Compressed and normal clusters count as allocated.  Zero clusters are
not counted as allocated unless their L2 entry has a non-zero offset
(e.g. preallocation).

Only the current L1 table counts towards the statistics - snapshots are
ignored.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-02-22 21:21:09 +01:00
Stefan Hajnoczi
801f704452 qcow2: introduce check_refcounts_l1/l2() flags
The check_refcounts_l1/l2() functions have a check_copied argument to
check that the QCOW_O_COPIED flag is consistent with refcount == 1.
This should be a bool, not an int.

However, the next patch introduces qcow2 fragmentation statistics and
also needs to pass an option to check_refcounts_l1/l2().  This is a good
opportunity to use an int flags field.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-02-22 21:21:09 +01:00
Federico Simoncelli
c6bb9ad198 qemu-img: find the image end offset during check
This patch adds the support for reporting the image end offset (in
bytes). This is particularly useful after a conversion (or a rebase)
where the destination is a block device in order to find the first
unused byte at the end of the image.

Signed-off-by: Federico Simoncelli <fsimonce@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-02-22 21:21:08 +01:00
Stefan Hajnoczi
8a8f584008 block/curl: only restrict protocols with libcurl>=7.19.4
The curl_easy_setopt(state->curl, CURLOPT_PROTOCOLS, ...) interface was
introduced in libcurl 7.19.4.  Therefore we cannot protect against
CVE-2013-0249 when linking against an older libcurl.

This fixes the build failure introduced by
fb6d1bbd24.

Reported-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Andreas Färber <andreas.faeber@web.de>
Message-id: 1360743934-8337-1-git-send-email-stefanha@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-02-13 11:57:35 -06:00
Stefan Hajnoczi
33ccf6675f Revert "block/vpc: Fix size calculation"
This reverts commit f880defbb0.

Jeff Cody's testing revealed that the interpretation of size differs
even between VirtualPC and HyperV.  Revert this so there is time to
consider the impact of any backwards incompatible behavior this change
creates.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-02-12 12:25:15 +01:00
Stefan Hajnoczi
da888d37b0 block/raw-posix: detect readonly Linux block devices using BLKROGET
Linux block devices can be set read-only with "blockdev --setro
<device>".  The same thing can be done for LVM volumes using "lvchange
--permission r <volume>".  This read-only setting is independent of
device node permissions.  Therefore the device can still be opened
O_RDWR but actual writes will fail.

This results in odd behavior for QEMU.  bdrv_open() is supposed to fail
if a read-only image is being opened with BDRV_O_RDWR.  By not failing
for Linux block devices, the guest boots up but every write produces an
I/O error.

This patch checks whether the block device is read-only so that Linux
block devices behave like regular files.

Reported-by: Sibiao Luo <sluo@redhat.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-02-12 12:22:49 +01:00
Stefan Weil
f880defbb0 block/vpc: Fix size calculation
The size calculated from the CHS values is not the real image (disk) size,
but usually a smaller value. This is caused by rounding effects.

Only older operating systems use CHS. Such guests won't be able to use
the whole disk. All modern operating systems use the real size.

This patch fixes https://bugs.launchpad.net/qemu/+bug/1105670/.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Message-id: 1360265212-22037-1-git-send-email-sw@weilnetz.de
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-02-11 08:14:41 -06:00
Markus Armbruster
312fd5f290 error: Strip trailing '\n' from error string arguments (again)
Commit 6daf194d and be62a2eb got rid of a bunch, but they keep coming
back.  Tracked down with this Coccinelle semantic patch:

    @r@
	expression err, eno, cls, fmt;
	position p;
    @@
    (
	error_report(fmt, ...)@p
    |
	error_set(err, cls, fmt, ...)@p
    |
	error_set_errno(err, eno, cls, fmt, ...)@p
    |
	error_setg(err, fmt, ...)@p
    |
	error_setg_errno(err, eno, fmt, ...)@p
    )
    @script:python@
	fmt << r.fmt;
	p << r.p;
    @@
    if "\\n" in str(fmt):
	print "%s:%s:%s:%s" % (p[0].file, p[0].line, p[0].column, fmt)

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1360354939-10994-4-git-send-email-armbru@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-02-11 08:13:19 -06:00
Stefan Hajnoczi
fb6d1bbd24 block/curl: disable extra protocols to prevent CVE-2013-0249
There is a buffer overflow in libcurl POP3/SMTP/IMAP.  The workaround is
simple: disable extra protocols so that they cannot be exploited.  Full
details here:

  http://curl.haxx.se/docs/adv_20130206.html

QEMU only cares about HTTP, HTTPS, FTP, FTPS, and TFTP.  I have tested
that this fix prevents the exploit on my host with
libcurl-7.27.0-5.fc18.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-02-08 11:14:20 -06:00
Andreas Färber
fdf263f63f block/raw-posix: Build fix for O_ASYNC
Commit eeb6b45d48 (block: raw-posix image
file reopen) broke the build on OpenIndiana.

illumos has no O_ASYNC. Exclude it from flags to be compared
and instead assert that it is not set where defined.

Cf. e61ab1da7e for qemu-ga.

Cc: qemu-stable@nongnu.org (1.3.x)
Cc: Jeff Cody <jcody@redhat.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-02-01 15:11:12 +01:00
Philipp Hahn
cd92347575 vmdk: Allow space in file name
The previous scanf() format string stopped parsing the file name on the
first white white space, which seems to be allowed at least by VMware
Workstation.

Change the format string to collect everything between the first and
second quote as the file name, disallowing line breaks.

Signed-off-by: Philipp Hahn <hahn@univention.de>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-02-01 14:58:29 +01:00
Kevin Wolf
46536235d8 parallels: Fix bdrv_open() error handling
Return -errno instead of -1 on errors. Hey, no memory leak to fix here
while we're touching it!

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-02-01 14:58:29 +01:00
Kevin Wolf
4f8aa2e19f dmg: Use g_free instead of free
The buffers are allocated with g_(re)alloc, so use g_free to free them.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-02-01 14:58:29 +01:00
Kevin Wolf
69d34a360d dmg: Fix bdrv_open() error handling
Return -errno instead of -1 on errors and add error checks in some
places that didn't have one. Passing things by reference requires more
correct typing, replaced a few off_ts therefore - with a 32-bit off_t
this is even a fix for truncation bugs.

While touching the code, fix even some more memory leaks than in the
other drivers...

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-02-01 14:58:29 +01:00
Kevin Wolf
59294e4659 vpc: Fix bdrv_open() error handling
Return -errno instead of -1 on errors. While touching the
code, fix a memory leak.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-02-01 14:58:28 +01:00
Kevin Wolf
1a60657f57 cloop: Fix bdrv_open() error handling
Return -errno instead of -1 on errors. While touching the
code, fix a memory leak.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-02-01 14:58:28 +01:00
Kevin Wolf
5b7d7dfd19 bochs: Fix bdrv_open() error handling
Return -errno instead of -1 on errors. While touching the
code, fix a memory leak.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-02-01 14:58:28 +01:00
Liu Yuan
6f74c260b4 sheepdog: pass vdi_id to sheep daemon for sd_close()
Sheep daemon needs vdi_id to identify which vdi is closed to release resources
such as object cache.

Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Liu Yuan <tailai.ly@taobao.com>
Reviewed-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-02-01 14:58:28 +01:00
Othmar Pasteka
7f2039f611 vmdk: Allow selecting SCSI adapter in image creation
Introduce a new option "adapter_type" when converting to vmdk images.
It can be one of the following: ide (default), buslogic, lsilogic
or legacyESX (according to the vmdk spec from vmware).

In case of a non-ide adapter, heads is set to 255 instead of the 16.
The latter is used for "ide".

Also see LP#545089

Signed-off-by: Othmar Pasteka <pasteka@kabsi.at>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-02-01 14:58:28 +01:00
Markus Armbruster
6528499fa4 g_malloc(0) and g_malloc0(0) return NULL; simplify
Once upon a time, it was decided that qemu_malloc(0) should abort.
Switching to glib retired that bright idea.  Some code that was added
to cope with it (e.g. in commits 702ef63, b76b6e9) is still around.
Bury it.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-01-30 11:14:46 +01:00
Paolo Bonzini
88ff0e48ee mirror: do nothing on zero-sized disk
On a zero-sized disk we need to break out of the job successfully
before bdrv_dirty_iter_init is called, otherwise you will get an
assertion failure with the next patch.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-25 18:18:35 +01:00
Stefan Weil
0e87ba2ccb block/vdi: Check for bad signature
vdi_open did not check for a bad signature.
This check was only in vdi_probe.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-25 18:18:35 +01:00
Stefan Weil
8937f8222c block/vdi: Improved return values from vdi_open
vdi_open returned -1 in case of any error, but it should return an
error code (negative value of errno or -EMEDIUMTYPE).

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-25 18:18:35 +01:00
Stefan Weil
9f0470bb2d block/vdi: Improve debug output for signature
The signature is a 32 bit value and needs up to 8 hex digits for printing.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-25 18:18:35 +01:00
Stefan Weil
15bac0d54f block: Use error code EMEDIUMTYPE for wrong format in some block drivers
This improves error reports for bochs, cow, qcow, qcow2, qed and vmdk
when a file with the wrong format is selected.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-25 18:18:35 +01:00
Paolo Bonzini
884fea4e87 mirror: support arbitrarily-sized iterations
Yet another optimization is to extend the mirroring iteration to include more
adjacent dirty blocks.  This limits the number of I/O operations and makes
mirroring efficient even with a small granularity.  Most of the infrastructure
is already in place; we only need to put a loop around the computation of
the origin and sector count of the iteration.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-25 18:18:35 +01:00
Paolo Bonzini
402a47411b mirror: support more than one in-flight AIO operation
With AIO support in place, we can start copying more than one chunk
in parallel.  This patch introduces the required infrastructure for
this: the buffer is split into multiple granularity-sized chunks,
and there is a free list to access them.

Because of copy-on-write, a single operation may already require
multiple chunks to be available on the free list.

In addition, two different iterations on the HBitmap may want to
copy the same cluster.  We avoid this by keeping a bitmap of in-flight
I/O operations, and blocking until the previous iteration completes.
This should be a pretty rare occurrence, though; as long as there is
no overlap the next iteration can start before the previous one finishes.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-25 18:18:35 +01:00
Paolo Bonzini
08e4ed6cde mirror: add buf-size argument to drive-mirror
This makes sense when the next commit starts using the extra buffer space
to perform many I/O operations asynchronously.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-25 18:18:34 +01:00