# By Stefan Hajnoczi (11) and others
# Via Kevin Wolf
* kwolf/for-anthony:
cmd646: fix build when DEBUG_IDE is enabled.
block: change default of .has_zero_init to 0
vpc: Implement .bdrv_has_zero_init
vmdk: remove wrong calculation of relative path
gluster: Return bdrv_has_zero_init = 0
block/ssh: Set bdrv_has_zero_init according to the file type.
block: Make BlockJobTypes const
qemu-iotests: add 055 drive-backup test case
qemu-iotests: extract wait_until_completed() into iotests.py
blockdev: add Abort transaction
blockdev: add DriveBackup transaction
blockdev: allow BdrvActionOps->commit() to be NULL
blockdev: rename BlkTransactionStates to singular
block: add drive-backup QMP command
blockdev: use bdrv_getlength() in qmp_drive_mirror()
blockdev: drop redundant proto_drv check
block: add basic backup support to block driver
block: add bdrv_add_before_write_notifier()
notify: add NotiferWithReturn so notifier list can abort
raw-posix: Fix /dev/cdrom magic on OS X
Message-id: 1372429509-29642-1-git-send-email-kwolf@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Make sure we use the correct TARGET/PRI macros in the debug statements.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
CC: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
.has_zero_init defaults to 1 for all formats and protocols.
this is a dangerous default since this means that all
new added drivers need to manually overwrite it to 0 if
they do not ensure that a device is zero initialized
after bdrv_create().
if a driver needs to explicitly set this value to
1 its easier to verify the correctness in the review process.
during review of the existing drivers it turned out
that ssh and gluster had a wrong default of 1.
both protocols support host_devices as backend
which are not by default zero initialized. this
wrong assumption will lead to possible corruption
if qemu-img convert is used to write to such a backend.
vpc and vmdk also defaulted to 1 altough they support
fixed respectively flat extends. this has to be addresses
in separate patches. both formats as well as the mentioned
ssh and gluster are turned to the default of 0 with this
patch for safety.
a similar problem with the wrong default existed for
iscsi most likely because the driver developer did
oversee the default value of 1.
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Use it for all targets, but be careful not to pass invalid CPUState.
cpu_single_env can be NULL, e.g. on Xen.
Signed-off-by: Andreas Färber <afaerber@suse.de>
While not normally needed for *-user, it can safely be used there since
always based on uint64_t, to avoid ifdeffery.
To avoid accidental uses, move the guards from exec/hwaddr.h to its
inclusion sites. No need for them in include/hw/.
Prepares for hwaddr use in qom/cpu.h.
Signed-off-by: Andreas Färber <afaerber@suse.de>
This allows to move the call into CPUState's realizefn.
Therefore move the stub into libqemustub.a.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Pass it to qemu_dummy_cpu_thread_fn().
Use CPUState::env_ptr for cpu_single_env.
Prepares for changing qemu_init_vcpu() argument to CPUState.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Pass it on to qemu_kvm_cpu_thread_fn().
Prepares for changing qemu_init_vcpu() argument to CPUState.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Andreas Färber <afaerber@suse.de>
CPUArchState is no longer needed.
Prepares for changing qemu_kvm_cpu_thread_fn() opaque to CPUState.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Use CPUState::env_ptr for now.
Prepares for changing cpu_handle_guest_debug() argument to CPUState.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Andreas Färber <afaerber@suse.de>
It no longer uses CPUArchState.
Prepares for changing qemu_kvm_cpu_thread_fn() opaque to CPUState.
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Andreas Färber <afaerber@suse.de>
It no longer uses CPUArchState.
Prepares for changing kvm_cpu_exec() argument to CPUState.
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Make cpustats monitor command available unconditionally.
Prepares for changing kvm_handle_internal_error() and kvm_cpu_exec()
arguments to CPUState.
Signed-off-by: Andreas Färber <afaerber@suse.de>
CPUArchState is no longer needed.
Prepares for changing qemu_kvm_cpu_thread_fn() opaque to CPUState.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Andreas Färber <afaerber@suse.de>
CPUArchState is no longer needed.
Prepares for changing qemu_kvm_init_cpu_signals() argument to CPUState.
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Andreas Färber <afaerber@suse.de>
It no longer uses CPUArchState.
Prepares for changing qemu_kvm_cpu_thread_fn() opaque to CPUState.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Andreas Färber <afaerber@suse.de>
It no longer needs CPUArchState.
Prepares for changing all_cpu_threads_idle() CPU loop to CPUState and
needed for changing qemu_kvm_wait_io_event() argument to CPUState.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Andreas Färber <afaerber@suse.de>
It no longer depends on CPUArchState, so move it to qom/cpu.c.
Prepares for changing GDBState::c_cpu to CPUState.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Change Monitor::mon_cpu to CPUState as well.
Reviewed-by: liguang <lig.fnst@cn.fujitsu.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
It no longer relies on CPUArchState since 20d695a.
Reviewed-by: liguang <lig.fnst@cn.fujitsu.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Andreas Färber <afaerber@suse.de>
A few targets already managed to implement cpu_save() and cpu_load()
without defining CPU_SAVE_VERSION that causes them to be registered.
Guard the prototypes with CPU_SAVE_VERSION to avoid this happening again
until all targets are converted to VMState (or QIDL).
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Since commit e67db06e9f (target-or32: Add
target stubs and QOM cpu) a VMStateDescription existed, but
CPU_SAVE_VERSION was not set, so it was never registered.
Drop cpu_{save,load}() and register VMStateDescription via DeviceState.
Use a version_id of 1 and specify minimum versions as well.
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Commit b758aca1f6 (target-alpha: Enable
the alpha-softmmu target.) introduced cpu_{save,load}() functions but
didn't define CPU_SAVE_VERSION, so they were never registered.
Drop cpu_{save,load}() and register the VMStateDescription via DeviceClass.
This operates on the AlphaCPU object instead of CPUAlphaState.
Reviewed-by: Juan Quintela <quintela@redhat.com>
Acked-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Andreas Färber <afaerber@suse.de>
To be used to embed common CPU state into CPU subclasses.
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
It's the equivalent to cpu_class_set_vmsd(), to assign
DeviceClass::vmsd. It wasn't needed before since only static,
unmigratable VMStateDescriptions were assigned so far.
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
It's CPUClass::vmsd, not CPUState::vmsd.
Reviewed-by: liguang <lig.fnst@cn.fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Depending on the subformat, has_zero_init on VHD must behave like raw
and query the underlying storage (fixed) or like other sparse formats
that can always return 1 (dynamic, differencing).
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When creating image with backing file, the driver tries to calculate the
relative path from created image file to backing file, but the path
computation is incorrect. e.g.:
$ qemu-img create -f vmdk -b vmdk-data-disk.vmdk vmdk-data-snapshot1
Formatting 'vmdk-data-snapshot1', fmt=vmdk size=10737418240
backing_file='vmdk-data-disk.vmdk' compat6=off zeroed_grain=off
$ qemu-img info vmdk-data-snapshot1
image: vmdk-data-snapshot1
file format: vmdk
virtual size: 10G (10737418240 bytes)
disk size: 12K
-> backing file: disk.vmdk
The common part in file names, "vmdk-data-", is incorrectly forgotten by
relative_path(). As the VMDK specification has no restriction on
parentNameHint to be relative path, we simply remove this by using the
backing_file option.
Cc: qemu-stable@nongnu.org
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
GlusterFS volumes can be backed by block devices, in which case
bdrv_create() doesn't make sure that the image is zeroed out. It is
currently not possibly to detect whether a given image is backed by a
file or a block device, and incorrectly assuming that it is zeroed
corrupts images during qemu-img convert, so let's err on the side of
caution and always return 0.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If the remote is a regular file, set it to true (ie. reads of
uninitialized areas in a newly created file will return zeroes).
If we can't prove that, return false (a safe default).
Tested by adding a debugging print statement [not part of this commit]
and creating a remote file and a remote block device:
$ ./qemu-img create ssh://localhost/tmp/new 100M
Formatting 'ssh://localhost/tmp/new', fmt=raw size=104857600
filename ssh://localhost/tmp/new: has_zero_init = 1
$ sudo lvcreate -L 1G -n tmp /dev/fedora
Logical volume "tmp" created
$ ./qemu-img create ssh://localhost/dev/fedora/tmp 1G
Formatting 'ssh://localhost/dev/fedora/tmp', fmt=raw size=1073741824
filename ssh://localhost/dev/fedora/tmp: has_zero_init = 0
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Testing drive-backup is similar to image streaming and drive mirroring.
This test case is based on 041.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The 'drive-mirror' tests often issue 'block-job-complete' and wait for
the QMP completion event. Other types of block jobs also want to wait
for completion but they may not need to issue 'block-job-complete'.
Extract wait_until_completed() from 041 and put it into iotests.py.
Return the QMP event object so the caller can make additional
assertions, if necessary.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The Abort action can be used to test QMP 'transaction' failure. Add it
as the last action to exercise the .abort() and .cleanup() code paths
for all previous actions.
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This patch adds a transactional version of the drive-backup QMP command.
It allows atomic snapshots of multiple drives along with automatic
cleanup if there is a failure to start one of the backup jobs.
Note that QMP events are emitted for block job completion/cancellation
and the block job will be listed by query-block-jobs.
@device: the name of the device whose writes should be mirrored.
@target: the target of the new image. If the file exists, or if it
is a device, the existing file/device will be used as the new
destination. If it does not exist, a new file will be created.
@format: #optional the format of the new destination, default is to
probe if @mode is 'existing', else the format of the source
@mode: #optional whether and how QEMU should create a new image, default is
'absolute-paths'.
@speed: #optional the maximum speed, in bytes per second
@on-source-error: #optional the action to take on an error on the source,
default 'report'. 'stop' and 'enospc' can only be used
if the block device supports io-status (see BlockInfo).
@on-target-error: #optional the action to take on an error on the target,
default 'report' (no limitations, since this applies to
a different block device than @device).
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Some QMP 'transaction' types don't need to do anything on .commit().
Make .commit() optional just like .abort().
The "drive-backup" action will take advantage of this, it only needs to
cancel the block job on .abort(). Other block job actions will probably
follow the same pattern, so allow .commit() to be NULL.
Suggested-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The QMP 'transaction' command keeps a list of in-flight transactions.
The transaction state structure is called BlkTransactionStates even
though it only deals with a single transaction. The only plural thing
is the linked list of transaction states.
I find it confusing to call the single structure "States". This patch
renames it to "State", just like BlockDriverState is singular.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
@drive-backup
Start a point-in-time copy of a block device to a new destination. The
status of ongoing drive-backup operations can be checked with
query-block-jobs where the BlockJobInfo.type field has the value 'backup'.
The operation can be stopped before it has completed using the
block-job-cancel command.
@device: the name of the device which should be copied.
@target: the target of the new image. If the file exists, or if it
is a device, the existing file/device will be used as the new
destination. If it does not exist, a new file will be created.
@format: #optional the format of the new destination, default is to
probe if @mode is 'existing', else the format of the source
@mode: #optional whether and how QEMU should create a new image, default is
'absolute-paths'.
@speed: #optional the maximum speed, in bytes per second
@on-source-error: #optional the action to take on an error on the source,
default 'report'. 'stop' and 'enospc' can only be used
if the block device supports io-status (see BlockInfo).
@on-target-error: #optional the action to take on an error on the target,
default 'report' (no limitations, since this applies to
a different block device than @device).
Note that @on-source-error and @on-target-error only affect background I/O.
If an error occurs during a guest write request, the device's rerror/werror
actions will be used.
Returns: nothing on success
If @device is not a valid block device, DeviceNotFound
Since 1.6
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Use bdrv_getlength() for its byte units and error return instead of
bdrv_get_geometry().
Reported-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
It is not necessary to check that we can find a protocol block driver
since we create or open the image file. This produces the error that we
need anyway.
Besides, the QERR_INVALID_BLOCK_FORMAT is inappropriate since the
protocol is incorrect rather than the format.
Also drop an empty line between bdrv_open() and checking its return
value. This may be due to copy-pasting from earlier code that performed
other operations before handling errors.
Reported-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
backup_start() creates a block job that copies a point-in-time snapshot
of a block device to a target block device.
We call backup_do_cow() for each write during backup. That function
reads the original data from the block device before it gets
overwritten. The data is then written to the target device.
Currently backup cluster size is hardcoded to 65536 bytes.
[I made a number of changes to Dietmar's original patch and folded them
in to make code review easy. Here is the full list:
* Drop BackupDumpFunc interface in favor of a target block device
* Detect zero clusters with buffer_is_zero() and use bdrv_co_write_zeroes()
* Use 0 delay instead of 1us, like other block jobs
* Unify creation/start functions into backup_start()
* Simplify cleanup, free bitmap in backup_run() instead of cb
* function
* Use HBitmap to avoid duplicating bitmap code
* Use bdrv_getlength() instead of accessing ->total_sectors
* directly
* Delete the backup.h header file, it is no longer necessary
* Move ./backup.c to block/backup.c
* Remove #ifdefed out code
* Coding style and whitespace cleanups
* Use bdrv_add_before_write_notifier() instead of blockjob-specific hooks
* Keep our own in-flight CowRequest list instead of using block.c
tracked requests. This means a little code duplication but is much
simpler than trying to share the tracked requests list and use the
backup block size.
* Add on_source_error and on_target_error error handling.
* Use trace events instead of DPRINTF()
-- stefanha]
Signed-off-by: Dietmar Maurer <dietmar@proxmox.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The bdrv_add_before_write_notifier() function installs a callback that
is invoked before a write request is processed. This will be used to
implement copy-on-write point-in-time snapshots where we need to copy
out old data before overwriting it.
Note that BdrvTrackedRequest is moved to block_int.h since it is passed
to .notify() functions.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
notifier_list_notify() has no return value. This is fine when we just
want to invoke side-effects.
Sometimes it's useful for notifiers to produce a return value. This
allows notifiers to "veto" an operation and will be used by the block
layer before-write notifier.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The raw-posix driver has code to provide a /dev/cdrom on OS X even
though it doesn't really exist. However, since commit c66a6157 the real
filename is dismissed after finding it, so opening /dev/cdrom fails.
Put the filename back into the options QDict to make this work again.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
# By Michael R. Hines (9) and others
# Via Juan Quintela
* quintela/migration.next:
rdma: introduce capability x-rdma-pin-all
rdma: new QEMUFileOps hooks
rdma: introduce qemu_ram_foreach_block()
rdma: export qemu_fflush()
rdma: introduce qemu_file_mode_is_not_valid()
rdma: export throughput w/ MigrationStats QMP
rdma: export yield_until_fd_readable()
rdma: introduce qemu_update_position()
rdma: add documentation
migration: do not overwrite zero pages
Revert "migration: do not sent zero pages in bulk stage"
arch_init/ram_load: add error message for block length mismatch
Message-id: 1372329455-5995-1-git-send-email-quintela@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This capability allows you to disable dynamic chunk registration
for better throughput on high-performance links.
For example, using an 8GB RAM virtual machine with all 8GB of memory in
active use and the VM itself is completely idle using a 40 gbps infiniband link:
1. x-rdma-pin-all disabled total time: approximately 7.5 seconds @ 9.5 Gbps
2. x-rdma-pin-all enabled total time: approximately 4 seconds @ 26 Gbps
These numbers would of course scale up to whatever size virtual machine
you have to migrate using RDMA.
Enabling this feature does *not* have any measurable affect on
migration *downtime*. This is because, without this feature, all of the
memory will have already been registered already in advance during
the bulk round and does not need to be re-registered during the successive
iteration rounds.
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Chegu Vinod <chegu_vinod@hp.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Chegu Vinod <chegu_vinod@hp.com>
Tested-by: Michael R. Hines <mrhines@us.ibm.com>
Signed-off-by: Michael R. Hines <mrhines@us.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
These are the prototypes and implementation of new hooks that
RDMA takes advantage of to perform dynamic page registration.
An optional hook is also introduced for a custom function
to be able to override the default save_page function.
Also included are the prototypes and accessor methods used by
arch_init.c which invoke funtions inside savevm.c to call out
to the hooks that may or may not have been overridden
inside of QEMUFileOps.
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Chegu Vinod <chegu_vinod@hp.com>
Tested-by: Chegu Vinod <chegu_vinod@hp.com>
Tested-by: Michael R. Hines <mrhines@us.ibm.com>
Signed-off-by: Michael R. Hines <mrhines@us.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
This is used during RDMA initialization in order to
transmit a description of all the RAM blocks to the
peer for later dynamic chunk registration purposes.
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Chegu Vinod <chegu_vinod@hp.com>
Tested-by: Chegu Vinod <chegu_vinod@hp.com>
Tested-by: Michael R. Hines <mrhines@us.ibm.com>
Signed-off-by: Michael R. Hines <mrhines@us.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>