Commit Graph

134 Commits

Author SHA1 Message Date
Paolo Bonzini
ea987c2c21 vmstate: type-check sub-arrays
While we cannot check against the type of the full array, we can check
against the type of the fields.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2015-01-16 13:06:17 +05:30
Dr. David Alan Gilbert
e1a8c9b67f socket shutdown
Add QEMUFile interface to allow a socket to be 'shut down' - i.e. any
reads/writes will fail (and any blocking read/write will be woken).

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2015-01-16 13:06:17 +05:30
ChenLiang
27af7d6ea5 xbzrle: optimize XBZRLE to decrease the cache misses
Avoid hot pages being replaced by others to remarkably decrease cache
misses

Sample results with the test program which quote from xbzrle.txt ran in
vm:(migrate bandwidth:1GE and xbzrle cache size 8MB)

the test program:

include <stdlib.h>
include <stdio.h>
int main()
 {
        char *buf = (char *) calloc(4096, 4096);
        while (1) {
            int i;
            for (i = 0; i < 4096 * 4; i++) {
                buf[i * 4096 / 4]++;
            }
            printf(".");
        }
 }

before this patch:
virsh qemu-monitor-command test_vm '{"execute": "query-migrate"}'
{"return":{"expected-downtime":1020,"xbzrle-cache":{"bytes":1108284,
"cache-size":8388608,"cache-miss-rate":0.987013,"pages":18297,"overflow":8,
"cache-miss":1228737},"status":"active","setup-time":10,"total-time":52398,
"ram":{"total":12466991104,"remaining":1695744,"mbps":935.559472,
"transferred":5780760580,"dirty-sync-counter":271,"duplicate":2878530,
"dirty-pages-rate":29130,"skipped":0,"normal-bytes":5748592640,
"normal":1403465}},"id":"libvirt-706"}

18k pages sent compressed in 52 seconds.
cache-miss-rate is 98.7%, totally miss.

after optimizing:
virsh qemu-monitor-command test_vm '{"execute": "query-migrate"}'
{"return":{"expected-downtime":2054,"xbzrle-cache":{"bytes":5066763,
"cache-size":8388608,"cache-miss-rate":0.485924,"pages":194823,"overflow":0,
"cache-miss":210653},"status":"active","setup-time":11,"total-time":18729,
"ram":{"total":12466991104,"remaining":3895296,"mbps":937.663549,
"transferred":1615042219,"dirty-sync-counter":98,"duplicate":2869840,
"dirty-pages-rate":58781,"skipped":0,"normal-bytes":1588404224,
"normal":387794}},"id":"libvirt-266"}

194k pages sent compressed in 18 seconds.
The value of cache-miss-rate decrease to 48.59%.

Signed-off-by: ChenLiang <chenliang88@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2015-01-15 17:49:43 +05:30
Eduardo Habkost
e68dd36596 qemu-file: Make qemu_file_is_writable() non-static
The QEMUFileStdio code will use qemu_file_is_writable() and will be
moved to a separate file.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2014-10-14 10:28:12 +02:00
Alexey Kardashevskiy
94ed706d53 vmstate: Allow dynamic allocation for VBUFFER during migration
This extends use of VMS_ALLOC flag from arrays to VBUFFER as well.

This defines VMSTATE_VBUFFER_ALLOC_UINT32 which makes use of VMS_ALLOC
and uses uint32_t type for a size.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2014-10-14 09:35:48 +02:00
Dr. David Alan Gilbert
deb22f9a44 QEMUSizedBuffer based QEMUFile
This is based on Stefan and Joel's patch that creates a QEMUFile that goes
to a memory buffer; from:

http://lists.gnu.org/archive/html/qemu-devel/2013-03/msg05036.html

Using the QEMUFile interface, this patch adds support functions for
operating on in-memory sized buffers that can be written to or read from.

Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Signed-off-by: Joel Schopp <jschopp@linux.vnet.ibm.com>

For fixes/tweeks I've done:
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2014-10-14 09:17:06 +02:00
Alexey Kardashevskiy
f32935ea22 vmstate: Add preallocation for migrating arrays (VMS_ALLOC flag)
There are few helpers already to support array migration. However they all
require the destination side to preallocate arrays before migration which
is not always possible due to unknown array size as it might be some
sort of dynamic state. One of the examples is an array of MSIX-enabled
devices in SPAPR PHB - this array may vary from 0 to 65536 entries and
its size depends on guest's ability to enable MSIX or do PCI hotplug.

This adds new VMSTATE_VARRAY_STRUCT_ALLOC macro which is pretty similar to
VMSTATE_STRUCT_VARRAY_POINTER_INT32 but it can alloc memory for migratign
array on the destination side.

This defines VMS_ALLOC flag for a field.

This changes vmstate_base_addr() to do the allocation when receiving
migration.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Juan Quintela <quintela@redhat.com>
[agraf: drop g_malloc_n usage]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:27 +02:00
Amit Shah
abfd9ce341 migration: dump vmstate info as a json file for static analysis
This commit adds a new command, '-dump-vmstate', that takes a filename
as an argument.  When executed, QEMU will dump the vmstate information
for the machine type it's invoked with to the file, and quit.

The JSON-format output can then be used to compare the vmstate info for
different QEMU versions, specifically to test whether live migration
would break due to changes in the vmstate data.

A Python script that compares the output of such JSON dumps is included
in the following commit.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2014-06-23 19:14:50 +02:00
Jason Wang
508e1180d3 migration: introduce self_announce_delay()
This patch introduces self_announce_delay() to calculate the delay for
the next announce round. This could be used by other device e.g
virtio-net who wants to do announcing by itself.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:54 +03:00
Jason Wang
110f463062 migration: export SELF_ANNOUNCE_ROUNDS
Export it for other users.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 16:41:54 +03:00
ChenLiang
8bc3923343 migration: expose xbzrle cache miss rate
expose xbzrle cache miss rate

Signed-off-by: ChenLiang <chenliang88@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2014-05-05 22:15:03 +02:00
ChenLiang
58570ed894 migration: expose the bitmap_sync_count to the end
expose the count that logs the times of updating the dirty bitmap to
end user.

Signed-off-by: ChenLiang <chenliang88@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2014-05-05 22:15:03 +02:00
Dr. David Alan Gilbert
0d6ab3ab91 Provide init function for ram migration
Provide ram_mig_init (like blk_mig_init) for vl.c to initialise stuff
to do with ram migration (currently in arch_init.c).

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2014-05-05 22:15:03 +02:00
Dr. David Alan Gilbert
548f52ea06 Make qemu_peek_buffer loop until it gets it's data
Make qemu_peek_buffer repeatedly call fill_buffer until it gets
all the data it requires, or until there is an error.

  At the moment, qemu_peek_buffer will try one qemu_fill_buffer if there
  isn't enough data waiting, however the kernel is entitled to return
  just a few bytes, and still leave qemu_peek_buffer with less bytes
  than it needed.  I've seen this fail in a dev world, and I think it
  could theoretically fail in the peeking of the subsection headers in
  the current world.

Comment qemu_peek_byte to point out it's not guaranteed to work for
  non-continuous peeks

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: ChenLiang <chenliang0016@icloud.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2014-05-05 22:15:03 +02:00
Michael S. Tsirkin
3476436a44 vmstate: s/VMSTATE_INT32_LE/VMSTATE_INT32_POSITIVE_LE/
As the macro verifies the value is positive, rename it
to make the function clearer.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2014-05-05 22:15:03 +02:00
Michael S. Tsirkin
4082f0889b vmstate: add VMSTATE_VALIDATE
Validate state using VMS_ARRAY with num = 0 and VMS_MUST_EXIST

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2014-05-05 14:15:10 +02:00
Michael S. Tsirkin
5bf81c8d63 vmstate: add VMS_MUST_EXIST
Can be used to verify a required field exists or validate
state in some other way.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2014-05-05 14:15:10 +02:00
Fam Zheng
2e323f03bf scsi: Fix migration of scsi sense data
c5f52875 changed the size of sense array in vmstate_scsi_device by
mistake. This patch restores the old size, and add a subsection for the
remaining part of the buffer size. So that migration is not broken.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-03-14 10:06:55 +01:00
Dr. David Alan Gilbert
6d3cb1f970 Fix two XBZRLE corruption issues
Push zero'd pages into the XBZRLE cache
    A page that was cached by XBZRLE, zero'd and then XBZRLE'd again
    was being compared against a stale cache value

Don't use 'qemu_put_buffer_async' to put pages from the XBZRLE cache
    Since the cache might change before the data hits the wire

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2014-02-25 14:30:28 +01:00
Christoffer Dall
a1b1d277cd vmstate: Add uint32 2D-array support
Add support for saving VMState of 2D arrays of uint32 values.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-02-08 14:50:48 +00:00
Orit Wasserman
89db9987c0 Don't abort on memory allocation error
It is better to fail migration in case of failure to
allocate new cache item

Signed-off-by: Orit Wasserman <owasserm@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2014-02-04 16:50:37 +01:00
Gonglei (Arei)
905f26f222 migration:fix free XBZRLE decoded_buf wrong
When qemu do live migration with xbzrle, qemu malloc decoded_buf
at destination end but free it at source end. It will crash qemu
by double free error in some scenarios. Splitting the XBZRLE structure
for clear logic distinguishing src/dst side.

Signed-off-by: ChenLiang <chenliang88@huawei.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Signed-off-by: GongLei <arei.gonglei@huawei.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2014-02-04 16:49:24 +01:00
Peter Maydell
20bcf73fa8 vmstate: Make VMSTATE_STRUCT_POINTER take type, not ptr-to-type
The VMSTATE_STRUCT_POINTER macros are a bit odd in that they
must be passed an argument "FooType *" rather than just taking
the FooType. They're only used in one place, so it's easy to
tidy this up. This also lets us use the macro to replace the
hand-rolled VMSTATE_PTIMER.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2014-02-04 15:51:45 +01:00
Eduardo Habkost
b5503338ed migration: Move QEMU_VM_* defines to migration/migration.h
The VMState code will be moved to vmstate.c and it uses some of the
QEMU_VM_* constants, so move it to a header.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2014-01-13 12:39:48 +01:00
Eduardo Habkost
c961514fd9 qemu-file: Make a few functions non-static
The QEMUFile code will be moved to qemu-file.c. This will require making
the following functions non-static because they are used by the savevm.c
code:

 * qemu_peek_byte()
 * qemu_peek_buffer()
 * qemu_file_skip()
 * qemu_file_set_error()

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2014-01-13 12:39:48 +01:00
Peter Maydell
a1f05e79f2 vmstate: Add support for an array of ptimer_state *
Add support for defining a vmstate field which is an array
of pointers to structures, and use this to define a
VMSTATE_PTIMER_ARRAY() which allows an array of ptimer_state*
to be used by devices.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1387159292-10436-2-git-send-email-lig.fnst@cn.fujitsu.com
2013-12-17 20:12:51 +00:00
Alexey Kardashevskiy
7102400d40 migration: add version supporting macros for struct pointer
This adds version supporting macros VMSTATE_STRUCT_POINTER_TEST_V
and VMSTATE_STRUCT_POINTER_V in addition to the already existing
VMSTATE_STRUCT_POINTER and VMSTATE_STRUCT_POINTER_TEST macros.

Cc: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-09-24 13:22:50 +02:00
Stefan Hajnoczi
02edd2e766 migration: fix spice migration
Commit 29ae8a4133 ("rdma: introduce
MIG_STATE_NONE and change MIG_STATE_SETUP state transition") changed the
state transitions during migration setup.

Spice used to be notified with MIG_STATE_ACTIVE and it detected this
using migration_is_active().  Spice is now notified with
MIG_STATE_SETUP and migration_is_active() no longer works.

Replace migration_is_active() with migration_in_setup() to fix spice
migration.

Cc: Michael R. Hines <mrhines@us.ibm.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-07-29 17:19:02 +02:00
Michael R. Hines
ed4fbd1082 rdma: account for the time spent in MIG_STATE_SETUP through QMP
Using the previous patches, we're now able to timestamp the SETUP
state. Once we have this time, let the user know about it in the
schema.

Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael R. Hines <mrhines@us.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-07-23 13:06:37 +02:00
Michael R. Hines
2da776db48 rdma: core logic
Code that does need to be visible is kept
well contained inside this file and this is the only
new additional file to the entire patch.

This file includes the entire protocol and interfaces
required to perform RDMA migration.

Also, the configure and Makefile modifications to link
this file are included.

Full documentation is in docs/rdma.txt

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Chegu Vinod <chegu_vinod@hp.com>
Tested-by: Chegu Vinod <chegu_vinod@hp.com>
Tested-by: Michael R. Hines <mrhines@us.ibm.com>
Signed-off-by: Michael R. Hines <mrhines@us.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-07-23 11:12:00 +02:00
Michael R. Hines
44c3b58cf9 rdma: introduce ram_handle_compressed()
This gives RDMA shared access to madvise() on the destination side
when an entire chunk is found to be zero.

Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Chegu Vinod <chegu_vinod@hp.com>
Tested-by: Chegu Vinod <chegu_vinod@hp.com>
Tested-by: Michael R. Hines <mrhines@us.ibm.com>
Signed-off-by: Michael R. Hines <mrhines@us.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-07-23 11:11:59 +02:00
Peter Lieven
323004a39d block-migration: efficiently encode zero blocks
this patch adds a efficient encoding for zero blocks by
adding a new flag indicating a block is completely zero.

additionally bdrv_write_zeros() is used at the destination
to efficiently write these zeroes. depending on the implementation
this avoids that the destination target gets fully provisioned.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-07-19 12:29:21 +08:00
Chegu Vinod
bde1e2ec21 Add 'auto-converge' migration capability
The auto-converge migration capability allows the user to specify if they
choose live migration seqeunce to automatically detect and force convergence.

Signed-off-by: Chegu Vinod <chegu_vinod@hp.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-07-12 20:34:58 +02:00
Peter Maydell
ec3f8c9913 linux-user: Fix compilation failure
Fix compilation failures for linux-user targets following recent
migration related commits bd2fa51fcd and 43487c67.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1372362818-4740-1-git-send-email-peter.maydell@linaro.org
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-06-27 15:38:35 -05:00
Michael R. Hines
60d9222c8f rdma: introduce capability x-rdma-pin-all
This capability allows you to disable dynamic chunk registration
for better throughput on high-performance links.

For example, using an 8GB RAM virtual machine with all 8GB of memory in
active use and the VM itself is completely idle using a 40 gbps infiniband link:

1. x-rdma-pin-all disabled total time: approximately 7.5 seconds @ 9.5 Gbps
2. x-rdma-pin-all enabled total time: approximately 4 seconds @ 26 Gbps

These numbers would of course scale up to whatever size virtual machine
you have to migrate using RDMA.

Enabling this feature does *not* have any measurable affect on
migration *downtime*. This is because, without this feature, all of the
memory will have already been registered already in advance during
the bulk round and does not need to be re-registered during the successive
iteration rounds.

Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Chegu Vinod <chegu_vinod@hp.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Chegu Vinod <chegu_vinod@hp.com>
Tested-by: Michael R. Hines <mrhines@us.ibm.com>
Signed-off-by: Michael R. Hines <mrhines@us.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-06-27 02:38:36 +02:00
Michael R. Hines
43487c678d rdma: new QEMUFileOps hooks
These are the prototypes and implementation of new hooks that
RDMA takes advantage of to perform dynamic page registration.

An optional hook is also introduced for a custom function
to be able to override the default save_page function.

Also included are the prototypes and accessor methods used by
arch_init.c which invoke funtions inside savevm.c to call out
to the hooks that may or may not have been overridden
inside of QEMUFileOps.

Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Chegu Vinod <chegu_vinod@hp.com>
Tested-by: Chegu Vinod <chegu_vinod@hp.com>
Tested-by: Michael R. Hines <mrhines@us.ibm.com>
Signed-off-by: Michael R. Hines <mrhines@us.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-06-27 02:38:36 +02:00
Michael R. Hines
be903b2ae7 rdma: export qemu_fflush()
RDMA uses this to flush the control channel before sending its
own message to handle page registrations.

Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Chegu Vinod <chegu_vinod@hp.com>
Tested-by: Chegu Vinod <chegu_vinod@hp.com>
Tested-by: Michael R. Hines <mrhines@us.ibm.com>
Signed-off-by: Michael R. Hines <mrhines@us.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-06-27 02:38:36 +02:00
Michael R. Hines
bc1256f7f1 rdma: introduce qemu_file_mode_is_not_valid()
QEMUFileRDMA also has read and write modes. This function is now
shared to reduce code duplication.

Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Chegu Vinod <chegu_vinod@hp.com>
Tested-by: Chegu Vinod <chegu_vinod@hp.com>
Tested-by: Michael R. Hines <mrhines@us.ibm.com>
Signed-off-by: Michael R. Hines <mrhines@us.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-06-27 02:38:36 +02:00
Michael R. Hines
7e114f8cf2 rdma: export throughput w/ MigrationStats QMP
This exposes throughput (in megabits/sec) through QMP.

Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Chegu Vinod <chegu_vinod@hp.com>
Tested-by: Chegu Vinod <chegu_vinod@hp.com>
Tested-by: Michael R. Hines <mrhines@us.ibm.com>
Signed-off-by: Michael R. Hines <mrhines@us.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-06-27 02:38:36 +02:00
Michael R. Hines
2b0ce0797d rdma: introduce qemu_update_position()
RDMA writes happen asynchronously, and thus the performance accounting
also needs to be able to occur asynchronously. This allows anybody
to call into savevm.c to update both f->pos as well as into arch_init.c
to update the acct_info structure with up-to-date values when
the RDMA transfer actually completes.

Reviewed-by: Juan Quintela <quintela@redhat.com>
Tested-by: Chegu Vinod <chegu_vinod@hp.com>
Tested-by: Michael R. Hines <mrhines@us.ibm.com>
Signed-off-by: Michael R. Hines <mrhines@us.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-06-27 02:38:35 +02:00
Kevin Wolf
05fcc84888 savevm: Implement block_writev_buffer()
Instead of breaking up RAM state into many small chunks, pass the iovec
to the block layer for better performance.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-15 08:26:18 +02:00
Peter Maydell
bd7f92e59e vmstate: Add support for two dimensional arrays
Add support for migrating two dimensional arrays, by defining
a set of new macros VMSTATE_*_2DARRAY paralleling the existing
VMSTATE_*_ARRAY macros. 2D arrays are handled the same for actual
state serialization; the only difference is that the type check
has to change for a 2D array.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Igor Mitsyanko <i.mitsyanko@gmail.com>
Message-id: 1363975375-3166-2-git-send-email-peter.maydell@linaro.org
2013-04-05 16:17:59 +01:00
Igor Mitsyanko
8070568b9a vmstate.h: introduce VMSTATE_BUFFER_POINTER_UNSAFE macro
Macro could be used to migrate a dynamically allocated buffer of known size.

Signed-off-by: Igor Mitsyanko <i.mitsyanko@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1362923278-4080-2-git-send-email-i.mitsyanko@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2013-04-05 16:17:58 +01:00
Orit Wasserman
6181ec2455 Add qemu_put_buffer_async
This allows us to add a buffer to the iovec to send without copying it
into the static buffer, the buffer will be sent later when qemu_fflush is called.

Signed-off-by: Orit Wasserman <owasserm@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-03-26 13:32:33 +01:00
Orit Wasserman
d913829f0f Add QemuFileWritevBuffer QemuFileOps
This will allow us to write an iovec

Signed-off-by: Orit Wasserman <owasserm@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-03-26 13:32:33 +01:00
Peter Lieven
f1c72795af migration: do not sent zero pages in bulk stage
during bulk stage of ram migration if a page is a
zero page do not send it at all.
the memory at the destination reads as zero anyway.

even if there is an madvise with QEMU_MADV_DONTNEED
at the target upon receipt of a zero page I have observed
that the target starts swapping if the memory is overcommitted.
it seems that the pages are dropped asynchronously.

this patch also updates QMP to return the number of
skipped pages in MigrationStats.

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-03-26 13:32:33 +01:00
David Gibson
377e2cb96b savevm: Fix bugs in the VMSTATE_VBUFFER_MULTIPLY definition
The VMSTATE_BUFFER_MULTIPLY macro is misnamed - it actually specifies
a variably sized buffer with VMS_VBUFFER, so should be named
VMSTATE_VBUFFER_MULTIPLY.  This patch fixes this (the macro had no current
users under either name).

In addition, unlike the other VMSTATE_VBUFFER variants, this macro did not
specify VMS_POINTER.  This patch fixes this bug as well.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-03-26 13:30:49 +01:00
David Gibson
8474a9dd67 savevm: Add VMSTATE_STRUCT_VARRAY_POINTER_UINT32
Currently the savevm code contains a VMSTATE_STRUCT_VARRAY_POINTER_INT32
helper (a variably sized array with the number of elements in an int32_t),
but not VMSTATE_STRUCT_VARRAY_POINTER_UINT32 (... with the number of
elements in a uint32_t).  This patch (trivially) fixes the deficiency.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-03-26 13:30:49 +01:00
David Gibson
213945e4d7 savevm: Add VMSTATE_FLOAT64 helpers
The current savevm code includes VMSTATE helpers for a number of commonly
used data types, but not for the float64 type used by the internal floating
point emulation code.  This patch fixes the deficiency.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-03-26 13:30:49 +01:00
David Gibson
d58f559834 savevm: Add VMSTATE_UINTTL_EQUAL helper
This adds an _EQUAL VMSTATE helper for target_ulongs, defined in terms of
VMSTATE_UINT32_EQUAL or VMSTATE_UINT64_EQUAL as appropriate.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-03-26 13:30:49 +01:00
David Gibson
e344b8a16d savevm: Add VMSTATE_UINT64_EQUAL helpers
The savevm code already includes a number of *_EQUAL helpers which act as
sanity checks verifying that the configuration of the saved state matches
that of the machine we're loading into to work.  Variants already exist
for 8 bit 16 bit and 32 bit integers, but not 64 bit integers.  This patch
fills that hole, adding a UINT64 version.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-03-26 13:30:48 +01:00
Andreas Färber
c71c3e99b8 stubs: Add a vmstate_dummy struct for CONFIG_USER_ONLY
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-03-12 10:35:54 +01:00
Andreas Färber
d7650eab42 vmstate: Make vmstate_register() static inline
This avoids adding a duplicate stub for CONFIG_USER_ONLY.

Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-03-12 10:35:54 +01:00
Peter Lieven
ee0b44aa9d page_cache: dup memory on insert
The page cache frees all data on finish, on resize and
if there is collision on insert. So it should be the caches
responsibility to dup the data that is stored in the cache.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Orit Wasserman <owasserm@redhat.com>

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-03-11 13:32:03 +01:00
Paolo Bonzini
b352365f5a migration: eliminate s->migration_file
The indirection is useless now.  Backends can open s->file directly.

Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-03-11 13:32:03 +01:00
Paolo Bonzini
1964a39706 migration: move rate limiting to QEMUFile
Rate limiting is now simply a byte counter; client call
qemu_file_rate_limit() manually to determine if they have to exit.
So it is possible and simple to move the functionality to QEMUFile.

This makes the remaining functionality of s->file redundant;
in the next patch we can remove it and write directly to s->migration_file.

Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-03-11 13:32:02 +01:00
Paolo Bonzini
e6a1cf2132 migration: use QEMUFile for writing outgoing migration data
Second, drop the file descriptor indirection, and write directly to the
QEMUFile.

Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-03-11 13:32:02 +01:00
Paolo Bonzini
f8bbc12863 migration: use QEMUFile for migration channel lifetime
As a start, use QEMUFile to store the destination and close it.
qemu_get_fd gets a file descriptor that will be used by the write
callbacks.

Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-03-11 13:32:02 +01:00
Paolo Bonzini
0cc3f3ccc9 qemu-file: add writable socket QEMUFile
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-03-11 13:32:02 +01:00
Paolo Bonzini
817b9ed5eb migration: merge qemu_popen_cmd with qemu_popen
There is no reason for outgoing exec migration to do popen manually
anymore (the reason used to be that we needed the FILE* to make it
non-blocking).  Use qemu_popen_cmd.

Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-03-11 13:32:02 +01:00
Paolo Bonzini
05f28b837c qemu-file: make qemu_fflush and qemu_file_set_error private again
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-03-11 13:32:02 +01:00
Paolo Bonzini
edaae611f6 migration: yay, buffering is gone
Buffering was needed because blocking writes could take a long time
and starve other threads seeking to grab the big QEMU mutex.

Now that all writes (except within _complete callbacks) are done
outside the big QEMU mutex, we do not need buffering at all.

Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-03-11 13:32:02 +01:00
Paolo Bonzini
9b09503752 migration: run setup callbacks out of big lock
Only the migration_bitmap_sync() call needs the iothread lock.

Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-03-11 13:32:01 +01:00
Paolo Bonzini
32c835ba39 migration: run pending/iterate callbacks out of big lock
This makes it possible to do blocking writes directly to the socket,
with no buffer in the middle.  For RAM, only the migration_bitmap_sync()
call needs the iothread lock.  For block migration, it is needed by
the block layer (including bdrv_drain_all and dirty bitmap access),
but because some code is shared between iterate and complete, all of
mig_save_device_dirty is run with the lock taken.

In the savevm case, the iterate callback runs within the big lock.
This is annoying because it complicates the rules.  Luckily we do not
need to do anything about it: the RAM iterate callback does not need
the iothread lock, and block migration never runs during savevm.

Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-03-11 13:32:01 +01:00
Paolo Bonzini
8c8de19d93 migration: reorder SaveVMHandlers members
This groups together the callbacks that later will have similar
locking rules.

Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-03-11 13:32:01 +01:00
Paolo Bonzini
bb1fadc444 migration: cleanup migration (including thread) in the iothread
Perform final cleanup in a bottom half, and add joining the thread to
the series of cleanup actions.

migrate_fd_error remains for connection error, but it doesn't need
to cleanup anything anymore.

Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-03-11 13:32:01 +01:00
Paolo Bonzini
dba433c03a migration: simplify error handling
Always use qemu_file_get_error to detect errors, since that is how
QEMUFile itself drops I/O after an error occurs.  There is no need
to propagate and check return values all the time.

Also remove the "complete" member, since we know that it is set (via
migrate_fd_cleanup) only when the state changes.

Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-03-11 13:32:01 +01:00
Paolo Bonzini
4eb938102b qemu-file: temporarily expose qemu_file_set_error and qemu_fflush
Right now, migration cannot entirely rely on QEMUFile's automatic
drop of I/O after an error, because it does its "real" I/O outside
the put_buffer callback.  To fix this until buffering is gone, expose
qemu_file_set_error which we will use in buffered_flush.

Similarly, buffered_flush is not a complete flush because some data may
still reside in the QEMUFile's own buffer.  This somewhat complicates the
process of closing the migration thread.  Again, when buffering is gone
buffered_flush will disappear and calling qemu_fflush will not be needed;
in the meanwhile, we expose the function for use in migration.c.

Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-03-11 13:32:01 +01:00
Paolo Bonzini
fd7f0d6617 hw: move fifo.[ch] to libqemuutil
fifo.c is generic code that can be easily unit tested.  So it
belongs in libqemuutil.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-03-01 13:53:10 +01:00
Juan Quintela
90f8ae724a migration: calculate expected_downtime
We removed the calculation in commit e4ed1541ac

Now we add it back.  We need to create dirty_bytes_rate because we
can't include cpu-all.h from migration.c, and there is no other way to
include TARGET_PAGE_SIZE.

Signed-off-by: Juan Quintela <quintela@redhat.com>


Reviewed-by: Orit Wasserman <owasserm@redhat.com>
2013-02-22 10:12:52 +01:00
Stefan Hajnoczi
ad55ab42d4 migration: make qemu_ftell() public and support writable files
Migration .save_live_iterate() functions return the number of bytes
transferred.  The easiest way of doing this is by calling qemu_ftell(f)
at the beginning and end of the function to calculate the difference.

Make qemu_ftell() public so that block-migration will be able to use it.
Also adjust the ftell calculation for writable files where buf_offset
does not include buf_size.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-id: 1360661835-28663-2-git-send-email-stefanha@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-02-12 16:26:44 -06:00
Juan Quintela
76f5933aea migration: move beginning stage to the migration thread
Signed-off-by: Juan Quintela <quintela@redhat.com>

Reviewed-by: Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2013-01-17 13:54:18 +01:00
Paolo Bonzini
b9c961a8ff migration: make function static
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>

Reviewed-by: Reviewed-by: Eric Blake <eblake@redhat.com>
2013-01-17 13:54:16 +01:00
Juan Quintela
9848a40427 migration: merge QEMUFileBuffered into MigrationState
Avoid splitting the state of outgoing migration, more or less arbitrarily,
between two data structures.  QEMUFileBuffered anyway is used only during
migration.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2012-12-20 23:09:40 +01:00
Juan Quintela
2e45086533 migration: Inline qemu_fopen_ops_buffered into migrate_fd_connect
Signed-off-by: Juan Quintela <quintela@redhat.com>
2012-12-20 23:09:39 +01:00
Juan Quintela
0e288fa369 migration: move migration_fd_put_ready()
Put it near its use and un-export it.

Signed-off-by: Juan Quintela <quintela@redhat.com>
2012-12-20 23:09:39 +01:00
Juan Quintela
0d82d0e8b9 migration: move buffered_file.c code into migration.c
This only moves the code (also from buffered_file.h to migration.h).
Fix whitespace until checkpatch is happy.

Signed-off-by: Juan Quintela <quintela@redhat.com>
2012-12-20 23:09:36 +01:00
Juan Quintela
e4ed1541ac savevm: New save live migration method: pending
Code just now does (simplified for clarity)

    if (qemu_savevm_state_iterate(s->file) == 1) {
       vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
       qemu_savevm_state_complete(s->file);
    }

Problem here is that qemu_savevm_state_iterate() returns 1 when it
knows that remaining memory to sent takes less than max downtime.

But this means that we could end spending 2x max_downtime, one
downtime in qemu_savevm_iterate, and the other in
qemu_savevm_state_complete.

Changed code to:

    pending_size = qemu_savevm_state_pending(s->file, max_size);
    DPRINTF("pending size %lu max %lu\n", pending_size, max_size);
    if (pending_size >= max_size) {
        ret = qemu_savevm_state_iterate(s->file);
     } else {
        vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
        qemu_savevm_state_complete(s->file);
     }

So what we do is: at current network speed, we calculate the maximum
number of bytes we can sent: max_size.

Then we ask every save_live section how much they have pending.  If
they are less than max_size, we move to complete phase, otherwise we
do an iterate one.

This makes things much simpler, because now individual sections don't
have to caluclate the bandwidth (it was implossible to do right from
there).

Signed-off-by: Juan Quintela <quintela@redhat.com>

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-20 23:09:25 +01:00
Juan Quintela
188a428559 migration: remove unfreeze logic
Now that we have a thread, and blocking writes, we don't need it.

Signed-off-by: Juan Quintela <quintela@redhat.com>

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-20 23:09:25 +01:00
Juan Quintela
dd217b8732 migration: make writes blocking
Move all the writes to the migration_thread, and make writings
blocking.  Notice that are still using the iothread for everything
that we do.

Signed-off-by: Juan Quintela <quintela@redhat.com>
2012-12-20 23:09:25 +01:00
Juan Quintela
766bd1769e migration: move migration thread init code to migrate_fd_put_ready
This way everything related with migration is run on the migration
thread and no locking is needed.

Signed-off-by: Juan Quintela <quintela@redhat.com>
2012-12-20 23:09:25 +01:00
Juan Quintela
edfa1af52f migration: make qemu_fopen_ops_buffered() return void
We want the file assignment to happen before the thread is created to
avoid locking, so we just do it before creating the thread.

Signed-off-by: Juan Quintela <quintela@redhat.com>

Reviewed-by: Orit Wasserman <owasserm@redhat.com>
2012-12-20 23:09:25 +01:00
Paolo Bonzini
1de7afc984 misc: move include files to include/qemu/
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-19 08:32:39 +01:00
Paolo Bonzini
caf71f86a3 migration: move include files to include/migration/
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-19 08:31:32 +01:00