For migration destination, we also need to know its state,
we will use it in COLO.
Here we add a new member 'state' for MigrationIncomingState,
and also use migrate_set_state() to modify its value.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
dgilbert: Fixed early free of MigraitonIncomingState
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Message-Id: <1450266458-3178-3-git-send-email-dgilbert@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Change the first parameter of migrate_set_state(), and export it.
We will use it in a later patch to update incoming state.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
dgilbert: Updated comment as per Juan's review
Message-Id: <1450266458-3178-2-git-send-email-dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Now that we guarantee the user doesn't have any enum values
beginning with a single underscore, we can use that for our
own purposes. Renaming ENUM_MAX to ENUM__MAX makes it obvious
that the sentinel is generated.
This patch was mostly generated by applying a temporary patch:
|diff --git a/scripts/qapi.py b/scripts/qapi.py
|index e6d014b..b862ec9 100644
|--- a/scripts/qapi.py
|+++ b/scripts/qapi.py
|@@ -1570,6 +1570,7 @@ const char *const %(c_name)s_lookup[] = {
| max_index = c_enum_const(name, 'MAX', prefix)
| ret += mcgen('''
| [%(max_index)s] = NULL,
|+// %(max_index)s
| };
| ''',
| max_index=max_index)
then running:
$ cat qapi-{types,event}.c tests/test-qapi-types.c |
sed -n 's,^// \(.*\)MAX,s|\1MAX|\1_MAX|g,p' > list
$ git grep -l _MAX | xargs sed -i -f list
The only things not generated are the changes in scripts/qapi.py.
Rejecting enum members named 'MAX' is now useless, and will be dropped
in the next patch.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1447836791-369-23-git-send-email-eblake@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
[Rebased to current master, commit message tweaked]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Dividing integer expressions transferred_bytes and time_spent, and then converting
the integer quotient to type double. Any remainder, or fractional part of the
quotient, is ignored. Fix this.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
The check is unneccesary, we read the value at the start of the
thread, use it, and never change it. The value is checked to be
non-NULL before thread creation.
Spotted by coverity, CID 1339211
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
I set current_time before the postcopy test but never use it;
(I think this was from the original version where it was time based).
Spotted by coverity, CID 1339208
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Peter reported a lock error on MacOS after my a82d593b
patch.
migrate_get_current does one-time initialisation of
a bunch of variables.
migrate_init does reinitialisation even on a 2nd
migrate after a cancel.
The problem here was that I'd initialised the mutex
in migrate_get_current, and the memset in migrate_init
corrupted it.
Remove the memset and replace it by explicit initialisation
of fields that need initialising; this also turns out to be simpler
than the old code that had to preserve some fields.
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Fixes: a82d593b
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Improve the text in both the qapi-schema and hmp help to point out
you need to set the postcopy-ram capability prior to issuing
migrate-start-postcopy.
Also fix the text of the migrate_start_postcopy error that
deals with capabilities.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Acked-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Where we have iterable, but non-postcopiable devices (e.g. htab
or block migration), complete them before forming the 'package'
but with the CPUs stopped. This stops them filling up the package.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Tweak the end of migration cleanup; we don't want to close stuff down
at the end of the main stream, since the postcopy is still sending pages
on the other thread.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
The loading of a device state (during postcopy) may access guest
memory that's still on the source machine and thus might need
a page fill; split off a separate thread that handles the incoming
page data so that the original incoming migration code can finish
off the device data.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
During the postcopy phase we must not call the iterate method on
precopy-only devices, since they may have done some cleanup during
the _complete call at the end of the precopy phase.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
On receiving MIG_RPCOMM_REQ_PAGES look up the address and
queue the page.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Add MIG_RP_MSG_REQ_PAGES command on Return path for the postcopy
destination to request a page from the source.
Two versions exist:
MIG_RP_MSG_REQ_PAGES_ID that includes a RAMBlock name and start/len
MIG_RP_MSG_REQ_PAGES that just has start/len for use with the same
RAMBlock as a previous MIG_RP_MSG_REQ_PAGES_ID
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
The end of migration in postcopy is a bit different since some of
the things normally done at the end of migration have already been
done on the transition to postcopy.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Rework the migration thread to setup and start postcopy.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Soon we'll be in either ACTIVE or POSTCOPY_ACTIVE when we
complete migration, and we need to know which we expect to be
in to change state safely.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
'MIGRATION_STATUS_POSTCOPY_ACTIVE' is entered after migrate_start_postcopy
'migration_in_postcopy' is provided for other sections to know if
they're in postcopy.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Soon we'll be in either ACTIVE or POSTCOPY_ACTIVE when we
complete migration, and we need to know which we expect to be
in to change state safely.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Once postcopy is enabled (with migrate_set_capability), the migration
will still start on precopy mode. To cause a transition into postcopy
the:
migrate_start_postcopy
command must be issued. Postcopy will start sometime after this
(when it's next checked in the migration loop).
Issuing the command before migration has started will error,
and issuing after it has finished is ignored.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Modify save_live_pending to return separate postcopiable and
non-postcopiable counts.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
The state of the postcopy process is managed via a series of messages;
* Add wrappers and handlers for sending/receiving these messages
* Add state variable that track the current state of postcopy
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
The 'postcopy ram' capability allows postcopy migration of RAM;
note that the migration starts off in precopy mode until
postcopy mode is triggered (see the migrate_start_postcopy
patch later in the series).
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Postcopy needs to have two migration streams loading concurrently;
one from memory (with the device state) and the other from the fd
with the memory transactions.
Split the core of qemu_loadvm_state out so we can use it for both.
Allow the inner loadvm loop to quit and cause the parent loops to
exit as well.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Open a return path, and handle messages that are received upon it.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Add 'migration_is_setup_or_active' utility function to check state.
(It gets postcopy added to it's list later on in the series)
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Add migrate_send_rp_message to send a message from destination to source along the return path.
(It uses a mutex to let it be called from multiple threads)
Add migrate_send_rp_shut to send a 'shut' message to indicate
the destination is finished with the RP.
Add migrate_send_rp_ack to send a 'PONG' message in response to a PING
Use it in the MSG_RP_PING handler
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
In postcopy we're going to need to perform the complete phase
for postcopiable devices at a different point, start out by
renaming all of the 'complete's to make the difference obvious.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Suspend to file is very much like a migrate, and it makes life
easier if we have the Migration state available, so initialise it
in the savevm.c code for suspending.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewd-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
'file' becomes confusing when you have flows in each direction;
rename to make it clear.
This leaves just the main forward direction ms->file, which is used
in a lot of places and is probably not worth renaming given the churn.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
The function qemu_savevm_state_cancel is called after the migration
in migration_thread, it seems strange to 'cancel' it after completion,
rename it to qemu_savevm_state_cleanup looks better.
Signed-off-by: Liang Li <liang.z.li@intel.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>al3
Reviewed-by: Amit Shah <amit.shah@redhat.com>al3
Signed-off-by: Juan Quintela <quintela@redhat.com>al3
Because of the patch 3ea3b7fa9af067982f34b of kvm, which introduces a
lazy collapsing of small sptes into large sptes mechanism, now
migration_end() is a time consuming operation because it calls
memroy_global_dirty_log_stop(), which will trigger the dropping of small
sptes operation and takes about dozens of milliseconds, so call
migration_end() before all the vmsate data has already been transferred
to the destination will prolong VM downtime. This operation should be
deferred after all the data has been transferred to the destination.
blk_mig_cleanup() can be deferred too.
For a VM with 8G RAM, this patch can reduce the VM downtime about 30 ms.
Signed-off-by: Liang Li <liang.z.li@intel.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>al3
Reviewed-by: Amit Shah <amit.shah@redhat.com>al3
Signed-off-by: Juan Quintela <quintela@redhat.com>al3
We were announcing the dest host's IP as our new IP a bit too soon -- if
there were errors detected after this announcement was done, the
migration is failed and the VM could continue running on the src host --
causing problems later.
Move around the qemu_announce_self() call so it's done just before the
VM is runnable.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
The current migration-completed event is generated a bit too early,
which means that an eager libvirt that's ready to go as soon
as it sees the event ends up racing with the actual end of migration.
This corresponds to RH bug:
https://bugzilla.redhat.com/show_bug.cgi?id=1271145
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
xSigned-off-by: Juan Quintela <quintela@redhat.com>
Migration has a define for MAX_THROTTLE. Update comment to clarify that this is
used for throttling transfer speed. Hopefully this will prevent it from being
confused with a guest cpu throttling entity.
Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Report throttle percentage in info migrate and query-migrate responses when
cpu throttling is active.
Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Remove traditional auto-converge static 30ms throttling code and replace it
with a dynamic throttling algorithm.
Additionally, be more aggressive when deciding when to start throttling.
Previously we waited until four unproductive memory passes. Now we begin
throttling after only two unproductive memory passes. Four seemed quite
arbitrary and only waiting for two passes allows us to complete the migration
faster.
Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Add migration parameters to allow the user to adjust the parameters
that control cpu throttling when auto-converge is in effect. The added
parameters are as follows:
x-cpu-throttle-initial : Initial percantage of time guest cpus are throttled
when migration auto-converge is activated.
x-cpu-throttle-increment: throttle percantage increase each time
auto-converge detects that migration is not making progress.
Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
g_new(T, n) is neater than g_malloc(sizeof(T) * n). It's also safer,
for two reasons. One, it catches multiplication overflowing size_t.
Two, it returns T * rather than void *, which lets the compiler catch
more type errors.
This commit only touches allocations with size arguments of the form
sizeof(T). Same Coccinelle semantic patch as in commit b45c03f.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1442231491-23352-1-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
The code that gets run at the end of the migration process
is getting large, and I'm about to add more for postcopy.
Split it into a separate function.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <1439463094-5394-3-git-send-email-dgilbert@redhat.com>
Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
When doing migration via the QMP command xen_save_devices_state, the
current runstate is not store into the global state section. Also the
current runstate is not the one we want on the receiver side.
During migration, the Xen toolstack paused QEMU before save the devices
state. Also, the toolstack expect QEMU to autostart when the migration is
finished.
So this patch store "running" as it's current runstate.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Commit df4b102452 introduced global_state
section. But it only filled the state while doing migration. While
doing a savevm, we stored an empty string as state. So when we did a
loadvm, it complained that state was invalid.
Fedora 21, 4.1.1, qemu 2.4.0-rc0
> ../../configure --target-list="x86_64-softmmu"
068 2s ... - output mismatch (see 068.out.bad)
--- /home/bos/jhuston/src/qemu/tests/qemu-iotests/068.out 2015-07-08
17:56:18.588164979 -0400
+++ 068.out.bad 2015-07-09 17:39:58.636651317 -0400
@@ -6,6 +6,8 @@
QEMU X.Y.Z monitor - type 'help' for more information
(qemu) savevm 0
(qemu) quit
+qemu-system-x86_64: Unknown savevm section or instance 'globalstate' 0
+qemu-system-x86_64: Error -22 while loading VM state
QEMU X.Y.Z monitor - type 'help' for more information
(qemu) quit
*** done
Failures: 068
Failed 1 of 1 tests
Actually, there were two problems here:
- we registered global_state too late for load_vm (fixed on another
patch on the list)
- we didn't store a valid state for savevm (fixed by this patch).
Reported-by: John Snow <jsnow@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
We can want the trace event even without migration events enabled.
Reported-by: Wen Congyang <ghostwcy@gmail.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
On previous change, we changed state at post load time if it was not
running, special casing the "running" change. Now, we change any states
at the end of the migration.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
We reuse the migration events from the source side, sending them on the
appropiate place.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Make check fails with events. THis is due to the parser/lexer that it
uses. Just in case that they are more broken parsers, just only send
events when there are capabilities.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
We have one argument that tells us what event has happened.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
We now use the helper everywhere, so no need to call this on this two
places. See on previous commit that there were a place where we missed
to mark the trace. Now all tracing is done in migrate_set_state().
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
There were three places that were not using the migrate_set_state()
helper, just fix that.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>