So far, live migration with shared storage meant that the image is in a
not-really-ready don't-touch-me state on the destination while the
source is still actively using it, but after completing the migration,
the image was fully opened on both sides. This is bad.
This patch adds a block driver callback to inactivate images on the
source before completing the migration. Inactivation means that it goes
to a state as if it was just live migrated to the qemu instance on the
source (i.e. BDRV_O_INACTIVE is set). You're then supposed to continue
either on the source or on the destination, which takes ownership of the
image.
A typical migration looks like this now with respect to disk images:
1. Destination qemu is started, the image is opened with
BDRV_O_INACTIVE. The image is fully opened on the source.
2. Migration is about to complete. The source flushes the image and
inactivates it. Now both sides have the image opened with
BDRV_O_INACTIVE and are expecting the other side to still modify it.
3. One side (the destination on success) continues and calls
bdrv_invalidate_all() in order to take ownership of the image again.
This removes BDRV_O_INACTIVE on the resuming side; the flag remains
set on the other side.
This ensures that the same image isn't written to by both instances
(unless both are resumed, but then you get what you deserve). This is
important because .bdrv_close for non-BDRV_O_INACTIVE images could write
to the image file, which is definitely forbidden while another host is
using the image.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
This allows to send a partial array where the size is another
structure field multiplied by a constant.
Signed-off-by: Juan Quintela <quintela@redhat.com>
[PMM: updated to current master]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Add vmstate support for migrating arrays of CPU_DoubleU via
VMSTATE_CPUDOUBLE_ARRAY.
Signed-off-by: Juan Quintela <quintela@redhat.com>
[PMM: rebased, since files have all moved since 2012;
added VMSTATE_CPUDOUBLE_ARRAY_V for consistency with FLOAT64]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWll18AAoJEDhwtADrkYZTLL8QAKB2zTF8/9QwIA46T/nNuQKV
ZckiADC6Aeh0Ksu5DAS7fZmfgPDmlwYYCN3x5KGeKGKIIPiVrddEYwyHqa6eTCOu
pbJBu5WeVamre8/9SH7u2VC/RMU0OZ+OhhJJf174Fc2mTALDtK1JJO4kXCzSUA5V
Iop04YtliH5dnDhCdIHH2tByDLMf1Iaq8NYJ0xWb3btNGX6iIT8F3EsbD9rGiE1m
c+F0qPRFDIrE+OseafrTHeKy/4D9biWnP9CmOGv49m+OxqYs33B26DhaIq41TvYv
/1sECCz2GmIFbpL1B0MvxNjKtj08btrz4EkpU4YBHxK+8EhOX2nJdfrZEhcone7A
c92esN8ATFbsG3AP1Vnt/dxG0YzQB8/azGP/MgVczYaj0m7WZ89etqendj1GeYAZ
2xXewICcmexBeMOodxthHxyQaUQ9oZyk8+sK5T9O6JKvb3uCHKJ6MeRwurHUEtL8
rzPLzKw8Tdalfa7AhQevVquH0QCmm4IEUC7xalHmfsFuqqTU95zfLa+DbdhzdIG+
KdRkCv4+yX8//kUM5LwiqSd7ruMDEMQPQz3pbegrKrUJDCcTt5TccZ6NxiccCpC3
6YXaUG2HqBNH5hznhR1Lf+gRdLeCW8WjI3fWHsAuyTGvl6z8qHm5/Q944UrIlJ8A
Ea1BUSMwgFqx5xp6KYjB
=OVhB
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/armbru/tags/pull-error-2016-01-13' into staging
Error reporting patches for 2016-01-13
# gpg: Signature made Wed 13 Jan 2016 14:21:48 GMT using RSA key ID EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg: aka "Markus Armbruster <armbru@pond.sub.org>"
* remotes/armbru/tags/pull-error-2016-01-13: (41 commits)
checkpatch: Detect newlines in error_report and other error functions
error: Consistently name Error * objects err, and not errp
s390/sclp: Simplify control flow in sclp_realize()
hw/s390x: Rename local variables Error *l_err to just err
error: Clean up errors with embedded newlines (again)
vhdx: Fix "log that needs to be replayed" error message
pci-assign: Clean up "Failed to assign" error messages
vmdk: Clean up "Invalid extent lines" error message
vmdk: Clean up control flow in vmdk_parse_extents() a bit
error: Strip trailing '\n' from error string arguments (again)
qemu-io qemu-nbd: Use error_report() etc. instead of fprintf()
migration: Use error_reportf_err() instead of monitor_printf()
spapr: Use error_reportf_err()
error: Use error_prepend() where it makes obvious sense
error: Use error_reportf_err() where it makes obvious sense
error: Don't decorate original error message when adding to it
error: New error_prepend(), error_reportf_err()
test-throttle: Simplify qemu_init_main_loop() error handling
qemu-nbd: Clean up "Failed to load snapshot" error message
block: Clean up "Could not create temporary overlay" error message
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Commit 6daf194d, be62a2eb and 312fd5f got rid of a bunch, but they
keep coming back. Tracked down with the Coccinelle semantic patch
from commit 312fd5f.
Cc: Fam Zheng <famz@redhat.com>
Cc: Peter Crosthwaite <crosthwaitepeter@gmail.com>
Cc: Bharata B Rao <bharata@linux.vnet.ibm.com>
Cc: Dominik Dingel <dingel@linux.vnet.ibm.com>
Cc: David Hildenbrand <dahi@linux.vnet.ibm.com>
Cc: Jason J. Herne <jjherne@linux.vnet.ibm.com>
Cc: Stefan Berger <stefanb@linux.vnet.ibm.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Changchun Ouyang <changchun.ouyang@intel.com>
Cc: zhanghailiang <zhang.zhanghailiang@huawei.com>
Cc: Pavel Fedin <p.fedin@samsung.com>
Signed-off-by: Markus Armbruster <armbru@pond.sub.org>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Acked-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1450452927-8346-17-git-send-email-armbru@redhat.com>
Both error_reportf_err() and monitor_printf() print to the same
destination when monitor_printf() is used correctly, i.e. within an
HMP monitor. Elsewhere, monitor_printf() does nothing, while
error_reportf_err() reports to stderr.
Both changed functions are HMP command handlers. These should only
run within an HMP monitor.
Unlike monitor_printf(), error_reportf_err() uses the error whole
instead of just its message obtained with error_get_pretty(). This
avoids suppressing its hint (see commit 50b7b00), but I don't think
the errors touched in this commit can come with hints.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1450452927-8346-15-git-send-email-armbru@redhat.com>
Both error_report_err() and monitor_printf() print to the same
destination when monitor_printf() is used correctly, i.e. within an
HMP monitor. Elsewhere, monitor_printf() does nothing, while
error_report_err() reports to stderr.
Most changed functions are HMP command handlers. These should only
run within an HMP monitor. The one exception is bdrv_password_cb(),
which should also only run within an HMP monitor.
Four command handlers prefix the error message with the command name:
balloon, migrate_set_capability, migrate_set_parameter, migrate.
Pointless, drop.
Unlike monitor_printf(), error_report_err() uses the error whole
instead of just its message obtained with error_get_pretty(). This
avoids suppressing its hint (see commit 50b7b00). Example:
(qemu) device_add ivshmem,id=666
Parameter 'id' expects an identifier
Identifiers consist of letters, digits, '-', '.', '_', starting with a letter.
Try "help device_add" for more information
The "Identifiers consist of..." line is new with this patch.
Coccinelle semantic patch:
@@
expression M, E;
@@
- monitor_printf(M, "%s\n", error_get_pretty(E));
- error_free(E);
+ error_report_err(E);
@r1@
expression M, E;
format F;
position p;
@@
- monitor_printf(M, "...%@F@\n", error_get_pretty(E));@p
- error_free(E);
+ error_report_err(E);
@script:python@
p << r1.p;
@@
print "%s:%s:%s: prefix dropped" % (p[0].file, p[0].line, p[0].column)
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1450452927-8346-4-git-send-email-armbru@redhat.com>
qemu_get_buffer does a copy, we can avoid the memcpy, and
we can then remove the extra buffer.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Liang Li <liang.z.li@intel.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Message-Id: <1450266458-3178-7-git-send-email-dgilbert@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Avoid a data copy (if we're lucky) in the xbzrle code.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Message-Id: <1450266458-3178-6-git-send-email-dgilbert@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Emit an event each time we sync the dirty bitmap on the source;
this helps libvirt use postcopy by giving it a kick when it
might be a good idea to start the postcopy.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Message-Id: <1450266458-3178-5-git-send-email-dgilbert@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
I missed the calls to send migration events on the destination side
as we enter postcopy.
Take care when adding them not to do it after state has been freed.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Message-Id: <1450266458-3178-4-git-send-email-dgilbert@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
For migration destination, we also need to know its state,
we will use it in COLO.
Here we add a new member 'state' for MigrationIncomingState,
and also use migrate_set_state() to modify its value.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
dgilbert: Fixed early free of MigraitonIncomingState
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Message-Id: <1450266458-3178-3-git-send-email-dgilbert@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Change the first parameter of migrate_set_state(), and export it.
We will use it in a later patch to update incoming state.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
dgilbert: Updated comment as per Juan's review
Message-Id: <1450266458-3178-2-git-send-email-dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Now that we guarantee the user doesn't have any enum values
beginning with a single underscore, we can use that for our
own purposes. Renaming ENUM_MAX to ENUM__MAX makes it obvious
that the sentinel is generated.
This patch was mostly generated by applying a temporary patch:
|diff --git a/scripts/qapi.py b/scripts/qapi.py
|index e6d014b..b862ec9 100644
|--- a/scripts/qapi.py
|+++ b/scripts/qapi.py
|@@ -1570,6 +1570,7 @@ const char *const %(c_name)s_lookup[] = {
| max_index = c_enum_const(name, 'MAX', prefix)
| ret += mcgen('''
| [%(max_index)s] = NULL,
|+// %(max_index)s
| };
| ''',
| max_index=max_index)
then running:
$ cat qapi-{types,event}.c tests/test-qapi-types.c |
sed -n 's,^// \(.*\)MAX,s|\1MAX|\1_MAX|g,p' > list
$ git grep -l _MAX | xargs sed -i -f list
The only things not generated are the changes in scripts/qapi.py.
Rejecting enum members named 'MAX' is now useless, and will be dropped
in the next patch.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1447836791-369-23-git-send-email-eblake@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
[Rebased to current master, commit message tweaked]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
My fix (84e7b80a) replaced the last_sent_block update that I'd
removed earlier; however it was too aggressive in the xbzrle case.
save_xbzrle_page might return '0' to mean that the page didn't
need sending since it was the same as the last sent version;
in this case we can't update 'last_sent_block' since we didn't
actually send it.
Symptom: 'Illegal RAM offset 1018000' as we try and send a page
to the wrong RAMBlock; potentially that could be a data
corruption if you were really unlucky.
Fixes: 84e7b80a05
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-id: 1449765106-6528-1-git-send-email-dgilbert@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Dividing integer expressions transferred_bytes and time_spent, and then converting
the integer quotient to type double. Any remainder, or fractional part of the
quotient, is ignored. Fix this.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
socket_writev_buffer() writes in a loop, using g_poll() to block. If
g_poll() fails, it tries to write more before the file descriptor is
ready. In theory, this could go into a tight loop. In practice,
errors other than EINTR are really unlikely, and when they happen,
we're probably screwed anyway, so we can just as well loop.
Clean it up a bit: retry poll on EINTR, keep ignoring other errors.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
If we set migration speed in a very large value, block-migration will try to read
all data to the memory. Because
(block_mig_state.submitted + block_mig_state.read_done) * BLOCK_SIZE
will be overflow, and it will be always less than rate limit.
There is no need to read too many data into memory when the rate limit is very large.
So limit the memory usage can fix the overflow problem.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
madvise() returns EINVAL in the case of many failures, but also
returns it in cases where the host kernel doesn't have THP enabled.
Postcopy only really cares that THP is off before it detects faults,
and turns it back on afterwards; so we're going to have
to assume that if the madvise fails then the host just doesn't do
THP and we can carry on with the postcopy.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Tested-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
basically all bdrv_* operations must be called under aio_context_acquire
except ones with bdrv_all prefix.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
Tested-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
The patch also ensures proper locking for the operation.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
Tested-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
State deletion can be performed on running VM which reduces VM downtime
This approach looks a bit more natural.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Tested-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
to create snapshot for all loaded block drivers.
The patch also ensures proper locking.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
Tested-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
There is no much sense to do the check and write warning.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Tested-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
to check that snapshot is available for all loaded block drivers.
The check bs != bs1 in hmp_info_snapshots is an optimization. The check
for availability of this snapshot will return always true as the list
of snapshots was collected from that image.
The patch also ensures proper locking.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
Tested-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
We should check that all inserted and not read-only images support
snapshotting. This could be made using already invented helper
bdrv_all_can_snapshot().
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
Tested-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
to switch to snapshot on all loaded block drivers.
The patch also ensures proper locking.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
Tested-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
to delete snapshots from all loaded block drivers.
The patch also ensures proper locking.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
Tested-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
The patch enforces proper locking for this operation.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
Tested-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
The check is unneccesary, we read the value at the start of the
thread, use it, and never change it. The value is checked to be
non-NULL before thread creation.
Spotted by coverity, CID 1339211
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
I set current_time before the postcopy test but never use it;
(I think this was from the original version where it was time based).
Spotted by coverity, CID 1339208
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
In a82d593b61 I accidentally removed the setting of
last_sent_block, put it back.
Symptoms:
Multithreaded compression only uses one thread.
Migration is a bit less efficient since it won't use 'cont' flags.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Fixes: a82d593b61
Signed-off-by: Juan Quintela <quintela@redhat.com>
Peter reported a lock error on MacOS after my a82d593b
patch.
migrate_get_current does one-time initialisation of
a bunch of variables.
migrate_init does reinitialisation even on a 2nd
migrate after a cancel.
The problem here was that I'd initialised the mutex
in migrate_get_current, and the memset in migrate_init
corrupted it.
Remove the memset and replace it by explicit initialisation
of fields that need initialising; this also turns out to be simpler
than the old code that had to preserve some fields.
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Fixes: a82d593b
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Improve the text in both the qapi-schema and hmp help to point out
you need to set the postcopy-ram capability prior to issuing
migrate-start-postcopy.
Also fix the text of the migrate_start_postcopy error that
deals with capabilities.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Acked-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Where the target page size is different from the host page
we special case it, but I messed up on the zero case check.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Where we have iterable, but non-postcopiable devices (e.g. htab
or block migration), complete them before forming the 'package'
but with the CPUs stopped. This stops them filling up the package.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Rest of the file already use that trick. 64bit offsets make no sense in
32bit archs, but that is ram_addr_t for you.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
since commit
commit 94f5a43704
Author: Liang Li <liang.z.li@intel.com>
Date: Mon Nov 2 15:37:00 2015 +0800
migration: defer migration_end & blk_mig_cleanup
when actual .cleanup callbacks calling was removed from complete operations.
The patch fixes regression introduced by the commit above results in
100% reliable assert for virtio-scsi VM with iothreads enabled during
'virsh create-snapshot' operation:
assert(i != mr->ioeventfd_nb);
memory_region_del_eventfd
virtio_pci_set_host_notifier_internal
virtio_pci_set_host_notifier
virtio_scsi_dataplane_start
virtio_scsi_handle_cmd
virtio_queue_notify_vq
virtio_queue_host_notifier_read
aio_dispatch
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Liang Li <liang.z.li@intel.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Postcopy detects accesses to pages that haven't been transferred yet
using userfaultfd, and it causes exceptions on pages that are 'not
present'.
Ballooning also causes pages to be marked as 'not present' when the
guest inflates the balloon.
Potentially a balloon could be inflated to discard pages that are
currently inflight during postcopy and that may be arriving at about
the same time.
To avoid this confusion, disable ballooning during postcopy.
When disabled we drop balloon requests from the guest. Since ballooning
is generally initiated by the host, the management system should avoid
initiating any balloon instructions to the guest during migration,
although it's not possible to know how long it would take a guest to
process a request made prior to the start of migration.
Guest initiated ballooning will not know if it's really freed a page
of host memory or not.
Queueing the requests until after migration would be nice, but is
non-trivial, since the set of inflate/deflate requests have to
be compared with the state of the page to know what the final
outcome is allowed to be.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Userfault doesn't work with mlock; mlock is designed to nail down pages
so they don't move, userfault is designed to tell you when they're not
there.
munlock the pages we userfault protect before postcopy.
mlock everything again at the end if mlock is enabled.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Tweak the end of migration cleanup; we don't want to close stuff down
at the end of the main stream, since the postcopy is still sending pages
on the other thread.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Prior to servicing userfault requests we must ensure we've not got
huge pages in the area that might include non-transferred memory,
since a hugepage could incorrectly mark the whole huge page as present.
We mark the area as non-huge page (nhp) just before we perform
discards; the discard code now tells us to discard any areas
that haven't been sent (as well as any that are redirtied);
any already formed transparent-huge-pages get fragmented
by this discard process if they cotnain any discards.
Transparent huge pages that have been entirely transferred
and don't contain any discards are not broken by this mechanism;
they stay as huge pages.
By starting postcopy after a full precopy pass, many of the pages
then stay as huge pages; this is important for maintaining performance
after the end of the migration.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Wire up more of the handlers for the commands on the destination side,
in particular loadvm_postcopy_handle_run now has enough to start the
guest running.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
The loading of a device state (during postcopy) may access guest
memory that's still on the source machine and thus might need
a page fill; split off a separate thread that handles the incoming
page data so that the original incoming migration code can finish
off the device data.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
userfaultfd is a Linux syscall that gives an fd that receives a stream
of notifications of accesses to pages registered with it and allows
the program to acknowledge those stalls and tell the accessing
thread to carry on.
We convert the requests from the kernel into messages back to the
source asking for the pages.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Prior to the start of postcopy, ensure that everything that will
be transferred later is a whole host-page in size.
This is accomplished by discarding partially transferred host pages
and marking any that are partially dirty as fully dirty.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
During the postcopy phase we must not call the iterate method on
precopy-only devices, since they may have done some cleanup during
the _complete call at the end of the precopy phase.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Once we're in postcopy the source processors are stopped and memory
shouldn't change any more, so there's no need to look at the dirty
map.
There are two notes to this:
1) If we do resync and a page had changed then the page would get
sent again, which the destination wouldn't allow (since it might
have also modified the page)
2) Before disabling this I'd seen very rare cases where a page had been
marked dirtied although the memory contents are apparently identical
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Ensure that target pages received within a host page are in order.
This shouldn't trigger, but in the cases where the sender goes
wrong and sends stuff out of order it produces a corruption that's
really nasty to debug.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>