Introduce the design of COLO, and how to test it.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
If users require SVM to takeover work, COLO incoming thread should
exit from loop while failover BH helps backing to migration incoming
coroutine.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
For primary side, if COLO gets failover request from users.
To be exact, gets 'x_colo_lost_heartbeat' command.
COLO thread will exit the loop while the failover BH does the
cleanup work and resumes VM.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
When handling failover, COLO processes differently according to
the different stage of failover process, here we introduce a global
atomic variable to record the status of failover.
We add four failover status to indicate the different stage of failover process.
You should use the helpers to get and set the value.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
We leave users to choose whatever heartbeat solution they want,
if the heartbeat is lost, or other errors they detect, they can use
experimental command 'x_colo_lost_heartbeat' to tell COLO to do failover,
COLO will do operations accordingly.
For example, if the command is sent to the Primary side,
the Primary side will exit COLO mode, does cleanup work,
and then, PVM will take over the service work. If sent to the Secondary side,
the Secondary side will run failover work, then takes over PVM's service work.
Cc: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
Do checkpoint periodically, the default interval is 200ms.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
Add checkpoint-delay parameter for migrate-set-parameters, so that
we can control the checkpoint frequency when COLO is in periodic mode.
Cc: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
We should not destroy the state of SVM (Secondary VM) until we receive
the complete data of PVM's state, in case the primary fails in the process
of sending the state, so we cache the VM's state in secondary side before
load it into SVM.
Besides, we should call qemu_system_reset() before load VM state,
which can ensure the data is intact.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
VM checkpointing is to synchronize the state of PVM to SVM, just
like migration does, we re-use save helpers to achieve migrating
PVM's state to Secondary side.
COLO need to cache the data of VM's state in the secondary side before
synchronize it to SVM. COLO need the size of the data to determine
how much data should be read in the secondary side.
So here, we can get the size of the data by saving it into I/O channel
before send it to the secondary side.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
Guest will enter this state when paused to save/restore VM state
under COLO checkpoint.
Cc: Eric Blake <eblake@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
We need communications protocol of user-defined to control
the checkpointing process.
The new checkpointing request is started by Primary VM,
and the interactive process like below:
Checkpoint synchronizing points:
Primary Secondary
initial work
'checkpoint-ready' <-------------------- @
'checkpoint-request' @ -------------------->
Suspend (Only in hybrid mode)
'checkpoint-reply' <-------------------- @
Suspend&Save state
'vmstate-send' @ -------------------->
Send state Receive state
'vmstate-received' <-------------------- @
Release packets Load state
'vmstate-load' <-------------------- @
Resume Resume (Only in hybrid mode)
Start Comparing (Only in hybrid mode)
NOTE:
1) '@' who sends the message
2) Every sync-point is synchronized by two sides with only
one handshake(single direction) for low-latency.
If more strict synchronization is required, a opposite direction
sync-point should be added.
3) Since sync-points are single direction, the remote side may
go forward a lot when this side just receives the sync-point.
4) For now, we only support 'periodic' checkpoint, for which
the Secondary VM is not running, later we will support 'hybrid' mode.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
This new communication path will be used for returning messages
from Secondary side to Primary side.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
Switch from normal migration loadvm process into COLO checkpoint process if
COLO mode is enabled.
We add three new members to struct MigrationIncomingState,
'have_colo_incoming_thread' and 'colo_incoming_thread' record the COLO
related thread for secondary VM, 'migration_incoming_co' records the
original migration incoming coroutine.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
Add a new migration state: MIGRATION_STATUS_COLO. Migration source side
enters this state after the first live migration successfully finished
if COLO is enabled by command 'migrate_set_capability x-colo on'.
We reuse migration thread, so the process of checkpointing will be handled
in migration thread.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
We can determine whether or not VM in destination should go into COLO mode
by referring to the info that was migrated.
We skip this section if COLO is not enabled (i.e.
migrate_set_capability colo off), so that, It doesn't break compatibility
with migration no matter whether users configure the --enable-colo/disable-colo
on the source/destination side or not;
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
We add helper function colo_supported() to indicate whether
colo is supported or not, with which we use to control whether or not
showing 'x-colo' string to users, they can use qmp command
'query-migrate-capabilities' or hmp command 'info migrate_capabilities'
to learn if colo is supported.
The default value for COLO (COarse-Grain LOck Stepping) is disabled.
Cc: Juan Quintela <quintela@redhat.com>
Cc: Amit Shah <amit.shah@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
This pull request supersedes and extends the one from 2016-10-26
(which had a build bug).
Highlights:
* SLOF (pseries guest firmware) update
* Enable a number of extra testcases on ppc / pseries
* Added the 'powernv' machine type
- Almost enough to be minimally usable
- But still missing necessary interrupt controller updates
* Cleanup and consolidation of NVRAM handling on several platforms
with related firmware
* Substantial cleanup to device tree construction
* Some more POWER9 instruction emulation
* Cleanup to handling of pseries option vectors and CAS reboot
handling (host/guest feature negotiation mechanism)
* Significant cleanups to handling of PCI devices in test cases
* New hotplug event infrastructure
* Memory hot unplug support for pseries
* Several bug fixes
The NVRAM cleanup affects some Sun sparc platforms as well as ppc
ones, but have been tested by the sparc maintainer (Mark Cave-Ayland).
The test additions also include substantial general changes to the
test framework that aren't strictly ppc related. They don't seem to
break tests on other platforms, they're for the benefit of enabling
tests on ppc and there isn't a specific maintainer for them, so
they're included in this tree.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJYEqvPAAoJEGw4ysog2bOSQiwP/jOr5bxwZmGDrOwAIzqhagib
lo0+W1E6OCttbBhAp1inWZP6RnexAZ7orb+Q7DHQFbtYukUbPYwB/VzmWhaws6XV
jxK/lVB3A+XlRIEKUUc8bWWGRN+QBMnQIcUhNlKuC4AVKMC1aZY9ZLT6LvilV6X7
QtxAlBPmI2od2kyDHt/ibG9FkROFMi9ybbQG+D7Pu32NlTPgF06R6NPKtpkjEpUU
dRYAUB+VTB4eofjzyVqsL+QB7uX5g0V9aPmYWBaXqjTG61ivHMJJ7zHta+GdckJM
fk3S68ftPmf4EE+uL1Ff2fy+2Sxjh4QeNwxl1rppzrgvW8VkNOX6969idxSERUb8
I2/RVM7F1mJ7+4fNIjenAru8qu3O981lU9+t7R5mmTcEsSk28FOkOv6Io+6JnGA2
32qgOXwihsUDaH2pDagZ+ySaOqjWMD9WGQTfQgFMthGkcs6heG7ByvFrcpcacl5a
kbMl7cj+zkgusLuQHx0dp669R7Ch7bxSigQC11iMCpAmFhXl8qJ37ACPJn8NlzOq
NJdZwXOp9plYZ70a2CgKVXB6j+jhxOeJHg2v08jfF0rUILovBwH0WOrl1m0fb2gB
1u+ua2FED+5rtwpGJ7pL/oE20H11QDfHNDvqEpagvHAHSSu5nqGxd/falYRYE59C
wdMXPqJYQqkSYuA6XkgO
=3Z5V
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.8-20161028' into staging
ppc patch queue 2016-10-28
This pull request supersedes and extends the one from 2016-10-26
(which had a build bug).
Highlights:
* SLOF (pseries guest firmware) update
* Enable a number of extra testcases on ppc / pseries
* Added the 'powernv' machine type
- Almost enough to be minimally usable
- But still missing necessary interrupt controller updates
* Cleanup and consolidation of NVRAM handling on several platforms
with related firmware
* Substantial cleanup to device tree construction
* Some more POWER9 instruction emulation
* Cleanup to handling of pseries option vectors and CAS reboot
handling (host/guest feature negotiation mechanism)
* Significant cleanups to handling of PCI devices in test cases
* New hotplug event infrastructure
* Memory hot unplug support for pseries
* Several bug fixes
The NVRAM cleanup affects some Sun sparc platforms as well as ppc
ones, but have been tested by the sparc maintainer (Mark Cave-Ayland).
The test additions also include substantial general changes to the
test framework that aren't strictly ppc related. They don't seem to
break tests on other platforms, they're for the benefit of enabling
tests on ppc and there isn't a specific maintainer for them, so
they're included in this tree.
# gpg: Signature made Fri 28 Oct 2016 02:37:19 BST
# gpg: using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-2.8-20161028: (73 commits)
ppc: allow certain HV interrupts to be delivered to guests
spapr: Memory hot-unplug support
spapr: use count+index for memory hotplug
spapr: Add DRC count indexed hotplug identifier type
spapr: add hotplug interrupt machine options
spapr_events: add support for dedicated hotplug event source
spapr: update spapr hotplug documentation
target-ppc: Add xvcmpnesp, xvcmpnedp instructions
target-ppc: add xscmp[eq,gt,ge,ne]dp instructions
tests: Add pseries machine to the prom-env-test, too
spapr_nvram: Pre-initialize the NVRAM to support the -prom-env parameter
libqos: Change PCI accessors to take opaque BAR handle
tests: Don't assume structure of PCI IO base in ahci-test
tests: Use qpci_mem{read,write} in ivshmem-test
libqos: Add 64-bit PCI IO accessors
tests: Clean up IO handling in ide-test
libqos: Implement mmio accessors in terms of mem{read,write}
libqos: Add streaming accessors for PCI MMIO
tests: Adjust tco-test to use qpci_legacy_iomap()
libqos: Better handling of PCI legacy IO
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-----BEGIN PGP SIGNATURE-----
iQIcBAABCAAGBQJYEfjrAAoJEL6G67QVEE/fdU4P/i7yBJo436OpkdgeWS8AWuFr
ptZ+Fj/weGka5GU9E3KQu36kbSgrtfcgwTHphCMXnZ0YCeKQDuM57f7LNiN6qheB
nqgJvJioLbUvLTQvCHOISM7bWOnYvASBmYtLJFtUcP/jhdOy61KaADnJ+7MbliNv
yJSW2RN+s/y9nUb+dxEpIXXUVMRa6BX+wHW3O44c1oLn6/Pe20aJeHTyDx3qiBhD
8RYXUgRZopH2bouBSzXgMQTbn/QMD/dC81WQlHKlt4swffyei2D/1pciOcuc0SXz
+SZdkTre5JB5Kd6DU8zQ6PrrIt1nPmLSptSyhQvNxm+uWNWHnFcW1s2aYmf/ikjl
4boW37ayJx09mns8yv7TerzEPbL5qJvVX8Dsnb6telkvrS9hy9S1xuIB5xHbt6/h
vwFmCdwaZoGpDDaoXRL+9k9TOI9BbEMKX33nAPDqvEXLMIf+og4fmweTKcY4XTRL
/Fdg1H71v8Ayv+r5TJOKwFg3PNNjnvqkbk1psS+aaW7dup43iaYGIKWy+VFaCufk
hPXLOtR5lUsYC2qm+nkjPIgoP7D8oZx4AGkCHbYsqzi+l1lynZH3rBIs8ggLr72o
FFk4g0sNYe1ccAa89jFEgWIQbS0N6ckUXCv12g3eyF/UIC1F35/mGGugSRnTXuc2
a/WsvgU7pGBrtqXcg7lF
=gsxL
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/berrange/tags/pull-qio-2016-10-27-1' into staging
Merge qio 2016/10/27 v1
# gpg: Signature made Thu 27 Oct 2016 13:54:03 BST
# gpg: using RSA key 0xBE86EBB415104FDF
# gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>"
# gpg: aka "Daniel P. Berrange <berrange@redhat.com>"
# Primary key fingerprint: DAF3 A6FD B26B 6291 2D0E 8E3F BE86 EBB4 1510 4FDF
* remotes/berrange/tags/pull-qio-2016-10-27-1:
main: set names for main loop sources created
vnc: set name for all I/O channels created
migration: set name for all I/O channels created
char: set name for all I/O channels created
nbd: set name for all I/O channels created
io: add ability to set a name for IO channels
io: Add a QIOChannelSocket cleanup test
io: set LISTEN flag explicitly for listen sockets
io: Introduce a qio_channel_set_feature() helper
io: Use qio_channel_has_feature() where applicable
io: Fix double shift usages on QIOChannel features
Conflicts:
qemu-char.c
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Use ncursesw package instead of curses on non-mingw, and check a few
functions.
Also take cflags from pkg-config, since cursesw headers may be in a
separate, non-default directory.
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Message-id: 20161015195308.20473-3-samuel.thibault@ens-lyon.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
In default VGA font, left/right arrow are glyphs 0x1a and 0x1b, not 0x0a and
0x0b.
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Message-id: 20161015195308.20473-2-samuel.thibault@ens-lyon.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
GTK generates key events for the delete key with key->string[0] = 0x7f
... but this does not work right with the readline_handle_byte()
function in util/readline.c, since this treats the keycode 127 as
backspace. So let's add a special case for the GTK delete key to make
this key behave right in the monitor interface of the GTK ui.
Buglink: https://bugs.launchpad.net/qemu/+bug/1619438
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-id: 1477570647-7100-1-git-send-email-thuth@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
gdk_screen_get_width() is deprecated since gtk 3.22.2, use
gdk_monitor_get_geometry() instead if it's available.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: 20161026152108.12364-1-berto@igalia.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
We do not want to catch the BrlAPI input/ouput immediately, but only
when the guest has started discussing withour virtual device.
This notably fixes input before the guest driver has started.
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
ppc hypervisors have delivered system reset and machine check exception
interrupts to guests in some situations (e.g., see FWNMI feature of LoPAPR,
or NMI injection in QEMU).
These exceptions are architected to set the HV bit in hardware, however
when injected into a guest, the HV bit should be cleared. Current code
masks off the HV bit before setting the new MSR, however this happens after
the interrupt delivery model has calculated delivery mode for the exception.
This can result in the guest's MSR LE bit being lost.
Account for this in the exception handler and don't set HV bit for guest
delivery.
Also add another sanity check to ensure similar bugs get caught.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Add support to hot remove pc-dimm memory devices.
Since we're introducing a machine-level unplug_request hook, we also
had handling for CPU unplug there as well to ensure CPU unplug
continues to work as it did before.
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
* add hooks to CAS/cmdline enablement of hotplug ACR support
* add hook for CPU unplug
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>