Commit Graph

354 Commits

Author SHA1 Message Date
Peter Maydell
eab9e9629c Migration bits from the COLO project
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQItBAABCAAXBQJYFc37EBxhbWl0QGtlcm5lbC5vcmcACgkQ6wtN/GV+9nDdqQ/9
 H+di+q/zF6aOEuXRCN0ud9R81fZ7nLve3e9EbKCNZXnxN3M3+zQj0At+e/SBEoTc
 rRwqJmNlRX3TWViWsAYmddgtopZ9R9DWwW/VsKjS2Ng230YShrA/o20hu2dJkwl3
 CN4vAObzc/gxM59NWUlMnTXOG+Z9fI1NlEf0vZ2484a59KPVIE7W1zZccT1F8MNq
 sfa3RlOGBchNO2Rfzrr6cFGGH7UTfWiftPs7EDOfN/YBcpld6V4DfPWdPP8r2DSX
 gG7D3DJuuqPxZCjl0Nm1OjaIunYfVrRpMPMeNyo1+kTVbvvhDPFjah/MSWu8XJ2c
 N5lSqikIAWfMbQVW9gDpQ0495eRQTWA7VIlCbwN6mqNxyBvQMA+licFq6UFFrCMp
 quC3gO+daz3fKvUhi23TpebbqKLHA5OZA5ZvkjGDgkKPMCJyQRBZHLb81t5xsulZ
 cXuzAOeRcXK9aEJvcrDgWwxzi3PN8zg74RF1ZV8gxM4DkHKohlsnbgshWqGkFh3M
 +S5tEPqVlOlDO9juf6rlwNnVbWhFDFEGMKjI9XMTWwVWTREbqyP86fP66h2C4qc6
 34yAHi5G2i43dxzHgpHC0MpU0XenO0EYdu+8Tcx35LSBSfkOeD7pU1DeZQgbI51m
 ZQSnLDJqv2HVdoZT7vaIjwuhtEl84xqelFdVg+cY7Gw=
 =sur7
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/amit-migration/tags/migration-for-2.8' into staging

Migration bits from the COLO project

# gpg: Signature made Sun 30 Oct 2016 10:39:55 GMT
# gpg:                using RSA key 0xEB0B4DFC657EF670
# gpg: Good signature from "Amit Shah <amit@amitshah.net>"
# gpg:                 aka "Amit Shah <amit@kernel.org>"
# gpg:                 aka "Amit Shah <amitshah@gmx.net>"
# Primary key fingerprint: 48CA 3722 5FE7 F4A8 B337  2735 1E9A 3B5F 8540 83B6
#      Subkey fingerprint: CC63 D332 AB8F 4617 4529  6534 EB0B 4DFC 657E F670

* remotes/amit-migration/tags/migration-for-2.8:
  MAINTAINERS: Add maintainer for COLO framework related files
  configure: Support enable/disable COLO feature
  docs: Add documentation for COLO feature
  COLO: Implement failover work for secondary VM
  COLO: Implement the process of failover for primary VM
  COLO: Introduce state to record failover process
  COLO: Add 'x-colo-lost-heartbeat' command to trigger failover
  COLO: Synchronize PVM's state to SVM periodically
  COLO: Add checkpoint-delay parameter for migrate-set-parameters
  COLO: Load VMState into QIOChannelBuffer before restore it
  COLO: Send PVM state to secondary side when do checkpoint
  COLO: Add a new RunState RUN_STATE_COLO
  COLO: Introduce checkpointing protocol
  COLO: Establish a new communicating path for COLO
  migration: Switch to COLO process after finishing loadvm
  migration: Enter into COLO mode after migration if COLO is enabled
  COLO: migrate COLO related info to secondary node
  migration: Introduce capability 'x-colo' to migration

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-31 13:06:38 +00:00
Peter Maydell
277d44f5a6 trivial patches for 2016-10-28
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABCAAGBQJYE2wfAAoJEHAbT2saaT5ZGYUH/3QWJ4OFWbqGo1YYN5AIAheF
 v1bQGTh1HGbLk46ajhUvzB0bMHb1FC1KoOruU2wFYuKK/J5zQ+4X9EmaC/fD7hyx
 nGTcPWAyxKOlqOq3In9ro+xWQNzEhfoypKCQQVC4Y3quzub48wAro8fuFSNXLyBq
 ERvAsjgj0TrLEHoWtJl2bPYiqSd6KAHZAKPFW3Jw8MmsBcTLmnF2PVW3LBfdcHe7
 6vlhqX7lPzVlHRaUsaxRkFxYd2YGisbe3bPRDw2fTxrtOYyEkopQq7xi2Q6Yq5N0
 z0yM2oJ7o1QtUOXYa7KBf03WZ7e119HimaUkGLg+0LVhQNbeG3hd3gNwApXa5og=
 =tYml
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/mjt/tags/trivial-patches-fetch' into staging

trivial patches for 2016-10-28

# gpg: Signature made Fri 28 Oct 2016 16:17:51 BST
# gpg:                using RSA key 0x701B4F6B1A693E59
# gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>"
# gpg:                 aka "Michael Tokarev <mjt@corpit.ru>"
# gpg:                 aka "Michael Tokarev <mjt@debian.org>"
# Primary key fingerprint: 6EE1 95D1 886E 8FFB 810D  4324 457C E0A0 8044 65C5
#      Subkey fingerprint: 7B73 BAD6 8BE7 A2C2 8931  4B22 701B 4F6B 1A69 3E59

* remotes/mjt/tags/trivial-patches-fetch: (23 commits)
  Fix build for less common build directories names
  clean-up: removed duplicate #includes
  scripts/clean-includes: added duplicate #include check
  monitor: deprecate 'default' option
  qemu-ga: Remove stray 'q' in documentation
  Makefile: Fix help text for target 'installer'
  s390: avoid always-true comparison in s390_pci_generate_fid()
  migration: Remove unneeded NULL check from migrate_fd_error()
  scripts/hxtool: fix undefined behavour of echo
  qemu-options.hx: set: fix copy-paste error
  usb: Change *_exitfn return type from int to void
  MAINTAINERS: qemu-trivial information
  colo-compare: remove unused struct CompareChardevProps and 'props' variable
  milkymist-pfpu: fix potential integer overflow
  hw/block/nvme: Simplify if-statements a little bit
  target-lm32: rewrite gen_compare()
  lm32: milkymist-tmu2: fix integer overflow
  target-lm32: disable asm logging via LOG_DIS()
  target-lm32: swap operand of wcsr in LOG_DIS()
  target-lm32: fix LOG_DIS operand order
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-31 11:58:30 +00:00
zhanghailiang
180fb75000 configure: Support enable/disable COLO feature
configure --enable-colo/--disable-colo to switch COLO
support on/off.

COLO feature doesn't depend on any other external libraries,
So here it is reasonable to enable COLO by default, to
avoid re-compile QEMU if users want to use this capability.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
2016-10-30 15:17:39 +05:30
zhanghailiang
9d2db3760b COLO: Implement failover work for secondary VM
If users require SVM to takeover work, COLO incoming thread should
exit from loop while failover BH helps backing to migration incoming
coroutine.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
2016-10-30 15:17:39 +05:30
zhanghailiang
b3f7f0c5e6 COLO: Implement the process of failover for primary VM
For primary side, if COLO gets failover request from users.
To be exact, gets 'x_colo_lost_heartbeat' command.
COLO thread will exit the loop while the failover BH does the
cleanup work and resumes VM.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
2016-10-30 15:17:39 +05:30
zhanghailiang
aef060850b COLO: Introduce state to record failover process
When handling failover, COLO processes differently according to
the different stage of failover process, here we introduce a global
atomic variable to record the status of failover.

We add four failover status to indicate the different stage of failover process.
You should use the helpers to get and set the value.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
2016-10-30 15:17:39 +05:30
zhanghailiang
d89e666e06 COLO: Add 'x-colo-lost-heartbeat' command to trigger failover
We leave users to choose whatever heartbeat solution they want,
if the heartbeat is lost, or other errors they detect, they can use
experimental command 'x_colo_lost_heartbeat' to tell COLO to do failover,
COLO will do operations accordingly.

For example, if the command is sent to the Primary side,
the Primary side will exit COLO mode, does cleanup work,
and then, PVM will take over the service work. If sent to the Secondary side,
the Secondary side will run failover work, then takes over PVM's service work.

Cc: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
2016-10-30 15:17:39 +05:30
zhanghailiang
18cc23d72c COLO: Synchronize PVM's state to SVM periodically
Do checkpoint periodically, the default interval is 200ms.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
2016-10-30 15:17:39 +05:30
zhanghailiang
68b5359187 COLO: Add checkpoint-delay parameter for migrate-set-parameters
Add checkpoint-delay parameter for migrate-set-parameters, so that
we can control the checkpoint frequency when COLO is in periodic mode.

Cc: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
2016-10-30 15:17:39 +05:30
zhanghailiang
4291d372e2 COLO: Load VMState into QIOChannelBuffer before restore it
We should not destroy the state of SVM (Secondary VM) until we receive
the complete data of PVM's state, in case the primary fails in the process
of sending the state, so we cache the VM's state in secondary side before
load it into SVM.

Besides, we should call qemu_system_reset() before load VM state,
which can ensure the data is intact.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
2016-10-30 15:17:39 +05:30
zhanghailiang
a91246c95f COLO: Send PVM state to secondary side when do checkpoint
VM checkpointing is to synchronize the state of PVM to SVM, just
like migration does, we re-use save helpers to achieve migrating
PVM's state to Secondary side.

COLO need to cache the data of VM's state in the secondary side before
synchronize it to SVM. COLO need the size of the data to determine
how much data should be read in the secondary side.
So here, we can get the size of the data by saving it into I/O channel
before send it to the secondary side.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
2016-10-30 15:17:39 +05:30
zhanghailiang
4f97558e10 COLO: Introduce checkpointing protocol
We need communications protocol of user-defined to control
the checkpointing process.

The new checkpointing request is started by Primary VM,
and the interactive process like below:

Checkpoint synchronizing points:

                   Primary               Secondary
                                            initial work
'checkpoint-ready'    <-------------------- @

'checkpoint-request'  @ -------------------->
                                            Suspend (Only in hybrid mode)
'checkpoint-reply'    <-------------------- @
                      Suspend&Save state
'vmstate-send'        @ -------------------->
                      Send state            Receive state
'vmstate-received'    <-------------------- @
                      Release packets       Load state
'vmstate-load'        <-------------------- @
                      Resume                Resume (Only in hybrid mode)

                      Start Comparing (Only in hybrid mode)
NOTE:
 1) '@' who sends the message
 2) Every sync-point is synchronized by two sides with only
    one handshake(single direction) for low-latency.
    If more strict synchronization is required, a opposite direction
    sync-point should be added.
 3) Since sync-points are single direction, the remote side may
    go forward a lot when this side just receives the sync-point.
 4) For now, we only support 'periodic' checkpoint, for which
   the Secondary VM is not running, later we will support 'hybrid' mode.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
2016-10-30 15:17:39 +05:30
zhanghailiang
56ba83d2a8 COLO: Establish a new communicating path for COLO
This new communication path will be used for returning messages
from Secondary side to Primary side.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
2016-10-30 15:17:39 +05:30
zhanghailiang
25d0c16f62 migration: Switch to COLO process after finishing loadvm
Switch from normal migration loadvm process into COLO checkpoint process if
COLO mode is enabled.

We add three new members to struct MigrationIncomingState,
'have_colo_incoming_thread' and 'colo_incoming_thread' record the COLO
related thread for secondary VM, 'migration_incoming_co' records the
original migration incoming coroutine.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
2016-10-30 15:17:39 +05:30
zhanghailiang
0b827d5e72 migration: Enter into COLO mode after migration if COLO is enabled
Add a new migration state: MIGRATION_STATUS_COLO. Migration source side
enters this state after the first live migration successfully finished
if COLO is enabled by command 'migrate_set_capability x-colo on'.

We reuse migration thread, so the process of checkpointing will be handled
in migration thread.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
2016-10-30 15:17:39 +05:30
zhanghailiang
5821ebf93b COLO: migrate COLO related info to secondary node
We can determine whether or not VM in destination should go into COLO mode
by referring to the info that was migrated.

We skip this section if COLO is not enabled (i.e.
migrate_set_capability colo off), so that, It doesn't break compatibility
with migration no matter whether users configure the --enable-colo/disable-colo
on the source/destination side or not;

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
2016-10-30 15:17:39 +05:30
zhanghailiang
35a6ed4f71 migration: Introduce capability 'x-colo' to migration
We add helper function colo_supported() to indicate whether
colo is supported or not, with which we use to control whether or not
showing 'x-colo' string to users, they can use qmp command
'query-migrate-capabilities' or hmp command 'info migrate_capabilities'
to learn if colo is supported.

The default value for COLO (COarse-Grain LOck Stepping) is disabled.

Cc: Juan Quintela <quintela@redhat.com>
Cc: Amit Shah <amit.shah@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
2016-10-30 15:17:39 +05:30
Peter Maydell
25174055f4 migration: Remove unneeded NULL check from migrate_fd_error()
All the callers of migrate_fd_error() pass a non-NULL
error parameter, and if any did pass NULL then we would
segfault in error_copy(), so remove the unnecessary
NULL check earlier in the function.
(Spotted by Coverity.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-10-28 18:17:23 +03:00
Peter Maydell
01b601f061 Merge qio 2016/10/27 v1
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJYEfjrAAoJEL6G67QVEE/fdU4P/i7yBJo436OpkdgeWS8AWuFr
 ptZ+Fj/weGka5GU9E3KQu36kbSgrtfcgwTHphCMXnZ0YCeKQDuM57f7LNiN6qheB
 nqgJvJioLbUvLTQvCHOISM7bWOnYvASBmYtLJFtUcP/jhdOy61KaADnJ+7MbliNv
 yJSW2RN+s/y9nUb+dxEpIXXUVMRa6BX+wHW3O44c1oLn6/Pe20aJeHTyDx3qiBhD
 8RYXUgRZopH2bouBSzXgMQTbn/QMD/dC81WQlHKlt4swffyei2D/1pciOcuc0SXz
 +SZdkTre5JB5Kd6DU8zQ6PrrIt1nPmLSptSyhQvNxm+uWNWHnFcW1s2aYmf/ikjl
 4boW37ayJx09mns8yv7TerzEPbL5qJvVX8Dsnb6telkvrS9hy9S1xuIB5xHbt6/h
 vwFmCdwaZoGpDDaoXRL+9k9TOI9BbEMKX33nAPDqvEXLMIf+og4fmweTKcY4XTRL
 /Fdg1H71v8Ayv+r5TJOKwFg3PNNjnvqkbk1psS+aaW7dup43iaYGIKWy+VFaCufk
 hPXLOtR5lUsYC2qm+nkjPIgoP7D8oZx4AGkCHbYsqzi+l1lynZH3rBIs8ggLr72o
 FFk4g0sNYe1ccAa89jFEgWIQbS0N6ckUXCv12g3eyF/UIC1F35/mGGugSRnTXuc2
 a/WsvgU7pGBrtqXcg7lF
 =gsxL
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/berrange/tags/pull-qio-2016-10-27-1' into staging

Merge qio 2016/10/27 v1

# gpg: Signature made Thu 27 Oct 2016 13:54:03 BST
# gpg:                using RSA key 0xBE86EBB415104FDF
# gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>"
# gpg:                 aka "Daniel P. Berrange <berrange@redhat.com>"
# Primary key fingerprint: DAF3 A6FD B26B 6291 2D0E  8E3F BE86 EBB4 1510 4FDF

* remotes/berrange/tags/pull-qio-2016-10-27-1:
  main: set names for main loop sources created
  vnc: set name for all I/O channels created
  migration: set name for all I/O channels created
  char: set name for all I/O channels created
  nbd: set name for all I/O channels created
  io: add ability to set a name for IO channels
  io: Add a QIOChannelSocket cleanup test
  io: set LISTEN flag explicitly for listen sockets
  io: Introduce a qio_channel_set_feature() helper
  io: Use qio_channel_has_feature() where applicable
  io: Fix double shift usages on QIOChannel features

Conflicts:
	qemu-char.c

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-28 15:30:55 +01:00
Daniel P. Berrange
6f01f136af migration: set name for all I/O channels created
Ensure that all I/O channels created for migration are given names
to distinguish their respective roles.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-10-27 09:13:10 +02:00
Peter Maydell
59811a320d migration/savevm.c: migrate non-default page size
Add a subsection to vmstate_configuration which is present
only if the guest is using a target page size which is
different from the default. This allows us to helpfully
diagnose attempts to migrate between machines which
are using different target page sizes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2016-10-24 16:26:50 +01:00
Vijaya Kumar K
adb65dec4b migration: Remove static allocation of xzblre cache buffer
Allocate xzblre zero page cache buffer dynamically.
Remove dependency on TARGET_PAGE_SIZE to make run-time
page size detection for arm platforms.

Signed-off-by: Vijaya Kumar K <vijayak@cavium.com>
Message-id: 1465808915-4887-2-git-send-email-vijayak@caviumnetworks.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-24 16:26:49 +01:00
Ashijeet Acharya
2ff3025797 migrate: move max-bandwidth and downtime-limit to migrate_set_parameter
Mark the old commands 'migrate_set_speed' and 'migrate_set_downtime' as
deprecated.
Move max-bandwidth and downtime-limit into migrate-set-parameters for
setting maximum migration speed and expected downtime limit parameters
respectively.
Change downtime units to milliseconds (only for new-command) and set
its upper bound limit to 2000 seconds.
Update the query part in both hmp and qmp qemu control interfaces.

Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2016-10-13 17:23:53 +02:00
Dr. David Alan Gilbert
9308ae5485 migration: Fix seg with missing port
The command :
   migrate tcp:localhost:

   currently segs; fix it so it now says:

   error parsing address 'localhost:'

and the same for -incoming.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2016-10-13 17:23:53 +02:00
Dr. David Alan Gilbert
5cf0f48d2a migration/postcopy: Explicitly disallow huge pages
At the moment postcopy will fail as soon as qemu tries to register
userfault on the RAMBlock pages that are backed by hugepages.
However, the kernel is going to get userfault support for hugepage
at some point, and we've not got the rest of the QEMU code to support
it yet, so fail neatly with an error like:

Postcopy doesn't support hugetlbfs yet (/objects/mem1)

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2016-10-13 17:23:53 +02:00
Dr. David Alan Gilbert
2ebeaec012 Postcopy vs xbzrle: Don't send xbzrle pages once in postcopy [for 2.8]
xbzrle relies on reading pages that have already been sent
to the destination and then applying the modifications; we can't
do that in postcopy because the destination may well have
modified the page already or the page has been discarded.

I already didn't allow reception of xbzrle pages, but I
forgot to add the test to stop them being sent.

Enabling both xbzrle and postcopy can make some sense;
if you think that your migration might finish if you
have xbzrle, then when it doesn't complete you flick
over to postcopy and stop xbzrle'ing.

This corresponds to RH bug:
https://bugzilla.redhat.com/show_bug.cgi?id=1368422

Symptom is:

Unknown combination of migration flags: 0x60 (postcopy mode)
(either 0x60 or 0x40)

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2016-10-13 17:23:53 +02:00
Ashijeet Acharya
091ecc8b69 migrate: Fix bounds check for migration parameters in migration.c
This patch fixes the out-of-bounds check of migration parameters in
qmp_migrate_set_parameters() for cpu-throttle-initial and
cpu-throttle-increment by adding a return statement for both as they
were broken since their introduction in 2.5 via commit 1626fee.
Due to the missing return statements, parameters were getting set to
out-of-bounds values despite the error.

Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2016-10-13 17:23:53 +02:00
Eric Blake
7f375e0446 migrate: Use boxed qapi for migrate-set-parameters
Now that QAPI makes it easy to pass a struct around, we don't
have to declare as many parameters or local variables.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2016-10-13 17:23:53 +02:00
Eric Blake
de63ab6124 migrate: Share common MigrationParameters struct
It is rather verbose, and slightly error-prone, to repeat
the same set of parameters for input (migrate-set-parameters)
as for output (query-migrate-parameters), where the only
difference is whether the members are optional.  We can just
document that the optional members will always be present
on output, and then share a common struct between both
commands.  The next patch can then reduce the amount of
code needed on input.

Also, we made a mistake in qemu 2.7 of returning an empty
string during 'query-migrate-parameters' when there is no
TLS, rather than omitting TLS details entirely.  Technically,
this change risks breaking any 2.7 client that is hard-coded
to expect the parameter's existence; on the other hand, clients
that are portable to 2.6 already must be prepared for those
members to not be present.

And this gets rid of yet one more place where the QMP output
visitor is silently converting a NULL string into "" (which
is a hack I ultimately want to kill off).

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2016-10-13 17:23:53 +02:00
Dr. David Alan Gilbert
cd5ea07064 migration/rdma: Don't flag an error when we've been told about one
If the other side tells us there's been an error and we fail
the migration, we don't need to signal that failure to the other
side because it already knew.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael R. Hines <michael@hinespot.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2016-10-13 17:22:38 +02:00
Dr. David Alan Gilbert
ccb783c312 migration: Make failed migration load set file error
If an error occurs in a section load, set the file error flag
so that the transport can get notified to do a cleanup.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael R. Hines <michael@hinespot.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2016-10-13 17:22:38 +02:00
Dr. David Alan Gilbert
12c67ffb1f migration/rdma: Pass qemu_file errors across link
If we fail for some reason (e.g. a mismatched RAMBlock)
and it's set the qemu_file error flag, pass that error back to the
peer so it can clean up rather than waiting for some higher level
progress.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael R. Hines <michael@hinespot.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2016-10-13 17:22:38 +02:00
Dr. David Alan Gilbert
49228e17ed migration: Report values for comparisons
Report the values when a comparison fails; together with
the previous patch that prints the device and field names
this should give a good idea of why loading the migration failed.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2016-10-13 17:22:38 +02:00
Dr. David Alan Gilbert
a1771070e7 migration: report an error giving the failed field
When a field fails to load (typically due to a limit
check, or a call to a get/put) report the device and field
to give an indication of the cause.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2016-10-13 17:22:38 +02:00
Paolo Bonzini
9c1f8f4493 migration: sync all address spaces
Migrating a VM during reboot sometimes results in differences
between the source and destination in the SMRAM area.

This is because migration_bitmap_sync() only fetches from KVM
the dirty log of address_space_memory.  SMRAM memory slots
are ignored and the modifications to SMRAM are not sent to the
destination.

Reported-by: He Rongguang <herongguang.he@huawei.com>
Reviewed-by: He Rongguang <herongguang.he@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-27 11:57:29 +02:00
Richard Henderson
a1febc4950 cutils: Export only buffer_is_zero
Since the two users don't make use of the returned offset,
beyond ensuring that the entire buffer is zero, consider the
can_use_buffer_find_nonzero_offset and buffer_find_nonzero_offset
functions internal.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-Id: <1472496380-19706-4-git-send-email-rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-13 19:09:45 +02:00
Laurent Vivier
e723b87103 trace-events: fix first line comment in trace-events
Documentation is docs/tracing.txt instead of docs/trace-events.txt.

find . -name trace-events -exec \
     sed -i "s?See docs/trace-events.txt for syntax documentation.?See docs/tracing.txt for syntax documentation.?" \
     {} \;

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-id: 1470669081-17860-1-git-send-email-lvivier@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-08-12 10:36:01 +01:00
Cao jin
474c624ddf migration/socket: fix typo in file header
Code of inet socket & unix socket is merged together.
Also add some newlines, make code block well separated.

Cc: Daniel P. Berrange <berrange@redhat.com>
Cc: Juan Quintela <quintela@redhat.com>
Cc: Amit Shah <amit.shah@redhat.com>

Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Message-Id: <1469696074-12744-4-git-send-email-caoj.fnst@cn.fujitsu.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-08-11 17:03:51 +05:30
Liang Li
787d134fb1 migration: fix live migration failure with compression
Because of commit 11808bb0c4, which remove some condition checks
of 'f->ops->writev_buffer', 'qemu_put_qemu_file' should be enhanced
to clear the 'f_src->iovcnt', or 'f_src->iovcnt' may exceed the
MAX_IOV_SIZE which will break live migration. This should be fixed.

Signed-off-by: Liang Li <liang.z.li@intel.com>
Reported-by: Jinshi Zhang <jinshi.c.zhang@intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <1470702146-24399-1-git-send-email-liang.z.li@intel.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-08-11 16:59:53 +05:30
Evgeny Yakovlev
0e8b3cdfbc migration: mmap error check fix
mmap man page:
"On success, mmap() returns a pointer to the mapped area. On error, the
value MAP_FAILED (that is, (void *) -1) is returned, and errno  is  set
to indicate the cause of the error."

The check in postcopy_get_tmp_page is definitely wrong and should be
fixed.

Signed-off-by: Evgeny Yakovlev <eyakovlev@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Amit Shah <amit.shah@redhat.com>
Message-Id: <1469785705-16670-1-git-send-email-den@openvz.org>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-08-11 16:59:38 +05:30
Cao jin
e110aa919a migration/ram: fix typo
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Message-Id: <1469776231-23820-1-git-send-email-caoj.fnst@cn.fujitsu.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-08-11 16:59:33 +05:30
Marc-André Lureau
df37dd6ffe qjson: free str
Release the qstring allocated in qjson_new().

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-08-08 00:00:24 +04:00
Dr. David Alan Gilbert
42da5550d6 migration: set state to post-migrate on failure
If a migration fails/is cancelled during the postcopy stage we currently
end up with the runstate as finish-migrate, where it should be post-migrate.
There's a small window in precopy where I think the same thing can
happen, but I've never seen it.

It rarely matters; the only postcopy case is if you restart a migration, which
again is a case that rarely matters in postcopy because it's only
safe to restart the migration if you know the destination hasn't
been running (which you might if you started the destination with -S
and hadn't got around to 'c' ing it before the postcopy failed).
Even then it's a small window but potentially you could hit if
there's a problem loading the devices on the destination.

This corresponds to:
https://bugzilla.redhat.com/show_bug.cgi?id=1355683

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Message-Id: <1468601086-32117-1-git-send-email-dgilbert@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-07-22 13:23:09 +05:30
Lin Ma
0c204cc810 hmp: show all of snapshot info on every block dev in output of 'info snapshots'
Currently, the output of 'info snapshots' shows fully available snapshots.
It's opaque, hides some snapshot information to users. It's not convenient
if users want to know more about all of snapshot information on every block
device via monitor.

Follow Kevin's and Max's proposals, The patch makes the output more detailed:
(qemu) info snapshots
List of snapshots present on all disks:
 ID        TAG                 VM SIZE                DATE       VM CLOCK
 --        checkpoint-1           165M 2016-05-22 16:58:07   00:02:06.813

List of partial (non-loadable) snapshots on 'drive_image1':
 ID        TAG                 VM SIZE                DATE       VM CLOCK
 1         snap1                     0 2016-05-22 16:57:31   00:01:30.567

Signed-off-by: Lin Ma <lma@suse.com>
Message-id: 1467869164-26688-3-git-send-email-lma@suse.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-07-13 13:41:39 +02:00
Lin Ma
3a1ee71190 hmp: use snapshot name to determine whether a snapshot is 'fully available'
Currently qemu uses snapshot id to determine whether a snapshot is fully
available, It causes incorrect output in some scenario.

For instance:
(qemu) info block
drive_image1 (#block113): /opt/vms/SLES12-SP1-JeOS-x86_64-GM/disk0.qcow2
(qcow2)
    Cache mode:       writeback

drive_image2 (#block349): /opt/vms/SLES12-SP1-JeOS-x86_64-GM/disk1.qcow2
(qcow2)
    Cache mode:       writeback
(qemu)
(qemu) info snapshots
There is no snapshot available.
(qemu)
(qemu) snapshot_blkdev_internal drive_image1 snap1
(qemu)
(qemu) info snapshots
There is no suitable snapshot available
(qemu)
(qemu) savevm checkpoint-1
(qemu)
(qemu) info snapshots
ID        TAG                 VM SIZE                DATE       VM CLOCK
1         snap1                     0 2016-05-22 16:57:31   00:01:30.567
(qemu)

$ qemu-img snapshot -l disk0.qcow2
Snapshot list:
ID        TAG                 VM SIZE                DATE       VM CLOCK
1         snap1                     0 2016-05-22 16:57:31   00:01:30.567
2         checkpoint-1           165M 2016-05-22 16:58:07   00:02:06.813

$ qemu-img snapshot -l disk1.qcow2
Snapshot list:
ID        TAG                 VM SIZE                DATE       VM CLOCK
1         checkpoint-1              0 2016-05-22 16:58:07   00:02:06.813

The patch uses snapshot name instead of snapshot id to determine whether a
snapshot is fully available and uses '--' instead of snapshot id in output
because the snapshot id is not guaranteed to be the same on all images.
For instance:
(qemu) info snapshots
List of snapshots present on all disks:
 ID        TAG                 VM SIZE                DATE       VM CLOCK
 --        checkpoint-1           165M 2016-05-22 16:58:07   00:02:06.813

Signed-off-by: Lin Ma <lma@suse.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1467869164-26688-2-git-send-email-lma@suse.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-07-13 13:41:39 +02:00
Paolo Bonzini
0b8b8753e4 coroutine: move entry argument to qemu_coroutine_create
In practice the entry argument is always known at creation time, and
it is confusing that sometimes qemu_coroutine_enter is used with a
non-NULL argument to re-enter a coroutine (this happens in
block/sheepdog.c and tests/test-coroutine.c).  So pass the opaque value
at creation time, for consistency with e.g. aio_bh_new.

Mostly done with the following semantic patch:

@ entry1 @
expression entry, arg, co;
@@
- co = qemu_coroutine_create(entry);
+ co = qemu_coroutine_create(entry, arg);
  ...
- qemu_coroutine_enter(co, arg);
+ qemu_coroutine_enter(co);

@ entry2 @
expression entry, arg;
identifier co;
@@
- Coroutine *co = qemu_coroutine_create(entry);
+ Coroutine *co = qemu_coroutine_create(entry, arg);
  ...
- qemu_coroutine_enter(co, arg);
+ qemu_coroutine_enter(co);

@ entry3 @
expression entry, arg;
@@
- qemu_coroutine_enter(qemu_coroutine_create(entry), arg);
+ qemu_coroutine_enter(qemu_coroutine_create(entry, arg));

@ reentry @
expression co;
@@
- qemu_coroutine_enter(co, NULL);
+ qemu_coroutine_enter(co);

except for the aforementioned few places where the semantic patch
stumbled (as expected) and for test_co_queue, which would otherwise
produce an uninitialized variable warning.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-07-13 13:26:02 +02:00
Daniel P. Berrange
521d47c6dc trace: split out trace events for migration/ directory
Move all trace-events for files in the migration/ directory to
their own file.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1466066426-16657-6-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20 17:22:14 +01:00
Liang Li
0d9f9a5c52 migration: code clean up
Use 'QemuMutex comp_done_lock' and 'QemuCond comp_done_cond' instead
of 'QemuMutex *comp_done_lock' and 'QemuCond comp_done_cond'. To keep
consistent with 'QemuMutex decomp_done_lock' and
'QemuCond comp_done_cond'.

Signed-off-by: Liang Li <liang.z.li@intel.com>
Message-Id: <1462433579-13691-10-git-send-email-liang.z.li@intel.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-06-17 18:24:31 +05:30
Liang Li
33d151f418 migration: refine the decompression code
The current code for multi-thread decompression is not clear,
especially in the aspect of using lock. Refine the code
to make it clear.

Signed-off-by: Liang Li <liang.z.li@intel.com>
Message-Id: <1462433579-13691-9-git-send-email-liang.z.li@intel.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-06-17 18:24:28 +05:30
Liang Li
a7a9a88f9d migration: refine the compression code
The current code for multi-thread compression is not clear,
especially in the aspect of using lock. Refine the code
to make it clear.

Signed-off-by: Liang Li <liang.z.li@intel.com>
Message-Id: <1462433579-13691-8-git-send-email-liang.z.li@intel.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-06-17 18:24:26 +05:30