qemu/qapi/migration.json

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

2625 lines
87 KiB
JSON
Raw Normal View History

# -*- Mode: Python -*-
# vim: filetype=python
#
##
# = Migration
##
{ 'include': 'common.json' }
{ 'include': 'sockets.json' }
##
# @MigrationStats:
#
# Detailed migration status.
#
# @transferred: amount of bytes already transferred to the target VM
#
# @remaining: amount of bytes remaining to be transferred to the
# target VM
#
# @total: total amount of bytes involved in the migration process
#
# @duplicate: number of duplicate (zero) pages (since 1.2)
#
# @skipped: number of skipped zero pages. Always zero, only provided for
# compatibility (since 1.5)
#
# @normal: number of normal pages (since 1.2)
#
# @normal-bytes: number of normal bytes sent (since 1.2)
#
# @dirty-pages-rate: number of pages dirtied by second by the guest
# (since 1.3)
#
# @mbps: throughput in megabits/sec. (since 1.6)
#
# @dirty-sync-count: number of times that dirty ram was synchronized
# (since 2.1)
#
# @postcopy-requests: The number of page requests received from the
# destination (since 2.7)
#
# @page-size: The number of bytes per page for the various page-based
# statistics (since 2.10)
#
# @multifd-bytes: The number of bytes sent through multifd (since 3.0)
#
# @pages-per-second: the number of memory pages transferred per second
# (Since 4.0)
#
# @precopy-bytes: The number of bytes sent in the pre-copy phase
# (since 7.0).
#
# @downtime-bytes: The number of bytes sent while the guest is paused
# (since 7.0).
#
# @postcopy-bytes: The number of bytes sent during the post-copy phase
# (since 7.0).
#
# @dirty-sync-missed-zero-copy: Number of times dirty RAM
# synchronization could not avoid copying dirty pages. This is
# between 0 and @dirty-sync-count * @multifd-channels. (since
# 7.1)
#
# Features:
#
# @deprecated: Member @skipped is always zero since 1.5.3
#
# Since: 0.14
#
##
{ 'struct': 'MigrationStats',
'data': {'transferred': 'int', 'remaining': 'int', 'total': 'int' ,
'duplicate': 'int',
'skipped': { 'type': 'int', 'features': [ 'deprecated' ] },
'normal': 'int',
'normal-bytes': 'int', 'dirty-pages-rate': 'int',
'mbps': 'number', 'dirty-sync-count': 'int',
'postcopy-requests': 'int', 'page-size': 'int',
'multifd-bytes': 'uint64', 'pages-per-second': 'uint64',
'precopy-bytes': 'uint64', 'downtime-bytes': 'uint64',
'postcopy-bytes': 'uint64',
'dirty-sync-missed-zero-copy': 'uint64' } }
##
# @XBZRLECacheStats:
#
# Detailed XBZRLE migration cache statistics
#
# @cache-size: XBZRLE cache size
#
# @bytes: amount of bytes already transferred to the target VM
#
# @pages: amount of pages transferred to the target VM
#
# @cache-miss: number of cache miss
#
# @cache-miss-rate: rate of cache miss (since 2.1)
#
# @encoding-rate: rate of encoded bytes (since 5.1)
#
# @overflow: number of overflows
#
# Since: 1.2
##
{ 'struct': 'XBZRLECacheStats',
'data': {'cache-size': 'size', 'bytes': 'int', 'pages': 'int',
'cache-miss': 'int', 'cache-miss-rate': 'number',
'encoding-rate': 'number', 'overflow': 'int' } }
##
# @CompressionStats:
#
# Detailed migration compression statistics
#
# @pages: amount of pages compressed and transferred to the target VM
#
# @busy: count of times that no free thread was available to compress
# data
#
# @busy-rate: rate of thread busy
#
# @compressed-size: amount of bytes after compression
#
# @compression-rate: rate of compressed size
#
# Since: 3.1
##
{ 'struct': 'CompressionStats',
'data': {'pages': 'int', 'busy': 'int', 'busy-rate': 'number',
'compressed-size': 'int', 'compression-rate': 'number' } }
##
# @MigrationStatus:
#
# An enumeration of migration status.
#
# @none: no migration has ever happened.
#
# @setup: migration process has been initiated.
#
# @cancelling: in the process of cancelling migration.
#
# @cancelled: cancelling migration is finished.
#
# @active: in the process of doing migration.
#
# @postcopy-active: like active, but now in postcopy mode. (since
# 2.5)
#
# @postcopy-paused: during postcopy but paused. (since 3.0)
#
# @postcopy-recover: trying to recover from a paused postcopy. (since
# 3.0)
#
# @completed: migration is finished.
#
# @failed: some error occurred during migration process.
#
# @colo: VM is in the process of fault tolerance, VM can not get into
# this state unless colo capability is enabled for migration.
# (since 2.8)
#
# @pre-switchover: Paused before device serialisation. (since 2.11)
#
# @device: During device serialisation when pause-before-switchover is
# enabled (since 2.11)
#
# @wait-unplug: wait for device unplug request by guest OS to be
# completed. (since 4.2)
#
# Since: 2.3
##
{ 'enum': 'MigrationStatus',
'data': [ 'none', 'setup', 'cancelling', 'cancelled',
'active', 'postcopy-active', 'postcopy-paused',
'postcopy-recover', 'completed', 'failed', 'colo',
'pre-switchover', 'device', 'wait-unplug' ] }
##
# @VfioStats:
#
# Detailed VFIO devices migration statistics
#
# @transferred: amount of bytes transferred to the target VM by VFIO
# devices
#
# Since: 5.2
##
{ 'struct': 'VfioStats',
'data': {'transferred': 'int' } }
##
# @MigrationInfo:
#
# Information about current migration process.
#
# @status: @MigrationStatus describing the current migration status.
# If this field is not returned, no migration process has been
# initiated
#
# @ram: @MigrationStats containing detailed migration status, only
# returned if status is 'active' or 'completed'(since 1.2)
#
# @disk: @MigrationStats containing detailed disk migration status,
# only returned if status is 'active' and it is a block migration
#
# @xbzrle-cache: @XBZRLECacheStats containing detailed XBZRLE
# migration statistics, only returned if XBZRLE feature is on and
# status is 'active' or 'completed' (since 1.2)
#
# @total-time: total amount of milliseconds since migration started.
# If migration has ended, it returns the total migration time.
# (since 1.2)
#
# @downtime: only present when migration finishes correctly total
# downtime in milliseconds for the guest. (since 1.3)
#
# @expected-downtime: only present while migration is active expected
# downtime in milliseconds for the guest in last walk of the dirty
# bitmap. (since 1.3)
#
# @setup-time: amount of setup time in milliseconds *before* the
# iterations begin but *after* the QMP command is issued. This is
# designed to provide an accounting of any activities (such as
# RDMA pinning) which may be expensive, but do not actually occur
# during the iterative migration rounds themselves. (since 1.6)
#
# @cpu-throttle-percentage: percentage of time guest cpus are being
# throttled during auto-converge. This is only present when
# auto-converge has started throttling guest cpus. (Since 2.7)
#
# @error-desc: the human readable error description string. Clients
# should not attempt to parse the error strings. (Since 2.7)
#
# @postcopy-blocktime: total time when all vCPU were blocked during
# postcopy live migration. This is only present when the
# postcopy-blocktime migration capability is enabled. (Since 3.0)
#
# @postcopy-vcpu-blocktime: list of the postcopy blocktime per vCPU.
# This is only present when the postcopy-blocktime migration
# capability is enabled. (Since 3.0)
#
# @compression: migration compression statistics, only returned if
# compression feature is on and status is 'active' or 'completed'
# (Since 3.1)
#
# @socket-address: Only used for tcp, to know what the real port is
# (Since 4.0)
#
# @vfio: @VfioStats containing detailed VFIO devices migration
# statistics, only returned if VFIO device is present, migration
# is supported by all VFIO devices and status is 'active' or
# 'completed' (since 5.2)
#
# @blocked-reasons: A list of reasons an outgoing migration is
# blocked. Present and non-empty when migration is blocked.
# (since 6.0)
#
# @dirty-limit-throttle-time-per-round: Maximum throttle time
# (in microseconds) of virtual CPUs each dirty ring full round,
# which shows how MigrationCapability dirty-limit affects the
# guest during live migration. (Since 8.1)
#
# @dirty-limit-ring-full-time: Estimated average dirty ring full time
# (in microseconds) for each dirty ring full round. The value
# equals the dirty ring memory size divided by the average dirty
# page rate of the virtual CPU, which can be used to observe the
# average memory load of the virtual CPU indirectly. Note that
# zero means guest doesn't dirty memory. (Since 8.1)
#
# Features:
#
# @deprecated: Member @disk is deprecated because block migration is.
# Member @compression is deprecated because it is unreliable and
# untested. It is recommended to use multifd migration, which
# offers an alternative compression implementation that is
# reliable and tested.
#
# Since: 0.14
##
{ 'struct': 'MigrationInfo',
'data': {'*status': 'MigrationStatus', '*ram': 'MigrationStats',
'*disk': { 'type': 'MigrationStats', 'features': [ 'deprecated' ] },
'*vfio': 'VfioStats',
'*xbzrle-cache': 'XBZRLECacheStats',
'*total-time': 'int',
'*expected-downtime': 'int',
'*downtime': 'int',
'*setup-time': 'int',
'*cpu-throttle-percentage': 'int',
'*error-desc': 'str',
'*blocked-reasons': ['str'],
'*postcopy-blocktime': 'uint32',
'*postcopy-vcpu-blocktime': ['uint32'],
'*compression': { 'type': 'CompressionStats', 'features': [ 'deprecated' ] },
'*socket-address': ['SocketAddress'],
'*dirty-limit-throttle-time-per-round': 'uint64',
'*dirty-limit-ring-full-time': 'uint64'} }
##
# @query-migrate:
#
# Returns information about current migration process. If migration
# is active there will be another json-object with RAM migration
# status and if block migration is active another one with block
# migration status.
#
# Returns: @MigrationInfo
#
# Since: 0.14
#
# Examples:
#
# 1. Before the first migration
#
# -> { "execute": "query-migrate" }
# <- { "return": {} }
#
# 2. Migration is done and has succeeded
#
# -> { "execute": "query-migrate" }
# <- { "return": {
# "status": "completed",
# "total-time":12345,
# "setup-time":12345,
# "downtime":12345,
# "ram":{
# "transferred":123,
# "remaining":123,
# "total":246,
# "duplicate":123,
# "normal":123,
# "normal-bytes":123456,
# "dirty-sync-count":15
# }
# }
# }
#
# 3. Migration is done and has failed
#
# -> { "execute": "query-migrate" }
# <- { "return": { "status": "failed" } }
#
# 4. Migration is being performed and is not a block migration:
#
# -> { "execute": "query-migrate" }
# <- {
# "return":{
# "status":"active",
# "total-time":12345,
# "setup-time":12345,
# "expected-downtime":12345,
# "ram":{
# "transferred":123,
# "remaining":123,
# "total":246,
# "duplicate":123,
# "normal":123,
# "normal-bytes":123456,
# "dirty-sync-count":15
# }
# }
# }
#
# 5. Migration is being performed and is a block migration:
#
# -> { "execute": "query-migrate" }
# <- {
# "return":{
# "status":"active",
# "total-time":12345,
# "setup-time":12345,
# "expected-downtime":12345,
# "ram":{
# "total":1057024,
# "remaining":1053304,
# "transferred":3720,
# "duplicate":123,
# "normal":123,
# "normal-bytes":123456,
# "dirty-sync-count":15
# },
# "disk":{
# "total":20971520,
# "remaining":20880384,
# "transferred":91136
# }
# }
# }
#
# 6. Migration is being performed and XBZRLE is active:
#
# -> { "execute": "query-migrate" }
# <- {
# "return":{
# "status":"active",
# "total-time":12345,
# "setup-time":12345,
# "expected-downtime":12345,
# "ram":{
# "total":1057024,
# "remaining":1053304,
# "transferred":3720,
# "duplicate":10,
# "normal":3333,
# "normal-bytes":3412992,
# "dirty-sync-count":15
# },
# "xbzrle-cache":{
# "cache-size":67108864,
# "bytes":20971520,
# "pages":2444343,
# "cache-miss":2244,
# "cache-miss-rate":0.123,
# "encoding-rate":80.1,
# "overflow":34434
# }
# }
# }
##
{ 'command': 'query-migrate', 'returns': 'MigrationInfo' }
##
# @MigrationCapability:
#
# Migration capabilities enumeration
#
# @xbzrle: Migration supports xbzrle (Xor Based Zero Run Length
# Encoding). This feature allows us to minimize migration traffic
# for certain work loads, by sending compressed difference of the
# pages
#
# @rdma-pin-all: Controls whether or not the entire VM memory
# footprint is mlock()'d on demand or all at once. Refer to
# docs/rdma.txt for usage. Disabled by default. (since 2.0)
#
# @zero-blocks: During storage migration encode blocks of zeroes
# efficiently. This essentially saves 1MB of zeroes per block on
# the wire. Enabling requires source and target VM to support
# this feature. To enable it is sufficient to enable the
# capability on the source VM. The feature is disabled by default.
# (since 1.6)
#
# @compress: Use multiple compression threads to accelerate live
# migration. This feature can help to reduce the migration
# traffic, by sending compressed pages. Please note that if
# compress and xbzrle are both on, compress only takes effect in
# the ram bulk stage, after that, it will be disabled and only
# xbzrle takes effect, this can help to minimize migration
# traffic. The feature is disabled by default. (since 2.4)
#
# @events: generate events for each migration state change (since 2.4)
#
# @auto-converge: If enabled, QEMU will automatically throttle down
# the guest to speed up convergence of RAM migration. (since 1.6)
#
# @postcopy-ram: Start executing on the migration target before all of
# RAM has been migrated, pulling the remaining pages along as
# needed. The capacity must have the same setting on both source
# and target or migration will not even start. NOTE: If the
# migration fails during postcopy the VM will fail. (since 2.6)
#
# @x-colo: If enabled, migration will never end, and the state of the
# VM on the primary side will be migrated continuously to the VM
# on secondary side, this process is called COarse-Grain LOck
# Stepping (COLO) for Non-stop Service. (since 2.8)
#
# @release-ram: if enabled, qemu will free the migrated ram pages on
# the source during postcopy-ram migration. (since 2.9)
#
# @block: If enabled, QEMU will also migrate the contents of all block
# devices. Default is disabled. A possible alternative uses
# mirror jobs to a builtin NBD server on the destination, which
# offers more flexibility. (Since 2.10)
#
# @return-path: If enabled, migration will use the return path even
# for precopy. (since 2.10)
#
# @pause-before-switchover: Pause outgoing migration before
# serialising device state and before disabling block IO (since
# 2.11)
#
# @multifd: Use more than one fd for migration (since 4.0)
#
# @dirty-bitmaps: If enabled, QEMU will migrate named dirty bitmaps.
# (since 2.12)
#
# @postcopy-blocktime: Calculate downtime for postcopy live migration
# (since 3.0)
#
# @late-block-activate: If enabled, the destination will not activate
# block devices (and thus take locks) immediately at the end of
# migration. (since 3.0)
#
# @x-ignore-shared: If enabled, QEMU will not migrate shared memory
# that is accessible on the destination machine. (since 4.0)
#
# @validate-uuid: Send the UUID of the source to allow the destination
# to ensure it is the same. (since 4.2)
#
# @background-snapshot: If enabled, the migration stream will be a
# snapshot of the VM exactly at the point when the migration
# procedure starts. The VM RAM is saved with running VM. (since
# 6.0)
#
# @zero-copy-send: Controls behavior on sending memory pages on
# migration. When true, enables a zero-copy mechanism for sending
# memory pages, if host supports it. Requires that QEMU be
# permitted to use locked memory for guest RAM pages. (since 7.1)
#
# @postcopy-preempt: If enabled, the migration process will allow
# postcopy requests to preempt precopy stream, so postcopy
# requests will be handled faster. This is a performance feature
# and should not affect the correctness of postcopy migration.
# (since 7.1)
#
migration: Add switchover ack capability Migration downtime estimation is calculated based on bandwidth and remaining migration data. This assumes that loading of migration data in the destination takes a negligible amount of time and that downtime depends only on network speed. While this may be true for RAM, it's not necessarily true for other migrated devices. For example, loading the data of a VFIO device in the destination might require from the device to allocate resources, prepare internal data structures and so on. These operations can take a significant amount of time which can increase migration downtime. This patch adds a new capability "switchover ack" that prevents the source from stopping the VM and completing the migration until an ACK is received from the destination that it's OK to do so. This can be used by migrated devices in various ways to reduce downtime. For example, a device can send initial precopy metadata to pre-allocate resources in the destination and use this capability to make sure that the pre-allocation is completed before the source VM is stopped, so it will have full effect. This new capability relies on the return path capability to communicate from the destination back to the source. The actual implementation of the capability will be added in the following patches. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Peter Xu <peterx@redhat.com> Acked-by: Markus Armbruster <armbru@redhat.com> Tested-by: YangHang Liu <yanghliu@redhat.com> Acked-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-06-21 14:11:54 +03:00
# @switchover-ack: If enabled, migration will not stop the source VM
# and complete the migration until an ACK is received from the
# destination that it's OK to do so. Exactly when this ACK is
# sent depends on the migrated devices that use this feature. For
# example, a device can use it to make sure some of its data is
# sent and loaded in the destination before doing switchover.
migration: Add switchover ack capability Migration downtime estimation is calculated based on bandwidth and remaining migration data. This assumes that loading of migration data in the destination takes a negligible amount of time and that downtime depends only on network speed. While this may be true for RAM, it's not necessarily true for other migrated devices. For example, loading the data of a VFIO device in the destination might require from the device to allocate resources, prepare internal data structures and so on. These operations can take a significant amount of time which can increase migration downtime. This patch adds a new capability "switchover ack" that prevents the source from stopping the VM and completing the migration until an ACK is received from the destination that it's OK to do so. This can be used by migrated devices in various ways to reduce downtime. For example, a device can send initial precopy metadata to pre-allocate resources in the destination and use this capability to make sure that the pre-allocation is completed before the source VM is stopped, so it will have full effect. This new capability relies on the return path capability to communicate from the destination back to the source. The actual implementation of the capability will be added in the following patches. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Peter Xu <peterx@redhat.com> Acked-by: Markus Armbruster <armbru@redhat.com> Tested-by: YangHang Liu <yanghliu@redhat.com> Acked-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-06-21 14:11:54 +03:00
# This can reduce downtime if devices that support this capability
# are present. 'return-path' capability must be enabled to use
# it. (since 8.1)
#
# @dirty-limit: If enabled, migration will throttle vCPUs as needed to
# keep their dirty page rate within @vcpu-dirty-limit. This can
# improve responsiveness of large guests during live migration,
# and can result in more stable read performance. Requires KVM
# with accelerator property "dirty-ring-size" set. (Since 8.1)
#
# @mapped-ram: Migrate using fixed offsets in the migration file for
# each RAM page. Requires a migration URI that supports seeking,
# such as a file. (since 9.0)
#
# Features:
#
# @deprecated: Member @block is deprecated. Use blockdev-mirror with
# NBD instead. Member @compress is deprecated because it is
# unreliable and untested. It is recommended to use multifd
# migration, which offers an alternative compression
# implementation that is reliable and tested.
#
# @unstable: Members @x-colo and @x-ignore-shared are experimental.
#
# Since: 1.2
##
{ 'enum': 'MigrationCapability',
'data': ['xbzrle', 'rdma-pin-all', 'auto-converge', 'zero-blocks',
{ 'name': 'compress', 'features': [ 'deprecated' ] },
'events', 'postcopy-ram',
{ 'name': 'x-colo', 'features': [ 'unstable' ] },
'release-ram',
{ 'name': 'block', 'features': [ 'deprecated' ] },
'return-path', 'pause-before-switchover', 'multifd',
'dirty-bitmaps', 'postcopy-blocktime', 'late-block-activate',
{ 'name': 'x-ignore-shared', 'features': [ 'unstable' ] },
'validate-uuid', 'background-snapshot',
'zero-copy-send', 'postcopy-preempt', 'switchover-ack',
'dirty-limit', 'mapped-ram'] }
##
# @MigrationCapabilityStatus:
#
# Migration capability information
#
# @capability: capability enum
#
# @state: capability state bool
#
# Since: 1.2
##
{ 'struct': 'MigrationCapabilityStatus',
'data': { 'capability': 'MigrationCapability', 'state': 'bool' } }
##
# @migrate-set-capabilities:
#
# Enable/Disable the following migration capabilities (like xbzrle)
#
# @capabilities: json array of capability modifications to make
#
# Since: 1.2
#
# Example:
#
# -> { "execute": "migrate-set-capabilities" , "arguments":
# { "capabilities": [ { "capability": "xbzrle", "state": true } ] } }
# <- { "return": {} }
##
{ 'command': 'migrate-set-capabilities',
'data': { 'capabilities': ['MigrationCapabilityStatus'] } }
##
# @query-migrate-capabilities:
#
# Returns information about the current migration capabilities status
#
# Returns: @MigrationCapabilityStatus
#
# Since: 1.2
#
# Example:
#
# -> { "execute": "query-migrate-capabilities" }
# <- { "return": [
# {"state": false, "capability": "xbzrle"},
# {"state": false, "capability": "rdma-pin-all"},
# {"state": false, "capability": "auto-converge"},
# {"state": false, "capability": "zero-blocks"},
# {"state": false, "capability": "compress"},
# {"state": true, "capability": "events"},
# {"state": false, "capability": "postcopy-ram"},
# {"state": false, "capability": "x-colo"}
# ]}
##
{ 'command': 'query-migrate-capabilities', 'returns': ['MigrationCapabilityStatus']}
##
# @MultiFDCompression:
#
# An enumeration of multifd compression methods.
#
# @none: no compression.
#
# @zlib: use zlib compression method.
#
# @zstd: use zstd compression method.
#
# Since: 5.0
##
{ 'enum': 'MultiFDCompression',
'data': [ 'none', 'zlib',
{ 'name': 'zstd', 'if': 'CONFIG_ZSTD' } ] }
##
# @MigMode:
#
# @normal: the original form of migration. (since 8.2)
#
# @cpr-reboot: The migrate command stops the VM and saves state to
# the URI. After quitting QEMU, the user resumes by running
# QEMU -incoming.
#
# This mode allows the user to quit QEMU, optionally update and
# reboot the OS, and restart QEMU. If the user reboots, the URI
# must persist across the reboot, such as by using a file.
#
# Unlike normal mode, the use of certain local storage options
# does not block the migration, but the user must not modify the
# contents of guest block devices between the quit and restart.
#
# This mode supports VFIO devices provided the user first puts
# the guest in the suspended runstate, such as by issuing
# guest-suspend-ram to the QEMU guest agent.
#
# Best performance is achieved when the memory backend is shared
# and the @x-ignore-shared migration capability is set, but this
# is not required. Further, if the user reboots before restarting
# such a configuration, the shared memory must persist across the
# reboot, such as by backing it with a dax device.
#
# @cpr-reboot may not be used with postcopy, background-snapshot,
# or COLO.
#
# (since 8.2)
##
{ 'enum': 'MigMode',
cpr: reboot mode Add the cpr-reboot migration mode. Usage: $ qemu-system-$arch -monitor stdio ... QEMU 8.1.50 monitor - type 'help' for more information (qemu) migrate_set_capability x-ignore-shared on (qemu) migrate_set_parameter mode cpr-reboot (qemu) migrate -d file:vm.state (qemu) info status VM status: paused (postmigrate) (qemu) quit $ qemu-system-$arch -monitor stdio -incoming defer ... QEMU 8.1.50 monitor - type 'help' for more information (qemu) migrate_set_capability x-ignore-shared on (qemu) migrate_set_parameter mode cpr-reboot (qemu) migrate_incoming file:vm.state (qemu) info status VM status: running In this mode, the migrate command saves state to a file, allowing one to quit qemu, reboot to an updated kernel, and restart an updated version of qemu. The caller must specify a migration URI that writes to and reads from a file. Unlike normal mode, the use of certain local storage options does not block the migration, but the caller must not modify guest block devices between the quit and restart. To avoid saving guest RAM to the file, the memory backend must be shared, and the @x-ignore-shared migration capability must be set. Guest RAM must be non-volatile across reboot, such as by backing it with a dax device, but this is not enforced. The restarted qemu arguments must match those used to initially start qemu, plus the -incoming option. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <1698263069-406971-6-git-send-email-steven.sistare@oracle.com>
2023-10-25 22:44:28 +03:00
'data': [ 'normal', 'cpr-reboot' ] }
##
# @ZeroPageDetection:
#
# @none: Do not perform zero page checking.
#
# @legacy: Perform zero page checking in main migration thread.
#
# @multifd: Perform zero page checking in multifd sender thread if
# multifd migration is enabled, else in the main migration
# thread as for @legacy.
#
# Since: 9.0
#
##
{ 'enum': 'ZeroPageDetection',
'data': [ 'none', 'legacy', 'multifd' ] }
##
# @BitmapMigrationBitmapAliasTransform:
#
# @persistent: If present, the bitmap will be made persistent or
# transient depending on this parameter.
#
# Since: 6.0
##
{ 'struct': 'BitmapMigrationBitmapAliasTransform',
'data': {
'*persistent': 'bool'
} }
##
# @BitmapMigrationBitmapAlias:
#
# @name: The name of the bitmap.
#
# @alias: An alias name for migration (for example the bitmap name on
# the opposite site).
#
# @transform: Allows the modification of the migrated bitmap. (since
# 6.0)
#
# Since: 5.2
##
{ 'struct': 'BitmapMigrationBitmapAlias',
'data': {
'name': 'str',
'alias': 'str',
'*transform': 'BitmapMigrationBitmapAliasTransform'
} }
##
# @BitmapMigrationNodeAlias:
#
# Maps a block node name and the bitmaps it has to aliases for dirty
# bitmap migration.
#
# @node-name: A block node name.
#
# @alias: An alias block node name for migration (for example the node
# name on the opposite site).
#
# @bitmaps: Mappings for the bitmaps on this node.
#
# Since: 5.2
##
{ 'struct': 'BitmapMigrationNodeAlias',
'data': {
'node-name': 'str',
'alias': 'str',
'bitmaps': [ 'BitmapMigrationBitmapAlias' ]
} }
##
# @MigrationParameter:
#
# Migration parameters enumeration
#
# @announce-initial: Initial delay (in milliseconds) before sending
# the first announce (Since 4.0)
#
# @announce-max: Maximum delay (in milliseconds) between packets in
# the announcement (Since 4.0)
#
# @announce-rounds: Number of self-announce packets sent after
# migration (Since 4.0)
#
# @announce-step: Increase in delay (in milliseconds) between
# subsequent packets in the announcement (Since 4.0)
#
# @compress-level: Set the compression level to be used in live
# migration, the compression level is an integer between 0 and 9,
# where 0 means no compression, 1 means the best compression
# speed, and 9 means best compression ratio which will consume
# more CPU.
#
# @compress-threads: Set compression thread count to be used in live
# migration, the compression thread count is an integer between 1
# and 255.
#
# @compress-wait-thread: Controls behavior when all compression
# threads are currently busy. If true (default), wait for a free
# compression thread to become available; otherwise, send the page
# uncompressed. (Since 3.1)
#
# @decompress-threads: Set decompression thread count to be used in
# live migration, the decompression thread count is an integer
# between 1 and 255. Usually, decompression is at least 4 times as
# fast as compression, so set the decompress-threads to the number
# about 1/4 of compress-threads is adequate.
#
# @throttle-trigger-threshold: The ratio of bytes_dirty_period and
# bytes_xfer_period to trigger throttling. It is expressed as
# percentage. The default value is 50. (Since 5.0)
#
# @cpu-throttle-initial: Initial percentage of time guest cpus are
# throttled when migration auto-converge is activated. The
# default value is 20. (Since 2.7)
#
# @cpu-throttle-increment: throttle percentage increase each time
# auto-converge detects that migration is not making progress.
# The default value is 10. (Since 2.7)
#
# @cpu-throttle-tailslow: Make CPU throttling slower at tail stage At
# the tail stage of throttling, the Guest is very sensitive to CPU
# percentage while the @cpu-throttle -increment is excessive
# usually at tail stage. If this parameter is true, we will
# compute the ideal CPU percentage used by the Guest, which may
# exactly make the dirty rate match the dirty rate threshold.
# Then we will choose a smaller throttle increment between the one
# specified by @cpu-throttle-increment and the one generated by
# ideal CPU percentage. Therefore, it is compatible to
# traditional throttling, meanwhile the throttle increment won't
# be excessive at tail stage. The default value is false. (Since
# 5.1)
#
# @tls-creds: ID of the 'tls-creds' object that provides credentials
# for establishing a TLS connection over the migration data
# channel. On the outgoing side of the migration, the credentials
# must be for a 'client' endpoint, while for the incoming side the
qapi: Improve migration TLS documentation MigrateSetParameters is about setting parameters, and MigrationParameters is about querying them. Their documentation of @tls-creds and @tls-hostname has residual damage from a failed attempt at de-duplicating them (see commit de63ab61241 "migrate: Share common MigrationParameters struct" and commit 1bda8b3c695 "migration: Unshare MigrationParameters struct for now"). MigrateSetParameters documentation issues: * It claims plain text mode "was reported by omitting tls-creds" before 2.9. MigrateSetParameters is not used for reporting, so this is misleading. Delete. * It similarly claims hostname defaulting to migration URI "was reported by omitting tls-hostname" before 2.9. Delete as well. Rephrase the remaining @tls-hostname contents for clarity. Enum MigrationParameter mirrors the members of struct MigrateSetParameters. Differences to MigrateSetParameters's member documentation are pointless. Copy the new text to MigrationParameter. MigrationParameters documentation issues: * @tls-creds runs the two last sentences together without punctuation. Fix that. * Much of the contents on @tls-hostname only applies to setting parameters, resulting in confusion. Replace by a suitable abridged version of the new MigrateSetParameters text, and a note on @tls-hostname omission in 2.8. Additional damage is due to flawed doc fix commit 66fcb9d651d (qapi/migration: Add missing tls-authz documentation): since it copied the missing MigrateSetParameters text from MigrationParameters instead of MigrationParameter, the part on recreating @tls-authz on the fly is missing. Copy that, too. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-ID: <20240322135117.195489-2-armbru@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> [Some typos corrected]
2024-03-22 16:51:15 +03:00
# credentials must be for a 'server' endpoint. Setting this to a
# non-empty string enables TLS for all migrations. An empty
# string means that QEMU will use plain text mode for migration,
# rather than TLS. (Since 2.7)
#
# @tls-hostname: migration target's hostname for validating the
# server's x509 certificate identity. If empty, QEMU will use the
# hostname from the migration URI, if any. A non-empty value is
# required when using x509 based TLS credentials and the migration
# URI does not include a hostname, such as fd: or exec: based
# migration. (Since 2.7)
#
qapi: Improve migration TLS documentation MigrateSetParameters is about setting parameters, and MigrationParameters is about querying them. Their documentation of @tls-creds and @tls-hostname has residual damage from a failed attempt at de-duplicating them (see commit de63ab61241 "migrate: Share common MigrationParameters struct" and commit 1bda8b3c695 "migration: Unshare MigrationParameters struct for now"). MigrateSetParameters documentation issues: * It claims plain text mode "was reported by omitting tls-creds" before 2.9. MigrateSetParameters is not used for reporting, so this is misleading. Delete. * It similarly claims hostname defaulting to migration URI "was reported by omitting tls-hostname" before 2.9. Delete as well. Rephrase the remaining @tls-hostname contents for clarity. Enum MigrationParameter mirrors the members of struct MigrateSetParameters. Differences to MigrateSetParameters's member documentation are pointless. Copy the new text to MigrationParameter. MigrationParameters documentation issues: * @tls-creds runs the two last sentences together without punctuation. Fix that. * Much of the contents on @tls-hostname only applies to setting parameters, resulting in confusion. Replace by a suitable abridged version of the new MigrateSetParameters text, and a note on @tls-hostname omission in 2.8. Additional damage is due to flawed doc fix commit 66fcb9d651d (qapi/migration: Add missing tls-authz documentation): since it copied the missing MigrateSetParameters text from MigrationParameters instead of MigrationParameter, the part on recreating @tls-authz on the fly is missing. Copy that, too. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-ID: <20240322135117.195489-2-armbru@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> [Some typos corrected]
2024-03-22 16:51:15 +03:00
# Note: empty value works only since 2.9.
#
# @tls-authz: ID of the 'authz' object subclass that provides access
# control checking of the TLS x509 certificate distinguished name.
# This object is only resolved at time of use, so can be deleted
# and recreated on the fly while the migration server is active.
# If missing, it will default to denying access (Since 4.0)
#
# @max-bandwidth: maximum speed for migration, in bytes per second.
# (Since 2.8)
#
migration: Allow user to specify available switchover bandwidth Migration bandwidth is a very important value to live migration. It's because it's one of the major factors that we'll make decision on when to switchover to destination in a precopy process. This value is currently estimated by QEMU during the whole live migration process by monitoring how fast we were sending the data. This can be the most accurate bandwidth if in the ideal world, where we're always feeding unlimited data to the migration channel, and then it'll be limited to the bandwidth that is available. However in reality it may be very different, e.g., over a 10Gbps network we can see query-migrate showing migration bandwidth of only a few tens of MB/s just because there are plenty of other things the migration thread might be doing. For example, the migration thread can be busy scanning zero pages, or it can be fetching dirty bitmap from other external dirty sources (like vhost or KVM). It means we may not be pushing data as much as possible to migration channel, so the bandwidth estimated from "how many data we sent in the channel" can be dramatically inaccurate sometimes. With that, the decision to switchover will be affected, by assuming that we may not be able to switchover at all with such a low bandwidth, but in reality we can. The migration may not even converge at all with the downtime specified, with that wrong estimation of bandwidth, keeping iterations forever with a low estimation of bandwidth. The issue is QEMU itself may not be able to avoid those uncertainties on measuing the real "available migration bandwidth". At least not something I can think of so far. One way to fix this is when the user is fully aware of the available bandwidth, then we can allow the user to help providing an accurate value. For example, if the user has a dedicated channel of 10Gbps for migration for this specific VM, the user can specify this bandwidth so QEMU can always do the calculation based on this fact, trusting the user as long as specified. It may not be the exact bandwidth when switching over (in which case qemu will push migration data as fast as possible), but much better than QEMU trying to wildly guess, especially when very wrong. A new parameter "avail-switchover-bandwidth" is introduced just for this. So when the user specified this parameter, instead of trusting the estimated value from QEMU itself (based on the QEMUFile send speed), it trusts the user more by using this value to decide when to switchover, assuming that we'll have such bandwidth available then. Note that specifying this value will not throttle the bandwidth for switchover yet, so QEMU will always use the full bandwidth possible for sending switchover data, assuming that should always be the most important way to use the network at that time. This can resolve issues like "unconvergence migration" which is caused by hilarious low "migration bandwidth" detected for whatever reason. Reported-by: Zhiyi Guo <zhguo@redhat.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231010221922.40638-1-peterx@redhat.com>
2023-10-11 01:19:22 +03:00
# @avail-switchover-bandwidth: to set the available bandwidth that
# migration can use during switchover phase. NOTE! This does not
# limit the bandwidth during switchover, but only for calculations when
# making decisions to switchover. By default, this value is zero,
# which means QEMU will estimate the bandwidth automatically. This can
# be set when the estimated value is not accurate, while the user is
# able to guarantee such bandwidth is available when switching over.
# When specified correctly, this can make the switchover decision much
# more accurate. (Since 8.2)
#
# @downtime-limit: set maximum tolerated downtime for migration.
# maximum downtime in milliseconds (Since 2.8)
#
# @x-checkpoint-delay: The delay time (in ms) between two COLO
# checkpoints in periodic mode. (Since 2.8)
#
# @block-incremental: Affects how much storage is migrated when the
# block migration capability is enabled. When false, the entire
# storage backing chain is migrated into a flattened image at the
# destination; when true, only the active qcow2 layer is migrated
# and the destination must already have access to the same backing
# chain as was used on the source. (since 2.10)
#
# @multifd-channels: Number of channels used to migrate data in
# parallel. This is the same number that the number of sockets
# used for migration. The default value is 2 (since 4.0)
#
# @xbzrle-cache-size: cache size to be used by XBZRLE migration. It
# needs to be a multiple of the target page size and a power of 2
# (Since 2.11)
#
# @max-postcopy-bandwidth: Background transfer bandwidth during
# postcopy. Defaults to 0 (unlimited). In bytes per second.
# (Since 3.0)
#
# @max-cpu-throttle: maximum cpu throttle percentage. Defaults to 99.
# (Since 3.1)
#
# @multifd-compression: Which compression method to use. Defaults to
# none. (Since 5.0)
#
# @multifd-zlib-level: Set the compression level to be used in live
# migration, the compression level is an integer between 0 and 9,
# where 0 means no compression, 1 means the best compression
# speed, and 9 means best compression ratio which will consume
# more CPU. Defaults to 1. (Since 5.0)
#
# @multifd-zstd-level: Set the compression level to be used in live
# migration, the compression level is an integer between 0 and 20,
# where 0 means no compression, 1 means the best compression
# speed, and 20 means best compression ratio which will consume
# more CPU. Defaults to 1. (Since 5.0)
#
# @block-bitmap-mapping: Maps block nodes and bitmaps on them to
# aliases for the purpose of dirty bitmap migration. Such aliases
# may for example be the corresponding names on the opposite site.
# The mapping must be one-to-one, but not necessarily complete: On
# the source, unmapped bitmaps and all bitmaps on unmapped nodes
# will be ignored. On the destination, encountering an unmapped
# alias in the incoming migration stream will result in a report,
# and all further bitmap migration data will then be discarded.
# Note that the destination does not know about bitmaps it does
# not receive, so there is no limitation or requirement regarding
# the number of bitmaps received, or how they are named, or on
# which nodes they are placed. By default (when this parameter
# has never been set), bitmap names are mapped to themselves.
# Nodes are mapped to their block device name if there is one, and
# to their node name otherwise. (Since 5.2)
#
# @x-vcpu-dirty-limit-period: Periodic time (in milliseconds) of dirty
# limit during live migration. Should be in the range 1 to 1000ms.
# Defaults to 1000ms. (Since 8.1)
#
# @vcpu-dirty-limit: Dirtyrate limit (MB/s) during live migration.
# Defaults to 1. (Since 8.1)
#
# @mode: Migration mode. See description in @MigMode. Default is 'normal'.
# (Since 8.2)
#
# @zero-page-detection: Whether and how to detect zero pages.
# See description in @ZeroPageDetection. Default is 'multifd'.
# (since 9.0)
#
# Features:
#
# @deprecated: Member @block-incremental is deprecated. Use
# blockdev-mirror with NBD instead. Members @compress-level,
# @compress-threads, @decompress-threads and @compress-wait-thread
# are deprecated because @compression is deprecated.
#
# @unstable: Members @x-checkpoint-delay and @x-vcpu-dirty-limit-period
# are experimental.
#
# Since: 2.4
##
{ 'enum': 'MigrationParameter',
'data': ['announce-initial', 'announce-max',
'announce-rounds', 'announce-step',
{ 'name': 'compress-level', 'features': [ 'deprecated' ] },
{ 'name': 'compress-threads', 'features': [ 'deprecated' ] },
{ 'name': 'decompress-threads', 'features': [ 'deprecated' ] },
{ 'name': 'compress-wait-thread', 'features': [ 'deprecated' ] },
'throttle-trigger-threshold',
'cpu-throttle-initial', 'cpu-throttle-increment',
'cpu-throttle-tailslow',
'tls-creds', 'tls-hostname', 'tls-authz', 'max-bandwidth',
migration: Allow user to specify available switchover bandwidth Migration bandwidth is a very important value to live migration. It's because it's one of the major factors that we'll make decision on when to switchover to destination in a precopy process. This value is currently estimated by QEMU during the whole live migration process by monitoring how fast we were sending the data. This can be the most accurate bandwidth if in the ideal world, where we're always feeding unlimited data to the migration channel, and then it'll be limited to the bandwidth that is available. However in reality it may be very different, e.g., over a 10Gbps network we can see query-migrate showing migration bandwidth of only a few tens of MB/s just because there are plenty of other things the migration thread might be doing. For example, the migration thread can be busy scanning zero pages, or it can be fetching dirty bitmap from other external dirty sources (like vhost or KVM). It means we may not be pushing data as much as possible to migration channel, so the bandwidth estimated from "how many data we sent in the channel" can be dramatically inaccurate sometimes. With that, the decision to switchover will be affected, by assuming that we may not be able to switchover at all with such a low bandwidth, but in reality we can. The migration may not even converge at all with the downtime specified, with that wrong estimation of bandwidth, keeping iterations forever with a low estimation of bandwidth. The issue is QEMU itself may not be able to avoid those uncertainties on measuing the real "available migration bandwidth". At least not something I can think of so far. One way to fix this is when the user is fully aware of the available bandwidth, then we can allow the user to help providing an accurate value. For example, if the user has a dedicated channel of 10Gbps for migration for this specific VM, the user can specify this bandwidth so QEMU can always do the calculation based on this fact, trusting the user as long as specified. It may not be the exact bandwidth when switching over (in which case qemu will push migration data as fast as possible), but much better than QEMU trying to wildly guess, especially when very wrong. A new parameter "avail-switchover-bandwidth" is introduced just for this. So when the user specified this parameter, instead of trusting the estimated value from QEMU itself (based on the QEMUFile send speed), it trusts the user more by using this value to decide when to switchover, assuming that we'll have such bandwidth available then. Note that specifying this value will not throttle the bandwidth for switchover yet, so QEMU will always use the full bandwidth possible for sending switchover data, assuming that should always be the most important way to use the network at that time. This can resolve issues like "unconvergence migration" which is caused by hilarious low "migration bandwidth" detected for whatever reason. Reported-by: Zhiyi Guo <zhguo@redhat.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231010221922.40638-1-peterx@redhat.com>
2023-10-11 01:19:22 +03:00
'avail-switchover-bandwidth', 'downtime-limit',
{ 'name': 'x-checkpoint-delay', 'features': [ 'unstable' ] },
{ 'name': 'block-incremental', 'features': [ 'deprecated' ] },
'multifd-channels',
'xbzrle-cache-size', 'max-postcopy-bandwidth',
'max-cpu-throttle', 'multifd-compression',
'multifd-zlib-level', 'multifd-zstd-level',
'block-bitmap-mapping',
{ 'name': 'x-vcpu-dirty-limit-period', 'features': ['unstable'] },
'vcpu-dirty-limit',
'mode',
'zero-page-detection'] }
##
# @MigrateSetParameters:
#
# @announce-initial: Initial delay (in milliseconds) before sending
# the first announce (Since 4.0)
#
# @announce-max: Maximum delay (in milliseconds) between packets in
# the announcement (Since 4.0)
#
# @announce-rounds: Number of self-announce packets sent after
# migration (Since 4.0)
#
# @announce-step: Increase in delay (in milliseconds) between
# subsequent packets in the announcement (Since 4.0)
#
# @compress-level: Set the compression level to be used in live
# migration, the compression level is an integer between 0 and 9,
# where 0 means no compression, 1 means the best compression
# speed, and 9 means best compression ratio which will consume
# more CPU.
#
# @compress-threads: Set compression thread count to be used in live
# migration, the compression thread count is an integer between 1
# and 255.
#
# @compress-wait-thread: Controls behavior when all compression
# threads are currently busy. If true (default), wait for a free
# compression thread to become available; otherwise, send the page
# uncompressed. (Since 3.1)
#
# @decompress-threads: Set decompression thread count to be used in
# live migration, the decompression thread count is an integer
# between 1 and 255. Usually, decompression is at least 4 times as
# fast as compression, so set the decompress-threads to the number
# about 1/4 of compress-threads is adequate.
#
# @throttle-trigger-threshold: The ratio of bytes_dirty_period and
# bytes_xfer_period to trigger throttling. It is expressed as
# percentage. The default value is 50. (Since 5.0)
#
# @cpu-throttle-initial: Initial percentage of time guest cpus are
# throttled when migration auto-converge is activated. The
# default value is 20. (Since 2.7)
#
# @cpu-throttle-increment: throttle percentage increase each time
# auto-converge detects that migration is not making progress.
# The default value is 10. (Since 2.7)
#
# @cpu-throttle-tailslow: Make CPU throttling slower at tail stage At
# the tail stage of throttling, the Guest is very sensitive to CPU
# percentage while the @cpu-throttle -increment is excessive
# usually at tail stage. If this parameter is true, we will
# compute the ideal CPU percentage used by the Guest, which may
# exactly make the dirty rate match the dirty rate threshold.
# Then we will choose a smaller throttle increment between the one
# specified by @cpu-throttle-increment and the one generated by
# ideal CPU percentage. Therefore, it is compatible to
# traditional throttling, meanwhile the throttle increment won't
# be excessive at tail stage. The default value is false. (Since
# 5.1)
#
# @tls-creds: ID of the 'tls-creds' object that provides credentials
# for establishing a TLS connection over the migration data
# channel. On the outgoing side of the migration, the credentials
# must be for a 'client' endpoint, while for the incoming side the
# credentials must be for a 'server' endpoint. Setting this to a
# non-empty string enables TLS for all migrations. An empty
# string means that QEMU will use plain text mode for migration,
qapi: Improve migration TLS documentation MigrateSetParameters is about setting parameters, and MigrationParameters is about querying them. Their documentation of @tls-creds and @tls-hostname has residual damage from a failed attempt at de-duplicating them (see commit de63ab61241 "migrate: Share common MigrationParameters struct" and commit 1bda8b3c695 "migration: Unshare MigrationParameters struct for now"). MigrateSetParameters documentation issues: * It claims plain text mode "was reported by omitting tls-creds" before 2.9. MigrateSetParameters is not used for reporting, so this is misleading. Delete. * It similarly claims hostname defaulting to migration URI "was reported by omitting tls-hostname" before 2.9. Delete as well. Rephrase the remaining @tls-hostname contents for clarity. Enum MigrationParameter mirrors the members of struct MigrateSetParameters. Differences to MigrateSetParameters's member documentation are pointless. Copy the new text to MigrationParameter. MigrationParameters documentation issues: * @tls-creds runs the two last sentences together without punctuation. Fix that. * Much of the contents on @tls-hostname only applies to setting parameters, resulting in confusion. Replace by a suitable abridged version of the new MigrateSetParameters text, and a note on @tls-hostname omission in 2.8. Additional damage is due to flawed doc fix commit 66fcb9d651d (qapi/migration: Add missing tls-authz documentation): since it copied the missing MigrateSetParameters text from MigrationParameters instead of MigrationParameter, the part on recreating @tls-authz on the fly is missing. Copy that, too. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-ID: <20240322135117.195489-2-armbru@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> [Some typos corrected]
2024-03-22 16:51:15 +03:00
# rather than TLS. This is the default. (Since 2.7)
#
# @tls-hostname: migration target's hostname for validating the
# server's x509 certificate identity. If empty, QEMU will use the
# hostname from the migration URI, if any. A non-empty value is
# required when using x509 based TLS credentials and the migration
# URI does not include a hostname, such as fd: or exec: based
# migration. (Since 2.7)
#
# Note: empty value works only since 2.9.
#
# @tls-authz: ID of the 'authz' object subclass that provides access
# control checking of the TLS x509 certificate distinguished name.
qapi: Improve migration TLS documentation MigrateSetParameters is about setting parameters, and MigrationParameters is about querying them. Their documentation of @tls-creds and @tls-hostname has residual damage from a failed attempt at de-duplicating them (see commit de63ab61241 "migrate: Share common MigrationParameters struct" and commit 1bda8b3c695 "migration: Unshare MigrationParameters struct for now"). MigrateSetParameters documentation issues: * It claims plain text mode "was reported by omitting tls-creds" before 2.9. MigrateSetParameters is not used for reporting, so this is misleading. Delete. * It similarly claims hostname defaulting to migration URI "was reported by omitting tls-hostname" before 2.9. Delete as well. Rephrase the remaining @tls-hostname contents for clarity. Enum MigrationParameter mirrors the members of struct MigrateSetParameters. Differences to MigrateSetParameters's member documentation are pointless. Copy the new text to MigrationParameter. MigrationParameters documentation issues: * @tls-creds runs the two last sentences together without punctuation. Fix that. * Much of the contents on @tls-hostname only applies to setting parameters, resulting in confusion. Replace by a suitable abridged version of the new MigrateSetParameters text, and a note on @tls-hostname omission in 2.8. Additional damage is due to flawed doc fix commit 66fcb9d651d (qapi/migration: Add missing tls-authz documentation): since it copied the missing MigrateSetParameters text from MigrationParameters instead of MigrationParameter, the part on recreating @tls-authz on the fly is missing. Copy that, too. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-ID: <20240322135117.195489-2-armbru@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> [Some typos corrected]
2024-03-22 16:51:15 +03:00
# This object is only resolved at time of use, so can be deleted
# and recreated on the fly while the migration server is active.
# If missing, it will default to denying access (Since 4.0)
#
# @max-bandwidth: maximum speed for migration, in bytes per second.
# (Since 2.8)
#
migration: Allow user to specify available switchover bandwidth Migration bandwidth is a very important value to live migration. It's because it's one of the major factors that we'll make decision on when to switchover to destination in a precopy process. This value is currently estimated by QEMU during the whole live migration process by monitoring how fast we were sending the data. This can be the most accurate bandwidth if in the ideal world, where we're always feeding unlimited data to the migration channel, and then it'll be limited to the bandwidth that is available. However in reality it may be very different, e.g., over a 10Gbps network we can see query-migrate showing migration bandwidth of only a few tens of MB/s just because there are plenty of other things the migration thread might be doing. For example, the migration thread can be busy scanning zero pages, or it can be fetching dirty bitmap from other external dirty sources (like vhost or KVM). It means we may not be pushing data as much as possible to migration channel, so the bandwidth estimated from "how many data we sent in the channel" can be dramatically inaccurate sometimes. With that, the decision to switchover will be affected, by assuming that we may not be able to switchover at all with such a low bandwidth, but in reality we can. The migration may not even converge at all with the downtime specified, with that wrong estimation of bandwidth, keeping iterations forever with a low estimation of bandwidth. The issue is QEMU itself may not be able to avoid those uncertainties on measuing the real "available migration bandwidth". At least not something I can think of so far. One way to fix this is when the user is fully aware of the available bandwidth, then we can allow the user to help providing an accurate value. For example, if the user has a dedicated channel of 10Gbps for migration for this specific VM, the user can specify this bandwidth so QEMU can always do the calculation based on this fact, trusting the user as long as specified. It may not be the exact bandwidth when switching over (in which case qemu will push migration data as fast as possible), but much better than QEMU trying to wildly guess, especially when very wrong. A new parameter "avail-switchover-bandwidth" is introduced just for this. So when the user specified this parameter, instead of trusting the estimated value from QEMU itself (based on the QEMUFile send speed), it trusts the user more by using this value to decide when to switchover, assuming that we'll have such bandwidth available then. Note that specifying this value will not throttle the bandwidth for switchover yet, so QEMU will always use the full bandwidth possible for sending switchover data, assuming that should always be the most important way to use the network at that time. This can resolve issues like "unconvergence migration" which is caused by hilarious low "migration bandwidth" detected for whatever reason. Reported-by: Zhiyi Guo <zhguo@redhat.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231010221922.40638-1-peterx@redhat.com>
2023-10-11 01:19:22 +03:00
# @avail-switchover-bandwidth: to set the available bandwidth that
# migration can use during switchover phase. NOTE! This does not
# limit the bandwidth during switchover, but only for calculations when
# making decisions to switchover. By default, this value is zero,
# which means QEMU will estimate the bandwidth automatically. This can
# be set when the estimated value is not accurate, while the user is
# able to guarantee such bandwidth is available when switching over.
# When specified correctly, this can make the switchover decision much
# more accurate. (Since 8.2)
#
# @downtime-limit: set maximum tolerated downtime for migration.
# maximum downtime in milliseconds (Since 2.8)
#
# @x-checkpoint-delay: The delay time (in ms) between two COLO
# checkpoints in periodic mode. (Since 2.8)
#
# @block-incremental: Affects how much storage is migrated when the
# block migration capability is enabled. When false, the entire
# storage backing chain is migrated into a flattened image at the
# destination; when true, only the active qcow2 layer is migrated
# and the destination must already have access to the same backing
# chain as was used on the source. (since 2.10)
#
# @multifd-channels: Number of channels used to migrate data in
# parallel. This is the same number that the number of sockets
# used for migration. The default value is 2 (since 4.0)
#
# @xbzrle-cache-size: cache size to be used by XBZRLE migration. It
# needs to be a multiple of the target page size and a power of 2
# (Since 2.11)
#
# @max-postcopy-bandwidth: Background transfer bandwidth during
# postcopy. Defaults to 0 (unlimited). In bytes per second.
# (Since 3.0)
#
# @max-cpu-throttle: maximum cpu throttle percentage. Defaults to 99.
# (Since 3.1)
#
# @multifd-compression: Which compression method to use. Defaults to
# none. (Since 5.0)
#
# @multifd-zlib-level: Set the compression level to be used in live
# migration, the compression level is an integer between 0 and 9,
# where 0 means no compression, 1 means the best compression
# speed, and 9 means best compression ratio which will consume
# more CPU. Defaults to 1. (Since 5.0)
#
# @multifd-zstd-level: Set the compression level to be used in live
# migration, the compression level is an integer between 0 and 20,
# where 0 means no compression, 1 means the best compression
# speed, and 20 means best compression ratio which will consume
# more CPU. Defaults to 1. (Since 5.0)
#
# @block-bitmap-mapping: Maps block nodes and bitmaps on them to
# aliases for the purpose of dirty bitmap migration. Such aliases
# may for example be the corresponding names on the opposite site.
# The mapping must be one-to-one, but not necessarily complete: On
# the source, unmapped bitmaps and all bitmaps on unmapped nodes
# will be ignored. On the destination, encountering an unmapped
# alias in the incoming migration stream will result in a report,
# and all further bitmap migration data will then be discarded.
# Note that the destination does not know about bitmaps it does
# not receive, so there is no limitation or requirement regarding
# the number of bitmaps received, or how they are named, or on
# which nodes they are placed. By default (when this parameter
# has never been set), bitmap names are mapped to themselves.
# Nodes are mapped to their block device name if there is one, and
# to their node name otherwise. (Since 5.2)
#
# @x-vcpu-dirty-limit-period: Periodic time (in milliseconds) of dirty
# limit during live migration. Should be in the range 1 to 1000ms.
# Defaults to 1000ms. (Since 8.1)
#
# @vcpu-dirty-limit: Dirtyrate limit (MB/s) during live migration.
# Defaults to 1. (Since 8.1)
#
# @mode: Migration mode. See description in @MigMode. Default is 'normal'.
# (Since 8.2)
#
# @zero-page-detection: Whether and how to detect zero pages.
# See description in @ZeroPageDetection. Default is 'multifd'.
# (since 9.0)
#
# Features:
#
# @deprecated: Member @block-incremental is deprecated. Use
# blockdev-mirror with NBD instead. Members @compress-level,
# @compress-threads, @decompress-threads and @compress-wait-thread
# are deprecated because @compression is deprecated.
#
# @unstable: Members @x-checkpoint-delay and @x-vcpu-dirty-limit-period
# are experimental.
#
# TODO: either fuse back into MigrationParameters, or make
# MigrationParameters members mandatory
#
# Since: 2.4
##
{ 'struct': 'MigrateSetParameters',
'data': { '*announce-initial': 'size',
'*announce-max': 'size',
'*announce-rounds': 'size',
'*announce-step': 'size',
'*compress-level': { 'type': 'uint8',
'features': [ 'deprecated' ] },
'*compress-threads': { 'type': 'uint8',
'features': [ 'deprecated' ] },
'*compress-wait-thread': { 'type': 'bool',
'features': [ 'deprecated' ] },
'*decompress-threads': { 'type': 'uint8',
'features': [ 'deprecated' ] },
'*throttle-trigger-threshold': 'uint8',
'*cpu-throttle-initial': 'uint8',
'*cpu-throttle-increment': 'uint8',
'*cpu-throttle-tailslow': 'bool',
'*tls-creds': 'StrOrNull',
'*tls-hostname': 'StrOrNull',
'*tls-authz': 'StrOrNull',
'*max-bandwidth': 'size',
migration: Allow user to specify available switchover bandwidth Migration bandwidth is a very important value to live migration. It's because it's one of the major factors that we'll make decision on when to switchover to destination in a precopy process. This value is currently estimated by QEMU during the whole live migration process by monitoring how fast we were sending the data. This can be the most accurate bandwidth if in the ideal world, where we're always feeding unlimited data to the migration channel, and then it'll be limited to the bandwidth that is available. However in reality it may be very different, e.g., over a 10Gbps network we can see query-migrate showing migration bandwidth of only a few tens of MB/s just because there are plenty of other things the migration thread might be doing. For example, the migration thread can be busy scanning zero pages, or it can be fetching dirty bitmap from other external dirty sources (like vhost or KVM). It means we may not be pushing data as much as possible to migration channel, so the bandwidth estimated from "how many data we sent in the channel" can be dramatically inaccurate sometimes. With that, the decision to switchover will be affected, by assuming that we may not be able to switchover at all with such a low bandwidth, but in reality we can. The migration may not even converge at all with the downtime specified, with that wrong estimation of bandwidth, keeping iterations forever with a low estimation of bandwidth. The issue is QEMU itself may not be able to avoid those uncertainties on measuing the real "available migration bandwidth". At least not something I can think of so far. One way to fix this is when the user is fully aware of the available bandwidth, then we can allow the user to help providing an accurate value. For example, if the user has a dedicated channel of 10Gbps for migration for this specific VM, the user can specify this bandwidth so QEMU can always do the calculation based on this fact, trusting the user as long as specified. It may not be the exact bandwidth when switching over (in which case qemu will push migration data as fast as possible), but much better than QEMU trying to wildly guess, especially when very wrong. A new parameter "avail-switchover-bandwidth" is introduced just for this. So when the user specified this parameter, instead of trusting the estimated value from QEMU itself (based on the QEMUFile send speed), it trusts the user more by using this value to decide when to switchover, assuming that we'll have such bandwidth available then. Note that specifying this value will not throttle the bandwidth for switchover yet, so QEMU will always use the full bandwidth possible for sending switchover data, assuming that should always be the most important way to use the network at that time. This can resolve issues like "unconvergence migration" which is caused by hilarious low "migration bandwidth" detected for whatever reason. Reported-by: Zhiyi Guo <zhguo@redhat.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231010221922.40638-1-peterx@redhat.com>
2023-10-11 01:19:22 +03:00
'*avail-switchover-bandwidth': 'size',
'*downtime-limit': 'uint64',
'*x-checkpoint-delay': { 'type': 'uint32',
'features': [ 'unstable' ] },
'*block-incremental': { 'type': 'bool',
'features': [ 'deprecated' ] },
'*multifd-channels': 'uint8',
'*xbzrle-cache-size': 'size',
'*max-postcopy-bandwidth': 'size',
'*max-cpu-throttle': 'uint8',
'*multifd-compression': 'MultiFDCompression',
'*multifd-zlib-level': 'uint8',
'*multifd-zstd-level': 'uint8',
'*block-bitmap-mapping': [ 'BitmapMigrationNodeAlias' ],
'*x-vcpu-dirty-limit-period': { 'type': 'uint64',
'features': [ 'unstable' ] },
'*vcpu-dirty-limit': 'uint64',
'*mode': 'MigMode',
'*zero-page-detection': 'ZeroPageDetection'} }
##
# @migrate-set-parameters:
#
# Set various migration parameters.
#
# Since: 2.4
#
# Example:
#
# -> { "execute": "migrate-set-parameters" ,
# "arguments": { "multifd-channels": 5 } }
# <- { "return": {} }
##
{ 'command': 'migrate-set-parameters', 'boxed': true,
'data': 'MigrateSetParameters' }
##
# @MigrationParameters:
#
# The optional members aren't actually optional.
#
# @announce-initial: Initial delay (in milliseconds) before sending
# the first announce (Since 4.0)
#
# @announce-max: Maximum delay (in milliseconds) between packets in
# the announcement (Since 4.0)
#
# @announce-rounds: Number of self-announce packets sent after
# migration (Since 4.0)
#
# @announce-step: Increase in delay (in milliseconds) between
# subsequent packets in the announcement (Since 4.0)
#
# @compress-level: compression level
#
# @compress-threads: compression thread count
#
# @compress-wait-thread: Controls behavior when all compression
# threads are currently busy. If true (default), wait for a free
# compression thread to become available; otherwise, send the page
# uncompressed. (Since 3.1)
#
# @decompress-threads: decompression thread count
#
# @throttle-trigger-threshold: The ratio of bytes_dirty_period and
# bytes_xfer_period to trigger throttling. It is expressed as
# percentage. The default value is 50. (Since 5.0)
#
# @cpu-throttle-initial: Initial percentage of time guest cpus are
# throttled when migration auto-converge is activated. (Since
# 2.7)
#
# @cpu-throttle-increment: throttle percentage increase each time
# auto-converge detects that migration is not making progress.
# (Since 2.7)
#
# @cpu-throttle-tailslow: Make CPU throttling slower at tail stage At
# the tail stage of throttling, the Guest is very sensitive to CPU
# percentage while the @cpu-throttle -increment is excessive
# usually at tail stage. If this parameter is true, we will
# compute the ideal CPU percentage used by the Guest, which may
# exactly make the dirty rate match the dirty rate threshold.
# Then we will choose a smaller throttle increment between the one
# specified by @cpu-throttle-increment and the one generated by
# ideal CPU percentage. Therefore, it is compatible to
# traditional throttling, meanwhile the throttle increment won't
# be excessive at tail stage. The default value is false. (Since
# 5.1)
#
# @tls-creds: ID of the 'tls-creds' object that provides credentials
# for establishing a TLS connection over the migration data
# channel. On the outgoing side of the migration, the credentials
# must be for a 'client' endpoint, while for the incoming side the
# credentials must be for a 'server' endpoint. An empty string
# means that QEMU will use plain text mode for migration, rather
qapi: Improve migration TLS documentation MigrateSetParameters is about setting parameters, and MigrationParameters is about querying them. Their documentation of @tls-creds and @tls-hostname has residual damage from a failed attempt at de-duplicating them (see commit de63ab61241 "migrate: Share common MigrationParameters struct" and commit 1bda8b3c695 "migration: Unshare MigrationParameters struct for now"). MigrateSetParameters documentation issues: * It claims plain text mode "was reported by omitting tls-creds" before 2.9. MigrateSetParameters is not used for reporting, so this is misleading. Delete. * It similarly claims hostname defaulting to migration URI "was reported by omitting tls-hostname" before 2.9. Delete as well. Rephrase the remaining @tls-hostname contents for clarity. Enum MigrationParameter mirrors the members of struct MigrateSetParameters. Differences to MigrateSetParameters's member documentation are pointless. Copy the new text to MigrationParameter. MigrationParameters documentation issues: * @tls-creds runs the two last sentences together without punctuation. Fix that. * Much of the contents on @tls-hostname only applies to setting parameters, resulting in confusion. Replace by a suitable abridged version of the new MigrateSetParameters text, and a note on @tls-hostname omission in 2.8. Additional damage is due to flawed doc fix commit 66fcb9d651d (qapi/migration: Add missing tls-authz documentation): since it copied the missing MigrateSetParameters text from MigrationParameters instead of MigrationParameter, the part on recreating @tls-authz on the fly is missing. Copy that, too. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-ID: <20240322135117.195489-2-armbru@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> [Some typos corrected]
2024-03-22 16:51:15 +03:00
# than TLS. (Since 2.7)
#
# Note: 2.8 omits empty @tls-creds instead.
#
# @tls-hostname: migration target's hostname for validating the
# server's x509 certificate identity. If empty, QEMU will use the
# hostname from the migration URI, if any. (Since 2.7)
#
# Note: 2.8 omits empty @tls-hostname instead.
#
# @tls-authz: ID of the 'authz' object subclass that provides access
# control checking of the TLS x509 certificate distinguished name.
# (Since 4.0)
#
# @max-bandwidth: maximum speed for migration, in bytes per second.
# (Since 2.8)
#
migration: Allow user to specify available switchover bandwidth Migration bandwidth is a very important value to live migration. It's because it's one of the major factors that we'll make decision on when to switchover to destination in a precopy process. This value is currently estimated by QEMU during the whole live migration process by monitoring how fast we were sending the data. This can be the most accurate bandwidth if in the ideal world, where we're always feeding unlimited data to the migration channel, and then it'll be limited to the bandwidth that is available. However in reality it may be very different, e.g., over a 10Gbps network we can see query-migrate showing migration bandwidth of only a few tens of MB/s just because there are plenty of other things the migration thread might be doing. For example, the migration thread can be busy scanning zero pages, or it can be fetching dirty bitmap from other external dirty sources (like vhost or KVM). It means we may not be pushing data as much as possible to migration channel, so the bandwidth estimated from "how many data we sent in the channel" can be dramatically inaccurate sometimes. With that, the decision to switchover will be affected, by assuming that we may not be able to switchover at all with such a low bandwidth, but in reality we can. The migration may not even converge at all with the downtime specified, with that wrong estimation of bandwidth, keeping iterations forever with a low estimation of bandwidth. The issue is QEMU itself may not be able to avoid those uncertainties on measuing the real "available migration bandwidth". At least not something I can think of so far. One way to fix this is when the user is fully aware of the available bandwidth, then we can allow the user to help providing an accurate value. For example, if the user has a dedicated channel of 10Gbps for migration for this specific VM, the user can specify this bandwidth so QEMU can always do the calculation based on this fact, trusting the user as long as specified. It may not be the exact bandwidth when switching over (in which case qemu will push migration data as fast as possible), but much better than QEMU trying to wildly guess, especially when very wrong. A new parameter "avail-switchover-bandwidth" is introduced just for this. So when the user specified this parameter, instead of trusting the estimated value from QEMU itself (based on the QEMUFile send speed), it trusts the user more by using this value to decide when to switchover, assuming that we'll have such bandwidth available then. Note that specifying this value will not throttle the bandwidth for switchover yet, so QEMU will always use the full bandwidth possible for sending switchover data, assuming that should always be the most important way to use the network at that time. This can resolve issues like "unconvergence migration" which is caused by hilarious low "migration bandwidth" detected for whatever reason. Reported-by: Zhiyi Guo <zhguo@redhat.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231010221922.40638-1-peterx@redhat.com>
2023-10-11 01:19:22 +03:00
# @avail-switchover-bandwidth: to set the available bandwidth that
# migration can use during switchover phase. NOTE! This does not
# limit the bandwidth during switchover, but only for calculations when
# making decisions to switchover. By default, this value is zero,
# which means QEMU will estimate the bandwidth automatically. This can
# be set when the estimated value is not accurate, while the user is
# able to guarantee such bandwidth is available when switching over.
# When specified correctly, this can make the switchover decision much
# more accurate. (Since 8.2)
#
# @downtime-limit: set maximum tolerated downtime for migration.
# maximum downtime in milliseconds (Since 2.8)
#
# @x-checkpoint-delay: the delay time between two COLO checkpoints.
# (Since 2.8)
#
# @block-incremental: Affects how much storage is migrated when the
# block migration capability is enabled. When false, the entire
# storage backing chain is migrated into a flattened image at the
# destination; when true, only the active qcow2 layer is migrated
# and the destination must already have access to the same backing
# chain as was used on the source. (since 2.10)
#
# @multifd-channels: Number of channels used to migrate data in
# parallel. This is the same number that the number of sockets
# used for migration. The default value is 2 (since 4.0)
#
# @xbzrle-cache-size: cache size to be used by XBZRLE migration. It
# needs to be a multiple of the target page size and a power of 2
# (Since 2.11)
#
# @max-postcopy-bandwidth: Background transfer bandwidth during
# postcopy. Defaults to 0 (unlimited). In bytes per second.
# (Since 3.0)
#
# @max-cpu-throttle: maximum cpu throttle percentage. Defaults to 99.
# (Since 3.1)
#
# @multifd-compression: Which compression method to use. Defaults to
# none. (Since 5.0)
#
# @multifd-zlib-level: Set the compression level to be used in live
# migration, the compression level is an integer between 0 and 9,
# where 0 means no compression, 1 means the best compression
# speed, and 9 means best compression ratio which will consume
# more CPU. Defaults to 1. (Since 5.0)
#
# @multifd-zstd-level: Set the compression level to be used in live
# migration, the compression level is an integer between 0 and 20,
# where 0 means no compression, 1 means the best compression
# speed, and 20 means best compression ratio which will consume
# more CPU. Defaults to 1. (Since 5.0)
#
# @block-bitmap-mapping: Maps block nodes and bitmaps on them to
# aliases for the purpose of dirty bitmap migration. Such aliases
# may for example be the corresponding names on the opposite site.
# The mapping must be one-to-one, but not necessarily complete: On
# the source, unmapped bitmaps and all bitmaps on unmapped nodes
# will be ignored. On the destination, encountering an unmapped
# alias in the incoming migration stream will result in a report,
# and all further bitmap migration data will then be discarded.
# Note that the destination does not know about bitmaps it does
# not receive, so there is no limitation or requirement regarding
# the number of bitmaps received, or how they are named, or on
# which nodes they are placed. By default (when this parameter
# has never been set), bitmap names are mapped to themselves.
# Nodes are mapped to their block device name if there is one, and
# to their node name otherwise. (Since 5.2)
#
# @x-vcpu-dirty-limit-period: Periodic time (in milliseconds) of dirty
# limit during live migration. Should be in the range 1 to 1000ms.
# Defaults to 1000ms. (Since 8.1)
#
# @vcpu-dirty-limit: Dirtyrate limit (MB/s) during live migration.
# Defaults to 1. (Since 8.1)
#
# @mode: Migration mode. See description in @MigMode. Default is 'normal'.
# (Since 8.2)
#
# @zero-page-detection: Whether and how to detect zero pages.
# See description in @ZeroPageDetection. Default is 'multifd'.
# (since 9.0)
#
# Features:
#
# @deprecated: Member @block-incremental is deprecated. Use
# blockdev-mirror with NBD instead. Members @compress-level,
# @compress-threads, @decompress-threads and @compress-wait-thread
# are deprecated because @compression is deprecated.
#
# @unstable: Members @x-checkpoint-delay and @x-vcpu-dirty-limit-period
# are experimental.
#
# Since: 2.4
##
{ 'struct': 'MigrationParameters',
'data': { '*announce-initial': 'size',
'*announce-max': 'size',
'*announce-rounds': 'size',
'*announce-step': 'size',
'*compress-level': { 'type': 'uint8',
'features': [ 'deprecated' ] },
'*compress-threads': { 'type': 'uint8',
'features': [ 'deprecated' ] },
'*compress-wait-thread': { 'type': 'bool',
'features': [ 'deprecated' ] },
'*decompress-threads': { 'type': 'uint8',
'features': [ 'deprecated' ] },
'*throttle-trigger-threshold': 'uint8',
'*cpu-throttle-initial': 'uint8',
'*cpu-throttle-increment': 'uint8',
'*cpu-throttle-tailslow': 'bool',
'*tls-creds': 'str',
'*tls-hostname': 'str',
'*tls-authz': 'str',
'*max-bandwidth': 'size',
migration: Allow user to specify available switchover bandwidth Migration bandwidth is a very important value to live migration. It's because it's one of the major factors that we'll make decision on when to switchover to destination in a precopy process. This value is currently estimated by QEMU during the whole live migration process by monitoring how fast we were sending the data. This can be the most accurate bandwidth if in the ideal world, where we're always feeding unlimited data to the migration channel, and then it'll be limited to the bandwidth that is available. However in reality it may be very different, e.g., over a 10Gbps network we can see query-migrate showing migration bandwidth of only a few tens of MB/s just because there are plenty of other things the migration thread might be doing. For example, the migration thread can be busy scanning zero pages, or it can be fetching dirty bitmap from other external dirty sources (like vhost or KVM). It means we may not be pushing data as much as possible to migration channel, so the bandwidth estimated from "how many data we sent in the channel" can be dramatically inaccurate sometimes. With that, the decision to switchover will be affected, by assuming that we may not be able to switchover at all with such a low bandwidth, but in reality we can. The migration may not even converge at all with the downtime specified, with that wrong estimation of bandwidth, keeping iterations forever with a low estimation of bandwidth. The issue is QEMU itself may not be able to avoid those uncertainties on measuing the real "available migration bandwidth". At least not something I can think of so far. One way to fix this is when the user is fully aware of the available bandwidth, then we can allow the user to help providing an accurate value. For example, if the user has a dedicated channel of 10Gbps for migration for this specific VM, the user can specify this bandwidth so QEMU can always do the calculation based on this fact, trusting the user as long as specified. It may not be the exact bandwidth when switching over (in which case qemu will push migration data as fast as possible), but much better than QEMU trying to wildly guess, especially when very wrong. A new parameter "avail-switchover-bandwidth" is introduced just for this. So when the user specified this parameter, instead of trusting the estimated value from QEMU itself (based on the QEMUFile send speed), it trusts the user more by using this value to decide when to switchover, assuming that we'll have such bandwidth available then. Note that specifying this value will not throttle the bandwidth for switchover yet, so QEMU will always use the full bandwidth possible for sending switchover data, assuming that should always be the most important way to use the network at that time. This can resolve issues like "unconvergence migration" which is caused by hilarious low "migration bandwidth" detected for whatever reason. Reported-by: Zhiyi Guo <zhguo@redhat.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231010221922.40638-1-peterx@redhat.com>
2023-10-11 01:19:22 +03:00
'*avail-switchover-bandwidth': 'size',
'*downtime-limit': 'uint64',
'*x-checkpoint-delay': { 'type': 'uint32',
'features': [ 'unstable' ] },
'*block-incremental': { 'type': 'bool',
'features': [ 'deprecated' ] },
'*multifd-channels': 'uint8',
'*xbzrle-cache-size': 'size',
'*max-postcopy-bandwidth': 'size',
'*max-cpu-throttle': 'uint8',
'*multifd-compression': 'MultiFDCompression',
'*multifd-zlib-level': 'uint8',
'*multifd-zstd-level': 'uint8',
'*block-bitmap-mapping': [ 'BitmapMigrationNodeAlias' ],
'*x-vcpu-dirty-limit-period': { 'type': 'uint64',
'features': [ 'unstable' ] },
'*vcpu-dirty-limit': 'uint64',
'*mode': 'MigMode',
'*zero-page-detection': 'ZeroPageDetection'} }
##
# @query-migrate-parameters:
#
# Returns information about the current migration parameters
#
# Returns: @MigrationParameters
#
# Since: 2.4
#
# Example:
#
# -> { "execute": "query-migrate-parameters" }
# <- { "return": {
# "multifd-channels": 2,
# "cpu-throttle-increment": 10,
# "cpu-throttle-initial": 20,
# "max-bandwidth": 33554432,
# "downtime-limit": 300
# }
# }
##
{ 'command': 'query-migrate-parameters',
'returns': 'MigrationParameters' }
##
# @migrate-start-postcopy:
#
# Followup to a migration command to switch the migration to postcopy
# mode. The postcopy-ram capability must be set on both source and
# destination before the original migration command.
#
# Since: 2.5
#
# Example:
#
# -> { "execute": "migrate-start-postcopy" }
# <- { "return": {} }
##
{ 'command': 'migrate-start-postcopy' }
##
# @MIGRATION:
#
# Emitted when a migration event happens
#
# @status: @MigrationStatus describing the current migration status.
#
# Since: 2.4
#
# Example:
#
# <- {"timestamp": {"seconds": 1432121972, "microseconds": 744001},
# "event": "MIGRATION",
# "data": {"status": "completed"} }
##
{ 'event': 'MIGRATION',
'data': {'status': 'MigrationStatus'}}
##
# @MIGRATION_PASS:
#
# Emitted from the source side of a migration at the start of each
# pass (when it syncs the dirty bitmap)
#
# @pass: An incrementing count (starting at 1 on the first pass)
#
# Since: 2.6
#
# Example:
#
# <- { "timestamp": {"seconds": 1449669631, "microseconds": 239225},
# "event": "MIGRATION_PASS", "data": {"pass": 2} }
##
{ 'event': 'MIGRATION_PASS',
'data': { 'pass': 'int' } }
##
# @COLOMessage:
#
# The message transmission between Primary side and Secondary side.
#
# @checkpoint-ready: Secondary VM (SVM) is ready for checkpointing
#
# @checkpoint-request: Primary VM (PVM) tells SVM to prepare for
# checkpointing
#
# @checkpoint-reply: SVM gets PVM's checkpoint request
#
# @vmstate-send: VM's state will be sent by PVM.
#
# @vmstate-size: The total size of VMstate.
#
# @vmstate-received: VM's state has been received by SVM.
#
# @vmstate-loaded: VM's state has been loaded by SVM.
#
# Since: 2.8
##
{ 'enum': 'COLOMessage',
'data': [ 'checkpoint-ready', 'checkpoint-request', 'checkpoint-reply',
'vmstate-send', 'vmstate-size', 'vmstate-received',
'vmstate-loaded' ] }
##
# @COLOMode:
#
# The COLO current mode.
#
# @none: COLO is disabled.
#
# @primary: COLO node in primary side.
#
# @secondary: COLO node in slave side.
#
# Since: 2.8
##
{ 'enum': 'COLOMode',
'data': [ 'none', 'primary', 'secondary'] }
##
# @FailoverStatus:
#
# An enumeration of COLO failover status
#
# @none: no failover has ever happened
#
# @require: got failover requirement but not handled
#
# @active: in the process of doing failover
#
# @completed: finish the process of failover
#
# @relaunch: restart the failover process, from 'none' -> 'completed'
# (Since 2.9)
#
# Since: 2.8
##
{ 'enum': 'FailoverStatus',
'data': [ 'none', 'require', 'active', 'completed', 'relaunch' ] }
##
# @COLO_EXIT:
#
# Emitted when VM finishes COLO mode due to some errors happening or
# at the request of users.
#
# @mode: report COLO mode when COLO exited.
#
# @reason: describes the reason for the COLO exit.
#
# Since: 3.1
#
# Example:
#
# <- { "timestamp": {"seconds": 2032141960, "microseconds": 417172},
# "event": "COLO_EXIT", "data": {"mode": "primary", "reason": "request" } }
##
{ 'event': 'COLO_EXIT',
'data': {'mode': 'COLOMode', 'reason': 'COLOExitReason' } }
##
# @COLOExitReason:
#
# The reason for a COLO exit.
#
# @none: failover has never happened. This state does not occur in
# the COLO_EXIT event, and is only visible in the result of
# query-colo-status.
#
# @request: COLO exit is due to an external request.
#
# @error: COLO exit is due to an internal error.
#
# @processing: COLO is currently handling a failover (since 4.0).
#
# Since: 3.1
##
{ 'enum': 'COLOExitReason',
'data': [ 'none', 'request', 'error' , 'processing' ] }
##
# @x-colo-lost-heartbeat:
#
# Tell qemu that heartbeat is lost, request it to do takeover
# procedures. If this command is sent to the PVM, the Primary side
# will exit COLO mode. If sent to the Secondary, the Secondary side
# will run failover work, then takes over server operation to become
# the service VM.
#
# Features:
#
# @unstable: This command is experimental.
#
# Since: 2.8
#
# Example:
#
# -> { "execute": "x-colo-lost-heartbeat" }
# <- { "return": {} }
##
{ 'command': 'x-colo-lost-heartbeat',
'features': [ 'unstable' ],
'if': 'CONFIG_REPLICATION' }
##
# @migrate_cancel:
#
# Cancel the current executing migration process.
#
# Notes: This command succeeds even if there is no migration process
# running.
#
# Since: 0.14
#
# Example:
#
# -> { "execute": "migrate_cancel" }
# <- { "return": {} }
##
{ 'command': 'migrate_cancel' }
##
# @migrate-continue:
#
# Continue migration when it's in a paused state.
#
# @state: The state the migration is currently expected to be in
#
# Since: 2.11
#
# Example:
#
# -> { "execute": "migrate-continue" , "arguments":
# { "state": "pre-switchover" } }
# <- { "return": {} }
##
{ 'command': 'migrate-continue', 'data': {'state': 'MigrationStatus'} }
##
# @MigrationAddressType:
#
# The migration stream transport mechanisms.
#
# @socket: Migrate via socket.
#
# @exec: Direct the migration stream to another process.
#
# @rdma: Migrate via RDMA.
#
# @file: Direct the migration stream to a file.
#
# Since: 8.2
##
{ 'enum': 'MigrationAddressType',
'data': [ 'socket', 'exec', 'rdma', 'file' ] }
##
# @FileMigrationArgs:
#
# @filename: The file to receive the migration stream
#
# @offset: The file offset where the migration stream will start
#
# Since: 8.2
##
{ 'struct': 'FileMigrationArgs',
'data': { 'filename': 'str',
'offset': 'uint64' } }
##
# @MigrationExecCommand:
#
# @args: command (list head) and arguments to execute.
#
# Since: 8.2
##
{ 'struct': 'MigrationExecCommand',
'data': {'args': [ 'str' ] } }
##
# @MigrationAddress:
#
# Migration endpoint configuration.
#
# @transport: The migration stream transport mechanism
#
# Since: 8.2
##
{ 'union': 'MigrationAddress',
'base': { 'transport' : 'MigrationAddressType'},
'discriminator': 'transport',
'data': {
'socket': 'SocketAddress',
'exec': 'MigrationExecCommand',
'rdma': 'InetSocketAddress',
'file': 'FileMigrationArgs' } }
##
# @MigrationChannelType:
#
# The migration channel-type request options.
#
# @main: Main outbound migration channel.
#
# Since: 8.1
##
{ 'enum': 'MigrationChannelType',
'data': [ 'main' ] }
##
# @MigrationChannel:
#
# Migration stream channel parameters.
#
# @channel-type: Channel type for transferring packet information.
#
# @addr: Migration endpoint configuration on destination interface.
#
# Since: 8.1
##
{ 'struct': 'MigrationChannel',
'data': {
'channel-type': 'MigrationChannelType',
'addr': 'MigrationAddress' } }
##
# @migrate:
#
# Migrates the current running guest to another Virtual Machine.
#
# @uri: the Uniform Resource Identifier of the destination VM
#
# @channels: list of migration stream channels with each stream in the
# list connected to a destination interface endpoint.
#
# @blk: do block migration (full disk copy)
#
# @inc: incremental disk copy migration
#
# @detach: this argument exists only for compatibility reasons and is
# ignored by QEMU
#
# @resume: resume one paused migration, default "off". (since 3.0)
#
# Features:
#
# @deprecated: Members @inc and @blk are deprecated. Use
# blockdev-mirror with NBD instead.
#
# Since: 0.14
#
# Notes:
#
# 1. The 'query-migrate' command should be used to check
# migration's progress and final result (this information is
# provided by the 'status' member)
#
# 2. All boolean arguments default to false
#
# 3. The user Monitor's "detach" argument is invalid in QMP and
# should not be used
#
# 4. The uri argument should have the Uniform Resource Identifier
# of default destination VM. This connection will be bound to
# default network.
#
# 5. For now, number of migration streams is restricted to one,
# i.e. number of items in 'channels' list is just 1.
#
# 6. The 'uri' and 'channels' arguments are mutually exclusive;
# exactly one of the two should be present.
#
# Example:
#
# -> { "execute": "migrate", "arguments": { "uri": "tcp:0:4446" } }
# <- { "return": {} }
#
# -> { "execute": "migrate",
# "arguments": {
# "channels": [ { "channel-type": "main",
# "addr": { "transport": "socket",
# "type": "inet",
# "host": "10.12.34.9",
# "port": "1050" } } ] } }
# <- { "return": {} }
#
# -> { "execute": "migrate",
# "arguments": {
# "channels": [ { "channel-type": "main",
# "addr": { "transport": "exec",
# "args": [ "/bin/nc", "-p", "6000",
# "/some/sock" ] } } ] } }
# <- { "return": {} }
#
# -> { "execute": "migrate",
# "arguments": {
# "channels": [ { "channel-type": "main",
# "addr": { "transport": "rdma",
# "host": "10.12.34.9",
# "port": "1050" } } ] } }
# <- { "return": {} }
#
# -> { "execute": "migrate",
# "arguments": {
# "channels": [ { "channel-type": "main",
# "addr": { "transport": "file",
# "filename": "/tmp/migfile",
# "offset": "0x1000" } } ] } }
# <- { "return": {} }
#
##
{ 'command': 'migrate',
'data': {'*uri': 'str',
'*channels': [ 'MigrationChannel' ],
'*blk': { 'type': 'bool', 'features': [ 'deprecated' ] },
'*inc': { 'type': 'bool', 'features': [ 'deprecated' ] },
'*detach': 'bool', '*resume': 'bool' } }
##
# @migrate-incoming:
#
# Start an incoming migration, the qemu must have been started with
# -incoming defer
#
# @uri: The Uniform Resource Identifier identifying the source or
# address to listen on
#
# @channels: list of migration stream channels with each stream in the
# list connected to a destination interface endpoint.
#
# Since: 2.3
#
# Notes:
#
# 1. It's a bad idea to use a string for the uri, but it needs to
# stay compatible with -incoming and the format of the uri is
# already exposed above libvirt.
#
# 2. QEMU must be started with -incoming defer to allow
# migrate-incoming to be used.
#
# 3. The uri format is the same as for -incoming
#
# 4. For now, number of migration streams is restricted to one,
# i.e. number of items in 'channels' list is just 1.
#
# 5. The 'uri' and 'channels' arguments are mutually exclusive;
# exactly one of the two should be present.
#
# Example:
#
# -> { "execute": "migrate-incoming",
# "arguments": { "uri": "tcp:0:4446" } }
# <- { "return": {} }
#
# -> { "execute": "migrate-incoming",
# "arguments": {
# "channels": [ { "channel-type": "main",
# "addr": { "transport": "socket",
# "type": "inet",
# "host": "10.12.34.9",
# "port": "1050" } } ] } }
# <- { "return": {} }
#
# -> { "execute": "migrate-incoming",
# "arguments": {
# "channels": [ { "channel-type": "main",
# "addr": { "transport": "exec",
# "args": [ "/bin/nc", "-p", "6000",
# "/some/sock" ] } } ] } }
# <- { "return": {} }
#
# -> { "execute": "migrate-incoming",
# "arguments": {
# "channels": [ { "channel-type": "main",
# "addr": { "transport": "rdma",
# "host": "10.12.34.9",
# "port": "1050" } } ] } }
# <- { "return": {} }
##
{ 'command': 'migrate-incoming',
'data': {'*uri': 'str',
'*channels': [ 'MigrationChannel' ] } }
##
# @xen-save-devices-state:
#
# Save the state of all devices to file. The RAM and the block
# devices of the VM are not saved by this command.
#
# @filename: the file to save the state of the devices to as binary
# data. See xen-save-devices-state.txt for a description of the
# binary format.
#
# @live: Optional argument to ask QEMU to treat this command as part
# of a live migration. Default to true. (since 2.11)
#
# Since: 1.1
#
# Example:
#
# -> { "execute": "xen-save-devices-state",
# "arguments": { "filename": "/tmp/save" } }
# <- { "return": {} }
##
{ 'command': 'xen-save-devices-state',
'data': {'filename': 'str', '*live':'bool' } }
##
# @xen-set-global-dirty-log:
#
# Enable or disable the global dirty log mode.
#
# @enable: true to enable, false to disable.
#
# Since: 1.3
#
# Example:
#
# -> { "execute": "xen-set-global-dirty-log",
# "arguments": { "enable": true } }
# <- { "return": {} }
##
{ 'command': 'xen-set-global-dirty-log', 'data': { 'enable': 'bool' } }
##
# @xen-load-devices-state:
#
# Load the state of all devices from file. The RAM and the block
# devices of the VM are not loaded by this command.
#
# @filename: the file to load the state of the devices from as binary
# data. See xen-save-devices-state.txt for a description of the
# binary format.
#
# Since: 2.7
#
# Example:
#
# -> { "execute": "xen-load-devices-state",
# "arguments": { "filename": "/tmp/resume" } }
# <- { "return": {} }
##
{ 'command': 'xen-load-devices-state', 'data': {'filename': 'str'} }
##
# @xen-set-replication:
#
# Enable or disable replication.
#
# @enable: true to enable, false to disable.
#
# @primary: true for primary or false for secondary.
#
# @failover: true to do failover, false to stop. Cannot be specified
# if 'enable' is true. Default value is false.
#
# Example:
#
# -> { "execute": "xen-set-replication",
# "arguments": {"enable": true, "primary": false} }
# <- { "return": {} }
#
# Since: 2.9
##
{ 'command': 'xen-set-replication',
'data': { 'enable': 'bool', 'primary': 'bool', '*failover': 'bool' },
'if': 'CONFIG_REPLICATION' }
##
# @ReplicationStatus:
#
# The result format for 'query-xen-replication-status'.
#
# @error: true if an error happened, false if replication is normal.
#
# @desc: the human readable error description string, when @error is
# 'true'.
#
# Since: 2.9
##
{ 'struct': 'ReplicationStatus',
'data': { 'error': 'bool', '*desc': 'str' },
'if': 'CONFIG_REPLICATION' }
##
# @query-xen-replication-status:
#
# Query replication status while the vm is running.
#
# Returns: A @ReplicationStatus object showing the status.
#
# Example:
#
# -> { "execute": "query-xen-replication-status" }
# <- { "return": { "error": false } }
#
# Since: 2.9
##
{ 'command': 'query-xen-replication-status',
'returns': 'ReplicationStatus',
'if': 'CONFIG_REPLICATION' }
##
# @xen-colo-do-checkpoint:
#
# Xen uses this command to notify replication to trigger a checkpoint.
#
# Example:
#
# -> { "execute": "xen-colo-do-checkpoint" }
# <- { "return": {} }
#
# Since: 2.9
##
{ 'command': 'xen-colo-do-checkpoint',
'if': 'CONFIG_REPLICATION' }
##
# @COLOStatus:
#
# The result format for 'query-colo-status'.
#
# @mode: COLO running mode. If COLO is running, this field will
# return 'primary' or 'secondary'.
#
# @last-mode: COLO last running mode. If COLO is running, this field
# will return same like mode field, after failover we can use this
# field to get last colo mode. (since 4.0)
#
# @reason: describes the reason for the COLO exit.
#
# Since: 3.1
##
{ 'struct': 'COLOStatus',
'data': { 'mode': 'COLOMode', 'last-mode': 'COLOMode',
'reason': 'COLOExitReason' },
'if': 'CONFIG_REPLICATION' }
##
# @query-colo-status:
#
# Query COLO status while the vm is running.
#
# Returns: A @COLOStatus object showing the status.
#
# Example:
#
# -> { "execute": "query-colo-status" }
# <- { "return": { "mode": "primary", "last-mode": "none", "reason": "request" } }
#
# Since: 3.1
##
{ 'command': 'query-colo-status',
'returns': 'COLOStatus',
'if': 'CONFIG_REPLICATION' }
##
# @migrate-recover:
#
# Provide a recovery migration stream URI.
#
# @uri: the URI to be used for the recovery of migration stream.
#
# Example:
#
# -> { "execute": "migrate-recover",
# "arguments": { "uri": "tcp:192.168.1.200:12345" } }
# <- { "return": {} }
#
# Since: 3.0
##
{ 'command': 'migrate-recover',
'data': { 'uri': 'str' },
'allow-oob': true }
##
# @migrate-pause:
#
# Pause a migration. Currently it only supports postcopy.
#
# Example:
#
# -> { "execute": "migrate-pause" }
# <- { "return": {} }
#
# Since: 3.0
##
{ 'command': 'migrate-pause', 'allow-oob': true }
##
# @UNPLUG_PRIMARY:
#
# Emitted from source side of a migration when migration state is
# WAIT_UNPLUG. Device was unplugged by guest operating system. Device
# resources in QEMU are kept on standby to be able to re-plug it in
# case of migration failure.
#
# @device-id: QEMU device id of the unplugged device
#
# Since: 4.2
#
# Example:
#
# <- { "event": "UNPLUG_PRIMARY",
# "data": { "device-id": "hostdev0" },
# "timestamp": { "seconds": 1265044230, "microseconds": 450486 } }
##
{ 'event': 'UNPLUG_PRIMARY',
'data': { 'device-id': 'str' } }
##
# @DirtyRateVcpu:
#
# Dirty rate of vcpu.
#
# @id: vcpu index.
#
# @dirty-rate: dirty rate.
#
# Since: 6.2
##
{ 'struct': 'DirtyRateVcpu',
'data': { 'id': 'int', 'dirty-rate': 'int64' } }
##
# @DirtyRateStatus:
#
# Dirty page rate measurement status.
#
# @unstarted: measuring thread has not been started yet
#
# @measuring: measuring thread is running
#
# @measured: dirty page rate is measured and the results are available
#
# Since: 5.2
##
{ 'enum': 'DirtyRateStatus',
'data': [ 'unstarted', 'measuring', 'measured'] }
##
# @DirtyRateMeasureMode:
#
# Method used to measure dirty page rate. Differences between
# available methods are explained in @calc-dirty-rate.
#
# @page-sampling: use page sampling
#
# @dirty-ring: use dirty ring
#
# @dirty-bitmap: use dirty bitmap
#
# Since: 6.2
##
{ 'enum': 'DirtyRateMeasureMode',
'data': ['page-sampling', 'dirty-ring', 'dirty-bitmap'] }
migration/calc-dirty-rate: millisecond-granularity period This patch allows to measure dirty page rate for sub-second intervals of time. An optional argument is introduced -- calc-time-unit. For example: {"execute": "calc-dirty-rate", "arguments": {"calc-time": 500, "calc-time-unit": "millisecond"} } Millisecond granularity allows to make predictions whether migration will succeed or not. To do this, calculate dirty rate with calc-time set to max allowed downtime (e.g. 300ms), convert measured rate into volume of dirtied memory, and divide by network throughput. If the value is lower than max allowed downtime, then migration will converge. Measurement results for single thread randomly writing to a 1/4/24GiB memory region: +----------------+-----------------------------------------------+ | calc-time | dirty rate MiB/s | | (milliseconds) +----------------+---------------+--------------+ | | theoretical | page-sampling | dirty-bitmap | | | (at 3M wr/sec) | | | +----------------+----------------+---------------+--------------+ | 1GiB | +----------------+----------------+---------------+--------------+ | 100 | 6996 | 7100 | 3192 | | 200 | 4606 | 4660 | 2655 | | 300 | 3305 | 3280 | 2371 | | 400 | 2534 | 2525 | 2154 | | 500 | 2041 | 2044 | 1871 | | 750 | 1365 | 1341 | 1358 | | 1000 | 1024 | 1052 | 1025 | | 1500 | 683 | 678 | 684 | | 2000 | 512 | 507 | 513 | +----------------+----------------+---------------+--------------+ | 4GiB | +----------------+----------------+---------------+--------------+ | 100 | 10232 | 8880 | 4070 | | 200 | 8954 | 8049 | 3195 | | 300 | 7889 | 7193 | 2881 | | 400 | 6996 | 6530 | 2700 | | 500 | 6245 | 5772 | 2312 | | 750 | 4829 | 4586 | 2465 | | 1000 | 3865 | 3780 | 2178 | | 1500 | 2694 | 2633 | 2004 | | 2000 | 2041 | 2031 | 1789 | +----------------+----------------+---------------+--------------+ | 24GiB | +----------------+----------------+---------------+--------------+ | 100 | 11495 | 8640 | 5597 | | 200 | 11226 | 8616 | 3527 | | 300 | 10965 | 8386 | 2355 | | 400 | 10713 | 8370 | 2179 | | 500 | 10469 | 8196 | 2098 | | 750 | 9890 | 7885 | 2556 | | 1000 | 9354 | 7506 | 2084 | | 1500 | 8397 | 6944 | 2075 | | 2000 | 7574 | 6402 | 2062 | +----------------+----------------+---------------+--------------+ Theoretical values are computed according to the following formula: size * (1 - (1-(4096/size))^(time*wps)) / (time * 2^20), where size is in bytes, time is in seconds, and wps is number of writes per second. Signed-off-by: Andrei Gudkov <gudkov.andrei@huawei.com> Reviewed-by: Hyman Huang <yong.huang@smartx.com> Message-Id: <d802e6b8053eb60fbec1a784cf86f67d9528e0a8.1693895970.git.gudkov.andrei@huawei.com> Signed-off-by: Hyman Huang <yong.huang@smartx.com>
2023-09-05 10:05:43 +03:00
##
# @TimeUnit:
#
# Specifies unit in which time-related value is specified.
#
# @second: value is in seconds
#
# @millisecond: value is in milliseconds
#
# Since: 8.2
migration/calc-dirty-rate: millisecond-granularity period This patch allows to measure dirty page rate for sub-second intervals of time. An optional argument is introduced -- calc-time-unit. For example: {"execute": "calc-dirty-rate", "arguments": {"calc-time": 500, "calc-time-unit": "millisecond"} } Millisecond granularity allows to make predictions whether migration will succeed or not. To do this, calculate dirty rate with calc-time set to max allowed downtime (e.g. 300ms), convert measured rate into volume of dirtied memory, and divide by network throughput. If the value is lower than max allowed downtime, then migration will converge. Measurement results for single thread randomly writing to a 1/4/24GiB memory region: +----------------+-----------------------------------------------+ | calc-time | dirty rate MiB/s | | (milliseconds) +----------------+---------------+--------------+ | | theoretical | page-sampling | dirty-bitmap | | | (at 3M wr/sec) | | | +----------------+----------------+---------------+--------------+ | 1GiB | +----------------+----------------+---------------+--------------+ | 100 | 6996 | 7100 | 3192 | | 200 | 4606 | 4660 | 2655 | | 300 | 3305 | 3280 | 2371 | | 400 | 2534 | 2525 | 2154 | | 500 | 2041 | 2044 | 1871 | | 750 | 1365 | 1341 | 1358 | | 1000 | 1024 | 1052 | 1025 | | 1500 | 683 | 678 | 684 | | 2000 | 512 | 507 | 513 | +----------------+----------------+---------------+--------------+ | 4GiB | +----------------+----------------+---------------+--------------+ | 100 | 10232 | 8880 | 4070 | | 200 | 8954 | 8049 | 3195 | | 300 | 7889 | 7193 | 2881 | | 400 | 6996 | 6530 | 2700 | | 500 | 6245 | 5772 | 2312 | | 750 | 4829 | 4586 | 2465 | | 1000 | 3865 | 3780 | 2178 | | 1500 | 2694 | 2633 | 2004 | | 2000 | 2041 | 2031 | 1789 | +----------------+----------------+---------------+--------------+ | 24GiB | +----------------+----------------+---------------+--------------+ | 100 | 11495 | 8640 | 5597 | | 200 | 11226 | 8616 | 3527 | | 300 | 10965 | 8386 | 2355 | | 400 | 10713 | 8370 | 2179 | | 500 | 10469 | 8196 | 2098 | | 750 | 9890 | 7885 | 2556 | | 1000 | 9354 | 7506 | 2084 | | 1500 | 8397 | 6944 | 2075 | | 2000 | 7574 | 6402 | 2062 | +----------------+----------------+---------------+--------------+ Theoretical values are computed according to the following formula: size * (1 - (1-(4096/size))^(time*wps)) / (time * 2^20), where size is in bytes, time is in seconds, and wps is number of writes per second. Signed-off-by: Andrei Gudkov <gudkov.andrei@huawei.com> Reviewed-by: Hyman Huang <yong.huang@smartx.com> Message-Id: <d802e6b8053eb60fbec1a784cf86f67d9528e0a8.1693895970.git.gudkov.andrei@huawei.com> Signed-off-by: Hyman Huang <yong.huang@smartx.com>
2023-09-05 10:05:43 +03:00
#
##
{ 'enum': 'TimeUnit',
'data': ['second', 'millisecond'] }
##
# @DirtyRateInfo:
#
# Information about measured dirty page rate.
#
# @dirty-rate: an estimate of the dirty page rate of the VM in units
# of MiB/s. Value is present only when @status is 'measured'.
#
# @status: current status of dirty page rate measurements
#
# @start-time: start time in units of second for calculation
#
migration/calc-dirty-rate: millisecond-granularity period This patch allows to measure dirty page rate for sub-second intervals of time. An optional argument is introduced -- calc-time-unit. For example: {"execute": "calc-dirty-rate", "arguments": {"calc-time": 500, "calc-time-unit": "millisecond"} } Millisecond granularity allows to make predictions whether migration will succeed or not. To do this, calculate dirty rate with calc-time set to max allowed downtime (e.g. 300ms), convert measured rate into volume of dirtied memory, and divide by network throughput. If the value is lower than max allowed downtime, then migration will converge. Measurement results for single thread randomly writing to a 1/4/24GiB memory region: +----------------+-----------------------------------------------+ | calc-time | dirty rate MiB/s | | (milliseconds) +----------------+---------------+--------------+ | | theoretical | page-sampling | dirty-bitmap | | | (at 3M wr/sec) | | | +----------------+----------------+---------------+--------------+ | 1GiB | +----------------+----------------+---------------+--------------+ | 100 | 6996 | 7100 | 3192 | | 200 | 4606 | 4660 | 2655 | | 300 | 3305 | 3280 | 2371 | | 400 | 2534 | 2525 | 2154 | | 500 | 2041 | 2044 | 1871 | | 750 | 1365 | 1341 | 1358 | | 1000 | 1024 | 1052 | 1025 | | 1500 | 683 | 678 | 684 | | 2000 | 512 | 507 | 513 | +----------------+----------------+---------------+--------------+ | 4GiB | +----------------+----------------+---------------+--------------+ | 100 | 10232 | 8880 | 4070 | | 200 | 8954 | 8049 | 3195 | | 300 | 7889 | 7193 | 2881 | | 400 | 6996 | 6530 | 2700 | | 500 | 6245 | 5772 | 2312 | | 750 | 4829 | 4586 | 2465 | | 1000 | 3865 | 3780 | 2178 | | 1500 | 2694 | 2633 | 2004 | | 2000 | 2041 | 2031 | 1789 | +----------------+----------------+---------------+--------------+ | 24GiB | +----------------+----------------+---------------+--------------+ | 100 | 11495 | 8640 | 5597 | | 200 | 11226 | 8616 | 3527 | | 300 | 10965 | 8386 | 2355 | | 400 | 10713 | 8370 | 2179 | | 500 | 10469 | 8196 | 2098 | | 750 | 9890 | 7885 | 2556 | | 1000 | 9354 | 7506 | 2084 | | 1500 | 8397 | 6944 | 2075 | | 2000 | 7574 | 6402 | 2062 | +----------------+----------------+---------------+--------------+ Theoretical values are computed according to the following formula: size * (1 - (1-(4096/size))^(time*wps)) / (time * 2^20), where size is in bytes, time is in seconds, and wps is number of writes per second. Signed-off-by: Andrei Gudkov <gudkov.andrei@huawei.com> Reviewed-by: Hyman Huang <yong.huang@smartx.com> Message-Id: <d802e6b8053eb60fbec1a784cf86f67d9528e0a8.1693895970.git.gudkov.andrei@huawei.com> Signed-off-by: Hyman Huang <yong.huang@smartx.com>
2023-09-05 10:05:43 +03:00
# @calc-time: time period for which dirty page rate was measured,
# expressed and rounded down to @calc-time-unit.
#
# @calc-time-unit: time unit of @calc-time (Since 8.2)
#
# @sample-pages: number of sampled pages per GiB of guest memory.
# Valid only in page-sampling mode (Since 6.1)
#
# @mode: mode that was used to measure dirty page rate (Since 6.2)
#
# @vcpu-dirty-rate: dirty rate for each vCPU if dirty-ring mode was
# specified (Since 6.2)
#
# Since: 5.2
##
{ 'struct': 'DirtyRateInfo',
'data': {'*dirty-rate': 'int64',
'status': 'DirtyRateStatus',
'start-time': 'int64',
'calc-time': 'int64',
migration/calc-dirty-rate: millisecond-granularity period This patch allows to measure dirty page rate for sub-second intervals of time. An optional argument is introduced -- calc-time-unit. For example: {"execute": "calc-dirty-rate", "arguments": {"calc-time": 500, "calc-time-unit": "millisecond"} } Millisecond granularity allows to make predictions whether migration will succeed or not. To do this, calculate dirty rate with calc-time set to max allowed downtime (e.g. 300ms), convert measured rate into volume of dirtied memory, and divide by network throughput. If the value is lower than max allowed downtime, then migration will converge. Measurement results for single thread randomly writing to a 1/4/24GiB memory region: +----------------+-----------------------------------------------+ | calc-time | dirty rate MiB/s | | (milliseconds) +----------------+---------------+--------------+ | | theoretical | page-sampling | dirty-bitmap | | | (at 3M wr/sec) | | | +----------------+----------------+---------------+--------------+ | 1GiB | +----------------+----------------+---------------+--------------+ | 100 | 6996 | 7100 | 3192 | | 200 | 4606 | 4660 | 2655 | | 300 | 3305 | 3280 | 2371 | | 400 | 2534 | 2525 | 2154 | | 500 | 2041 | 2044 | 1871 | | 750 | 1365 | 1341 | 1358 | | 1000 | 1024 | 1052 | 1025 | | 1500 | 683 | 678 | 684 | | 2000 | 512 | 507 | 513 | +----------------+----------------+---------------+--------------+ | 4GiB | +----------------+----------------+---------------+--------------+ | 100 | 10232 | 8880 | 4070 | | 200 | 8954 | 8049 | 3195 | | 300 | 7889 | 7193 | 2881 | | 400 | 6996 | 6530 | 2700 | | 500 | 6245 | 5772 | 2312 | | 750 | 4829 | 4586 | 2465 | | 1000 | 3865 | 3780 | 2178 | | 1500 | 2694 | 2633 | 2004 | | 2000 | 2041 | 2031 | 1789 | +----------------+----------------+---------------+--------------+ | 24GiB | +----------------+----------------+---------------+--------------+ | 100 | 11495 | 8640 | 5597 | | 200 | 11226 | 8616 | 3527 | | 300 | 10965 | 8386 | 2355 | | 400 | 10713 | 8370 | 2179 | | 500 | 10469 | 8196 | 2098 | | 750 | 9890 | 7885 | 2556 | | 1000 | 9354 | 7506 | 2084 | | 1500 | 8397 | 6944 | 2075 | | 2000 | 7574 | 6402 | 2062 | +----------------+----------------+---------------+--------------+ Theoretical values are computed according to the following formula: size * (1 - (1-(4096/size))^(time*wps)) / (time * 2^20), where size is in bytes, time is in seconds, and wps is number of writes per second. Signed-off-by: Andrei Gudkov <gudkov.andrei@huawei.com> Reviewed-by: Hyman Huang <yong.huang@smartx.com> Message-Id: <d802e6b8053eb60fbec1a784cf86f67d9528e0a8.1693895970.git.gudkov.andrei@huawei.com> Signed-off-by: Hyman Huang <yong.huang@smartx.com>
2023-09-05 10:05:43 +03:00
'calc-time-unit': 'TimeUnit',
'sample-pages': 'uint64',
'mode': 'DirtyRateMeasureMode',
'*vcpu-dirty-rate': [ 'DirtyRateVcpu' ] } }
##
# @calc-dirty-rate:
#
# Start measuring dirty page rate of the VM. Results can be retrieved
# with @query-dirty-rate after measurements are completed.
#
# Dirty page rate is the number of pages changed in a given time
# period expressed in MiB/s. The following methods of calculation are
# available:
#
# 1. In page sampling mode, a random subset of pages are selected and
# hashed twice: once at the beginning of measurement time period,
# and once again at the end. If two hashes for some page are
# different, the page is counted as changed. Since this method
# relies on sampling and hashing, calculated dirty page rate is
# only an estimate of its true value. Increasing @sample-pages
# improves estimation quality at the cost of higher computational
# overhead.
#
# 2. Dirty bitmap mode captures writes to memory (for example by
# temporarily revoking write access to all pages) and counting page
# faults. Information about modified pages is collected into a
# bitmap, where each bit corresponds to one guest page. This mode
# requires that KVM accelerator property "dirty-ring-size" is *not*
# set.
#
# 3. Dirty ring mode is similar to dirty bitmap mode, but the
# information about modified pages is collected into ring buffer.
# This mode tracks page modification per each vCPU separately. It
# requires that KVM accelerator property "dirty-ring-size" is set.
#
migration/calc-dirty-rate: millisecond-granularity period This patch allows to measure dirty page rate for sub-second intervals of time. An optional argument is introduced -- calc-time-unit. For example: {"execute": "calc-dirty-rate", "arguments": {"calc-time": 500, "calc-time-unit": "millisecond"} } Millisecond granularity allows to make predictions whether migration will succeed or not. To do this, calculate dirty rate with calc-time set to max allowed downtime (e.g. 300ms), convert measured rate into volume of dirtied memory, and divide by network throughput. If the value is lower than max allowed downtime, then migration will converge. Measurement results for single thread randomly writing to a 1/4/24GiB memory region: +----------------+-----------------------------------------------+ | calc-time | dirty rate MiB/s | | (milliseconds) +----------------+---------------+--------------+ | | theoretical | page-sampling | dirty-bitmap | | | (at 3M wr/sec) | | | +----------------+----------------+---------------+--------------+ | 1GiB | +----------------+----------------+---------------+--------------+ | 100 | 6996 | 7100 | 3192 | | 200 | 4606 | 4660 | 2655 | | 300 | 3305 | 3280 | 2371 | | 400 | 2534 | 2525 | 2154 | | 500 | 2041 | 2044 | 1871 | | 750 | 1365 | 1341 | 1358 | | 1000 | 1024 | 1052 | 1025 | | 1500 | 683 | 678 | 684 | | 2000 | 512 | 507 | 513 | +----------------+----------------+---------------+--------------+ | 4GiB | +----------------+----------------+---------------+--------------+ | 100 | 10232 | 8880 | 4070 | | 200 | 8954 | 8049 | 3195 | | 300 | 7889 | 7193 | 2881 | | 400 | 6996 | 6530 | 2700 | | 500 | 6245 | 5772 | 2312 | | 750 | 4829 | 4586 | 2465 | | 1000 | 3865 | 3780 | 2178 | | 1500 | 2694 | 2633 | 2004 | | 2000 | 2041 | 2031 | 1789 | +----------------+----------------+---------------+--------------+ | 24GiB | +----------------+----------------+---------------+--------------+ | 100 | 11495 | 8640 | 5597 | | 200 | 11226 | 8616 | 3527 | | 300 | 10965 | 8386 | 2355 | | 400 | 10713 | 8370 | 2179 | | 500 | 10469 | 8196 | 2098 | | 750 | 9890 | 7885 | 2556 | | 1000 | 9354 | 7506 | 2084 | | 1500 | 8397 | 6944 | 2075 | | 2000 | 7574 | 6402 | 2062 | +----------------+----------------+---------------+--------------+ Theoretical values are computed according to the following formula: size * (1 - (1-(4096/size))^(time*wps)) / (time * 2^20), where size is in bytes, time is in seconds, and wps is number of writes per second. Signed-off-by: Andrei Gudkov <gudkov.andrei@huawei.com> Reviewed-by: Hyman Huang <yong.huang@smartx.com> Message-Id: <d802e6b8053eb60fbec1a784cf86f67d9528e0a8.1693895970.git.gudkov.andrei@huawei.com> Signed-off-by: Hyman Huang <yong.huang@smartx.com>
2023-09-05 10:05:43 +03:00
# @calc-time: time period for which dirty page rate is calculated.
# By default it is specified in seconds, but the unit can be set
# explicitly with @calc-time-unit. Note that larger @calc-time
# values will typically result in smaller dirty page rates because
# page dirtying is a one-time event. Once some page is counted
# as dirty during @calc-time period, further writes to this page
# will not increase dirty page rate anymore.
#
# @calc-time-unit: time unit in which @calc-time is specified.
# By default it is seconds. (Since 8.2)
#
# @sample-pages: number of sampled pages per each GiB of guest memory.
# Default value is 512. For 4KiB guest pages this corresponds to
# sampling ratio of 0.2%. This argument is used only in page
# sampling mode. (Since 6.1)
#
# @mode: mechanism for tracking dirty pages. Default value is
# 'page-sampling'. Others are 'dirty-bitmap' and 'dirty-ring'.
# (Since 6.1)
#
# Since: 5.2
#
# Example:
#
# -> {"execute": "calc-dirty-rate", "arguments": {"calc-time": 1,
# 'sample-pages': 512} }
# <- { "return": {} }
migration/calc-dirty-rate: millisecond-granularity period This patch allows to measure dirty page rate for sub-second intervals of time. An optional argument is introduced -- calc-time-unit. For example: {"execute": "calc-dirty-rate", "arguments": {"calc-time": 500, "calc-time-unit": "millisecond"} } Millisecond granularity allows to make predictions whether migration will succeed or not. To do this, calculate dirty rate with calc-time set to max allowed downtime (e.g. 300ms), convert measured rate into volume of dirtied memory, and divide by network throughput. If the value is lower than max allowed downtime, then migration will converge. Measurement results for single thread randomly writing to a 1/4/24GiB memory region: +----------------+-----------------------------------------------+ | calc-time | dirty rate MiB/s | | (milliseconds) +----------------+---------------+--------------+ | | theoretical | page-sampling | dirty-bitmap | | | (at 3M wr/sec) | | | +----------------+----------------+---------------+--------------+ | 1GiB | +----------------+----------------+---------------+--------------+ | 100 | 6996 | 7100 | 3192 | | 200 | 4606 | 4660 | 2655 | | 300 | 3305 | 3280 | 2371 | | 400 | 2534 | 2525 | 2154 | | 500 | 2041 | 2044 | 1871 | | 750 | 1365 | 1341 | 1358 | | 1000 | 1024 | 1052 | 1025 | | 1500 | 683 | 678 | 684 | | 2000 | 512 | 507 | 513 | +----------------+----------------+---------------+--------------+ | 4GiB | +----------------+----------------+---------------+--------------+ | 100 | 10232 | 8880 | 4070 | | 200 | 8954 | 8049 | 3195 | | 300 | 7889 | 7193 | 2881 | | 400 | 6996 | 6530 | 2700 | | 500 | 6245 | 5772 | 2312 | | 750 | 4829 | 4586 | 2465 | | 1000 | 3865 | 3780 | 2178 | | 1500 | 2694 | 2633 | 2004 | | 2000 | 2041 | 2031 | 1789 | +----------------+----------------+---------------+--------------+ | 24GiB | +----------------+----------------+---------------+--------------+ | 100 | 11495 | 8640 | 5597 | | 200 | 11226 | 8616 | 3527 | | 300 | 10965 | 8386 | 2355 | | 400 | 10713 | 8370 | 2179 | | 500 | 10469 | 8196 | 2098 | | 750 | 9890 | 7885 | 2556 | | 1000 | 9354 | 7506 | 2084 | | 1500 | 8397 | 6944 | 2075 | | 2000 | 7574 | 6402 | 2062 | +----------------+----------------+---------------+--------------+ Theoretical values are computed according to the following formula: size * (1 - (1-(4096/size))^(time*wps)) / (time * 2^20), where size is in bytes, time is in seconds, and wps is number of writes per second. Signed-off-by: Andrei Gudkov <gudkov.andrei@huawei.com> Reviewed-by: Hyman Huang <yong.huang@smartx.com> Message-Id: <d802e6b8053eb60fbec1a784cf86f67d9528e0a8.1693895970.git.gudkov.andrei@huawei.com> Signed-off-by: Hyman Huang <yong.huang@smartx.com>
2023-09-05 10:05:43 +03:00
#
# Measure dirty rate using dirty bitmap for 500 milliseconds:
migration/calc-dirty-rate: millisecond-granularity period This patch allows to measure dirty page rate for sub-second intervals of time. An optional argument is introduced -- calc-time-unit. For example: {"execute": "calc-dirty-rate", "arguments": {"calc-time": 500, "calc-time-unit": "millisecond"} } Millisecond granularity allows to make predictions whether migration will succeed or not. To do this, calculate dirty rate with calc-time set to max allowed downtime (e.g. 300ms), convert measured rate into volume of dirtied memory, and divide by network throughput. If the value is lower than max allowed downtime, then migration will converge. Measurement results for single thread randomly writing to a 1/4/24GiB memory region: +----------------+-----------------------------------------------+ | calc-time | dirty rate MiB/s | | (milliseconds) +----------------+---------------+--------------+ | | theoretical | page-sampling | dirty-bitmap | | | (at 3M wr/sec) | | | +----------------+----------------+---------------+--------------+ | 1GiB | +----------------+----------------+---------------+--------------+ | 100 | 6996 | 7100 | 3192 | | 200 | 4606 | 4660 | 2655 | | 300 | 3305 | 3280 | 2371 | | 400 | 2534 | 2525 | 2154 | | 500 | 2041 | 2044 | 1871 | | 750 | 1365 | 1341 | 1358 | | 1000 | 1024 | 1052 | 1025 | | 1500 | 683 | 678 | 684 | | 2000 | 512 | 507 | 513 | +----------------+----------------+---------------+--------------+ | 4GiB | +----------------+----------------+---------------+--------------+ | 100 | 10232 | 8880 | 4070 | | 200 | 8954 | 8049 | 3195 | | 300 | 7889 | 7193 | 2881 | | 400 | 6996 | 6530 | 2700 | | 500 | 6245 | 5772 | 2312 | | 750 | 4829 | 4586 | 2465 | | 1000 | 3865 | 3780 | 2178 | | 1500 | 2694 | 2633 | 2004 | | 2000 | 2041 | 2031 | 1789 | +----------------+----------------+---------------+--------------+ | 24GiB | +----------------+----------------+---------------+--------------+ | 100 | 11495 | 8640 | 5597 | | 200 | 11226 | 8616 | 3527 | | 300 | 10965 | 8386 | 2355 | | 400 | 10713 | 8370 | 2179 | | 500 | 10469 | 8196 | 2098 | | 750 | 9890 | 7885 | 2556 | | 1000 | 9354 | 7506 | 2084 | | 1500 | 8397 | 6944 | 2075 | | 2000 | 7574 | 6402 | 2062 | +----------------+----------------+---------------+--------------+ Theoretical values are computed according to the following formula: size * (1 - (1-(4096/size))^(time*wps)) / (time * 2^20), where size is in bytes, time is in seconds, and wps is number of writes per second. Signed-off-by: Andrei Gudkov <gudkov.andrei@huawei.com> Reviewed-by: Hyman Huang <yong.huang@smartx.com> Message-Id: <d802e6b8053eb60fbec1a784cf86f67d9528e0a8.1693895970.git.gudkov.andrei@huawei.com> Signed-off-by: Hyman Huang <yong.huang@smartx.com>
2023-09-05 10:05:43 +03:00
#
# -> {"execute": "calc-dirty-rate", "arguments": {"calc-time": 500,
# "calc-time-unit": "millisecond", "mode": "dirty-bitmap"} }
migration/calc-dirty-rate: millisecond-granularity period This patch allows to measure dirty page rate for sub-second intervals of time. An optional argument is introduced -- calc-time-unit. For example: {"execute": "calc-dirty-rate", "arguments": {"calc-time": 500, "calc-time-unit": "millisecond"} } Millisecond granularity allows to make predictions whether migration will succeed or not. To do this, calculate dirty rate with calc-time set to max allowed downtime (e.g. 300ms), convert measured rate into volume of dirtied memory, and divide by network throughput. If the value is lower than max allowed downtime, then migration will converge. Measurement results for single thread randomly writing to a 1/4/24GiB memory region: +----------------+-----------------------------------------------+ | calc-time | dirty rate MiB/s | | (milliseconds) +----------------+---------------+--------------+ | | theoretical | page-sampling | dirty-bitmap | | | (at 3M wr/sec) | | | +----------------+----------------+---------------+--------------+ | 1GiB | +----------------+----------------+---------------+--------------+ | 100 | 6996 | 7100 | 3192 | | 200 | 4606 | 4660 | 2655 | | 300 | 3305 | 3280 | 2371 | | 400 | 2534 | 2525 | 2154 | | 500 | 2041 | 2044 | 1871 | | 750 | 1365 | 1341 | 1358 | | 1000 | 1024 | 1052 | 1025 | | 1500 | 683 | 678 | 684 | | 2000 | 512 | 507 | 513 | +----------------+----------------+---------------+--------------+ | 4GiB | +----------------+----------------+---------------+--------------+ | 100 | 10232 | 8880 | 4070 | | 200 | 8954 | 8049 | 3195 | | 300 | 7889 | 7193 | 2881 | | 400 | 6996 | 6530 | 2700 | | 500 | 6245 | 5772 | 2312 | | 750 | 4829 | 4586 | 2465 | | 1000 | 3865 | 3780 | 2178 | | 1500 | 2694 | 2633 | 2004 | | 2000 | 2041 | 2031 | 1789 | +----------------+----------------+---------------+--------------+ | 24GiB | +----------------+----------------+---------------+--------------+ | 100 | 11495 | 8640 | 5597 | | 200 | 11226 | 8616 | 3527 | | 300 | 10965 | 8386 | 2355 | | 400 | 10713 | 8370 | 2179 | | 500 | 10469 | 8196 | 2098 | | 750 | 9890 | 7885 | 2556 | | 1000 | 9354 | 7506 | 2084 | | 1500 | 8397 | 6944 | 2075 | | 2000 | 7574 | 6402 | 2062 | +----------------+----------------+---------------+--------------+ Theoretical values are computed according to the following formula: size * (1 - (1-(4096/size))^(time*wps)) / (time * 2^20), where size is in bytes, time is in seconds, and wps is number of writes per second. Signed-off-by: Andrei Gudkov <gudkov.andrei@huawei.com> Reviewed-by: Hyman Huang <yong.huang@smartx.com> Message-Id: <d802e6b8053eb60fbec1a784cf86f67d9528e0a8.1693895970.git.gudkov.andrei@huawei.com> Signed-off-by: Hyman Huang <yong.huang@smartx.com>
2023-09-05 10:05:43 +03:00
#
# <- { "return": {} }
##
{ 'command': 'calc-dirty-rate', 'data': {'calc-time': 'int64',
migration/calc-dirty-rate: millisecond-granularity period This patch allows to measure dirty page rate for sub-second intervals of time. An optional argument is introduced -- calc-time-unit. For example: {"execute": "calc-dirty-rate", "arguments": {"calc-time": 500, "calc-time-unit": "millisecond"} } Millisecond granularity allows to make predictions whether migration will succeed or not. To do this, calculate dirty rate with calc-time set to max allowed downtime (e.g. 300ms), convert measured rate into volume of dirtied memory, and divide by network throughput. If the value is lower than max allowed downtime, then migration will converge. Measurement results for single thread randomly writing to a 1/4/24GiB memory region: +----------------+-----------------------------------------------+ | calc-time | dirty rate MiB/s | | (milliseconds) +----------------+---------------+--------------+ | | theoretical | page-sampling | dirty-bitmap | | | (at 3M wr/sec) | | | +----------------+----------------+---------------+--------------+ | 1GiB | +----------------+----------------+---------------+--------------+ | 100 | 6996 | 7100 | 3192 | | 200 | 4606 | 4660 | 2655 | | 300 | 3305 | 3280 | 2371 | | 400 | 2534 | 2525 | 2154 | | 500 | 2041 | 2044 | 1871 | | 750 | 1365 | 1341 | 1358 | | 1000 | 1024 | 1052 | 1025 | | 1500 | 683 | 678 | 684 | | 2000 | 512 | 507 | 513 | +----------------+----------------+---------------+--------------+ | 4GiB | +----------------+----------------+---------------+--------------+ | 100 | 10232 | 8880 | 4070 | | 200 | 8954 | 8049 | 3195 | | 300 | 7889 | 7193 | 2881 | | 400 | 6996 | 6530 | 2700 | | 500 | 6245 | 5772 | 2312 | | 750 | 4829 | 4586 | 2465 | | 1000 | 3865 | 3780 | 2178 | | 1500 | 2694 | 2633 | 2004 | | 2000 | 2041 | 2031 | 1789 | +----------------+----------------+---------------+--------------+ | 24GiB | +----------------+----------------+---------------+--------------+ | 100 | 11495 | 8640 | 5597 | | 200 | 11226 | 8616 | 3527 | | 300 | 10965 | 8386 | 2355 | | 400 | 10713 | 8370 | 2179 | | 500 | 10469 | 8196 | 2098 | | 750 | 9890 | 7885 | 2556 | | 1000 | 9354 | 7506 | 2084 | | 1500 | 8397 | 6944 | 2075 | | 2000 | 7574 | 6402 | 2062 | +----------------+----------------+---------------+--------------+ Theoretical values are computed according to the following formula: size * (1 - (1-(4096/size))^(time*wps)) / (time * 2^20), where size is in bytes, time is in seconds, and wps is number of writes per second. Signed-off-by: Andrei Gudkov <gudkov.andrei@huawei.com> Reviewed-by: Hyman Huang <yong.huang@smartx.com> Message-Id: <d802e6b8053eb60fbec1a784cf86f67d9528e0a8.1693895970.git.gudkov.andrei@huawei.com> Signed-off-by: Hyman Huang <yong.huang@smartx.com>
2023-09-05 10:05:43 +03:00
'*calc-time-unit': 'TimeUnit',
'*sample-pages': 'int',
'*mode': 'DirtyRateMeasureMode'} }
##
# @query-dirty-rate:
#
# Query results of the most recent invocation of @calc-dirty-rate.
#
migration/calc-dirty-rate: millisecond-granularity period This patch allows to measure dirty page rate for sub-second intervals of time. An optional argument is introduced -- calc-time-unit. For example: {"execute": "calc-dirty-rate", "arguments": {"calc-time": 500, "calc-time-unit": "millisecond"} } Millisecond granularity allows to make predictions whether migration will succeed or not. To do this, calculate dirty rate with calc-time set to max allowed downtime (e.g. 300ms), convert measured rate into volume of dirtied memory, and divide by network throughput. If the value is lower than max allowed downtime, then migration will converge. Measurement results for single thread randomly writing to a 1/4/24GiB memory region: +----------------+-----------------------------------------------+ | calc-time | dirty rate MiB/s | | (milliseconds) +----------------+---------------+--------------+ | | theoretical | page-sampling | dirty-bitmap | | | (at 3M wr/sec) | | | +----------------+----------------+---------------+--------------+ | 1GiB | +----------------+----------------+---------------+--------------+ | 100 | 6996 | 7100 | 3192 | | 200 | 4606 | 4660 | 2655 | | 300 | 3305 | 3280 | 2371 | | 400 | 2534 | 2525 | 2154 | | 500 | 2041 | 2044 | 1871 | | 750 | 1365 | 1341 | 1358 | | 1000 | 1024 | 1052 | 1025 | | 1500 | 683 | 678 | 684 | | 2000 | 512 | 507 | 513 | +----------------+----------------+---------------+--------------+ | 4GiB | +----------------+----------------+---------------+--------------+ | 100 | 10232 | 8880 | 4070 | | 200 | 8954 | 8049 | 3195 | | 300 | 7889 | 7193 | 2881 | | 400 | 6996 | 6530 | 2700 | | 500 | 6245 | 5772 | 2312 | | 750 | 4829 | 4586 | 2465 | | 1000 | 3865 | 3780 | 2178 | | 1500 | 2694 | 2633 | 2004 | | 2000 | 2041 | 2031 | 1789 | +----------------+----------------+---------------+--------------+ | 24GiB | +----------------+----------------+---------------+--------------+ | 100 | 11495 | 8640 | 5597 | | 200 | 11226 | 8616 | 3527 | | 300 | 10965 | 8386 | 2355 | | 400 | 10713 | 8370 | 2179 | | 500 | 10469 | 8196 | 2098 | | 750 | 9890 | 7885 | 2556 | | 1000 | 9354 | 7506 | 2084 | | 1500 | 8397 | 6944 | 2075 | | 2000 | 7574 | 6402 | 2062 | +----------------+----------------+---------------+--------------+ Theoretical values are computed according to the following formula: size * (1 - (1-(4096/size))^(time*wps)) / (time * 2^20), where size is in bytes, time is in seconds, and wps is number of writes per second. Signed-off-by: Andrei Gudkov <gudkov.andrei@huawei.com> Reviewed-by: Hyman Huang <yong.huang@smartx.com> Message-Id: <d802e6b8053eb60fbec1a784cf86f67d9528e0a8.1693895970.git.gudkov.andrei@huawei.com> Signed-off-by: Hyman Huang <yong.huang@smartx.com>
2023-09-05 10:05:43 +03:00
# @calc-time-unit: time unit in which to report calculation time.
# By default it is reported in seconds. (Since 8.2)
#
# Since: 5.2
#
# Examples:
#
# 1. Measurement is in progress:
#
# <- {"status": "measuring", "sample-pages": 512,
# "mode": "page-sampling", "start-time": 1693900454, "calc-time": 10,
# "calc-time-unit": "second"}
#
# 2. Measurement has been completed:
#
# <- {"status": "measured", "sample-pages": 512, "dirty-rate": 108,
# "mode": "page-sampling", "start-time": 1693900454, "calc-time": 10,
# "calc-time-unit": "second"}
##
migration/calc-dirty-rate: millisecond-granularity period This patch allows to measure dirty page rate for sub-second intervals of time. An optional argument is introduced -- calc-time-unit. For example: {"execute": "calc-dirty-rate", "arguments": {"calc-time": 500, "calc-time-unit": "millisecond"} } Millisecond granularity allows to make predictions whether migration will succeed or not. To do this, calculate dirty rate with calc-time set to max allowed downtime (e.g. 300ms), convert measured rate into volume of dirtied memory, and divide by network throughput. If the value is lower than max allowed downtime, then migration will converge. Measurement results for single thread randomly writing to a 1/4/24GiB memory region: +----------------+-----------------------------------------------+ | calc-time | dirty rate MiB/s | | (milliseconds) +----------------+---------------+--------------+ | | theoretical | page-sampling | dirty-bitmap | | | (at 3M wr/sec) | | | +----------------+----------------+---------------+--------------+ | 1GiB | +----------------+----------------+---------------+--------------+ | 100 | 6996 | 7100 | 3192 | | 200 | 4606 | 4660 | 2655 | | 300 | 3305 | 3280 | 2371 | | 400 | 2534 | 2525 | 2154 | | 500 | 2041 | 2044 | 1871 | | 750 | 1365 | 1341 | 1358 | | 1000 | 1024 | 1052 | 1025 | | 1500 | 683 | 678 | 684 | | 2000 | 512 | 507 | 513 | +----------------+----------------+---------------+--------------+ | 4GiB | +----------------+----------------+---------------+--------------+ | 100 | 10232 | 8880 | 4070 | | 200 | 8954 | 8049 | 3195 | | 300 | 7889 | 7193 | 2881 | | 400 | 6996 | 6530 | 2700 | | 500 | 6245 | 5772 | 2312 | | 750 | 4829 | 4586 | 2465 | | 1000 | 3865 | 3780 | 2178 | | 1500 | 2694 | 2633 | 2004 | | 2000 | 2041 | 2031 | 1789 | +----------------+----------------+---------------+--------------+ | 24GiB | +----------------+----------------+---------------+--------------+ | 100 | 11495 | 8640 | 5597 | | 200 | 11226 | 8616 | 3527 | | 300 | 10965 | 8386 | 2355 | | 400 | 10713 | 8370 | 2179 | | 500 | 10469 | 8196 | 2098 | | 750 | 9890 | 7885 | 2556 | | 1000 | 9354 | 7506 | 2084 | | 1500 | 8397 | 6944 | 2075 | | 2000 | 7574 | 6402 | 2062 | +----------------+----------------+---------------+--------------+ Theoretical values are computed according to the following formula: size * (1 - (1-(4096/size))^(time*wps)) / (time * 2^20), where size is in bytes, time is in seconds, and wps is number of writes per second. Signed-off-by: Andrei Gudkov <gudkov.andrei@huawei.com> Reviewed-by: Hyman Huang <yong.huang@smartx.com> Message-Id: <d802e6b8053eb60fbec1a784cf86f67d9528e0a8.1693895970.git.gudkov.andrei@huawei.com> Signed-off-by: Hyman Huang <yong.huang@smartx.com>
2023-09-05 10:05:43 +03:00
{ 'command': 'query-dirty-rate', 'data': {'*calc-time-unit': 'TimeUnit' },
'returns': 'DirtyRateInfo' }
migration: introduce snapshot-{save, load, delete} QMP commands savevm, loadvm and delvm are some of the few HMP commands that have never been converted to use QMP. The reasons for the lack of conversion are that they blocked execution of the event thread, and the semantics around choice of disks were ill-defined. Despite this downside, however, libvirt and applications using libvirt have used these commands for as long as QMP has existed, via the "human-monitor-command" passthrough command. IOW, while it is clearly desirable to be able to fix the problems, they are not a blocker to all real world usage. Meanwhile there is a need for other features which involve adding new parameters to the commands. This is possible with HMP passthrough, but it provides no reliable way for apps to introspect features, so using QAPI modelling is highly desirable. This patch thus introduces new snapshot-{load,save,delete} commands to QMP that are intended to replace the old HMP counterparts. The new commands are given different names, because they will be using the new QEMU job framework and thus will have diverging behaviour from the HMP originals. It would thus be misleading to keep the same name. While this design uses the generic job framework, the current impl is still blocking. The intention that the blocking problem is fixed later. None the less applications using these new commands should assume that they are asynchronous and thus wait for the job status change event to indicate completion. In addition to using the job framework, the new commands require the caller to be explicit about all the block device nodes used in the snapshot operations, with no built-in default heuristics in use. Note that the existing "query-named-block-nodes" can be used to query what snapshots currently exist for block nodes. Acked-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20210204124834.774401-13-berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> dgilbert: removed tests for now, the output ordering isn't deterministic
2021-02-04 15:48:34 +03:00
##
# @DirtyLimitInfo:
#
# Dirty page rate limit information of a virtual CPU.
#
# @cpu-index: index of a virtual CPU.
#
# @limit-rate: upper limit of dirty page rate (MB/s) for a virtual
# CPU, 0 means unlimited.
#
# @current-rate: current dirty page rate (MB/s) for a virtual CPU.
#
# Since: 7.1
##
{ 'struct': 'DirtyLimitInfo',
'data': { 'cpu-index': 'int',
'limit-rate': 'uint64',
'current-rate': 'uint64' } }
##
# @set-vcpu-dirty-limit:
#
# Set the upper limit of dirty page rate for virtual CPUs.
#
# Requires KVM with accelerator property "dirty-ring-size" set. A
# virtual CPU's dirty page rate is a measure of its memory load. To
# observe dirty page rates, use @calc-dirty-rate.
#
# @cpu-index: index of a virtual CPU, default is all.
#
# @dirty-rate: upper limit of dirty page rate (MB/s) for virtual CPUs.
#
# Since: 7.1
#
# Example:
#
# -> {"execute": "set-vcpu-dirty-limit"}
# "arguments": { "dirty-rate": 200,
# "cpu-index": 1 } }
# <- { "return": {} }
##
{ 'command': 'set-vcpu-dirty-limit',
'data': { '*cpu-index': 'int',
'dirty-rate': 'uint64' } }
##
# @cancel-vcpu-dirty-limit:
#
# Cancel the upper limit of dirty page rate for virtual CPUs.
#
# Cancel the dirty page limit for the vCPU which has been set with
# set-vcpu-dirty-limit command. Note that this command requires
# support from dirty ring, same as the "set-vcpu-dirty-limit".
#
# @cpu-index: index of a virtual CPU, default is all.
#
# Since: 7.1
#
# Example:
#
# -> {"execute": "cancel-vcpu-dirty-limit"},
# "arguments": { "cpu-index": 1 } }
# <- { "return": {} }
##
{ 'command': 'cancel-vcpu-dirty-limit',
'data': { '*cpu-index': 'int'} }
##
# @query-vcpu-dirty-limit:
#
# Returns information about virtual CPU dirty page rate limits, if
# any.
#
# Since: 7.1
#
# Example:
#
# -> {"execute": "query-vcpu-dirty-limit"}
# <- {"return": [
# { "limit-rate": 60, "current-rate": 3, "cpu-index": 0},
# { "limit-rate": 60, "current-rate": 3, "cpu-index": 1}]}
##
{ 'command': 'query-vcpu-dirty-limit',
'returns': [ 'DirtyLimitInfo' ] }
##
# @MigrationThreadInfo:
#
# Information about migrationthreads
#
# @name: the name of migration thread
#
# @thread-id: ID of the underlying host thread
#
# Since: 7.2
##
{ 'struct': 'MigrationThreadInfo',
'data': {'name': 'str',
'thread-id': 'int'} }
##
# @query-migrationthreads:
#
# Returns information of migration threads
#
# Returns: @MigrationThreadInfo
#
# Since: 7.2
##
{ 'command': 'query-migrationthreads',
'returns': ['MigrationThreadInfo'] }
migration: introduce snapshot-{save, load, delete} QMP commands savevm, loadvm and delvm are some of the few HMP commands that have never been converted to use QMP. The reasons for the lack of conversion are that they blocked execution of the event thread, and the semantics around choice of disks were ill-defined. Despite this downside, however, libvirt and applications using libvirt have used these commands for as long as QMP has existed, via the "human-monitor-command" passthrough command. IOW, while it is clearly desirable to be able to fix the problems, they are not a blocker to all real world usage. Meanwhile there is a need for other features which involve adding new parameters to the commands. This is possible with HMP passthrough, but it provides no reliable way for apps to introspect features, so using QAPI modelling is highly desirable. This patch thus introduces new snapshot-{load,save,delete} commands to QMP that are intended to replace the old HMP counterparts. The new commands are given different names, because they will be using the new QEMU job framework and thus will have diverging behaviour from the HMP originals. It would thus be misleading to keep the same name. While this design uses the generic job framework, the current impl is still blocking. The intention that the blocking problem is fixed later. None the less applications using these new commands should assume that they are asynchronous and thus wait for the job status change event to indicate completion. In addition to using the job framework, the new commands require the caller to be explicit about all the block device nodes used in the snapshot operations, with no built-in default heuristics in use. Note that the existing "query-named-block-nodes" can be used to query what snapshots currently exist for block nodes. Acked-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20210204124834.774401-13-berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> dgilbert: removed tests for now, the output ordering isn't deterministic
2021-02-04 15:48:34 +03:00
##
# @snapshot-save:
#
# Save a VM snapshot
#
# @job-id: identifier for the newly created job
#
migration: introduce snapshot-{save, load, delete} QMP commands savevm, loadvm and delvm are some of the few HMP commands that have never been converted to use QMP. The reasons for the lack of conversion are that they blocked execution of the event thread, and the semantics around choice of disks were ill-defined. Despite this downside, however, libvirt and applications using libvirt have used these commands for as long as QMP has existed, via the "human-monitor-command" passthrough command. IOW, while it is clearly desirable to be able to fix the problems, they are not a blocker to all real world usage. Meanwhile there is a need for other features which involve adding new parameters to the commands. This is possible with HMP passthrough, but it provides no reliable way for apps to introspect features, so using QAPI modelling is highly desirable. This patch thus introduces new snapshot-{load,save,delete} commands to QMP that are intended to replace the old HMP counterparts. The new commands are given different names, because they will be using the new QEMU job framework and thus will have diverging behaviour from the HMP originals. It would thus be misleading to keep the same name. While this design uses the generic job framework, the current impl is still blocking. The intention that the blocking problem is fixed later. None the less applications using these new commands should assume that they are asynchronous and thus wait for the job status change event to indicate completion. In addition to using the job framework, the new commands require the caller to be explicit about all the block device nodes used in the snapshot operations, with no built-in default heuristics in use. Note that the existing "query-named-block-nodes" can be used to query what snapshots currently exist for block nodes. Acked-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20210204124834.774401-13-berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> dgilbert: removed tests for now, the output ordering isn't deterministic
2021-02-04 15:48:34 +03:00
# @tag: name of the snapshot to create
#
migration: introduce snapshot-{save, load, delete} QMP commands savevm, loadvm and delvm are some of the few HMP commands that have never been converted to use QMP. The reasons for the lack of conversion are that they blocked execution of the event thread, and the semantics around choice of disks were ill-defined. Despite this downside, however, libvirt and applications using libvirt have used these commands for as long as QMP has existed, via the "human-monitor-command" passthrough command. IOW, while it is clearly desirable to be able to fix the problems, they are not a blocker to all real world usage. Meanwhile there is a need for other features which involve adding new parameters to the commands. This is possible with HMP passthrough, but it provides no reliable way for apps to introspect features, so using QAPI modelling is highly desirable. This patch thus introduces new snapshot-{load,save,delete} commands to QMP that are intended to replace the old HMP counterparts. The new commands are given different names, because they will be using the new QEMU job framework and thus will have diverging behaviour from the HMP originals. It would thus be misleading to keep the same name. While this design uses the generic job framework, the current impl is still blocking. The intention that the blocking problem is fixed later. None the less applications using these new commands should assume that they are asynchronous and thus wait for the job status change event to indicate completion. In addition to using the job framework, the new commands require the caller to be explicit about all the block device nodes used in the snapshot operations, with no built-in default heuristics in use. Note that the existing "query-named-block-nodes" can be used to query what snapshots currently exist for block nodes. Acked-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20210204124834.774401-13-berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> dgilbert: removed tests for now, the output ordering isn't deterministic
2021-02-04 15:48:34 +03:00
# @vmstate: block device node name to save vmstate to
#
migration: introduce snapshot-{save, load, delete} QMP commands savevm, loadvm and delvm are some of the few HMP commands that have never been converted to use QMP. The reasons for the lack of conversion are that they blocked execution of the event thread, and the semantics around choice of disks were ill-defined. Despite this downside, however, libvirt and applications using libvirt have used these commands for as long as QMP has existed, via the "human-monitor-command" passthrough command. IOW, while it is clearly desirable to be able to fix the problems, they are not a blocker to all real world usage. Meanwhile there is a need for other features which involve adding new parameters to the commands. This is possible with HMP passthrough, but it provides no reliable way for apps to introspect features, so using QAPI modelling is highly desirable. This patch thus introduces new snapshot-{load,save,delete} commands to QMP that are intended to replace the old HMP counterparts. The new commands are given different names, because they will be using the new QEMU job framework and thus will have diverging behaviour from the HMP originals. It would thus be misleading to keep the same name. While this design uses the generic job framework, the current impl is still blocking. The intention that the blocking problem is fixed later. None the less applications using these new commands should assume that they are asynchronous and thus wait for the job status change event to indicate completion. In addition to using the job framework, the new commands require the caller to be explicit about all the block device nodes used in the snapshot operations, with no built-in default heuristics in use. Note that the existing "query-named-block-nodes" can be used to query what snapshots currently exist for block nodes. Acked-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20210204124834.774401-13-berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> dgilbert: removed tests for now, the output ordering isn't deterministic
2021-02-04 15:48:34 +03:00
# @devices: list of block device node names to save a snapshot to
#
# Applications should not assume that the snapshot save is complete
# when this command returns. The job commands / events must be used
# to determine completion and to fetch details of any errors that
# arise.
migration: introduce snapshot-{save, load, delete} QMP commands savevm, loadvm and delvm are some of the few HMP commands that have never been converted to use QMP. The reasons for the lack of conversion are that they blocked execution of the event thread, and the semantics around choice of disks were ill-defined. Despite this downside, however, libvirt and applications using libvirt have used these commands for as long as QMP has existed, via the "human-monitor-command" passthrough command. IOW, while it is clearly desirable to be able to fix the problems, they are not a blocker to all real world usage. Meanwhile there is a need for other features which involve adding new parameters to the commands. This is possible with HMP passthrough, but it provides no reliable way for apps to introspect features, so using QAPI modelling is highly desirable. This patch thus introduces new snapshot-{load,save,delete} commands to QMP that are intended to replace the old HMP counterparts. The new commands are given different names, because they will be using the new QEMU job framework and thus will have diverging behaviour from the HMP originals. It would thus be misleading to keep the same name. While this design uses the generic job framework, the current impl is still blocking. The intention that the blocking problem is fixed later. None the less applications using these new commands should assume that they are asynchronous and thus wait for the job status change event to indicate completion. In addition to using the job framework, the new commands require the caller to be explicit about all the block device nodes used in the snapshot operations, with no built-in default heuristics in use. Note that the existing "query-named-block-nodes" can be used to query what snapshots currently exist for block nodes. Acked-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20210204124834.774401-13-berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> dgilbert: removed tests for now, the output ordering isn't deterministic
2021-02-04 15:48:34 +03:00
#
# Note that execution of the guest CPUs may be stopped during the time
# it takes to save the snapshot. A future version of QEMU may ensure
# CPUs are executing continuously.
migration: introduce snapshot-{save, load, delete} QMP commands savevm, loadvm and delvm are some of the few HMP commands that have never been converted to use QMP. The reasons for the lack of conversion are that they blocked execution of the event thread, and the semantics around choice of disks were ill-defined. Despite this downside, however, libvirt and applications using libvirt have used these commands for as long as QMP has existed, via the "human-monitor-command" passthrough command. IOW, while it is clearly desirable to be able to fix the problems, they are not a blocker to all real world usage. Meanwhile there is a need for other features which involve adding new parameters to the commands. This is possible with HMP passthrough, but it provides no reliable way for apps to introspect features, so using QAPI modelling is highly desirable. This patch thus introduces new snapshot-{load,save,delete} commands to QMP that are intended to replace the old HMP counterparts. The new commands are given different names, because they will be using the new QEMU job framework and thus will have diverging behaviour from the HMP originals. It would thus be misleading to keep the same name. While this design uses the generic job framework, the current impl is still blocking. The intention that the blocking problem is fixed later. None the less applications using these new commands should assume that they are asynchronous and thus wait for the job status change event to indicate completion. In addition to using the job framework, the new commands require the caller to be explicit about all the block device nodes used in the snapshot operations, with no built-in default heuristics in use. Note that the existing "query-named-block-nodes" can be used to query what snapshots currently exist for block nodes. Acked-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20210204124834.774401-13-berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> dgilbert: removed tests for now, the output ordering isn't deterministic
2021-02-04 15:48:34 +03:00
#
# It is strongly recommended that @devices contain all writable block
# device nodes if a consistent snapshot is required.
migration: introduce snapshot-{save, load, delete} QMP commands savevm, loadvm and delvm are some of the few HMP commands that have never been converted to use QMP. The reasons for the lack of conversion are that they blocked execution of the event thread, and the semantics around choice of disks were ill-defined. Despite this downside, however, libvirt and applications using libvirt have used these commands for as long as QMP has existed, via the "human-monitor-command" passthrough command. IOW, while it is clearly desirable to be able to fix the problems, they are not a blocker to all real world usage. Meanwhile there is a need for other features which involve adding new parameters to the commands. This is possible with HMP passthrough, but it provides no reliable way for apps to introspect features, so using QAPI modelling is highly desirable. This patch thus introduces new snapshot-{load,save,delete} commands to QMP that are intended to replace the old HMP counterparts. The new commands are given different names, because they will be using the new QEMU job framework and thus will have diverging behaviour from the HMP originals. It would thus be misleading to keep the same name. While this design uses the generic job framework, the current impl is still blocking. The intention that the blocking problem is fixed later. None the less applications using these new commands should assume that they are asynchronous and thus wait for the job status change event to indicate completion. In addition to using the job framework, the new commands require the caller to be explicit about all the block device nodes used in the snapshot operations, with no built-in default heuristics in use. Note that the existing "query-named-block-nodes" can be used to query what snapshots currently exist for block nodes. Acked-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20210204124834.774401-13-berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> dgilbert: removed tests for now, the output ordering isn't deterministic
2021-02-04 15:48:34 +03:00
#
# If @tag already exists, an error will be reported
#
# Example:
#
# -> { "execute": "snapshot-save",
# "arguments": {
# "job-id": "snapsave0",
# "tag": "my-snap",
# "vmstate": "disk0",
# "devices": ["disk0", "disk1"]
# }
# }
# <- { "return": { } }
# <- {"event": "JOB_STATUS_CHANGE",
# "timestamp": {"seconds": 1432121972, "microseconds": 744001},
# "data": {"status": "created", "id": "snapsave0"}}
# <- {"event": "JOB_STATUS_CHANGE",
# "timestamp": {"seconds": 1432122172, "microseconds": 744001},
# "data": {"status": "running", "id": "snapsave0"}}
# <- {"event": "STOP",
# "timestamp": {"seconds": 1432122372, "microseconds": 744001} }
# <- {"event": "RESUME",
# "timestamp": {"seconds": 1432122572, "microseconds": 744001} }
# <- {"event": "JOB_STATUS_CHANGE",
# "timestamp": {"seconds": 1432122772, "microseconds": 744001},
# "data": {"status": "waiting", "id": "snapsave0"}}
# <- {"event": "JOB_STATUS_CHANGE",
# "timestamp": {"seconds": 1432122972, "microseconds": 744001},
# "data": {"status": "pending", "id": "snapsave0"}}
# <- {"event": "JOB_STATUS_CHANGE",
# "timestamp": {"seconds": 1432123172, "microseconds": 744001},
# "data": {"status": "concluded", "id": "snapsave0"}}
# -> {"execute": "query-jobs"}
# <- {"return": [{"current-progress": 1,
# "status": "concluded",
# "total-progress": 1,
# "type": "snapshot-save",
# "id": "snapsave0"}]}
migration: introduce snapshot-{save, load, delete} QMP commands savevm, loadvm and delvm are some of the few HMP commands that have never been converted to use QMP. The reasons for the lack of conversion are that they blocked execution of the event thread, and the semantics around choice of disks were ill-defined. Despite this downside, however, libvirt and applications using libvirt have used these commands for as long as QMP has existed, via the "human-monitor-command" passthrough command. IOW, while it is clearly desirable to be able to fix the problems, they are not a blocker to all real world usage. Meanwhile there is a need for other features which involve adding new parameters to the commands. This is possible with HMP passthrough, but it provides no reliable way for apps to introspect features, so using QAPI modelling is highly desirable. This patch thus introduces new snapshot-{load,save,delete} commands to QMP that are intended to replace the old HMP counterparts. The new commands are given different names, because they will be using the new QEMU job framework and thus will have diverging behaviour from the HMP originals. It would thus be misleading to keep the same name. While this design uses the generic job framework, the current impl is still blocking. The intention that the blocking problem is fixed later. None the less applications using these new commands should assume that they are asynchronous and thus wait for the job status change event to indicate completion. In addition to using the job framework, the new commands require the caller to be explicit about all the block device nodes used in the snapshot operations, with no built-in default heuristics in use. Note that the existing "query-named-block-nodes" can be used to query what snapshots currently exist for block nodes. Acked-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20210204124834.774401-13-berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> dgilbert: removed tests for now, the output ordering isn't deterministic
2021-02-04 15:48:34 +03:00
#
# Since: 6.0
##
{ 'command': 'snapshot-save',
'data': { 'job-id': 'str',
'tag': 'str',
'vmstate': 'str',
'devices': ['str'] } }
##
# @snapshot-load:
#
# Load a VM snapshot
#
# @job-id: identifier for the newly created job
#
migration: introduce snapshot-{save, load, delete} QMP commands savevm, loadvm and delvm are some of the few HMP commands that have never been converted to use QMP. The reasons for the lack of conversion are that they blocked execution of the event thread, and the semantics around choice of disks were ill-defined. Despite this downside, however, libvirt and applications using libvirt have used these commands for as long as QMP has existed, via the "human-monitor-command" passthrough command. IOW, while it is clearly desirable to be able to fix the problems, they are not a blocker to all real world usage. Meanwhile there is a need for other features which involve adding new parameters to the commands. This is possible with HMP passthrough, but it provides no reliable way for apps to introspect features, so using QAPI modelling is highly desirable. This patch thus introduces new snapshot-{load,save,delete} commands to QMP that are intended to replace the old HMP counterparts. The new commands are given different names, because they will be using the new QEMU job framework and thus will have diverging behaviour from the HMP originals. It would thus be misleading to keep the same name. While this design uses the generic job framework, the current impl is still blocking. The intention that the blocking problem is fixed later. None the less applications using these new commands should assume that they are asynchronous and thus wait for the job status change event to indicate completion. In addition to using the job framework, the new commands require the caller to be explicit about all the block device nodes used in the snapshot operations, with no built-in default heuristics in use. Note that the existing "query-named-block-nodes" can be used to query what snapshots currently exist for block nodes. Acked-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20210204124834.774401-13-berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> dgilbert: removed tests for now, the output ordering isn't deterministic
2021-02-04 15:48:34 +03:00
# @tag: name of the snapshot to load.
#
migration: introduce snapshot-{save, load, delete} QMP commands savevm, loadvm and delvm are some of the few HMP commands that have never been converted to use QMP. The reasons for the lack of conversion are that they blocked execution of the event thread, and the semantics around choice of disks were ill-defined. Despite this downside, however, libvirt and applications using libvirt have used these commands for as long as QMP has existed, via the "human-monitor-command" passthrough command. IOW, while it is clearly desirable to be able to fix the problems, they are not a blocker to all real world usage. Meanwhile there is a need for other features which involve adding new parameters to the commands. This is possible with HMP passthrough, but it provides no reliable way for apps to introspect features, so using QAPI modelling is highly desirable. This patch thus introduces new snapshot-{load,save,delete} commands to QMP that are intended to replace the old HMP counterparts. The new commands are given different names, because they will be using the new QEMU job framework and thus will have diverging behaviour from the HMP originals. It would thus be misleading to keep the same name. While this design uses the generic job framework, the current impl is still blocking. The intention that the blocking problem is fixed later. None the less applications using these new commands should assume that they are asynchronous and thus wait for the job status change event to indicate completion. In addition to using the job framework, the new commands require the caller to be explicit about all the block device nodes used in the snapshot operations, with no built-in default heuristics in use. Note that the existing "query-named-block-nodes" can be used to query what snapshots currently exist for block nodes. Acked-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20210204124834.774401-13-berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> dgilbert: removed tests for now, the output ordering isn't deterministic
2021-02-04 15:48:34 +03:00
# @vmstate: block device node name to load vmstate from
#
migration: introduce snapshot-{save, load, delete} QMP commands savevm, loadvm and delvm are some of the few HMP commands that have never been converted to use QMP. The reasons for the lack of conversion are that they blocked execution of the event thread, and the semantics around choice of disks were ill-defined. Despite this downside, however, libvirt and applications using libvirt have used these commands for as long as QMP has existed, via the "human-monitor-command" passthrough command. IOW, while it is clearly desirable to be able to fix the problems, they are not a blocker to all real world usage. Meanwhile there is a need for other features which involve adding new parameters to the commands. This is possible with HMP passthrough, but it provides no reliable way for apps to introspect features, so using QAPI modelling is highly desirable. This patch thus introduces new snapshot-{load,save,delete} commands to QMP that are intended to replace the old HMP counterparts. The new commands are given different names, because they will be using the new QEMU job framework and thus will have diverging behaviour from the HMP originals. It would thus be misleading to keep the same name. While this design uses the generic job framework, the current impl is still blocking. The intention that the blocking problem is fixed later. None the less applications using these new commands should assume that they are asynchronous and thus wait for the job status change event to indicate completion. In addition to using the job framework, the new commands require the caller to be explicit about all the block device nodes used in the snapshot operations, with no built-in default heuristics in use. Note that the existing "query-named-block-nodes" can be used to query what snapshots currently exist for block nodes. Acked-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20210204124834.774401-13-berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> dgilbert: removed tests for now, the output ordering isn't deterministic
2021-02-04 15:48:34 +03:00
# @devices: list of block device node names to load a snapshot from
#
# Applications should not assume that the snapshot load is complete
# when this command returns. The job commands / events must be used
# to determine completion and to fetch details of any errors that
# arise.
migration: introduce snapshot-{save, load, delete} QMP commands savevm, loadvm and delvm are some of the few HMP commands that have never been converted to use QMP. The reasons for the lack of conversion are that they blocked execution of the event thread, and the semantics around choice of disks were ill-defined. Despite this downside, however, libvirt and applications using libvirt have used these commands for as long as QMP has existed, via the "human-monitor-command" passthrough command. IOW, while it is clearly desirable to be able to fix the problems, they are not a blocker to all real world usage. Meanwhile there is a need for other features which involve adding new parameters to the commands. This is possible with HMP passthrough, but it provides no reliable way for apps to introspect features, so using QAPI modelling is highly desirable. This patch thus introduces new snapshot-{load,save,delete} commands to QMP that are intended to replace the old HMP counterparts. The new commands are given different names, because they will be using the new QEMU job framework and thus will have diverging behaviour from the HMP originals. It would thus be misleading to keep the same name. While this design uses the generic job framework, the current impl is still blocking. The intention that the blocking problem is fixed later. None the less applications using these new commands should assume that they are asynchronous and thus wait for the job status change event to indicate completion. In addition to using the job framework, the new commands require the caller to be explicit about all the block device nodes used in the snapshot operations, with no built-in default heuristics in use. Note that the existing "query-named-block-nodes" can be used to query what snapshots currently exist for block nodes. Acked-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20210204124834.774401-13-berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> dgilbert: removed tests for now, the output ordering isn't deterministic
2021-02-04 15:48:34 +03:00
#
# Note that execution of the guest CPUs will be stopped during the
# time it takes to load the snapshot.
#
# It is strongly recommended that @devices contain all writable block
# device nodes that can have changed since the original @snapshot-save
# command execution.
migration: introduce snapshot-{save, load, delete} QMP commands savevm, loadvm and delvm are some of the few HMP commands that have never been converted to use QMP. The reasons for the lack of conversion are that they blocked execution of the event thread, and the semantics around choice of disks were ill-defined. Despite this downside, however, libvirt and applications using libvirt have used these commands for as long as QMP has existed, via the "human-monitor-command" passthrough command. IOW, while it is clearly desirable to be able to fix the problems, they are not a blocker to all real world usage. Meanwhile there is a need for other features which involve adding new parameters to the commands. This is possible with HMP passthrough, but it provides no reliable way for apps to introspect features, so using QAPI modelling is highly desirable. This patch thus introduces new snapshot-{load,save,delete} commands to QMP that are intended to replace the old HMP counterparts. The new commands are given different names, because they will be using the new QEMU job framework and thus will have diverging behaviour from the HMP originals. It would thus be misleading to keep the same name. While this design uses the generic job framework, the current impl is still blocking. The intention that the blocking problem is fixed later. None the less applications using these new commands should assume that they are asynchronous and thus wait for the job status change event to indicate completion. In addition to using the job framework, the new commands require the caller to be explicit about all the block device nodes used in the snapshot operations, with no built-in default heuristics in use. Note that the existing "query-named-block-nodes" can be used to query what snapshots currently exist for block nodes. Acked-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20210204124834.774401-13-berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> dgilbert: removed tests for now, the output ordering isn't deterministic
2021-02-04 15:48:34 +03:00
#
# Example:
#
# -> { "execute": "snapshot-load",
# "arguments": {
# "job-id": "snapload0",
# "tag": "my-snap",
# "vmstate": "disk0",
# "devices": ["disk0", "disk1"]
# }
# }
# <- { "return": { } }
# <- {"event": "JOB_STATUS_CHANGE",
# "timestamp": {"seconds": 1472124172, "microseconds": 744001},
# "data": {"status": "created", "id": "snapload0"}}
# <- {"event": "JOB_STATUS_CHANGE",
# "timestamp": {"seconds": 1472125172, "microseconds": 744001},
# "data": {"status": "running", "id": "snapload0"}}
# <- {"event": "STOP",
# "timestamp": {"seconds": 1472125472, "microseconds": 744001} }
# <- {"event": "RESUME",
# "timestamp": {"seconds": 1472125872, "microseconds": 744001} }
# <- {"event": "JOB_STATUS_CHANGE",
# "timestamp": {"seconds": 1472126172, "microseconds": 744001},
# "data": {"status": "waiting", "id": "snapload0"}}
# <- {"event": "JOB_STATUS_CHANGE",
# "timestamp": {"seconds": 1472127172, "microseconds": 744001},
# "data": {"status": "pending", "id": "snapload0"}}
# <- {"event": "JOB_STATUS_CHANGE",
# "timestamp": {"seconds": 1472128172, "microseconds": 744001},
# "data": {"status": "concluded", "id": "snapload0"}}
# -> {"execute": "query-jobs"}
# <- {"return": [{"current-progress": 1,
# "status": "concluded",
# "total-progress": 1,
# "type": "snapshot-load",
# "id": "snapload0"}]}
migration: introduce snapshot-{save, load, delete} QMP commands savevm, loadvm and delvm are some of the few HMP commands that have never been converted to use QMP. The reasons for the lack of conversion are that they blocked execution of the event thread, and the semantics around choice of disks were ill-defined. Despite this downside, however, libvirt and applications using libvirt have used these commands for as long as QMP has existed, via the "human-monitor-command" passthrough command. IOW, while it is clearly desirable to be able to fix the problems, they are not a blocker to all real world usage. Meanwhile there is a need for other features which involve adding new parameters to the commands. This is possible with HMP passthrough, but it provides no reliable way for apps to introspect features, so using QAPI modelling is highly desirable. This patch thus introduces new snapshot-{load,save,delete} commands to QMP that are intended to replace the old HMP counterparts. The new commands are given different names, because they will be using the new QEMU job framework and thus will have diverging behaviour from the HMP originals. It would thus be misleading to keep the same name. While this design uses the generic job framework, the current impl is still blocking. The intention that the blocking problem is fixed later. None the less applications using these new commands should assume that they are asynchronous and thus wait for the job status change event to indicate completion. In addition to using the job framework, the new commands require the caller to be explicit about all the block device nodes used in the snapshot operations, with no built-in default heuristics in use. Note that the existing "query-named-block-nodes" can be used to query what snapshots currently exist for block nodes. Acked-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20210204124834.774401-13-berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> dgilbert: removed tests for now, the output ordering isn't deterministic
2021-02-04 15:48:34 +03:00
#
# Since: 6.0
##
{ 'command': 'snapshot-load',
'data': { 'job-id': 'str',
'tag': 'str',
'vmstate': 'str',
'devices': ['str'] } }
##
# @snapshot-delete:
#
# Delete a VM snapshot
#
# @job-id: identifier for the newly created job
#
migration: introduce snapshot-{save, load, delete} QMP commands savevm, loadvm and delvm are some of the few HMP commands that have never been converted to use QMP. The reasons for the lack of conversion are that they blocked execution of the event thread, and the semantics around choice of disks were ill-defined. Despite this downside, however, libvirt and applications using libvirt have used these commands for as long as QMP has existed, via the "human-monitor-command" passthrough command. IOW, while it is clearly desirable to be able to fix the problems, they are not a blocker to all real world usage. Meanwhile there is a need for other features which involve adding new parameters to the commands. This is possible with HMP passthrough, but it provides no reliable way for apps to introspect features, so using QAPI modelling is highly desirable. This patch thus introduces new snapshot-{load,save,delete} commands to QMP that are intended to replace the old HMP counterparts. The new commands are given different names, because they will be using the new QEMU job framework and thus will have diverging behaviour from the HMP originals. It would thus be misleading to keep the same name. While this design uses the generic job framework, the current impl is still blocking. The intention that the blocking problem is fixed later. None the less applications using these new commands should assume that they are asynchronous and thus wait for the job status change event to indicate completion. In addition to using the job framework, the new commands require the caller to be explicit about all the block device nodes used in the snapshot operations, with no built-in default heuristics in use. Note that the existing "query-named-block-nodes" can be used to query what snapshots currently exist for block nodes. Acked-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20210204124834.774401-13-berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> dgilbert: removed tests for now, the output ordering isn't deterministic
2021-02-04 15:48:34 +03:00
# @tag: name of the snapshot to delete.
#
migration: introduce snapshot-{save, load, delete} QMP commands savevm, loadvm and delvm are some of the few HMP commands that have never been converted to use QMP. The reasons for the lack of conversion are that they blocked execution of the event thread, and the semantics around choice of disks were ill-defined. Despite this downside, however, libvirt and applications using libvirt have used these commands for as long as QMP has existed, via the "human-monitor-command" passthrough command. IOW, while it is clearly desirable to be able to fix the problems, they are not a blocker to all real world usage. Meanwhile there is a need for other features which involve adding new parameters to the commands. This is possible with HMP passthrough, but it provides no reliable way for apps to introspect features, so using QAPI modelling is highly desirable. This patch thus introduces new snapshot-{load,save,delete} commands to QMP that are intended to replace the old HMP counterparts. The new commands are given different names, because they will be using the new QEMU job framework and thus will have diverging behaviour from the HMP originals. It would thus be misleading to keep the same name. While this design uses the generic job framework, the current impl is still blocking. The intention that the blocking problem is fixed later. None the less applications using these new commands should assume that they are asynchronous and thus wait for the job status change event to indicate completion. In addition to using the job framework, the new commands require the caller to be explicit about all the block device nodes used in the snapshot operations, with no built-in default heuristics in use. Note that the existing "query-named-block-nodes" can be used to query what snapshots currently exist for block nodes. Acked-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20210204124834.774401-13-berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> dgilbert: removed tests for now, the output ordering isn't deterministic
2021-02-04 15:48:34 +03:00
# @devices: list of block device node names to delete a snapshot from
#
# Applications should not assume that the snapshot delete is complete
# when this command returns. The job commands / events must be used
# to determine completion and to fetch details of any errors that
# arise.
migration: introduce snapshot-{save, load, delete} QMP commands savevm, loadvm and delvm are some of the few HMP commands that have never been converted to use QMP. The reasons for the lack of conversion are that they blocked execution of the event thread, and the semantics around choice of disks were ill-defined. Despite this downside, however, libvirt and applications using libvirt have used these commands for as long as QMP has existed, via the "human-monitor-command" passthrough command. IOW, while it is clearly desirable to be able to fix the problems, they are not a blocker to all real world usage. Meanwhile there is a need for other features which involve adding new parameters to the commands. This is possible with HMP passthrough, but it provides no reliable way for apps to introspect features, so using QAPI modelling is highly desirable. This patch thus introduces new snapshot-{load,save,delete} commands to QMP that are intended to replace the old HMP counterparts. The new commands are given different names, because they will be using the new QEMU job framework and thus will have diverging behaviour from the HMP originals. It would thus be misleading to keep the same name. While this design uses the generic job framework, the current impl is still blocking. The intention that the blocking problem is fixed later. None the less applications using these new commands should assume that they are asynchronous and thus wait for the job status change event to indicate completion. In addition to using the job framework, the new commands require the caller to be explicit about all the block device nodes used in the snapshot operations, with no built-in default heuristics in use. Note that the existing "query-named-block-nodes" can be used to query what snapshots currently exist for block nodes. Acked-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20210204124834.774401-13-berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> dgilbert: removed tests for now, the output ordering isn't deterministic
2021-02-04 15:48:34 +03:00
#
# Example:
#
# -> { "execute": "snapshot-delete",
# "arguments": {
# "job-id": "snapdelete0",
# "tag": "my-snap",
# "devices": ["disk0", "disk1"]
# }
# }
# <- { "return": { } }
# <- {"event": "JOB_STATUS_CHANGE",
# "timestamp": {"seconds": 1442124172, "microseconds": 744001},
# "data": {"status": "created", "id": "snapdelete0"}}
# <- {"event": "JOB_STATUS_CHANGE",
# "timestamp": {"seconds": 1442125172, "microseconds": 744001},
# "data": {"status": "running", "id": "snapdelete0"}}
# <- {"event": "JOB_STATUS_CHANGE",
# "timestamp": {"seconds": 1442126172, "microseconds": 744001},
# "data": {"status": "waiting", "id": "snapdelete0"}}
# <- {"event": "JOB_STATUS_CHANGE",
# "timestamp": {"seconds": 1442127172, "microseconds": 744001},
# "data": {"status": "pending", "id": "snapdelete0"}}
# <- {"event": "JOB_STATUS_CHANGE",
# "timestamp": {"seconds": 1442128172, "microseconds": 744001},
# "data": {"status": "concluded", "id": "snapdelete0"}}
# -> {"execute": "query-jobs"}
# <- {"return": [{"current-progress": 1,
# "status": "concluded",
# "total-progress": 1,
# "type": "snapshot-delete",
# "id": "snapdelete0"}]}
migration: introduce snapshot-{save, load, delete} QMP commands savevm, loadvm and delvm are some of the few HMP commands that have never been converted to use QMP. The reasons for the lack of conversion are that they blocked execution of the event thread, and the semantics around choice of disks were ill-defined. Despite this downside, however, libvirt and applications using libvirt have used these commands for as long as QMP has existed, via the "human-monitor-command" passthrough command. IOW, while it is clearly desirable to be able to fix the problems, they are not a blocker to all real world usage. Meanwhile there is a need for other features which involve adding new parameters to the commands. This is possible with HMP passthrough, but it provides no reliable way for apps to introspect features, so using QAPI modelling is highly desirable. This patch thus introduces new snapshot-{load,save,delete} commands to QMP that are intended to replace the old HMP counterparts. The new commands are given different names, because they will be using the new QEMU job framework and thus will have diverging behaviour from the HMP originals. It would thus be misleading to keep the same name. While this design uses the generic job framework, the current impl is still blocking. The intention that the blocking problem is fixed later. None the less applications using these new commands should assume that they are asynchronous and thus wait for the job status change event to indicate completion. In addition to using the job framework, the new commands require the caller to be explicit about all the block device nodes used in the snapshot operations, with no built-in default heuristics in use. Note that the existing "query-named-block-nodes" can be used to query what snapshots currently exist for block nodes. Acked-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20210204124834.774401-13-berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> dgilbert: removed tests for now, the output ordering isn't deterministic
2021-02-04 15:48:34 +03:00
#
# Since: 6.0
##
{ 'command': 'snapshot-delete',
'data': { 'job-id': 'str',
'tag': 'str',
'devices': ['str'] } }