tests: introduce a framework for testing migration performance
This introduces a moderately general purpose framework for
testing performance of migration.
The initial guest workload is provided by the included 'stress'
program, which is configured to spawn one thread per guest CPU
and run a maximally memory intensive workload. It will loop
over GB of memory, xor'ing each byte with data from a 4k array
of random bytes. This ensures heavy read and write load across
all of guest memory to stress the migration performance. While
running the 'stress' program will record how long it takes to
xor each GB of memory and print this data for later reporting.
The test engine will spawn a pair of QEMU processes, either on
the same host, or with the target on a remote host via ssh,
using the host kernel and a custom initrd built with 'stress'
as the /init binary. Kernel command line args are set to ensure
a fast kernel boot time (< 1 second) between launching QEMU and
the stress program starting execution.
None the less, the test engine will initially wait N seconds for
the guest workload to stablize, before starting the migration
operation. When migration is running, the engine will use pause,
post-copy, autoconverge, xbzrle compression and multithread
compression features, as well as downtime & bandwidth tuning
to encourage completion. If migration completes, the test engine
will wait N seconds again for the guest workooad to stablize on
the target host. If migration does not complete after a preset
number of iterations, it will be aborted.
While the QEMU process is running on the source host, the test
engine will sample the host CPU usage of QEMU as a whole, and
each vCPU thread. While migration is running, it will record
all the stats reported by 'query-migration'. Finally, it will
capture the output of the stress program running in the guest.
All the data produced from a single test execution is recorded
in a structured JSON file. A separate program is then able to
create interactive charts using the "plotly" python + javascript
libraries, showing the characteristics of the migration.
The data output provides visualization of the effect on guest
vCPU workloads from the migration process, the corresponding
vCPU utilization on the host, and the overall CPU hit from
QEMU on the host. This is correlated from statistics from the
migration process, such as downtime, vCPU throttling and iteration
number.
While the tests can be run individually with arbitrary parameters,
there is also a facility for producing batch reports for a number
of pre-defined scenarios / comparisons, in order to be able to
get standardized results across different hardware configurations
(eg TCP vs RDMA, or comparing different VCPU counts / memory
sizes, etc).
To use this, first you must build the initrd image
$ make tests/migration/initrd-stress.img
To run a a one-shot test with all default parameters
$ ./tests/migration/guestperf.py > result.json
This has many command line args for varying its behaviour.
For example, to increase the RAM size and CPU count and
bind it to specific host NUMA nodes
$ ./tests/migration/guestperf.py \
--mem 4 --cpus 2 \
--src-mem-bind 0 --src-cpu-bind 0,1 \
--dst-mem-bind 1 --dst-cpu-bind 2,3 \
> result.json
Using mem + cpu binding is strongly recommended on NUMA
machines, otherwise the guest performance results will
vary wildly between runs of the test due to lucky/unlucky
NUMA placement, making sensible data analysis impossible.
To make it run across separate hosts:
$ ./tests/migration/guestperf.py \
--dst-host somehostname > result.json
To request that post-copy is enabled, with switchover
after 5 iterations
$ ./tests/migration/guestperf.py \
--post-copy --post-copy-iters 5 > result.json
Once a result.json file is created, a graph of the data
can be generated, showing guest workload performance per
thread and the migration iteration points:
$ ./tests/migration/guestperf-plot.py --output result.html \
--migration-iters --split-guest-cpu result.json
To further include host vCPU utilization and overall QEMU
utilization
$ ./tests/migration/guestperf-plot.py --output result.html \
--migration-iters --split-guest-cpu \
--qemu-cpu --vcpu-cpu result.json
NB, the 'guestperf-plot.py' command requires that you have
the plotly python library installed. eg you must do
$ pip install --user plotly
Viewing the result.html file requires that you have the
plotly.min.js file in the same directory as the HTML
output. This js file is installed as part of the plotly
python library, so can be found in
$HOME/.local/lib/python2.7/site-packages/plotly/offline/plotly.min.js
The guestperf-plot.py program can accept multiple json files
to plot, enabling results from different configurations to
be compared.
Finally, to run the entire standardized set of comparisons
$ ./tests/migration/guestperf-batch.py \
--dst-host somehost \
--mem 4 --cpus 2 \
--src-mem-bind 0 --src-cpu-bind 0,1 \
--dst-mem-bind 1 --dst-cpu-bind 2,3
--output tcp-somehost-4gb-2cpu
will store JSON files from all scenarios in the directory
named tcp-somehost-4gb-2cpu
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1469020993-29426-7-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-07-20 16:23:13 +03:00
|
|
|
#
|
|
|
|
# Migration test main engine
|
|
|
|
#
|
|
|
|
# Copyright (c) 2016 Red Hat, Inc.
|
|
|
|
#
|
|
|
|
# This library is free software; you can redistribute it and/or
|
|
|
|
# modify it under the terms of the GNU Lesser General Public
|
|
|
|
# License as published by the Free Software Foundation; either
|
2020-11-10 21:42:21 +03:00
|
|
|
# version 2.1 of the License, or (at your option) any later version.
|
tests: introduce a framework for testing migration performance
This introduces a moderately general purpose framework for
testing performance of migration.
The initial guest workload is provided by the included 'stress'
program, which is configured to spawn one thread per guest CPU
and run a maximally memory intensive workload. It will loop
over GB of memory, xor'ing each byte with data from a 4k array
of random bytes. This ensures heavy read and write load across
all of guest memory to stress the migration performance. While
running the 'stress' program will record how long it takes to
xor each GB of memory and print this data for later reporting.
The test engine will spawn a pair of QEMU processes, either on
the same host, or with the target on a remote host via ssh,
using the host kernel and a custom initrd built with 'stress'
as the /init binary. Kernel command line args are set to ensure
a fast kernel boot time (< 1 second) between launching QEMU and
the stress program starting execution.
None the less, the test engine will initially wait N seconds for
the guest workload to stablize, before starting the migration
operation. When migration is running, the engine will use pause,
post-copy, autoconverge, xbzrle compression and multithread
compression features, as well as downtime & bandwidth tuning
to encourage completion. If migration completes, the test engine
will wait N seconds again for the guest workooad to stablize on
the target host. If migration does not complete after a preset
number of iterations, it will be aborted.
While the QEMU process is running on the source host, the test
engine will sample the host CPU usage of QEMU as a whole, and
each vCPU thread. While migration is running, it will record
all the stats reported by 'query-migration'. Finally, it will
capture the output of the stress program running in the guest.
All the data produced from a single test execution is recorded
in a structured JSON file. A separate program is then able to
create interactive charts using the "plotly" python + javascript
libraries, showing the characteristics of the migration.
The data output provides visualization of the effect on guest
vCPU workloads from the migration process, the corresponding
vCPU utilization on the host, and the overall CPU hit from
QEMU on the host. This is correlated from statistics from the
migration process, such as downtime, vCPU throttling and iteration
number.
While the tests can be run individually with arbitrary parameters,
there is also a facility for producing batch reports for a number
of pre-defined scenarios / comparisons, in order to be able to
get standardized results across different hardware configurations
(eg TCP vs RDMA, or comparing different VCPU counts / memory
sizes, etc).
To use this, first you must build the initrd image
$ make tests/migration/initrd-stress.img
To run a a one-shot test with all default parameters
$ ./tests/migration/guestperf.py > result.json
This has many command line args for varying its behaviour.
For example, to increase the RAM size and CPU count and
bind it to specific host NUMA nodes
$ ./tests/migration/guestperf.py \
--mem 4 --cpus 2 \
--src-mem-bind 0 --src-cpu-bind 0,1 \
--dst-mem-bind 1 --dst-cpu-bind 2,3 \
> result.json
Using mem + cpu binding is strongly recommended on NUMA
machines, otherwise the guest performance results will
vary wildly between runs of the test due to lucky/unlucky
NUMA placement, making sensible data analysis impossible.
To make it run across separate hosts:
$ ./tests/migration/guestperf.py \
--dst-host somehostname > result.json
To request that post-copy is enabled, with switchover
after 5 iterations
$ ./tests/migration/guestperf.py \
--post-copy --post-copy-iters 5 > result.json
Once a result.json file is created, a graph of the data
can be generated, showing guest workload performance per
thread and the migration iteration points:
$ ./tests/migration/guestperf-plot.py --output result.html \
--migration-iters --split-guest-cpu result.json
To further include host vCPU utilization and overall QEMU
utilization
$ ./tests/migration/guestperf-plot.py --output result.html \
--migration-iters --split-guest-cpu \
--qemu-cpu --vcpu-cpu result.json
NB, the 'guestperf-plot.py' command requires that you have
the plotly python library installed. eg you must do
$ pip install --user plotly
Viewing the result.html file requires that you have the
plotly.min.js file in the same directory as the HTML
output. This js file is installed as part of the plotly
python library, so can be found in
$HOME/.local/lib/python2.7/site-packages/plotly/offline/plotly.min.js
The guestperf-plot.py program can accept multiple json files
to plot, enabling results from different configurations to
be compared.
Finally, to run the entire standardized set of comparisons
$ ./tests/migration/guestperf-batch.py \
--dst-host somehost \
--mem 4 --cpus 2 \
--src-mem-bind 0 --src-cpu-bind 0,1 \
--dst-mem-bind 1 --dst-cpu-bind 2,3
--output tcp-somehost-4gb-2cpu
will store JSON files from all scenarios in the directory
named tcp-somehost-4gb-2cpu
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1469020993-29426-7-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-07-20 16:23:13 +03:00
|
|
|
#
|
|
|
|
# This library is distributed in the hope that it will be useful,
|
|
|
|
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
|
|
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
|
|
|
# Lesser General Public License for more details.
|
|
|
|
#
|
|
|
|
# You should have received a copy of the GNU Lesser General Public
|
|
|
|
# License along with this library; if not, see <http://www.gnu.org/licenses/>.
|
|
|
|
#
|
|
|
|
|
|
|
|
|
|
|
|
import os
|
|
|
|
import re
|
|
|
|
import sys
|
|
|
|
import time
|
|
|
|
|
|
|
|
from guestperf.progress import Progress, ProgressStats
|
|
|
|
from guestperf.report import Report
|
|
|
|
from guestperf.timings import TimingRecord, Timings
|
|
|
|
|
2019-02-06 19:29:01 +03:00
|
|
|
sys.path.append(os.path.join(os.path.dirname(__file__),
|
|
|
|
'..', '..', '..', 'python'))
|
2019-06-28 00:28:14 +03:00
|
|
|
from qemu.machine import QEMUMachine
|
2019-02-06 19:29:01 +03:00
|
|
|
|
tests: introduce a framework for testing migration performance
This introduces a moderately general purpose framework for
testing performance of migration.
The initial guest workload is provided by the included 'stress'
program, which is configured to spawn one thread per guest CPU
and run a maximally memory intensive workload. It will loop
over GB of memory, xor'ing each byte with data from a 4k array
of random bytes. This ensures heavy read and write load across
all of guest memory to stress the migration performance. While
running the 'stress' program will record how long it takes to
xor each GB of memory and print this data for later reporting.
The test engine will spawn a pair of QEMU processes, either on
the same host, or with the target on a remote host via ssh,
using the host kernel and a custom initrd built with 'stress'
as the /init binary. Kernel command line args are set to ensure
a fast kernel boot time (< 1 second) between launching QEMU and
the stress program starting execution.
None the less, the test engine will initially wait N seconds for
the guest workload to stablize, before starting the migration
operation. When migration is running, the engine will use pause,
post-copy, autoconverge, xbzrle compression and multithread
compression features, as well as downtime & bandwidth tuning
to encourage completion. If migration completes, the test engine
will wait N seconds again for the guest workooad to stablize on
the target host. If migration does not complete after a preset
number of iterations, it will be aborted.
While the QEMU process is running on the source host, the test
engine will sample the host CPU usage of QEMU as a whole, and
each vCPU thread. While migration is running, it will record
all the stats reported by 'query-migration'. Finally, it will
capture the output of the stress program running in the guest.
All the data produced from a single test execution is recorded
in a structured JSON file. A separate program is then able to
create interactive charts using the "plotly" python + javascript
libraries, showing the characteristics of the migration.
The data output provides visualization of the effect on guest
vCPU workloads from the migration process, the corresponding
vCPU utilization on the host, and the overall CPU hit from
QEMU on the host. This is correlated from statistics from the
migration process, such as downtime, vCPU throttling and iteration
number.
While the tests can be run individually with arbitrary parameters,
there is also a facility for producing batch reports for a number
of pre-defined scenarios / comparisons, in order to be able to
get standardized results across different hardware configurations
(eg TCP vs RDMA, or comparing different VCPU counts / memory
sizes, etc).
To use this, first you must build the initrd image
$ make tests/migration/initrd-stress.img
To run a a one-shot test with all default parameters
$ ./tests/migration/guestperf.py > result.json
This has many command line args for varying its behaviour.
For example, to increase the RAM size and CPU count and
bind it to specific host NUMA nodes
$ ./tests/migration/guestperf.py \
--mem 4 --cpus 2 \
--src-mem-bind 0 --src-cpu-bind 0,1 \
--dst-mem-bind 1 --dst-cpu-bind 2,3 \
> result.json
Using mem + cpu binding is strongly recommended on NUMA
machines, otherwise the guest performance results will
vary wildly between runs of the test due to lucky/unlucky
NUMA placement, making sensible data analysis impossible.
To make it run across separate hosts:
$ ./tests/migration/guestperf.py \
--dst-host somehostname > result.json
To request that post-copy is enabled, with switchover
after 5 iterations
$ ./tests/migration/guestperf.py \
--post-copy --post-copy-iters 5 > result.json
Once a result.json file is created, a graph of the data
can be generated, showing guest workload performance per
thread and the migration iteration points:
$ ./tests/migration/guestperf-plot.py --output result.html \
--migration-iters --split-guest-cpu result.json
To further include host vCPU utilization and overall QEMU
utilization
$ ./tests/migration/guestperf-plot.py --output result.html \
--migration-iters --split-guest-cpu \
--qemu-cpu --vcpu-cpu result.json
NB, the 'guestperf-plot.py' command requires that you have
the plotly python library installed. eg you must do
$ pip install --user plotly
Viewing the result.html file requires that you have the
plotly.min.js file in the same directory as the HTML
output. This js file is installed as part of the plotly
python library, so can be found in
$HOME/.local/lib/python2.7/site-packages/plotly/offline/plotly.min.js
The guestperf-plot.py program can accept multiple json files
to plot, enabling results from different configurations to
be compared.
Finally, to run the entire standardized set of comparisons
$ ./tests/migration/guestperf-batch.py \
--dst-host somehost \
--mem 4 --cpus 2 \
--src-mem-bind 0 --src-cpu-bind 0,1 \
--dst-mem-bind 1 --dst-cpu-bind 2,3
--output tcp-somehost-4gb-2cpu
will store JSON files from all scenarios in the directory
named tcp-somehost-4gb-2cpu
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1469020993-29426-7-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-07-20 16:23:13 +03:00
|
|
|
|
|
|
|
class Engine(object):
|
|
|
|
|
|
|
|
def __init__(self, binary, dst_host, kernel, initrd, transport="tcp",
|
|
|
|
sleep=15, verbose=False, debug=False):
|
|
|
|
|
|
|
|
self._binary = binary # Path to QEMU binary
|
|
|
|
self._dst_host = dst_host # Hostname of target host
|
|
|
|
self._kernel = kernel # Path to kernel image
|
|
|
|
self._initrd = initrd # Path to stress initrd
|
|
|
|
self._transport = transport # 'unix' or 'tcp' or 'rdma'
|
|
|
|
self._sleep = sleep
|
|
|
|
self._verbose = verbose
|
|
|
|
self._debug = debug
|
|
|
|
|
|
|
|
if debug:
|
|
|
|
self._verbose = debug
|
|
|
|
|
|
|
|
def _vcpu_timing(self, pid, tid_list):
|
|
|
|
records = []
|
|
|
|
now = time.time()
|
|
|
|
|
|
|
|
jiffies_per_sec = os.sysconf(os.sysconf_names['SC_CLK_TCK'])
|
|
|
|
for tid in tid_list:
|
|
|
|
statfile = "/proc/%d/task/%d/stat" % (pid, tid)
|
|
|
|
with open(statfile, "r") as fh:
|
|
|
|
stat = fh.readline()
|
|
|
|
fields = stat.split(" ")
|
|
|
|
stime = int(fields[13])
|
|
|
|
utime = int(fields[14])
|
|
|
|
records.append(TimingRecord(tid, now, 1000 * (stime + utime) / jiffies_per_sec))
|
|
|
|
return records
|
|
|
|
|
|
|
|
def _cpu_timing(self, pid):
|
|
|
|
records = []
|
|
|
|
now = time.time()
|
|
|
|
|
|
|
|
jiffies_per_sec = os.sysconf(os.sysconf_names['SC_CLK_TCK'])
|
|
|
|
statfile = "/proc/%d/stat" % pid
|
|
|
|
with open(statfile, "r") as fh:
|
|
|
|
stat = fh.readline()
|
|
|
|
fields = stat.split(" ")
|
|
|
|
stime = int(fields[13])
|
|
|
|
utime = int(fields[14])
|
|
|
|
return TimingRecord(pid, now, 1000 * (stime + utime) / jiffies_per_sec)
|
|
|
|
|
|
|
|
def _migrate_progress(self, vm):
|
|
|
|
info = vm.command("query-migrate")
|
|
|
|
|
|
|
|
if "ram" not in info:
|
|
|
|
info["ram"] = {}
|
|
|
|
|
|
|
|
return Progress(
|
|
|
|
info.get("status", "active"),
|
|
|
|
ProgressStats(
|
|
|
|
info["ram"].get("transferred", 0),
|
|
|
|
info["ram"].get("remaining", 0),
|
|
|
|
info["ram"].get("total", 0),
|
|
|
|
info["ram"].get("duplicate", 0),
|
|
|
|
info["ram"].get("skipped", 0),
|
|
|
|
info["ram"].get("normal", 0),
|
|
|
|
info["ram"].get("normal-bytes", 0),
|
|
|
|
info["ram"].get("dirty-pages-rate", 0),
|
|
|
|
info["ram"].get("mbps", 0),
|
|
|
|
info["ram"].get("dirty-sync-count", 0)
|
|
|
|
),
|
|
|
|
time.time(),
|
|
|
|
info.get("total-time", 0),
|
|
|
|
info.get("downtime", 0),
|
|
|
|
info.get("expected-downtime", 0),
|
|
|
|
info.get("setup-time", 0),
|
|
|
|
info.get("x-cpu-throttle-percentage", 0),
|
|
|
|
)
|
|
|
|
|
|
|
|
def _migrate(self, hardware, scenario, src, dst, connect_uri):
|
|
|
|
src_qemu_time = []
|
|
|
|
src_vcpu_time = []
|
|
|
|
src_pid = src.get_pid()
|
|
|
|
|
|
|
|
vcpus = src.command("query-cpus")
|
|
|
|
src_threads = []
|
|
|
|
for vcpu in vcpus:
|
|
|
|
src_threads.append(vcpu["thread_id"])
|
|
|
|
|
|
|
|
# XXX how to get dst timings on remote host ?
|
|
|
|
|
|
|
|
if self._verbose:
|
2018-06-08 15:29:43 +03:00
|
|
|
print("Sleeping %d seconds for initial guest workload run" % self._sleep)
|
tests: introduce a framework for testing migration performance
This introduces a moderately general purpose framework for
testing performance of migration.
The initial guest workload is provided by the included 'stress'
program, which is configured to spawn one thread per guest CPU
and run a maximally memory intensive workload. It will loop
over GB of memory, xor'ing each byte with data from a 4k array
of random bytes. This ensures heavy read and write load across
all of guest memory to stress the migration performance. While
running the 'stress' program will record how long it takes to
xor each GB of memory and print this data for later reporting.
The test engine will spawn a pair of QEMU processes, either on
the same host, or with the target on a remote host via ssh,
using the host kernel and a custom initrd built with 'stress'
as the /init binary. Kernel command line args are set to ensure
a fast kernel boot time (< 1 second) between launching QEMU and
the stress program starting execution.
None the less, the test engine will initially wait N seconds for
the guest workload to stablize, before starting the migration
operation. When migration is running, the engine will use pause,
post-copy, autoconverge, xbzrle compression and multithread
compression features, as well as downtime & bandwidth tuning
to encourage completion. If migration completes, the test engine
will wait N seconds again for the guest workooad to stablize on
the target host. If migration does not complete after a preset
number of iterations, it will be aborted.
While the QEMU process is running on the source host, the test
engine will sample the host CPU usage of QEMU as a whole, and
each vCPU thread. While migration is running, it will record
all the stats reported by 'query-migration'. Finally, it will
capture the output of the stress program running in the guest.
All the data produced from a single test execution is recorded
in a structured JSON file. A separate program is then able to
create interactive charts using the "plotly" python + javascript
libraries, showing the characteristics of the migration.
The data output provides visualization of the effect on guest
vCPU workloads from the migration process, the corresponding
vCPU utilization on the host, and the overall CPU hit from
QEMU on the host. This is correlated from statistics from the
migration process, such as downtime, vCPU throttling and iteration
number.
While the tests can be run individually with arbitrary parameters,
there is also a facility for producing batch reports for a number
of pre-defined scenarios / comparisons, in order to be able to
get standardized results across different hardware configurations
(eg TCP vs RDMA, or comparing different VCPU counts / memory
sizes, etc).
To use this, first you must build the initrd image
$ make tests/migration/initrd-stress.img
To run a a one-shot test with all default parameters
$ ./tests/migration/guestperf.py > result.json
This has many command line args for varying its behaviour.
For example, to increase the RAM size and CPU count and
bind it to specific host NUMA nodes
$ ./tests/migration/guestperf.py \
--mem 4 --cpus 2 \
--src-mem-bind 0 --src-cpu-bind 0,1 \
--dst-mem-bind 1 --dst-cpu-bind 2,3 \
> result.json
Using mem + cpu binding is strongly recommended on NUMA
machines, otherwise the guest performance results will
vary wildly between runs of the test due to lucky/unlucky
NUMA placement, making sensible data analysis impossible.
To make it run across separate hosts:
$ ./tests/migration/guestperf.py \
--dst-host somehostname > result.json
To request that post-copy is enabled, with switchover
after 5 iterations
$ ./tests/migration/guestperf.py \
--post-copy --post-copy-iters 5 > result.json
Once a result.json file is created, a graph of the data
can be generated, showing guest workload performance per
thread and the migration iteration points:
$ ./tests/migration/guestperf-plot.py --output result.html \
--migration-iters --split-guest-cpu result.json
To further include host vCPU utilization and overall QEMU
utilization
$ ./tests/migration/guestperf-plot.py --output result.html \
--migration-iters --split-guest-cpu \
--qemu-cpu --vcpu-cpu result.json
NB, the 'guestperf-plot.py' command requires that you have
the plotly python library installed. eg you must do
$ pip install --user plotly
Viewing the result.html file requires that you have the
plotly.min.js file in the same directory as the HTML
output. This js file is installed as part of the plotly
python library, so can be found in
$HOME/.local/lib/python2.7/site-packages/plotly/offline/plotly.min.js
The guestperf-plot.py program can accept multiple json files
to plot, enabling results from different configurations to
be compared.
Finally, to run the entire standardized set of comparisons
$ ./tests/migration/guestperf-batch.py \
--dst-host somehost \
--mem 4 --cpus 2 \
--src-mem-bind 0 --src-cpu-bind 0,1 \
--dst-mem-bind 1 --dst-cpu-bind 2,3
--output tcp-somehost-4gb-2cpu
will store JSON files from all scenarios in the directory
named tcp-somehost-4gb-2cpu
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1469020993-29426-7-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-07-20 16:23:13 +03:00
|
|
|
sleep_secs = self._sleep
|
|
|
|
while sleep_secs > 1:
|
|
|
|
src_qemu_time.append(self._cpu_timing(src_pid))
|
|
|
|
src_vcpu_time.extend(self._vcpu_timing(src_pid, src_threads))
|
|
|
|
time.sleep(1)
|
|
|
|
sleep_secs -= 1
|
|
|
|
|
|
|
|
if self._verbose:
|
2018-06-08 15:29:43 +03:00
|
|
|
print("Starting migration")
|
tests: introduce a framework for testing migration performance
This introduces a moderately general purpose framework for
testing performance of migration.
The initial guest workload is provided by the included 'stress'
program, which is configured to spawn one thread per guest CPU
and run a maximally memory intensive workload. It will loop
over GB of memory, xor'ing each byte with data from a 4k array
of random bytes. This ensures heavy read and write load across
all of guest memory to stress the migration performance. While
running the 'stress' program will record how long it takes to
xor each GB of memory and print this data for later reporting.
The test engine will spawn a pair of QEMU processes, either on
the same host, or with the target on a remote host via ssh,
using the host kernel and a custom initrd built with 'stress'
as the /init binary. Kernel command line args are set to ensure
a fast kernel boot time (< 1 second) between launching QEMU and
the stress program starting execution.
None the less, the test engine will initially wait N seconds for
the guest workload to stablize, before starting the migration
operation. When migration is running, the engine will use pause,
post-copy, autoconverge, xbzrle compression and multithread
compression features, as well as downtime & bandwidth tuning
to encourage completion. If migration completes, the test engine
will wait N seconds again for the guest workooad to stablize on
the target host. If migration does not complete after a preset
number of iterations, it will be aborted.
While the QEMU process is running on the source host, the test
engine will sample the host CPU usage of QEMU as a whole, and
each vCPU thread. While migration is running, it will record
all the stats reported by 'query-migration'. Finally, it will
capture the output of the stress program running in the guest.
All the data produced from a single test execution is recorded
in a structured JSON file. A separate program is then able to
create interactive charts using the "plotly" python + javascript
libraries, showing the characteristics of the migration.
The data output provides visualization of the effect on guest
vCPU workloads from the migration process, the corresponding
vCPU utilization on the host, and the overall CPU hit from
QEMU on the host. This is correlated from statistics from the
migration process, such as downtime, vCPU throttling and iteration
number.
While the tests can be run individually with arbitrary parameters,
there is also a facility for producing batch reports for a number
of pre-defined scenarios / comparisons, in order to be able to
get standardized results across different hardware configurations
(eg TCP vs RDMA, or comparing different VCPU counts / memory
sizes, etc).
To use this, first you must build the initrd image
$ make tests/migration/initrd-stress.img
To run a a one-shot test with all default parameters
$ ./tests/migration/guestperf.py > result.json
This has many command line args for varying its behaviour.
For example, to increase the RAM size and CPU count and
bind it to specific host NUMA nodes
$ ./tests/migration/guestperf.py \
--mem 4 --cpus 2 \
--src-mem-bind 0 --src-cpu-bind 0,1 \
--dst-mem-bind 1 --dst-cpu-bind 2,3 \
> result.json
Using mem + cpu binding is strongly recommended on NUMA
machines, otherwise the guest performance results will
vary wildly between runs of the test due to lucky/unlucky
NUMA placement, making sensible data analysis impossible.
To make it run across separate hosts:
$ ./tests/migration/guestperf.py \
--dst-host somehostname > result.json
To request that post-copy is enabled, with switchover
after 5 iterations
$ ./tests/migration/guestperf.py \
--post-copy --post-copy-iters 5 > result.json
Once a result.json file is created, a graph of the data
can be generated, showing guest workload performance per
thread and the migration iteration points:
$ ./tests/migration/guestperf-plot.py --output result.html \
--migration-iters --split-guest-cpu result.json
To further include host vCPU utilization and overall QEMU
utilization
$ ./tests/migration/guestperf-plot.py --output result.html \
--migration-iters --split-guest-cpu \
--qemu-cpu --vcpu-cpu result.json
NB, the 'guestperf-plot.py' command requires that you have
the plotly python library installed. eg you must do
$ pip install --user plotly
Viewing the result.html file requires that you have the
plotly.min.js file in the same directory as the HTML
output. This js file is installed as part of the plotly
python library, so can be found in
$HOME/.local/lib/python2.7/site-packages/plotly/offline/plotly.min.js
The guestperf-plot.py program can accept multiple json files
to plot, enabling results from different configurations to
be compared.
Finally, to run the entire standardized set of comparisons
$ ./tests/migration/guestperf-batch.py \
--dst-host somehost \
--mem 4 --cpus 2 \
--src-mem-bind 0 --src-cpu-bind 0,1 \
--dst-mem-bind 1 --dst-cpu-bind 2,3
--output tcp-somehost-4gb-2cpu
will store JSON files from all scenarios in the directory
named tcp-somehost-4gb-2cpu
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1469020993-29426-7-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-07-20 16:23:13 +03:00
|
|
|
if scenario._auto_converge:
|
|
|
|
resp = src.command("migrate-set-capabilities",
|
|
|
|
capabilities = [
|
|
|
|
{ "capability": "auto-converge",
|
|
|
|
"state": True }
|
|
|
|
])
|
|
|
|
resp = src.command("migrate-set-parameters",
|
|
|
|
x_cpu_throttle_increment=scenario._auto_converge_step)
|
|
|
|
|
|
|
|
if scenario._post_copy:
|
|
|
|
resp = src.command("migrate-set-capabilities",
|
|
|
|
capabilities = [
|
|
|
|
{ "capability": "postcopy-ram",
|
|
|
|
"state": True }
|
|
|
|
])
|
|
|
|
resp = dst.command("migrate-set-capabilities",
|
|
|
|
capabilities = [
|
|
|
|
{ "capability": "postcopy-ram",
|
|
|
|
"state": True }
|
|
|
|
])
|
|
|
|
|
|
|
|
resp = src.command("migrate_set_speed",
|
|
|
|
value=scenario._bandwidth * 1024 * 1024)
|
|
|
|
|
|
|
|
resp = src.command("migrate_set_downtime",
|
|
|
|
value=scenario._downtime / 1024.0)
|
|
|
|
|
|
|
|
if scenario._compression_mt:
|
|
|
|
resp = src.command("migrate-set-capabilities",
|
|
|
|
capabilities = [
|
|
|
|
{ "capability": "compress",
|
|
|
|
"state": True }
|
|
|
|
])
|
|
|
|
resp = src.command("migrate-set-parameters",
|
|
|
|
compress_threads=scenario._compression_mt_threads)
|
|
|
|
resp = dst.command("migrate-set-capabilities",
|
|
|
|
capabilities = [
|
|
|
|
{ "capability": "compress",
|
|
|
|
"state": True }
|
|
|
|
])
|
|
|
|
resp = dst.command("migrate-set-parameters",
|
|
|
|
decompress_threads=scenario._compression_mt_threads)
|
|
|
|
|
|
|
|
if scenario._compression_xbzrle:
|
|
|
|
resp = src.command("migrate-set-capabilities",
|
|
|
|
capabilities = [
|
|
|
|
{ "capability": "xbzrle",
|
|
|
|
"state": True }
|
|
|
|
])
|
|
|
|
resp = dst.command("migrate-set-capabilities",
|
|
|
|
capabilities = [
|
|
|
|
{ "capability": "xbzrle",
|
|
|
|
"state": True }
|
|
|
|
])
|
|
|
|
resp = src.command("migrate-set-cache-size",
|
|
|
|
value=(hardware._mem * 1024 * 1024 * 1024 / 100 *
|
|
|
|
scenario._compression_xbzrle_cache))
|
|
|
|
|
|
|
|
resp = src.command("migrate", uri=connect_uri)
|
|
|
|
|
|
|
|
post_copy = False
|
|
|
|
paused = False
|
|
|
|
|
|
|
|
progress_history = []
|
|
|
|
|
|
|
|
start = time.time()
|
|
|
|
loop = 0
|
|
|
|
while True:
|
|
|
|
loop = loop + 1
|
|
|
|
time.sleep(0.05)
|
|
|
|
|
|
|
|
progress = self._migrate_progress(src)
|
|
|
|
if (loop % 20) == 0:
|
|
|
|
src_qemu_time.append(self._cpu_timing(src_pid))
|
|
|
|
src_vcpu_time.extend(self._vcpu_timing(src_pid, src_threads))
|
|
|
|
|
|
|
|
if (len(progress_history) == 0 or
|
|
|
|
(progress_history[-1]._ram._iterations <
|
|
|
|
progress._ram._iterations)):
|
|
|
|
progress_history.append(progress)
|
|
|
|
|
|
|
|
if progress._status in ("completed", "failed", "cancelled"):
|
|
|
|
if progress._status == "completed" and paused:
|
|
|
|
dst.command("cont")
|
|
|
|
if progress_history[-1] != progress:
|
|
|
|
progress_history.append(progress)
|
|
|
|
|
|
|
|
if progress._status == "completed":
|
|
|
|
if self._verbose:
|
2018-06-08 15:29:43 +03:00
|
|
|
print("Sleeping %d seconds for final guest workload run" % self._sleep)
|
tests: introduce a framework for testing migration performance
This introduces a moderately general purpose framework for
testing performance of migration.
The initial guest workload is provided by the included 'stress'
program, which is configured to spawn one thread per guest CPU
and run a maximally memory intensive workload. It will loop
over GB of memory, xor'ing each byte with data from a 4k array
of random bytes. This ensures heavy read and write load across
all of guest memory to stress the migration performance. While
running the 'stress' program will record how long it takes to
xor each GB of memory and print this data for later reporting.
The test engine will spawn a pair of QEMU processes, either on
the same host, or with the target on a remote host via ssh,
using the host kernel and a custom initrd built with 'stress'
as the /init binary. Kernel command line args are set to ensure
a fast kernel boot time (< 1 second) between launching QEMU and
the stress program starting execution.
None the less, the test engine will initially wait N seconds for
the guest workload to stablize, before starting the migration
operation. When migration is running, the engine will use pause,
post-copy, autoconverge, xbzrle compression and multithread
compression features, as well as downtime & bandwidth tuning
to encourage completion. If migration completes, the test engine
will wait N seconds again for the guest workooad to stablize on
the target host. If migration does not complete after a preset
number of iterations, it will be aborted.
While the QEMU process is running on the source host, the test
engine will sample the host CPU usage of QEMU as a whole, and
each vCPU thread. While migration is running, it will record
all the stats reported by 'query-migration'. Finally, it will
capture the output of the stress program running in the guest.
All the data produced from a single test execution is recorded
in a structured JSON file. A separate program is then able to
create interactive charts using the "plotly" python + javascript
libraries, showing the characteristics of the migration.
The data output provides visualization of the effect on guest
vCPU workloads from the migration process, the corresponding
vCPU utilization on the host, and the overall CPU hit from
QEMU on the host. This is correlated from statistics from the
migration process, such as downtime, vCPU throttling and iteration
number.
While the tests can be run individually with arbitrary parameters,
there is also a facility for producing batch reports for a number
of pre-defined scenarios / comparisons, in order to be able to
get standardized results across different hardware configurations
(eg TCP vs RDMA, or comparing different VCPU counts / memory
sizes, etc).
To use this, first you must build the initrd image
$ make tests/migration/initrd-stress.img
To run a a one-shot test with all default parameters
$ ./tests/migration/guestperf.py > result.json
This has many command line args for varying its behaviour.
For example, to increase the RAM size and CPU count and
bind it to specific host NUMA nodes
$ ./tests/migration/guestperf.py \
--mem 4 --cpus 2 \
--src-mem-bind 0 --src-cpu-bind 0,1 \
--dst-mem-bind 1 --dst-cpu-bind 2,3 \
> result.json
Using mem + cpu binding is strongly recommended on NUMA
machines, otherwise the guest performance results will
vary wildly between runs of the test due to lucky/unlucky
NUMA placement, making sensible data analysis impossible.
To make it run across separate hosts:
$ ./tests/migration/guestperf.py \
--dst-host somehostname > result.json
To request that post-copy is enabled, with switchover
after 5 iterations
$ ./tests/migration/guestperf.py \
--post-copy --post-copy-iters 5 > result.json
Once a result.json file is created, a graph of the data
can be generated, showing guest workload performance per
thread and the migration iteration points:
$ ./tests/migration/guestperf-plot.py --output result.html \
--migration-iters --split-guest-cpu result.json
To further include host vCPU utilization and overall QEMU
utilization
$ ./tests/migration/guestperf-plot.py --output result.html \
--migration-iters --split-guest-cpu \
--qemu-cpu --vcpu-cpu result.json
NB, the 'guestperf-plot.py' command requires that you have
the plotly python library installed. eg you must do
$ pip install --user plotly
Viewing the result.html file requires that you have the
plotly.min.js file in the same directory as the HTML
output. This js file is installed as part of the plotly
python library, so can be found in
$HOME/.local/lib/python2.7/site-packages/plotly/offline/plotly.min.js
The guestperf-plot.py program can accept multiple json files
to plot, enabling results from different configurations to
be compared.
Finally, to run the entire standardized set of comparisons
$ ./tests/migration/guestperf-batch.py \
--dst-host somehost \
--mem 4 --cpus 2 \
--src-mem-bind 0 --src-cpu-bind 0,1 \
--dst-mem-bind 1 --dst-cpu-bind 2,3
--output tcp-somehost-4gb-2cpu
will store JSON files from all scenarios in the directory
named tcp-somehost-4gb-2cpu
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1469020993-29426-7-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-07-20 16:23:13 +03:00
|
|
|
sleep_secs = self._sleep
|
|
|
|
while sleep_secs > 1:
|
|
|
|
time.sleep(1)
|
|
|
|
src_qemu_time.append(self._cpu_timing(src_pid))
|
|
|
|
src_vcpu_time.extend(self._vcpu_timing(src_pid, src_threads))
|
|
|
|
sleep_secs -= 1
|
|
|
|
|
|
|
|
return [progress_history, src_qemu_time, src_vcpu_time]
|
|
|
|
|
|
|
|
if self._verbose and (loop % 20) == 0:
|
2018-06-08 15:29:43 +03:00
|
|
|
print("Iter %d: remain %5dMB of %5dMB (total %5dMB @ %5dMb/sec)" % (
|
tests: introduce a framework for testing migration performance
This introduces a moderately general purpose framework for
testing performance of migration.
The initial guest workload is provided by the included 'stress'
program, which is configured to spawn one thread per guest CPU
and run a maximally memory intensive workload. It will loop
over GB of memory, xor'ing each byte with data from a 4k array
of random bytes. This ensures heavy read and write load across
all of guest memory to stress the migration performance. While
running the 'stress' program will record how long it takes to
xor each GB of memory and print this data for later reporting.
The test engine will spawn a pair of QEMU processes, either on
the same host, or with the target on a remote host via ssh,
using the host kernel and a custom initrd built with 'stress'
as the /init binary. Kernel command line args are set to ensure
a fast kernel boot time (< 1 second) between launching QEMU and
the stress program starting execution.
None the less, the test engine will initially wait N seconds for
the guest workload to stablize, before starting the migration
operation. When migration is running, the engine will use pause,
post-copy, autoconverge, xbzrle compression and multithread
compression features, as well as downtime & bandwidth tuning
to encourage completion. If migration completes, the test engine
will wait N seconds again for the guest workooad to stablize on
the target host. If migration does not complete after a preset
number of iterations, it will be aborted.
While the QEMU process is running on the source host, the test
engine will sample the host CPU usage of QEMU as a whole, and
each vCPU thread. While migration is running, it will record
all the stats reported by 'query-migration'. Finally, it will
capture the output of the stress program running in the guest.
All the data produced from a single test execution is recorded
in a structured JSON file. A separate program is then able to
create interactive charts using the "plotly" python + javascript
libraries, showing the characteristics of the migration.
The data output provides visualization of the effect on guest
vCPU workloads from the migration process, the corresponding
vCPU utilization on the host, and the overall CPU hit from
QEMU on the host. This is correlated from statistics from the
migration process, such as downtime, vCPU throttling and iteration
number.
While the tests can be run individually with arbitrary parameters,
there is also a facility for producing batch reports for a number
of pre-defined scenarios / comparisons, in order to be able to
get standardized results across different hardware configurations
(eg TCP vs RDMA, or comparing different VCPU counts / memory
sizes, etc).
To use this, first you must build the initrd image
$ make tests/migration/initrd-stress.img
To run a a one-shot test with all default parameters
$ ./tests/migration/guestperf.py > result.json
This has many command line args for varying its behaviour.
For example, to increase the RAM size and CPU count and
bind it to specific host NUMA nodes
$ ./tests/migration/guestperf.py \
--mem 4 --cpus 2 \
--src-mem-bind 0 --src-cpu-bind 0,1 \
--dst-mem-bind 1 --dst-cpu-bind 2,3 \
> result.json
Using mem + cpu binding is strongly recommended on NUMA
machines, otherwise the guest performance results will
vary wildly between runs of the test due to lucky/unlucky
NUMA placement, making sensible data analysis impossible.
To make it run across separate hosts:
$ ./tests/migration/guestperf.py \
--dst-host somehostname > result.json
To request that post-copy is enabled, with switchover
after 5 iterations
$ ./tests/migration/guestperf.py \
--post-copy --post-copy-iters 5 > result.json
Once a result.json file is created, a graph of the data
can be generated, showing guest workload performance per
thread and the migration iteration points:
$ ./tests/migration/guestperf-plot.py --output result.html \
--migration-iters --split-guest-cpu result.json
To further include host vCPU utilization and overall QEMU
utilization
$ ./tests/migration/guestperf-plot.py --output result.html \
--migration-iters --split-guest-cpu \
--qemu-cpu --vcpu-cpu result.json
NB, the 'guestperf-plot.py' command requires that you have
the plotly python library installed. eg you must do
$ pip install --user plotly
Viewing the result.html file requires that you have the
plotly.min.js file in the same directory as the HTML
output. This js file is installed as part of the plotly
python library, so can be found in
$HOME/.local/lib/python2.7/site-packages/plotly/offline/plotly.min.js
The guestperf-plot.py program can accept multiple json files
to plot, enabling results from different configurations to
be compared.
Finally, to run the entire standardized set of comparisons
$ ./tests/migration/guestperf-batch.py \
--dst-host somehost \
--mem 4 --cpus 2 \
--src-mem-bind 0 --src-cpu-bind 0,1 \
--dst-mem-bind 1 --dst-cpu-bind 2,3
--output tcp-somehost-4gb-2cpu
will store JSON files from all scenarios in the directory
named tcp-somehost-4gb-2cpu
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1469020993-29426-7-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-07-20 16:23:13 +03:00
|
|
|
progress._ram._iterations,
|
|
|
|
progress._ram._remaining_bytes / (1024 * 1024),
|
|
|
|
progress._ram._total_bytes / (1024 * 1024),
|
|
|
|
progress._ram._transferred_bytes / (1024 * 1024),
|
|
|
|
progress._ram._transfer_rate_mbs,
|
2018-06-08 15:29:43 +03:00
|
|
|
))
|
tests: introduce a framework for testing migration performance
This introduces a moderately general purpose framework for
testing performance of migration.
The initial guest workload is provided by the included 'stress'
program, which is configured to spawn one thread per guest CPU
and run a maximally memory intensive workload. It will loop
over GB of memory, xor'ing each byte with data from a 4k array
of random bytes. This ensures heavy read and write load across
all of guest memory to stress the migration performance. While
running the 'stress' program will record how long it takes to
xor each GB of memory and print this data for later reporting.
The test engine will spawn a pair of QEMU processes, either on
the same host, or with the target on a remote host via ssh,
using the host kernel and a custom initrd built with 'stress'
as the /init binary. Kernel command line args are set to ensure
a fast kernel boot time (< 1 second) between launching QEMU and
the stress program starting execution.
None the less, the test engine will initially wait N seconds for
the guest workload to stablize, before starting the migration
operation. When migration is running, the engine will use pause,
post-copy, autoconverge, xbzrle compression and multithread
compression features, as well as downtime & bandwidth tuning
to encourage completion. If migration completes, the test engine
will wait N seconds again for the guest workooad to stablize on
the target host. If migration does not complete after a preset
number of iterations, it will be aborted.
While the QEMU process is running on the source host, the test
engine will sample the host CPU usage of QEMU as a whole, and
each vCPU thread. While migration is running, it will record
all the stats reported by 'query-migration'. Finally, it will
capture the output of the stress program running in the guest.
All the data produced from a single test execution is recorded
in a structured JSON file. A separate program is then able to
create interactive charts using the "plotly" python + javascript
libraries, showing the characteristics of the migration.
The data output provides visualization of the effect on guest
vCPU workloads from the migration process, the corresponding
vCPU utilization on the host, and the overall CPU hit from
QEMU on the host. This is correlated from statistics from the
migration process, such as downtime, vCPU throttling and iteration
number.
While the tests can be run individually with arbitrary parameters,
there is also a facility for producing batch reports for a number
of pre-defined scenarios / comparisons, in order to be able to
get standardized results across different hardware configurations
(eg TCP vs RDMA, or comparing different VCPU counts / memory
sizes, etc).
To use this, first you must build the initrd image
$ make tests/migration/initrd-stress.img
To run a a one-shot test with all default parameters
$ ./tests/migration/guestperf.py > result.json
This has many command line args for varying its behaviour.
For example, to increase the RAM size and CPU count and
bind it to specific host NUMA nodes
$ ./tests/migration/guestperf.py \
--mem 4 --cpus 2 \
--src-mem-bind 0 --src-cpu-bind 0,1 \
--dst-mem-bind 1 --dst-cpu-bind 2,3 \
> result.json
Using mem + cpu binding is strongly recommended on NUMA
machines, otherwise the guest performance results will
vary wildly between runs of the test due to lucky/unlucky
NUMA placement, making sensible data analysis impossible.
To make it run across separate hosts:
$ ./tests/migration/guestperf.py \
--dst-host somehostname > result.json
To request that post-copy is enabled, with switchover
after 5 iterations
$ ./tests/migration/guestperf.py \
--post-copy --post-copy-iters 5 > result.json
Once a result.json file is created, a graph of the data
can be generated, showing guest workload performance per
thread and the migration iteration points:
$ ./tests/migration/guestperf-plot.py --output result.html \
--migration-iters --split-guest-cpu result.json
To further include host vCPU utilization and overall QEMU
utilization
$ ./tests/migration/guestperf-plot.py --output result.html \
--migration-iters --split-guest-cpu \
--qemu-cpu --vcpu-cpu result.json
NB, the 'guestperf-plot.py' command requires that you have
the plotly python library installed. eg you must do
$ pip install --user plotly
Viewing the result.html file requires that you have the
plotly.min.js file in the same directory as the HTML
output. This js file is installed as part of the plotly
python library, so can be found in
$HOME/.local/lib/python2.7/site-packages/plotly/offline/plotly.min.js
The guestperf-plot.py program can accept multiple json files
to plot, enabling results from different configurations to
be compared.
Finally, to run the entire standardized set of comparisons
$ ./tests/migration/guestperf-batch.py \
--dst-host somehost \
--mem 4 --cpus 2 \
--src-mem-bind 0 --src-cpu-bind 0,1 \
--dst-mem-bind 1 --dst-cpu-bind 2,3
--output tcp-somehost-4gb-2cpu
will store JSON files from all scenarios in the directory
named tcp-somehost-4gb-2cpu
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1469020993-29426-7-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-07-20 16:23:13 +03:00
|
|
|
|
|
|
|
if progress._ram._iterations > scenario._max_iters:
|
|
|
|
if self._verbose:
|
2018-06-08 15:29:43 +03:00
|
|
|
print("No completion after %d iterations over RAM" % scenario._max_iters)
|
tests: introduce a framework for testing migration performance
This introduces a moderately general purpose framework for
testing performance of migration.
The initial guest workload is provided by the included 'stress'
program, which is configured to spawn one thread per guest CPU
and run a maximally memory intensive workload. It will loop
over GB of memory, xor'ing each byte with data from a 4k array
of random bytes. This ensures heavy read and write load across
all of guest memory to stress the migration performance. While
running the 'stress' program will record how long it takes to
xor each GB of memory and print this data for later reporting.
The test engine will spawn a pair of QEMU processes, either on
the same host, or with the target on a remote host via ssh,
using the host kernel and a custom initrd built with 'stress'
as the /init binary. Kernel command line args are set to ensure
a fast kernel boot time (< 1 second) between launching QEMU and
the stress program starting execution.
None the less, the test engine will initially wait N seconds for
the guest workload to stablize, before starting the migration
operation. When migration is running, the engine will use pause,
post-copy, autoconverge, xbzrle compression and multithread
compression features, as well as downtime & bandwidth tuning
to encourage completion. If migration completes, the test engine
will wait N seconds again for the guest workooad to stablize on
the target host. If migration does not complete after a preset
number of iterations, it will be aborted.
While the QEMU process is running on the source host, the test
engine will sample the host CPU usage of QEMU as a whole, and
each vCPU thread. While migration is running, it will record
all the stats reported by 'query-migration'. Finally, it will
capture the output of the stress program running in the guest.
All the data produced from a single test execution is recorded
in a structured JSON file. A separate program is then able to
create interactive charts using the "plotly" python + javascript
libraries, showing the characteristics of the migration.
The data output provides visualization of the effect on guest
vCPU workloads from the migration process, the corresponding
vCPU utilization on the host, and the overall CPU hit from
QEMU on the host. This is correlated from statistics from the
migration process, such as downtime, vCPU throttling and iteration
number.
While the tests can be run individually with arbitrary parameters,
there is also a facility for producing batch reports for a number
of pre-defined scenarios / comparisons, in order to be able to
get standardized results across different hardware configurations
(eg TCP vs RDMA, or comparing different VCPU counts / memory
sizes, etc).
To use this, first you must build the initrd image
$ make tests/migration/initrd-stress.img
To run a a one-shot test with all default parameters
$ ./tests/migration/guestperf.py > result.json
This has many command line args for varying its behaviour.
For example, to increase the RAM size and CPU count and
bind it to specific host NUMA nodes
$ ./tests/migration/guestperf.py \
--mem 4 --cpus 2 \
--src-mem-bind 0 --src-cpu-bind 0,1 \
--dst-mem-bind 1 --dst-cpu-bind 2,3 \
> result.json
Using mem + cpu binding is strongly recommended on NUMA
machines, otherwise the guest performance results will
vary wildly between runs of the test due to lucky/unlucky
NUMA placement, making sensible data analysis impossible.
To make it run across separate hosts:
$ ./tests/migration/guestperf.py \
--dst-host somehostname > result.json
To request that post-copy is enabled, with switchover
after 5 iterations
$ ./tests/migration/guestperf.py \
--post-copy --post-copy-iters 5 > result.json
Once a result.json file is created, a graph of the data
can be generated, showing guest workload performance per
thread and the migration iteration points:
$ ./tests/migration/guestperf-plot.py --output result.html \
--migration-iters --split-guest-cpu result.json
To further include host vCPU utilization and overall QEMU
utilization
$ ./tests/migration/guestperf-plot.py --output result.html \
--migration-iters --split-guest-cpu \
--qemu-cpu --vcpu-cpu result.json
NB, the 'guestperf-plot.py' command requires that you have
the plotly python library installed. eg you must do
$ pip install --user plotly
Viewing the result.html file requires that you have the
plotly.min.js file in the same directory as the HTML
output. This js file is installed as part of the plotly
python library, so can be found in
$HOME/.local/lib/python2.7/site-packages/plotly/offline/plotly.min.js
The guestperf-plot.py program can accept multiple json files
to plot, enabling results from different configurations to
be compared.
Finally, to run the entire standardized set of comparisons
$ ./tests/migration/guestperf-batch.py \
--dst-host somehost \
--mem 4 --cpus 2 \
--src-mem-bind 0 --src-cpu-bind 0,1 \
--dst-mem-bind 1 --dst-cpu-bind 2,3
--output tcp-somehost-4gb-2cpu
will store JSON files from all scenarios in the directory
named tcp-somehost-4gb-2cpu
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1469020993-29426-7-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-07-20 16:23:13 +03:00
|
|
|
src.command("migrate_cancel")
|
|
|
|
continue
|
|
|
|
|
|
|
|
if time.time() > (start + scenario._max_time):
|
|
|
|
if self._verbose:
|
2018-06-08 15:29:43 +03:00
|
|
|
print("No completion after %d seconds" % scenario._max_time)
|
tests: introduce a framework for testing migration performance
This introduces a moderately general purpose framework for
testing performance of migration.
The initial guest workload is provided by the included 'stress'
program, which is configured to spawn one thread per guest CPU
and run a maximally memory intensive workload. It will loop
over GB of memory, xor'ing each byte with data from a 4k array
of random bytes. This ensures heavy read and write load across
all of guest memory to stress the migration performance. While
running the 'stress' program will record how long it takes to
xor each GB of memory and print this data for later reporting.
The test engine will spawn a pair of QEMU processes, either on
the same host, or with the target on a remote host via ssh,
using the host kernel and a custom initrd built with 'stress'
as the /init binary. Kernel command line args are set to ensure
a fast kernel boot time (< 1 second) between launching QEMU and
the stress program starting execution.
None the less, the test engine will initially wait N seconds for
the guest workload to stablize, before starting the migration
operation. When migration is running, the engine will use pause,
post-copy, autoconverge, xbzrle compression and multithread
compression features, as well as downtime & bandwidth tuning
to encourage completion. If migration completes, the test engine
will wait N seconds again for the guest workooad to stablize on
the target host. If migration does not complete after a preset
number of iterations, it will be aborted.
While the QEMU process is running on the source host, the test
engine will sample the host CPU usage of QEMU as a whole, and
each vCPU thread. While migration is running, it will record
all the stats reported by 'query-migration'. Finally, it will
capture the output of the stress program running in the guest.
All the data produced from a single test execution is recorded
in a structured JSON file. A separate program is then able to
create interactive charts using the "plotly" python + javascript
libraries, showing the characteristics of the migration.
The data output provides visualization of the effect on guest
vCPU workloads from the migration process, the corresponding
vCPU utilization on the host, and the overall CPU hit from
QEMU on the host. This is correlated from statistics from the
migration process, such as downtime, vCPU throttling and iteration
number.
While the tests can be run individually with arbitrary parameters,
there is also a facility for producing batch reports for a number
of pre-defined scenarios / comparisons, in order to be able to
get standardized results across different hardware configurations
(eg TCP vs RDMA, or comparing different VCPU counts / memory
sizes, etc).
To use this, first you must build the initrd image
$ make tests/migration/initrd-stress.img
To run a a one-shot test with all default parameters
$ ./tests/migration/guestperf.py > result.json
This has many command line args for varying its behaviour.
For example, to increase the RAM size and CPU count and
bind it to specific host NUMA nodes
$ ./tests/migration/guestperf.py \
--mem 4 --cpus 2 \
--src-mem-bind 0 --src-cpu-bind 0,1 \
--dst-mem-bind 1 --dst-cpu-bind 2,3 \
> result.json
Using mem + cpu binding is strongly recommended on NUMA
machines, otherwise the guest performance results will
vary wildly between runs of the test due to lucky/unlucky
NUMA placement, making sensible data analysis impossible.
To make it run across separate hosts:
$ ./tests/migration/guestperf.py \
--dst-host somehostname > result.json
To request that post-copy is enabled, with switchover
after 5 iterations
$ ./tests/migration/guestperf.py \
--post-copy --post-copy-iters 5 > result.json
Once a result.json file is created, a graph of the data
can be generated, showing guest workload performance per
thread and the migration iteration points:
$ ./tests/migration/guestperf-plot.py --output result.html \
--migration-iters --split-guest-cpu result.json
To further include host vCPU utilization and overall QEMU
utilization
$ ./tests/migration/guestperf-plot.py --output result.html \
--migration-iters --split-guest-cpu \
--qemu-cpu --vcpu-cpu result.json
NB, the 'guestperf-plot.py' command requires that you have
the plotly python library installed. eg you must do
$ pip install --user plotly
Viewing the result.html file requires that you have the
plotly.min.js file in the same directory as the HTML
output. This js file is installed as part of the plotly
python library, so can be found in
$HOME/.local/lib/python2.7/site-packages/plotly/offline/plotly.min.js
The guestperf-plot.py program can accept multiple json files
to plot, enabling results from different configurations to
be compared.
Finally, to run the entire standardized set of comparisons
$ ./tests/migration/guestperf-batch.py \
--dst-host somehost \
--mem 4 --cpus 2 \
--src-mem-bind 0 --src-cpu-bind 0,1 \
--dst-mem-bind 1 --dst-cpu-bind 2,3
--output tcp-somehost-4gb-2cpu
will store JSON files from all scenarios in the directory
named tcp-somehost-4gb-2cpu
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1469020993-29426-7-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-07-20 16:23:13 +03:00
|
|
|
src.command("migrate_cancel")
|
|
|
|
continue
|
|
|
|
|
|
|
|
if (scenario._post_copy and
|
|
|
|
progress._ram._iterations >= scenario._post_copy_iters and
|
|
|
|
not post_copy):
|
|
|
|
if self._verbose:
|
2018-06-08 15:29:43 +03:00
|
|
|
print("Switching to post-copy after %d iterations" % scenario._post_copy_iters)
|
tests: introduce a framework for testing migration performance
This introduces a moderately general purpose framework for
testing performance of migration.
The initial guest workload is provided by the included 'stress'
program, which is configured to spawn one thread per guest CPU
and run a maximally memory intensive workload. It will loop
over GB of memory, xor'ing each byte with data from a 4k array
of random bytes. This ensures heavy read and write load across
all of guest memory to stress the migration performance. While
running the 'stress' program will record how long it takes to
xor each GB of memory and print this data for later reporting.
The test engine will spawn a pair of QEMU processes, either on
the same host, or with the target on a remote host via ssh,
using the host kernel and a custom initrd built with 'stress'
as the /init binary. Kernel command line args are set to ensure
a fast kernel boot time (< 1 second) between launching QEMU and
the stress program starting execution.
None the less, the test engine will initially wait N seconds for
the guest workload to stablize, before starting the migration
operation. When migration is running, the engine will use pause,
post-copy, autoconverge, xbzrle compression and multithread
compression features, as well as downtime & bandwidth tuning
to encourage completion. If migration completes, the test engine
will wait N seconds again for the guest workooad to stablize on
the target host. If migration does not complete after a preset
number of iterations, it will be aborted.
While the QEMU process is running on the source host, the test
engine will sample the host CPU usage of QEMU as a whole, and
each vCPU thread. While migration is running, it will record
all the stats reported by 'query-migration'. Finally, it will
capture the output of the stress program running in the guest.
All the data produced from a single test execution is recorded
in a structured JSON file. A separate program is then able to
create interactive charts using the "plotly" python + javascript
libraries, showing the characteristics of the migration.
The data output provides visualization of the effect on guest
vCPU workloads from the migration process, the corresponding
vCPU utilization on the host, and the overall CPU hit from
QEMU on the host. This is correlated from statistics from the
migration process, such as downtime, vCPU throttling and iteration
number.
While the tests can be run individually with arbitrary parameters,
there is also a facility for producing batch reports for a number
of pre-defined scenarios / comparisons, in order to be able to
get standardized results across different hardware configurations
(eg TCP vs RDMA, or comparing different VCPU counts / memory
sizes, etc).
To use this, first you must build the initrd image
$ make tests/migration/initrd-stress.img
To run a a one-shot test with all default parameters
$ ./tests/migration/guestperf.py > result.json
This has many command line args for varying its behaviour.
For example, to increase the RAM size and CPU count and
bind it to specific host NUMA nodes
$ ./tests/migration/guestperf.py \
--mem 4 --cpus 2 \
--src-mem-bind 0 --src-cpu-bind 0,1 \
--dst-mem-bind 1 --dst-cpu-bind 2,3 \
> result.json
Using mem + cpu binding is strongly recommended on NUMA
machines, otherwise the guest performance results will
vary wildly between runs of the test due to lucky/unlucky
NUMA placement, making sensible data analysis impossible.
To make it run across separate hosts:
$ ./tests/migration/guestperf.py \
--dst-host somehostname > result.json
To request that post-copy is enabled, with switchover
after 5 iterations
$ ./tests/migration/guestperf.py \
--post-copy --post-copy-iters 5 > result.json
Once a result.json file is created, a graph of the data
can be generated, showing guest workload performance per
thread and the migration iteration points:
$ ./tests/migration/guestperf-plot.py --output result.html \
--migration-iters --split-guest-cpu result.json
To further include host vCPU utilization and overall QEMU
utilization
$ ./tests/migration/guestperf-plot.py --output result.html \
--migration-iters --split-guest-cpu \
--qemu-cpu --vcpu-cpu result.json
NB, the 'guestperf-plot.py' command requires that you have
the plotly python library installed. eg you must do
$ pip install --user plotly
Viewing the result.html file requires that you have the
plotly.min.js file in the same directory as the HTML
output. This js file is installed as part of the plotly
python library, so can be found in
$HOME/.local/lib/python2.7/site-packages/plotly/offline/plotly.min.js
The guestperf-plot.py program can accept multiple json files
to plot, enabling results from different configurations to
be compared.
Finally, to run the entire standardized set of comparisons
$ ./tests/migration/guestperf-batch.py \
--dst-host somehost \
--mem 4 --cpus 2 \
--src-mem-bind 0 --src-cpu-bind 0,1 \
--dst-mem-bind 1 --dst-cpu-bind 2,3
--output tcp-somehost-4gb-2cpu
will store JSON files from all scenarios in the directory
named tcp-somehost-4gb-2cpu
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1469020993-29426-7-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-07-20 16:23:13 +03:00
|
|
|
resp = src.command("migrate-start-postcopy")
|
|
|
|
post_copy = True
|
|
|
|
|
|
|
|
if (scenario._pause and
|
|
|
|
progress._ram._iterations >= scenario._pause_iters and
|
|
|
|
not paused):
|
|
|
|
if self._verbose:
|
2018-06-08 15:29:43 +03:00
|
|
|
print("Pausing VM after %d iterations" % scenario._pause_iters)
|
tests: introduce a framework for testing migration performance
This introduces a moderately general purpose framework for
testing performance of migration.
The initial guest workload is provided by the included 'stress'
program, which is configured to spawn one thread per guest CPU
and run a maximally memory intensive workload. It will loop
over GB of memory, xor'ing each byte with data from a 4k array
of random bytes. This ensures heavy read and write load across
all of guest memory to stress the migration performance. While
running the 'stress' program will record how long it takes to
xor each GB of memory and print this data for later reporting.
The test engine will spawn a pair of QEMU processes, either on
the same host, or with the target on a remote host via ssh,
using the host kernel and a custom initrd built with 'stress'
as the /init binary. Kernel command line args are set to ensure
a fast kernel boot time (< 1 second) between launching QEMU and
the stress program starting execution.
None the less, the test engine will initially wait N seconds for
the guest workload to stablize, before starting the migration
operation. When migration is running, the engine will use pause,
post-copy, autoconverge, xbzrle compression and multithread
compression features, as well as downtime & bandwidth tuning
to encourage completion. If migration completes, the test engine
will wait N seconds again for the guest workooad to stablize on
the target host. If migration does not complete after a preset
number of iterations, it will be aborted.
While the QEMU process is running on the source host, the test
engine will sample the host CPU usage of QEMU as a whole, and
each vCPU thread. While migration is running, it will record
all the stats reported by 'query-migration'. Finally, it will
capture the output of the stress program running in the guest.
All the data produced from a single test execution is recorded
in a structured JSON file. A separate program is then able to
create interactive charts using the "plotly" python + javascript
libraries, showing the characteristics of the migration.
The data output provides visualization of the effect on guest
vCPU workloads from the migration process, the corresponding
vCPU utilization on the host, and the overall CPU hit from
QEMU on the host. This is correlated from statistics from the
migration process, such as downtime, vCPU throttling and iteration
number.
While the tests can be run individually with arbitrary parameters,
there is also a facility for producing batch reports for a number
of pre-defined scenarios / comparisons, in order to be able to
get standardized results across different hardware configurations
(eg TCP vs RDMA, or comparing different VCPU counts / memory
sizes, etc).
To use this, first you must build the initrd image
$ make tests/migration/initrd-stress.img
To run a a one-shot test with all default parameters
$ ./tests/migration/guestperf.py > result.json
This has many command line args for varying its behaviour.
For example, to increase the RAM size and CPU count and
bind it to specific host NUMA nodes
$ ./tests/migration/guestperf.py \
--mem 4 --cpus 2 \
--src-mem-bind 0 --src-cpu-bind 0,1 \
--dst-mem-bind 1 --dst-cpu-bind 2,3 \
> result.json
Using mem + cpu binding is strongly recommended on NUMA
machines, otherwise the guest performance results will
vary wildly between runs of the test due to lucky/unlucky
NUMA placement, making sensible data analysis impossible.
To make it run across separate hosts:
$ ./tests/migration/guestperf.py \
--dst-host somehostname > result.json
To request that post-copy is enabled, with switchover
after 5 iterations
$ ./tests/migration/guestperf.py \
--post-copy --post-copy-iters 5 > result.json
Once a result.json file is created, a graph of the data
can be generated, showing guest workload performance per
thread and the migration iteration points:
$ ./tests/migration/guestperf-plot.py --output result.html \
--migration-iters --split-guest-cpu result.json
To further include host vCPU utilization and overall QEMU
utilization
$ ./tests/migration/guestperf-plot.py --output result.html \
--migration-iters --split-guest-cpu \
--qemu-cpu --vcpu-cpu result.json
NB, the 'guestperf-plot.py' command requires that you have
the plotly python library installed. eg you must do
$ pip install --user plotly
Viewing the result.html file requires that you have the
plotly.min.js file in the same directory as the HTML
output. This js file is installed as part of the plotly
python library, so can be found in
$HOME/.local/lib/python2.7/site-packages/plotly/offline/plotly.min.js
The guestperf-plot.py program can accept multiple json files
to plot, enabling results from different configurations to
be compared.
Finally, to run the entire standardized set of comparisons
$ ./tests/migration/guestperf-batch.py \
--dst-host somehost \
--mem 4 --cpus 2 \
--src-mem-bind 0 --src-cpu-bind 0,1 \
--dst-mem-bind 1 --dst-cpu-bind 2,3
--output tcp-somehost-4gb-2cpu
will store JSON files from all scenarios in the directory
named tcp-somehost-4gb-2cpu
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1469020993-29426-7-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-07-20 16:23:13 +03:00
|
|
|
resp = src.command("stop")
|
|
|
|
paused = True
|
|
|
|
|
|
|
|
def _get_common_args(self, hardware, tunnelled=False):
|
|
|
|
args = [
|
|
|
|
"noapic",
|
|
|
|
"edd=off",
|
|
|
|
"printk.time=1",
|
|
|
|
"noreplace-smp",
|
|
|
|
"cgroup_disable=memory",
|
|
|
|
"pci=noearly",
|
|
|
|
"console=ttyS0",
|
|
|
|
]
|
|
|
|
if self._debug:
|
|
|
|
args.append("debug")
|
|
|
|
else:
|
|
|
|
args.append("quiet")
|
|
|
|
|
|
|
|
args.append("ramsize=%s" % hardware._mem)
|
|
|
|
|
|
|
|
cmdline = " ".join(args)
|
|
|
|
if tunnelled:
|
|
|
|
cmdline = "'" + cmdline + "'"
|
|
|
|
|
|
|
|
argv = [
|
2019-09-04 08:27:39 +03:00
|
|
|
"-accel", "kvm",
|
tests: introduce a framework for testing migration performance
This introduces a moderately general purpose framework for
testing performance of migration.
The initial guest workload is provided by the included 'stress'
program, which is configured to spawn one thread per guest CPU
and run a maximally memory intensive workload. It will loop
over GB of memory, xor'ing each byte with data from a 4k array
of random bytes. This ensures heavy read and write load across
all of guest memory to stress the migration performance. While
running the 'stress' program will record how long it takes to
xor each GB of memory and print this data for later reporting.
The test engine will spawn a pair of QEMU processes, either on
the same host, or with the target on a remote host via ssh,
using the host kernel and a custom initrd built with 'stress'
as the /init binary. Kernel command line args are set to ensure
a fast kernel boot time (< 1 second) between launching QEMU and
the stress program starting execution.
None the less, the test engine will initially wait N seconds for
the guest workload to stablize, before starting the migration
operation. When migration is running, the engine will use pause,
post-copy, autoconverge, xbzrle compression and multithread
compression features, as well as downtime & bandwidth tuning
to encourage completion. If migration completes, the test engine
will wait N seconds again for the guest workooad to stablize on
the target host. If migration does not complete after a preset
number of iterations, it will be aborted.
While the QEMU process is running on the source host, the test
engine will sample the host CPU usage of QEMU as a whole, and
each vCPU thread. While migration is running, it will record
all the stats reported by 'query-migration'. Finally, it will
capture the output of the stress program running in the guest.
All the data produced from a single test execution is recorded
in a structured JSON file. A separate program is then able to
create interactive charts using the "plotly" python + javascript
libraries, showing the characteristics of the migration.
The data output provides visualization of the effect on guest
vCPU workloads from the migration process, the corresponding
vCPU utilization on the host, and the overall CPU hit from
QEMU on the host. This is correlated from statistics from the
migration process, such as downtime, vCPU throttling and iteration
number.
While the tests can be run individually with arbitrary parameters,
there is also a facility for producing batch reports for a number
of pre-defined scenarios / comparisons, in order to be able to
get standardized results across different hardware configurations
(eg TCP vs RDMA, or comparing different VCPU counts / memory
sizes, etc).
To use this, first you must build the initrd image
$ make tests/migration/initrd-stress.img
To run a a one-shot test with all default parameters
$ ./tests/migration/guestperf.py > result.json
This has many command line args for varying its behaviour.
For example, to increase the RAM size and CPU count and
bind it to specific host NUMA nodes
$ ./tests/migration/guestperf.py \
--mem 4 --cpus 2 \
--src-mem-bind 0 --src-cpu-bind 0,1 \
--dst-mem-bind 1 --dst-cpu-bind 2,3 \
> result.json
Using mem + cpu binding is strongly recommended on NUMA
machines, otherwise the guest performance results will
vary wildly between runs of the test due to lucky/unlucky
NUMA placement, making sensible data analysis impossible.
To make it run across separate hosts:
$ ./tests/migration/guestperf.py \
--dst-host somehostname > result.json
To request that post-copy is enabled, with switchover
after 5 iterations
$ ./tests/migration/guestperf.py \
--post-copy --post-copy-iters 5 > result.json
Once a result.json file is created, a graph of the data
can be generated, showing guest workload performance per
thread and the migration iteration points:
$ ./tests/migration/guestperf-plot.py --output result.html \
--migration-iters --split-guest-cpu result.json
To further include host vCPU utilization and overall QEMU
utilization
$ ./tests/migration/guestperf-plot.py --output result.html \
--migration-iters --split-guest-cpu \
--qemu-cpu --vcpu-cpu result.json
NB, the 'guestperf-plot.py' command requires that you have
the plotly python library installed. eg you must do
$ pip install --user plotly
Viewing the result.html file requires that you have the
plotly.min.js file in the same directory as the HTML
output. This js file is installed as part of the plotly
python library, so can be found in
$HOME/.local/lib/python2.7/site-packages/plotly/offline/plotly.min.js
The guestperf-plot.py program can accept multiple json files
to plot, enabling results from different configurations to
be compared.
Finally, to run the entire standardized set of comparisons
$ ./tests/migration/guestperf-batch.py \
--dst-host somehost \
--mem 4 --cpus 2 \
--src-mem-bind 0 --src-cpu-bind 0,1 \
--dst-mem-bind 1 --dst-cpu-bind 2,3
--output tcp-somehost-4gb-2cpu
will store JSON files from all scenarios in the directory
named tcp-somehost-4gb-2cpu
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1469020993-29426-7-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-07-20 16:23:13 +03:00
|
|
|
"-cpu", "host",
|
|
|
|
"-kernel", self._kernel,
|
|
|
|
"-initrd", self._initrd,
|
|
|
|
"-append", cmdline,
|
|
|
|
"-chardev", "stdio,id=cdev0",
|
|
|
|
"-device", "isa-serial,chardev=cdev0",
|
|
|
|
"-m", str((hardware._mem * 1024) + 512),
|
|
|
|
"-smp", str(hardware._cpus),
|
|
|
|
]
|
|
|
|
|
|
|
|
if self._debug:
|
|
|
|
argv.extend(["-device", "sga"])
|
|
|
|
|
|
|
|
if hardware._prealloc_pages:
|
|
|
|
argv_source += ["-mem-path", "/dev/shm",
|
|
|
|
"-mem-prealloc"]
|
|
|
|
if hardware._locked_pages:
|
2020-12-10 18:58:07 +03:00
|
|
|
argv_source += ["-overcommit", "mem-lock=on"]
|
tests: introduce a framework for testing migration performance
This introduces a moderately general purpose framework for
testing performance of migration.
The initial guest workload is provided by the included 'stress'
program, which is configured to spawn one thread per guest CPU
and run a maximally memory intensive workload. It will loop
over GB of memory, xor'ing each byte with data from a 4k array
of random bytes. This ensures heavy read and write load across
all of guest memory to stress the migration performance. While
running the 'stress' program will record how long it takes to
xor each GB of memory and print this data for later reporting.
The test engine will spawn a pair of QEMU processes, either on
the same host, or with the target on a remote host via ssh,
using the host kernel and a custom initrd built with 'stress'
as the /init binary. Kernel command line args are set to ensure
a fast kernel boot time (< 1 second) between launching QEMU and
the stress program starting execution.
None the less, the test engine will initially wait N seconds for
the guest workload to stablize, before starting the migration
operation. When migration is running, the engine will use pause,
post-copy, autoconverge, xbzrle compression and multithread
compression features, as well as downtime & bandwidth tuning
to encourage completion. If migration completes, the test engine
will wait N seconds again for the guest workooad to stablize on
the target host. If migration does not complete after a preset
number of iterations, it will be aborted.
While the QEMU process is running on the source host, the test
engine will sample the host CPU usage of QEMU as a whole, and
each vCPU thread. While migration is running, it will record
all the stats reported by 'query-migration'. Finally, it will
capture the output of the stress program running in the guest.
All the data produced from a single test execution is recorded
in a structured JSON file. A separate program is then able to
create interactive charts using the "plotly" python + javascript
libraries, showing the characteristics of the migration.
The data output provides visualization of the effect on guest
vCPU workloads from the migration process, the corresponding
vCPU utilization on the host, and the overall CPU hit from
QEMU on the host. This is correlated from statistics from the
migration process, such as downtime, vCPU throttling and iteration
number.
While the tests can be run individually with arbitrary parameters,
there is also a facility for producing batch reports for a number
of pre-defined scenarios / comparisons, in order to be able to
get standardized results across different hardware configurations
(eg TCP vs RDMA, or comparing different VCPU counts / memory
sizes, etc).
To use this, first you must build the initrd image
$ make tests/migration/initrd-stress.img
To run a a one-shot test with all default parameters
$ ./tests/migration/guestperf.py > result.json
This has many command line args for varying its behaviour.
For example, to increase the RAM size and CPU count and
bind it to specific host NUMA nodes
$ ./tests/migration/guestperf.py \
--mem 4 --cpus 2 \
--src-mem-bind 0 --src-cpu-bind 0,1 \
--dst-mem-bind 1 --dst-cpu-bind 2,3 \
> result.json
Using mem + cpu binding is strongly recommended on NUMA
machines, otherwise the guest performance results will
vary wildly between runs of the test due to lucky/unlucky
NUMA placement, making sensible data analysis impossible.
To make it run across separate hosts:
$ ./tests/migration/guestperf.py \
--dst-host somehostname > result.json
To request that post-copy is enabled, with switchover
after 5 iterations
$ ./tests/migration/guestperf.py \
--post-copy --post-copy-iters 5 > result.json
Once a result.json file is created, a graph of the data
can be generated, showing guest workload performance per
thread and the migration iteration points:
$ ./tests/migration/guestperf-plot.py --output result.html \
--migration-iters --split-guest-cpu result.json
To further include host vCPU utilization and overall QEMU
utilization
$ ./tests/migration/guestperf-plot.py --output result.html \
--migration-iters --split-guest-cpu \
--qemu-cpu --vcpu-cpu result.json
NB, the 'guestperf-plot.py' command requires that you have
the plotly python library installed. eg you must do
$ pip install --user plotly
Viewing the result.html file requires that you have the
plotly.min.js file in the same directory as the HTML
output. This js file is installed as part of the plotly
python library, so can be found in
$HOME/.local/lib/python2.7/site-packages/plotly/offline/plotly.min.js
The guestperf-plot.py program can accept multiple json files
to plot, enabling results from different configurations to
be compared.
Finally, to run the entire standardized set of comparisons
$ ./tests/migration/guestperf-batch.py \
--dst-host somehost \
--mem 4 --cpus 2 \
--src-mem-bind 0 --src-cpu-bind 0,1 \
--dst-mem-bind 1 --dst-cpu-bind 2,3
--output tcp-somehost-4gb-2cpu
will store JSON files from all scenarios in the directory
named tcp-somehost-4gb-2cpu
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1469020993-29426-7-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-07-20 16:23:13 +03:00
|
|
|
if hardware._huge_pages:
|
|
|
|
pass
|
|
|
|
|
|
|
|
return argv
|
|
|
|
|
|
|
|
def _get_src_args(self, hardware):
|
|
|
|
return self._get_common_args(hardware)
|
|
|
|
|
|
|
|
def _get_dst_args(self, hardware, uri):
|
|
|
|
tunnelled = False
|
|
|
|
if self._dst_host != "localhost":
|
|
|
|
tunnelled = True
|
|
|
|
argv = self._get_common_args(hardware, tunnelled)
|
|
|
|
return argv + ["-incoming", uri]
|
|
|
|
|
|
|
|
@staticmethod
|
|
|
|
def _get_common_wrapper(cpu_bind, mem_bind):
|
|
|
|
wrapper = []
|
|
|
|
if len(cpu_bind) > 0 or len(mem_bind) > 0:
|
|
|
|
wrapper.append("numactl")
|
|
|
|
if cpu_bind:
|
|
|
|
wrapper.append("--physcpubind=%s" % ",".join(cpu_bind))
|
|
|
|
if mem_bind:
|
|
|
|
wrapper.append("--membind=%s" % ",".join(mem_bind))
|
|
|
|
|
|
|
|
return wrapper
|
|
|
|
|
|
|
|
def _get_src_wrapper(self, hardware):
|
|
|
|
return self._get_common_wrapper(hardware._src_cpu_bind, hardware._src_mem_bind)
|
|
|
|
|
|
|
|
def _get_dst_wrapper(self, hardware):
|
|
|
|
wrapper = self._get_common_wrapper(hardware._dst_cpu_bind, hardware._dst_mem_bind)
|
|
|
|
if self._dst_host != "localhost":
|
|
|
|
return ["ssh",
|
|
|
|
"-R", "9001:localhost:9001",
|
|
|
|
self._dst_host] + wrapper
|
|
|
|
else:
|
|
|
|
return wrapper
|
|
|
|
|
|
|
|
def _get_timings(self, vm):
|
|
|
|
log = vm.get_log()
|
|
|
|
if not log:
|
|
|
|
return []
|
|
|
|
if self._debug:
|
2018-06-08 15:29:43 +03:00
|
|
|
print(log)
|
tests: introduce a framework for testing migration performance
This introduces a moderately general purpose framework for
testing performance of migration.
The initial guest workload is provided by the included 'stress'
program, which is configured to spawn one thread per guest CPU
and run a maximally memory intensive workload. It will loop
over GB of memory, xor'ing each byte with data from a 4k array
of random bytes. This ensures heavy read and write load across
all of guest memory to stress the migration performance. While
running the 'stress' program will record how long it takes to
xor each GB of memory and print this data for later reporting.
The test engine will spawn a pair of QEMU processes, either on
the same host, or with the target on a remote host via ssh,
using the host kernel and a custom initrd built with 'stress'
as the /init binary. Kernel command line args are set to ensure
a fast kernel boot time (< 1 second) between launching QEMU and
the stress program starting execution.
None the less, the test engine will initially wait N seconds for
the guest workload to stablize, before starting the migration
operation. When migration is running, the engine will use pause,
post-copy, autoconverge, xbzrle compression and multithread
compression features, as well as downtime & bandwidth tuning
to encourage completion. If migration completes, the test engine
will wait N seconds again for the guest workooad to stablize on
the target host. If migration does not complete after a preset
number of iterations, it will be aborted.
While the QEMU process is running on the source host, the test
engine will sample the host CPU usage of QEMU as a whole, and
each vCPU thread. While migration is running, it will record
all the stats reported by 'query-migration'. Finally, it will
capture the output of the stress program running in the guest.
All the data produced from a single test execution is recorded
in a structured JSON file. A separate program is then able to
create interactive charts using the "plotly" python + javascript
libraries, showing the characteristics of the migration.
The data output provides visualization of the effect on guest
vCPU workloads from the migration process, the corresponding
vCPU utilization on the host, and the overall CPU hit from
QEMU on the host. This is correlated from statistics from the
migration process, such as downtime, vCPU throttling and iteration
number.
While the tests can be run individually with arbitrary parameters,
there is also a facility for producing batch reports for a number
of pre-defined scenarios / comparisons, in order to be able to
get standardized results across different hardware configurations
(eg TCP vs RDMA, or comparing different VCPU counts / memory
sizes, etc).
To use this, first you must build the initrd image
$ make tests/migration/initrd-stress.img
To run a a one-shot test with all default parameters
$ ./tests/migration/guestperf.py > result.json
This has many command line args for varying its behaviour.
For example, to increase the RAM size and CPU count and
bind it to specific host NUMA nodes
$ ./tests/migration/guestperf.py \
--mem 4 --cpus 2 \
--src-mem-bind 0 --src-cpu-bind 0,1 \
--dst-mem-bind 1 --dst-cpu-bind 2,3 \
> result.json
Using mem + cpu binding is strongly recommended on NUMA
machines, otherwise the guest performance results will
vary wildly between runs of the test due to lucky/unlucky
NUMA placement, making sensible data analysis impossible.
To make it run across separate hosts:
$ ./tests/migration/guestperf.py \
--dst-host somehostname > result.json
To request that post-copy is enabled, with switchover
after 5 iterations
$ ./tests/migration/guestperf.py \
--post-copy --post-copy-iters 5 > result.json
Once a result.json file is created, a graph of the data
can be generated, showing guest workload performance per
thread and the migration iteration points:
$ ./tests/migration/guestperf-plot.py --output result.html \
--migration-iters --split-guest-cpu result.json
To further include host vCPU utilization and overall QEMU
utilization
$ ./tests/migration/guestperf-plot.py --output result.html \
--migration-iters --split-guest-cpu \
--qemu-cpu --vcpu-cpu result.json
NB, the 'guestperf-plot.py' command requires that you have
the plotly python library installed. eg you must do
$ pip install --user plotly
Viewing the result.html file requires that you have the
plotly.min.js file in the same directory as the HTML
output. This js file is installed as part of the plotly
python library, so can be found in
$HOME/.local/lib/python2.7/site-packages/plotly/offline/plotly.min.js
The guestperf-plot.py program can accept multiple json files
to plot, enabling results from different configurations to
be compared.
Finally, to run the entire standardized set of comparisons
$ ./tests/migration/guestperf-batch.py \
--dst-host somehost \
--mem 4 --cpus 2 \
--src-mem-bind 0 --src-cpu-bind 0,1 \
--dst-mem-bind 1 --dst-cpu-bind 2,3
--output tcp-somehost-4gb-2cpu
will store JSON files from all scenarios in the directory
named tcp-somehost-4gb-2cpu
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1469020993-29426-7-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-07-20 16:23:13 +03:00
|
|
|
|
|
|
|
regex = r"[^\s]+\s\((\d+)\):\sINFO:\s(\d+)ms\scopied\s\d+\sGB\sin\s(\d+)ms"
|
|
|
|
matcher = re.compile(regex)
|
|
|
|
records = []
|
|
|
|
for line in log.split("\n"):
|
|
|
|
match = matcher.match(line)
|
|
|
|
if match:
|
|
|
|
records.append(TimingRecord(int(match.group(1)),
|
|
|
|
int(match.group(2)) / 1000.0,
|
|
|
|
int(match.group(3))))
|
|
|
|
return records
|
|
|
|
|
|
|
|
def run(self, hardware, scenario, result_dir=os.getcwd()):
|
|
|
|
abs_result_dir = os.path.join(result_dir, scenario._name)
|
|
|
|
|
|
|
|
if self._transport == "tcp":
|
|
|
|
uri = "tcp:%s:9000" % self._dst_host
|
|
|
|
elif self._transport == "rdma":
|
|
|
|
uri = "rdma:%s:9000" % self._dst_host
|
|
|
|
elif self._transport == "unix":
|
|
|
|
if self._dst_host != "localhost":
|
|
|
|
raise Exception("Running use unix migration transport for non-local host")
|
|
|
|
uri = "unix:/var/tmp/qemu-migrate-%d.migrate" % os.getpid()
|
|
|
|
try:
|
|
|
|
os.remove(uri[5:])
|
|
|
|
os.remove(monaddr)
|
|
|
|
except:
|
|
|
|
pass
|
|
|
|
|
|
|
|
if self._dst_host != "localhost":
|
|
|
|
dstmonaddr = ("localhost", 9001)
|
|
|
|
else:
|
|
|
|
dstmonaddr = "/var/tmp/qemu-dst-%d-monitor.sock" % os.getpid()
|
|
|
|
srcmonaddr = "/var/tmp/qemu-src-%d-monitor.sock" % os.getpid()
|
|
|
|
|
2019-06-28 00:28:14 +03:00
|
|
|
src = QEMUMachine(self._binary,
|
|
|
|
args=self._get_src_args(hardware),
|
|
|
|
wrapper=self._get_src_wrapper(hardware),
|
|
|
|
name="qemu-src-%d" % os.getpid(),
|
|
|
|
monitor_address=srcmonaddr)
|
|
|
|
|
|
|
|
dst = QEMUMachine(self._binary,
|
|
|
|
args=self._get_dst_args(hardware, uri),
|
|
|
|
wrapper=self._get_dst_wrapper(hardware),
|
|
|
|
name="qemu-dst-%d" % os.getpid(),
|
|
|
|
monitor_address=dstmonaddr)
|
tests: introduce a framework for testing migration performance
This introduces a moderately general purpose framework for
testing performance of migration.
The initial guest workload is provided by the included 'stress'
program, which is configured to spawn one thread per guest CPU
and run a maximally memory intensive workload. It will loop
over GB of memory, xor'ing each byte with data from a 4k array
of random bytes. This ensures heavy read and write load across
all of guest memory to stress the migration performance. While
running the 'stress' program will record how long it takes to
xor each GB of memory and print this data for later reporting.
The test engine will spawn a pair of QEMU processes, either on
the same host, or with the target on a remote host via ssh,
using the host kernel and a custom initrd built with 'stress'
as the /init binary. Kernel command line args are set to ensure
a fast kernel boot time (< 1 second) between launching QEMU and
the stress program starting execution.
None the less, the test engine will initially wait N seconds for
the guest workload to stablize, before starting the migration
operation. When migration is running, the engine will use pause,
post-copy, autoconverge, xbzrle compression and multithread
compression features, as well as downtime & bandwidth tuning
to encourage completion. If migration completes, the test engine
will wait N seconds again for the guest workooad to stablize on
the target host. If migration does not complete after a preset
number of iterations, it will be aborted.
While the QEMU process is running on the source host, the test
engine will sample the host CPU usage of QEMU as a whole, and
each vCPU thread. While migration is running, it will record
all the stats reported by 'query-migration'. Finally, it will
capture the output of the stress program running in the guest.
All the data produced from a single test execution is recorded
in a structured JSON file. A separate program is then able to
create interactive charts using the "plotly" python + javascript
libraries, showing the characteristics of the migration.
The data output provides visualization of the effect on guest
vCPU workloads from the migration process, the corresponding
vCPU utilization on the host, and the overall CPU hit from
QEMU on the host. This is correlated from statistics from the
migration process, such as downtime, vCPU throttling and iteration
number.
While the tests can be run individually with arbitrary parameters,
there is also a facility for producing batch reports for a number
of pre-defined scenarios / comparisons, in order to be able to
get standardized results across different hardware configurations
(eg TCP vs RDMA, or comparing different VCPU counts / memory
sizes, etc).
To use this, first you must build the initrd image
$ make tests/migration/initrd-stress.img
To run a a one-shot test with all default parameters
$ ./tests/migration/guestperf.py > result.json
This has many command line args for varying its behaviour.
For example, to increase the RAM size and CPU count and
bind it to specific host NUMA nodes
$ ./tests/migration/guestperf.py \
--mem 4 --cpus 2 \
--src-mem-bind 0 --src-cpu-bind 0,1 \
--dst-mem-bind 1 --dst-cpu-bind 2,3 \
> result.json
Using mem + cpu binding is strongly recommended on NUMA
machines, otherwise the guest performance results will
vary wildly between runs of the test due to lucky/unlucky
NUMA placement, making sensible data analysis impossible.
To make it run across separate hosts:
$ ./tests/migration/guestperf.py \
--dst-host somehostname > result.json
To request that post-copy is enabled, with switchover
after 5 iterations
$ ./tests/migration/guestperf.py \
--post-copy --post-copy-iters 5 > result.json
Once a result.json file is created, a graph of the data
can be generated, showing guest workload performance per
thread and the migration iteration points:
$ ./tests/migration/guestperf-plot.py --output result.html \
--migration-iters --split-guest-cpu result.json
To further include host vCPU utilization and overall QEMU
utilization
$ ./tests/migration/guestperf-plot.py --output result.html \
--migration-iters --split-guest-cpu \
--qemu-cpu --vcpu-cpu result.json
NB, the 'guestperf-plot.py' command requires that you have
the plotly python library installed. eg you must do
$ pip install --user plotly
Viewing the result.html file requires that you have the
plotly.min.js file in the same directory as the HTML
output. This js file is installed as part of the plotly
python library, so can be found in
$HOME/.local/lib/python2.7/site-packages/plotly/offline/plotly.min.js
The guestperf-plot.py program can accept multiple json files
to plot, enabling results from different configurations to
be compared.
Finally, to run the entire standardized set of comparisons
$ ./tests/migration/guestperf-batch.py \
--dst-host somehost \
--mem 4 --cpus 2 \
--src-mem-bind 0 --src-cpu-bind 0,1 \
--dst-mem-bind 1 --dst-cpu-bind 2,3
--output tcp-somehost-4gb-2cpu
will store JSON files from all scenarios in the directory
named tcp-somehost-4gb-2cpu
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1469020993-29426-7-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-07-20 16:23:13 +03:00
|
|
|
|
|
|
|
try:
|
|
|
|
src.launch()
|
|
|
|
dst.launch()
|
|
|
|
|
|
|
|
ret = self._migrate(hardware, scenario, src, dst, uri)
|
|
|
|
progress_history = ret[0]
|
|
|
|
qemu_timings = ret[1]
|
|
|
|
vcpu_timings = ret[2]
|
|
|
|
if uri[0:5] == "unix:":
|
|
|
|
os.remove(uri[5:])
|
|
|
|
if self._verbose:
|
2018-06-08 15:29:43 +03:00
|
|
|
print("Finished migration")
|
tests: introduce a framework for testing migration performance
This introduces a moderately general purpose framework for
testing performance of migration.
The initial guest workload is provided by the included 'stress'
program, which is configured to spawn one thread per guest CPU
and run a maximally memory intensive workload. It will loop
over GB of memory, xor'ing each byte with data from a 4k array
of random bytes. This ensures heavy read and write load across
all of guest memory to stress the migration performance. While
running the 'stress' program will record how long it takes to
xor each GB of memory and print this data for later reporting.
The test engine will spawn a pair of QEMU processes, either on
the same host, or with the target on a remote host via ssh,
using the host kernel and a custom initrd built with 'stress'
as the /init binary. Kernel command line args are set to ensure
a fast kernel boot time (< 1 second) between launching QEMU and
the stress program starting execution.
None the less, the test engine will initially wait N seconds for
the guest workload to stablize, before starting the migration
operation. When migration is running, the engine will use pause,
post-copy, autoconverge, xbzrle compression and multithread
compression features, as well as downtime & bandwidth tuning
to encourage completion. If migration completes, the test engine
will wait N seconds again for the guest workooad to stablize on
the target host. If migration does not complete after a preset
number of iterations, it will be aborted.
While the QEMU process is running on the source host, the test
engine will sample the host CPU usage of QEMU as a whole, and
each vCPU thread. While migration is running, it will record
all the stats reported by 'query-migration'. Finally, it will
capture the output of the stress program running in the guest.
All the data produced from a single test execution is recorded
in a structured JSON file. A separate program is then able to
create interactive charts using the "plotly" python + javascript
libraries, showing the characteristics of the migration.
The data output provides visualization of the effect on guest
vCPU workloads from the migration process, the corresponding
vCPU utilization on the host, and the overall CPU hit from
QEMU on the host. This is correlated from statistics from the
migration process, such as downtime, vCPU throttling and iteration
number.
While the tests can be run individually with arbitrary parameters,
there is also a facility for producing batch reports for a number
of pre-defined scenarios / comparisons, in order to be able to
get standardized results across different hardware configurations
(eg TCP vs RDMA, or comparing different VCPU counts / memory
sizes, etc).
To use this, first you must build the initrd image
$ make tests/migration/initrd-stress.img
To run a a one-shot test with all default parameters
$ ./tests/migration/guestperf.py > result.json
This has many command line args for varying its behaviour.
For example, to increase the RAM size and CPU count and
bind it to specific host NUMA nodes
$ ./tests/migration/guestperf.py \
--mem 4 --cpus 2 \
--src-mem-bind 0 --src-cpu-bind 0,1 \
--dst-mem-bind 1 --dst-cpu-bind 2,3 \
> result.json
Using mem + cpu binding is strongly recommended on NUMA
machines, otherwise the guest performance results will
vary wildly between runs of the test due to lucky/unlucky
NUMA placement, making sensible data analysis impossible.
To make it run across separate hosts:
$ ./tests/migration/guestperf.py \
--dst-host somehostname > result.json
To request that post-copy is enabled, with switchover
after 5 iterations
$ ./tests/migration/guestperf.py \
--post-copy --post-copy-iters 5 > result.json
Once a result.json file is created, a graph of the data
can be generated, showing guest workload performance per
thread and the migration iteration points:
$ ./tests/migration/guestperf-plot.py --output result.html \
--migration-iters --split-guest-cpu result.json
To further include host vCPU utilization and overall QEMU
utilization
$ ./tests/migration/guestperf-plot.py --output result.html \
--migration-iters --split-guest-cpu \
--qemu-cpu --vcpu-cpu result.json
NB, the 'guestperf-plot.py' command requires that you have
the plotly python library installed. eg you must do
$ pip install --user plotly
Viewing the result.html file requires that you have the
plotly.min.js file in the same directory as the HTML
output. This js file is installed as part of the plotly
python library, so can be found in
$HOME/.local/lib/python2.7/site-packages/plotly/offline/plotly.min.js
The guestperf-plot.py program can accept multiple json files
to plot, enabling results from different configurations to
be compared.
Finally, to run the entire standardized set of comparisons
$ ./tests/migration/guestperf-batch.py \
--dst-host somehost \
--mem 4 --cpus 2 \
--src-mem-bind 0 --src-cpu-bind 0,1 \
--dst-mem-bind 1 --dst-cpu-bind 2,3
--output tcp-somehost-4gb-2cpu
will store JSON files from all scenarios in the directory
named tcp-somehost-4gb-2cpu
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1469020993-29426-7-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-07-20 16:23:13 +03:00
|
|
|
|
|
|
|
src.shutdown()
|
|
|
|
dst.shutdown()
|
|
|
|
|
|
|
|
return Report(hardware, scenario, progress_history,
|
|
|
|
Timings(self._get_timings(src) + self._get_timings(dst)),
|
|
|
|
Timings(qemu_timings),
|
|
|
|
Timings(vcpu_timings),
|
|
|
|
self._binary, self._dst_host, self._kernel,
|
|
|
|
self._initrd, self._transport, self._sleep)
|
|
|
|
except Exception as e:
|
|
|
|
if self._debug:
|
2018-06-08 15:29:43 +03:00
|
|
|
print("Failed: %s" % str(e))
|
tests: introduce a framework for testing migration performance
This introduces a moderately general purpose framework for
testing performance of migration.
The initial guest workload is provided by the included 'stress'
program, which is configured to spawn one thread per guest CPU
and run a maximally memory intensive workload. It will loop
over GB of memory, xor'ing each byte with data from a 4k array
of random bytes. This ensures heavy read and write load across
all of guest memory to stress the migration performance. While
running the 'stress' program will record how long it takes to
xor each GB of memory and print this data for later reporting.
The test engine will spawn a pair of QEMU processes, either on
the same host, or with the target on a remote host via ssh,
using the host kernel and a custom initrd built with 'stress'
as the /init binary. Kernel command line args are set to ensure
a fast kernel boot time (< 1 second) between launching QEMU and
the stress program starting execution.
None the less, the test engine will initially wait N seconds for
the guest workload to stablize, before starting the migration
operation. When migration is running, the engine will use pause,
post-copy, autoconverge, xbzrle compression and multithread
compression features, as well as downtime & bandwidth tuning
to encourage completion. If migration completes, the test engine
will wait N seconds again for the guest workooad to stablize on
the target host. If migration does not complete after a preset
number of iterations, it will be aborted.
While the QEMU process is running on the source host, the test
engine will sample the host CPU usage of QEMU as a whole, and
each vCPU thread. While migration is running, it will record
all the stats reported by 'query-migration'. Finally, it will
capture the output of the stress program running in the guest.
All the data produced from a single test execution is recorded
in a structured JSON file. A separate program is then able to
create interactive charts using the "plotly" python + javascript
libraries, showing the characteristics of the migration.
The data output provides visualization of the effect on guest
vCPU workloads from the migration process, the corresponding
vCPU utilization on the host, and the overall CPU hit from
QEMU on the host. This is correlated from statistics from the
migration process, such as downtime, vCPU throttling and iteration
number.
While the tests can be run individually with arbitrary parameters,
there is also a facility for producing batch reports for a number
of pre-defined scenarios / comparisons, in order to be able to
get standardized results across different hardware configurations
(eg TCP vs RDMA, or comparing different VCPU counts / memory
sizes, etc).
To use this, first you must build the initrd image
$ make tests/migration/initrd-stress.img
To run a a one-shot test with all default parameters
$ ./tests/migration/guestperf.py > result.json
This has many command line args for varying its behaviour.
For example, to increase the RAM size and CPU count and
bind it to specific host NUMA nodes
$ ./tests/migration/guestperf.py \
--mem 4 --cpus 2 \
--src-mem-bind 0 --src-cpu-bind 0,1 \
--dst-mem-bind 1 --dst-cpu-bind 2,3 \
> result.json
Using mem + cpu binding is strongly recommended on NUMA
machines, otherwise the guest performance results will
vary wildly between runs of the test due to lucky/unlucky
NUMA placement, making sensible data analysis impossible.
To make it run across separate hosts:
$ ./tests/migration/guestperf.py \
--dst-host somehostname > result.json
To request that post-copy is enabled, with switchover
after 5 iterations
$ ./tests/migration/guestperf.py \
--post-copy --post-copy-iters 5 > result.json
Once a result.json file is created, a graph of the data
can be generated, showing guest workload performance per
thread and the migration iteration points:
$ ./tests/migration/guestperf-plot.py --output result.html \
--migration-iters --split-guest-cpu result.json
To further include host vCPU utilization and overall QEMU
utilization
$ ./tests/migration/guestperf-plot.py --output result.html \
--migration-iters --split-guest-cpu \
--qemu-cpu --vcpu-cpu result.json
NB, the 'guestperf-plot.py' command requires that you have
the plotly python library installed. eg you must do
$ pip install --user plotly
Viewing the result.html file requires that you have the
plotly.min.js file in the same directory as the HTML
output. This js file is installed as part of the plotly
python library, so can be found in
$HOME/.local/lib/python2.7/site-packages/plotly/offline/plotly.min.js
The guestperf-plot.py program can accept multiple json files
to plot, enabling results from different configurations to
be compared.
Finally, to run the entire standardized set of comparisons
$ ./tests/migration/guestperf-batch.py \
--dst-host somehost \
--mem 4 --cpus 2 \
--src-mem-bind 0 --src-cpu-bind 0,1 \
--dst-mem-bind 1 --dst-cpu-bind 2,3
--output tcp-somehost-4gb-2cpu
will store JSON files from all scenarios in the directory
named tcp-somehost-4gb-2cpu
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1469020993-29426-7-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-07-20 16:23:13 +03:00
|
|
|
try:
|
|
|
|
src.shutdown()
|
|
|
|
except:
|
|
|
|
pass
|
|
|
|
try:
|
|
|
|
dst.shutdown()
|
|
|
|
except:
|
|
|
|
pass
|
|
|
|
|
|
|
|
if self._debug:
|
2018-06-08 15:29:43 +03:00
|
|
|
print(src.get_log())
|
|
|
|
print(dst.get_log())
|
tests: introduce a framework for testing migration performance
This introduces a moderately general purpose framework for
testing performance of migration.
The initial guest workload is provided by the included 'stress'
program, which is configured to spawn one thread per guest CPU
and run a maximally memory intensive workload. It will loop
over GB of memory, xor'ing each byte with data from a 4k array
of random bytes. This ensures heavy read and write load across
all of guest memory to stress the migration performance. While
running the 'stress' program will record how long it takes to
xor each GB of memory and print this data for later reporting.
The test engine will spawn a pair of QEMU processes, either on
the same host, or with the target on a remote host via ssh,
using the host kernel and a custom initrd built with 'stress'
as the /init binary. Kernel command line args are set to ensure
a fast kernel boot time (< 1 second) between launching QEMU and
the stress program starting execution.
None the less, the test engine will initially wait N seconds for
the guest workload to stablize, before starting the migration
operation. When migration is running, the engine will use pause,
post-copy, autoconverge, xbzrle compression and multithread
compression features, as well as downtime & bandwidth tuning
to encourage completion. If migration completes, the test engine
will wait N seconds again for the guest workooad to stablize on
the target host. If migration does not complete after a preset
number of iterations, it will be aborted.
While the QEMU process is running on the source host, the test
engine will sample the host CPU usage of QEMU as a whole, and
each vCPU thread. While migration is running, it will record
all the stats reported by 'query-migration'. Finally, it will
capture the output of the stress program running in the guest.
All the data produced from a single test execution is recorded
in a structured JSON file. A separate program is then able to
create interactive charts using the "plotly" python + javascript
libraries, showing the characteristics of the migration.
The data output provides visualization of the effect on guest
vCPU workloads from the migration process, the corresponding
vCPU utilization on the host, and the overall CPU hit from
QEMU on the host. This is correlated from statistics from the
migration process, such as downtime, vCPU throttling and iteration
number.
While the tests can be run individually with arbitrary parameters,
there is also a facility for producing batch reports for a number
of pre-defined scenarios / comparisons, in order to be able to
get standardized results across different hardware configurations
(eg TCP vs RDMA, or comparing different VCPU counts / memory
sizes, etc).
To use this, first you must build the initrd image
$ make tests/migration/initrd-stress.img
To run a a one-shot test with all default parameters
$ ./tests/migration/guestperf.py > result.json
This has many command line args for varying its behaviour.
For example, to increase the RAM size and CPU count and
bind it to specific host NUMA nodes
$ ./tests/migration/guestperf.py \
--mem 4 --cpus 2 \
--src-mem-bind 0 --src-cpu-bind 0,1 \
--dst-mem-bind 1 --dst-cpu-bind 2,3 \
> result.json
Using mem + cpu binding is strongly recommended on NUMA
machines, otherwise the guest performance results will
vary wildly between runs of the test due to lucky/unlucky
NUMA placement, making sensible data analysis impossible.
To make it run across separate hosts:
$ ./tests/migration/guestperf.py \
--dst-host somehostname > result.json
To request that post-copy is enabled, with switchover
after 5 iterations
$ ./tests/migration/guestperf.py \
--post-copy --post-copy-iters 5 > result.json
Once a result.json file is created, a graph of the data
can be generated, showing guest workload performance per
thread and the migration iteration points:
$ ./tests/migration/guestperf-plot.py --output result.html \
--migration-iters --split-guest-cpu result.json
To further include host vCPU utilization and overall QEMU
utilization
$ ./tests/migration/guestperf-plot.py --output result.html \
--migration-iters --split-guest-cpu \
--qemu-cpu --vcpu-cpu result.json
NB, the 'guestperf-plot.py' command requires that you have
the plotly python library installed. eg you must do
$ pip install --user plotly
Viewing the result.html file requires that you have the
plotly.min.js file in the same directory as the HTML
output. This js file is installed as part of the plotly
python library, so can be found in
$HOME/.local/lib/python2.7/site-packages/plotly/offline/plotly.min.js
The guestperf-plot.py program can accept multiple json files
to plot, enabling results from different configurations to
be compared.
Finally, to run the entire standardized set of comparisons
$ ./tests/migration/guestperf-batch.py \
--dst-host somehost \
--mem 4 --cpus 2 \
--src-mem-bind 0 --src-cpu-bind 0,1 \
--dst-mem-bind 1 --dst-cpu-bind 2,3
--output tcp-somehost-4gb-2cpu
will store JSON files from all scenarios in the directory
named tcp-somehost-4gb-2cpu
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1469020993-29426-7-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-07-20 16:23:13 +03:00
|
|
|
raise
|
|
|
|
|