Go to file
David Hildenbrand 910b25766b virtio-mem: Paravirtualized memory hot(un)plug
This is the very basic/initial version of virtio-mem. An introduction to
virtio-mem can be found in the Linux kernel driver [1]. While it can be
used in the current state for hotplug of a smaller amount of memory, it
will heavily benefit from resizeable memory regions in the future.

Each virtio-mem device manages a memory region (provided via a memory
backend). After requested by the hypervisor ("requested-size"), the
guest can try to plug/unplug blocks of memory within that region, in order
to reach the requested size. Initially, and after a reboot, all memory is
unplugged (except in special cases - reboot during postcopy).

The guest may only try to plug/unplug blocks of memory within the usable
region size. The usable region size is a little bigger than the
requested size, to give the device driver some flexibility. The usable
region size will only grow, except on reboots or when all memory is
requested to get unplugged. The guest can never plug more memory than
requested. Unplugged memory will get zapped/discarded, similar to in a
balloon device.

The block size is variable, however, it is always chosen in a way such that
THP splits are avoided (e.g., 2MB). The state of each block
(plugged/unplugged) is tracked in a bitmap.

As virtio-mem devices (e.g., virtio-mem-pci) will be memory devices, we now
expose "VirtioMEMDeviceInfo" via "query-memory-devices".

--------------------------------------------------------------------------

There are two important follow-up items that are in the works:
1. Resizeable memory regions: Use resizeable allocations/RAM blocks to
   grow/shrink along with the usable region size. This avoids creating
   initially very big VMAs, RAM blocks, and KVM slots.
2. Protection of unplugged memory: Make sure the gust cannot actually
   make use of unplugged memory.

Other follow-up items that are in the works:
1. Exclude unplugged memory during migration (via precopy notifier).
2. Handle remapping of memory.
3. Support for other architectures.

--------------------------------------------------------------------------

Example usage (virtio-mem-pci is introduced in follow-up patches):

Start QEMU with two virtio-mem devices (one per NUMA node):
 $ qemu-system-x86_64 -m 4G,maxmem=20G \
  -smp sockets=2,cores=2 \
  -numa node,nodeid=0,cpus=0-1 -numa node,nodeid=1,cpus=2-3 \
  [...]
  -object memory-backend-ram,id=mem0,size=8G \
  -device virtio-mem-pci,id=vm0,memdev=mem0,node=0,requested-size=0M \
  -object memory-backend-ram,id=mem1,size=8G \
  -device virtio-mem-pci,id=vm1,memdev=mem1,node=1,requested-size=1G

Query the configuration:
 (qemu) info memory-devices
 Memory device [virtio-mem]: "vm0"
   memaddr: 0x140000000
   node: 0
   requested-size: 0
   size: 0
   max-size: 8589934592
   block-size: 2097152
   memdev: /objects/mem0
 Memory device [virtio-mem]: "vm1"
   memaddr: 0x340000000
   node: 1
   requested-size: 1073741824
   size: 1073741824
   max-size: 8589934592
   block-size: 2097152
   memdev: /objects/mem1

Add some memory to node 0:
 (qemu) qom-set vm0 requested-size 500M

Remove some memory from node 1:
 (qemu) qom-set vm1 requested-size 200M

Query the configuration again:
 (qemu) info memory-devices
 Memory device [virtio-mem]: "vm0"
   memaddr: 0x140000000
   node: 0
   requested-size: 524288000
   size: 524288000
   max-size: 8589934592
   block-size: 2097152
   memdev: /objects/mem0
 Memory device [virtio-mem]: "vm1"
   memaddr: 0x340000000
   node: 1
   requested-size: 209715200
   size: 209715200
   max-size: 8589934592
   block-size: 2097152
   memdev: /objects/mem1

[1] https://lkml.kernel.org/r/20200311171422.10484-1-david@redhat.com

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20200626072248.78761-11-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-07-03 07:57:04 -04:00
.github .github: Enable repo-lockdown bot to refuse GitHub pull requests 2020-04-07 16:19:18 +01:00
.gitlab-ci.d gitlab-ci: Move edk2 and opensbi YAML files to .gitlab-ci.d folder 2020-05-28 11:00:39 +02:00
accel accel/kvm: Convert to ram_block_discard_disable() 2020-07-02 05:54:59 -04:00
audio audio/jack: simplify the re-init code path 2020-06-17 14:44:51 +02:00
authz qom: Drop parameter @errp of object_property_add() & friends 2020-05-15 07:07:58 +02:00
backends tpm: Move backend code under the 'backends/' directory 2020-06-19 07:25:55 -04:00
block block/nvme: support nested aio_poll() 2020-06-23 15:46:08 +01:00
bsd-user exec/cpu-all: Use bool for have_guest_base 2020-05-15 15:25:16 +01:00
capstone@22ead3e0bf disas: Add capstone as submodule 2017-10-26 11:56:20 +02:00
chardev chardev/char.c: Use qemu_co_sleep_ns if in coroutine 2020-06-18 21:05:52 +08:00
contrib libvhost-user: advertise vring features 2020-06-12 10:17:06 -04:00
crypto crypto: Remove use of GCRYPT_VERSION macro. 2020-06-15 11:33:51 +01:00
default-configs hw/rx: Add RX GDB simulator 2020-06-22 18:37:12 +02:00
disas target/mips: Add implementation of GINVT instruction 2020-01-29 19:28:52 +01:00
docs numa: forbid '-numa node, mem' for 5.1 and newer machine types 2020-06-26 09:39:39 -04:00
dtc@85e5d83984 Makefile: dtc: update, build the libfdt target 2020-06-16 14:49:05 +01:00
dump various: Remove suspicious '\' character outside of #define in C code 2020-04-29 08:01:51 +02:00
fpu softfloat: return low bits of quotient from floatx80_modrem 2020-06-26 09:39:38 -04:00
fsdev virtfs-proxy-helper: Make the helper_opts[] array const 2020-03-09 15:59:31 +01:00
gdb-xml target/arm: Use correct GDB XML for M-profile cores 2020-05-14 15:03:08 +01:00
hw virtio-mem: Paravirtualized memory hot(un)plug 2020-07-03 07:57:04 -04:00
include virtio-mem: Paravirtualized memory hot(un)plug 2020-07-03 07:57:04 -04:00
io io/task: Move 'qom/object.h' header to source 2020-06-10 12:09:37 -04:00
libdecnumber build: remove CONFIG_LIBDECNUMBER 2017-10-16 18:03:52 +02:00
linux-headers Linux headers: update 2020-06-18 12:13:36 +02:00
linux-user linux-user: detect overflow of MAP_FIXED mmap 2020-06-08 17:04:19 +01:00
migration migration/colo: Use ram_block_discard_disable() 2020-07-02 05:54:59 -04:00
monitor monitor/hmp-cmds: improvements for the 'info migrate' 2020-06-17 17:48:39 +01:00
nbd nbd/server: Avoid long error message assertions CVE-2020-10761 2020-06-10 12:58:59 -05:00
net net: Drop the NetLegacy structure, always use Netdev instead 2020-06-18 21:05:52 +08:00
pc-bios Update OpenBIOS images to 4704d9eb built from submodule. 2020-05-21 21:00:39 +01:00
plugins qemu/qemu-plugin: Make qemu_plugin_hwaddr_is_io() hwaddr argument const 2020-05-15 15:25:16 +01:00
po translations: Add Swedish language 2020-06-15 20:51:10 +02:00
python/qemu python/qemu/qtest: Check before accessing _qtest 2020-05-31 18:25:31 +02:00
qapi virtio-mem: Paravirtualized memory hot(un)plug 2020-07-03 07:57:04 -04:00
qga qga: Fix qmp_guest_suspend_{disk, ram}() error handling 2020-04-29 08:01:52 +02:00
qobject qobject: Eliminate qdict_iter(), use qdict_first(), qdict_next() 2020-04-30 06:51:15 +02:00
qom hmp: Make json format optional for qom-set 2020-06-17 17:48:39 +01:00
replay replay: synchronize on every virtual timer callback 2020-06-26 06:45:30 -04:00
roms Update OpenBIOS images to 4704d9eb built from submodule. 2020-05-21 21:00:39 +01:00
scripts scripts/performance: Add topN_callgrind.py script 2020-06-27 20:07:59 +02:00
scsi error: Use error_reportf_err() where appropriate 2020-05-27 07:45:30 +02:00
slirp@2faae0f778 slirp: update to fix CVE-2020-1983 2020-04-21 18:39:20 +01:00
softmmu blockdev: Deprecate -drive with bogus interface type 2020-06-23 16:07:07 +02:00
storage-daemon qemu-storage-daemon: Add --monitor option 2020-03-06 17:21:28 +01:00
stubs acpi: move aml builder code for floppy device 2020-06-24 17:18:28 -04:00
target target/i386: sev: Use ram_block_discard_disable() 2020-07-02 05:54:59 -04:00
tcg tcg: call qemu_spin_destroy for tb->jmp_lock 2020-06-16 14:49:05 +01:00
tests Revert "tests/migration: Reduce autoconverge initial bandwidth" 2020-07-02 05:54:58 -04:00
tools/virtiofsd virtiofsd: Whitelist fchmod 2020-06-17 17:48:38 +01:00
trace trace/simple: Fix unauthorized enable 2020-06-24 11:21:00 +01:00
ui audio: Let capture_callback handler use const buffer argument 2020-05-26 08:29:39 +02:00
util * Various fixes 2020-06-26 16:55:20 +01:00
.cirrus.yml cirrus.yml: serialise make check 2020-06-16 14:49:05 +01:00
.dir-locals.el Add .dir-locals.el file to configure emacs coding style 2015-10-08 19:46:01 +03:00
.editorconfig editorconfig: add setting for shell scripts 2019-06-12 17:53:22 +01:00
.exrc
.gdbinit .gdbinit: load QEMU sub-commands when gdb starts 2017-06-07 14:38:45 +01:00
.gitignore .gitignore: Ignore storage-daemon files 2020-06-17 14:53:40 +02:00
.gitlab-ci.yml gitlab-ci: Determine the number of jobs dynamically 2020-05-28 11:01:38 +02:00
.gitmodules hw/ppc/prep: Remove the deprecated "prep" machine and the OpenHackware BIOS 2020-02-02 14:07:57 +11:00
.gitpublish Add a git-publish configuration file 2018-03-05 09:03:17 +00:00
.mailmap Trivial branch pull request 20200610 2020-06-11 19:22:52 +01:00
.patchew.yml ci: store Patchew configuration in the tree 2019-06-03 14:03:02 +02:00
.readthedocs.yml .readthedocs.yml: specify some minimum python requirements 2020-02-07 15:15:16 +01:00
.shippable.yml Revert ".shippable: temporaily disable some cross builds" 2020-06-16 14:49:05 +01:00
.travis.yml configure: Let SLOF be initialized by ./scripts/git-submodule.sh 2020-06-15 18:26:47 +02:00
arch_init.c arch_init: Remove unused 'qapi-commands-misc.h' include 2020-06-05 21:23:22 +02:00
balloon.c virtio-balloon: Rip out qemu_balloon_inhibit() 2020-07-02 05:54:59 -04:00
block.c block: Call attention to truncation of long NBD exports 2020-06-10 12:58:59 -05:00
blockdev-nbd.c blockdev-nbd: Boxed argument type for nbd-server-add 2020-03-06 17:21:28 +01:00
blockdev.c blockdev: Deprecate -drive with bogus interface type 2020-06-23 16:07:07 +02:00
blockjob.c block: Add BdrvChildRole to BdrvChild 2020-05-18 19:05:25 +02:00
bootdevice.c Drop more @errp parameters after previous commit 2020-05-15 07:08:14 +02:00
Changelog Use HTTPS for qemu.org and other domains 2017-11-21 13:34:13 +00:00
CODING_STYLE.rst docs: split the CODING_STYLE doc into distinct groups 2019-09-05 14:41:00 +01:00
configure * Various fixes 2020-06-26 16:55:20 +01:00
COPYING
COPYING.LIB COPYING.LIB: Synchronize the LGPL 2.1 with the version from gnu.org 2019-01-30 11:01:22 +01:00
cpus-common.c cpu: convert queued work to a QSIMPLEQ 2020-06-16 14:49:05 +01:00
cpus.c replay: notify the main loop when there are no instructions 2020-06-26 06:45:30 -04:00
device_tree.c device_tree: Constify compat in qemu_fdt_node_path() 2020-04-30 15:35:41 +01:00
disas.c disas: Let disas::read_memory() handler return EIO on error 2020-06-10 12:10:23 -04:00
dma-helpers.c icount: make dma reads deterministic 2020-06-17 14:53:39 +02:00
exec-vary.c exec: Cache TARGET_PAGE_MASK for TARGET_PAGE_BITS_VARY 2019-10-28 10:35:20 +01:00
exec.c exec: Introduce ram_block_discard_(disable|require)() 2020-07-02 05:54:59 -04:00
gdbstub.c gdbstub/linux-user: support debugging over a unix socket 2020-05-06 09:29:26 +01:00
gitdm.config contrib: gitdm: add a mapping for Janus Technologies 2019-03-12 19:31:29 +00:00
hmp-commands-info.hx memory: Make 'info mtree' not display disabled regions by default 2020-06-10 12:10:49 -04:00
hmp-commands.hx hmp: Make json format optional for qom-set 2020-06-17 17:48:39 +01:00
ioport.c Include qemu-common.h exactly where needed 2019-06-12 13:20:20 +02:00
iothread.c qom: Drop parameter @errp of object_property_add() & friends 2020-05-15 07:07:58 +02:00
job-qmp.c job: take each job's lock individually in job_txn_apply 2020-04-07 14:34:47 +02:00
job.c job: take each job's lock individually in job_txn_apply 2020-04-07 14:34:47 +02:00
Kconfig.host configure: simplify vhost condition with Kconfig 2019-12-17 19:32:48 +01:00
LICENSE tcg/LICENSE: Remove out of date claim about TCG subdirectory licensing 2019-11-11 15:11:21 +01:00
MAINTAINERS MAINTAINERS: Add 'Performance Tools and Tests' subsection 2020-06-27 20:15:07 +02:00
Makefile Makefile: Install qemu-[qmp/ga]-ref.* into the directory "interop" 2020-06-26 09:39:37 -04:00
Makefile.objs tpm: Move backend code under the 'backends/' directory 2020-06-19 07:25:55 -04:00
Makefile.target update syscall numbers to linux 5.5 (with scripts) 2020-03-20 16:00:21 +00:00
memory_ldst.inc.c memory: Single byte swap along the I/O path 2019-09-03 08:30:39 -07:00
memory_mapping.c Include qemu-common.h exactly where needed 2019-06-12 13:20:20 +02:00
memory.c memory: Revert "memory: accept mismatching sizes in memory_region_access_valid" 2020-06-26 06:45:30 -04:00
module-common.c all: Clean up includes 2016-02-04 17:41:30 +00:00
os-posix.c os-posix: simplify os_find_datadir 2019-12-17 19:32:47 +01:00
os-win32.c glib: use portable g_setenv() 2019-12-17 09:05:23 +01:00
qdev-monitor.c qdev: Use qdev_realize() in qdev_device_add() 2020-06-15 22:06:04 +02:00
qemu-bridge-helper.c build: rename CONFIG_LIBCAP to CONFIG_LIBCAP_NG 2019-12-17 19:35:47 +01:00
qemu-edid.c Include qemu-common.h exactly where needed 2019-06-12 13:20:20 +02:00
qemu-img-cmds.hx qemu-img: Add convert --bitmaps option 2020-05-28 13:16:30 -05:00
qemu-img.c qemu-img: Add convert --bitmaps option 2020-05-28 13:16:30 -05:00
qemu-io-cmds.c block-backend: Add flags to blk_truncate() 2020-04-30 17:51:07 +02:00
qemu-io.c qemu-io: adds option to use aio engine 2020-01-30 20:59:42 +00:00
qemu-keymap.c Include qemu-common.h exactly where needed 2019-06-12 13:20:20 +02:00
qemu-nbd.c error: Use error_reportf_err() where appropriate 2020-05-27 07:45:30 +02:00
qemu-options-wrapper.h qemu-img: remove references to GEN_DOCS 2018-05-20 08:35:54 +03:00
qemu-options.h Clean up ill-advised or unusual header guards 2016-07-12 16:20:46 +02:00
qemu-options.hx numa: forbid '-numa node, mem' for 5.1 and newer machine types 2020-06-26 09:39:39 -04:00
qemu-seccomp.c seccomp: report more useful errors from seccomp 2019-03-27 13:11:38 +01:00
qemu-storage-daemon.c qemu-storage-daemon: Fix non-string --object properties 2020-04-30 17:51:07 +02:00
qemu.nsi qemu.nsi: Install Sphinx documentation 2020-03-09 16:45:00 +00:00
qemu.sasl Default to GSSAPI (Kerberos) instead of DIGEST-MD5 for SASL 2017-05-09 14:41:47 +01:00
qtest.c qtest: fix fuzzer-related 80-char limit violations 2020-03-06 10:33:26 +00:00
README.rst docs: merge HACKING.rst contents into CODING_STYLE.rst 2019-09-05 14:27:06 +01:00
replication.c replication: Introduce new APIs to do replication operation 2016-09-13 11:00:56 +01:00
replication.h Include qemu/module.h where needed, drop it from qemu-common.h 2019-06-12 13:18:33 +02:00
rules.mak build-sys: Move the print-variable rule to rules.mak 2020-03-09 15:59:31 +01:00
thunk.c thunk: improve readability of allocation loop 2019-03-11 18:48:20 +01:00
tpm.c tpm: Clean up error reporting in tpm_init_tpmdev() 2018-10-19 14:51:34 +02:00
trace-events trace: add mmu_index to mem_info 2019-10-28 15:12:38 +00:00
VERSION Open 5.1 development tree 2020-04-29 15:07:10 +01:00
version.rc Use HTTPS for qemu.org and other domains 2017-11-21 13:34:13 +00:00

===========
QEMU README
===========

QEMU is a generic and open source machine & userspace emulator and
virtualizer.

QEMU is capable of emulating a complete machine in software without any
need for hardware virtualization support. By using dynamic translation,
it achieves very good performance. QEMU can also integrate with the Xen
and KVM hypervisors to provide emulated hardware while allowing the
hypervisor to manage the CPU. With hypervisor support, QEMU can achieve
near native performance for CPUs. When QEMU emulates CPUs directly it is
capable of running operating systems made for one machine (e.g. an ARMv7
board) on a different machine (e.g. an x86_64 PC board).

QEMU is also capable of providing userspace API virtualization for Linux
and BSD kernel interfaces. This allows binaries compiled against one
architecture ABI (e.g. the Linux PPC64 ABI) to be run on a host using a
different architecture ABI (e.g. the Linux x86_64 ABI). This does not
involve any hardware emulation, simply CPU and syscall emulation.

QEMU aims to fit into a variety of use cases. It can be invoked directly
by users wishing to have full control over its behaviour and settings.
It also aims to facilitate integration into higher level management
layers, by providing a stable command line interface and monitor API.
It is commonly invoked indirectly via the libvirt library when using
open source applications such as oVirt, OpenStack and virt-manager.

QEMU as a whole is released under the GNU General Public License,
version 2. For full licensing details, consult the LICENSE file.


Building
========

QEMU is multi-platform software intended to be buildable on all modern
Linux platforms, OS-X, Win32 (via the Mingw64 toolchain) and a variety
of other UNIX targets. The simple steps to build QEMU are:


.. code-block:: shell

  mkdir build
  cd build
  ../configure
  make

Additional information can also be found online via the QEMU website:

* `<https://qemu.org/Hosts/Linux>`_
* `<https://qemu.org/Hosts/Mac>`_
* `<https://qemu.org/Hosts/W32>`_


Submitting patches
==================

The QEMU source code is maintained under the GIT version control system.

.. code-block:: shell

   git clone https://git.qemu.org/git/qemu.git

When submitting patches, one common approach is to use 'git
format-patch' and/or 'git send-email' to format & send the mail to the
qemu-devel@nongnu.org mailing list. All patches submitted must contain
a 'Signed-off-by' line from the author. Patches should follow the
guidelines set out in the CODING_STYLE.rst file.

Additional information on submitting patches can be found online via
the QEMU website

* `<https://qemu.org/Contribute/SubmitAPatch>`_
* `<https://qemu.org/Contribute/TrivialPatches>`_

The QEMU website is also maintained under source control.

.. code-block:: shell

  git clone https://git.qemu.org/git/qemu-web.git

* `<https://www.qemu.org/2017/02/04/the-new-qemu-website-is-up/>`_

A 'git-publish' utility was created to make above process less
cumbersome, and is highly recommended for making regular contributions,
or even just for sending consecutive patch series revisions. It also
requires a working 'git send-email' setup, and by default doesn't
automate everything, so you may want to go through the above steps
manually for once.

For installation instructions, please go to

*  `<https://github.com/stefanha/git-publish>`_

The workflow with 'git-publish' is:

.. code-block:: shell

  $ git checkout master -b my-feature
  $ # work on new commits, add your 'Signed-off-by' lines to each
  $ git publish

Your patch series will be sent and tagged as my-feature-v1 if you need to refer
back to it in the future.

Sending v2:

.. code-block:: shell

  $ git checkout my-feature # same topic branch
  $ # making changes to the commits (using 'git rebase', for example)
  $ git publish

Your patch series will be sent with 'v2' tag in the subject and the git tip
will be tagged as my-feature-v2.

Bug reporting
=============

The QEMU project uses Launchpad as its primary upstream bug tracker. Bugs
found when running code built from QEMU git or upstream released sources
should be reported via:

* `<https://bugs.launchpad.net/qemu/>`_

If using QEMU via an operating system vendor pre-built binary package, it
is preferable to report bugs to the vendor's own bug tracker first. If
the bug is also known to affect latest upstream code, it can also be
reported via launchpad.

For additional information on bug reporting consult:

* `<https://qemu.org/Contribute/ReportABug>`_


Contact
=======

The QEMU community can be contacted in a number of ways, with the two
main methods being email and IRC

* `<mailto:qemu-devel@nongnu.org>`_
* `<https://lists.nongnu.org/mailman/listinfo/qemu-devel>`_
* #qemu on irc.oftc.net

Information on additional methods of contacting the community can be
found online via the QEMU website:

* `<https://qemu.org/Contribute/StartHere>`_