Go to file
Peter Xu 5504a81261 KVM: Dynamic sized kvm memslots array
Zhiyi reported an infinite loop issue in VFIO use case.  The cause of that
was a separate discussion, however during that I found a regression of
dirty sync slowness when profiling.

Each KVMMemoryListerner maintains an array of kvm memslots.  Currently it's
statically allocated to be the max supported by the kernel.  However after
Linux commit 4fc096a99e ("KVM: Raise the maximum number of user memslots"),
the max supported memslots reported now grows to some number large enough
so that it may not be wise to always statically allocate with the max
reported.

What's worse, QEMU kvm code still walks all the allocated memslots entries
to do any form of lookups.  It can drastically slow down all memslot
operations because each of such loop can run over 32K times on the new
kernels.

Fix this issue by making the memslots to be allocated dynamically.

Here the initial size was set to 16 because it should cover the basic VM
usages, so that the hope is the majority VM use case may not even need to
grow at all (e.g. if one starts a VM with ./qemu-system-x86_64 by default
it'll consume 9 memslots), however not too large to waste memory.

There can also be even better way to address this, but so far this is the
simplest and should be already better even than before we grow the max
supported memslots.  For example, in the case of above issue when VFIO was
attached on a 32GB system, there are only ~10 memslots used.  So it could
be good enough as of now.

In the above VFIO context, measurement shows that the precopy dirty sync
shrinked from ~86ms to ~3ms after this patch applied.  It should also apply
to any KVM enabled VM even without VFIO.

NOTE: we don't have a FIXES tag for this patch because there's no real
commit that regressed this in QEMU. Such behavior existed for a long time,
but only start to be a problem when the kernel reports very large
nr_slots_max value.  However that's pretty common now (the kernel change
was merged in 2021) so we attached cc:stable because we'll want this change
to be backported to stable branches.

Cc: qemu-stable <qemu-stable@nongnu.org>
Reported-by: Zhiyi Guo <zhguo@redhat.com>
Tested-by: Zhiyi Guo <zhguo@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240917163835.194664-2-peterx@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-17 19:41:30 +02:00
.github/workflows
.gitlab/issue_templates
.gitlab-ci.d tests: update lcitool to fix freebsd py311-yaml rename 2024-10-14 15:54:24 +01:00
accel KVM: Dynamic sized kvm memslots array 2024-10-17 19:41:30 +02:00
audio audio/pw: Report more accurate error when connecting to PipeWire fails 2024-10-14 17:35:24 +04:00
authz
backends hostmem: Apply merge property after the memory region is initialized 2024-09-24 11:33:35 +02:00
block docs: Mark "gluster" support in QEMU as deprecated 2024-10-07 10:54:10 +02:00
bsd-user bsd-user: Implement set_mcontext and get_ucontext_sigreturn for RISCV 2024-10-02 15:11:52 +10:00
chardev chardev/mux: implement detach of frontends from mux 2024-10-15 12:26:01 +04:00
common-user
configs target/i386/gdbstub: Expose orig_ax 2024-10-13 10:05:51 -07:00
contrib contrib/plugins: avoid hanging program 2024-09-19 15:58:01 +01:00
crypto crypto: drop obsolete back compat logic for old nettle 2024-10-10 13:13:53 +01:00
disas disas: Remove CRIS disassembler 2024-10-07 11:33:20 +02:00
docs docs/system: Add recommendations to Hyper-V enlightenments doc 2024-10-17 12:30:21 +02:00
dump dump: make range overlap check more readable 2024-07-23 20:30:36 +02:00
ebpf qapi/ebpf: Drop temporary 'prefix' 2024-09-10 13:22:47 +02:00
fpu fpu: remove break after g_assert_not_reached() 2024-09-24 13:53:35 +02:00
fsdev * pc: Add a description for the i8042 property 2024-10-04 19:28:37 +01:00
gdb-xml target/i386/gdbstub: Expose orig_ax 2024-10-13 10:05:51 -07:00
gdbstub license: Update deprecated SPDX tag LGPL-2.0+ to LGPL-2.0-or-later 2024-09-20 10:11:59 +03:00
host/include util/cpuinfo-riscv: Support host/cpuinfo.h for riscv 2024-07-03 10:24:12 -07:00
hw hw/arm/xilinx_zynq: Add various missing unimplemented devices 2024-10-15 15:16:17 +01:00
include KVM: Dynamic sized kvm memslots array 2024-10-17 19:41:30 +02:00
io qapi/crypto: Rename QCryptoHashAlgorithm to *Algo, and drop prefix 2024-09-10 14:02:16 +02:00
libdecnumber
linux-headers linux-header: PPC: KVM: Update one-reg ids for DEXCR, HASHKEYR and HASHPKEYR 2024-07-26 09:21:06 +10:00
linux-user linux-user/vm86: Fix compilation with Clang 2024-10-13 10:34:00 -07:00
migration migration/multifd: fix build error when qpl compression is enabled 2024-10-09 08:30:53 -04:00
monitor gdbstub: move enums into separate header 2024-06-24 10:14:17 +01:00
nbd nbd: fix -Werror=maybe-uninitialized false-positive 2024-10-02 16:14:29 +04:00
net net: Remove deadcode 2024-10-03 17:26:05 +03:00
pc-bios roms/openbios: update OpenBIOS images to c3a19c1e built from submodule 2024-09-24 20:58:54 +01:00
plugins plugins: add plugin API to read guest memory 2024-09-19 15:58:01 +01:00
po po: update Italian translation 2024-08-13 19:01:42 +02:00
python Require meson version 1.5.0 2024-10-07 16:41:57 +02:00
qapi Migration pull request 2024-10-09 20:12:11 +01:00
qga docs: Fix some typos (found by typos) and grammar issues 2024-08-16 14:12:59 +01:00
qobject qobject: remove return after g_assert_not_reached() 2024-09-24 13:53:35 +02:00
qom * pc: Add a description for the i8042 property 2024-10-04 19:28:37 +01:00
replay replay: Remove unused replay_disable_events 2024-10-03 17:26:06 +03:00
roms roms/openbios: update OpenBIOS images to c3a19c1e built from submodule 2024-09-24 20:58:54 +01:00
rust meson: check in main meson.build for native Rust compiler 2024-10-14 15:48:05 +01:00
scripts rust: add PL011 device model 2024-10-11 12:32:17 +02:00
scsi
semihosting semihosting: Restrict to TCG 2024-07-22 09:38:16 +01:00
stats
storage-daemon Revert "meson: Propagate gnutls dependency" 2024-07-03 18:41:26 +02:00
stubs meson: Drop the .fa library suffix 2024-07-03 18:41:26 +02:00
subprojects rust: add PL011 device model 2024-10-11 12:32:17 +02:00
system vl.c: Remove pxa2xx-specific -portrait and -rotate options 2024-10-15 15:16:17 +01:00
target target/i386: assert that cc_op* and pc_save are preserved 2024-10-17 19:41:30 +02:00
tcg tcg/s390x: fix constraint for 32-bit TSTEQ/TSTNE 2024-10-17 19:41:22 +02:00
tests target-arm queue: 2024-10-15 15:18:22 +01:00
tools qemu-vmsr-helper: implement --verbose/-v 2024-07-31 13:15:06 +02:00
trace
ui vl.c: Remove pxa2xx-specific -portrait and -rotate options 2024-10-15 15:16:17 +01:00
util include: Move QemuLockCnt APIs to their own header 2024-10-15 15:16:17 +01:00
.dir-locals.el
.editorconfig
.exrc
.gdbinit
.git-blame-ignore-revs
.gitattributes .gitattributes: add Rust diff and merge attributes 2024-10-11 12:32:17 +02:00
.gitignore
.gitlab-ci.yml
.gitmodules
.gitpublish
.mailmap
.patchew.yml
.readthedocs.yml
.travis.yml Remove the unused sh4eb target 2024-10-02 10:21:39 +02:00
block.c qapi/block-core: Drop temporary 'prefix' 2024-09-10 13:22:47 +02:00
blockdev-nbd.c nbd/server: CVE-2024-7409: Avoid use-after-free when closing server 2024-08-26 08:42:42 -05:00
blockdev.c backup: add minimum cluster size to performance options 2024-09-30 10:53:08 +03:00
blockjob.c
configure configure, meson: synchronize defaults for configure and Meson Rust options 2024-10-14 15:48:48 +01:00
COPYING
COPYING.LIB
cpu-common.c cpu-common.c: export cpu_get_free_index to be reused later 2024-07-26 09:21:06 +10:00
cpu-target.c
event-loop-base.c
gitdm.config
hmp-commands-info.hx hmp-commands-info.hx: Add missing info command for stats subcommand 2024-06-30 19:51:44 +03:00
hmp-commands.hx
iothread.c
job-qmp.c
job.c
Kconfig build-sys: Add rust feature option 2024-10-07 16:41:58 +02:00
Kconfig.host build-sys: Add rust feature option 2024-10-07 16:41:58 +02:00
LICENSE
MAINTAINERS include: Move QemuLockCnt APIs to their own header 2024-10-15 15:16:17 +01:00
Makefile Makefile: trigger re-configure on updated pythondeps 2024-08-16 14:04:19 +01:00
meson_options.txt configure, meson: synchronize defaults for configure and Meson Rust options 2024-10-14 15:48:48 +01:00
meson.build UI-related fixes & shareable 2d memory with -display dbus 2024-10-14 17:05:25 +01:00
module-common.c
os-posix.c os-posix: Expand setrlimit() syscall compatibility 2024-06-30 19:51:44 +03:00
os-win32.c
page-target.c
page-vary-common.c
page-vary-target.c
pythondeps.toml Require meson version 1.5.0 2024-10-07 16:41:57 +02:00
qemu-bridge-helper.c
qemu-edid.c
qemu-img-cmds.hx
qemu-img.c
qemu-io-cmds.c qemu-io: add cvtnum() error handling for zone commands 2024-06-10 11:05:43 +02:00
qemu-io.c
qemu-keymap.c qemu-keymap: Release local allocation references 2024-10-03 17:26:05 +03:00
qemu-nbd.c nbd/server: Plumb in new args to nbd_client_add() 2024-08-08 15:05:27 -05:00
qemu-options.hx vl.c: Remove pxa2xx-specific -portrait and -rotate options 2024-10-15 15:16:17 +01:00
qemu.nsi license: Simplify GPL-2.0-or-later license descriptions 2024-09-20 10:11:59 +03:00
qemu.sasl
README.rst README.rst: add the missing punctuations 2024-07-17 14:04:15 +03:00
replication.c
trace-events tracepoints: move physmem trace points 2024-07-05 12:33:37 +01:00
VERSION Open 9.2 development tree 2024-09-03 09:18:43 -07:00
version.rc

===========
QEMU README
===========

QEMU is a generic and open source machine & userspace emulator and
virtualizer.

QEMU is capable of emulating a complete machine in software without any
need for hardware virtualization support. By using dynamic translation,
it achieves very good performance. QEMU can also integrate with the Xen
and KVM hypervisors to provide emulated hardware while allowing the
hypervisor to manage the CPU. With hypervisor support, QEMU can achieve
near native performance for CPUs. When QEMU emulates CPUs directly it is
capable of running operating systems made for one machine (e.g. an ARMv7
board) on a different machine (e.g. an x86_64 PC board).

QEMU is also capable of providing userspace API virtualization for Linux
and BSD kernel interfaces. This allows binaries compiled against one
architecture ABI (e.g. the Linux PPC64 ABI) to be run on a host using a
different architecture ABI (e.g. the Linux x86_64 ABI). This does not
involve any hardware emulation, simply CPU and syscall emulation.

QEMU aims to fit into a variety of use cases. It can be invoked directly
by users wishing to have full control over its behaviour and settings.
It also aims to facilitate integration into higher level management
layers, by providing a stable command line interface and monitor API.
It is commonly invoked indirectly via the libvirt library when using
open source applications such as oVirt, OpenStack and virt-manager.

QEMU as a whole is released under the GNU General Public License,
version 2. For full licensing details, consult the LICENSE file.


Documentation
=============

Documentation can be found hosted online at
`<https://www.qemu.org/documentation/>`_. The documentation for the
current development version that is available at
`<https://www.qemu.org/docs/master/>`_ is generated from the ``docs/``
folder in the source tree, and is built by `Sphinx
<https://www.sphinx-doc.org/en/master/>`_.


Building
========

QEMU is multi-platform software intended to be buildable on all modern
Linux platforms, OS-X, Win32 (via the Mingw64 toolchain) and a variety
of other UNIX targets. The simple steps to build QEMU are:


.. code-block:: shell

  mkdir build
  cd build
  ../configure
  make

Additional information can also be found online via the QEMU website:

* `<https://wiki.qemu.org/Hosts/Linux>`_
* `<https://wiki.qemu.org/Hosts/Mac>`_
* `<https://wiki.qemu.org/Hosts/W32>`_


Submitting patches
==================

The QEMU source code is maintained under the GIT version control system.

.. code-block:: shell

   git clone https://gitlab.com/qemu-project/qemu.git

When submitting patches, one common approach is to use 'git
format-patch' and/or 'git send-email' to format & send the mail to the
qemu-devel@nongnu.org mailing list. All patches submitted must contain
a 'Signed-off-by' line from the author. Patches should follow the
guidelines set out in the `style section
<https://www.qemu.org/docs/master/devel/style.html>`_ of
the Developers Guide.

Additional information on submitting patches can be found online via
the QEMU website:

* `<https://wiki.qemu.org/Contribute/SubmitAPatch>`_
* `<https://wiki.qemu.org/Contribute/TrivialPatches>`_

The QEMU website is also maintained under source control.

.. code-block:: shell

  git clone https://gitlab.com/qemu-project/qemu-web.git

* `<https://www.qemu.org/2017/02/04/the-new-qemu-website-is-up/>`_

A 'git-publish' utility was created to make above process less
cumbersome, and is highly recommended for making regular contributions,
or even just for sending consecutive patch series revisions. It also
requires a working 'git send-email' setup, and by default doesn't
automate everything, so you may want to go through the above steps
manually for once.

For installation instructions, please go to:

*  `<https://github.com/stefanha/git-publish>`_

The workflow with 'git-publish' is:

.. code-block:: shell

  $ git checkout master -b my-feature
  $ # work on new commits, add your 'Signed-off-by' lines to each
  $ git publish

Your patch series will be sent and tagged as my-feature-v1 if you need to refer
back to it in the future.

Sending v2:

.. code-block:: shell

  $ git checkout my-feature # same topic branch
  $ # making changes to the commits (using 'git rebase', for example)
  $ git publish

Your patch series will be sent with 'v2' tag in the subject and the git tip
will be tagged as my-feature-v2.

Bug reporting
=============

The QEMU project uses GitLab issues to track bugs. Bugs
found when running code built from QEMU git or upstream released sources
should be reported via:

* `<https://gitlab.com/qemu-project/qemu/-/issues>`_

If using QEMU via an operating system vendor pre-built binary package, it
is preferable to report bugs to the vendor's own bug tracker first. If
the bug is also known to affect latest upstream code, it can also be
reported via GitLab.

For additional information on bug reporting consult:

* `<https://wiki.qemu.org/Contribute/ReportABug>`_


ChangeLog
=========

For version history and release notes, please visit
`<https://wiki.qemu.org/ChangeLog/>`_ or look at the git history for
more detailed information.


Contact
=======

The QEMU community can be contacted in a number of ways, with the two
main methods being email and IRC:

* `<mailto:qemu-devel@nongnu.org>`_
* `<https://lists.nongnu.org/mailman/listinfo/qemu-devel>`_
* #qemu on irc.oftc.net

Information on additional methods of contacting the community can be
found online via the QEMU website:

* `<https://wiki.qemu.org/Contribute/StartHere>`_