If "-audio BACKEND" is used without a model, the resulting backend
will be used whenever the audiodev property is not specified.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Allow an offset option to be specified as part of the file URI, in
the form "file:filename,offset=offset", where offset accepts the common
size suffixes, or the 0x prefix, but not both. Migration data is written
to and read from the file starting at offset. If unspecified, it defaults
to 0.
This is needed by libvirt to store its own data at the head of the file.
Suggested-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <1694182931-61390-3-git-send-email-steven.sistare@oracle.com>
Extend the migration URI to support file:<filename>. This can be used for
any migration scenario that does not require a reverse path. It can be
used as an alternative to 'exec:cat > file' in minimized containers that
do not contain /bin/sh, and it is easier to use than the fd:<fdname> URI.
It can be used in HMP commands, and as a qemu command-line parameter.
For best performance, guest ram should be shared and x-ignore-shared
should be true, so guest pages are not written to the file, in which case
the guest may remain running. If ram is not so configured, then the user
is advised to stop the guest first. Otherwise, a busy guest may re-dirty
the same page, causing it to be appended to the file multiple times,
and the file may grow unboundedly. That issue is being addressed in the
"fixed-ram" patch series.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Tested-by: Michael Galaxy <mgalaxy@akamai.com>
Reviewed-by: Michael Galaxy <mgalaxy@akamai.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <1694182931-61390-2-git-send-email-steven.sistare@oracle.com>
These have been deprecated for a long time, and the introduction of
-audio in 7.1.0 has cemented the new way of specifying an audio backend's
parameters. However, there is still a need for simple configuration
of the audio backend in the desktop case; therefore, if no audiodev is
passed to audio_init(), go through a bunch of simple Audiodev* structures
and pick the first that can be initialized successfully.
The only QEMU_AUDIO_* option that is left in, waiting for a better idea,
is QEMU_AUDIO_DRV=none which is used by qtest.
Remove all the parsing code, including the concept of "can_be_default"
audio drivers: now that audio_prio_list[] is only used in a single place,
wav can be excluded directly in that function.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
"Host Memory Backends" and "Memory devices" queue ("mem"):
- Support and document VM templating with R/O files using a new "rom"
parameter for memory-backend-file
- Some cleanups and fixes around NVDIMMs and R/O file handling for guest
RAM
- Optimize ioeventfd updates by skipping address spaces that are not
applicable
-----BEGIN PGP SIGNATURE-----
iQJFBAABCAAvFiEEG9nKrXNcTDpGDfzKTd4Q9wD/g1oFAmUJdykRHGRhdmlkQHJl
ZGhhdC5jb20ACgkQTd4Q9wD/g1pf2w//akOUoYMuamySGjXtKLVyMKZkjIys+Ama
k2C0xzsWAHBP572ezwHi8uxf5j9kzAjsw6GxDZ7FAamD9MhiohkEvkecloBx6f/c
q3fVHblBNkG7v2urtf4+6PJtJvhzOST2SFXfWeYhO/vaA04AYCDgexv82JN3gA6B
OS8WyOX62b8wILPSY2GLZ8IqpE9XnOYZwzVBn6YB1yo7ZkYEfXO6cA8nykNuNcOE
vppqDo7uVIX6317FWj8ygxmzFfOaj0WT2MT2XFzEIDfg8BInQN8HC4mTn0hcVKMa
N1y+eZH733CQKT+uNBRZ5YOeljOi4d6gEEyvkkA/L7e5D3Qg9hIdvHb4uryCFSWX
Vt07OP1XLBwCZFobOC6sg+2gtTZJxxYK89e6ZzEd0454S24w5bnEteRAaCGOP0XL
ww9xYULqhtZs55UC4rvZHJwdUAk1fIY4VqynwkeQXegvz6BxedNeEkJiiEU0Tizx
N2VpsxAJ7H/LLSFeZoCRESo4azrH6U4n7S/eS1tkCniFqibfe2yIQCDoJVfb42ec
gfg/vThCrDwHkIHzkMmoV8NndA7Q7SIkyMfYeEEBeZMeg8JzYll4DJEw/jQCacxh
KRUa+AZvGlTJUq0mkvyOVfLki+iaehoIUuY1yvMrmdWijPO8n3YybmP9Ljhr8VdR
9MSYZe+I2v8=
=iraT
-----END PGP SIGNATURE-----
Merge tag 'mem-2023-09-19' of https://github.com/davidhildenbrand/qemu into staging
Hi,
"Host Memory Backends" and "Memory devices" queue ("mem"):
- Support and document VM templating with R/O files using a new "rom"
parameter for memory-backend-file
- Some cleanups and fixes around NVDIMMs and R/O file handling for guest
RAM
- Optimize ioeventfd updates by skipping address spaces that are not
applicable
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEG9nKrXNcTDpGDfzKTd4Q9wD/g1oFAmUJdykRHGRhdmlkQHJl
# ZGhhdC5jb20ACgkQTd4Q9wD/g1pf2w//akOUoYMuamySGjXtKLVyMKZkjIys+Ama
# k2C0xzsWAHBP572ezwHi8uxf5j9kzAjsw6GxDZ7FAamD9MhiohkEvkecloBx6f/c
# q3fVHblBNkG7v2urtf4+6PJtJvhzOST2SFXfWeYhO/vaA04AYCDgexv82JN3gA6B
# OS8WyOX62b8wILPSY2GLZ8IqpE9XnOYZwzVBn6YB1yo7ZkYEfXO6cA8nykNuNcOE
# vppqDo7uVIX6317FWj8ygxmzFfOaj0WT2MT2XFzEIDfg8BInQN8HC4mTn0hcVKMa
# N1y+eZH733CQKT+uNBRZ5YOeljOi4d6gEEyvkkA/L7e5D3Qg9hIdvHb4uryCFSWX
# Vt07OP1XLBwCZFobOC6sg+2gtTZJxxYK89e6ZzEd0454S24w5bnEteRAaCGOP0XL
# ww9xYULqhtZs55UC4rvZHJwdUAk1fIY4VqynwkeQXegvz6BxedNeEkJiiEU0Tizx
# N2VpsxAJ7H/LLSFeZoCRESo4azrH6U4n7S/eS1tkCniFqibfe2yIQCDoJVfb42ec
# gfg/vThCrDwHkIHzkMmoV8NndA7Q7SIkyMfYeEEBeZMeg8JzYll4DJEw/jQCacxh
# KRUa+AZvGlTJUq0mkvyOVfLki+iaehoIUuY1yvMrmdWijPO8n3YybmP9Ljhr8VdR
# 9MSYZe+I2v8=
# =iraT
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 19 Sep 2023 06:25:45 EDT
# gpg: using RSA key 1BD9CAAD735C4C3A460DFCCA4DDE10F700FF835A
# gpg: issuer "david@redhat.com"
# gpg: Good signature from "David Hildenbrand <david@redhat.com>" [unknown]
# gpg: aka "David Hildenbrand <davidhildenbrand@gmail.com>" [full]
# gpg: aka "David Hildenbrand <hildenbr@in.tum.de>" [unknown]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 1BD9 CAAD 735C 4C3A 460D FCCA 4DDE 10F7 00FF 835A
* tag 'mem-2023-09-19' of https://github.com/davidhildenbrand/qemu:
memory: avoid updating ioeventfds for some address_space
machine: Improve error message when using default RAM backend id
softmmu/physmem: Hint that "readonly=on,rom=off" exists when opening file R/W for private mapping fails
docs: Start documenting VM templating
docs: Don't mention "-mem-path" in multi-process.rst
softmmu/physmem: Never return directories from file_ram_open()
softmmu/physmem: Fail creation of new files in file_ram_open() with readonly=true
softmmu/physmem: Bail out early in ram_block_discard_range() with readonly files
softmmu/physmem: Remap with proper protection in qemu_ram_remap()
backends/hostmem-file: Add "rom" property to support VM templating with R/O files
softmmu/physmem: Distinguish between file access mode and mmap protection
nvdimm: Reject writing label data to ROM instead of crashing QEMU
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
For now, "share=off,readonly=on" would always result in us opening the
file R/O and mmap'ing the opened file MAP_PRIVATE R/O -- effectively
turning it into ROM.
Especially for VM templating, "share=off" is a common use case. However,
that use case is impossible with files that lack write permissions,
because "share=off,readonly=on" will not give us writable RAM.
The sole user of ROM via memory-backend-file are R/O NVDIMMs, but as we
have users (Kata Containers) that rely on the existing behavior --
malicious VMs should not be able to consume COW memory for R/O NVDIMMs --
we cannot change the semantics of "share=off,readonly=on"
So let's add a new "rom" property with on/off/auto values. "auto" is
the default and what most people will use: for historical reasons, to not
change the old semantics, it defaults to the value of the "readonly"
property.
For VM templating, one can now use:
-object memory-backend-file,share=off,readonly=on,rom=off,...
But we'll disallow:
-object memory-backend-file,share=on,readonly=on,rom=off,...
because we would otherwise get an error when trying to mmap the R/O file
shared and writable. An explicit error message is cleaner.
We will also disallow for now:
-object memory-backend-file,share=off,readonly=off,rom=on,...
-object memory-backend-file,share=on,readonly=off,rom=on,...
It's not harmful, but also not really required for now.
Alternatives that were abandoned:
* Make "unarmed=on" for the NVDIMM set the memory region container
readonly. We would still see a change of ROM->RAM and possibly run
into memslot limits with vhost-user. Further, there might be use cases
for "unarmed=on" that should still allow writing to that memory
(temporary files, system RAM, ...).
* Add a new "readonly=on/off/auto" parameter for NVDIMMs. Similar issues
as with "unarmed=on".
* Make "readonly" consume "on/off/file" instead of being a 'bool' type.
This would slightly changes the behavior of the "readonly" parameter:
values like true/false (as accepted by a 'bool'type) would no longer be
accepted.
Message-ID: <20230906120503.359863-4-david@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
AF_XDP is a network socket family that allows communication directly
with the network device driver in the kernel, bypassing most or all
of the kernel networking stack. In the essence, the technology is
pretty similar to netmap. But, unlike netmap, AF_XDP is Linux-native
and works with any network interfaces without driver modifications.
Unlike vhost-based backends (kernel, user, vdpa), AF_XDP doesn't
require access to character devices or unix sockets. Only access to
the network interface itself is necessary.
This patch implements a network backend that communicates with the
kernel by creating an AF_XDP socket. A chunk of userspace memory
is shared between QEMU and the host kernel. 4 ring buffers (Tx, Rx,
Fill and Completion) are placed in that memory along with a pool of
memory buffers for the packet data. Data transmission is done by
allocating one of the buffers, copying packet data into it and
placing the pointer into Tx ring. After transmission, device will
return the buffer via Completion ring. On Rx, device will take
a buffer form a pre-populated Fill ring, write the packet data into
it and place the buffer into Rx ring.
AF_XDP network backend takes on the communication with the host
kernel and the network interface and forwards packets to/from the
peer device in QEMU.
Usage example:
-device virtio-net-pci,netdev=guest1,mac=00:16:35:AF:AA:5C
-netdev af-xdp,ifname=ens6f1np1,id=guest1,mode=native,queues=1
XDP program bridges the socket with a network interface. It can be
attached to the interface in 2 different modes:
1. skb - this mode should work for any interface and doesn't require
driver support. With a caveat of lower performance.
2. native - this does require support from the driver and allows to
bypass skb allocation in the kernel and potentially use
zero-copy while getting packets in/out userspace.
By default, QEMU will try to use native mode and fall back to skb.
Mode can be forced via 'mode' option. To force 'copy' even in native
mode, use 'force-copy=on' option. This might be useful if there is
some issue with the driver.
Option 'queues=N' allows to specify how many device queues should
be open. Note that all the queues that are not open are still
functional and can receive traffic, but it will not be delivered to
QEMU. So, the number of device queues should generally match the
QEMU configuration, unless the device is shared with something
else and the traffic re-direction to appropriate queues is correctly
configured on a device level (e.g. with ethtool -N).
'start-queue=M' option can be used to specify from which queue id
QEMU should start configuring 'N' queues. It might also be necessary
to use this option with certain NICs, e.g. MLX5 NICs. See the docs
for examples.
In a general case QEMU will need CAP_NET_ADMIN and CAP_SYS_ADMIN
or CAP_BPF capabilities in order to load default XSK/XDP programs to
the network interface and configure BPF maps. It is possible, however,
to run with no capabilities. For that to work, an external process
with enough capabilities will need to pre-load default XSK program,
create AF_XDP sockets and pass their file descriptors to QEMU process
on startup via 'sock-fds' option. Network backend will need to be
configured with 'inhibit=on' to avoid loading of the program.
QEMU will need 32 MB of locked memory (RLIMIT_MEMLOCK) per queue
or CAP_IPC_LOCK.
There are few performance challenges with the current network backends.
First is that they do not support IO threads. This means that data
path is handled by the main thread in QEMU and may slow down other
work or may be slowed down by some other work. This also means that
taking advantage of multi-queue is generally not possible today.
Another thing is that data path is going through the device emulation
code, which is not really optimized for performance. The fastest
"frontend" device is virtio-net. But it's not optimized for heavy
traffic either, because it expects such use-cases to be handled via
some implementation of vhost (user, kernel, vdpa). In practice, we
have virtio notifications and rcu lock/unlock on a per-packet basis
and not very efficient accesses to the guest memory. Communication
channels between backend and frontend devices do not allow passing
more than one packet at a time as well.
Some of these challenges can be avoided in the future by adding better
batching into device emulation or by implementing vhost-af-xdp variant.
There are also a few kernel limitations. AF_XDP sockets do not
support any kinds of checksum or segmentation offloading. Buffers
are limited to a page size (4K), i.e. MTU is limited. Multi-buffer
support implementation for AF_XDP is in progress, but not ready yet.
Also, transmission in all non-zero-copy modes is synchronous, i.e.
done in a syscall. That doesn't allow high packet rates on virtual
interfaces.
However, keeping in mind all of these challenges, current implementation
of the AF_XDP backend shows a decent performance while running on top
of a physical NIC with zero-copy support.
Test setup:
2 VMs running on 2 physical hosts connected via ConnectX6-Dx card.
Network backend is configured to open the NIC directly in native mode.
The driver supports zero-copy. NIC is configured to use 1 queue.
Inside a VM - iperf3 for basic TCP performance testing and dpdk-testpmd
for PPS testing.
iperf3 result:
TCP stream : 19.1 Gbps
dpdk-testpmd (single queue, single CPU core, 64 B packets) results:
Tx only : 3.4 Mpps
Rx only : 2.0 Mpps
L2 FWD Loopback : 1.5 Mpps
In skb mode the same setup shows much lower performance, similar to
the setup where pair of physical NICs is replaced with veth pair:
iperf3 result:
TCP stream : 9 Gbps
dpdk-testpmd (single queue, single CPU core, 64 B packets) results:
Tx only : 1.2 Mpps
Rx only : 1.0 Mpps
L2 FWD Loopback : 0.7 Mpps
Results in skb mode or over the veth are close to results of a tap
backend with vhost=on and disabled segmentation offloading bridged
with a NIC.
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> (docker/lcitool)
Signed-off-by: Jason Wang <jasowang@redhat.com>
Now that we have Eager Page Split support added for ARM in the kernel,
enable it in Qemu. This adds,
-eager-split-size to -accel sub-options to set the eager page split chunk size.
-enable KVM_CAP_ARM_EAGER_SPLIT_CHUNK_SIZE.
The chunk size specifies how many pages to break at a time, using a
single allocation. Bigger the chunk size, more pages need to be
allocated ahead of time.
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Message-id: 20230905091246.1931-1-shameerali.kolothum.thodi@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The current description says that these options will create a device
on the IDE bus, which is only true on x86. So rephrase these sentences
a little bit to speak of "default bus" instead.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
HAX is deprecated since commits 73741fda6c ("MAINTAINERS: Abort
HAXM maintenance") and 90c167a1da ("docs/about/deprecated: Mark
HAXM in QEMU as deprecated"), released in v8.0.0.
Per the latest HAXM release (v7.8 [*]), the latest QEMU supported
is v7.2:
Note: Up to this release, HAXM supports QEMU from 2.9.0 to 7.2.0.
The next commit (https://github.com/intel/haxm/commit/da1b8ec072)
added:
HAXM v7.8.0 is our last release and we will not accept
pull requests or respond to issues after this.
It became very hard to build and test HAXM. Its previous
maintainers made it clear they won't help. It doesn't seem to be
a very good use of QEMU maintainers to spend their time in a dead
project. Save our time by removing this orphan zombie code.
[*] https://github.com/intel/haxm/releases/tag/v7.8.0
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230831082016.60885-1-philmd@linaro.org>
We recently introduced "-run-with" for options that influence the
runtime behavior of QEMU. This option has the big advantage that it
can group related options (so that it is easier for the users to spot
them) and that the options become introspectable via QMP this way.
So let's start moving more switches into this option group, starting
with "-chroot" now.
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Message-Id: <20230703074447.17044-1-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The description of the options starts at column 16, so fix
this in some runaway lines for a more uniform output.
While we're at it, replace the capital "NOTE" with "Note"
since this seems to be the more common capitalization in
qemu-options.hx.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
As recent CVE-2023-2861 (fixed by f6b0de53fb) once again showed, the 9p
'proxy' fs driver is in bad shape. Using the 'proxy' backend was already
discouraged for safety reasons before and we recommended to use the
'local' backend (preferably in conjunction with its 'mapped' security
model) instead, but now it is time to officially deprecate the 'proxy'
backend.
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <E1qDkmw-0007M1-8f@lizzy.crudebyte.com>
When we for example have a sparse qcow2 image and discard: unmap is enabled,
there can be a lot of fragmentation in the image after some time. Especially on VM's
that do a lot of writes/deletes.
This causes the qcow2 image to grow even over 110% of its virtual size,
because the free gaps in the image get too small to allocate new
continuous clusters. So it allocates new space at the end of the image.
Disabling discard is not an option, as discard is needed to keep the
incremental backup size as low as possible. Without discard, the
incremental backups would become large, as qemu thinks it's just dirty
blocks but it doesn't know the blocks are unneeded.
So we need to avoid fragmentation but also 'empty' the unneeded blocks in
the image to have a small incremental backup.
In addition, we also want to send the discards further down the stack, so
the underlying blocks are still discarded.
Therefor we introduce a new qcow2 option "discard-no-unref".
When setting this option to true, discards will no longer have the qcow2
driver relinquish cluster allocations. Other than that, the request is
handled as normal: All clusters in range are marked as zero, and, if
pass-discard-request is true, it is passed further down the stack.
The only difference is that the now-zero clusters are preallocated
instead of being unallocated.
This will avoid fragmentation on the qcow2 image.
Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1621
Signed-off-by: Jean-Louis Dupond <jean-louis@dupond.be>
Message-Id: <20230605084523.34134-2-jean-louis@dupond.be>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Add an option for hostmem-file to start the memory object at an offset
into the target file. This is useful if multiple memory objects reside
inside the same target file, such as a device node.
In particular, it's useful to map guest memory directly into /dev/mem
for experimentation.
To make this work consistently, also fix up all places in QEMU that
expect fd offsets to be 0.
Signed-off-by: Alexander Graf <graf@amazon.com>
Message-Id: <20230403221421.60877-1-graf@amazon.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
qmp-intro.txt is quite small and provides very little information
that isn't already in the documentation elsewhere. Fold the example
command lines into qemu-options.hx, and delete the now-unneeded plain
text document.
While we're touching the qemu-options.hx documentation text,
wordsmith it a little bit and improve the rST formatting.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20230515162245.3964307-4-peter.maydell@linaro.org>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Add new -run-with option with an async-teardown=on|off parameter. It is
visible in the output of query-command-line-options QMP command, so it
can be discovered and used by libvirt.
The option -async-teardown is now redundant, deprecate it.
Reported-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Fixes: c891c24b1a ("os-posix: asynchronous teardown for shutdown on Linux")
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Message-Id: <20230505120051.36605-2-imbrenda@linux.ibm.com>
[thuth: Add curly braces to fix error with GCC 8.5, fix bug in deprecated.rst]
Signed-off-by: Thomas Huth <thuth@redhat.com>
This commit adds a new audiodev backend to allow QEMU to use Pipewire as
both an audio sink and source. This backend is available on most systems
Add Pipewire entry points for QEMU Pipewire audio backend
Add wrappers for QEMU Pipewire audio backend in qpw_pcm_ops()
qpw_write function returns the current state of the stream to pwaudio
and Writes some data to the server for playback streams using pipewire
spa_ringbuffer implementation.
qpw_read function returns the current state of the stream to pwaudio and
reads some data from the server for capture streams using pipewire
spa_ringbuffer implementation. These functions qpw_write and qpw_read
are called during playback and capture.
Added some functions that convert pw audio formats to QEMU audio format
and vice versa which would be needed in the pipewire audio sink and
source functions qpw_init_in() & qpw_init_out().
These methods that implement playback and recording will create streams
for playback and capture that will start processing and will result in
the on_process callbacks to be called.
Built a connection to the Pipewire sound system server in the
qpw_audio_init() method.
Signed-off-by: Dorinda Bassey <dbassey@redhat.com>
Reviewed-by: Volker Rümelin <vr_qemu@t-online.de>
Message-Id: <20230417105654.32328-1-dbassey@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Document that the -singlestep command line option is now
deprecated, as it is replaced by either the TCG accelerator
property 'one-insn-per-tb' for system emulation or the new
'-one-insn-per-tb' option for usermode emulation, and remove
the only use of the deprecated syntax from a README.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20230417164041.684562-7-peter.maydell@linaro.org
This commit adds 'one-insn-per-tb' as a property on the TCG
accelerator object, so you can enable it with
-accel tcg,one-insn-per-tb=on
It has the same behaviour as the existing '-singlestep' command line
option. We use a different name because 'singlestep' has always been
a confusing choice, because it doesn't have anything to do with
single-stepping the CPU. What it does do is force TCG emulation to
put one guest instruction in each TB, which can be useful in some
situations (such as analysing debug logs).
The existing '-singlestep' commandline options are decoupled from the
global 'singlestep' variable and instead now are syntactic sugar for
setting the accel property. (These can then go away after a
deprecation period.)
The global variable remains for the moment as:
* what the TCG code looks at to change its behaviour
* what HMP and QMP use to query and set the behaviour
In the following commits we'll clean those up to not directly
look at the global variable.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230417164041.684562-2-peter.maydell@linaro.org
A guest only ever experiences, at most, 1 bit of reduced physical
addressing. Update the documentation to reflect this as well as change
the example value on the reduced-phys-bits option.
Fixes: a9b4942f48 ("target/i386: add Secure Encrypted Virtualization (SEV) object")
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <13a62ced1808546c1d398e2025cf85f4c94ae123.1664550870.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This commit adds the following description:
1. `memdev` option is recommended over `mem` option (see [1,2])
2. users must specify memory for all NUMA nodes (see [2])
This commit also separates descriptions for `mem` and `memdev` into two
paragraphs. The old doc describes legacy `mem` option first, and it was
a bit confusing.
Related documentation:
[1] https://wiki.qemu.org/ChangeLog/5.1#Incompatible_changes
[2] https://www.qemu.org/docs/master/about/removed-features.html
Signed-off-by: Yohei Kojima <y-koj@outlook.jp>
Message-Id: <TYZPR06MB5418D6B0175A49E8E76988439D8E9@TYZPR06MB5418.apcprd06.prod.outlook.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
[AJB: fix documentation in commit message]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230424092249.58552-15-alex.bennee@linaro.org>
We are a bit premature in recommending -blockdev/-device as the best
way to configure block devices. It seems there are times the more
human friendly -drive still makes sense especially when -snapshot is
involved.
Improve the language to hopefully make things clearer.
Suggested-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230424092249.58552-7-alex.bennee@linaro.org>
Our 'file' chardev backend supports both "output from this chardev
is written to a file" and "input from this chardev should be read
from a file" (except on Windows). However, you can only set up
the input file if you're using the QMP interface -- there is no
command line syntax to do it.
Add command line syntax to allow specifying an input file
as well as an output file, using a new 'input-path' suboption.
The specific use case I have is that I'd like to be able to
feed fuzzer reproducer input into qtest without having to use
'-qtest stdio' and put the input onto stdin. Being able to
use a file chardev like this:
-chardev file,id=repro,path=/dev/null,input-path=repro.txt -qtest chardev:repro
means that stdio is free for use by gdb.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20230413150724.404304-3-peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
[thuth: Replace "input-file=" typo with "input-path="]
Signed-off-by: Thomas Huth <thuth@redhat.com>
LoongArch has enabled CONFIG_SMBIOS, but didn't enable CLI '-smbios'.
Fixes: 3efa6fa1e6 ("hw/loongarch: Add smbios support")
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20230227035905.1290953-2-gaosong@loongson.cn>
In stream mode, if the server shuts down there is currently
no way to reconnect the client to a new server without removing
the NIC device and the netdev backend (or to reboot).
This patch introduces a reconnect option that specifies a delay
to try to reconnect with the same parameters.
Add a new test in qtest to test the reconnect option and the
connect/disconnect events.
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
This has been replaced by the 'password-secret' option,
which references a 'secret' object instance.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The 'password-secret' option was added
commit b189346eb1
Author: Daniel P. Berrangé <berrange@redhat.com>
Date: Thu Jan 21 14:19:21 2016 +0000
iscsi: add support for getting CHAP password via QCryptoSecret API
but was not mentioned in the command line docs
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The main reason to do this is to document our O_BINARY implementation
decision somewhere. However I've also moved some of the implementation
details out of qemu-options and added links between the two. As a
bonus I've highlighted the scary warnings about host access with the
appropriate RST tags.
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230124180127.1881110-22-alex.bennee@linaro.org>
Add ability to dump /tmp/perf-<pid>.map and jit-<pid>.dump.
The first one allows the perf tool to map samples to each individual
translation block. The second one adds the ability to resolve symbol
names, line numbers and inspect JITed code.
Example of use:
perf record qemu-x86_64 -perfmap ./a.out
perf report
or
perf record -k 1 qemu-x86_64 -jitdump ./a.out
DEBUGINFOD_URLS= perf inject -j -i perf.data -o perf.data.jitted
perf report -i perf.data.jitted
Co-developed-by: Vanderson M. do Rosario <vandersonmr2@gmail.com>
Co-developed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20230112152013.125680-4-iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* Rework and improvements of the EINTR handling by Nikita
* Deprecate the -no-hpet command line option
* Disable the qtests in the 32-bit Windows CI job again
* Some other misc fixes here and there
-----BEGIN PGP SIGNATURE-----
iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmO8It8RHHRodXRoQHJl
ZGhhdC5jb20ACgkQLtnXdP5wLbUwbA//dXgfHy95C1r2nTMDekk09+KkmNB1f6M8
3HK4ROmmrMT/aP9FwfqMBT7JHM/m4bwOGw0Sula8vfjg9NYGPWuSYjdObWKnrIq/
YORoTxqak9c98Co06EQbAfWn3Pj0ifQkX+FIyzcNGhu4856FWdBsMuyq52VLi36q
Z8ruSOmclzluoIB3mVYY/s5J7ED2A3K0h39frKLE9FGsKObX10KWj+MZyDHi9oGZ
ucTHai12OXgNghjlrwI0BqJziih4NxfIWs0JovSo3cN0at7m57G5JChjR38zTMNT
2Q46tDKoIXesY1GUmVuIgJ5F1Uoshc8Pz5qBSQ5mUbZUQMpivhFrEB666wsYmPd1
M/YwnZ+PFhWjem7p28fKmnmkeATvE0S+vMDifTVZ880nmAbyUm1vFKfqV6r2mBrT
p4iXfh/9easFfJWHueU4fBwyMndDGRaCRJnP8KQ5I9yb0WZbt+/0k/y8CQD8Oxr7
dNFFFoY3KnIO9DCRO5Wr+3OqUgtSAQyhBDf5V2wSMCFrwPHKsvWKSbdiWR3Qe4ck
41InWgawB3xx57+vXraDUA10+nBZ1VrM92ObqfLPTFqjLCom6Fm85cG4YFRLIvRt
rdlOC+ScpeVpec7MwcHrScGL0HmUgPnShDAo07pRy4oKK+c89sXzdAFf2nYJTAWS
WCuChrn7VFM=
=D+Yw
-----END PGP SIGNATURE-----
Merge tag 'pull-request-2023-01-09' of https://gitlab.com/thuth/qemu into staging
* s390x header clean-ups from Philippe
* Rework and improvements of the EINTR handling by Nikita
* Deprecate the -no-hpet command line option
* Disable the qtests in the 32-bit Windows CI job again
* Some other misc fixes here and there
# gpg: Signature made Mon 09 Jan 2023 14:21:19 GMT
# gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg: issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg: aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg: aka "Thomas Huth <huth@tuxfamily.org>" [full]
# gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3 EAB9 2ED9 D774 FE70 2DB5
* tag 'pull-request-2023-01-09' of https://gitlab.com/thuth/qemu:
.gitlab-ci.d/windows: Do not run the qtests in the msys2-32bit job
error handling: Use RETRY_ON_EINTR() macro where applicable
Refactoring: refactor TFR() macro to RETRY_ON_EINTR()
docs/interop: Change the vnc-ledstate-Pseudo-encoding doc into .rst
i386: Deprecate the -no-hpet QEMU command line option
tests/qtest/bios-tables-test: Replace -no-hpet with hpet=off machine parameter
tests/readconfig: spice doesn't support unix socket on windows yet
target/s390x: Restrict sysemu/reset.h to system emulation
target/s390x/tcg/excp_helper: Restrict system headers to sysemu
target/s390x/tcg/misc_helper: Remove unused "memory.h" include
hw/s390x/pv: Restrict Protected Virtualization to sysemu
exec/memory: Expose memory_region_access_valid()
MAINTAINERS: Add MIPS-related docs and configs to the MIPS architecture section
tests/vm: Update get_default_jobs() to work on non-x86_64 non-KVM hosts
qemu-iotests/stream-under-throttle: do not shutdown QEMU
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The HPET setting has been turned into a machine property a while ago
already, so we should finally do the next step and deprecate the
legacy CLI option, too.
Message-Id: <20221229114913.260400-1-thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
mostly vhost-vdpa:
guest announce feature emulation when using shadow virtqueue
support for configure interrupt
startup speed ups
an acpi change to only generate cluster node in PPTT when specified for arm
misc fixes, cleanups
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmO6eGMPHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRpoUIIALqC3UtJcK3AuAMbeqVokxl5CPwoeXMyi+rT
0QuN8m8dpBtJFpy3Vyq0afixOFmlwvORW5ye4QI97OyIhtLJq00buzQsgHjNoPo3
zN2L0BDyofDmfFHgCxcEbv2aAO8TaqRSHmKffEFmf8JDMDL9Ev1QvPTWHhfm2eJf
VKPHOtCA/3WXBD9JNfYJ0YuzCrrJaMhIO6/5tqv9yjMxWTfEFa1J2Sr2tWkRLuDk
FPfApy7afjI705Guv6PllZ3JdOMwf7iZaoBK6mSdCDSyi1xciYM0VeWi8SLD4qbM
N+9NkUQOIYS5ZC4BXrULy6HDUsECJ71I0pvHveX7nwbK6xPD4RQ=
=0tPe
-----END PGP SIGNATURE-----
Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging
virtio,pc,pci: features, cleanups, fixes
mostly vhost-vdpa:
guest announce feature emulation when using shadow virtqueue
support for configure interrupt
startup speed ups
an acpi change to only generate cluster node in PPTT when specified for arm
misc fixes, cleanups
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Sun 08 Jan 2023 08:01:39 GMT
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (50 commits)
vhost-scsi: fix memleak of vsc->inflight
acpi: cpuhp: fix guest-visible maximum access size to the legacy reg block
tests: acpi: aarch64: Add *.topology tables
tests: acpi: aarch64: Add topology test for aarch64
tests: acpi: Add and whitelist *.topology blobs
tests: virt: Update expected ACPI tables for virt test
hw/acpi/aml-build: Only generate cluster node in PPTT when specified
tests: virt: Allow changes to PPTT test table
virtio-pci: fix proxy->vector_irqfd leak in virtio_pci_set_guest_notifiers
vdpa: commit all host notifier MRs in a single MR transaction
vhost: configure all host notifiers in a single MR transaction
vhost: simplify vhost_dev_enable_notifiers
vdpa: harden the error path if get_iova_range failed
vdpa-dev: get iova range explicitly
docs/devel: Rules on #include in headers
include: Include headers where needed
include/hw/virtio: Break inclusion loop
include/hw/cxl: Break inclusion loop cxl_pci.h and cxl_cdat_h
include/hw/pci: Include hw/pci/pci.h where needed
include/hw/pci: Split pci_device.h off pci.h
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Currently we'll always generate a cluster node no matter user has
specified '-smp clusters=X' or not. Cluster is an optional level
and will participant the building of Linux scheduling domains and
only appears on a few platforms. It's unncessary to always build
it when it cannot reflect the real topology on platforms having no
cluster implementation and to avoid affecting the linux scheduling
domains in the VM. So only generate the cluster topology in ACPI
PPTT when the user has specified it explicitly in -smp.
Tested qemu-system-aarch64 with `-smp 8` and linux 6.1-rc1, without
this patch:
estuary:/sys/devices/system/cpu/cpu0/topology$ cat cluster_*
ff # cluster_cpus
0-7 # cluster_cpus_list
56 # cluster_id
with this patch:
estuary:/sys/devices/system/cpu/cpu0/topology$ cat cluster_*
ff # cluster_cpus
0-7 # cluster_cpus_list
36 # cluster_id, with no cluster node kernel will make it to
physical package id
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Yanan Wang <wangyanan55@huawei.com>
Tested-by: Yanan Wang <wangyanan55@huawei.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Message-Id: <20221229065513.55652-3-yangyicong@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Those typos are in files which are used to generate the QEMU manual.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Message-Id: <20221110190825.879620-1-sw@weilnetz.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Ani Sinha <ani@anisinha.ca>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
[thuth: update sentence in can.rst as suggested by Peter]
Signed-off-by: Thomas Huth <thuth@redhat.com>
This reverts commit 59d1ce4439.
The "zpcii-disable" machine property is redundant with the "interpret"
zPCI device property. Remove it for clarification.
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Message-Id: <20221107161349.1032730-2-clg@kaod.org>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Most of them were found and fixed using codespell.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20221030105944.311940-1-sw@weilnetz.de>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
This patch adds support for asynchronously tearing down a VM on Linux.
When qemu terminates, either naturally or because of a fatal signal,
the VM is torn down. If the VM is huge, it can take a considerable
amount of time for it to be cleaned up. In case of a protected VM, it
might take even longer than a non-protected VM (this is the case on
s390x, for example).
Some users might want to shut down a VM and restart it immediately,
without having to wait. This is especially true if management
infrastructure like libvirt is used.
This patch implements a simple trick on Linux to allow qemu to return
immediately, with the teardown of the VM being performed
asynchronously.
If the new commandline option -async-teardown is used, a new process is
spawned from qemu at startup, using the clone syscall, in such way that
it will share its address space with qemu.The new process will have the
name "cleanup/<QEMU_PID>". It will wait until qemu terminates
completely, and then it will exit itself.
This allows qemu to terminate quickly, without having to wait for the
whole address space to be torn down. The cleanup process will exit
after qemu, so it will be the last user of the address space, and
therefore it will take care of the actual teardown. The cleanup
process will share the same cgroups as qemu, so both memory usage and
cpu time will be accounted properly.
If possible, close_range will be used in the cleanup process to close
all open file descriptors. If it is not available or if it fails, /proc
will be used to determine which file descriptors to close.
If the cleanup process is forcefully killed with SIGKILL before the
main qemu process has terminated completely, the mechanism is defeated
and the teardown will not be asynchronous.
This feature can already be used with libvirt by adding the following
to the XML domain definition to pass the parameter to qemu directly:
<commandline xmlns="http://libvirt.org/schemas/domain/qemu/1.0">
<arg value='-async-teardown'/>
</commandline>
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com>
Tested-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com>
Message-Id: <20220812133453.82671-1-imbrenda@linux.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use QIOChannel, QIOChannelSocket and QIONetListener.
This allows net/stream to use all the available parameters provided by
SocketAddress.
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com> (QAPI schema)
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com> (QAPI schema)
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Copied from socket netdev file and modified to use SocketAddress
to be able to introduce new features like unix socket.
"udp" and "mcast" are squashed into dgram netdev, multicast is detected
according to the IP address type.
"listen" and "connect" modes are managed by stream netdev. An optional
parameter "server" defines the mode (off by default)
The two new types need to be parsed the modern way with -netdev, because
with the traditional way, the "type" field of netdev structure collides with
the "type" field of SocketAddress and prevents the correct evaluation of the
command line option. Moreover the traditional way doesn't allow to use
the same type (SocketAddress) several times with the -netdev option
(needed to specify "local" and "remote" addresses).
The previous commit paved the way for parsing the modern way, but
omitted one detail: how to pick modern vs. traditional, in
netdev_is_modern().
We want to pick based on the value of parameter "type". But how to
extract it from the option argument?
Parsing the option argument, either the modern or the traditional way,
extracts it for us, but only if parsing succeeds.
If parsing fails, there is no good option. No matter which parser we
pick, it'll be the wrong one for some arguments, and the error
reporting will be confusing.
Fortunately, the traditional parser accepts *anything* when called in
a certain way. This maximizes our chance to extract the value of
"type", and in turn minimizes the risk of confusing error reporting.
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Similar to other vhost backends, vhostfd can be passed to vhost-vdpa
backend as another parameter to instantiate vhost-vdpa net client.
This would benefit the use case where only open file descriptors, as
opposed to raw vhost-vdpa device paths, are accessible from the QEMU
process.
(qemu) netdev_add type=vhost-vdpa,vhostfd=61,id=vhost-vdpa1
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The patch adds "show_menubar" command line option for GTK UI similar to
"show_tabs". This option allows to hide menu bar initially, it still can
be toggled by shortcut and other shortcuts still work.
Signed-off-by: Bryce Mills <brycemills@proton.me>
Acked-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <NWO_zx1CT5Aj9vAXsRlqBppXd63gcKwL9V1qM1Meh36M_9tCw-EsCnfpvONXhHjmtKIUoSuCy9OO6cHS7M8b0oHBOCZG6f1jZ4Q2tqgI2Qo=@proton.me>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
There are cases that malicious virtual machine can cause CPU stuck (due
to event windows don't open up), e.g., infinite loop in microcode when
nested #AC (CVE-2015-5307). No event window means no event (NMI, SMI and
IRQ) can be delivered. It leads the CPU to be unavailable to host or
other VMs. Notify VM exit is introduced to mitigate such kind of
attacks, which will generate a VM exit if no event window occurs in VM
non-root mode for a specified amount of time (notify window).
A new KVM capability KVM_CAP_X86_NOTIFY_VMEXIT is exposed to user space
so that the user can query the capability and set the expected notify
window when creating VMs. The format of the argument when enabling this
capability is as follows:
Bit 63:32 - notify window specified in qemu command
Bit 31:0 - some flags (e.g. KVM_X86_NOTIFY_VMEXIT_ENABLED is set to
enable the feature.)
Users can configure the feature by a new (x86 only) accel property:
qemu -accel kvm,notify-vmexit=run|internal-error|disable,notify-window=n
The default option of notify-vmexit is run, which will enable the
capability and do nothing if the exit happens. The internal-error option
raises a KVM internal error if it happens. The disable option does not
enable the capability. The default value of notify-window is 0. It is valid
only when notify-vmexit is not disabled. The valid range of notify-window
is non-negative. It is even safe to set it to zero since there's an
internal hardware threshold to be added to ensure no false positive.
Because a notify VM exit may happen with VM_CONTEXT_INVALID set in exit
qualification (no cases are anticipated that would set this bit), which
means VM context is corrupted. It would be reflected in the flags of
KVM_EXIT_NOTIFY exit. If KVM_NOTIFY_CONTEXT_INVALID bit is set, raise a KVM
internal error unconditionally.
Acked-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
Message-Id: <20220929072014.20705-5-chenyi.qiang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
PATCH v1: add support for SMBIOS type 8 to qemu
PATCH v2: incorporate patch v1 feedback and add smbios type=8 to qemu-options
internal_reference: internal reference designator
external_reference: external reference designator
connector_type: hex value for port connector type (see SMBIOS 7.9.2)
port_type: hex value for port type (see SMBIOS 7.9.3)
After studying various vendor implementationsi (Dell, Lenovo, MSI),
the value of internal connector type was hard-coded to 0x0 (None).
Example usage:
-smbios type=8,internal_reference=JUSB1,external_reference=USB1,connector_type=0x12,port_type=0x10 \
-smbios type=8,internal_reference=JAUD1,external_reference="Audio Jack",connector_type=0x1f,port_type=0x1d \
-smbios type=8,internal_reference=LAN,external_reference=Ethernet,connector_type=0x0b,port_type=0x1f \
-smbios type=8,internal_reference=PS2,external_reference=Mouse,connector_type=0x0f,port_type=0x0e \
-smbios type=8,internal_reference=PS2,external_reference=Keyboard,connector_type=0x0f,port_type=0x0d
Signed-off-by: Hal Martin <hal.martin@gmail.com>
Message-Id: <20220812135153.17859-1-hal.martin@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This was deprecated in 6.2 and is ready to go. It removes quite a bit
of code that handled the registration of watchdog models.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
sndio is the native API used by OpenBSD, although it has been ported to
other *BSD's and Linux (packages for Ubuntu, Debian, Void, Arch, etc.).
Signed-off-by: Brad Smith <brad@comstyle.com>
Signed-off-by: Alexandre Ratchov <alex@caoua.org>
Reviewed-by: Volker Rümelin <vr_qemu@t-online.de>
Tested-by: Volker Rümelin <vr_qemu@t-online.de>
Message-Id: <YxibXrWsrS3XYQM3@vm1.arverb.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The zpcii-disable machine property can be used to force-disable the use
of zPCI interpretation facilities for a VM. By default, this setting
will be off for machine 7.2 and newer.
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
Message-Id: <20220902172737.170349-9-mjrosato@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
[thuth: Fix contextual conflict in ccw_machine_7_1_instance_options()]
Signed-off-by: Thomas Huth <thuth@redhat.com>
add a simple help option for -audio and -audiodev
to show the list of available drivers, and document them.
Signed-off-by: Claudio Fontana <cfontana@suse.de>
Message-Id: <20220908081441.7111-1-cfontana@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Currently our semihosting implementations generally prohibit use of
semihosting calls in system emulation from the guest userspace. This
is a very long standing behaviour justified originally "to provide
some semblance of security" (since code with access to the
semihosting ABI can do things like read and write arbitrary files on
the host system). However, it is sometimes useful to be able to run
trusted guest code which performs semihosting calls from guest
userspace, notably for test code. Add a command line suboption to
the existing semihosting-config option group so that you can
explicitly opt in to semihosting from guest userspace with
-semihosting-config userspace=on
(There is no equivalent option for the user-mode emulator, because
there by definition all code runs in userspace and has access to
semihosting already.)
This commit adds the infrastructure for the command line option and
adds a bool 'is_user' parameter to the function
semihosting_userspace_enabled() that target code can use to check
whether it should be permitting the semihosting call for userspace.
It mechanically makes all the callsites pass 'false', so they
continue checking "is semihosting enabled in general". Subsequent
commits will make each target that implements semihosting honour the
userspace=on option by passing the correct value and removing
whatever "don't do this for userspace" checking they were doing by
hand.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220822141230.3658237-2-peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Try to correct any confusion about QEMU's Byzantine disk options by
laying out the preferred "modern" options as-per:
"<danpb> (best: -device + -blockdev, 2nd obsolete syntax: -device +
-drive, 3rd obsolete syntax: -drive, 4th obsolete syntax: -hdNN)"
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Acked-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Cc: qemu-block@nongnu.org
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Hanna Reitz <hreitz@redhat.com>
Cc: Thomas Huth <thuth@redhat.com>
Message-Id: <20220822165608.2980552-7-alex.bennee@linaro.org>
How to control the booting of QEMU is often a source of confusion for
users. Bring the options that control this together in the manual
pages and add some verbiage to describe when each option is
appropriate. This attempts to codify some of the knowledge expressed
in:
https://stackoverflow.com/questions/58420670/qemu-bios-vs-kernel-vs-device-loader-file/58434837#58434837
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20220725140520.515340-14-alex.bennee@linaro.org>
Currently QEMU exits with code 0 on both panic an shutdown. For tests
it is useful to return 1 on panic, so that it counts as a test
failure.
Introduce a new exit-failure PanicAction that makes main() return
EXIT_FAILURE. Tests can use -action panic=exit-failure option to
activate this behavior.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-Id: <20220725223746.227063-2-iii@linux.ibm.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
The patch adds "show_tabs" command line option for GTK ui similar to
"grab_on_hover". This option allows tabbed view mode to not have to be
enabled by hand at each start of the VM.
Signed-off-by: Felix "xq" Queißner <xq@random-projects.net>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220712133753.18937-1-xq@random-projects.net>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Paolo Bonzini requested this change to simplify the ongoing
effort to allow machine setup entirely via RPC.
Includes shortening the command line form cxl-fixed-memory-window
to cxl-fmw as the command lines are extremely long even with this
change.
The json change is needed to ensure that there is
a CXLFixedMemoryWindowOptionsList even though the actual
element in the json is never used. Similar to existing
SgxEpcProperties.
Update qemu-options.hx to reflect that this is now a -machine
parameter. The bulk of -M / -machine parameters are documented
under machine, so use that in preference to M.
Update cxl-test and bios-tables-test to reflect new parameters.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Message-Id: <20220608145440.26106-2-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We have "-sdl" and "-curses", but no "-gtk" and no "-cocoa" ...
these old-style options are rather confusing than helpful nowadays.
Now that the deprecation period is over, let's remove them, so we
get a cleaner interface (where "-display" is the only way to select
the user interface).
Message-Id: <20220519155625.1414365-4-thuth@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Dropping these deprecated parameters simplifies further refactoring
(e.g. QAPIfication is easier without underscores in the name).
Message-Id: <20220519155625.1414365-2-thuth@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
most of CXL support
fixes, cleanups all over the place
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmKCuLIPHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRpdDUH/12SmWaAo+0+SdIHgWFFxsmg3t/EdcO38fgi
MV+GpYdbp6TlU3jdQhrMZYmFdkVVydBdxk93ujCLbFS0ixTsKj31j0IbZMfdcGgv
SLqnV+E3JdHqnGP39q9a9rdwYWyqhkgHoldxilIFW76ngOSapaZVvnwnOMAMkf77
1LieL4/Xq7N9Ho86Zrs3IczQcf0czdJRDaFaSIu8GaHl8ELyuPhlSm6CSqqrEEWR
PA/COQsLDbLOMxbfCi5v88r5aaxmGNZcGbXQbiH9qVHw65nlHyLH9UkNTdJn1du1
f2GYwwa7eekfw/LCvvVwxO1znJrj02sfFai7aAtQYbXPvjvQiqA=
=xdSk
-----END PGP SIGNATURE-----
Merge tag 'for_upstream' of git://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging
virtio,pc,pci: fixes,cleanups,features
most of CXL support
fixes, cleanups all over the place
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmKCuLIPHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRpdDUH/12SmWaAo+0+SdIHgWFFxsmg3t/EdcO38fgi
# MV+GpYdbp6TlU3jdQhrMZYmFdkVVydBdxk93ujCLbFS0ixTsKj31j0IbZMfdcGgv
# SLqnV+E3JdHqnGP39q9a9rdwYWyqhkgHoldxilIFW76ngOSapaZVvnwnOMAMkf77
# 1LieL4/Xq7N9Ho86Zrs3IczQcf0czdJRDaFaSIu8GaHl8ELyuPhlSm6CSqqrEEWR
# PA/COQsLDbLOMxbfCi5v88r5aaxmGNZcGbXQbiH9qVHw65nlHyLH9UkNTdJn1du1
# f2GYwwa7eekfw/LCvvVwxO1znJrj02sfFai7aAtQYbXPvjvQiqA=
# =xdSk
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 16 May 2022 01:48:50 PM PDT
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [undefined]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* tag 'for_upstream' of git://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (86 commits)
vhost-user-scsi: avoid unlink(NULL) with fd passing
virtio-net: don't handle mq request in userspace handler for vhost-vdpa
vhost-vdpa: change name and polarity for vhost_vdpa_one_time_request()
vhost-vdpa: backend feature should set only once
vhost-net: fix improper cleanup in vhost_net_start
vhost-vdpa: fix improper cleanup in net_init_vhost_vdpa
virtio-net: align ctrl_vq index for non-mq guest for vhost_vdpa
virtio-net: setup vhost_dev and notifiers for cvq only when feature is negotiated
hw/i386/amd_iommu: Fix IOMMU event log encoding errors
hw/i386: Make pic a property of common x86 base machine type
hw/i386: Make pit a property of common x86 base machine type
include/hw/pci/pcie_host: Correct PCIE_MMCFG_SIZE_MAX
include/hw/pci/pcie_host: Correct PCIE_MMCFG_BUS_MASK
docs/vhost-user: Clarifications for VHOST_USER_ADD/REM_MEM_REG
vhost-user: more master/slave things
virtio: add vhost support for virtio devices
virtio: drop name parameter for virtio_init()
virtio/vhost-user: dynamically assign VhostUserHostNotifiers
hw/virtio/vhost-user: don't suppress F_CONFIG when supported
include/hw: start documenting the vhost API
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-audio is used like "-audio pa,model=sb16". It is almost as simple as
-soundhw, but it reuses the -audiodev parsing machinery and attaches an
audiodev to the newly-created device. The main 'feature' is that
it knows about adding the codec device for model=intel-hda, and adding
the audiodev to the codec device.
In the future, it could be extended to support default models or
builtin devices, just like -nic, or even a default backend. For now,
keep it simple.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The concept of these is introduced in [1] in terms of the
description the CEDT ACPI table. The principal is more general.
Unlike once traffic hits the CXL root bridges, the host system
memory address routing is implementation defined and effectively
static once observable by standard / generic system software.
Each CXL Fixed Memory Windows (CFMW) is a region of PA space
which has fixed system dependent routing configured so that
accesses can be routed to the CXL devices below a set of target
root bridges. The accesses may be interleaved across multiple
root bridges.
For QEMU we could have fully specified these regions in terms
of a base PA + size, but as the absolute address does not matter
it is simpler to let individual platforms place the memory regions.
ExampleS:
-cxl-fixed-memory-window targets.0=cxl.0,size=128G
-cxl-fixed-memory-window targets.0=cxl.1,size=128G
-cxl-fixed-memory-window targets.0=cxl0,targets.1=cxl.1,size=256G,interleave-granularity=2k
Specifies
* 2x 128G regions not interleaved across root bridges, one for each of
the root bridges with ids cxl.0 and cxl.1
* 256G region interleaved across root bridges with ids cxl.0 and cxl.1
with a 2k interleave granularity.
When system software enumerates the devices below a given root bridge
it can then decide which CFMW to use. If non interleave is desired
(or possible) it can use the appropriate CFMW for the root bridge in
question. If there are suitable devices to interleave across the
two root bridges then it may use the 3rd CFMS.
A number of other designs were considered but the following constraints
made it hard to adapt existing QEMU approaches to this particular problem.
1) The size must be known before a specific architecture / board brings
up it's PA memory map. We need to set up an appropriate region.
2) Using links to the host bridges provides a clean command line interface
but these links cannot be established until command line devices have
been added.
Hence the two step process used here of first establishing the size,
interleave-ways and granularity + caching the ids of the host bridges
and then, once available finding the actual host bridges so they can
be used later to support interleave decoding.
[1] CXL 2.0 ECN: CEDT CFMWS & QTG DSM (computeexpresslink.org / specifications)
Signed-off-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Acked-by: Markus Armbruster <armbru@redhat.com> # QAPI Schema
Message-Id: <20220429144110.25167-28-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The Xen hypervisor is only available on x86 and arm - thus let's
limit the related options to these targets.
Message-Id: <20220427133156.344418-1-thuth@redhat.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
There is no need to present the user with -enable-kvm if there
is no support for KVM on the corresponding target.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20220427134906.348118-1-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Like -set and -readconfig, it would not really be too hard to
extend -writeconfig to parsing mechanisms other than QemuOpts.
However, the uses of -writeconfig are substantially more
limited, as it is generally easier to write the configuration
by hand in the first place. In addition, -writeconfig does
not even try to detect cases where it prints incorrect
syntax (for example if values have a quote in them, since
qemu_config_parse does not support any kind of escaping.
Just remove it.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20220414145721.326866-1-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Users requiring FIPS support must build QEMU with either the libgcrypt
or gnutls libraries as the crytography backend.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
On Mac OS X the Option key maps to Alt and Command to Super/Meta. This change
swaps them around so that Alt is the key closer to the space bar and Meta/Super
is between Control and Alt, like on non-Mac keyboards.
It is a cocoa display option, disabled by default.
Acked-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Gustavo Noronha Silva <gustavo@noronha.dev.br>
Message-Id: <20210713213200.2547-3-gustavo@noronha.dev.br>
Signed-off-by: Akihiko Odaki <akihiko.odaki@gmail.com>
Message-Id: <20220306121119.45631-3-akihiko.odaki@gmail.com>
Reviewed-by: Will Cohen <wwcohen@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Applications such as Gnome may use Alt-Tab and Super-Tab for different
purposes, some use Ctrl-arrows so we want to allow qemu to handle
everything when it captures the mouse/keyboard.
However, Mac OS handles some combos like Command-Tab and Ctrl-arrows
at an earlier part of the event handling chain, not letting qemu see it.
We add a global Event Tap that allows qemu to see all events when the
mouse is grabbed. Note that this requires additional permissions.
See:
https://developer.apple.com/documentation/coregraphics/1454426-cgeventtapcreate?language=objc#discussionhttps://support.apple.com/en-in/guide/mac-help/mh32356/mac
Acked-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Gustavo Noronha Silva <gustavo@noronha.dev.br>
Message-Id: <20210713213200.2547-2-gustavo@noronha.dev.br>
Signed-off-by: Akihiko Odaki <akihiko.odaki@gmail.com>
Message-Id: <20220306121119.45631-2-akihiko.odaki@gmail.com>
Reviewed-by: Will Cohen <wwcohen@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
When switching between guest and host on a Mac using command-tab the
command key is sent to the guest which can trigger functionality in the
guest OS. Specifying left-command-key=off disables forwarding this key
to the guest. Defaults to enabled.
Also updated the cocoa display documentation to reference the new
left-command-key option along with the existing show-cursor option.
Signed-off-by: Carwyn Ellis <carwynellis@gmail.com>
Reviewed-by: Akihiko Odaki <akihiko.odaki@gmail.com>
[PMD: Set QAPI structure @since tag to 7.0]
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
This parameter is to be used in the processor_id entry in the type 4
table.
This parameter is set as optional and if left will use the values from
the CPU model.
This enables hiding the host information from the guest and allowing AMD
VMs to run pretending to be Intel for some userspace software concerns.
Reviewed-by: Peter Foley <pefoley@google.com>
Reviewed-by: Titus Rwantare <titusr@google.com>
Signed-off-by: Patrick Venture <venture@google.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20220125163118.1011809-1-venture@google.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
qemu-options.hx contains grammar that a native English-speaking
person would never use.
Replace "This option defines where is connected the drive" by
"This option defines where the drive is connected".
Fixes: https://gitlab.com/qemu-project/qemu/-/issues/853
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20220202143422.912070-1-lvivier@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
ARM64 machines like Kunpeng Family Server Chips have a level
of hardware topology in which a group of CPU cores share L3
cache tag or L2 cache. For example, Kunpeng 920 typically
has 6 or 8 clusters in each NUMA node (also represent range
of CPU die), and each cluster has 4 CPU cores. All clusters
share L3 cache data, but CPU cores in each cluster share a
local L3 tag.
Running a guest kernel with Cluster-Aware Scheduling on the
Hosts which have physical clusters, if we can design a vCPU
topology with cluster level for guest kernel and then have
a dedicated vCPU pinning, the guest will gain scheduling
performance improvement from cache affinity of CPU cluster.
So let's enable the support for this new parameter on ARM
virt machines. After this patch, we can define a 4-level
CPU hierarchy like: cpus=*,maxcpus=*,sockets=*,clusters=*,
cores=*,threads=*.
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-id: 20220107083232.16256-2-wangyanan55@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This option was just a wrapper around the -display ...,window-close=off
parameter, and the name "no-quit" is rather confusing compared to
"window-close" (since there are still other means to quit the emulator),
so let's remove this now.
Message-Id: <20211215082417.180735-1-thuth@redhat.com>
Acked-by: Michal Prívozník <mprivozn@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The new Cluster-Aware Scheduling support has landed in Linux 5.16,
which has been proved to benefit the scheduling performance (e.g.
load balance and wake_affine strategy) on both x86_64 and AArch64.
So now in Linux 5.16 we have four-level arch-neutral CPU topology
definition like below and a new scheduler level for clusters.
struct cpu_topology {
int thread_id;
int core_id;
int cluster_id;
int package_id;
int llc_id;
cpumask_t thread_sibling;
cpumask_t core_sibling;
cpumask_t cluster_sibling;
cpumask_t llc_sibling;
}
A cluster generally means a group of CPU cores which share L2 cache
or other mid-level resources, and it is the shared resources that
is used to improve scheduler's behavior. From the point of view of
the size range, it's between CPU die and CPU core. For example, on
some ARM64 Kunpeng servers, we have 6 clusters in each NUMA node,
and 4 CPU cores in each cluster. The 4 CPU cores share a separate
L2 cache and a L3 cache tag, which brings cache affinity advantage.
In virtualization, on the Hosts which have pClusters (physical
clusters), if we can design a vCPU topology with cluster level for
guest kernel and have a dedicated vCPU pinning. A Cluster-Aware
Guest kernel can also make use of the cache affinity of CPU clusters
to gain similar scheduling performance.
This patch adds infrastructure for CPU cluster level topology
configuration and parsing, so that the user can specify cluster
parameter if their machines support it.
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
Message-Id: <20211228092221.21068-3-wangyanan55@huawei.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
[PMD: Added '(since 7.0)' to @clusters in qapi/machine.json]
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
We have a description in qemu-options.hx for each CPU topology
parameter to explain what it exactly means, and also an extra
declaration for the target-specific one, e.g. "for PC only"
when describing "dies", and "for PC, it's on one die" when
describing "cores".
Now we are going to introduce one more non-generic parameter
"clusters", it will make the Doc less readable and if we still
continue to use the legacy way to describe it.
So let's at first make two tweaks of the Docs to improve the
readability and also scalability:
1) In the -help text: Delete the extra specific declaration and
describe each topology parameter level by level. Then add a
note to declare that different machines may support different
subsets and the actual meaning of the supported parameters
will vary accordingly.
2) In the rST text: List all the sub-hierarchies currently
supported in QEMU, and correspondingly give an example of
-smp configuration for each of them.
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211228092221.21068-2-wangyanan55@huawei.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Add a new -audio backend that accepts D-Bus clients/listeners to handle
playback & recording, to be exported via the -display dbus.
Example usage:
-audiodev dbus,in.mixing-engine=off,out.mixing-engine=off,id=dbus
-display dbus,audiodev=dbus
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Add an option to use direct connections instead of via the bus. Clients
are accepted with QMP add_client.
This allows to provide the D-Bus display without a bus. It also
simplifies the testing setup (some CI have issues to setup a D-Bus bus
in a container).
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
The "dbus" display backend exports the QEMU consoles and other
UI-related interfaces over D-Bus.
By default, the connection is established on the session bus, but you
can specify a different bus with the "addr" option.
The backend takes the "org.qemu" service name, while still allowing
further instances to queue on the same name (so you can lookup all the
available instances too). It accepts any number of clients at this
point, although this is expected to evolve with options to restrict
clients, or only accept p2p via fd passing.
The interface is intentionally very close to the internal QEMU API,
and can be introspected or interacted with busctl/dfeet etc:
$ ./qemu-system-x86_64 -name MyVM -display dbus
$ busctl --user introspect org.qemu /org/qemu/Display1/Console_0
org.qemu.Display1.Console interface - - -
.RegisterListener method h - -
.SetUIInfo method qqiiuu - -
.DeviceAddress property s "pci/0000/01.0" emits-change
.Head property u 0 emits-change
.Height property u 480 emits-change
.Label property s "VGA" emits-change
.Type property s "Graphic" emits-change
.Width property u 640 emits-change
[...]
See the interfaces XML source file and Sphinx docs for the generated API
documentations.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
The basic SGX did not enable numa for SGX EPC sections, which
result in all EPC sections located in numa node 0. This patch
enable SGX numa function in the guest and the EPC section can
work with RAM as one numa node.
The Guest kernel related log:
[ 0.009981] ACPI: SRAT: Node 0 PXM 0 [mem 0x180000000-0x183ffffff]
[ 0.009982] ACPI: SRAT: Node 1 PXM 1 [mem 0x184000000-0x185bfffff]
The SRAT table can normally show SGX EPC sections menory info in different
numa nodes.
The SGX EPC numa related command:
......
-m 4G,maxmem=20G \
-smp sockets=2,cores=2 \
-cpu host,+sgx-provisionkey \
-object memory-backend-ram,size=2G,host-nodes=0,policy=bind,id=node0 \
-object memory-backend-epc,id=mem0,size=64M,prealloc=on,host-nodes=0,policy=bind \
-numa node,nodeid=0,cpus=0-1,memdev=node0 \
-object memory-backend-ram,size=2G,host-nodes=1,policy=bind,id=node1 \
-object memory-backend-epc,id=mem1,size=28M,prealloc=on,host-nodes=1,policy=bind \
-numa node,nodeid=1,cpus=2-3,memdev=node1 \
-M sgx-epc.0.memdev=mem0,sgx-epc.0.node=0,sgx-epc.1.memdev=mem1,sgx-epc.1.node=1 \
......
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20211101162009.62161-2-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Introduce new boolean 'kernel-hashes' option on the sev-guest object.
It will be used to to decide whether to add the hashes of
kernel/initrd/cmdline to SEV guest memory when booting with -kernel.
The default value is 'off'.
Signed-off-by: Dov Murik <dovmurik@linux.ibm.com>
Acked-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The single backtick markup in ReST is the "default role". Currently,
Sphinx's default role is called "content". Sphinx suggests you can use
the "Any" role instead to turn any single-backtick enclosed item into a
cross-reference.
This is useful for things like autodoc for Python docstrings, where it's
often nicer to reference other types with `foo` instead of the more
laborious :py:meth:`foo`. It's also useful in multi-domain cases to
easily reference definitions from other Sphinx domains, such as
referencing C code definitions from outside of kerneldoc comments.
Before we do that, though, we'll need to turn all existing usages of the
"content" role to inline verbatim markup wherever it does not correctly
resolve into a cross-refernece by using double backticks instead.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <20211004215238.1523082-2-jsnow@redhat.com>
New option parameters unstable-input and unstable-output set policy
for unstable interfaces just like deprecated-input and
deprecated-output set policy for deprecated interfaces (see commit
6dd75472d5 "qemu-options: New -compat to set policy for deprecated
interfaces"). This is intended for testing users of the management
interfaces. It is experimental.
For now, this covers only syntactic aspects of QMP, i.e. stuff tagged
with feature 'unstable'. We may want to extend it to cover semantic
aspects, or the command line.
Note that there is no good way for management application to detect
presence of these new option parameters: they are not visible output
of query-qmp-schema or query-command-line-options. Tolerable, because
it's meant for testing. If running with -compat fails, skip the test.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: John Snow <jsnow@redhat.com>
Message-Id: <20211028102520.747396-10-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[Doc comments fixed up]
There is one numa config example in qemu-options.hx currently
using "-smp 2" and assuming that there will be 2 sockets and
2 cpus totally. However now the actual calculation logic of
missing sockets and cores is not immutable and is considered
liable to change. Although we will get maxcpus=2 finally based
on current parser, it's always stable to specify it explicitly.
So "-smp 2,sockets=2,maxcpus=2" will be optimal when we expect
multiple sockets and 2 cpus totally.
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
Reviewed-by: Philippe Mathieu-Daude <philmd@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-Id: <20210928121134.21064-3-wangyanan55@huawei.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
In qemu-option.hx, there is "-smp [[cpus=]n][,maxcpus=cpus]..." in the
DEF part, and "-smp [[cpus=]n][,maxcpus=maxcpus]..." in the RST part.
Obviously the later is right, let's fix the previous one.
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
Reviewed-by: Damien Hedde <damien.hedde@greensocs.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-Id: <20210928121134.21064-2-wangyanan55@huawei.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
In the real SMP hardware topology world, it's much more likely that
we have high cores-per-socket counts and few sockets totally. While
the current preference of sockets over cores in smp parsing results
in a virtual cpu topology with low cores-per-sockets counts and a
large number of sockets, which is just contrary to the real world.
Given that it is better to make the virtual cpu topology be more
reflective of the real world and also for the sake of compatibility,
we start to prefer cores over sockets over threads in smp parsing
since machine type 6.2 for different arches.
In this patch, a boolean "smp_prefer_sockets" is added, and we only
enable the old preference on older machines and enable the new one
since type 6.2 for all arches by using the machine compat mechanism.
Suggested-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210929025816.21076-10-wangyanan55@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Currently we directly calculate the omitted cpus based on the given
incomplete collection of parameters. This makes some cmdlines like:
-smp maxcpus=16
-smp sockets=2,maxcpus=16
-smp sockets=2,dies=2,maxcpus=16
-smp sockets=2,cores=4,maxcpus=16
not work. We should probably set the value of cpus to match maxcpus
if it's omitted, which will make above configs start to work.
So the calculation logic of cpus/maxcpus after this patch will be:
When both maxcpus and cpus are omitted, maxcpus will be calculated
from the given parameters and cpus will be set equal to maxcpus.
When only one of maxcpus and cpus is given then the omitted one
will be set to its counterpart's value. Both maxcpus and cpus may
be specified, but maxcpus must be equal to or greater than cpus.
Note: change in this patch won't affect any existing working cmdlines
but allows more incomplete configs to be valid.
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210929025816.21076-6-wangyanan55@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In the SMP configuration, we should either provide a topology
parameter with a reasonable value (greater than zero) or just
omit it and QEMU will compute the missing value.
The users shouldn't provide a configuration with any parameter
of it specified as zero (e.g. -smp 8,sockets=0) which could
possibly cause unexpected results in the -smp parsing. So we
deprecate this kind of configurations since 6.2 by adding the
explicit sanity check.
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210929025816.21076-3-wangyanan55@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Because SGX EPC is enumerated through CPUID, EPC "devices" need to be
realized prior to realizing the vCPUs themselves, i.e. long before
generic devices are parsed and realized. From a virtualization
perspective, the CPUID aspect also means that EPC sections cannot be
hotplugged without paravirtualizing the guest kernel (hardware does
not support hotplugging as EPC sections must be locked down during
pre-boot to provide EPC's security properties).
So even though EPC sections could be realized through the generic
-devices command, they need to be created much earlier for them to
actually be usable by the guest. Place all EPC sections in a
contiguous block, somewhat arbitrarily starting after RAM above 4g.
Ensuring EPC is in a contiguous region simplifies calculations, e.g.
device memory base, PCI hole, etc..., allows dynamic calculation of the
total EPC size, e.g. exposing EPC to guests does not require -maxmem,
and last but not least allows all of EPC to be enumerated in a single
ACPI entry, which is expected by some kernels, e.g. Windows 7 and 8.
The new compound properties command for sgx like below:
......
-object memory-backend-epc,id=mem1,size=28M,prealloc=on \
-object memory-backend-epc,id=mem2,size=10M \
-M sgx-epc.0.memdev=mem1,sgx-epc.1.memdev=mem2
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20210719112136.57018-6-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>