Commit Graph

795 Commits

Author SHA1 Message Date
Philippe Mathieu-Daudé
bf964322d6 meson: Allow building binary with no target-specific files in hw/
Allow  building a qemu-system-foo binary with target-agnostic
only HW models.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20231121203129.67999-1-philmd@linaro.org>
2024-01-05 16:20:14 +01:00
Alex Bennée
f705c1f25d meson.build: report graphics backends separately
To enable accelerated VirtIO GPUs for the guest we need the rendering
support on the host, which currently it's reported in the configuration
summary under the "dependencies" section. Add a graphics backend section
and report the status of the VirGL and Rutabaga support libraries.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231222114846.2850741-1-alex.bennee@linaro.org>
[Remove from dependencies as suggested by Philippe. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-12-31 09:11:29 +01:00
Paolo Bonzini
d0cda6f461 configure, meson: rename targetos to host_os
This variable is about the host OS, not the target.  It is used a lot
more since the Meson conversion, but the original sin dates back to 2003.
Time to fix it.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-12-31 09:11:29 +01:00
Paolo Bonzini
cfc1a889e5 meson: rename config_all
config_all now lists only accelerators, rename it to indicate its actual
content.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-12-31 09:11:28 +01:00
Paolo Bonzini
0d66549cf5 meson: remove CONFIG_ALL
CONFIG_ALL is tricky to use and was ported over to Meson from the
recursive processing of Makefile variables.  Meson sourcesets
however have all_sources() and all_dependencies() methods that
remove the need for it.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-12-31 09:11:28 +01:00
Paolo Bonzini
1220f5813a meson: remove config_targetos
config_targetos is now empty and can be removed; its use in sourcesets
that do not involve target-specific files can be replaced with an empty
dictionary.

In fact, at this point *all* sourcesets that do not involve
target-specific files are just glorified mutable arrays.  Enforce that
they never test for symbols in "when:" by computing the set of files
without "strict: false".

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-12-31 09:11:28 +01:00
Paolo Bonzini
dc4954943d meson: remove CONFIG_POSIX and CONFIG_WIN32 from config_targetos
For consistency with other OSes, use if...endif for rules that are
target-independent.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-12-31 09:11:28 +01:00
Paolo Bonzini
53e8868d69 meson: remove OS definitions from config_targetos
CONFIG_DARWIN, CONFIG_LINUX and CONFIG_BSD are used in some rules, but
only CONFIG_LINUX has substantial use.  Convert them all to if...endif.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-12-31 09:11:28 +01:00
Paolo Bonzini
e7c22ff87a meson: always probe u2f and canokey if the option is enabled
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-12-31 09:11:28 +01:00
Paolo Bonzini
9cd76c9111 meson: move subdirs to "Collect sources" section
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-12-31 09:11:28 +01:00
Paolo Bonzini
95933f13a0 meson: move config-host.h definitions together
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-12-31 09:11:28 +01:00
Paolo Bonzini
82761296d2 meson: move CFI detection code with other compiler flags
Keep it together with the other compiler modes, and before dependencies.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-12-31 09:11:28 +01:00
Paolo Bonzini
a775c7130c meson: keep subprojects together
And move away dependencies that are not subprojects anymore.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-12-31 09:11:28 +01:00
Paolo Bonzini
0f63d96311 meson: move accelerator dependency checks together
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-12-31 09:11:28 +01:00
Paolo Bonzini
a9cba054cf meson: move option validation together
Check options before compiler flags, because some compiler flags are
incompatible with modules.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-12-31 09:11:28 +01:00
Paolo Bonzini
44458a6158 meson: move program checks together
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-12-31 09:11:28 +01:00
Paolo Bonzini
ea444d91e5 meson: add more sections to main meson.build
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-12-31 09:11:28 +01:00
Stefan Hajnoczi
eaae59af40 Fix for building with Xen 4.18
-----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmV4M4AUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroOPgwgAhRYBI8Q7FO4LWZTi+ubYXfS1ZEVC
 uy5eiyQNlymmAFFqutXLokvN1qsGhRlSeX5/uo5Tn6vWjkXPLlGikrecWHFSPmLS
 0s+4NOOfrM6gMm5CCqMzjQuogr4+xxiw/g+rxhWGhNqlL1jVG1+I6AU5EobMNlDA
 gqd33OL509xkLVN6pCcmFwBInDHQl63YwOwVIR3cd2cfUW28M8DzGd9KULWJkZva
 I51COEwo0EpLNC2ile7pnA8+8F79WBMgUdrhBzl/a8RHv7AvxAPQB/0TsZQknFo0
 PS3Y+yXdn2CT3KInu+QeW3kHkVoAdK06/cSOqIbEKuKgnZjEz0qFHq4K3A==
 =SKW6
 -----END PGP SIGNATURE-----

Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging

Fix for building with Xen 4.18

# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmV4M4AUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroOPgwgAhRYBI8Q7FO4LWZTi+ubYXfS1ZEVC
# uy5eiyQNlymmAFFqutXLokvN1qsGhRlSeX5/uo5Tn6vWjkXPLlGikrecWHFSPmLS
# 0s+4NOOfrM6gMm5CCqMzjQuogr4+xxiw/g+rxhWGhNqlL1jVG1+I6AU5EobMNlDA
# gqd33OL509xkLVN6pCcmFwBInDHQl63YwOwVIR3cd2cfUW28M8DzGd9KULWJkZva
# I51COEwo0EpLNC2ile7pnA8+8F79WBMgUdrhBzl/a8RHv7AvxAPQB/0TsZQknFo0
# PS3Y+yXdn2CT3KInu+QeW3kHkVoAdK06/cSOqIbEKuKgnZjEz0qFHq4K3A==
# =SKW6
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 12 Dec 2023 05:18:40 EST
# gpg:                using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg:                issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* tag 'for-upstream' of https://gitlab.com/bonzini/qemu:
  xen: fix condition for skipping virtio-mmio defines
  meson, xen: fix condition for enabling the Xen accelerator

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-12-12 07:42:02 -05:00
Paolo Bonzini
16b6273b4b meson, xen: fix condition for enabling the Xen accelerator
A misspelled condition in xen_native.h is hiding a bug in the enablement of
Xen for qemu-system-aarch64.  The bug becomes apparent when building for
Xen 4.18.

While the i386 emulator provides the xenpv machine type for multiple architectures,
and therefore can be compiled with Xen enabled even when the host is Arm, the
opposite is not true: qemu-system-aarch64 can only be compiled with Xen support
enabled when the host is Arm.

Expand the computation of accelerator_targets['CONFIG_XEN'] similar to what is
already there for KVM.

Cc: Stefano Stabellini <stefano.stabellini@amd.com>
Cc: Richard W.M. Jones <rjones@redhat.com>
Cc: Daniel P. Berrangé <berrange@redhat.com>
Reported-by: Michael Young <m.a.young@durham.ac.uk>
Fixes: 0c8ab1cddd ("xen_arm: Create virtio-mmio devices during initialization", 2023-08-30)
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-12-12 11:18:21 +01:00
Michael S. Tsirkin
dc864d3a37 osdep: add getloadavg
getloadavg is supported on Linux, BSDs, Solaris.

Following man page:
RETURN VALUE
       If the load average was unobtainable, -1 is returned; otherwise,
       the number of samples actually retrieved is returned.

accordingly, make stub for systems which don't support this function return -1
for consistency.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-12-01 08:53:04 -05:00
Markus Armbruster
569205e4e9 meson: Enable -Wshadow=local
Local variables shadowing other local variables or parameters make the
code needlessly hard to understand.  Bugs love to hide in such code.
Evidence: commit bbde656263 (migration/rdma: Fix save_page method to
fail on polling error).

Enable -Wshadow=local to prevent such issues.  Possible thanks to
recent cleanups.  Enabling -Wshadow would prevent more issues, but
we're not yet ready for that.

As usual, the warning is only enabled when the compiler recognizes it.
GCC does, Clang doesn't.

Some shadowed locals remain in bsd-user.  Since BSD prefers Clang,
let's not wait for its cleanup.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20231026053115.2066744-2-armbru@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2023-11-13 10:32:57 +01:00
Greg Manning
36fa077394 plugins: allow plugins to be enabled on windows
allow plugins to be enabled in the configure script on windows. Also,
add the qemu_plugin_api.lib to the installer.

Signed-off-by: Greg Manning <gmanning@rapitasystems.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20231102172053.17692-5-gmanning@rapitasystems.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
[AJB: add check for dlltool to configure]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20231106185112.2755262-17-alex.bennee@linaro.org>
2023-11-08 15:15:23 +00:00
Marc-André Lureau
d017f28a2e build-sys: make pixman actually optional
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
2023-11-07 14:04:25 +04:00
Marc-André Lureau
da554e1616 ui/gtk: -display gtk requires PIXMAN
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
2023-11-07 14:04:25 +04:00
Marc-André Lureau
c98791eb63 ui/spice: SPICE/QXL requires PIXMAN
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
2023-11-07 14:04:25 +04:00
Marc-André Lureau
89fd3eab52 ui/vnc: VNC requires PIXMAN
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
2023-11-07 14:04:25 +04:00
Marc-André Lureau
cca1575686 build-sys: add a "pixman" feature
For now, pixman is mandatory, but we set config_host.h and Kconfig.
Once compilation is fixed, "pixman" will become actually optional.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
2023-11-07 14:04:24 +04:00
Maciej S. Szmigiero
0d9e8c0b67 Add Hyper-V Dynamic Memory Protocol driver (hv-balloon) base
This driver is like virtio-balloon on steroids: it allows both changing the
guest memory allocation via ballooning and (in the next patch) inserting
pieces of extra RAM into it on demand from a provided memory backend.

The actual resizing is done via ballooning interface (for example, via
the "balloon" HMP command).
This includes resizing the guest past its boot size - that is, hot-adding
additional memory in granularity limited only by the guest alignment
requirements, as provided by the next patch.

In contrast with ACPI DIMM hotplug where one can only request to unplug a
whole DIMM stick this driver allows removing memory from guest in single
page (4k) units via ballooning.

After a VM reboot the guest is back to its original (boot) size.

In the future, the guest boot memory size might be changed on reboot
instead, taking into account the effective size that VM had before that
reboot (much like Hyper-V does).

For performance reasons, the guest-released memory is tracked in a few
range trees, as a series of (start, count) ranges.
Each time a new page range is inserted into such tree its neighbors are
checked as candidates for possible merging with it.

Besides performance reasons, the Dynamic Memory protocol itself uses page
ranges as the data structure in its messages, so relevant pages need to be
merged into such ranges anyway.

One has to be careful when tracking the guest-released pages, since the
guest can maliciously report returning pages outside its current address
space, which later clash with the address range of newly added memory.
Similarly, the guest can report freeing the same page twice.

The above design results in much better ballooning performance than when
using virtio-balloon with the same guest: 230 GB / minute with this driver
versus 70 GB / minute with virtio-balloon.

During a ballooning operation most of time is spent waiting for the guest
to come up with newly freed page ranges, processing the received ranges on
the host side (in QEMU and KVM) is nearly instantaneous.

The unballoon operation is also pretty much instantaneous:
thanks to the merging of the ballooned out page ranges 200 GB of memory can
be returned to the guest in about 1 second.
With virtio-balloon this operation takes about 2.5 minutes.

These tests were done against a Windows Server 2019 guest running on a
Xeon E5-2699, after dirtying the whole memory inside guest before each
balloon operation.

Using a range tree instead of a bitmap to track the removed memory also
means that the solution scales well with the guest size: even a 1 TB range
takes just a few bytes of such metadata.

Since the required GTree operations aren't present in every Glib version
a check for them was added to the meson build script, together with new
"--enable-hv-balloon" and "--disable-hv-balloon" configure arguments.
If these GTree operations are missing in the system's Glib version this
driver will be skipped during QEMU build.

An optional "status-report=on" device parameter requests memory status
events from the guest (typically sent every second), which allow the host
to learn both the guest memory available and the guest memory in use
counts.

Following commits will add support for their external emission as
"HV_BALLOON_STATUS_REPORT" QMP events.

The driver is named hv-balloon since the Linux kernel client driver for
the Dynamic Memory Protocol is named as such and to follow the naming
pattern established by the virtio-balloon driver.
The whole protocol runs over Hyper-V VMBus.

The driver was tested against Windows Server 2012 R2, Windows Server 2016
and Windows Server 2019 guests and obeys the guest alignment requirements
reported to the host via DM_CAPABILITIES_REPORT message.

Acked-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
2023-11-06 14:08:10 +01:00
Stefan Hajnoczi
1b4a5a20da virtio,pc,pci: features, cleanups
infrastructure for vhost-vdpa shadow work
 piix south bridge rework
 reconnect for vhost-user-scsi
 dummy ACPI QTG DSM for cxl
 
 tests, cleanups, fixes all over the place
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmU06PMPHG1zdEByZWRo
 YXQuY29tAAoJECgfDbjSjVRpNIsH/0DlKti86VZLJ6PbNqsnKxoK2gg05TbEhPZU
 pQ+RPDaCHpFBsLC5qsoMJwvaEQFe0e49ZFemw7bXRzBxgmbbNnZ9ArCIPqT+rvQd
 7UBmyC+kacVyybZatq69aK2BHKFtiIRlT78d9Izgtjmp8V7oyKoz14Esh8wkE+FT
 ypHUa70Addi6alNm6BVkm7bxZxi0Wrmf3THqF8ViYvufzHKl7JR5e17fKWEG0BqV
 9W7AeHMnzJ7jkTvBGUw7g5EbzFn7hPLTbO4G/VW97k0puS4WRX5aIMkVhUazsRIa
 zDOuXCCskUWuRapiCwY0E4g7cCaT8/JR6JjjBaTgkjJgvo5Y8Eg=
 =ILek
 -----END PGP SIGNATURE-----

Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging

virtio,pc,pci: features, cleanups

infrastructure for vhost-vdpa shadow work
piix south bridge rework
reconnect for vhost-user-scsi
dummy ACPI QTG DSM for cxl

tests, cleanups, fixes all over the place

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmU06PMPHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRpNIsH/0DlKti86VZLJ6PbNqsnKxoK2gg05TbEhPZU
# pQ+RPDaCHpFBsLC5qsoMJwvaEQFe0e49ZFemw7bXRzBxgmbbNnZ9ArCIPqT+rvQd
# 7UBmyC+kacVyybZatq69aK2BHKFtiIRlT78d9Izgtjmp8V7oyKoz14Esh8wkE+FT
# ypHUa70Addi6alNm6BVkm7bxZxi0Wrmf3THqF8ViYvufzHKl7JR5e17fKWEG0BqV
# 9W7AeHMnzJ7jkTvBGUw7g5EbzFn7hPLTbO4G/VW97k0puS4WRX5aIMkVhUazsRIa
# zDOuXCCskUWuRapiCwY0E4g7cCaT8/JR6JjjBaTgkjJgvo5Y8Eg=
# =ILek
# -----END PGP SIGNATURE-----
# gpg: Signature made Sun 22 Oct 2023 02:18:43 PDT
# gpg:                using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg:                issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (62 commits)
  intel-iommu: Report interrupt remapping faults, fix return value
  MAINTAINERS: Add include/hw/intc/i8259.h to the PC chip section
  vhost-user: Fix protocol feature bit conflict
  tests/acpi: Update DSDT.cxl with QTG DSM
  hw/cxl: Add QTG _DSM support for ACPI0017 device
  tests/acpi: Allow update of DSDT.cxl
  hw/i386/cxl: ensure maxram is greater than ram size for calculating cxl range
  vhost-user: fix lost reconnect
  vhost-user-scsi: start vhost when guest kicks
  vhost-user-scsi: support reconnect to backend
  vhost: move and rename the conn retry times
  vhost-user-common: send get_inflight_fd once
  hw/i386/pc_piix: Make PIIX4 south bridge usable in PC machine
  hw/isa/piix: Implement multi-process QEMU support also for PIIX4
  hw/isa/piix: Resolve duplicate code regarding PCI interrupt wiring
  hw/isa/piix: Reuse PIIX3's PCI interrupt triggering in PIIX4
  hw/isa/piix: Rename functions to be shared for PCI interrupt triggering
  hw/isa/piix: Reuse PIIX3 base class' realize method in PIIX4
  hw/isa/piix: Share PIIX3's base class with PIIX4
  hw/isa/piix: Harmonize names of reset control memory regions
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-10-23 14:45:29 -07:00
Stefan Hajnoczi
c0c4f14729 virtio: call ->vhost_reset_device() during reset
vhost-user-scsi has a VirtioDeviceClass->reset() function that calls
->vhost_reset_device(). The other vhost devices don't notify the vhost
device upon reset.

Stateful vhost devices may need to handle device reset in order to free
resources or prevent stale device state from interfering after reset.

Call ->vhost_device_reset() from virtio_reset() so that that vhost
devices are notified of device reset.

This patch affects behavior as follows:
- vhost-kernel: No change in behavior since ->vhost_reset_device() is
  not implemented.
- vhost-user: back-ends that negotiate
  VHOST_USER_PROTOCOL_F_RESET_DEVICE now receive a
  VHOST_USER_DEVICE_RESET message upon device reset. Otherwise there is
  no change in behavior. DPDK, SPDK, libvhost-user, and the
  vhost-user-backend crate do not negotiate
  VHOST_USER_PROTOCOL_F_RESET_DEVICE automatically.
- vhost-vdpa: an extra SET_STATUS 0 call is made during device reset.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20231004014532.1228637-4-stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
2023-10-22 05:18:16 -04:00
Philippe Mathieu-Daudé
da6f5544ce buildsys: Only display Objective-C information when Objective-C is used
When configuring with '--disable-cocoa --disable-coreaudio'
on Darwin, we get:

 meson.build:4081:58: ERROR: Tried to access compiler for language "objc", not specified for host machine.
 meson.build:4097:47: ERROR: Tried to access unknown option 'objc_args'.

Instead of unconditionally display Objective-C informations
on Darwin, display them when Objective-C is discovered.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20231009093812.52915-1-philmd@linaro.org>
2023-10-19 23:13:27 +02:00
Stefan Hajnoczi
604b70f6a4 * build system and Python cleanups
* fix netbsd VM build
 * allow non-relocatable installs
 * allow using command line options to configure qemu-ga
 * target/i386: check intercept for XSETBV
 * target/i386: fix CPUID_HT exposure
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmUvkQQUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroM3pQgArXCsmnsjlng1chjCvKnIuVmaTYZ5
 aC9pcx7TlyM0+XWtTN0NQhFt71Te+3ioReXIQRvy5O68RNbEkiu8LXfOJhWAHbWk
 vZVtzHQuOZVizeZtUruKlDaw0nZ8bg+NI4aGLs6rs3WphEAM+tiLnZJ0BouiedKS
 e/COB/Hqjok+Ntksbfv5q7XpWjwQB0y2073vM1Mcf0ToOWFLFdL7x0SZ3hxyYlYl
 eoefp/8kbWeUWA7HuoOKmpiLIxmKnY7eXp+UCvdnEhnSce9sCxpn2nzqqLuPItTK
 V3GrJ2//+lrekPHyQvb8IjUMUrPOmzf8GadIE0tkfdHjEP72IsHk0VX81A==
 =rPte
 -----END PGP SIGNATURE-----

Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging

* build system and Python cleanups
* fix netbsd VM build
* allow non-relocatable installs
* allow using command line options to configure qemu-ga
* target/i386: check intercept for XSETBV
* target/i386: fix CPUID_HT exposure

# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmUvkQQUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroM3pQgArXCsmnsjlng1chjCvKnIuVmaTYZ5
# aC9pcx7TlyM0+XWtTN0NQhFt71Te+3ioReXIQRvy5O68RNbEkiu8LXfOJhWAHbWk
# vZVtzHQuOZVizeZtUruKlDaw0nZ8bg+NI4aGLs6rs3WphEAM+tiLnZJ0BouiedKS
# e/COB/Hqjok+Ntksbfv5q7XpWjwQB0y2073vM1Mcf0ToOWFLFdL7x0SZ3hxyYlYl
# eoefp/8kbWeUWA7HuoOKmpiLIxmKnY7eXp+UCvdnEhnSce9sCxpn2nzqqLuPItTK
# V3GrJ2//+lrekPHyQvb8IjUMUrPOmzf8GadIE0tkfdHjEP72IsHk0VX81A==
# =rPte
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 18 Oct 2023 04:02:12 EDT
# gpg:                using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg:                issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (32 commits)
  configure: define "pkg-config" in addition to "pkgconfig"
  meson: add a note on why we use config_host for program paths
  meson-buildoptions: document the data at the top
  configure, meson: use command line options to configure qemu-ga
  configure: unify handling of several Debian cross containers
  configure: move environment-specific defaults to config-meson.cross
  configure: move target-specific defaults to an external machine file
  configure: remove some dead cruft
  configure: clean up PIE option handling
  configure: clean up plugin option handling
  configure, tests/tcg: simplify GDB conditionals
  tests/tcg/arm: move non-SVE tests out of conditional
  hw/remote: move stub vfu_object_set_bus_irq out of stubs/
  hw/xen: cleanup sourcesets
  configure: clean up handling of CFI option
  meson, cutils: allow non-relocatable installs
  meson: do not use set10
  meson: do not build shaders by default
  tracetool: avoid invalid escape in Python string
  tests/vm: avoid invalid escape in Python string
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-10-18 06:20:41 -04:00
Paolo Bonzini
b4d61d3d81 meson: add a note on why we use config_host for program paths
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-10-18 10:01:02 +02:00
Paolo Bonzini
a47dd5c516 configure, tests/tcg: simplify GDB conditionals
Unify HAVE_GDB_BIN (currently in config-host.mak) and
HOST_GDB_SUPPORTS_ARCH into a single GDB variable in
config-target.mak.

Reviewed-by: Emmanouil Pitsidianakis <manos.pitsidianakis@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-10-18 10:01:01 +02:00
Paolo Bonzini
655e2a778d meson, cutils: allow non-relocatable installs
Say QEMU is configured with bindir = "/usr/bin" and a firmware path
that starts with "/usr/share/qemu".  Ever since QEMU 5.2, QEMU's
install has been relocatable: if you move qemu-system-x86_64 from
/usr/bin to /home/username/bin, it will start looking for firmware in
/home/username/share/qemu.  Previously, you would get a non-relocatable
install where the moved QEMU will keep looking for firmware in
/usr/share/qemu.

Windows almost always wants relocatable installs, and in fact that
is why QEMU 5.2 introduced relocatability in the first place.
However, newfangled distribution mechanisms such as AppImage
(https://docs.appimage.org/reference/best-practices.html), and
possibly NixOS, also dislike using at runtime the absolute paths
that were established at build time.

On POSIX systems you almost never care; if you do, your usecase
dictates which one is desirable, so there's no single answer.
Obviously relocatability works fine most of the time, because not many
people have complained about QEMU's switch to relocatable install,
and that's why until now there was no way to disable relocatability.

But a non-relocatable, non-modular binary can help if you want to do
experiments with old firmware and new QEMU or vice versa (because you
can just upgrade/downgrade the firmware package, and use rpm2cpio or
similar to extract the QEMU binaries outside /usr), so allow both.
This patch allows one to build a non-relocatable install using a new
option to configure.  Why?  Because it's not too hard, and because
it helps the user double check the relocatability of their install.

Note that the same code that handles relocation also lets you run QEMU
from the build tree and pick e.g. firmware files from the source tree
transparently.  Therefore that part remains active with this patch,
even if you configure with --disable-relocatable.

Suggested-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Emmanouil Pitsidianakis <manos.pitsidianakis@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-10-18 10:00:57 +02:00
Paolo Bonzini
230f6e06b8 meson: do not use set10
Make all items of config-host.h consistent.  To keep the --disable-coroutine-pool
code visible to the compiler, mutuate the IS_ENABLED() macro from Linux.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-10-17 15:20:53 +02:00
Gurchetan Singh
cd9adbefcc gfxstream + rutabaga: meson support
- Add meson detection of rutabaga_gfx
- Build virtio-gpu-rutabaga.c + associated vga/pci files when
  present

Signed-off-by: Gurchetan Singh <gurchetansingh@chromium.org>
Tested-by: Alyssa Ross <hi@alyssa.is>
Tested-by: Emmanouil Pitsidianakis <manos.pitsidianakis@linaro.org>
Tested-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Emmanouil Pitsidianakis <manos.pitsidianakis@linaro.org>
Reviewed-by: Antonio Caggiano <quic_acaggian@quicinc.com>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
2023-10-16 11:29:56 +04:00
Akihiko Odaki
956af7daad gdbstub: Introduce GDBFeature structure
Before this change, the information from a XML file was stored in an
array that is not descriptive. Introduce a dedicated structure type to
make it easier to understand and to extend with more fields.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230912224107.29669-6-akihiko.odaki@daynix.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20231009164104.369749-13-alex.bennee@linaro.org>
2023-10-11 08:46:33 +01:00
Philippe Mathieu-Daudé
8d7f2e767d system: Rename softmmu/ directory as system/
The softmmu/ directory contains files specific to system
emulation. Rename it as system/. Update meson rules, the
MAINTAINERS file and all the documentation and comments.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231004090629.37473-14-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-10-08 21:08:08 +02:00
Philippe Mathieu-Daudé
01c85e60a4 meson: Rename target_softmmu_arch -> target_system_arch
Finish the convertion started with commit de6cd7599b
("meson: Replace softmmu_ss -> system_ss"). If the
$target_type is 'system', then use the target_system_arch[]
source set :)

Mechanical change doing:

  $ sed -i -e s/target_softmmu_arch/target_system_arch/g \
      $(git grep -l target_softmmu_arch)

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231004090629.37473-13-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-10-07 19:03:07 +02:00
Philippe Mathieu-Daudé
21d2140dd8 meson: Rename softmmu_mods -> system_mods
See commit de6cd7599b ("meson: Replace softmmu_ss -> system_ss")
for rationale.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231004090629.37473-12-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-10-07 19:03:07 +02:00
Stefan Hajnoczi
800af0aae1 accel: Introduce AccelClass::cpu_common_[un]realize
accel: Target agnostic code movement
 accel/tcg: Cleanups to use CPUState instead of CPUArchState
 accel/tcg: Move CPUNegativeOffsetState into CPUState
 tcg: Split out tcg init functions to tcg/startup.h
 linux-user/hppa: Fix struct target_sigcontext layout
 build: Remove --enable-gprof
 -----BEGIN PGP SIGNATURE-----
 
 iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmUdsL4dHHJpY2hhcmQu
 aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV/iYggAvDJEyMCAXSSH97BA
 wZT/2D/MFIhOMk6xrQRnrXfrG70N0iVKz44jl9j7k1D+9BOHcso//DDJH3c96k9A
 MgDb6W2bsWvC15/Qw6BALf5bb/II0MJuCcQvj3CNX5lNkXAWhwIOBhsZx7V9ST1+
 rihN4nowpRWdV5GeCjDGaJW455Y1gc96hICYHy6Eqw1cUgUFt9vm5aYU3FHlat29
 sYRaVYKUL2hRUPPNcPiPq0AaJ8wN6/s8gT+V1UvTzkhHqskoM4ZU89RchuXVoq1h
 SvhKElyULMRzM7thWtpW8qYJPj4mxZsKArESvHjsunGD6KEz3Fh1sy6EKRcdmpG/
 II1vkg==
 =k2Io
 -----END PGP SIGNATURE-----

Merge tag 'pull-tcg-20231004' of https://gitlab.com/rth7680/qemu into staging

accel: Introduce AccelClass::cpu_common_[un]realize
accel: Target agnostic code movement
accel/tcg: Cleanups to use CPUState instead of CPUArchState
accel/tcg: Move CPUNegativeOffsetState into CPUState
tcg: Split out tcg init functions to tcg/startup.h
linux-user/hppa: Fix struct target_sigcontext layout
build: Remove --enable-gprof

# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmUdsL4dHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV/iYggAvDJEyMCAXSSH97BA
# wZT/2D/MFIhOMk6xrQRnrXfrG70N0iVKz44jl9j7k1D+9BOHcso//DDJH3c96k9A
# MgDb6W2bsWvC15/Qw6BALf5bb/II0MJuCcQvj3CNX5lNkXAWhwIOBhsZx7V9ST1+
# rihN4nowpRWdV5GeCjDGaJW455Y1gc96hICYHy6Eqw1cUgUFt9vm5aYU3FHlat29
# sYRaVYKUL2hRUPPNcPiPq0AaJ8wN6/s8gT+V1UvTzkhHqskoM4ZU89RchuXVoq1h
# SvhKElyULMRzM7thWtpW8qYJPj4mxZsKArESvHjsunGD6KEz3Fh1sy6EKRcdmpG/
# II1vkg==
# =k2Io
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 04 Oct 2023 14:36:46 EDT
# gpg:                using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg:                issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [full]
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A  05C0 64DF 38E8 AF7E 215F

* tag 'pull-tcg-20231004' of https://gitlab.com/rth7680/qemu: (47 commits)
  tcg/loongarch64: Fix buid error
  tests/avocado: Re-enable MIPS Malta tests (GitLab issue #1884 fixed)
  build: Remove --enable-gprof
  linux-user/hppa: Fix struct target_sigcontext layout
  tcg: Split out tcg init functions to tcg/startup.h
  tcg: Remove argument to tcg_prologue_init
  accel/tcg: Make cpu-exec-common.c a target agnostic unit
  accel/tcg: Make icount.o a target agnostic unit
  accel/tcg: Make monitor.c a target-agnostic unit
  accel/tcg: Rename target-specific 'internal.h' -> 'internal-target.h'
  exec: Rename target specific page-vary.c -> page-vary-target.c
  exec: Rename cpu.c -> cpu-target.c
  accel: Rename accel-common.c -> accel-target.c
  accel: Make accel-blocker.o target agnostic
  accel/tcg: Restrict dump_exec_info() declaration
  exec: Move cpu_loop_foo() target agnostic functions to 'cpu-common.h'
  exec: Make EXCP_FOO definitions target agnostic
  accel/tcg: move ld/st helpers to ldst_common.c.inc
  accel/tcg: Unify user and softmmu do_[st|ld]*_mmu()
  accel/tcg: Remove env_tlb()
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-10-05 08:55:34 -04:00
Richard Henderson
a0bc599726 build: Remove --enable-gprof
This build option has been deprecated since 8.0.
Remove all CONFIG_GPROF code that depends on that,
including one errant check using TARGET_GPROF.

Acked-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-10-04 11:03:54 -07:00
Philippe Mathieu-Daudé
8c7907a180 exec: Rename target specific page-vary.c -> page-vary-target.c
This matches the target agnostic 'page-vary-common.c' counterpart.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20230914185718.76241-8-philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-10-04 11:03:54 -07:00
Philippe Mathieu-Daudé
fe0007f3c1 exec: Rename cpu.c -> cpu-target.c
We have exec/cpu code split in 2 files for target agnostic
("common") and specific. Rename 'cpu.c' which is target
specific using the '-target' suffix. Update MAINTAINERS.
Remove the 's from 'cpus-common.c' to match the API cpu_foo()
functions.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20230914185718.76241-7-philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-10-04 11:03:54 -07:00
Daniel P. Berrangé
9afa888ce0 osdep: set _FORTIFY_SOURCE=2 when optimization is enabled
Currently we set _FORTIFY_SOURCE=2 as a compiler argument when the
meson 'optimization' setting is non-zero, the compiler is GCC and
the target is Linux.

While the default QEMU optimization level is 2, user could override
this by setting CFLAGS="-O0" or --extra-cflags="-O0" when running
configure and this won't be reflected in the meson 'optimization'
setting. As a result we try to enable _FORTIFY_SOURCE=2 and then the
user gets compile errors as it only works with optimization.

Rather than trying to improve detection in meson, it is simpler to
just check the __OPTIMIZE__ define from osdep.h.

The comment about being incompatible with clang appears to be
outdated, as compilation works fine without excluding clang.

In the coroutine code we must set _FORTIFY_SOURCE=0 to stop the
logic in osdep.h then enabling it.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20231003091549.223020-1-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-10-04 09:52:06 -04:00
Paolo Bonzini
4c545a05ab meson: clean up static_library keyword arguments
These are either built because they are dependencies of other targets,
or not needed at all because they are used via extract_objects().
Mark them as "build_by_default: false"; if applicable, mark them
as "fa" so that -Wl,--whole-archive does not interact with the
linker script used for fuzzing.

(The "fa" hack is brittle; updating to Meson 1.1 would allow using
declare_dependency(objects: ...) instead).

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1044
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-09-29 09:33:10 +02:00
Paolo Bonzini
f0df613b98 make-release: do not ship dtc sources
A new enough libfdt is included in all of Debian 11, Ubuntu 20.04
and MSYS2.  It has also been included for several minor releases
in Fedora and openSUSE Leap, as well as in CentOS.  Therefore
there is no need anymore to ship the sources together with the QEMU
tarballs.

Keep the wrap file so that it can be used with --enable-download,
but do not ship the sources anymore with either archive-source.sh
or make-release.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-09-29 09:33:10 +02:00
Thomas Huth
c64023b0ba meson.build: Make keyutils independent from keyring
Commit 0db0fbb5cf ("Add conditional dependency for libkeyutils")
tried to provide a possibility for the user to disable keyutils
if not required by makeing it depend on the keyring feature. This
looked reasonable at a first glance (the unit test in tests/unit/
needs both), but the condition in meson.build fails if the feature
is meant to be detected automatically, and there is also another
spot in backends/meson.build where keyutils is used independently
from keyring. So let's remove the dependency on keyring again and
introduce a proper meson build option instead.

Cc: qemu-stable@nongnu.org
Fixes: 0db0fbb5cf ("Add conditional dependency for libkeyutils")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1842
Message-ID: <20230824094208.255279-1-thuth@redhat.com>
Reviewed-by: "Daniel P. Berrangé" <berrange@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-09-25 07:53:40 +02:00
Ilya Maximets
cb039ef3d9 net: add initial support for AF_XDP network backend
AF_XDP is a network socket family that allows communication directly
with the network device driver in the kernel, bypassing most or all
of the kernel networking stack.  In the essence, the technology is
pretty similar to netmap.  But, unlike netmap, AF_XDP is Linux-native
and works with any network interfaces without driver modifications.
Unlike vhost-based backends (kernel, user, vdpa), AF_XDP doesn't
require access to character devices or unix sockets.  Only access to
the network interface itself is necessary.

This patch implements a network backend that communicates with the
kernel by creating an AF_XDP socket.  A chunk of userspace memory
is shared between QEMU and the host kernel.  4 ring buffers (Tx, Rx,
Fill and Completion) are placed in that memory along with a pool of
memory buffers for the packet data.  Data transmission is done by
allocating one of the buffers, copying packet data into it and
placing the pointer into Tx ring.  After transmission, device will
return the buffer via Completion ring.  On Rx, device will take
a buffer form a pre-populated Fill ring, write the packet data into
it and place the buffer into Rx ring.

AF_XDP network backend takes on the communication with the host
kernel and the network interface and forwards packets to/from the
peer device in QEMU.

Usage example:

  -device virtio-net-pci,netdev=guest1,mac=00:16:35:AF:AA:5C
  -netdev af-xdp,ifname=ens6f1np1,id=guest1,mode=native,queues=1

XDP program bridges the socket with a network interface.  It can be
attached to the interface in 2 different modes:

1. skb - this mode should work for any interface and doesn't require
         driver support.  With a caveat of lower performance.

2. native - this does require support from the driver and allows to
            bypass skb allocation in the kernel and potentially use
            zero-copy while getting packets in/out userspace.

By default, QEMU will try to use native mode and fall back to skb.
Mode can be forced via 'mode' option.  To force 'copy' even in native
mode, use 'force-copy=on' option.  This might be useful if there is
some issue with the driver.

Option 'queues=N' allows to specify how many device queues should
be open.  Note that all the queues that are not open are still
functional and can receive traffic, but it will not be delivered to
QEMU.  So, the number of device queues should generally match the
QEMU configuration, unless the device is shared with something
else and the traffic re-direction to appropriate queues is correctly
configured on a device level (e.g. with ethtool -N).
'start-queue=M' option can be used to specify from which queue id
QEMU should start configuring 'N' queues.  It might also be necessary
to use this option with certain NICs, e.g. MLX5 NICs.  See the docs
for examples.

In a general case QEMU will need CAP_NET_ADMIN and CAP_SYS_ADMIN
or CAP_BPF capabilities in order to load default XSK/XDP programs to
the network interface and configure BPF maps.  It is possible, however,
to run with no capabilities.  For that to work, an external process
with enough capabilities will need to pre-load default XSK program,
create AF_XDP sockets and pass their file descriptors to QEMU process
on startup via 'sock-fds' option.  Network backend will need to be
configured with 'inhibit=on' to avoid loading of the program.
QEMU will need 32 MB of locked memory (RLIMIT_MEMLOCK) per queue
or CAP_IPC_LOCK.

There are few performance challenges with the current network backends.

First is that they do not support IO threads.  This means that data
path is handled by the main thread in QEMU and may slow down other
work or may be slowed down by some other work.  This also means that
taking advantage of multi-queue is generally not possible today.

Another thing is that data path is going through the device emulation
code, which is not really optimized for performance.  The fastest
"frontend" device is virtio-net.  But it's not optimized for heavy
traffic either, because it expects such use-cases to be handled via
some implementation of vhost (user, kernel, vdpa).  In practice, we
have virtio notifications and rcu lock/unlock on a per-packet basis
and not very efficient accesses to the guest memory.  Communication
channels between backend and frontend devices do not allow passing
more than one packet at a time as well.

Some of these challenges can be avoided in the future by adding better
batching into device emulation or by implementing vhost-af-xdp variant.

There are also a few kernel limitations.  AF_XDP sockets do not
support any kinds of checksum or segmentation offloading.  Buffers
are limited to a page size (4K), i.e. MTU is limited.  Multi-buffer
support implementation for AF_XDP is in progress, but not ready yet.
Also, transmission in all non-zero-copy modes is synchronous, i.e.
done in a syscall.  That doesn't allow high packet rates on virtual
interfaces.

However, keeping in mind all of these challenges, current implementation
of the AF_XDP backend shows a decent performance while running on top
of a physical NIC with zero-copy support.

Test setup:

2 VMs running on 2 physical hosts connected via ConnectX6-Dx card.
Network backend is configured to open the NIC directly in native mode.
The driver supports zero-copy.  NIC is configured to use 1 queue.

Inside a VM - iperf3 for basic TCP performance testing and dpdk-testpmd
for PPS testing.

iperf3 result:
 TCP stream      : 19.1 Gbps

dpdk-testpmd (single queue, single CPU core, 64 B packets) results:
 Tx only         : 3.4 Mpps
 Rx only         : 2.0 Mpps
 L2 FWD Loopback : 1.5 Mpps

In skb mode the same setup shows much lower performance, similar to
the setup where pair of physical NICs is replaced with veth pair:

iperf3 result:
  TCP stream      : 9 Gbps

dpdk-testpmd (single queue, single CPU core, 64 B packets) results:
  Tx only         : 1.2 Mpps
  Rx only         : 1.0 Mpps
  L2 FWD Loopback : 0.7 Mpps

Results in skb mode or over the veth are close to results of a tap
backend with vhost=on and disabled segmentation offloading bridged
with a NIC.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> (docker/lcitool)
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-09-18 14:36:13 +08:00