Commit Graph

115061 Commits

Author SHA1 Message Date
Philippe Mathieu-Daudé
c9e0b9a59c hw/char/goldfish: Use DMA memory API
Rather than using address_space_rw(..., 0 or 1),
use the simpler DMA memory API which expand to
the same code. This allows removing a cast on
the 'buf' variable which is really const. Since
'buf' is only used in the CMD_READ_BUFFER case,
we can reduce its scope.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-Id: <20240723181850.46000-1-philmd@linaro.org>
2024-07-23 22:34:33 +02:00
Zhao Liu
2f28f28e74 hw/nubus/virtio-mmio: Fix missing ERRP_GUARD() in realize handler
According to the comment in qapi/error.h, dereferencing @errp requires
ERRP_GUARD():

* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
* - It must not be dereferenced, because it may be null.
...
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
*
* Using it when it's not needed is safe, but please avoid cluttering
* the source with useless code.

In nubus_virtio_mmio_realize(), @errp is dereferenced without
ERRP_GUARD().

Although nubus_virtio_mmio_realize() - as a DeviceClass.realize()
method - is never passed a null @errp argument, it should follow the
rules on @errp usage.  Add the ERRP_GUARD() there.

Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240723161802.1377985-1-zhao1.liu@intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-07-23 22:34:09 +02:00
Yao Xingtao
1aef44805d dump: make range overlap check more readable
use ranges_overlap() instead of open-coding the overlap check to improve
the readability of the code.

Signed-off-by: Yao Xingtao <yaoxt.fnst@fujitsu.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-ID: <20240722040742.11513-13-yaoxt.fnst@fujitsu.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-07-23 20:30:36 +02:00
Yao Xingtao
7cd9b9d476 crypto/block-luks: make range overlap check more readable
use ranges_overlap() instead of open-coding the overlap check to improve
the readability of the code.

Signed-off-by: Yao Xingtao <yaoxt.fnst@fujitsu.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20240722040742.11513-12-yaoxt.fnst@fujitsu.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-07-23 20:30:36 +02:00
Yao Xingtao
13c59a0e9e system/memory_mapping: make range overlap check more readable
use ranges_overlap() instead of open-coding the overlap check to improve
the readability of the code.

Signed-off-by: Yao Xingtao <yaoxt.fnst@fujitsu.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-ID: <20240722040742.11513-10-yaoxt.fnst@fujitsu.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-07-23 20:30:36 +02:00
Yao Xingtao
2a48b590f7 sparc/ldst_helper: make range overlap check more readable
use ranges_overlap() instead of open-coding the overlap check to improve
the readability of the code.

Signed-off-by: Yao Xingtao <yaoxt.fnst@fujitsu.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240722040742.11513-9-yaoxt.fnst@fujitsu.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-07-23 20:30:36 +02:00
Yao Xingtao
7b3e371526 cxl/mailbox: make range overlap check more readable
use ranges_overlap() instead of open-coding the overlap check to improve
the readability of the code.

Signed-off-by: Yao Xingtao <yaoxt.fnst@fujitsu.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240722040742.11513-5-yaoxt.fnst@fujitsu.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-07-23 20:30:36 +02:00
Yao Xingtao
007643bddc util/range: Make ranges_overlap() return bool
Just like range_overlaps_range(), use the returned bool value
to check whether 2 given ranges overlap.

Signed-off-by: Yao Xingtao <yaoxt.fnst@fujitsu.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240722040742.11513-2-yaoxt.fnst@fujitsu.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-07-23 20:30:36 +02:00
Yao Xingtao
c8f1a322d1 hw/mips/loongson3_virt: remove useless type cast
The type of kernel_entry, kernel_low and kernel_high is uint64_t, cast
the pointer of this type to uint64_t* is useless.

Signed-off-by: Yao Xingtao <yaoxt.fnst@fujitsu.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240722091728.4334-2-yaoxt.fnst@fujitsu.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-07-23 20:30:36 +02:00
BALATON Zoltan
53858a6a30 hw/i2c/mpc_i2c: Fix mmio region size
The last register of this device is at offset 0x14 occupying 8 bits so
to cover it the mmio region needs to be 0x15 bytes long. Also correct
the name of the field storing this register value to match the
register name.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Fixes: 7abb479c7a ("PPC: E500: Add FSL I2C controller")
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240721225506.B32704E6039@zero.eik.bme.hu>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-07-23 20:30:36 +02:00
Thomas Weißschuh
16c84ec1be docs/interop/firmware.json: convert "Example" section
Since commit 3c5f6114d9 ("qapi: remove "Example" doc section")
the "Example" section is not valid anymore.
It has been replaced by the "qmp-example" directive.

This was not detected earlier as firmware.json was not validated.
As this validation is about to be added, adapt firmware.json.

Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Message-ID: <20240719-qapi-firmware-json-v6-3-c2e3de390b58@linutronix.de>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-07-23 20:30:36 +02:00
Thomas Weißschuh
d5d0070bfe docs/interop/firmware.json: add new enum FirmwareArchitecture
Only a small subset of all architectures supported by qemu make use of
firmware files. Introduce and use a new enum to represent this.

This also removes the dependency to machine.json from the global qapi
definitions.

Claim "Since: 3.0" for the new enum, because that's correct for most of
its members, and the members are what matters in the interface.

Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240719-qapi-firmware-json-v6-2-c2e3de390b58@linutronix.de>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-07-23 20:30:36 +02:00
Thomas Weißschuh
220434f099 docs/interop/firmware.json: add new enum FirmwareFormat
Only a small subset of all blockdev drivers make sense for firmware
images. Introduce and use a new enum to represent this.

This also reduces the dependency on firmware.json from the global qapi
definitions.

Claim "Since: 3.0" for the new enum, because that's correct for its
members, and the members are what matters in the interface.

Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240719-qapi-firmware-json-v6-1-c2e3de390b58@linutronix.de>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-07-23 20:30:36 +02:00
Philippe Mathieu-Daudé
9ea0f206b7 docs: Correct Loongarch -> LoongArch
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Tested-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Message-Id: <20240718133312.10324-20-philmd@linaro.org>
2024-07-23 20:30:36 +02:00
Philippe Mathieu-Daudé
13e8ec6cf3 hw/intc/loongson_ipi: Declare QOM types using DEFINE_TYPES() macro
When multiple QOM types are registered in the same file,
it is simpler to use the the DEFINE_TYPES() macro. Replace
the type_init() / type_register_static() combination.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Song Gao <gaosong@loongson.cn>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Tested-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Message-Id: <20240718133312.10324-2-philmd@linaro.org>
2024-07-23 20:30:36 +02:00
Philippe Mathieu-Daudé
0c2086bc73 hw/intc/loongson_ipi: Fix resource leak
Once initialised, QOM objects can be realized and
unrealized multiple times before being finalized.
Resources allocated in REALIZE must be deallocated
in an equivalent UNREALIZE handler.

Free the CPU array in loongson_ipi_unrealize()
instead of loongson_ipi_finalize().

Cc: qemu-stable@nongnu.org
Fixes: 5e90b8db38 ("hw/loongarch: Set iocsr address space per-board rather than percpu")
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20240723111405.14208-3-philmd@linaro.org>
2024-07-23 20:30:36 +02:00
Bibo Mao
2465c89fb9 hw/intc/loongson_ipi: Access memory in little endian
Loongson IPI is only available in little-endian,
so use that to access the guest memory (in case
we run on a big-endian host).

Cc: qemu-stable@nongnu.org
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Fixes: f6783e3438 ("hw/loongarch: Add LoongArch ipi interrupt support")
[PMD: Extracted from bigger commit, added commit description]
Co-Developed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Tested-by: Bibo Mao <maobibo@loongson.cn>
Acked-by: Song Gao <gaosong@loongson.cn>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Tested-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Message-Id: <20240718133312.10324-3-philmd@linaro.org>
2024-07-23 20:30:30 +02:00
Warner Losh
afdb6be1bd bsd-user: Add aarch64 build to tree
Add the aarch64 bsd-user fragments needed to build the new aarch64 code.

Signed-off-by: Warner Losh <imp@bsdimp.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2024-07-23 10:56:30 -06:00
Warner Losh
1c687f65b4 bsd-user: Make compile for non-linux user-mode stuff
We include the files that define PR_MTE_TCF_SHIFT only on Linux, but use
them unconditionally. Restrict its use to Linux-only.

"It's ugly, but it's not actually wrong."

Signed-off-by: Warner Losh <imp@bsdimp.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2024-07-23 10:56:30 -06:00
Warner Losh
5fa2a10ba6 bsd-user: Define TARGET_SIGSTACK_ALIGN and use it to round stack
Most (all?) targets require stacks to be properly aligned. Rather than a
series of ifdefs in bsd-user/signal.h, instead use a manditory #define
for all architectures.

Signed-off-by: Warner Losh <imp@bsdimp.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2024-07-23 10:56:30 -06:00
Jessica Clarke
5b6828d194 bsd-user: Sync fork_start/fork_end with linux-user
This reorders some of the calls, deduplicates code between branches and,
most importantly, fixes a double end_exclusive call in the parent that
will cause exclusive_context_count to go negative.

Signed-off-by: Jessica Clarke <jrtc27@jrtc27.com>
Pull-Request: https://github.com/qemu-bsd-user/qemu-bsd-user/pull/52
Reviewed-by: Warner Losh <imp@bsdimp.com>
Signed-off-by: Warner Losh <imp@bsdimp.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2024-07-23 10:56:25 -06:00
Warner Losh
b314fd06cf bsd-user: Hard wire aarch64 to be 4k pages only
Only support 4k pages for aarch64 binaries. The variable page size stuff
isn't working just yet, so put in this lessor-of-evils kludge until that
is complete.

Signed-off-by: Warner Losh <imp@bsdimp.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2024-07-23 10:50:55 -06:00
Doug Rabson
e6e102b972 bsd-user: Simplify the implementation of execve
This removes the logic which prepends the emulator to each call to
execve and fexecve. This is not necessary with the existing
imgact_binmisc support and it avoids the need to install the emulator
binary into jail environments when using 'binmiscctl --pre-open'.

Signed-off-by: Doug Rabson <dfr@rabson.org>
Reviewed-by: Warner Losh <imp@bsdimp.com>
Signed-off-by: Warner Losh <imp@bsdimp.com>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
2024-07-23 10:50:54 -06:00
Stacey Son
ce6c541dcb bsd-user:Add AArch64 improvements and signal handling functions
Added get_ucontext_sigreturn function to check processor state ensuring current execution mode is EL0 and no flags
indicating interrupts or exceptions are set.
Updated AArch64 code to use CF directly without reading/writing the entire processor state, improving efficiency.
Changed FP data structures to use Int128 instead of __uint128_t, leveraging QEMU's generic mechanism for referencing this type.

Signed-off-by: Stacey Son <sson@FreeBSD.org>
Signed-off-by: Ajeet Singh <itachis@FreeBSD.org>
Signed-off-by: Warner Losh <imp@bsdimp.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240707191128.10509-9-itachis@FreeBSD.org>
Signed-off-by: Warner Losh <imp@bsdimp.com>
2024-07-23 10:50:54 -06:00
Stacey Son
dadfc6d5df bsd-user:Add set_mcontext function for ARM AArch64
The function copies register values from the provided target_mcontext_t
structure to the CPUARMState registers.
Note:FP is unfinished upstream but will be a separate commit coming soon.

Signed-off-by: Stacey Son <sson@FreeBSD.org>
Signed-off-by: Ajeet Singh <itachis@FreeBSD.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240707191128.10509-8-itachis@FreeBSD.org>
Signed-off-by: Warner Losh <imp@bsdimp.com>
2024-07-23 10:50:54 -06:00
Warner Losh
c88f44d85a bsd-user:Add setup_sigframe_arch function for ARM AArch64
The function utilizes the `get_mcontext` function to retrieve the machine
context for the current CPUARMState

Signed-off-by: Warner Losh <imp@bsdimp.com>
Signed-off-by: Ajeet Singh <itachis@FreeBSD.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240707191128.10509-7-itachis@FreeBSD.org>
Signed-off-by: Warner Losh <imp@bsdimp.com>
2024-07-23 10:50:54 -06:00
Stacey Son
9959fae592 bsd-user:Add get_mcontext function for ARM AArch64
function to retrieve machine context,it populates the provided
target_mcontext_t structure with information from the CPUARMState
registers.

Signed-off-by: Stacey Son <sson@FreeBSD.org>
Signed-off-by: Ajeet Singh <itachis@FreeBSD.org>
Co-authored-by: Kyle Evans <kevans@FreeBSD.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240707191128.10509-6-itachis@FreeBSD.org>
Signed-off-by: Warner Losh <imp@bsdimp.com>
2024-07-23 10:50:54 -06:00
Stacey Son
7dba5e10a6 bsd-user:Add ARM AArch64 signal handling support
Added sigcode setup function for signal trampoline which initializes a sequence of instructions
to handle signal returns and exits, copying this code to the target offset.
Defined ARM AArch64 specific signal definitions including register indices and sizes,
and introduced structures to represent general purpose registers, floating point registers, and machine context.
Added function to set up signal handler arguments, populating register values in `CPUARMState`
based on the provided signal, signal frame, signal action, and frame address.

Signed-off-by: Stacey Son <sson@FreeBSD.org>
Signed-off-by: Ajeet Singh <itachis@FreeBSD.org>
Signed-off-by: Warner Losh <imp@bsdimp.com>
Co-authored-by: Warner Losh <imp@bsdimp.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240707191128.10509-5-itachis@FreeBSD.org>
Signed-off-by: Warner Losh <imp@bsdimp.com>
2024-07-23 10:50:54 -06:00
Warner Losh
1541d87db2 bsd-user:Add ARM AArch64 support and capabilities
Added function to access rval2 by accessing the x1 register.
Defined ARM AArch64 ELF parameters including mmap and dynamic load addresses.
Introduced extensive hardware capability definitions and macros for retrieving hardware capability (hwcap) flags.
Implemented function to retrieve ARM AArch64 hardware capabilities using the `GET_FEATURE_ID` macro.
Added function to retrieve extended ARM AArch64 hardware capability flags.

Signed-off-by: Stacey Son <sson@FreeBSD.org>
Signed-off-by: Ajeet Singh <itachis@FreeBSD.org>
Signed-off-by: Warner Losh <imp@bsdimp.com>
Co-authored-by: Kyle Evans <kevans@FreeBSD.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240707191128.10509-4-itachis@FreeBSD.org>
Signed-off-by: Warner Losh <imp@bsdimp.com>
2024-07-23 10:50:54 -06:00
Stacey Son
1acce7718b bsd-user:Add AArch64 register handling and related functions
Added header file for managing CPU register states in FreeBSD user mode.
Introduced prototypes for setting and getting thread-local storage (TLS).
Implemented AArch64 sysarch() system call emulation and a printing function.
Added function for setting up thread upcall to add thread support to BSD-USER.
Initialized thread's register state during thread setup.
Updated ARM AArch64 VM parameter definitions for bsd-user, including address spaces for FreeBSD/arm64 and
a function for getting the stack pointer from CPU and setting a return value.

Signed-off-by: Stacey Son <sson@FreeBSD.org>
Signed-off-by: Ajeet Singh <itachis@FreeBSD.org>
Co-authored-by: Jessica Clarke <jrtc27@jrtc27.com>
Co-authored-by: Sean Bruno <sbruno@freebsd.org>
Co-authored-by: Warner Losh <imp@bsdimp.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240707191128.10509-3-itachis@FreeBSD.org>
Signed-off-by: Warner Losh <imp@bsdimp.com>
2024-07-23 10:50:54 -06:00
Stacey Son
8cbb4fc12e bsd-user:Add CPU initialization and management functions
Added function to initialize ARM CPU and check if it supports 64-bit mode.
Implemented CPU loop function to handle exceptions and emulate execution of instructions.
Added function to clone CPU state to create a new thread.
Included AArch64 specific CPU functions for bsd-user to set and receive thread-local-storage
value from the tpidr_el0 register.
Introduced structure for storing CPU register states for BSD-USER.

Signed-off-by: Stacey Son <sson@FreeBSD.org>
Signed-off-by: Ajeet Singh <itachis@FreeBSD.org>
Co-authored-by: Kyle Evans <kevans@freebsd.org>
Co-authored-by: Sean Bruno <sbruno@freebsd.org>
Co-authored-by: Jessica Clarke <jrtc27@jrtc27.com>
Reviewed-by: Warner Losh <imp@bsdimp.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240707191128.10509-2-itachis@FreeBSD.org>
Signed-off-by: Warner Losh <imp@bsdimp.com>
2024-07-23 10:50:54 -06:00
Clément Mathieu--Drif
35422553bc hw/i386/intel_iommu: Extract device IOTLB invalidation logic
This piece of code can be shared by both IOTLB invalidation and
PASID-based IOTLB invalidation

No functional changes intended.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Message-ID: <20240718081636.879544-12-zhenzhong.duan@intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-07-23 18:08:44 +02:00
Philippe Mathieu-Daudé
99481a0988 accel: Restrict probe_access*() functions to TCG
This API is specific to TCG (already handled by hardware
accelerators), so restrict it with #ifdef'ry. Remove
unnecessary stubs.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240529155918.6221-1-philmd@linaro.org>
2024-07-23 18:08:44 +02:00
Joao Martins
30b9167785 vfio/common: Allow disabling device dirty page tracking
The property 'x-pre-copy-dirty-page-tracking' allows disabling the whole
tracking of VF pre-copy phase of dirty page tracking, though it means
that it will only be used at the start of the switchover phase.

Add an option that disables the VF dirty page tracking, and fall
back into container-based dirty page tracking. This also allows to
use IOMMU dirty tracking even on VFs with their own dirty
tracker scheme.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
2024-07-23 17:14:53 +02:00
Joao Martins
f48b472450 vfio/migration: Don't block migration device dirty tracking is unsupported
By default VFIO migration is set to auto, which will support live
migration if the migration capability is set *and* also dirty page
tracking is supported.

For testing purposes one can force enable without dirty page tracking
via enable-migration=on, but that option is generally left for testing
purposes.

So starting with IOMMU dirty tracking it can use to accommodate the lack of
VF dirty page tracking allowing us to minimize the VF requirements for
migration and thus enabling migration by default for those too.

While at it change the error messages to mention IOMMU dirty tracking as
well.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
[ clg: - spelling in commit log ]
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-07-23 17:14:53 +02:00
Joao Martins
7c30710bd9 vfio/iommufd: Implement VFIOIOMMUClass::query_dirty_bitmap support
ioctl(iommufd, IOMMU_HWPT_GET_DIRTY_BITMAP, arg) is the UAPI
that fetches the bitmap that tells what was dirty in an IOVA
range.

A single bitmap is allocated and used across all the hwpts
sharing an IOAS which is then used in log_sync() to set Qemu
global bitmaps.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
2024-07-23 17:14:52 +02:00
Joao Martins
52ce88229c vfio/iommufd: Implement VFIOIOMMUClass::set_dirty_tracking support
ioctl(iommufd, IOMMU_HWPT_SET_DIRTY_TRACKING, arg) is the UAPI that
enables or disables dirty page tracking. The ioctl is used if the hwpt
has been created with dirty tracking supported domain (stored in
hwpt::flags) and it is called on the whole list of iommu domains.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
2024-07-23 17:14:52 +02:00
Joao Martins
dddfd8d667 vfio/iommufd: Probe and request hwpt dirty tracking capability
In preparation to using the dirty tracking UAPI, probe whether the IOMMU
supports dirty tracking. This is done via the data stored in
hiod::caps::hw_caps initialized from GET_HW_INFO.

Qemu doesn't know if VF dirty tracking is supported when allocating
hardware pagetable in iommufd_cdev_autodomains_get(). This is because
VFIODevice migration state hasn't been initialized *yet* hence it can't pick
between VF dirty tracking vs IOMMU dirty tracking. So, if IOMMU supports
dirty tracking it always creates HWPTs with IOMMU_HWPT_ALLOC_DIRTY_TRACKING
even if later on VFIOMigration decides to use VF dirty tracking instead.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
[ clg: - Fixed vbasedev->iommu_dirty_tracking assignment in
         iommufd_cdev_autodomains_get()
       - Added warning for heterogeneous dirty page tracking support
	 in iommufd_cdev_autodomains_get() ]
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
2024-07-23 17:14:52 +02:00
Joao Martins
83a4d596a9 vfio/{iommufd, container}: Invoke HostIOMMUDevice::realize() during attach_device()
Move the HostIOMMUDevice::realize() to be invoked during the attach of the device
before we allocate IOMMUFD hardware pagetable objects (HWPT). This allows the use
of the hw_caps obtained by IOMMU_GET_HW_INFO that essentially tell if the IOMMU
behind the device supports dirty tracking.

Note: The HostIOMMUDevice data from legacy backend is static and doesn't
need any information from the (type1-iommu) backend to be initialized.
In contrast however, the IOMMUFD HostIOMMUDevice data requires the
iommufd FD to be connected and having a devid to be able to successfully
GET_HW_INFO. This means vfio_device_hiod_realize() is called in
different places within the backend .attach_device() implementation.

Suggested-by: Cédric Le Goater <clg@redhat.cm>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
[ clg: Fixed error handling in iommufd_cdev_attach() ]
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
2024-07-23 17:14:52 +02:00
Joao Martins
21e8d3a3aa vfio/iommufd: Add hw_caps field to HostIOMMUDeviceCaps
Store the value of @caps returned by iommufd_backend_get_device_info()
in a new field HostIOMMUDeviceCaps::hw_caps. Right now the only value is
whether device IOMMU supports dirty tracking (IOMMU_HW_CAP_DIRTY_TRACKING).

This is in preparation for HostIOMMUDevice::realize() being called early
during attach_device().

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
2024-07-23 17:14:52 +02:00
Joao Martins
6c63532642 vfio/{iommufd,container}: Remove caps::aw_bits
Remove caps::aw_bits which requires the bcontainer::iova_ranges being
initialized after device is actually attached. Instead defer that to
.get_cap() and call vfio_device_get_aw_bits() directly.

This is in preparation for HostIOMMUDevice::realize() being called early
during attach_device().

Suggested-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
2024-07-23 17:14:52 +02:00
Joao Martins
5b1e96e654 vfio/iommufd: Introduce auto domain creation
There's generally two modes of operation for IOMMUFD:

1) The simple user API which intends to perform relatively simple things
with IOMMUs e.g. DPDK. The process generally creates an IOAS and attaches
to VFIO and mainly performs IOAS_MAP and UNMAP.

2) The native IOMMUFD API where you have fine grained control of the
IOMMU domain and model it accordingly. This is where most new feature
are being steered to.

For dirty tracking 2) is required, as it needs to ensure that
the stage-2/parent IOMMU domain will only attach devices
that support dirty tracking (so far it is all homogeneous in x86, likely
not the case for smmuv3). Such invariant on dirty tracking provides a
useful guarantee to VMMs that will refuse incompatible device
attachments for IOMMU domains.

Dirty tracking insurance is enforced via HWPT_ALLOC, which is
responsible for creating an IOMMU domain. This is contrast to the
'simple API' where the IOMMU domain is created by IOMMUFD automatically
when it attaches to VFIO (usually referred as autodomains) but it has
the needed handling for mdevs.

To support dirty tracking with the advanced IOMMUFD API, it needs
similar logic, where IOMMU domains are created and devices attached to
compatible domains. Essentially mimicking kernel
iommufd_device_auto_get_domain(). With mdevs given there's no IOMMU domain
it falls back to IOAS attach.

The auto domain logic allows different IOMMU domains to be created when
DMA dirty tracking is not desired (and VF can provide it), and others where
it is. Here it is not used in this way given how VFIODevice migration
state is initialized after the device attachment. But such mixed mode of
IOMMU dirty tracking + device dirty tracking is an improvement that can
be added on. Keep the 'all of nothing' of type1 approach that we have
been using so far between container vs device dirty tracking.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
[ clg: Added ERRP_GUARD() in iommufd_cdev_autodomains_get() ]
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
2024-07-23 17:14:52 +02:00
Zhenzhong Duan
8b8705e7f2 vfio/ccw: Don't initialize HOST_IOMMU_DEVICE with mdev
mdevs aren't "physical" devices and when asking for backing IOMMU info,
it fails the entire provisioning of the guest. Fix that by setting
vbasedev->mdev true so skipping HostIOMMUDevice initialization in the
presence of mdevs.

Fixes: 9305895201 ("vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler")
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Acked-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
2024-07-23 17:14:52 +02:00
Zhenzhong Duan
c598d65aef vfio/ap: Don't initialize HOST_IOMMU_DEVICE with mdev
mdevs aren't "physical" devices and when asking for backing IOMMU info,
it fails the entire provisioning of the guest. Fix that by setting
vbasedev->mdev true so skipping HostIOMMUDevice initialization in the
presence of mdevs.

Fixes: 9305895201 ("vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler")
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
2024-07-23 17:14:52 +02:00
Joao Martins
b07dcb7d4f vfio/iommufd: Return errno in iommufd_cdev_attach_ioas_hwpt()
In preparation to implement auto domains have the attach function
return the errno it got during domain attach instead of a bool.

-EINVAL is tracked to track domain incompatibilities, and decide whether
to create a new IOMMU domain.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
2024-07-23 17:14:52 +02:00
Joao Martins
2d1bf25897 backends/iommufd: Extend iommufd_backend_get_device_info() to fetch HW capabilities
The helper will be able to fetch vendor agnostic IOMMU capabilities
supported both by hardware and software. Right now it is only iommu dirty
tracking.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
2024-07-23 17:14:52 +02:00
Joao Martins
9f17604195 vfio/iommufd: Don't initialize nor set a HOST_IOMMU_DEVICE with mdev
mdevs aren't "physical" devices and when asking for backing IOMMU info, it
fails the entire provisioning of the guest. Fix that by skipping
HostIOMMUDevice initialization in the presence of mdevs, and skip setting
an iommu device when it is known to be an mdev.

Cc: Zhenzhong Duan <zhenzhong.duan@intel.com>
Fixes: 9305895201 ("vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler")
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
2024-07-23 17:14:52 +02:00
Joao Martins
13e522f644 vfio/pci: Extract mdev check into an helper
In preparation to skip initialization of the HostIOMMUDevice for mdev,
extract the checks that validate if a device is an mdev into helpers.

A vfio_device_is_mdev() is created, and subsystems consult VFIODevice::mdev
to check if it's mdev or not.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
2024-07-23 17:14:52 +02:00
Eric Auger
07321a6d08 hw/vfio/container: Fix SIGSEV on vfio_container_instance_finalize()
In vfio_connect_container's error path, the base container is
removed twice form the VFIOAddressSpace QLIST: first on the
listener_release_exit label and second, on free_container_exit
label, through object_unref(container), which calls
vfio_container_instance_finalize().

Let's remove the first instance.

Fixes: 938026053f ("vfio/container: Switch to QOM")
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
2024-07-23 17:14:52 +02:00
Richard Henderson
3b5efc553e qga-pull-2024-07-23
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEwsLBCepDxjwUI+uE711egWG6hOcFAmafUs0ACgkQ711egWG6
 hOffwQ/+PMFMOq3jwV11Na0GnrFHT0SLlcxNWYGQjE0Q/nwuYWMTKdo2iB9rVC7T
 qxaT6PLtTZPgRsJudJ5kkvLFw88Nr6BuWl31tCVeALUO7C0oTg/oRDfYVeH4/jfG
 PS5TiM6ie27SvI5lhGZhd9sRAy8N6NGgT6Fh+pS2tVVfftcfVYKVmnzgtvk314A+
 MpeW8ukVruSW+9G+suXaE750g/drZJAoepC5pW1HXdHE+IuzXNdMWZqwMqBZSM5T
 X8VcLvMjFrFrfLOP2el6mloriw67aJyKe9Uwsp548HdXfZKrLCmaR7cZK5zKVQDK
 Rzolyuw19wNNi0TZAwmP+MBioDiIHcM4nNhVDCHIVCbXzQHa4BhAr/cr8uucyfM5
 hdCWmaTl4Tksk4q4ooHurDWshV26QNRbLRD1Vx1Rhrwz42MmU2VG13PsSWqLj00I
 fj1LzhQOmr26cewgayIL7ODwHDXiwKi+6lKS1OyTjXXubucScgxSyTNC785T6Rvk
 T58KAnBRD3vDhE7Dn/4KdRClRFY+7R2/jcHdFnA4vfvOVV8ZXp/m0O0wfLEikH6/
 dGDDVBLNG5gqV477++0wdqkYFq6MmON3PH/EA6rgZYc4At5kS+HFNASBvnFRYMGf
 dgtyj8jV5uoffqYOqyXxClP6eTgV1EZ0/wKZ8uJipivB7azjnkE=
 =xzjT
 -----END PGP SIGNATURE-----

Merge tag 'qga-pull-2024-07-23' of https://github.com/kostyanf14/qemu into staging

qga-pull-2024-07-23

# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEwsLBCepDxjwUI+uE711egWG6hOcFAmafUs0ACgkQ711egWG6
# hOffwQ/+PMFMOq3jwV11Na0GnrFHT0SLlcxNWYGQjE0Q/nwuYWMTKdo2iB9rVC7T
# qxaT6PLtTZPgRsJudJ5kkvLFw88Nr6BuWl31tCVeALUO7C0oTg/oRDfYVeH4/jfG
# PS5TiM6ie27SvI5lhGZhd9sRAy8N6NGgT6Fh+pS2tVVfftcfVYKVmnzgtvk314A+
# MpeW8ukVruSW+9G+suXaE750g/drZJAoepC5pW1HXdHE+IuzXNdMWZqwMqBZSM5T
# X8VcLvMjFrFrfLOP2el6mloriw67aJyKe9Uwsp548HdXfZKrLCmaR7cZK5zKVQDK
# Rzolyuw19wNNi0TZAwmP+MBioDiIHcM4nNhVDCHIVCbXzQHa4BhAr/cr8uucyfM5
# hdCWmaTl4Tksk4q4ooHurDWshV26QNRbLRD1Vx1Rhrwz42MmU2VG13PsSWqLj00I
# fj1LzhQOmr26cewgayIL7ODwHDXiwKi+6lKS1OyTjXXubucScgxSyTNC785T6Rvk
# T58KAnBRD3vDhE7Dn/4KdRClRFY+7R2/jcHdFnA4vfvOVV8ZXp/m0O0wfLEikH6/
# dGDDVBLNG5gqV477++0wdqkYFq6MmON3PH/EA6rgZYc4At5kS+HFNASBvnFRYMGf
# dgtyj8jV5uoffqYOqyXxClP6eTgV1EZ0/wKZ8uJipivB7azjnkE=
# =xzjT
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 23 Jul 2024 04:50:53 PM AEST
# gpg:                using RSA key C2C2C109EA43C63C1423EB84EF5D5E8161BA84E7
# gpg: Good signature from "Kostiantyn Kostiuk (Upstream PR sign) <kkostiuk@redhat.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: C2C2 C109 EA43 C63C 1423  EB84 EF5D 5E81 61BA 84E7

* tag 'qga-pull-2024-07-23' of https://github.com/kostyanf14/qemu: (25 commits)
  qga/linux: Add new api 'guest-network-get-route'
  guest-agent: document allow-rpcs in config file section
  qga/commands-posix: Make ga_wait_child() return boolean
  qga: centralize logic for disabling/enabling commands
  qga: allow configuration file path via the cli
  qga: remove pointless 'blockrpcs_key' variable
  qga: move declare of QGAConfig struct to top of file
  qga: don't disable fsfreeze commands if vss_init fails
  qga: conditionalize schema for commands not supported on other UNIX
  qga: conditionalize schema for commands requiring utmpx
  qga: conditionalize schema for commands requiring libudev
  qga: conditionalize schema for commands requiring fstrim
  qga: conditionalize schema for commands requiring fsfreeze
  qga: conditionalize schema for commands only supported on Windows
  qga: conditionalize schema for commands requiring linux/win32
  qga: conditionalize schema for commands requiring getifaddrs
  qga: conditionalize schema for commands unsupported on non-Linux POSIX
  qga: conditionalize schema for commands unsupported on Windows
  qga: move CONFIG_FSFREEZE/TRIM to be meson defined options
  qga: move linux memory block command impls to commands-linux.c
  ...

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-07-24 01:00:22 +10:00