The code for powering on a CPU in arm-powerctl.c has two separate
use cases:
* emulation of a real hardware power controller
* emulation of firmware interfaces (primarily PSCI) with
CPU on/off APIs
For the first case, we only need to reset the CPU and set its
starting PC and X0. For the second case, because we're emulating the
firmware we need to ensure that it's in the state that the firmware
provides. In particular, when we reset to a lower EL than the
highest one we are emulating, we need to put the CPU into a state
that permits correct running at that lower EL. We already do a
little of this in arm-powerctl.c (for instance we set SCR_HCE to
enable the HVC insn) but we don't do enough of it. This means that
in the case where we are emulating EL3 but also providing emulated
PSCI the guest will crash when a secondary core tries to use a
feature that needs an SCR_EL3 bit to be set, such as MTE or PAuth.
The hw/arm/boot.c code also has to support this "start guest code in
an EL that's lower than the highest emulated EL" case in order to do
direct guest kernel booting; it has all the necessary initialization
code to set the SCR_EL3 bits. Pull the relevant boot.c code out into
a separate function so we can share it between there and
arm-powerctl.c.
This refactoring has a few code changes that look like they
might be behaviour changes but aren't:
* if info->secure_boot is false and info->secure_board_setup is
true, then the old code would start the first CPU in Hyp
mode but without changing SCR.NS and NSACR.{CP11,CP10}.
This was wrong behaviour because there's no such thing
as Secure Hyp mode. The new code will leave the CPU in SVC.
(There is no board which sets secure_boot to false and
secure_board_setup to true, so this isn't a behaviour
change for any of our boards.)
* we don't explicitly clear SCR.NS when arm-powerctl.c
does a CPU-on to EL3. This was a no-op because CPU reset
will reset to NS == 0.
And some real behaviour changes:
* we no longer set HCR_EL2.RW when booting into EL2: the guest
can and should do that themselves before dropping into their
EL1 code. (arm-powerctl and boot did this differently; I
opted to use the logic from arm-powerctl, which only sets
HCR_EL2.RW when it's directly starting the guest in EL1,
because it's more correct, and I don't expect guests to be
accidentally depending on our having set the RW bit for them.)
* if we are booting a CPU into AArch32 Secure SVC then we won't
set SCR.HCE any more. This affects only the vexpress-a15 and
raspi2b machine types. Guests booting in this case will either:
- be able to set SCR.HCE themselves as part of moving from
Secure SVC into NS Hyp mode
- will move from Secure SVC to NS SVC, and won't care about
behaviour of the HVC insn
- will stay in Secure SVC, and won't care about HVC
* on an arm-powerctl CPU-on we will now set the SCR bits for
pauth/mte/sve/sme/hcx/fgt features
The first two of these are very minor and I don't expect guest
code to trip over them, so I didn't judge it worth convoluting
the code in an attempt to keep exactly the same boot.c behaviour.
The third change fixes issue 1899.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1899
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230926155619.4028618-1-peter.maydell@linaro.org
The hw/arm/boot.h include in common-semi-target.h is not actually
needed, and it's a bit odd because it pulls a hw/arm header into a
target/arm file.
This include was originally needed because the semihosting code used
the arm_boot_info struct to get the base address of the RAM in system
emulation, to use in a (bad) heuristic for the return values for the
SYS_HEAPINFO semihosting call. We've since overhauled how we
calculate the HEAPINFO values in system emulation, and the code no
longer uses the arm_boot_info struct.
Remove the now-redundant include line, and instead directly include
the cpu-qom.h header that we were previously getting via boot.h.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230925112219.3919261-1-peter.maydell@linaro.org
The include of hw/arm/virt.h in kvm64.c is unnecessary and also a
layering violation since the generic KVM code shouldn't need to know
anything about board-specifics. The include line is an accidental
leftover from commit 15613357ba, where we cleaned up the code
to not depend on virt board internals but forgot to also remove the
now-redundant include line.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20230925110429.3917202-1-peter.maydell@linaro.org
FEAT_HPMN0 is a small feature which defines that it is valid for
MDCR_EL2.HPMN to be set to 0, meaning "no PMU event counters provided
to an EL1 guest" (previously this setting was reserved). QEMU's
implementation almost gets HPMN == 0 right, but we need to fix
one check in pmevcntr_is_64_bit(). That is enough for us to
advertise the feature in the 'max' CPU.
(We don't need to make the behaviour conditional on feature
presence, because the FEAT_HPMN0 behaviour is within the range
of permitted UNPREDICTABLE behaviour for a non-FEAT_HPMN0
implementation.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230921185445.3339214-1-peter.maydell@linaro.org
The SMMUv3.1-XNX feature is mandatory for an SMMUv3.1 if S2P is
supported, so we should theoretically have implemented it as part of
the recent S2P work. Fortunately, for us the implementation is a
no-op.
This feature is about interpretation of the stage 2 page table
descriptor XN bits, which control execute permissions.
For QEMU, the permission bits passed to an IOMMU (via MemTxAttrs and
IOMMUAccessFlags) only indicate read and write; we do not distinguish
data reads from instruction reads outside the CPU proper. In the
SMMU architecture's terms, our interconnect between the client device
and the SMMU doesn't have the ability to convey the INST attribute,
and we therefore use the default value of "data" for this attribute.
We also do not support the bits in the Stream Table Entry that can
override the on-the-bus transaction attribute permissions (we do not
set SMMU_IDR1.ATTR_PERMS_OVR=1).
These two things together mean that for our implementation, it never
has to deal with transactions with the INST attribute, and so it can
correctly ignore the XN bits entirely. So we already implement
FEAT_XNX's "XN field is now 2 bits, not 1" behaviour to the extent
that we need to.
Advertise the presence of the feature in SMMU_IDR3.XNX.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Mostafa Saleh <smostafa@google.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20230914145705.1648377-4-peter.maydell@linaro.org
In smmuv3_init_regs() when we set the various bits in the ID
registers, we do this almost in order of the fields in the
registers, but not quite. Move the initialization of
SMMU_IDR3.RIL and SMMU_IDR5.OAS into their correct places.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Mostafa Saleh <smostafa@google.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20230914145705.1648377-3-peter.maydell@linaro.org
Update the SMMUv3 ID register bit field definitions to the
set in the most recent specification (IHI0700 F.a).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Mostafa Saleh <smostafa@google.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20230914145705.1648377-2-peter.maydell@linaro.org
For the Thumb T32 encoding of LDM, if only a single register is
specified in the register list this instruction is UNPREDICTABLE,
with the following choices:
* instruction UNDEFs
* instruction is a NOP
* instruction loads a single register
* instruction loads an unspecified set of registers
Currently we choose to UNDEF (a behaviour chosen in commit
4b222545db in 2019; previously we treated it as "load the
specified single register").
Unfortunately there is real world code out there (which shipped in at
least Android 11, 12 and 13) which incorrectly uses this
UNPREDICTABLE insn on the assumption that it does a single register
load, which is (presumably) what it happens to do on real hardware,
and is also what it does on the equivalent A32 encoding.
Revert to the pre-4b222545dbf30 behaviour of not UNDEFing
for this T32 encoding.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1799
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230927101853.39288-1-peter.maydell@linaro.org
We can neaten the code by switching the callers that work on a
CPUstate to the kvm_get_one_reg function.
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20231010142453.224369-3-cohuck@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We can neaten the code by switching to the kvm_set_one_reg function.
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20231010142453.224369-2-cohuck@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Use the private peripheral interrupt definitions from bsa.h instead of
defining them locally. Refactor to use the INTIDs defined there instead
of the PPI# used previously.
Signed-off-by: Leif Lindholm <quic_llindhol@quicinc.com>
Message-id: 20230919090229.188092-4-quic_llindhol@quicinc.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
virt.h defines a number of IRQs that are ultimately described by Arm's
Base System Architecture specification. Move these to a dedicated header
so that they can be reused by other platforms that do the same.
Include that header from virt.h to minimise churn.
While we're moving the definitions, sort them into numerical order,
and add the ARCH_TIMER_NS_EL2_VIRT_IRQ definition used by sbsa-ref
and which will eventually be needed by virt also.
Signed-off-by: Leif Lindholm <quic_llindhol@quicinc.com>
Message-id: 20230919090229.188092-3-quic_llindhol@quicinc.com
[PMM: Remove unused PPI_TO_INTID macro; sort numerically;
add ARCH_TIMER_NS_EL2_VIRT_IRQ]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
GIC Private Peripheral Interrupts (PPI) are defined as GIC INTID 16-31.
As in, PPI0 is INTID16 .. PPI15 is INTID31.
Arm's Base System Architecture specification (BSA) lists the mandated and
recommended private interrupt IDs by INTID, not by PPI index. But current
definitions in virt define them by PPI index, complicating cross
referencing.
Meanwhile, the PPI(x) macro counterintuitively adds 16 to the input value,
converting a PPI index to an INTID.
Resolve this by redefining the BSA-allocated PPIs by their INTIDs,
and replacing the PPI(x) macro with an INTID_TO_PPI(x) one where required.
Signed-off-by: Leif Lindholm <quic_llindhol@quicinc.com>
Message-id: 20230919090229.188092-2-quic_llindhol@quicinc.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
On an attempt to access CNTPCT_EL0 from EL0 using a guest running on top
of Xen, a trap from EL2 was observed which is something not reproducible
on HW (also, Xen does not trap accesses to physical counter).
This is because gt_counter_access() checks for an incorrect bit (1
instead of 0) of CNTHCTL_EL2 if HCR_EL2.E2H is 0 and access is made to
physical counter. Refer ARM ARM DDI 0487J.a, D19.12.2:
When HCR_EL2.E2H is 0:
- EL1PCTEN, bit [0]: refers to physical counter
- EL1PCEN, bit [1]: refers to physical timer registers
Drop entire block "if (hcr & HCR_E2H) {...} else {...}" from EL0 case
and fall through to EL1 case, given that after fixing checking for the
correct bit, the handling is the same.
Fixes: 5bc8437136 ("target/arm: Update timer access for VHE")
Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Tested-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Message-id: 20230928094404.20802-1-michal.orzel@amd.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Index in file_size array must be checked against num_files, because the
entries we are looking for may be absent in the PDB.
Fixes: Coverity CID 1521597
Signed-off-by: Viktor Prutyanov <viktor@daynix.com>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20230930235317.11469-3-viktor@daynix.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
String sign_rsds isn't terminated, so the print length must be limited.
Fixes: Coverity CID 1521598
Signed-off-by: Viktor Prutyanov <viktor@daynix.com>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-id: 20230930235317.11469-2-viktor@daynix.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This replaces the comma (,) to dot (.) in the device type name
so the name can be used with the 'driver=' command line option.
Signed-off-by: Tong Ho <tong.ho@amd.com>
Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Message-id: 20231003052139.199665-1-tong.ho@amd.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This change implements the ResettableClass interface for the device.
Signed-off-by: Tong Ho <tong.ho@amd.com>
Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Message-id: 20231004055339.323833-1-tong.ho@amd.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This change implements the ResettableClass interface for the device.
Signed-off-by: Tong Ho <tong.ho@amd.com>
Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Message-id: 20231004055713.324009-1-tong.ho@amd.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This change implements the ResettableClass interface for the device.
Signed-off-by: Tong Ho <tong.ho@amd.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20231003052345.199725-1-tong.ho@amd.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
struct arm_boot_info is declared in "hw/arm/boot.h".
By including the correct header we don't need to declare
it again in "target/arm/cpu-qom.h".
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20231013130214.95742-1-philmd@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The file is obviously related to the raspberrypi machine, so
it should reside in hw/arm/ instead of hw/misc/. And while we're
at it, also adjust the wildcard in MAINTAINERS so that it covers
this file, too.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Acked-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20231012073458.860187-1-thuth@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* Support for VFIODisplay migration with ramfb
* Preliminary work for IOMMUFD support
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmUvlEYACgkQUaNDx8/7
7KFlaw//X2053de2eTdo38/UMSzi5ACWWn2j1iGQZf/3+J2LcdlixZarZr/2DN56
4axmwF6+GKozt5+EnvWtgodDn6U9iyMNaAB3CGBHFHsH8uqKeZd/Ii754q4Rcmy9
ZufBOPWm9Ff7s2MMFiAZvso75jP2wuwVEe1YPRjeJnsNSNIJ6WZfemh3Sl96yRBb
r38uqzqetKwl7HziMMWP3yb8v+dU8A9bqI1hf1FZGttfFz3XA+pmjXKA6XxdfiZF
AAotu5x9w86a08sAlr/qVsZFLR37oQykkXM0D840DafJDyr5fbJiq8cwfOjMw9+D
w6+udRm5KoBWPsvb/T3dR88GRMO22PChjH9Vjl51TstMNhdTxuKJTKhhSoUFZbXV
8CMjwfALk5ggIOyCk1LRd04ed+9qkqgcbw1Guy5pYnyPnY/X6XurxxaxS6Gemgtn
UvgRYhSjio+LgHLO77IVkWJMooTEPzUTty2Zxa7ldbbE+utPUtsmac9+1m2pnpqk
5VQmB074QnsJuvf+7HPU6vYCzQWoXHsH1UY/A0fF7MPedNUAbVYzKrdGPyqEMqHy
xbilAIaS3oO0pMT6kUpRv5c5vjbwkx94Nf/ii8fQVjWzPfCcaF3yEfaam62jMUku
stySaRpavKIx2oYLlucBqeKaBGaUofk13gGTQlsFs8pKCOAV7r4=
=s0fN
-----END PGP SIGNATURE-----
Merge tag 'pull-vfio-20231018' of https://github.com/legoater/qemu into staging
vfio queue:
* Support for VFIODisplay migration with ramfb
* Preliminary work for IOMMUFD support
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmUvlEYACgkQUaNDx8/7
# 7KFlaw//X2053de2eTdo38/UMSzi5ACWWn2j1iGQZf/3+J2LcdlixZarZr/2DN56
# 4axmwF6+GKozt5+EnvWtgodDn6U9iyMNaAB3CGBHFHsH8uqKeZd/Ii754q4Rcmy9
# ZufBOPWm9Ff7s2MMFiAZvso75jP2wuwVEe1YPRjeJnsNSNIJ6WZfemh3Sl96yRBb
# r38uqzqetKwl7HziMMWP3yb8v+dU8A9bqI1hf1FZGttfFz3XA+pmjXKA6XxdfiZF
# AAotu5x9w86a08sAlr/qVsZFLR37oQykkXM0D840DafJDyr5fbJiq8cwfOjMw9+D
# w6+udRm5KoBWPsvb/T3dR88GRMO22PChjH9Vjl51TstMNhdTxuKJTKhhSoUFZbXV
# 8CMjwfALk5ggIOyCk1LRd04ed+9qkqgcbw1Guy5pYnyPnY/X6XurxxaxS6Gemgtn
# UvgRYhSjio+LgHLO77IVkWJMooTEPzUTty2Zxa7ldbbE+utPUtsmac9+1m2pnpqk
# 5VQmB074QnsJuvf+7HPU6vYCzQWoXHsH1UY/A0fF7MPedNUAbVYzKrdGPyqEMqHy
# xbilAIaS3oO0pMT6kUpRv5c5vjbwkx94Nf/ii8fQVjWzPfCcaF3yEfaam62jMUku
# stySaRpavKIx2oYLlucBqeKaBGaUofk13gGTQlsFs8pKCOAV7r4=
# =s0fN
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 18 Oct 2023 04:16:06 EDT
# gpg: using RSA key A0F66548F04895EBFE6B0B6051A343C7CFFBECA1
# gpg: Good signature from "Cédric Le Goater <clg@redhat.com>" [unknown]
# gpg: aka "Cédric Le Goater <clg@kaod.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: A0F6 6548 F048 95EB FE6B 0B60 51A3 43C7 CFFB ECA1
* tag 'pull-vfio-20231018' of https://github.com/legoater/qemu: (22 commits)
hw/vfio: add ramfb migration support
ramfb-standalone: add migration support
ramfb: add migration support
vfio/pci: Remove vfio_detach_device from vfio_realize error path
vfio/ccw: Remove redundant definition of TYPE_VFIO_CCW
vfio/ap: Remove pointless apdev variable
vfio/pci: Fix a potential memory leak in vfio_listener_region_add
vfio/common: Move legacy VFIO backend code into separate container.c
vfio/common: Introduce a global VFIODevice list
vfio/common: Store the parent container in VFIODevice
vfio/common: Introduce a per container device list
vfio/common: Move VFIO reset handler registration to a group agnostic function
vfio/ccw: Use vfio_[attach/detach]_device
vfio/ap: Use vfio_[attach/detach]_device
vfio/platform: Use vfio_[attach/detach]_device
vfio/pci: Introduce vfio_[attach/detach]_device
vfio/common: Extract out vfio_kvm_device_[add/del]_fd
vfio/common: Introduce vfio_container_add|del_section_window()
vfio/common: Propagate KVM_SET_DEVICE_ATTR error if any
vfio/common: Move IOMMU agnostic helpers to a separate file
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Add a "VFIODisplay" subsection whenever "x-ramfb-migrate" is turned on.
Turn it off by default on machines <= 8.1 for compatibility reasons.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
[ clg: - checkpatch fixes
- improved warn_report() in vfio_realize() ]
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Add a "ramfb-dev" section whenever "x-migrate" is turned on. Turn it off
by default on machines <= 8.1 for compatibility reasons.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Implementing RAMFB migration is quite straightforward. One caveat is to
treat the whole RAMFBCfg as a blob, since that's what is exposed to the
guest directly. This avoid having to fiddle with endianness issues if we
were to migrate fields individually as integers.
The devices using RAMFB will have to include ramfb_vmstate in their
migration description.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
In vfio_realize, on the error path, we currently call
vfio_detach_device() after a successful vfio_attach_device.
While this looks natural, vfio_instance_finalize also induces
a vfio_detach_device(), and it seems to be the right place
instead as other resources are released there which happen
to be a prerequisite to a successful UNSET_CONTAINER.
So let's rely on the finalize vfio_detach_device call to free
all the relevant resources.
Fixes: a28e06621170 ("vfio/pci: Introduce vfio_[attach/detach]_device")
Reported-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
No functional changes.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
No need to double-cast, call VFIO_AP_DEVICE() on DeviceState.
No functional changes.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
When there is an failure in vfio_listener_region_add() and the section
belongs to a ram device, there is an inaccurate error report which should
never be related to vfio_dma_map failure. The memory holding err is also
incrementally leaked in each failure.
Fix it by reporting the real error and free it.
Fixes: 567b5b309a ("vfio/pci: Relax DMA map errors for MMIO regions")
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Move all the code really dependent on the legacy VFIO container/group
into a separate file: container.c. What does remain in common.c is
the code related to VFIOAddressSpace, MemoryListeners, migration and
all other general operations.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Some functions iterate over all the VFIODevices. This is currently
achieved by iterating over all groups/devices. Let's
introduce a global list of VFIODevices simplifying that scan.
This will also be useful while migrating to IOMMUFD by hiding the
group specificity.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Suggested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
let's store the parent contaienr within the VFIODevice.
This simplifies the logic in vfio_viommu_preset() and
brings the benefice to hide the group specificity which
is useful for IOMMUFD migration.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Several functions need to iterate over the VFIO devices attached to
a given container. This is currently achieved by iterating over the
groups attached to the container and then over the devices in the group.
Let's introduce a per container device list that simplifies this
search.
Per container list is used in below functions:
vfio_devices_all_dirty_tracking
vfio_devices_all_device_dirty_tracking
vfio_devices_all_running_and_mig_active
vfio_devices_dma_logging_stop
vfio_devices_dma_logging_start
vfio_devices_query_dirty_bitmap
This will also ease the migration of IOMMUFD by hiding the group
specificity.
Suggested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Move the reset handler registration/unregistration to a place that is not
group specific. vfio_[get/put]_address_space are the best places for that
purpose.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Let the vfio-ccw device use vfio_attach_device() and
vfio_detach_device(), hence hiding the details of the used
IOMMU backend.
Note that the migration reduces the following trace
"vfio: subchannel %s has already been attached" (featuring
cssid.ssid.devid) into "device is already attached"
Also now all the devices have been migrated to use the new
vfio_attach_device/vfio_detach_device API, let's turn the
legacy functions into static functions, local to container.c.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Let the vfio-ap device use vfio_attach_device() and
vfio_detach_device(), hence hiding the details of the used
IOMMU backend.
We take the opportunity to use g_path_get_basename() which
is prefered, as suggested by
3e015d815b ("use g_path_get_basename instead of basename")
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Let the vfio-platform device use vfio_attach_device() and
vfio_detach_device(), hence hiding the details of the used
IOMMU backend.
Drop the trace event for vfio-platform as we have similar
one in vfio_attach_device.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
We want the VFIO devices to be able to use two different
IOMMU backends, the legacy VFIO one and the new iommufd one.
Introduce vfio_[attach/detach]_device which aim at hiding the
underlying IOMMU backend (IOCTLs, datatypes, ...).
Once vfio_attach_device completes, the device is attached
to a security context and its fd can be used. Conversely
When vfio_detach_device completes, the device has been
detached from the security context.
At the moment only the implementation based on the legacy
container/group exists. Let's use it from the vfio-pci device.
Subsequent patches will handle other devices.
We also take benefit of this patch to properly free
vbasedev->name on failure.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Introduce two new helpers, vfio_kvm_device_[add/del]_fd
which take as input a file descriptor which can be either a group fd or
a cdev fd. This uses the new KVM_DEV_VFIO_FILE VFIO KVM device group,
which aliases to the legacy KVM_DEV_VFIO_GROUP.
vfio_kvm_device_[add/del]_group then call those new helpers.
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Introduce helper functions that isolate the code used for
VFIO_SPAPR_TCE_v2_IOMMU.
Those helpers hide implementation details beneath the container object
and make the vfio_listener_region_add/del() implementations more
readable. No code change intended.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
In the VFIO_SPAPR_TCE_v2_IOMMU container case, when
KVM_SET_DEVICE_ATTR fails, we currently don't propagate the
error as we do on the vfio_spapr_create_window() failure
case. Let's align the code. Take the opportunity to
reword the error message and make it more explicit.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Move low-level iommu agnostic helpers to a separate helpers.c
file. They relate to regions, interrupts, device/region
capabilities and etc.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Since commit da3c22c74a ("linux-headers: Update to Linux v6.6-rc1"),
linux-headers has been updated to v6.6-rc1.
As previous patch added iommufd.h to update-linux-headers.sh,
run the script again against TAG v6.6-rc1 to have iommufd.h included.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Update the script to import iommufd.h
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Meson used to allow both "pkgconfig" and "pkg-config" entries in machine
files; the former was used for dependency lookup and the latter
was used as return value for "find_program('pkg-config')", which is a less
common use-case and one that QEMU does not need.
This inconsistency is going to be fixed by Meson 1.3, which will deprecate
"pkgconfig" in favor of "pkg-config" (the less common one, but it makes
sense because it matches the name of the binary). For backward
compatibility it is still allowed to define both, so do that in the
configure-generated machine file.
Related: https://github.com/mesonbuild/meson/pull/12385
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>