Cache topology needs to be defined based on CPU topology levels. Thus,
define CPU topology enumeration in qapi/machine.json to make it generic
for all architectures.
To match the general topology naming style, rename CPU_TOPO_LEVEL_* to
CPU_TOPOLOGY_LEVEL_*, and rename SMT and package levels to thread and
socket.
Also, enumerate additional topology levels for non-i386 arches, and add
a CPU_TOPOLOGY_LEVEL_DEFAULT to help future smp-cache object to work
with compatibility requirement of arch-specific cache topology models.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20241101083331.340178-3-zhao1.liu@intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
CXL now can use Generic Port Affinity Structures.
CXL now allows control of link speed and width
vhost-user-blk now supports live resize, by means of
a new device-sync-config command
amd iommu now supports interrupt remapping
pcie devices now report extended tag field support
intel_iommu dropped support for Transient Mapping, to match VTD spec
arch agnostic ACPI infrastructure for vCPU Hotplug
Fixes, cleanups all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmcpNqUPHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRp/2oH/0qO33prhDa48J5mqT9NuJzzYwp5QHKF9Zjv
fDAplMUEmfxZIEgJchcyDWPYTGX2geT4pCFhRWioZMIR/0JyzrFgSwsk1kL88cMh
46gzhNVD6ybyPJ7O0Zq3GLy5jo7rlw/n+fFxKAuRCzcbK/fmH8gNC+RwW1IP64Na
HDczYilHUhnO7yKZFQzQNQVbK4BckrG1bu0Fcx0EMUQBf4V6x7GLOrT+3hkKYcr6
+DG5DmUmv20or/FXnu2Ye+MzR8Ebx6JVK3A3sXEE4Ns2CCzK9QLzeeyc2aU13jWN
OpZ6WcKF8HqYprIwnSsMTxhPcq0/c7TvrGrazVwna5RUBMyjjvc=
=zSX4
-----END PGP SIGNATURE-----
Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging
virtio,pc,pci: features, fixes, cleanups
CXL now can use Generic Port Affinity Structures.
CXL now allows control of link speed and width
vhost-user-blk now supports live resize, by means of
a new device-sync-config command
amd iommu now supports interrupt remapping
pcie devices now report extended tag field support
intel_iommu dropped support for Transient Mapping, to match VTD spec
arch agnostic ACPI infrastructure for vCPU Hotplug
Fixes, cleanups all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmcpNqUPHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRp/2oH/0qO33prhDa48J5mqT9NuJzzYwp5QHKF9Zjv
# fDAplMUEmfxZIEgJchcyDWPYTGX2geT4pCFhRWioZMIR/0JyzrFgSwsk1kL88cMh
# 46gzhNVD6ybyPJ7O0Zq3GLy5jo7rlw/n+fFxKAuRCzcbK/fmH8gNC+RwW1IP64Na
# HDczYilHUhnO7yKZFQzQNQVbK4BckrG1bu0Fcx0EMUQBf4V6x7GLOrT+3hkKYcr6
# +DG5DmUmv20or/FXnu2Ye+MzR8Ebx6JVK3A3sXEE4Ns2CCzK9QLzeeyc2aU13jWN
# OpZ6WcKF8HqYprIwnSsMTxhPcq0/c7TvrGrazVwna5RUBMyjjvc=
# =zSX4
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 04 Nov 2024 21:03:33 GMT
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (65 commits)
intel_iommu: Add missed reserved bit check for IEC descriptor
intel_iommu: Add missed sanity check for 256-bit invalidation queue
intel_iommu: Send IQE event when setting reserved bit in IQT_TAIL
hw/acpi: Update GED with vCPU Hotplug VMSD for migration
tests/qtest/bios-tables-test: Update DSDT golden masters for x86/{pc,q35}
hw/acpi: Update ACPI `_STA` method with QOM vCPU ACPI Hotplug states
qtest: allow ACPI DSDT Table changes
hw/acpi: Make CPUs ACPI `presence` conditional during vCPU hot-unplug
hw/pci: Add parenthesis to PCI_BUILD_BDF macro
hw/cxl: Ensure there is enough data to read the input header in cmd_get_physical_port_state()
hw/cxl: Ensure there is enough data for the header in cmd_ccls_set_lsa()
hw/cxl: Check that writes do not go beyond end of target attributes
hw/cxl: Ensuring enough data to read parameters in cmd_tunnel_management_cmd()
hw/cxl: Avoid accesses beyond the end of cel_log.
hw/cxl: Check the length of data requested fits in get_log()
hw/cxl: Check enough data in cmd_firmware_update_transfer()
hw/cxl: Check input length is large enough in cmd_events_clear_records()
hw/cxl: Check input includes at least the header in cmd_features_set_feature()
hw/cxl: Check size of input data to dynamic capacity mailbox commands
hw/cxl/cxl-mailbox-util: Fix output buffer index update when retrieving DC extents
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
IEC descriptor is 128-bit invalidation descriptor, must be padded with
128-bits of 0s in the upper bytes to create a 256-bit descriptor when
the invalidation queue is configured for 256-bit descriptors (IQA_REG.DW=1).
Fixes: 02a2cbc872 ("x86-iommu: introduce IEC notifiers")
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Message-Id: <20241104125536.1236118-4-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
According to VTD spec, a 256-bit descriptor will result in an invalid
descriptor error if submitted in an IQ that is setup to provide hardware
with 128-bit descriptors (IQA_REG.DW=0). Meanwhile, there are old inv desc
types (e.g. iotlb_inv_desc) that can be either 128bits or 256bits. If a
128-bit version of this descriptor is submitted into an IQ that is setup
to provide hardware with 256-bit descriptors will also result in an invalid
descriptor error.
The 2nd will be captured by the tail register update. So we only need to
focus on the 1st.
Because the reserved bit check between different types of invalidation desc
are common, so introduce a common function vtd_inv_desc_reserved_check()
to do all the checks and pass the differences as parameters.
With this change, need to replace error_report_once() call with error_report()
to catch different call sites. This isn't an issue as error_report_once()
here is mainly used to help debug guest error, but it only dumps once in
qemu life cycle and doesn't help much, we need error_report() instead.
Fixes: c0c1d35184 ("intel_iommu: add 256 bits qi_desc support")
Suggested-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Message-Id: <20241104125536.1236118-3-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
According to VTD spec, Figure 11-22, Invalidation Queue Tail Register,
"When Descriptor Width (DW) field in Invalidation Queue Address Register
(IQA_REG) is Set (256-bit descriptors), hardware treats bit-4 as reserved
and a value of 1 in the bit will result in invalidation queue error."
Current code missed to send IQE event to guest, fix it.
Fixes: c0c1d35184 ("intel_iommu: add 256 bits qi_desc support")
Suggested-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Message-Id: <20241104125536.1236118-2-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
VT-d spec removed Transient Mapping (TM) field from second-level page-tables
and treat the field as Reserved(0) since revision 3.2.
Changing the field as reserved(0) will break backward compatibility, so
introduce a property "stale-tm" to allow user to control the setting.
Use pc_compat_9_1 to handle the compatibility for machines before 9.2 which
allow guest to set the field. Starting from 9.2, this field is reserved(0)
by default to match spec. Of course, user can force it on command line.
This doesn't impact function of vIOMMU as there was no logic to emulate
Transient Mapping.
Suggested-by: Yi Liu <yi.l.liu@intel.com>
Suggested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Message-Id: <20241028022514.806657-1-zhenzhong.duan@intel.com>
Reviewed-by: Clément Mathieu--Drif<clement.mathieu--drif@eviden.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The XTSup mode enables x2APIC support for AMD IOMMU, which is needed
to support vcpu w/ APIC ID > 255.
Reviewed-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Santosh Shukla <santosh.shukla@amd.com>
Message-Id: <20240927172913.121477-6-santosh.shukla@amd.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In order to support AMD IOMMU interrupt remapping emulation with PCI
pass-through devices, QEMU needs to notify VFIO when guest IOMMU driver
updates and invalidate the guest interrupt remapping table (IRT), and
communicate information so that the host IOMMU driver can update
the shadowed interrupt remapping table in the host IOMMU.
Therefore, send notification when guest IOMMU emulates the IRT
invalidation commands.
Reviewed-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Santosh Shukla <santosh.shukla@amd.com>
Message-Id: <20240927172913.121477-5-santosh.shukla@amd.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Use shared memory region for interrupt remapping which can be
aliased by all devices.
Reviewed-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Santosh Shukla <santosh.shukla@amd.com>
Message-Id: <20240927172913.121477-4-santosh.shukla@amd.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Introduce 'nodma' shared memory region to support PT mode
so that for each device, we only create an alias to shared memory
region when DMA-remapping is disabled.
Reviewed-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Santosh Shukla <santosh.shukla@amd.com>
Message-Id: <20240927172913.121477-3-santosh.shukla@amd.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Rename the MMIO memory region variable 'mmio' to 'mr_mmio'
so to correctly name align with struct AMDVIState::variable type.
No functional change intended.
Reviewed-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Santosh Shukla <santosh.shukla@amd.com>
Message-Id: <20240927172913.121477-2-santosh.shukla@amd.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
These are very similar to the recently added Generic Initiators
but instead of representing an initiator of memory traffic they
represent an edge point beyond which may lie either targets or
initiators. Here we add these ports such that they may
be targets of hmat_lb records to describe the latency and
bandwidth from host side initiators to the port. A discoverable
mechanism such as UEFI CDAT read from CXL devices and switches
is used to discover the remainder of the path, and the OS can build
up full latency and bandwidth numbers as need for work and data
placement decisions.
Acked-by: Markus Armbruster <armbru@redhat.com>
Tested-by: "Huang, Ying" <ying.huang@intel.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20240916174122.1843197-1-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Rather than relying on PCI internals, use the new acpi_property
to obtain the ACPI _UID values. These are still the same
as the PCI Bus numbers so no functional change.
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: "Huang, Ying" <ying.huang@intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20240916171017.1841767-9-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Whilst ACPI SRAT Generic Initiator Afinity Structures are able to refer to
both PCI and ACPI Device Handles, the QEMU implementation only implements
the PCI Device Handle case. For now move the code into the existing
hw/acpi/pci.c file and header. If support for ACPI Device Handles is
added in the future, perhaps this will be moved again.
Also push the struct AcpiGenericInitiator down into the c file as not
used outside pci.c.
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: "Huang, Ying" <ying.huang@intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20240916171017.1841767-7-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
AWS nitro enclaves[1] is an Amazon EC2[2] feature that allows creating
isolated execution environments, called enclaves, from Amazon EC2
instances which are used for processing highly sensitive data. Enclaves
have no persistent storage and no external networking. The enclave VMs
are based on the Firecracker microvm with a vhost-vsock device for
communication with the parent EC2 instance that spawned it and a Nitro
Secure Module (NSM) device for cryptographic attestation. The parent
instance VM always has CID 3 while the enclave VM gets a dynamic CID.
An EIF (Enclave Image Format)[3] file is used to boot an AWS nitro enclave
virtual machine. This commit adds support for AWS nitro enclave emulation
using a new machine type option '-M nitro-enclave'. This new machine type
is based on the 'microvm' machine type, similar to how real nitro enclave
VMs are based on Firecracker microvm. For nitro-enclave to boot from an
EIF file, the kernel and ramdisk(s) are extracted into a temporary kernel
and a temporary initrd file which are then hooked into the regular x86
boot mechanism along with the extracted cmdline. The EIF file path should
be provided using the '-kernel' QEMU option.
In QEMU, the vsock emulation for nitro enclave is added using vhost-user-
vsock as opposed to vhost-vsock. vhost-vsock doesn't support sibling VM
communication which is needed for nitro enclaves. So for the vsock
communication to CID 3 to work, another process that does the vsock
emulation in userspace must be run, for example, vhost-device-vsock[4]
from rust-vmm, with necessary vsock communication support in another
guest VM with CID 3. Using vhost-user-vsock also enables the possibility
to implement some proxying support in the vhost-user-vsock daemon that
will forward all the packets to the host machine instead of CID 3 so
that users of nitro-enclave can run the necessary applications in their
host machine instead of running another whole VM with CID 3. The following
mandatory nitro-enclave machine option has been added related to the
vhost-user-vsock device.
- 'vsock': The chardev id from the '-chardev' option for the
vhost-user-vsock device.
AWS Nitro Enclaves have built-in Nitro Secure Module (NSM) device which
has been added using the virtio-nsm device added in a previous commit.
In Nitro Enclaves, all the PCRs start in a known zero state and the first
16 PCRs are locked from boot and reserved. The PCR0, PCR1, PCR2 and PCR8
contain the SHA384 hashes related to the EIF file used to boot the VM
for validation. The following optional nitro-enclave machine options
have been added related to the NSM device.
- 'id': Enclave identifier, reflected in the module-id of the NSM
device. If not provided, a default id will be set.
- 'parent-role': Parent instance IAM role ARN, reflected in PCR3
of the NSM device.
- 'parent-id': Parent instance identifier, reflected in PCR4 of the
NSM device.
[1] https://docs.aws.amazon.com/enclaves/latest/user/nitro-enclave.html
[2] https://aws.amazon.com/ec2/
[3] https://github.com/aws/aws-nitro-enclaves-image-format
[4] https://github.com/rust-vmm/vhost-device/tree/main/vhost-device-vsock
Signed-off-by: Dorjoy Chowdhury <dorjoychy111@gmail.com>
Reviewed-by: Alexander Graf <graf@amazon.com>
Link: https://lore.kernel.org/r/20241008211727.49088-6-dorjoychy111@gmail.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The x86 architecture uses little endianness. Directly use
the little-endian LD/ST API.
Mechanical change using:
$ end=le; \
for acc in uw w l q tul; do \
sed -i -e "s/ld${acc}_p(/ld${acc}_${end}_p(/" \
-e "s/st${acc}_p(/st${acc}_${end}_p(/" \
$(git grep -wlE '(ld|st)t?u?[wlq]_p' hw/i386/); \
done
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20241004163042.85922-9-philmd@linaro.org>
* kvm: support for nested FRED
* tests/unit: fix warning when compiling test-nested-aio-poll with LTO
* kvm: refactoring of VM creation
* target/i386: expose IBPB-BRTYPE and SBPB CPUID bits to the guest
* hw/char: clean up serial
* remove virtfs-proxy-helper
* target/i386/kvm: Report which action failed in kvm_arch_put/get_registers
* qom: improvements to object_resolve_path*()
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmb++MsUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroPVnwf/cdvfxvDm22tEdlh8vHlV17HtVdcC
Hw334M/3PDvbTmGzPBg26lzo4nFS6SLrZ8ETCeqvuJrtKzqVk9bI8ssZW5KA4ijM
nkxguRPHO8E6U33ZSucc+Hn56+bAx4I2X80dLKXJ87OsbMffIeJ6aHGSEI1+fKVh
pK7q53+Y3lQWuRBGhDIyKNuzqU4g+irpQwXOhux63bV3ADadmsqzExP6Gmtl8OKM
DylPu1oK7EPZumlSiJa7Gy1xBqL4Rc4wGPNYx2RVRjp+i7W2/Y1uehm3wSBw+SXC
a6b7SvLoYfWYS14/qCF4cBL3sJH/0f/4g8ZAhDDxi2i5kBr0/5oioDyE/A==
=/zo4
-----END PGP SIGNATURE-----
Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging
* pc: Add a description for the i8042 property
* kvm: support for nested FRED
* tests/unit: fix warning when compiling test-nested-aio-poll with LTO
* kvm: refactoring of VM creation
* target/i386: expose IBPB-BRTYPE and SBPB CPUID bits to the guest
* hw/char: clean up serial
* remove virtfs-proxy-helper
* target/i386/kvm: Report which action failed in kvm_arch_put/get_registers
* qom: improvements to object_resolve_path*()
# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmb++MsUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroPVnwf/cdvfxvDm22tEdlh8vHlV17HtVdcC
# Hw334M/3PDvbTmGzPBg26lzo4nFS6SLrZ8ETCeqvuJrtKzqVk9bI8ssZW5KA4ijM
# nkxguRPHO8E6U33ZSucc+Hn56+bAx4I2X80dLKXJ87OsbMffIeJ6aHGSEI1+fKVh
# pK7q53+Y3lQWuRBGhDIyKNuzqU4g+irpQwXOhux63bV3ADadmsqzExP6Gmtl8OKM
# DylPu1oK7EPZumlSiJa7Gy1xBqL4Rc4wGPNYx2RVRjp+i7W2/Y1uehm3wSBw+SXC
# a6b7SvLoYfWYS14/qCF4cBL3sJH/0f/4g8ZAhDDxi2i5kBr0/5oioDyE/A==
# =/zo4
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 03 Oct 2024 21:04:27 BST
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (23 commits)
qom: update object_resolve_path*() documentation
qom: set *ambiguous on all paths
qom: rename object_resolve_path_type() "ambiguousp"
target/i386/kvm: Report which action failed in kvm_arch_put/get_registers
kvm: Allow kvm_arch_get/put_registers to accept Error**
accel/kvm: refactor dirty ring setup
minikconf: print error entirely on stderr
9p: remove 'proxy' filesystem backend driver
hw/char: Extract serial-mm
hw/char/serial.h: Extract serial-isa.h
hw: Remove unused inclusion of hw/char/serial.h
target/i386: Expose IBPB-BRTYPE and SBPB CPUID bits to the guest
kvm: refactor core virtual machine creation into its own function
kvm/i386: replace identity_base variable with a constant
kvm/i386: refactor kvm_arch_init and split it into smaller functions
kvm: replace fprintf with error_report()/printf() in kvm_init()
kvm/i386: fix return values of is_host_cpu_intel()
kvm/i386: make kvm_filter_msr() and related definitions private to kvm module
hw/i386/pc: Add a description for the i8042 property
tests/unit: remove block layer code from test-nested-aio-poll
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
# Conflicts:
# hw/arm/Kconfig
# hw/arm/pxa2xx.c
Expose handle_bufioreq in xen_register_ioreq().
This is to allow machines to enable or disable buffered ioreqs.
No functional change since all callers still set it to
HVM_IOREQSRV_BUFIOREQ_ATOMIC.
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com>
The includes where updated based on compile errors. Now, the inclusion of the
header roughly matches Kconfig dependencies:
# grep -r -e "select SERIAL_ISA"
hw/ppc/Kconfig: select SERIAL_ISA
hw/isa/Kconfig: select SERIAL_ISA
hw/sparc64/Kconfig: select SERIAL_ISA
hw/i386/Kconfig: select SERIAL_ISA
hw/i386/Kconfig: select SERIAL_ISA # for serial_hds_isa_init()
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Link: https://lore.kernel.org/r/20240905073832.16222-3-shentey@gmail.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some devices need to distinguish cold start reset from waking up from a
suspended state. This patch adds new value to the enum, and updates the
i386 wakeup method to use this new reset type.
Message-ID: <20240904103722.946194-3-jmarcin@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Juraj Marcin <jmarcin@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Currently, both qemu_devices_reset() and MachineClass::reset() use
ShutdownCause for the reason of the reset. However, the Resettable
interface uses ResetState, so ShutdownCause needs to be translated to
ResetType somewhere. Translating it qemu_devices_reset() makes adding
new reset types harder, as they cannot always be matched to a single
ShutdownCause here, and devices may need to check the ResetType to
determine what to reset and if to reset at all.
This patch moves this translation up in the call stack to
qemu_system_reset() and updates all MachineClass children to use the
ResetType instead.
Message-ID: <20240904103722.946194-2-jmarcin@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Juraj Marcin <jmarcin@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Use device_class_set_legacy_reset() instead of opencoding an
assignment to DeviceClass::reset. This change was produced
with:
spatch --macro-file scripts/cocci-macro-file.h \
--sp-file scripts/coccinelle/device-reset.cocci \
--keep-comments --smpl-spacing --in-place --dir hw
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240830145812.1967042-8-peter.maydell@linaro.org
This patch implements the periodic and the swsmi ICH9 chipset timers. They are
especially useful when prototyping UEFI firmware (e.g. with EDK2's OVMF)
using QEMU.
For backwards compatibility, the compat properties "x-smi-swsmi-timer",
and "x-smi-periodic-timer" are introduced.
Additionally, writes to the SMI_STS register are enabled for the
corresponding two bits using a write mask to make future work easier.
Signed-off-by: Dominic Prinz <git@dprinz.de>
Message-Id: <1d90ea69e01ab71a0f2ced116801dc78e04f4448.1725991505.git.git@dprinz.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
When qemu runs without kvm acceleration the ACPI executions take a great
amount of time. If they take more than the default time (30sec), the
ACPI calls fail and the system might not behave correctly.
Now the _PRT table is computed on the fly. We can drastically reduce the
execution of the _PRT method if we return a pre-computed table.
Without this patch:
[ 51.343484] ACPI Error: Aborting method \_SB.PCI0._PRT due to previous error (AE_AML_LOOP_TIMEOUT) (20230628/psparse-529)
[ 51.527032] ACPI Error: Method execution failed \_SB.PCI0._PRT due to previous error (AE_AML_LOOP_TIMEOUT) (20230628/uteval-68)
[ 51.530049] virtio-pci 0000:00:02.0: can't derive routing for PCI INT A
[ 51.530797] virtio-pci 0000:00:02.0: PCI INT A: no GSI
[ 81.922901] ACPI Error: Aborting method \_SB.PCI0._PRT due to previous error (AE_AML_LOOP_TIMEOUT) (20230628/psparse-529)
[ 82.103534] ACPI Error: Method execution failed \_SB.PCI0._PRT due to previous error (AE_AML_LOOP_TIMEOUT) (20230628/uteval-68)
[ 82.106088] virtio-pci 0000:00:04.0: can't derive routing for PCI INT A
[ 82.106761] virtio-pci 0000:00:04.0: PCI INT A: no GSI
[ 112.192568] ACPI Error: Aborting method \_SB.PCI0._PRT due to previous error (AE_AML_LOOP_TIMEOUT) (20230628/psparse-529)
[ 112.486687] ACPI Error: Method execution failed \_SB.PCI0._PRT due to previous error (AE_AML_LOOP_TIMEOUT) (20230628/uteval-68)
[ 112.489554] virtio-pci 0000:00:05.0: can't derive routing for PCI INT A
[ 112.490027] virtio-pci 0000:00:05.0: PCI INT A: no GSI
[ 142.559448] ACPI Error: Aborting method \_SB.PCI0._PRT due to previous error (AE_AML_LOOP_TIMEOUT) (20230628/psparse-529)
[ 142.718596] ACPI Error: Method execution failed \_SB.PCI0._PRT due to previous error (AE_AML_LOOP_TIMEOUT) (20230628/uteval-68)
[ 142.722889] virtio-pci 0000:00:06.0: can't derive routing for PCI INT A
[ 142.724578] virtio-pci 0000:00:06.0: PCI INT A: no GSI
With this patch:
[ 22.938076] ACPI: \_SB_.LNKB: Enabled at IRQ 10
[ 24.214002] ACPI: \_SB_.LNKD: Enabled at IRQ 11
[ 25.465170] ACPI: \_SB_.LNKA: Enabled at IRQ 10
[ 27.944920] ACPI: \_SB_.LNKC: Enabled at IRQ 11
ACPI disassembly:
Scope (PCI0)
{
Method (_PRT, 0, NotSerialized) // _PRT: PCI Routing Table
{
Return (Package (0x80)
{
Package (0x04)
{
0xFFFF,
Zero,
LNKD,
Zero
},
Package (0x04)
{
0xFFFF,
One,
LNKA,
Zero
},
Package (0x04)
{
0xFFFF,
0x02,
LNKB,
Zero
},
Package (0x04)
{
0xFFFF,
0x03,
LNKC,
Zero
},
Package (0x04)
{
0x0001FFFF,
Zero,
LNKS,
Zero
},
Context: https://lore.kernel.org/virtualization/20240417145544.38d7b482@imammedo.users.ipa.redhat.com/T/#t
Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240814115736.1580337-3-ribalda@chromium.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In vtd_process_inv_desc(), VTD_INV_DESC_PC and VTD_INV_DESC_PIOTLB are
bypassed without scalable mode check. These two types are not valid
in legacy mode and we should report error.
Fixes: 4a4f219e8a ("intel_iommu: add scalable-mode option to make scalable mode work")
Suggested-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Clément Mathieu--Drif<clement.mathieu--drif@eviden.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Message-Id: <20240814071321.2621384-3-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
According to spec, invalidation descriptor type is 7bits which is
concatenation of bits[11:9] and bits[3:0] of invalidation descriptor.
Currently we only pick bits[3:0] as the invalidation type and treat
bits[11:9] as reserved zero. This is not a problem for now as bits[11:9]
is zero for all current invalidation types. But it will break if newer
type occupies bits[11:9].
Fix it by taking bits[11:9] into type and make reserved bits check accurate.
Suggested-by: Clément Mathieu--Drif<clement.mathieu--drif@eviden.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Reviewed-by: Clément Mathieu--Drif<clement.mathieu--drif@eviden.com>
Message-Id: <20240814071321.2621384-2-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add a Xen PVH x86 machine based on the abstract PVH Machine.
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Cosmetic: add comments in x86_load_linux() pointing to the kernel documentation
so that users can better understand the code.
CC: qemu-trivial@nongnu.org
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Since commit 4ccd5fe22f ('pc: add option
to disable PS/2 mouse/keyboard'), the vmport will not be created unless
the i8042 PS/2 controller is enabled. To avoid confusion, let's fail if
vmport was explicitly requested, but the i8042 controller is disabled.
This also changes the behavior of vmport=auto to take i8042 controller
availability into account.
Signed-off-by: Kamil Szczęk <kamil@szczek.dev>
Reviewed-by: Bernhard Beschow <shentey@gmail.com>
Message-ID: <0MS3y5E-hHqODIhiuFxmCnIrXd612JIGq31UuMsz4KGCKZ_wWuF-PHGKTRSGS0nWaPEddOdF4YOczHdgorulECPo792OhWov7O9BBF6UMX4=@szczek.dev>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The code which translates vmport=auto to on/off is currently separate
for each PC machine variant, while being functionally equivalent.
This moves the translation into a shared initialization function, while
also tightening the enum assertion.
Signed-off-by: Kamil Szczęk <kamil@szczek.dev>
Reviewed-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <v8pz1uwgIYWkidgZK-o8H-qJvnSyl0641XVmNO43Qls307AA3QRPuad_py6xGe0JAxB6yDEe76oZ8tau_n-2Y6sJBCKzCujNbEUUFhd-ahI=@szczek.dev>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
If VT-D hardware supports scalable mode, Linux will set the IQA DW field
(bit11). In qemu, the vtd_mem_write and vtd_update_iq_dw set DW field well.
However, vtd_mem_read the DW field wrong because "& VTD_IQA_QS" dropped the
value of DW.
Replace "&VTD_IQA_QS" with "& (VTD_IQA_QS | VTD_IQA_DW_MASK)" could save
the DW field.
Test patch as below:
config the "x-scalable-mode" option:
"-device intel-iommu,caching-mode=on,x-scalable-mode=on,aw-bits=48"
After Linux OS boot, check the IQA_REG DW Field by usage 1 or 2:
1. IOMMU_DEBUGFS:
Before fix:
cat /sys/kernel/debug/iommu/intel/iommu_regset |grep IQA
IQA 0x90 0x00000001001da001
After fix:
cat /sys/kernel/debug/iommu/intel/iommu_regset |grep IQA
IQA 0x90 0x00000001001da801
Check DW field(bit11) is 1.
2. devmem2 read the IQA_REG (offset 0x90):
Before fix:
devmem2 0xfed90090
/dev/mem opened.
Memory mapped at address 0x7f72c795b000.
Value at address 0xFED90090 (0x7f72c795b090): 0x1DA001
After fix:
devmem2 0xfed90090
/dev/mem opened.
Memory mapped at address 0x7fc95281c000.
Value at address 0xFED90090 (0x7fc95281c090): 0x1DA801
Check DW field(bit11) is 1.
Signed-off-by: yeeli <seven.yi.lee@gmail.com>
Message-Id: <20240725031858.1529902-1-seven.yi.lee@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In amdvi_update_iotlb() we will only put a new entry in the hash
table if to_cache.perm is not IOMMU_NONE. However we allocate the
memory for the new AMDVIIOTLBEntry and for the hash table key
regardless. This means that in the IOMMU_NONE case we will leak the
memory we alloacted.
Move the allocations into the if() to the point where we know we're
going to add the item to the hash table.
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2452
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20240731170019.3590563-1-peter.maydell@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
- Restrict probe_access*() functions to TCG (Phil)
- Extract do_invalidate_device_tlb from vtd_process_device_iotlb_desc (Clément)
- Fixes in Loongson IPI model (Bibo & Phil)
- Make docs/interop/firmware.json compatible with qapi-gen.py script (Thomas)
- Correct MPC I2C MMIO region size (Zoltan)
- Remove useless cast in Loongson3 Virt machine (Yao)
- Various uses of range overlap API (Yao)
- Use ERRP_GUARD macro in nubus_virtio_mmio_realize (Zhao)
- Use DMA memory API in Goldfish UART model (Phil)
- Expose fifo8_pop_buf and introduce fifo8_drop (Phil)
- MAINTAINERS updates (Zhao, Phil)
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmagFF8ACgkQ4+MsLN6t
wN5bKg//f5TwUhsy2ff0FJpHheDOj/9Gc2nZ1U/Fp0E5N3sz3A7MGp91wye6Xwi3
XG34YN9LK1AVzuCdrEEs5Uaxs1ZS1R2mV+fZaGHwYYxPDdnXxGyp/2Q0eyRxzbcN
zxE2hWscYSZbPVEru4HvZJKfp4XnE1cqA78fJKMAdtq0IPq38tmQNRlJ+gWD9dC6
ZUHXPFf3DnucvVuwqb0JYO/E+uJpcTtgR6pc09Xtv/HFgMiS0vKZ1I/6LChqAUw9
eLMpD/5V2naemVadJe98/dL7gIUnhB8GTjsb4ioblG59AO/uojutwjBSQvFxBUUw
U5lX9OSn20ouwcGiqimsz+5ziwhCG0R6r1zeQJFqUxrpZSscq7NQp9ygbvirm+wS
edLc8yTPf4MtYOihzPP9jLPcXPZjEV64gSnJISDDFYWANCrysX3suaFEOuVYPl+s
ZgQYRVSSYOYHgNqBSRkPKKVUxskSQiqLY3SfGJG4EA9Ktt5lD1cLCXQxhdsqphFm
Ws3zkrVVL0EKl4v/4MtCgITIIctN1ZJE9u3oPJjASqSvK6EebFqAJkc2SidzKHz0
F3iYX2AheWNHCQ3HFu023EvFryjlxYk95fs2f6Uj2a9yVbi813qsvd3gcZ8t0kTT
+dmQwpu1MxjzZnA6838R6OCMnC+UpMPqQh3dPkU/5AF2fc3NnN8=
=J/I2
-----END PGP SIGNATURE-----
Merge tag 'hw-misc-20240723' of https://github.com/philmd/qemu into staging
Misc HW patch queue
- Restrict probe_access*() functions to TCG (Phil)
- Extract do_invalidate_device_tlb from vtd_process_device_iotlb_desc (Clément)
- Fixes in Loongson IPI model (Bibo & Phil)
- Make docs/interop/firmware.json compatible with qapi-gen.py script (Thomas)
- Correct MPC I2C MMIO region size (Zoltan)
- Remove useless cast in Loongson3 Virt machine (Yao)
- Various uses of range overlap API (Yao)
- Use ERRP_GUARD macro in nubus_virtio_mmio_realize (Zhao)
- Use DMA memory API in Goldfish UART model (Phil)
- Expose fifo8_pop_buf and introduce fifo8_drop (Phil)
- MAINTAINERS updates (Zhao, Phil)
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmagFF8ACgkQ4+MsLN6t
# wN5bKg//f5TwUhsy2ff0FJpHheDOj/9Gc2nZ1U/Fp0E5N3sz3A7MGp91wye6Xwi3
# XG34YN9LK1AVzuCdrEEs5Uaxs1ZS1R2mV+fZaGHwYYxPDdnXxGyp/2Q0eyRxzbcN
# zxE2hWscYSZbPVEru4HvZJKfp4XnE1cqA78fJKMAdtq0IPq38tmQNRlJ+gWD9dC6
# ZUHXPFf3DnucvVuwqb0JYO/E+uJpcTtgR6pc09Xtv/HFgMiS0vKZ1I/6LChqAUw9
# eLMpD/5V2naemVadJe98/dL7gIUnhB8GTjsb4ioblG59AO/uojutwjBSQvFxBUUw
# U5lX9OSn20ouwcGiqimsz+5ziwhCG0R6r1zeQJFqUxrpZSscq7NQp9ygbvirm+wS
# edLc8yTPf4MtYOihzPP9jLPcXPZjEV64gSnJISDDFYWANCrysX3suaFEOuVYPl+s
# ZgQYRVSSYOYHgNqBSRkPKKVUxskSQiqLY3SfGJG4EA9Ktt5lD1cLCXQxhdsqphFm
# Ws3zkrVVL0EKl4v/4MtCgITIIctN1ZJE9u3oPJjASqSvK6EebFqAJkc2SidzKHz0
# F3iYX2AheWNHCQ3HFu023EvFryjlxYk95fs2f6Uj2a9yVbi813qsvd3gcZ8t0kTT
# +dmQwpu1MxjzZnA6838R6OCMnC+UpMPqQh3dPkU/5AF2fc3NnN8=
# =J/I2
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 24 Jul 2024 06:36:47 AM AEST
# gpg: using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [full]
* tag 'hw-misc-20240723' of https://github.com/philmd/qemu: (28 commits)
MAINTAINERS: Add myself as a reviewer of machine core
MAINTAINERS: Cover guest-agent in QAPI schema
util/fifo8: Introduce fifo8_drop()
util/fifo8: Expose fifo8_pop_buf()
util/fifo8: Rename fifo8_pop_buf() -> fifo8_pop_bufptr()
util/fifo8: Rename fifo8_peek_buf() -> fifo8_peek_bufptr()
util/fifo8: Use fifo8_reset() in fifo8_create()
util/fifo8: Fix style
chardev/char-fe: Document returned value on error
hw/char/goldfish: Use DMA memory API
hw/nubus/virtio-mmio: Fix missing ERRP_GUARD() in realize handler
dump: make range overlap check more readable
crypto/block-luks: make range overlap check more readable
system/memory_mapping: make range overlap check more readable
sparc/ldst_helper: make range overlap check more readable
cxl/mailbox: make range overlap check more readable
util/range: Make ranges_overlap() return bool
hw/mips/loongson3_virt: remove useless type cast
hw/i2c/mpc_i2c: Fix mmio region size
docs/interop/firmware.json: convert "Example" section
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* hpet: emulation improvements
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmaelL4UHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroMXoQf+K77lNlHLETSgeeP3dr7yZPOmXjjN
qFY/18jiyLw7MK1rZC09fF+n9SoaTH8JDKupt0z9M1R10HKHLIO04f8zDE+dOxaE
Rou3yKnlTgFPGSoPPFr1n1JJfxtYlLZRoUzaAcHUaa4W7JR/OHJX90n1Rb9MXeDk
jV6P0v1FWtIDdM6ERm9qBGoQdYhj6Ra2T4/NZKJFXwIhKEkxgu4yO7WXv8l0dxQz
jE4fKotqAvrkYW1EsiVZm30lw/19duhvGiYeQXoYhk8KKXXjAbJMblLITSNWsCio
3l6Uud/lOxekkJDAq5nH3H9hCBm0WwvwL+0vRf3Mkr+/xRGvrhtmUdp8NQ==
=00mB
-----END PGP SIGNATURE-----
Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging
* target/i386/kvm: support for reading RAPL MSRs using a helper program
* hpet: emulation improvements
# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmaelL4UHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroMXoQf+K77lNlHLETSgeeP3dr7yZPOmXjjN
# qFY/18jiyLw7MK1rZC09fF+n9SoaTH8JDKupt0z9M1R10HKHLIO04f8zDE+dOxaE
# Rou3yKnlTgFPGSoPPFr1n1JJfxtYlLZRoUzaAcHUaa4W7JR/OHJX90n1Rb9MXeDk
# jV6P0v1FWtIDdM6ERm9qBGoQdYhj6Ra2T4/NZKJFXwIhKEkxgu4yO7WXv8l0dxQz
# jE4fKotqAvrkYW1EsiVZm30lw/19duhvGiYeQXoYhk8KKXXjAbJMblLITSNWsCio
# 3l6Uud/lOxekkJDAq5nH3H9hCBm0WwvwL+0vRf3Mkr+/xRGvrhtmUdp8NQ==
# =00mB
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 23 Jul 2024 03:19:58 AM AEST
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu:
hpet: avoid timer storms on periodic timers
hpet: store full 64-bit target value of the counter
hpet: accept 64-bit reads and writes
hpet: place read-only bits directly in "new_val"
hpet: remove unnecessary variable "index"
hpet: ignore high bits of comparator in 32-bit mode
hpet: fix and cleanup persistence of interrupt status
Add support for RAPL MSRs in KVM/Qemu
tools: build qemu-vmsr-helper
qio: add support for SO_PEERCRED for socket channel
target/i386: do not crash if microvm guest uses SGX CPUID leaves
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This piece of code can be shared by both IOTLB invalidation and
PASID-based IOTLB invalidation
No functional changes intended.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Message-ID: <20240718081636.879544-12-zhenzhong.duan@intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
CPUs Control device(\\_SB.PCI0) register interface for the x86 arch is IO port
based and existing CPUs AML code assumes _CRS objects would evaluate to a system
resource which describes IO Port address. But on ARM arch CPUs control
device(\\_SB.PRES) register interface is memory-mapped hence _CRS object should
evaluate to system resource which describes memory-mapped base address. Update
build CPUs AML function to accept both IO/MEMORY region spaces and accordingly
update the _CRS object.
Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Tested-by: Xianglai Li <lixianglai@loongson.cn>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
Tested-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20240716111502.202344-6-salil.mehta@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently QEMU describes initial[1] RAM* in SMBIOS as a series of
virtual DIMMs (capped at 16Gb max) using type 17 structure entries.
Which is fine for the most cases. However when starting guest
with terabytes of RAM this leads to too many memory device
structures, which eventually upsets linux kernel as it reserves
only 64K for these entries and when that border is crossed out
it runs out of reserved memory.
Instead of partitioning initial RAM on 16Gb DIMMs, use maximum
possible chunk size that SMBIOS spec allows[2]. Which lets
encode RAM in lower 31 bits of 32bit field (which amounts upto
2047Tb per DIMM).
As result initial RAM will generate only one type 17 structure
until host/guest reach ability to use more RAM in the future.
Compat changes:
We can't unconditionally change chunk size as it will break
QEMU<->guest ABI (and migration). Thus introduce a new machine
class field that would let older versioned machines to use
legacy 16Gb chunks, while new(er) machine type[s] use maximum
possible chunk size.
PS:
While it might seem to be risky to rise max entry size this large
(much beyond of what current physical RAM modules support),
I'd not expect it causing much issues, modulo uncovering bugs
in software running within guest. And those should be fixed
on guest side to handle SMBIOS spec properly, especially if
guest is expected to support so huge RAM configs.
In worst case, QEMU can reduce chunk size later if we would
care enough about introducing a workaround for some 'unfixable'
guest OS, either by fixing up the next machine type or
giving users a CLI option to customize it.
1) Initial RAM - is RAM configured with help '-m SIZE' CLI option/
implicitly defined by machine. It doesn't include memory
configured with help of '-device' option[s] (pcdimm,nvdimm,...)
2) SMBIOS 3.1.0 7.18.5 Memory Device — Extended Size
PS:
* tested on 8Tb host with RHEL6 guest, which seems to parse
type 17 SMBIOS table entries correctly (according to 'dmidecode').
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20240715122417.4059293-1-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
sgx_epc_get_section assumes a PC platform is in use:
bool sgx_epc_get_section(int section_nr, uint64_t *addr, uint64_t *size)
{
PCMachineState *pcms = PC_MACHINE(qdev_get_machine());
However, sgx_epc_get_section is called by CPUID regardless of whether
SGX state has been initialized or which platform is in use. Check
whether the machine has the right QOM class and if not behave as if
there are no EPC sections.
Fixes: 1dec2e1f19 ("i386: Update SGX CPUID info according to hardware/KVM/user input", 2021-09-30)
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2142
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The 'level' field in vtd_iotlb_key is an unsigned integer.
We don't need to store level as an int in vtd_lookup_iotlb.
This is not an issue by itself, but using unsigned here seems cleaner.
Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Message-Id: <20240709142557.317271-5-clement.mathieu--drif@eviden.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Per the below code, it can overflow as am can be larger than 8 according
to the CH 6.5.2.3 IOTLB Invalidate. Use uint64_t to avoid overflows.
Fixes: b5a280c008 ("intel-iommu: add IOTLB using hash table")
Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
Reviewed-by: Minwoo Im <minwoo.im@samsung.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Message-Id: <20240709142557.317271-4-clement.mathieu--drif@eviden.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
These 2 macros are for high 64-bit of the FRCD registers.
Declarations have to be moved accordingly.
Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
Reviewed-by: Minwoo Im <minwoo.im@samsung.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Message-Id: <20240709142557.317271-3-clement.mathieu--drif@eviden.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The constant must be unsigned, otherwise the two's complement
overrides the other fields when a PASID is present.
Fixes: 1b2b12376c ("intel-iommu: PASID support")
Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Minwoo Im <minwoo.im@samsung.com>
Message-Id: <20240709142557.317271-2-clement.mathieu--drif@eviden.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>