Commit Graph

378 Commits

Author SHA1 Message Date
Corey Minyard
f5878b0381 i386: Rename bools in PCMachineState to end in _enabled
This makes their function more clear and prevents conflicts when adding
the actual devices to the machine state, if necessary.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20181107152434.22219-1-minyard@acm.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-12-11 15:45:22 -02:00
Singh, Brijesh
35c2450191 x86_iommu: move vtd_generate_msi_message in common file
The vtd_generate_msi_message() in intel-iommu is used to construct a MSI
Message from IRQ. A similar function will be needed when we add interrupt
remapping support in amd-iommu. Moving the function in common file to
avoid the code duplication. Rename it to x86_iommu_irq_to_msi_message().
There is no logic changes in the code flow.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Suggested-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: Tom Lendacky <Thomas.Lendacky@amd.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-05 13:24:02 -05:00
Tao Xu
09b9ee643f i386: Add PKU on Skylake-Server CPU model
As the release document ref below link (page 13):
https://software.intel.com/sites/default/files/managed/c5/15/\
architecture-instruction-set-extensions-programming-reference.pdf

PKU is supported in Skylake Server (Only Server) and later, and
on Intel(R) Xeon(R) Processor Scalable Family. So PKU is supposed
to be in Skylake-Server CPU model. And PKU's CPUID has been
exposed to QEMU. But PKU can't be find in Skylake-Server CPU
model in the code. So this patch will fix this issue in
Skylake-Server CPU model.

Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <5014b57f834dcfa8fd3781504d98dcf063d54fde.1540801392.git.tao3.xu@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-10-30 21:14:43 -03:00
Roman Kagan
9b4cf107b0 hyperv: only add SynIC in compatible configurations
Certain configurations do not allow SynIC to be used in QEMU.  In
particular,

- when hyperv_vpindex is off, SINT routes can't be used as they refer to
  the destination vCPU by vp_index

- older KVM (which doesn't expose KVM_CAP_HYPERV_SYNIC2) zeroes out
  SynIC message and event pages on every msr load, breaking migration

OTOH in-KVM users of SynIC -- SynIC timers -- do work in those
configurations, and we shouldn't stop the guest from using them.

To cover both scenarios, introduce an X86CPU property that makes CPU
init code to skip creation of the SynIC object (and thus disables any
SynIC use in QEMU) but keeps the KVM part of the SynIC working.
The property is clear by default but is set via compat logic for older
machine types.

As a result, when hv_synic and a modern machine type are specified, QEMU
will refuse to run unless vp_index is on and the kernel is recent
enough.  OTOH with an older machine type QEMU will run fine with
hv_synic=on against an older kernel and/or without vp_index enabled but
will disallow the in-QEMU uses of SynIC (in e.g. VMBus).

Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20180921082217.29481-4-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-19 13:44:14 +02:00
Paolo Bonzini
990e0be260 i386: do not migrate MSR_SMI_COUNT on machine types <2.12
MSR_SMI_COUNT started being migrated in QEMU 2.12.  Do not migrate it
on older machine types, or the subsection causes a load failure for
guests that use SMM.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-07-30 14:00:11 +02:00
Eduardo Habkost
97e50dd013 pc: Fix typo on PC_COMPAT_2_12
I forgot a hyphen when amending the compat code on commit
e0051647 ("i386: Enable TOPOEXT feature on AMD EPYC CPU").

Fixes: e00516475c
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20180703011026.18650-1-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-07-09 14:20:57 -03:00
Peter Xu
0c8465440d hmp: obsolete "info ioapic"
Let's start to use "info pic" just like other platforms.  For now we
keep the command for a while so that old users can know what is the new
command to use.

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20171229073104.3810-6-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-06-28 19:05:37 +02:00
Peter Xu
cce5405e0e ioapic: support "info irq"
This include both userspace and in-kernel ioapic.  Note that the numbers
can be inaccurate for kvm-ioapic.  One reason is the same with
kvm-i8259, that when irqfd is used, irqs can be delivered all inside
kernel without our notice.  Meanwhile, kvm-ioapic is specially treated
when irq numbers <ISA_NUM_IRQS, those irqs will be delivered in kernel
too via kvm-i8259 (please refer to kvm_pc_gsi_handler).

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20171229073104.3810-5-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-06-28 19:05:37 +02:00
Babu Moger
e00516475c i386: Enable TOPOEXT feature on AMD EPYC CPU
Enable TOPOEXT feature on EPYC CPU. This is required to support
hyperthreading on VM guests. Also extend xlevel to 0x8000001E.

Disable topoext on PC_COMPAT_2_12 and keep xlevel 0x8000000a.

Signed-off-by: Babu Moger <babu.moger@amd.com>
Message-Id: <1529443919-67509-3-git-send-email-babu.moger@amd.com>
[ehabkost: Added EPYC-IBPB.xlevel to PC_COMPAT_2_12]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-06-22 15:01:15 -03:00
Ross Zwisler
11c39b5cd9 nvdimm: make persistence option symbolic
Replace the "nvdimm-cap" option which took numeric arguments such as "2"
with a more user friendly "nvdimm-persistence" option which takes symbolic
arguments "cpu" or "mem-ctrl".

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-06-11 22:19:57 +03:00
Ross Zwisler
9ab3aad281 nvdimm, acpi: support NFIT platform capabilities
Add a machine command line option to allow the user to control the Platform
Capabilities Structure in the virtualized NFIT.  This Platform Capabilities
Structure was added in ACPI 6.2 Errata A.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-06-01 19:20:38 +03:00
Peter Xu
63b88968f1 intel-iommu: rework the page walk logic
This patch fixes a potential small window that the DMA page table might
be incomplete or invalid when the guest sends domain/context
invalidations to a device.  This can cause random DMA errors for
assigned devices.

This is a major change to the VT-d shadow page walking logic. It
includes but is not limited to:

- For each VTDAddressSpace, now we maintain what IOVA ranges we have
  mapped and what we have not.  With that information, now we only send
  MAP or UNMAP when necessary.  Say, we don't send MAP notifies if we
  know we have already mapped the range, meanwhile we don't send UNMAP
  notifies if we know we never mapped the range at all.

- Introduce vtd_sync_shadow_page_table[_range] APIs so that we can call
  in any places to resync the shadow page table for a device.

- When we receive domain/context invalidation, we should not really run
  the replay logic, instead we use the new sync shadow page table API to
  resync the whole shadow page table without unmapping the whole
  region.  After this change, we'll only do the page walk once for each
  domain invalidations (before this, it can be multiple, depending on
  number of notifiers per address space).

While at it, the page walking logic is also refactored to be simpler.

CC: QEMU Stable <qemu-stable@nongnu.org>
Reported-by: Jintack Lim <jintack@cs.columbia.edu>
Tested-by: Jintack Lim <jintack@cs.columbia.edu>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-05-23 17:34:05 +03:00
Peter Xu
4f8a62a933 intel-iommu: only do page walk for MAP notifiers
For UNMAP-only IOMMU notifiers, we don't need to walk the page tables.
Fasten that procedure by skipping the page table walk.  That should
boost performance for UNMAP-only notifiers like vhost.

CC: QEMU Stable <qemu-stable@nongnu.org>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-05-23 17:33:57 +03:00
Peter Xu
1d9efa73e1 intel-iommu: add iommu lock
SECURITY IMPLICATION: this patch fixes a potential race when multiple
threads access the IOMMU IOTLB cache.

Add a per-iommu big lock to protect IOMMU status.  Currently the only
thing to be protected is the IOTLB/context cache, since that can be
accessed even without BQL, e.g., in IO dataplane.

Note that we don't need to protect device page tables since that's fully
controlled by the guest kernel.  However there is still possibility that
malicious drivers will program the device to not obey the rule.  In that
case QEMU can't really do anything useful, instead the guest itself will
be responsible for all uncertainties.

CC: QEMU Stable <qemu-stable@nongnu.org>
Reported-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-05-23 17:33:57 +03:00
Peter Xu
b4a4ba0d68 intel-iommu: remove IntelIOMMUNotifierNode
That is not really necessary.  Removing that node struct and put the
list entry directly into VTDAddressSpace.  It simplfies the code a lot.
Since at it, rename the old notifiers_list into vtd_as_with_notifiers.

CC: QEMU Stable <qemu-stable@nongnu.org>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-05-23 17:33:57 +03:00
Babu Moger
ab8f992e3e i386: Add new property to control cache info
The property legacy-cache will be used to control the cache information.
If user passes "-cpu legacy-cache" then older information will
be displayed even if the hardware supports new information. Otherwise
use the statically loaded cache definitions if available.

Renamed the previous cache structures to legacy_*. If there is any change in
the cache information, then it needs to be initialized in builtin_x86_defs.

Signed-off-by: Babu Moger <babu.moger@amd.com>
Tested-by: Geoffrey McRae <geoff@hostfission.com>
Message-Id: <20180514164156.27034-3-babu.moger@amd.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-05-15 11:33:33 -03:00
Babu Moger
968ee4ad25 pc: add 2.13 machine types
Add pc-q35-2.13 and pc-i440fx-2.13 machine types

Signed-off-by: Babu Moger <babu.moger@amd.com>
Message-Id: <20180514164156.27034-2-babu.moger@amd.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-05-15 11:33:33 -03:00
Igor Mammedov
38aefb578d pc: simplify MachineClass::get_hotplug_handler handling
By default MachineClass::get_hotplug_handler is NULL and concrete board
should set it to it's own handler.
Considering there isn't any default handler, drop saving empty
MachineClass::get_hotplug_handler in child class and make PC code
consistent with spapr/s390x boards.

We can bring this back when actual usecase surfaces and do it
consistently across boards that use get_hotplug_handler().

Suggested-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-id: 1525691524-32265-2-git-send-email-imammedo@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-05-10 18:10:56 +01:00
David Hildenbrand
f2ffbe2b7d pc: rename "hotplug memory" terminology to "device memory"
Let's make it clear that we are dealing with device memory. That it can
be used for memory hotplug is just a special case.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180423165126.15441-10-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-05-07 10:00:02 -03:00
David Hildenbrand
b0c14ec4ef machine: make MemoryHotplugState accessible via the machine
Let's allow to query the MemoryHotplugState directly from the machine.
If the pointer is NULL, the machine does not support memory devices. If
the pointer is !NULL, the machine supports memory devices and the
data structure contains information about the applicable physical
guest address space region.

This allows us to generically detect if a certain machine has support
for memory devices, and to generically manage it (find free address
range, plug/unplug a memory region).

We will rename "MemoryHotplugState" to something more meaningful
("DeviceMemory") after we completed factoring out the pc-dimm code into
MemoryDevice code.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180423165126.15441-3-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
[ehabkost: rebased series, solved conflicts at spapr.c]
[ehabkost: squashed fix to use g_malloc0()]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-05-07 10:00:02 -03:00
Wang Xin
0ab126f165 pc: correct misspelled CPU model-id for pc 2.2
Signed-off-by: Wang Xin <wangxinxin.wang@huawei.com>
Message-Id: <1517367668-25048-1-git-send-email-wangxinxin.wang@huawei.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-03-16 16:29:07 -03:00
Philippe Mathieu-Daudé
47973a2dbf hw/input/i8042: Extract declarations from i386/pc.h into input/i8042.h
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au> (hw/ppc)
Message-Id: <20180308223946.26784-4-f4bug@amsat.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:48 +01:00
Philippe Mathieu-Daudé
bb3d5ea858 hw/isa: Move parallel_hds_isa_init() to hw/char/parallel-isa.c
Again... (after 07dc788054 and 9157eee1b1).

We now extract the ISA bus specific helpers.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180308223946.26784-2-f4bug@amsat.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:47 +01:00
Paolo Bonzini
4b9c264bd2 q35: change default NIC to e1000e
The e1000 NIC is getting old and is not a very good default for a
PCIe machine type.  Change it to e1000e, which should be supported
by a good number of guests.

In particular, drivers for 82574 were added first to Linux 2.6.27 (2008)
and Windows 2008 R2.  This does mean that Windows 2008 will not work
anymore with Q35 machine types and a default "-net nic -net xxx" network
configuration; it did work before because it does have an AHCI driver.
However, Windows 2008 has been declared out of main stream support
in 2015.  It will get out of extended support in 2020.  Windows 2008
R2 has the same end of support dates and, since the two are basically
Vista vs. Windows 7, R2 probably is more popular.

Reviewed-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:45 +01:00
Prasad Singamsetty
37f51384ae intel-iommu: Extend address width to 48 bits
The current implementation of Intel IOMMU code only supports 39 bits
iova address width. This patch provides a new parameter (x-aw-bits)
for intel-iommu to extend its address width to 48 bits but keeping the
default the same (39 bits). The reason for not changing the default
is to avoid potential compatibility problems with live migration of
intel-iommu enabled QEMU guest. The only valid values for 'x-aw-bits'
parameter are 39 and 48.

After enabling larger address width (48), we should be able to map
larger iova addresses in the guest. For example, a QEMU guest that
is configured with large memory ( >=1TB ). To check whether 48 bits
aw is enabled, we can grep in the guest dmesg output with line:
"DMAR: Host address width 48".

Signed-off-by: Prasad Singamsetty <prasad.singamsety@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-01-18 21:52:38 +02:00
Prasad Singamsetty
92e5d85e83 intel-iommu: Redefine macros to enable supporting 48 bit address width
The current implementation of Intel IOMMU code only supports 39 bits
host/iova address width so number of macros use hard coded values based
on that. This patch is to redefine them so they can be used with
variable address widths. This patch doesn't add any new functionality
but enables adding support for 48 bit address width.

Signed-off-by: Prasad Singamsetty <prasad.singamsety@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-01-18 21:52:38 +02:00
Haozhong Zhang
c68bcb3a99 target/i386: add clflushopt to "Skylake-Server" cpu model
CPUID_7_0_EBX_CLFLUSHOPT is missed in current "Skylake-Server" cpu
model. Add it to "Skylake-Server" cpu model on pc-i440fx-2.12 and
pc-q35-2.12. Keep it disabled in "Skylake-Server" cpu model on older
machine types.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Message-Id: <20171219033730.12748-3-haozhong.zhang@intel.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-01-17 23:04:31 -02:00
Haozhong Zhang
df47ce8af4 pc: add 2.12 machine types
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Message-Id: <20171219033730.12748-2-haozhong.zhang@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-01-17 23:04:31 -02:00
Michael S. Tsirkin
acc95bc850 Merge remote-tracking branch 'origin/master' into HEAD
Resolve conflicts around apb.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-01-11 22:03:50 +02:00
Sergio Andres Gomez Del Real
2cb9f06e3d apic: add function to apic that will be used by hvf
This patch adds the function apic_get_highest_priority_irr to
apic.c and exports it through the interface in apic.h for use by hvf.

Signed-off-by: Sergio Andres Gomez Del Real <Sergio.G.DelReal@gmail.com>
Message-Id: <20170913090522.4022-8-Sergio.G.DelReal@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-22 15:01:19 +01:00
Peter Xu
bf33cc75ad intel_iommu: remove X86_IOMMU_PCI_DEVFN_MAX
We have PCI_DEVFN_MAX now.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Liu, Yi L <yi.l.liu@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-12-22 01:42:03 +02:00
Philippe Mathieu-Daudé
0d5d8a3a90 hw/misc/pvpanic: extract public API from i386/pc to "hw/misc/pvpanic.h"
and remove the old i386/pc dependency.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-12-18 17:07:02 +03:00
Philippe Mathieu-Daudé
489983d6b4 hw/net/ne2000: extract ne2k-isa code from i386/pc to ne2000-isa.c
- add "hw/net/ne2000-isa.h"
- remove the old i386 dependency

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Hervé Poussineau <hpoussin@reactos.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au> [PPC]
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-12-18 17:07:02 +03:00
Philippe Mathieu-Daudé
866e2b3727 hw/display/vga: extract public API from i386/pc to "hw/display/vga.h"
and remove the old i386/pc dependency.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-12-18 17:07:02 +03:00
Philippe Mathieu-Daudé
b1c439d179 hw/acpi/ich9: extract ACPI_PM_PROP_TCO_ENABLED from i386/pc
enable_tco is specific to i386/pc.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-12-18 17:07:02 +03:00
Philippe Mathieu-Daudé
9dc047ce8f hw/acpi: ACPI_PM_* defines are not restricted to i386 arch
this allows to remove the old i386/pc dependency on acpi/core.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-12-18 17:07:02 +03:00
Chao Gao
861fec459b i386/msi: Correct mask of destination ID in MSI address
According to SDM 10.11.1, only [19:12] bits of MSI address are
Destination ID, change the mask to avoid ambiguity for VT-d spec
has used the bit 4 to indicate a remappable interrupt request.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-12-01 18:28:15 +02:00
Marcel Apfelbaum
9fa99d2519 hw/pci-host: Fix x86 Host Bridges 64bit PCI hole
Currently there is no MMIO range over 4G
reserved for PCI hotplug. Since the 32bit PCI hole
depends on the number of cold-plugged PCI devices
and other factors, it is very possible is too small
to hotplug PCI devices with large BARs.

Fix it by reserving 2G for I4400FX chipset
in order to comply with older Win32 Guest OSes
and 32G for Q35 chipset.

Even if the new defaults of pci-hole64-size will appear in
"info qtree" also for older machines, the property was
not implemented so no changes will be visible to guests.

Note this is a regression since prev QEMU versions had
some range reserved for 64bit PCI hotplug.

Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-11-16 17:46:53 +02:00
Gonglei
6c69dfb67e i386/cpu/hyperv: support over 64 vcpus for windows guests
Starting with Windows Server 2012 and Windows 8, if
CPUID.40000005.EAX contains a value of -1, Windows assumes specific
limit to the number of VPs. In this case, Windows Server 2012
guest VMs may use more than 64 VPs, up to the maximum supported
number of processors applicable to the specific Windows
version being used.

https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/reference/tlfs

For compatibility, Let's introduce a new property for X86CPU,
named "x-hv-max-vps" as Eduardo's suggestion, and set it
to 0x40 before machine 2.10.

(The "x-" prefix indicates that the property is not supposed to
be a stable user interface.)

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Message-Id: <1505143227-14324-1-git-send-email-arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-19 16:20:49 +02:00
Marcel Apfelbaum
a6fd5b0e05 pc: add 2.11 machine types
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-09-08 16:15:17 +03:00
Peter Xu
07f7b73398 intel_iommu: use access_flags for iotlb
It was cached by read/write separately. Let's merge them.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-08-02 00:13:25 +03:00
Daniel P. Berrange
1ce36bfe64 i386: expose "TCGTCGTCGTCG" in the 0x40000000 CPUID leaf
Currently when running KVM, we expose "KVMKVMKVM\0\0\0" in
the 0x40000000 CPUID leaf. Other hypervisors (VMWare,
HyperV, Xen, BHyve) all do the same thing, which leaves
TCG as the odd one out.

The CPUID signature is used by software to detect which
virtual environment they are running in and (potentially)
change behaviour in certain ways. For example, systemd
supports a ConditionVirtualization= setting in unit files.
The virt-what command can also report the virt type it is
running on

Currently both these apps have to resort to custom hacks
like looking for 'fw-cfg' entry in the /proc/device-tree
file to identify TCG.

This change thus proposes a signature "TCGTCGTCGTCG" to be
reported when running under TCG.

To hide this, the -cpu option tcg-cpuid=off can be used.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <20170509132736.10071-3-berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-07-17 15:41:30 -03:00
Alexey Kardashevskiy
1221a47467 memory/iommu: introduce IOMMUMemoryRegionClass
This finishes QOM'fication of IOMMUMemoryRegion by introducing
a IOMMUMemoryRegionClass. This also provides a fastpath analog for
IOMMU_MEMORY_REGION_GET_CLASS().

This makes IOMMUMemoryRegion an abstract class.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20170711035620.4232-3-aik@ozlabs.ru>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-14 12:04:41 +02:00
Alexey Kardashevskiy
3df9d74806 memory/iommu: QOM'fy IOMMU MemoryRegion
This defines new QOM object - IOMMUMemoryRegion - with MemoryRegion
as a parent.

This moves IOMMU-related fields from MR to IOMMU MR. However to avoid
dymanic QOM casting in fast path (address_space_translate, etc),
this adds an @is_iommu boolean flag to MR and provides new helper to
do simple cast to IOMMU MR - memory_region_get_iommu. The flag
is set in the instance init callback. This defines
memory_region_is_iommu as memory_region_get_iommu()!=NULL.

This switches MemoryRegion to IOMMUMemoryRegion in most places except
the ones where MemoryRegion may be an alias.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20170711035620.4232-2-aik@ozlabs.ru>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-14 12:04:41 +02:00
Thomas Huth
2099935dbf Move CONFIG_KVM related definitions to kvm_i386.h
pc.h and sysemu/kvm.h are also included from common code (where
CONFIG_KVM is not available), so the #defines that depend on CONFIG_KVM
should not be declared here to avoid that anybody is using them in a
wrong way. Since we're also going to poison CONFIG_KVM for common code,
let's move them to kvm_i386.h instead. Most of the dummy definitions
from sysemu/kvm.h are also unused since the code that uses them is
only compiled for CONFIG_KVM (e.g. target/i386/kvm.c), so the unused
defines are also simply dropped here instead of being moved.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1498454578-18709-3-git-send-email-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-04 14:30:03 +02:00
Laszlo Ersek
2f295167e0 q35/mch: implement extended TSEG sizes
The q35 machine type currently lets the guest firmware select a 1MB, 2MB
or 8MB TSEG (basically, SMRAM) size. In edk2/OVMF, we use 8MB, but even
that is not enough when a lot of VCPUs (more than approx. 224) are
configured -- SMRAM footprint scales largely proportionally with VCPU
count.

Introduce a new property for "mch" called "extended-tseg-mbytes", which
expresses (in megabytes) the user's choice of TSEG (SMRAM) size.

Invent a new, QEMU-specific register in the config space of the DRAM
Controller, at offset 0x50, in order to allow guest firmware to query the
TSEG (SMRAM) size.

According to Intel Document Number 316966-002, Table 5-1 "DRAM Controller
Register Address Map (D0:F0)":

    Warning: Address locations that are not listed are considered Intel
             Reserved registers locations. Reads to Reserved registers may
             return non-zero values. Writes to reserved locations may
             cause system failures.

             All registers that are defined in the PCI 2.3 specification,
             but are not necessary or implemented in this component are
             simply not included in this document. The
             reserved/unimplemented space in the PCI configuration header
             space is not documented as such in this summary.

Offsets 0x50 and 0x51 are not listed in Table 5-1. They are also not part
of the standard PCI config space header. And they precede the capability
list as well, which starts at 0xe0 for this device.

When the guest writes value 0xffff to this register, the value that can be
read back is that of "mch.extended-tseg-mbytes" -- unless it remains
0xffff. The guest is required to write 0xffff first (as opposed to a
read-only register) because PCI config space is generally not cleared on
QEMU reset, and after S3 resume or reboot, new guest firmware running on
old QEMU could read a guest OS-injected value from this register.

After reading the available "extended" TSEG size, the guest firmware may
actually request that TSEG size by writing pattern 11b to the ESMRAMC
register's TSEG_SZ bit-field. (The Intel spec referenced above defines
only patterns 00b (1MB), 01b (2MB) and 10b (8MB); 11b is reserved.)

On the QEMU command line, the value can be set with

  -global mch.extended-tseg-mbytes=N

The default value for 2.10+ q35 machine types is 16. The value is limited
to 0xfff (4095) at the moment, purely so that the product (4095 MB) can be
stored to the uint32_t variable "tseg_size" in mch_update_smram(). Users
are responsible for choosing sensible TSEG sizes.

On 2.9 and earlier q35 machine types, the default value is 0. This lets
the 11b bit pattern in ESMRAMC.TSEG_SZ, and the register at offset 0x50,
keep their original behavior.

When "extended-tseg-mbytes" is nonzero, the new register at offset 0x50 is
set to that value on reset, for completeness.

PCI config space is migrated automatically, so no VMSD changes are
necessary.

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Ref: https://bugzilla.redhat.com/show_bug.cgi?id=1447027
Ref: https://lists.01.org/pipermail/edk2-devel/2017-May/010456.html
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-06-16 18:07:08 +03:00
Eduardo Habkost
1f43571604 pc: Use "min-[x]level" on compat_props
Since the automatic cpuid-level code was introduced in commit
c39c0edf9b ("target-i386: Automatically
set level/xlevel/xlevel2 when needed"), the CPU model tables just define
the default CPUID level code (set using "min-level").  Setting
"[x]level" forces CPUID level to a specific value and disable the
automatic-level logic.

But the PC compat code was not updated and the existing "[x]level"
compat properties broke compatibility for people using features that
triggered the auto-level code.  To keep previous behavior, we should set
"min-[x]level" instead of "[x]level" on compat_props.

This was not a problem for most cases, because old machine-types don't
have full-cpuid-auto-level enabled.  The only common use case it broke
was the CPUID[7] auto-level code, that was already enabled since the
first CPUID[7] feature was introduced (in QEMU 1.4.0).

This causes the regression reported at:
https://bugzilla.redhat.com/show_bug.cgi?id=1454641

Change the PC compat code to use "min-[x]level" instead of "[x]level" on
compat_props, and add new test cases to ensure we don't break this
again.

Reported-by: "Guo, Zhiyi" <zhguo@redhat.com>
Fixes: c39c0edf9b ("target-i386: Automatically set level/xlevel/xlevel2 when needed")
Cc: qemu-stable@nongnu.org
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-06-05 14:59:08 -03:00
Peter Xu
dbaabb25f4 intel_iommu: support passthrough (PT)
Hardware support for VT-d device passthrough. Although current Linux can
live with iommu=pt even without this, but this is faster than when using
software passthrough.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Liu, Yi L <yi.l.liu@linux.intel.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
2017-05-25 21:25:27 +03:00
Peter Xu
465238d9f8 pc: add 2.10 machine type
CC: "Michael S. Tsirkin" <mst@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Richard Henderson <rth@twiddle.net>
CC: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-05-10 22:04:23 +03:00
Igor Mammedov
98e753a6e5 pc/fwcfg: unbreak migration from qemu-2.5 and qemu-2.6 during firmware boot
Since 2.7 commit (b2a575a Add optionrom compatible with fw_cfg DMA version)
regressed migration during firmware exection time by
abusing fwcfg.dma_enabled property to decide loading
dma version of option rom AND by mistake disabling DMA
for 2.6 and earlier globally instead of only for option rom.

so 2.6 machine type guest is broken when it already runs
firmware in DMA mode but migrated to qemu-2.7(pc-2.6)
at that time;

a) qemu-2.6:pc2.6 (fwcfg.dma=on,firmware=dma,oprom=ioport)
b) qemu-2.7:pc2.6 (fwcfg.dma=off,firmware=ioport,oprom=ioport)

  to:   a     b
from
a       OK   FAIL
b       OK   OK

So we currently have broken forward migration from
qemu-2.6 to qemu-2.[789] that however could be fixed
for 2.10 by re-enabling DMA for 2.[56] machine types
and allowing dma capable option rom only since 2.7.
As result qemu should end up with:

c) qemu-2.10:pc2.6 (fwcfg.dma=on,firmware=dma,oprom=ioport)

   to:  a     b    c
from
a      OK   FAIL  OK
b      OK   OK    OK
c      OK   FAIL  OK

where forward migration from qemu-2.6 to qemu-2.10 should
work again leaving only qemu-2.[789]:pc-2.6 broken.

Reported-by: Eduardo Habkost <ehabkost@redhat.com>
Analyzed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-05-10 22:04:23 +03:00