Merge commit 'df84f17' into HEAD

This merge fixes a semantic conflict with the trivial tree.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This commit is contained in:
Paolo Bonzini 2019-10-26 15:36:22 +02:00
commit 673652a785
46 changed files with 2224 additions and 1034 deletions

3
.gitmodules vendored
View File

@ -58,3 +58,6 @@
[submodule "roms/opensbi"] [submodule "roms/opensbi"]
path = roms/opensbi path = roms/opensbi
url = https://git.qemu.org/git/opensbi.git url = https://git.qemu.org/git/opensbi.git
[submodule "roms/qboot"]
path = roms/qboot
url = https://github.com/bonzini/qboot

View File

@ -1275,6 +1275,15 @@ F: include/hw/timer/hpet.h
F: include/hw/timer/i8254* F: include/hw/timer/i8254*
F: include/hw/rtc/mc146818rtc* F: include/hw/rtc/mc146818rtc*
microvm
M: Sergio Lopez <slp@redhat.com>
M: Paolo Bonzini <pbonzini@redhat.com>
S: Maintained
F: docs/microvm.rst
F: hw/i386/microvm.c
F: include/hw/i386/microvm.h
F: pc-bios/bios-microvm.bin
Machine core Machine core
M: Eduardo Habkost <ehabkost@redhat.com> M: Eduardo Habkost <ehabkost@redhat.com>
M: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> M: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>

View File

@ -28,3 +28,4 @@
CONFIG_ISAPC=y CONFIG_ISAPC=y
CONFIG_I440FX=y CONFIG_I440FX=y
CONFIG_Q35=y CONFIG_Q35=y
CONFIG_MICROVM=y

View File

@ -184,6 +184,19 @@ enabled.
Requires: hv-vpindex, hv-synic, hv-time, hv-stimer Requires: hv-vpindex, hv-synic, hv-time, hv-stimer
3.17. hv-no-nonarch-coresharing=on/off/auto
===========================================
This enlightenment tells guest OS that virtual processors will never share a
physical core unless they are reported as sibling SMT threads. This information
is required by Windows and Hyper-V guests to properly mitigate SMT related CPU
vulnerabilities.
When the option is set to 'auto' QEMU will enable the feature only when KVM
reports that non-architectural coresharing is impossible, this means that
hyper-threading is not supported or completely disabled on the host. This
setting also prevents migration as SMT settings on the destination may differ.
When the option is set to 'on' QEMU will always enable the feature, regardless
of host setup. To keep guests secure, this can only be used in conjunction with
exposing correct vCPU topology and vCPU pinning.
4. Development features 4. Development features
======================== ========================

108
docs/microvm.rst Normal file
View File

@ -0,0 +1,108 @@
====================
microvm Machine Type
====================
``microvm`` is a machine type inspired by ``Firecracker`` and
constructed after its machine model.
It's a minimalist machine type without ``PCI`` nor ``ACPI`` support,
designed for short-lived guests. microvm also establishes a baseline
for benchmarking and optimizing both QEMU and guest operating systems,
since it is optimized for both boot time and footprint.
Supported devices
-----------------
The microvm machine type supports the following devices:
- ISA bus
- i8259 PIC (optional)
- i8254 PIT (optional)
- MC146818 RTC (optional)
- One ISA serial port (optional)
- LAPIC
- IOAPIC (with kernel-irqchip=split by default)
- kvmclock (if using KVM)
- fw_cfg
- Up to eight virtio-mmio devices (configured by the user)
Limitations
-----------
Currently, microvm does *not* support the following features:
- PCI-only devices.
- Hotplug of any kind.
- Live migration across QEMU versions.
Using the microvm machine type
------------------------------
Machine-specific options
~~~~~~~~~~~~~~~~~~~~~~~~
It supports the following machine-specific options:
- microvm.x-option-roms=bool (Set off to disable loading option ROMs)
- microvm.pit=OnOffAuto (Enable i8254 PIT)
- microvm.isa-serial=bool (Set off to disable the instantiation an ISA serial port)
- microvm.pic=OnOffAuto (Enable i8259 PIC)
- microvm.rtc=OnOffAuto (Enable MC146818 RTC)
- microvm.auto-kernel-cmdline=bool (Set off to disable adding virtio-mmio devices to the kernel cmdline)
Boot options
~~~~~~~~~~~~
By default, microvm uses ``qboot`` as its BIOS, to obtain better boot
times, but it's also compatible with ``SeaBIOS``.
As no current FW is able to boot from a block device using
``virtio-mmio`` as its transport, a microvm-based VM needs to be run
using a host-side kernel and, optionally, an initrd image.
Running a microvm-based VM
~~~~~~~~~~~~~~~~~~~~~~~~~~
By default, microvm aims for maximum compatibility, enabling both
legacy and non-legacy devices. In this example, a VM is created
without passing any additional machine-specific option, using the
legacy ``ISA serial`` device as console::
$ qemu-system-x86_64 -M microvm \
-enable-kvm -cpu host -m 512m -smp 2 \
-kernel vmlinux -append "earlyprintk=ttyS0 console=ttyS0 root=/dev/vda" \
-nodefaults -no-user-config -nographic \
-serial stdio \
-drive id=test,file=test.img,format=raw,if=none \
-device virtio-blk-device,drive=test \
-netdev tap,id=tap0,script=no,downscript=no \
-device virtio-net-device,netdev=tap0
While the example above works, you might be interested in reducing the
footprint further by disabling some legacy devices. If you're using
``KVM``, you can disable the ``RTC``, making the Guest rely on
``kvmclock`` exclusively. Additionally, if your host's CPUs have the
``TSC_DEADLINE`` feature, you can also disable both the i8259 PIC and
the i8254 PIT (make sure you're also emulating a CPU with such feature
in the guest).
This is an example of a VM with all optional legacy features
disabled::
$ qemu-system-x86_64 \
-M microvm,x-option-roms=off,pit=off,pic=off,isa-serial=off,rtc=off \
-enable-kvm -cpu host -m 512m -smp 2 \
-kernel vmlinux -append "console=hvc0 root=/dev/vda" \
-nodefaults -no-user-config -nographic \
-chardev stdio,id=virtiocon0 \
-device virtio-serial-device \
-device virtconsole,chardev=virtiocon0 \
-drive id=test,file=test.img,format=raw,if=none \
-device virtio-blk-device,drive=test \
-netdev tap,id=tap0,script=no,downscript=no \
-device virtio-net-device,netdev=tap0

View File

@ -128,7 +128,7 @@ void build_legacy_cpu_hotplug_aml(Aml *ctx, MachineState *machine,
Aml *one = aml_int(1); Aml *one = aml_int(1);
MachineClass *mc = MACHINE_GET_CLASS(machine); MachineClass *mc = MACHINE_GET_CLASS(machine);
const CPUArchIdList *apic_ids = mc->possible_cpu_arch_ids(machine); const CPUArchIdList *apic_ids = mc->possible_cpu_arch_ids(machine);
PCMachineState *pcms = PC_MACHINE(machine); X86MachineState *x86ms = X86_MACHINE(machine);
/* /*
* _MAT method - creates an madt apic buffer * _MAT method - creates an madt apic buffer
@ -236,9 +236,9 @@ void build_legacy_cpu_hotplug_aml(Aml *ctx, MachineState *machine,
/* The current AML generator can cover the APIC ID range [0..255], /* The current AML generator can cover the APIC ID range [0..255],
* inclusive, for VCPU hotplug. */ * inclusive, for VCPU hotplug. */
QEMU_BUILD_BUG_ON(ACPI_CPU_HOTPLUG_ID_LIMIT > 256); QEMU_BUILD_BUG_ON(ACPI_CPU_HOTPLUG_ID_LIMIT > 256);
if (pcms->apic_id_limit > ACPI_CPU_HOTPLUG_ID_LIMIT) { if (x86ms->apic_id_limit > ACPI_CPU_HOTPLUG_ID_LIMIT) {
error_report("max_cpus is too large. APIC ID of last CPU is %u", error_report("max_cpus is too large. APIC ID of last CPU is %u",
pcms->apic_id_limit - 1); x86ms->apic_id_limit - 1);
exit(1); exit(1);
} }
@ -315,8 +315,8 @@ void build_legacy_cpu_hotplug_aml(Aml *ctx, MachineState *machine,
* ith up to 255 elements. Windows guests up to win2k8 fail when * ith up to 255 elements. Windows guests up to win2k8 fail when
* VarPackageOp is used. * VarPackageOp is used.
*/ */
pkg = pcms->apic_id_limit <= 255 ? aml_package(pcms->apic_id_limit) : pkg = x86ms->apic_id_limit <= 255 ? aml_package(x86ms->apic_id_limit) :
aml_varpackage(pcms->apic_id_limit); aml_varpackage(x86ms->apic_id_limit);
for (i = 0, apic_idx = 0; i < apic_ids->len; i++) { for (i = 0, apic_idx = 0; i < apic_ids->len; i++) {
int apic_id = apic_ids->cpus[i].arch_id; int apic_id = apic_ids->cpus[i].arch_id;

View File

@ -92,6 +92,16 @@ config Q35
select SMBIOS select SMBIOS
select FW_CFG_DMA select FW_CFG_DMA
config MICROVM
bool
imply SERIAL_ISA
select ISA_BUS
select APIC
select IOAPIC
select I8259
select MC146818RTC
select VIRTIO_MMIO
config VTD config VTD
bool bool

View File

@ -1,8 +1,10 @@
obj-$(CONFIG_KVM) += kvm/ obj-$(CONFIG_KVM) += kvm/
obj-y += e820_memory_layout.o multiboot.o obj-y += e820_memory_layout.o multiboot.o
obj-y += x86.o
obj-y += pc.o obj-y += pc.o
obj-$(CONFIG_I440FX) += pc_piix.o obj-$(CONFIG_I440FX) += pc_piix.o
obj-$(CONFIG_Q35) += pc_q35.o obj-$(CONFIG_Q35) += pc_q35.o
obj-$(CONFIG_MICROVM) += microvm.o
obj-y += fw_cfg.o pc_sysfw.o obj-y += fw_cfg.o pc_sysfw.o
obj-y += x86-iommu.o obj-y += x86-iommu.o
obj-$(CONFIG_VTD) += intel_iommu.o obj-$(CONFIG_VTD) += intel_iommu.o

View File

@ -361,6 +361,7 @@ static void
build_madt(GArray *table_data, BIOSLinker *linker, PCMachineState *pcms) build_madt(GArray *table_data, BIOSLinker *linker, PCMachineState *pcms)
{ {
MachineClass *mc = MACHINE_GET_CLASS(pcms); MachineClass *mc = MACHINE_GET_CLASS(pcms);
X86MachineState *x86ms = X86_MACHINE(pcms);
const CPUArchIdList *apic_ids = mc->possible_cpu_arch_ids(MACHINE(pcms)); const CPUArchIdList *apic_ids = mc->possible_cpu_arch_ids(MACHINE(pcms));
int madt_start = table_data->len; int madt_start = table_data->len;
AcpiDeviceIfClass *adevc = ACPI_DEVICE_IF_GET_CLASS(pcms->acpi_dev); AcpiDeviceIfClass *adevc = ACPI_DEVICE_IF_GET_CLASS(pcms->acpi_dev);
@ -390,7 +391,7 @@ build_madt(GArray *table_data, BIOSLinker *linker, PCMachineState *pcms)
io_apic->address = cpu_to_le32(IO_APIC_DEFAULT_ADDRESS); io_apic->address = cpu_to_le32(IO_APIC_DEFAULT_ADDRESS);
io_apic->interrupt = cpu_to_le32(0); io_apic->interrupt = cpu_to_le32(0);
if (pcms->apic_xrupt_override) { if (x86ms->apic_xrupt_override) {
intsrcovr = acpi_data_push(table_data, sizeof *intsrcovr); intsrcovr = acpi_data_push(table_data, sizeof *intsrcovr);
intsrcovr->type = ACPI_APIC_XRUPT_OVERRIDE; intsrcovr->type = ACPI_APIC_XRUPT_OVERRIDE;
intsrcovr->length = sizeof(*intsrcovr); intsrcovr->length = sizeof(*intsrcovr);
@ -1831,6 +1832,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
CrsRangeSet crs_range_set; CrsRangeSet crs_range_set;
PCMachineState *pcms = PC_MACHINE(machine); PCMachineState *pcms = PC_MACHINE(machine);
PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(machine); PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(machine);
X86MachineState *x86ms = X86_MACHINE(machine);
AcpiMcfgInfo mcfg; AcpiMcfgInfo mcfg;
uint32_t nr_mem = machine->ram_slots; uint32_t nr_mem = machine->ram_slots;
int root_bus_limit = 0xFF; int root_bus_limit = 0xFF;
@ -2103,7 +2105,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
* with half of the 16-bit control register. Hence, the total size * with half of the 16-bit control register. Hence, the total size
* of the i/o region used is FW_CFG_CTL_SIZE; when using DMA, the * of the i/o region used is FW_CFG_CTL_SIZE; when using DMA, the
* DMA control register is located at FW_CFG_DMA_IO_BASE + 4 */ * DMA control register is located at FW_CFG_DMA_IO_BASE + 4 */
uint8_t io_size = object_property_get_bool(OBJECT(pcms->fw_cfg), uint8_t io_size = object_property_get_bool(OBJECT(x86ms->fw_cfg),
"dma_enabled", NULL) ? "dma_enabled", NULL) ?
ROUND_UP(FW_CFG_CTL_SIZE, 4) + sizeof(dma_addr_t) : ROUND_UP(FW_CFG_CTL_SIZE, 4) + sizeof(dma_addr_t) :
FW_CFG_CTL_SIZE; FW_CFG_CTL_SIZE;
@ -2336,6 +2338,7 @@ build_srat(GArray *table_data, BIOSLinker *linker, MachineState *machine)
int srat_start, numa_start, slots; int srat_start, numa_start, slots;
uint64_t mem_len, mem_base, next_base; uint64_t mem_len, mem_base, next_base;
MachineClass *mc = MACHINE_GET_CLASS(machine); MachineClass *mc = MACHINE_GET_CLASS(machine);
X86MachineState *x86ms = X86_MACHINE(machine);
const CPUArchIdList *apic_ids = mc->possible_cpu_arch_ids(machine); const CPUArchIdList *apic_ids = mc->possible_cpu_arch_ids(machine);
PCMachineState *pcms = PC_MACHINE(machine); PCMachineState *pcms = PC_MACHINE(machine);
ram_addr_t hotplugabble_address_space_size = ram_addr_t hotplugabble_address_space_size =
@ -2406,16 +2409,16 @@ build_srat(GArray *table_data, BIOSLinker *linker, MachineState *machine)
} }
/* Cut out the ACPI_PCI hole */ /* Cut out the ACPI_PCI hole */
if (mem_base <= pcms->below_4g_mem_size && if (mem_base <= x86ms->below_4g_mem_size &&
next_base > pcms->below_4g_mem_size) { next_base > x86ms->below_4g_mem_size) {
mem_len -= next_base - pcms->below_4g_mem_size; mem_len -= next_base - x86ms->below_4g_mem_size;
if (mem_len > 0) { if (mem_len > 0) {
numamem = acpi_data_push(table_data, sizeof *numamem); numamem = acpi_data_push(table_data, sizeof *numamem);
build_srat_memory(numamem, mem_base, mem_len, i - 1, build_srat_memory(numamem, mem_base, mem_len, i - 1,
MEM_AFFINITY_ENABLED); MEM_AFFINITY_ENABLED);
} }
mem_base = 1ULL << 32; mem_base = 1ULL << 32;
mem_len = next_base - pcms->below_4g_mem_size; mem_len = next_base - x86ms->below_4g_mem_size;
next_base = mem_base + mem_len; next_base = mem_base + mem_len;
} }
@ -2634,6 +2637,7 @@ void acpi_build(AcpiBuildTables *tables, MachineState *machine)
{ {
PCMachineState *pcms = PC_MACHINE(machine); PCMachineState *pcms = PC_MACHINE(machine);
PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms); PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
X86MachineState *x86ms = X86_MACHINE(machine);
GArray *table_offsets; GArray *table_offsets;
unsigned facs, dsdt, rsdt, fadt; unsigned facs, dsdt, rsdt, fadt;
AcpiPmInfo pm; AcpiPmInfo pm;
@ -2795,7 +2799,7 @@ void acpi_build(AcpiBuildTables *tables, MachineState *machine)
*/ */
int legacy_aml_len = int legacy_aml_len =
pcmc->legacy_acpi_table_size + pcmc->legacy_acpi_table_size +
ACPI_BUILD_LEGACY_CPU_AML_SIZE * pcms->apic_id_limit; ACPI_BUILD_LEGACY_CPU_AML_SIZE * x86ms->apic_id_limit;
int legacy_table_size = int legacy_table_size =
ROUND_UP(tables_blob->len - aml_len + legacy_aml_len, ROUND_UP(tables_blob->len - aml_len + legacy_aml_len,
ACPI_BUILD_ALIGN_SIZE); ACPI_BUILD_ALIGN_SIZE);
@ -2885,13 +2889,14 @@ void acpi_setup(void)
{ {
PCMachineState *pcms = PC_MACHINE(qdev_get_machine()); PCMachineState *pcms = PC_MACHINE(qdev_get_machine());
PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms); PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
X86MachineState *x86ms = X86_MACHINE(pcms);
AcpiBuildTables tables; AcpiBuildTables tables;
AcpiBuildState *build_state; AcpiBuildState *build_state;
Object *vmgenid_dev; Object *vmgenid_dev;
TPMIf *tpm; TPMIf *tpm;
static FwCfgTPMConfig tpm_config; static FwCfgTPMConfig tpm_config;
if (!pcms->fw_cfg) { if (!x86ms->fw_cfg) {
ACPI_BUILD_DPRINTF("No fw cfg. Bailing out.\n"); ACPI_BUILD_DPRINTF("No fw cfg. Bailing out.\n");
return; return;
} }
@ -2922,7 +2927,7 @@ void acpi_setup(void)
acpi_add_rom_blob(acpi_build_update, build_state, acpi_add_rom_blob(acpi_build_update, build_state,
tables.linker->cmd_blob, "etc/table-loader", 0); tables.linker->cmd_blob, "etc/table-loader", 0);
fw_cfg_add_file(pcms->fw_cfg, ACPI_BUILD_TPMLOG_FILE, fw_cfg_add_file(x86ms->fw_cfg, ACPI_BUILD_TPMLOG_FILE,
tables.tcpalog->data, acpi_data_len(tables.tcpalog)); tables.tcpalog->data, acpi_data_len(tables.tcpalog));
tpm = tpm_find(); tpm = tpm_find();
@ -2932,13 +2937,13 @@ void acpi_setup(void)
.tpm_version = tpm_get_version(tpm), .tpm_version = tpm_get_version(tpm),
.tpmppi_version = TPM_PPI_VERSION_1_30 .tpmppi_version = TPM_PPI_VERSION_1_30
}; };
fw_cfg_add_file(pcms->fw_cfg, "etc/tpm/config", fw_cfg_add_file(x86ms->fw_cfg, "etc/tpm/config",
&tpm_config, sizeof tpm_config); &tpm_config, sizeof tpm_config);
} }
vmgenid_dev = find_vmgenid_dev(); vmgenid_dev = find_vmgenid_dev();
if (vmgenid_dev) { if (vmgenid_dev) {
vmgenid_add_fw_cfg(VMGENID(vmgenid_dev), pcms->fw_cfg, vmgenid_add_fw_cfg(VMGENID(vmgenid_dev), x86ms->fw_cfg,
tables.vmgenid); tables.vmgenid);
} }
@ -2951,7 +2956,7 @@ void acpi_setup(void)
uint32_t rsdp_size = acpi_data_len(tables.rsdp); uint32_t rsdp_size = acpi_data_len(tables.rsdp);
build_state->rsdp = g_memdup(tables.rsdp->data, rsdp_size); build_state->rsdp = g_memdup(tables.rsdp->data, rsdp_size);
fw_cfg_add_file_callback(pcms->fw_cfg, ACPI_BUILD_RSDP_FILE, fw_cfg_add_file_callback(x86ms->fw_cfg, ACPI_BUILD_RSDP_FILE,
acpi_build_update, NULL, build_state, acpi_build_update, NULL, build_state,
build_state->rsdp, rsdp_size, true); build_state->rsdp, rsdp_size, true);
build_state->rsdp_mr = NULL; build_state->rsdp_mr = NULL;

View File

@ -1540,6 +1540,7 @@ static void amdvi_realize(DeviceState *dev, Error **err)
X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(dev); X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(dev);
MachineState *ms = MACHINE(qdev_get_machine()); MachineState *ms = MACHINE(qdev_get_machine());
PCMachineState *pcms = PC_MACHINE(ms); PCMachineState *pcms = PC_MACHINE(ms);
X86MachineState *x86ms = X86_MACHINE(ms);
PCIBus *bus = pcms->bus; PCIBus *bus = pcms->bus;
s->iotlb = g_hash_table_new_full(amdvi_uint64_hash, s->iotlb = g_hash_table_new_full(amdvi_uint64_hash,
@ -1568,7 +1569,7 @@ static void amdvi_realize(DeviceState *dev, Error **err)
} }
/* Pseudo address space under root PCI bus. */ /* Pseudo address space under root PCI bus. */
pcms->ioapic_as = amdvi_host_dma_iommu(bus, s, AMDVI_IOAPIC_SB_DEVID); x86ms->ioapic_as = amdvi_host_dma_iommu(bus, s, AMDVI_IOAPIC_SB_DEVID);
/* set up MMIO */ /* set up MMIO */
memory_region_init_io(&s->mmio, OBJECT(s), &mmio_mem_ops, s, "amdvi-mmio", memory_region_init_io(&s->mmio, OBJECT(s), &mmio_mem_ops, s, "amdvi-mmio",

View File

@ -3733,6 +3733,7 @@ static void vtd_realize(DeviceState *dev, Error **errp)
{ {
MachineState *ms = MACHINE(qdev_get_machine()); MachineState *ms = MACHINE(qdev_get_machine());
PCMachineState *pcms = PC_MACHINE(ms); PCMachineState *pcms = PC_MACHINE(ms);
X86MachineState *x86ms = X86_MACHINE(ms);
PCIBus *bus = pcms->bus; PCIBus *bus = pcms->bus;
IntelIOMMUState *s = INTEL_IOMMU_DEVICE(dev); IntelIOMMUState *s = INTEL_IOMMU_DEVICE(dev);
X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(dev); X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(dev);
@ -3773,7 +3774,7 @@ static void vtd_realize(DeviceState *dev, Error **errp)
sysbus_mmio_map(SYS_BUS_DEVICE(s), 0, Q35_HOST_BRIDGE_IOMMU_ADDR); sysbus_mmio_map(SYS_BUS_DEVICE(s), 0, Q35_HOST_BRIDGE_IOMMU_ADDR);
pci_setup_iommu(bus, vtd_host_dma_iommu, dev); pci_setup_iommu(bus, vtd_host_dma_iommu, dev);
/* Pseudo address space under root PCI bus. */ /* Pseudo address space under root PCI bus. */
pcms->ioapic_as = vtd_host_dma_iommu(bus, s, Q35_PSEUDO_DEVFN_IOAPIC); x86ms->ioapic_as = vtd_host_dma_iommu(bus, s, Q35_PSEUDO_DEVFN_IOAPIC);
qemu_add_machine_init_done_notifier(&vtd_machine_done_notify); qemu_add_machine_init_done_notifier(&vtd_machine_done_notify);
} }

572
hw/i386/microvm.c Normal file
View File

@ -0,0 +1,572 @@
/*
* Copyright (c) 2018 Intel Corporation
* Copyright (c) 2019 Red Hat, Inc.
*
* This program is free software; you can redistribute it and/or modify it
* under the terms and conditions of the GNU General Public License,
* version 2 or later, as published by the Free Software Foundation.
*
* This program is distributed in the hope it will be useful, but WITHOUT
* ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
* FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
* more details.
*
* You should have received a copy of the GNU General Public License along with
* this program. If not, see <http://www.gnu.org/licenses/>.
*/
#include "qemu/osdep.h"
#include "qemu/error-report.h"
#include "qemu/cutils.h"
#include "qemu/units.h"
#include "qapi/error.h"
#include "qapi/visitor.h"
#include "qapi/qapi-visit-common.h"
#include "sysemu/sysemu.h"
#include "sysemu/cpus.h"
#include "sysemu/numa.h"
#include "sysemu/reset.h"
#include "hw/loader.h"
#include "hw/irq.h"
#include "hw/kvm/clock.h"
#include "hw/i386/microvm.h"
#include "hw/i386/x86.h"
#include "hw/i386/pc.h"
#include "target/i386/cpu.h"
#include "hw/timer/i8254.h"
#include "hw/rtc/mc146818rtc.h"
#include "hw/char/serial.h"
#include "hw/i386/topology.h"
#include "hw/i386/e820_memory_layout.h"
#include "hw/i386/fw_cfg.h"
#include "hw/virtio/virtio-mmio.h"
#include "cpu.h"
#include "elf.h"
#include "kvm_i386.h"
#include "hw/xen/start_info.h"
#define MICROVM_BIOS_FILENAME "bios-microvm.bin"
static void microvm_set_rtc(MicrovmMachineState *mms, ISADevice *s)
{
X86MachineState *x86ms = X86_MACHINE(mms);
int val;
val = MIN(x86ms->below_4g_mem_size / KiB, 640);
rtc_set_memory(s, 0x15, val);
rtc_set_memory(s, 0x16, val >> 8);
/* extended memory (next 64MiB) */
if (x86ms->below_4g_mem_size > 1 * MiB) {
val = (x86ms->below_4g_mem_size - 1 * MiB) / KiB;
} else {
val = 0;
}
if (val > 65535) {
val = 65535;
}
rtc_set_memory(s, 0x17, val);
rtc_set_memory(s, 0x18, val >> 8);
rtc_set_memory(s, 0x30, val);
rtc_set_memory(s, 0x31, val >> 8);
/* memory between 16MiB and 4GiB */
if (x86ms->below_4g_mem_size > 16 * MiB) {
val = (x86ms->below_4g_mem_size - 16 * MiB) / (64 * KiB);
} else {
val = 0;
}
if (val > 65535) {
val = 65535;
}
rtc_set_memory(s, 0x34, val);
rtc_set_memory(s, 0x35, val >> 8);
/* memory above 4GiB */
val = x86ms->above_4g_mem_size / 65536;
rtc_set_memory(s, 0x5b, val);
rtc_set_memory(s, 0x5c, val >> 8);
rtc_set_memory(s, 0x5d, val >> 16);
}
static void microvm_gsi_handler(void *opaque, int n, int level)
{
GSIState *s = opaque;
qemu_set_irq(s->ioapic_irq[n], level);
}
static void microvm_devices_init(MicrovmMachineState *mms)
{
X86MachineState *x86ms = X86_MACHINE(mms);
ISABus *isa_bus;
ISADevice *rtc_state;
GSIState *gsi_state;
int i;
/* Core components */
gsi_state = g_malloc0(sizeof(*gsi_state));
if (mms->pic == ON_OFF_AUTO_ON || mms->pic == ON_OFF_AUTO_AUTO) {
x86ms->gsi = qemu_allocate_irqs(gsi_handler, gsi_state, GSI_NUM_PINS);
} else {
x86ms->gsi = qemu_allocate_irqs(microvm_gsi_handler,
gsi_state, GSI_NUM_PINS);
}
isa_bus = isa_bus_new(NULL, get_system_memory(), get_system_io(),
&error_abort);
isa_bus_irqs(isa_bus, x86ms->gsi);
ioapic_init_gsi(gsi_state, "machine");
kvmclock_create();
for (i = 0; i < VIRTIO_NUM_TRANSPORTS; i++) {
sysbus_create_simple("virtio-mmio",
VIRTIO_MMIO_BASE + i * 512,
x86ms->gsi[VIRTIO_IRQ_BASE + i]);
}
/* Optional and legacy devices */
if (mms->pic == ON_OFF_AUTO_ON || mms->pic == ON_OFF_AUTO_AUTO) {
qemu_irq *i8259;
i8259 = i8259_init(isa_bus, pc_allocate_cpu_irq());
for (i = 0; i < ISA_NUM_IRQS; i++) {
gsi_state->i8259_irq[i] = i8259[i];
}
g_free(i8259);
}
if (mms->pit == ON_OFF_AUTO_ON || mms->pit == ON_OFF_AUTO_AUTO) {
if (kvm_pit_in_kernel()) {
kvm_pit_init(isa_bus, 0x40);
} else {
i8254_pit_init(isa_bus, 0x40, 0, NULL);
}
}
if (mms->rtc == ON_OFF_AUTO_ON ||
(mms->rtc == ON_OFF_AUTO_AUTO && !kvm_enabled())) {
rtc_state = mc146818_rtc_init(isa_bus, 2000, NULL);
microvm_set_rtc(mms, rtc_state);
}
if (mms->isa_serial) {
serial_hds_isa_init(isa_bus, 0, 1);
}
if (bios_name == NULL) {
bios_name = MICROVM_BIOS_FILENAME;
}
x86_bios_rom_init(get_system_memory(), true);
}
static void microvm_memory_init(MicrovmMachineState *mms)
{
MachineState *machine = MACHINE(mms);
X86MachineState *x86ms = X86_MACHINE(mms);
MemoryRegion *ram, *ram_below_4g, *ram_above_4g;
MemoryRegion *system_memory = get_system_memory();
FWCfgState *fw_cfg;
ram_addr_t lowmem;
int i;
/*
* Check whether RAM fits below 4G (leaving 1/2 GByte for IO memory
* and 256 Mbytes for PCI Express Enhanced Configuration Access Mapping
* also known as MMCFG).
* If it doesn't, we need to split it in chunks below and above 4G.
* In any case, try to make sure that guest addresses aligned at
* 1G boundaries get mapped to host addresses aligned at 1G boundaries.
*/
if (machine->ram_size >= 0xb0000000) {
lowmem = 0x80000000;
} else {
lowmem = 0xb0000000;
}
/*
* Handle the machine opt max-ram-below-4g. It is basically doing
* min(qemu limit, user limit).
*/
if (!x86ms->max_ram_below_4g) {
x86ms->max_ram_below_4g = 4 * GiB;
}
if (lowmem > x86ms->max_ram_below_4g) {
lowmem = x86ms->max_ram_below_4g;
if (machine->ram_size - lowmem > lowmem &&
lowmem & (1 * GiB - 1)) {
warn_report("There is possibly poor performance as the ram size "
" (0x%" PRIx64 ") is more then twice the size of"
" max-ram-below-4g (%"PRIu64") and"
" max-ram-below-4g is not a multiple of 1G.",
(uint64_t)machine->ram_size, x86ms->max_ram_below_4g);
}
}
if (machine->ram_size > lowmem) {
x86ms->above_4g_mem_size = machine->ram_size - lowmem;
x86ms->below_4g_mem_size = lowmem;
} else {
x86ms->above_4g_mem_size = 0;
x86ms->below_4g_mem_size = machine->ram_size;
}
ram = g_malloc(sizeof(*ram));
memory_region_allocate_system_memory(ram, NULL, "microvm.ram",
machine->ram_size);
ram_below_4g = g_malloc(sizeof(*ram_below_4g));
memory_region_init_alias(ram_below_4g, NULL, "ram-below-4g", ram,
0, x86ms->below_4g_mem_size);
memory_region_add_subregion(system_memory, 0, ram_below_4g);
e820_add_entry(0, x86ms->below_4g_mem_size, E820_RAM);
if (x86ms->above_4g_mem_size > 0) {
ram_above_4g = g_malloc(sizeof(*ram_above_4g));
memory_region_init_alias(ram_above_4g, NULL, "ram-above-4g", ram,
x86ms->below_4g_mem_size,
x86ms->above_4g_mem_size);
memory_region_add_subregion(system_memory, 0x100000000ULL,
ram_above_4g);
e820_add_entry(0x100000000ULL, x86ms->above_4g_mem_size, E820_RAM);
}
fw_cfg = fw_cfg_init_io_dma(FW_CFG_IO_BASE, FW_CFG_IO_BASE + 4,
&address_space_memory);
fw_cfg_add_i16(fw_cfg, FW_CFG_NB_CPUS, machine->smp.cpus);
fw_cfg_add_i16(fw_cfg, FW_CFG_MAX_CPUS, machine->smp.max_cpus);
fw_cfg_add_i64(fw_cfg, FW_CFG_RAM_SIZE, (uint64_t)machine->ram_size);
fw_cfg_add_i32(fw_cfg, FW_CFG_IRQ0_OVERRIDE, kvm_allows_irq0_override());
fw_cfg_add_bytes(fw_cfg, FW_CFG_E820_TABLE,
&e820_reserve, sizeof(e820_reserve));
fw_cfg_add_file(fw_cfg, "etc/e820", e820_table,
sizeof(struct e820_entry) * e820_get_num_entries());
rom_set_fw(fw_cfg);
if (machine->kernel_filename != NULL) {
x86_load_linux(x86ms, fw_cfg, 0, true, true);
}
if (mms->option_roms) {
for (i = 0; i < nb_option_roms; i++) {
rom_add_option(option_rom[i].name, option_rom[i].bootindex);
}
}
x86ms->fw_cfg = fw_cfg;
x86ms->ioapic_as = &address_space_memory;
}
static gchar *microvm_get_mmio_cmdline(gchar *name)
{
gchar *cmdline;
gchar *separator;
long int index;
int ret;
separator = g_strrstr(name, ".");
if (!separator) {
return NULL;
}
if (qemu_strtol(separator + 1, NULL, 10, &index) != 0) {
return NULL;
}
cmdline = g_malloc0(VIRTIO_CMDLINE_MAXLEN);
ret = g_snprintf(cmdline, VIRTIO_CMDLINE_MAXLEN,
" virtio_mmio.device=512@0x%lx:%ld",
VIRTIO_MMIO_BASE + index * 512,
VIRTIO_IRQ_BASE + index);
if (ret < 0 || ret >= VIRTIO_CMDLINE_MAXLEN) {
g_free(cmdline);
return NULL;
}
return cmdline;
}
static void microvm_fix_kernel_cmdline(MachineState *machine)
{
X86MachineState *x86ms = X86_MACHINE(machine);
BusState *bus;
BusChild *kid;
char *cmdline;
/*
* Find MMIO transports with attached devices, and add them to the kernel
* command line.
*
* Yes, this is a hack, but one that heavily improves the UX without
* introducing any significant issues.
*/
cmdline = g_strdup(machine->kernel_cmdline);
bus = sysbus_get_default();
QTAILQ_FOREACH(kid, &bus->children, sibling) {
DeviceState *dev = kid->child;
ObjectClass *class = object_get_class(OBJECT(dev));
if (class == object_class_by_name(TYPE_VIRTIO_MMIO)) {
VirtIOMMIOProxy *mmio = VIRTIO_MMIO(OBJECT(dev));
VirtioBusState *mmio_virtio_bus = &mmio->bus;
BusState *mmio_bus = &mmio_virtio_bus->parent_obj;
if (!QTAILQ_EMPTY(&mmio_bus->children)) {
gchar *mmio_cmdline = microvm_get_mmio_cmdline(mmio_bus->name);
if (mmio_cmdline) {
char *newcmd = g_strjoin(NULL, cmdline, mmio_cmdline, NULL);
g_free(mmio_cmdline);
g_free(cmdline);
cmdline = newcmd;
}
}
}
}
fw_cfg_modify_i32(x86ms->fw_cfg, FW_CFG_CMDLINE_SIZE, strlen(cmdline) + 1);
fw_cfg_modify_string(x86ms->fw_cfg, FW_CFG_CMDLINE_DATA, cmdline);
}
static void microvm_machine_state_init(MachineState *machine)
{
MicrovmMachineState *mms = MICROVM_MACHINE(machine);
X86MachineState *x86ms = X86_MACHINE(machine);
Error *local_err = NULL;
microvm_memory_init(mms);
x86_cpus_init(x86ms, CPU_VERSION_LATEST);
if (local_err) {
error_report_err(local_err);
exit(1);
}
microvm_devices_init(mms);
}
static void microvm_machine_reset(MachineState *machine)
{
MicrovmMachineState *mms = MICROVM_MACHINE(machine);
CPUState *cs;
X86CPU *cpu;
if (machine->kernel_filename != NULL &&
mms->auto_kernel_cmdline && !mms->kernel_cmdline_fixed) {
microvm_fix_kernel_cmdline(machine);
mms->kernel_cmdline_fixed = true;
}
qemu_devices_reset();
CPU_FOREACH(cs) {
cpu = X86_CPU(cs);
if (cpu->apic_state) {
device_reset(cpu->apic_state);
}
}
}
static void microvm_machine_get_pic(Object *obj, Visitor *v, const char *name,
void *opaque, Error **errp)
{
MicrovmMachineState *mms = MICROVM_MACHINE(obj);
OnOffAuto pic = mms->pic;
visit_type_OnOffAuto(v, name, &pic, errp);
}
static void microvm_machine_set_pic(Object *obj, Visitor *v, const char *name,
void *opaque, Error **errp)
{
MicrovmMachineState *mms = MICROVM_MACHINE(obj);
visit_type_OnOffAuto(v, name, &mms->pic, errp);
}
static void microvm_machine_get_pit(Object *obj, Visitor *v, const char *name,
void *opaque, Error **errp)
{
MicrovmMachineState *mms = MICROVM_MACHINE(obj);
OnOffAuto pit = mms->pit;
visit_type_OnOffAuto(v, name, &pit, errp);
}
static void microvm_machine_set_pit(Object *obj, Visitor *v, const char *name,
void *opaque, Error **errp)
{
MicrovmMachineState *mms = MICROVM_MACHINE(obj);
visit_type_OnOffAuto(v, name, &mms->pit, errp);
}
static void microvm_machine_get_rtc(Object *obj, Visitor *v, const char *name,
void *opaque, Error **errp)
{
MicrovmMachineState *mms = MICROVM_MACHINE(obj);
OnOffAuto rtc = mms->rtc;
visit_type_OnOffAuto(v, name, &rtc, errp);
}
static void microvm_machine_set_rtc(Object *obj, Visitor *v, const char *name,
void *opaque, Error **errp)
{
MicrovmMachineState *mms = MICROVM_MACHINE(obj);
visit_type_OnOffAuto(v, name, &mms->rtc, errp);
}
static bool microvm_machine_get_isa_serial(Object *obj, Error **errp)
{
MicrovmMachineState *mms = MICROVM_MACHINE(obj);
return mms->isa_serial;
}
static void microvm_machine_set_isa_serial(Object *obj, bool value,
Error **errp)
{
MicrovmMachineState *mms = MICROVM_MACHINE(obj);
mms->isa_serial = value;
}
static bool microvm_machine_get_option_roms(Object *obj, Error **errp)
{
MicrovmMachineState *mms = MICROVM_MACHINE(obj);
return mms->option_roms;
}
static void microvm_machine_set_option_roms(Object *obj, bool value,
Error **errp)
{
MicrovmMachineState *mms = MICROVM_MACHINE(obj);
mms->option_roms = value;
}
static bool microvm_machine_get_auto_kernel_cmdline(Object *obj, Error **errp)
{
MicrovmMachineState *mms = MICROVM_MACHINE(obj);
return mms->auto_kernel_cmdline;
}
static void microvm_machine_set_auto_kernel_cmdline(Object *obj, bool value,
Error **errp)
{
MicrovmMachineState *mms = MICROVM_MACHINE(obj);
mms->auto_kernel_cmdline = value;
}
static void microvm_machine_initfn(Object *obj)
{
MicrovmMachineState *mms = MICROVM_MACHINE(obj);
/* Configuration */
mms->pic = ON_OFF_AUTO_AUTO;
mms->pit = ON_OFF_AUTO_AUTO;
mms->rtc = ON_OFF_AUTO_AUTO;
mms->isa_serial = true;
mms->option_roms = true;
mms->auto_kernel_cmdline = true;
/* State */
mms->kernel_cmdline_fixed = false;
}
static void microvm_class_init(ObjectClass *oc, void *data)
{
MachineClass *mc = MACHINE_CLASS(oc);
mc->init = microvm_machine_state_init;
mc->family = "microvm_i386";
mc->desc = "microvm (i386)";
mc->units_per_default_bus = 1;
mc->no_floppy = 1;
mc->max_cpus = 288;
mc->has_hotpluggable_cpus = false;
mc->auto_enable_numa_with_memhp = false;
mc->default_cpu_type = TARGET_DEFAULT_CPU_TYPE;
mc->nvdimm_supported = false;
/* Avoid relying too much on kernel components */
mc->default_kernel_irqchip_split = true;
/* Machine class handlers */
mc->reset = microvm_machine_reset;
object_class_property_add(oc, MICROVM_MACHINE_PIC, "OnOffAuto",
microvm_machine_get_pic,
microvm_machine_set_pic,
NULL, NULL, &error_abort);
object_class_property_set_description(oc, MICROVM_MACHINE_PIC,
"Enable i8259 PIC", &error_abort);
object_class_property_add(oc, MICROVM_MACHINE_PIT, "OnOffAuto",
microvm_machine_get_pit,
microvm_machine_set_pit,
NULL, NULL, &error_abort);
object_class_property_set_description(oc, MICROVM_MACHINE_PIT,
"Enable i8254 PIT", &error_abort);
object_class_property_add(oc, MICROVM_MACHINE_RTC, "OnOffAuto",
microvm_machine_get_rtc,
microvm_machine_set_rtc,
NULL, NULL, &error_abort);
object_class_property_set_description(oc, MICROVM_MACHINE_RTC,
"Enable MC146818 RTC", &error_abort);
object_class_property_add_bool(oc, MICROVM_MACHINE_ISA_SERIAL,
microvm_machine_get_isa_serial,
microvm_machine_set_isa_serial,
&error_abort);
object_class_property_set_description(oc, MICROVM_MACHINE_ISA_SERIAL,
"Set off to disable the instantiation an ISA serial port",
&error_abort);
object_class_property_add_bool(oc, MICROVM_MACHINE_OPTION_ROMS,
microvm_machine_get_option_roms,
microvm_machine_set_option_roms,
&error_abort);
object_class_property_set_description(oc, MICROVM_MACHINE_OPTION_ROMS,
"Set off to disable loading option ROMs", &error_abort);
object_class_property_add_bool(oc, MICROVM_MACHINE_AUTO_KERNEL_CMDLINE,
microvm_machine_get_auto_kernel_cmdline,
microvm_machine_set_auto_kernel_cmdline,
&error_abort);
object_class_property_set_description(oc,
MICROVM_MACHINE_AUTO_KERNEL_CMDLINE,
"Set off to disable adding virtio-mmio devices to the kernel cmdline",
&error_abort);
}
static const TypeInfo microvm_machine_info = {
.name = TYPE_MICROVM_MACHINE,
.parent = TYPE_X86_MACHINE,
.instance_size = sizeof(MicrovmMachineState),
.instance_init = microvm_machine_initfn,
.class_size = sizeof(MicrovmMachineClass),
.class_init = microvm_class_init,
.interfaces = (InterfaceInfo[]) {
{ }
},
};
static void microvm_machine_init(void)
{
type_register_static(&microvm_machine_info);
}
type_init(microvm_machine_init);

File diff suppressed because it is too large Load Diff

View File

@ -27,6 +27,7 @@
#include "qemu/units.h" #include "qemu/units.h"
#include "hw/loader.h" #include "hw/loader.h"
#include "hw/i386/x86.h"
#include "hw/i386/pc.h" #include "hw/i386/pc.h"
#include "hw/i386/apic.h" #include "hw/i386/apic.h"
#include "hw/display/ramfb.h" #include "hw/display/ramfb.h"
@ -56,7 +57,6 @@
#endif #endif
#include "migration/global_state.h" #include "migration/global_state.h"
#include "migration/misc.h" #include "migration/misc.h"
#include "kvm_i386.h"
#include "sysemu/numa.h" #include "sysemu/numa.h"
#define MAX_IDE_BUS 2 #define MAX_IDE_BUS 2
@ -73,6 +73,7 @@ static void pc_init1(MachineState *machine,
{ {
PCMachineState *pcms = PC_MACHINE(machine); PCMachineState *pcms = PC_MACHINE(machine);
PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms); PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
X86MachineState *x86ms = X86_MACHINE(machine);
MemoryRegion *system_memory = get_system_memory(); MemoryRegion *system_memory = get_system_memory();
MemoryRegion *system_io = get_system_io(); MemoryRegion *system_io = get_system_io();
int i; int i;
@ -80,7 +81,6 @@ static void pc_init1(MachineState *machine,
ISABus *isa_bus; ISABus *isa_bus;
PCII440FXState *i440fx_state; PCII440FXState *i440fx_state;
int piix3_devfn = -1; int piix3_devfn = -1;
qemu_irq *i8259;
qemu_irq smi_irq; qemu_irq smi_irq;
GSIState *gsi_state; GSIState *gsi_state;
DriveInfo *hd[MAX_IDE_BUS * MAX_IDE_DEVS]; DriveInfo *hd[MAX_IDE_BUS * MAX_IDE_DEVS];
@ -125,11 +125,11 @@ static void pc_init1(MachineState *machine,
if (xen_enabled()) { if (xen_enabled()) {
xen_hvm_init(pcms, &ram_memory); xen_hvm_init(pcms, &ram_memory);
} else { } else {
if (!pcms->max_ram_below_4g) { if (!x86ms->max_ram_below_4g) {
pcms->max_ram_below_4g = 0xe0000000; /* default: 3.5G */ x86ms->max_ram_below_4g = 0xe0000000; /* default: 3.5G */
} }
lowmem = pcms->max_ram_below_4g; lowmem = x86ms->max_ram_below_4g;
if (machine->ram_size >= pcms->max_ram_below_4g) { if (machine->ram_size >= x86ms->max_ram_below_4g) {
if (pcmc->gigabyte_align) { if (pcmc->gigabyte_align) {
if (lowmem > 0xc0000000) { if (lowmem > 0xc0000000) {
lowmem = 0xc0000000; lowmem = 0xc0000000;
@ -138,21 +138,21 @@ static void pc_init1(MachineState *machine,
warn_report("Large machine and max_ram_below_4g " warn_report("Large machine and max_ram_below_4g "
"(%" PRIu64 ") not a multiple of 1G; " "(%" PRIu64 ") not a multiple of 1G; "
"possible bad performance.", "possible bad performance.",
pcms->max_ram_below_4g); x86ms->max_ram_below_4g);
} }
} }
} }
if (machine->ram_size >= lowmem) { if (machine->ram_size >= lowmem) {
pcms->above_4g_mem_size = machine->ram_size - lowmem; x86ms->above_4g_mem_size = machine->ram_size - lowmem;
pcms->below_4g_mem_size = lowmem; x86ms->below_4g_mem_size = lowmem;
} else { } else {
pcms->above_4g_mem_size = 0; x86ms->above_4g_mem_size = 0;
pcms->below_4g_mem_size = machine->ram_size; x86ms->below_4g_mem_size = machine->ram_size;
} }
} }
pc_cpus_init(pcms); x86_cpus_init(x86ms, pcmc->default_cpu_version);
if (kvm_enabled() && pcmc->kvmclock_enabled) { if (kvm_enabled() && pcmc->kvmclock_enabled) {
kvmclock_create(); kvmclock_create();
@ -187,22 +187,15 @@ static void pc_init1(MachineState *machine,
xen_load_linux(pcms); xen_load_linux(pcms);
} }
gsi_state = g_malloc0(sizeof(*gsi_state)); gsi_state = pc_gsi_create(&x86ms->gsi, pcmc->pci_enabled);
if (kvm_ioapic_in_kernel()) {
kvm_pc_setup_irq_routing(pcmc->pci_enabled);
pcms->gsi = qemu_allocate_irqs(kvm_pc_gsi_handler, gsi_state,
GSI_NUM_PINS);
} else {
pcms->gsi = qemu_allocate_irqs(gsi_handler, gsi_state, GSI_NUM_PINS);
}
if (pcmc->pci_enabled) { if (pcmc->pci_enabled) {
pci_bus = i440fx_init(host_type, pci_bus = i440fx_init(host_type,
pci_type, pci_type,
&i440fx_state, &piix3_devfn, &isa_bus, pcms->gsi, &i440fx_state, &piix3_devfn, &isa_bus, x86ms->gsi,
system_memory, system_io, machine->ram_size, system_memory, system_io, machine->ram_size,
pcms->below_4g_mem_size, x86ms->below_4g_mem_size,
pcms->above_4g_mem_size, x86ms->above_4g_mem_size,
pci_memory, ram_memory); pci_memory, ram_memory);
pcms->bus = pci_bus; pcms->bus = pci_bus;
} else { } else {
@ -212,25 +205,15 @@ static void pc_init1(MachineState *machine,
&error_abort); &error_abort);
no_hpet = 1; no_hpet = 1;
} }
isa_bus_irqs(isa_bus, pcms->gsi); isa_bus_irqs(isa_bus, x86ms->gsi);
if (kvm_pic_in_kernel()) { pc_i8259_create(isa_bus, gsi_state->i8259_irq);
i8259 = kvm_i8259_init(isa_bus);
} else if (xen_enabled()) {
i8259 = xen_interrupt_controller_init();
} else {
i8259 = i8259_init(isa_bus, pc_allocate_cpu_irq());
}
for (i = 0; i < ISA_NUM_IRQS; i++) {
gsi_state->i8259_irq[i] = i8259[i];
}
g_free(i8259);
if (pcmc->pci_enabled) { if (pcmc->pci_enabled) {
ioapic_init_gsi(gsi_state, "i440fx"); ioapic_init_gsi(gsi_state, "i440fx");
} }
pc_register_ferr_irq(pcms->gsi[13]); pc_register_ferr_irq(x86ms->gsi[13]);
pc_vga_init(isa_bus, pcmc->pci_enabled ? pci_bus : NULL); pc_vga_init(isa_bus, pcmc->pci_enabled ? pci_bus : NULL);
@ -240,7 +223,7 @@ static void pc_init1(MachineState *machine,
} }
/* init basic PC hardware */ /* init basic PC hardware */
pc_basic_device_init(isa_bus, pcms->gsi, &rtc_state, true, pc_basic_device_init(isa_bus, x86ms->gsi, &rtc_state, true,
(pcms->vmport != ON_OFF_AUTO_ON), pcms->pit_enabled, (pcms->vmport != ON_OFF_AUTO_ON), pcms->pit_enabled,
0x4); 0x4);
@ -287,7 +270,7 @@ else {
smi_irq = qemu_allocate_irq(pc_acpi_smi_interrupt, first_cpu, 0); smi_irq = qemu_allocate_irq(pc_acpi_smi_interrupt, first_cpu, 0);
/* TODO: Populate SPD eeprom data. */ /* TODO: Populate SPD eeprom data. */
pcms->smbus = piix4_pm_init(pci_bus, piix3_devfn + 3, 0xb100, pcms->smbus = piix4_pm_init(pci_bus, piix3_devfn + 3, 0xb100,
pcms->gsi[9], smi_irq, x86ms->gsi[9], smi_irq,
pc_machine_is_smm_enabled(pcms), pc_machine_is_smm_enabled(pcms),
&piix4_pm); &piix4_pm);
smbus_eeprom_init(pcms->smbus, 8, NULL, 0); smbus_eeprom_init(pcms->smbus, 8, NULL, 0);
@ -303,7 +286,7 @@ else {
if (machine->nvdimms_state->is_enabled) { if (machine->nvdimms_state->is_enabled) {
nvdimm_init_acpi_state(machine->nvdimms_state, system_io, nvdimm_init_acpi_state(machine->nvdimms_state, system_io,
pcms->fw_cfg, OBJECT(pcms)); x86ms->fw_cfg, OBJECT(pcms));
} }
} }
@ -728,7 +711,7 @@ DEFINE_I440FX_MACHINE(v1_4, "pc-i440fx-1.4", pc_compat_1_4_fn,
static void pc_i440fx_1_3_machine_options(MachineClass *m) static void pc_i440fx_1_3_machine_options(MachineClass *m)
{ {
PCMachineClass *pcmc = PC_MACHINE_CLASS(m); X86MachineClass *x86mc = X86_MACHINE_CLASS(m);
static GlobalProperty compat[] = { static GlobalProperty compat[] = {
PC_CPU_MODEL_IDS("1.3.0") PC_CPU_MODEL_IDS("1.3.0")
{ "usb-tablet", "usb_version", "1" }, { "usb-tablet", "usb_version", "1" },
@ -739,7 +722,7 @@ static void pc_i440fx_1_3_machine_options(MachineClass *m)
pc_i440fx_1_4_machine_options(m); pc_i440fx_1_4_machine_options(m);
m->hw_version = "1.3.0"; m->hw_version = "1.3.0";
pcmc->compat_apic_id_mode = true; x86mc->compat_apic_id_mode = true;
compat_props_add(m->compat_props, compat, G_N_ELEMENTS(compat)); compat_props_add(m->compat_props, compat, G_N_ELEMENTS(compat));
} }

View File

@ -36,11 +36,11 @@
#include "hw/rtc/mc146818rtc.h" #include "hw/rtc/mc146818rtc.h"
#include "hw/xen/xen.h" #include "hw/xen/xen.h"
#include "sysemu/kvm.h" #include "sysemu/kvm.h"
#include "kvm_i386.h"
#include "hw/kvm/clock.h" #include "hw/kvm/clock.h"
#include "hw/pci-host/q35.h" #include "hw/pci-host/q35.h"
#include "hw/qdev-properties.h" #include "hw/qdev-properties.h"
#include "exec/address-spaces.h" #include "exec/address-spaces.h"
#include "hw/i386/x86.h"
#include "hw/i386/pc.h" #include "hw/i386/pc.h"
#include "hw/i386/ich9.h" #include "hw/i386/ich9.h"
#include "hw/i386/amd_iommu.h" #include "hw/i386/amd_iommu.h"
@ -115,6 +115,7 @@ static void pc_q35_init(MachineState *machine)
{ {
PCMachineState *pcms = PC_MACHINE(machine); PCMachineState *pcms = PC_MACHINE(machine);
PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms); PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
X86MachineState *x86ms = X86_MACHINE(machine);
Q35PCIHost *q35_host; Q35PCIHost *q35_host;
PCIHostState *phb; PCIHostState *phb;
PCIBus *host_bus; PCIBus *host_bus;
@ -128,7 +129,6 @@ static void pc_q35_init(MachineState *machine)
MemoryRegion *ram_memory; MemoryRegion *ram_memory;
GSIState *gsi_state; GSIState *gsi_state;
ISABus *isa_bus; ISABus *isa_bus;
qemu_irq *i8259;
int i; int i;
ICH9LPCState *ich9_lpc; ICH9LPCState *ich9_lpc;
PCIDevice *ahci; PCIDevice *ahci;
@ -152,34 +152,34 @@ static void pc_q35_init(MachineState *machine)
/* Handle the machine opt max-ram-below-4g. It is basically doing /* Handle the machine opt max-ram-below-4g. It is basically doing
* min(qemu limit, user limit). * min(qemu limit, user limit).
*/ */
if (!pcms->max_ram_below_4g) { if (!x86ms->max_ram_below_4g) {
pcms->max_ram_below_4g = 1ULL << 32; /* default: 4G */; x86ms->max_ram_below_4g = 4 * GiB;
} }
if (lowmem > pcms->max_ram_below_4g) { if (lowmem > x86ms->max_ram_below_4g) {
lowmem = pcms->max_ram_below_4g; lowmem = x86ms->max_ram_below_4g;
if (machine->ram_size - lowmem > lowmem && if (machine->ram_size - lowmem > lowmem &&
lowmem & (1 * GiB - 1)) { lowmem & (1 * GiB - 1)) {
warn_report("There is possibly poor performance as the ram size " warn_report("There is possibly poor performance as the ram size "
" (0x%" PRIx64 ") is more then twice the size of" " (0x%" PRIx64 ") is more then twice the size of"
" max-ram-below-4g (%"PRIu64") and" " max-ram-below-4g (%"PRIu64") and"
" max-ram-below-4g is not a multiple of 1G.", " max-ram-below-4g is not a multiple of 1G.",
(uint64_t)machine->ram_size, pcms->max_ram_below_4g); (uint64_t)machine->ram_size, x86ms->max_ram_below_4g);
} }
} }
if (machine->ram_size >= lowmem) { if (machine->ram_size >= lowmem) {
pcms->above_4g_mem_size = machine->ram_size - lowmem; x86ms->above_4g_mem_size = machine->ram_size - lowmem;
pcms->below_4g_mem_size = lowmem; x86ms->below_4g_mem_size = lowmem;
} else { } else {
pcms->above_4g_mem_size = 0; x86ms->above_4g_mem_size = 0;
pcms->below_4g_mem_size = machine->ram_size; x86ms->below_4g_mem_size = machine->ram_size;
} }
if (xen_enabled()) { if (xen_enabled()) {
xen_hvm_init(pcms, &ram_memory); xen_hvm_init(pcms, &ram_memory);
} }
pc_cpus_init(pcms); x86_cpus_init(x86ms, pcmc->default_cpu_version);
kvmclock_create(); kvmclock_create();
@ -209,16 +209,6 @@ static void pc_q35_init(MachineState *machine)
rom_memory, &ram_memory); rom_memory, &ram_memory);
} }
/* irq lines */
gsi_state = g_malloc0(sizeof(*gsi_state));
if (kvm_ioapic_in_kernel()) {
kvm_pc_setup_irq_routing(pcmc->pci_enabled);
pcms->gsi = qemu_allocate_irqs(kvm_pc_gsi_handler, gsi_state,
GSI_NUM_PINS);
} else {
pcms->gsi = qemu_allocate_irqs(gsi_handler, gsi_state, GSI_NUM_PINS);
}
/* create pci host bus */ /* create pci host bus */
q35_host = Q35_HOST_DEVICE(qdev_create(NULL, TYPE_Q35_HOST_DEVICE)); q35_host = Q35_HOST_DEVICE(qdev_create(NULL, TYPE_Q35_HOST_DEVICE));
@ -231,9 +221,9 @@ static void pc_q35_init(MachineState *machine)
MCH_HOST_PROP_SYSTEM_MEM, NULL); MCH_HOST_PROP_SYSTEM_MEM, NULL);
object_property_set_link(OBJECT(q35_host), OBJECT(system_io), object_property_set_link(OBJECT(q35_host), OBJECT(system_io),
MCH_HOST_PROP_IO_MEM, NULL); MCH_HOST_PROP_IO_MEM, NULL);
object_property_set_int(OBJECT(q35_host), pcms->below_4g_mem_size, object_property_set_int(OBJECT(q35_host), x86ms->below_4g_mem_size,
PCI_HOST_BELOW_4G_MEM_SIZE, NULL); PCI_HOST_BELOW_4G_MEM_SIZE, NULL);
object_property_set_int(OBJECT(q35_host), pcms->above_4g_mem_size, object_property_set_int(OBJECT(q35_host), x86ms->above_4g_mem_size,
PCI_HOST_ABOVE_4G_MEM_SIZE, NULL); PCI_HOST_ABOVE_4G_MEM_SIZE, NULL);
/* pci */ /* pci */
qdev_init_nofail(DEVICE(q35_host)); qdev_init_nofail(DEVICE(q35_host));
@ -252,34 +242,26 @@ static void pc_q35_init(MachineState *machine)
object_property_set_link(OBJECT(machine), OBJECT(lpc), object_property_set_link(OBJECT(machine), OBJECT(lpc),
PC_MACHINE_ACPI_DEVICE_PROP, &error_abort); PC_MACHINE_ACPI_DEVICE_PROP, &error_abort);
/* irq lines */
gsi_state = pc_gsi_create(&x86ms->gsi, pcmc->pci_enabled);
ich9_lpc = ICH9_LPC_DEVICE(lpc); ich9_lpc = ICH9_LPC_DEVICE(lpc);
lpc_dev = DEVICE(lpc); lpc_dev = DEVICE(lpc);
for (i = 0; i < GSI_NUM_PINS; i++) { for (i = 0; i < GSI_NUM_PINS; i++) {
qdev_connect_gpio_out_named(lpc_dev, ICH9_GPIO_GSI, i, pcms->gsi[i]); qdev_connect_gpio_out_named(lpc_dev, ICH9_GPIO_GSI, i, x86ms->gsi[i]);
} }
pci_bus_irqs(host_bus, ich9_lpc_set_irq, ich9_lpc_map_irq, ich9_lpc, pci_bus_irqs(host_bus, ich9_lpc_set_irq, ich9_lpc_map_irq, ich9_lpc,
ICH9_LPC_NB_PIRQS); ICH9_LPC_NB_PIRQS);
pci_bus_set_route_irq_fn(host_bus, ich9_route_intx_pin_to_irq); pci_bus_set_route_irq_fn(host_bus, ich9_route_intx_pin_to_irq);
isa_bus = ich9_lpc->isa_bus; isa_bus = ich9_lpc->isa_bus;
if (kvm_pic_in_kernel()) { pc_i8259_create(isa_bus, gsi_state->i8259_irq);
i8259 = kvm_i8259_init(isa_bus);
} else if (xen_enabled()) {
i8259 = xen_interrupt_controller_init();
} else {
i8259 = i8259_init(isa_bus, pc_allocate_cpu_irq());
}
for (i = 0; i < ISA_NUM_IRQS; i++) {
gsi_state->i8259_irq[i] = i8259[i];
}
g_free(i8259);
if (pcmc->pci_enabled) { if (pcmc->pci_enabled) {
ioapic_init_gsi(gsi_state, "q35"); ioapic_init_gsi(gsi_state, "q35");
} }
pc_register_ferr_irq(pcms->gsi[13]); pc_register_ferr_irq(x86ms->gsi[13]);
assert(pcms->vmport != ON_OFF_AUTO__MAX); assert(pcms->vmport != ON_OFF_AUTO__MAX);
if (pcms->vmport == ON_OFF_AUTO_AUTO) { if (pcms->vmport == ON_OFF_AUTO_AUTO) {
@ -287,7 +269,7 @@ static void pc_q35_init(MachineState *machine)
} }
/* init basic PC hardware */ /* init basic PC hardware */
pc_basic_device_init(isa_bus, pcms->gsi, &rtc_state, !mc->no_floppy, pc_basic_device_init(isa_bus, x86ms->gsi, &rtc_state, !mc->no_floppy,
(pcms->vmport != ON_OFF_AUTO_ON), pcms->pit_enabled, (pcms->vmport != ON_OFF_AUTO_ON), pcms->pit_enabled,
0xff0104); 0xff0104);
@ -330,7 +312,7 @@ static void pc_q35_init(MachineState *machine)
if (machine->nvdimms_state->is_enabled) { if (machine->nvdimms_state->is_enabled) {
nvdimm_init_acpi_state(machine->nvdimms_state, system_io, nvdimm_init_acpi_state(machine->nvdimms_state, system_io,
pcms->fw_cfg, OBJECT(pcms)); x86ms->fw_cfg, OBJECT(pcms));
} }
} }

View File

@ -31,6 +31,7 @@
#include "qemu/option.h" #include "qemu/option.h"
#include "qemu/units.h" #include "qemu/units.h"
#include "hw/sysbus.h" #include "hw/sysbus.h"
#include "hw/i386/x86.h"
#include "hw/i386/pc.h" #include "hw/i386/pc.h"
#include "hw/loader.h" #include "hw/loader.h"
#include "hw/qdev-properties.h" #include "hw/qdev-properties.h"
@ -38,8 +39,6 @@
#include "hw/block/flash.h" #include "hw/block/flash.h"
#include "sysemu/kvm.h" #include "sysemu/kvm.h"
#define BIOS_FILENAME "bios.bin"
/* /*
* We don't have a theoretically justifiable exact lower bound on the base * We don't have a theoretically justifiable exact lower bound on the base
* address of any flash mapping. In practice, the IO-APIC MMIO range is * address of any flash mapping. In practice, the IO-APIC MMIO range is
@ -211,59 +210,6 @@ static void pc_system_flash_map(PCMachineState *pcms,
} }
} }
static void old_pc_system_rom_init(MemoryRegion *rom_memory, bool isapc_ram_fw)
{
char *filename;
MemoryRegion *bios, *isa_bios;
int bios_size, isa_bios_size;
int ret;
/* BIOS load */
if (bios_name == NULL) {
bios_name = BIOS_FILENAME;
}
filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, bios_name);
if (filename) {
bios_size = get_image_size(filename);
} else {
bios_size = -1;
}
if (bios_size <= 0 ||
(bios_size % 65536) != 0) {
goto bios_error;
}
bios = g_malloc(sizeof(*bios));
memory_region_init_ram(bios, NULL, "pc.bios", bios_size, &error_fatal);
if (!isapc_ram_fw) {
memory_region_set_readonly(bios, true);
}
ret = rom_add_file_fixed(bios_name, (uint32_t)(-bios_size), -1);
if (ret != 0) {
bios_error:
fprintf(stderr, "qemu: could not load PC BIOS '%s'\n", bios_name);
exit(1);
}
g_free(filename);
/* map the last 128KB of the BIOS in ISA space */
isa_bios_size = MIN(bios_size, 128 * KiB);
isa_bios = g_malloc(sizeof(*isa_bios));
memory_region_init_alias(isa_bios, NULL, "isa-bios", bios,
bios_size - isa_bios_size, isa_bios_size);
memory_region_add_subregion_overlap(rom_memory,
0x100000 - isa_bios_size,
isa_bios,
1);
if (!isapc_ram_fw) {
memory_region_set_readonly(isa_bios, true);
}
/* map all the bios at the top of memory */
memory_region_add_subregion(rom_memory,
(uint32_t)(-bios_size),
bios);
}
void pc_system_firmware_init(PCMachineState *pcms, void pc_system_firmware_init(PCMachineState *pcms,
MemoryRegion *rom_memory) MemoryRegion *rom_memory)
{ {
@ -272,7 +218,7 @@ void pc_system_firmware_init(PCMachineState *pcms,
BlockBackend *pflash_blk[ARRAY_SIZE(pcms->flash)]; BlockBackend *pflash_blk[ARRAY_SIZE(pcms->flash)];
if (!pcmc->pci_enabled) { if (!pcmc->pci_enabled) {
old_pc_system_rom_init(rom_memory, true); x86_bios_rom_init(rom_memory, true);
return; return;
} }
@ -293,7 +239,7 @@ void pc_system_firmware_init(PCMachineState *pcms,
if (!pflash_blk[0]) { if (!pflash_blk[0]) {
/* Machine property pflash0 not set, use ROM mode */ /* Machine property pflash0 not set, use ROM mode */
old_pc_system_rom_init(rom_memory, false); x86_bios_rom_init(rom_memory, false);
} else { } else {
if (kvm_enabled() && !kvm_readonly_mem_enabled()) { if (kvm_enabled() && !kvm_readonly_mem_enabled()) {
/* /*

795
hw/i386/x86.c Normal file
View File

@ -0,0 +1,795 @@
/*
* Copyright (c) 2003-2004 Fabrice Bellard
* Copyright (c) 2019 Red Hat, Inc.
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include "qemu/osdep.h"
#include "qemu/error-report.h"
#include "qemu/option.h"
#include "qemu/cutils.h"
#include "qemu/units.h"
#include "qemu-common.h"
#include "qapi/error.h"
#include "qapi/qmp/qerror.h"
#include "qapi/qapi-visit-common.h"
#include "qapi/visitor.h"
#include "sysemu/qtest.h"
#include "sysemu/numa.h"
#include "sysemu/replay.h"
#include "sysemu/sysemu.h"
#include "hw/i386/x86.h"
#include "target/i386/cpu.h"
#include "hw/i386/topology.h"
#include "hw/i386/fw_cfg.h"
#include "hw/acpi/cpu_hotplug.h"
#include "hw/nmi.h"
#include "hw/loader.h"
#include "multiboot.h"
#include "elf.h"
#include "standard-headers/asm-x86/bootparam.h"
#define BIOS_FILENAME "bios.bin"
/* Physical Address of PVH entry point read from kernel ELF NOTE */
static size_t pvh_start_addr;
/*
* Calculates initial APIC ID for a specific CPU index
*
* Currently we need to be able to calculate the APIC ID from the CPU index
* alone (without requiring a CPU object), as the QEMU<->Seabios interfaces have
* no concept of "CPU index", and the NUMA tables on fw_cfg need the APIC ID of
* all CPUs up to max_cpus.
*/
uint32_t x86_cpu_apic_id_from_index(X86MachineState *x86ms,
unsigned int cpu_index)
{
MachineState *ms = MACHINE(x86ms);
X86MachineClass *x86mc = X86_MACHINE_GET_CLASS(x86ms);
uint32_t correct_id;
static bool warned;
correct_id = x86_apicid_from_cpu_idx(x86ms->smp_dies, ms->smp.cores,
ms->smp.threads, cpu_index);
if (x86mc->compat_apic_id_mode) {
if (cpu_index != correct_id && !warned && !qtest_enabled()) {
error_report("APIC IDs set in compatibility mode, "
"CPU topology won't match the configuration");
warned = true;
}
return cpu_index;
} else {
return correct_id;
}
}
void x86_cpu_new(X86MachineState *x86ms, int64_t apic_id, Error **errp)
{
Object *cpu = NULL;
Error *local_err = NULL;
CPUX86State *env = NULL;
cpu = object_new(MACHINE(x86ms)->cpu_type);
env = &X86_CPU(cpu)->env;
env->nr_dies = x86ms->smp_dies;
object_property_set_uint(cpu, apic_id, "apic-id", &local_err);
object_property_set_bool(cpu, true, "realized", &local_err);
object_unref(cpu);
error_propagate(errp, local_err);
}
void x86_cpus_init(X86MachineState *x86ms, int default_cpu_version)
{
int i;
const CPUArchIdList *possible_cpus;
MachineState *ms = MACHINE(x86ms);
MachineClass *mc = MACHINE_GET_CLASS(x86ms);
x86_cpu_set_default_version(default_cpu_version);
/*
* Calculates the limit to CPU APIC ID values
*
* Limit for the APIC ID value, so that all
* CPU APIC IDs are < x86ms->apic_id_limit.
*
* This is used for FW_CFG_MAX_CPUS. See comments on fw_cfg_arch_create().
*/
x86ms->apic_id_limit = x86_cpu_apic_id_from_index(x86ms,
ms->smp.max_cpus - 1) + 1;
possible_cpus = mc->possible_cpu_arch_ids(ms);
for (i = 0; i < ms->smp.cpus; i++) {
x86_cpu_new(x86ms, possible_cpus->cpus[i].arch_id, &error_fatal);
}
}
CpuInstanceProperties
x86_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
{
MachineClass *mc = MACHINE_GET_CLASS(ms);
const CPUArchIdList *possible_cpus = mc->possible_cpu_arch_ids(ms);
assert(cpu_index < possible_cpus->len);
return possible_cpus->cpus[cpu_index].props;
}
int64_t x86_get_default_cpu_node_id(const MachineState *ms, int idx)
{
X86CPUTopoInfo topo;
X86MachineState *x86ms = X86_MACHINE(ms);
assert(idx < ms->possible_cpus->len);
x86_topo_ids_from_apicid(ms->possible_cpus->cpus[idx].arch_id,
x86ms->smp_dies, ms->smp.cores,
ms->smp.threads, &topo);
return topo.pkg_id % ms->numa_state->num_nodes;
}
const CPUArchIdList *x86_possible_cpu_arch_ids(MachineState *ms)
{
X86MachineState *x86ms = X86_MACHINE(ms);
int i;
unsigned int max_cpus = ms->smp.max_cpus;
if (ms->possible_cpus) {
/*
* make sure that max_cpus hasn't changed since the first use, i.e.
* -smp hasn't been parsed after it
*/
assert(ms->possible_cpus->len == max_cpus);
return ms->possible_cpus;
}
ms->possible_cpus = g_malloc0(sizeof(CPUArchIdList) +
sizeof(CPUArchId) * max_cpus);
ms->possible_cpus->len = max_cpus;
for (i = 0; i < ms->possible_cpus->len; i++) {
X86CPUTopoInfo topo;
ms->possible_cpus->cpus[i].type = ms->cpu_type;
ms->possible_cpus->cpus[i].vcpus_count = 1;
ms->possible_cpus->cpus[i].arch_id =
x86_cpu_apic_id_from_index(x86ms, i);
x86_topo_ids_from_apicid(ms->possible_cpus->cpus[i].arch_id,
x86ms->smp_dies, ms->smp.cores,
ms->smp.threads, &topo);
ms->possible_cpus->cpus[i].props.has_socket_id = true;
ms->possible_cpus->cpus[i].props.socket_id = topo.pkg_id;
if (x86ms->smp_dies > 1) {
ms->possible_cpus->cpus[i].props.has_die_id = true;
ms->possible_cpus->cpus[i].props.die_id = topo.die_id;
}
ms->possible_cpus->cpus[i].props.has_core_id = true;
ms->possible_cpus->cpus[i].props.core_id = topo.core_id;
ms->possible_cpus->cpus[i].props.has_thread_id = true;
ms->possible_cpus->cpus[i].props.thread_id = topo.smt_id;
}
return ms->possible_cpus;
}
static void x86_nmi(NMIState *n, int cpu_index, Error **errp)
{
/* cpu index isn't used */
CPUState *cs;
CPU_FOREACH(cs) {
X86CPU *cpu = X86_CPU(cs);
if (!cpu->apic_state) {
cpu_interrupt(cs, CPU_INTERRUPT_NMI);
} else {
apic_deliver_nmi(cpu->apic_state);
}
}
}
static long get_file_size(FILE *f)
{
long where, size;
/* XXX: on Unix systems, using fstat() probably makes more sense */
where = ftell(f);
fseek(f, 0, SEEK_END);
size = ftell(f);
fseek(f, where, SEEK_SET);
return size;
}
struct setup_data {
uint64_t next;
uint32_t type;
uint32_t len;
uint8_t data[0];
} __attribute__((packed));
/*
* The entry point into the kernel for PVH boot is different from
* the native entry point. The PVH entry is defined by the x86/HVM
* direct boot ABI and is available in an ELFNOTE in the kernel binary.
*
* This function is passed to load_elf() when it is called from
* load_elfboot() which then additionally checks for an ELF Note of
* type XEN_ELFNOTE_PHYS32_ENTRY and passes it to this function to
* parse the PVH entry address from the ELF Note.
*
* Due to trickery in elf_opts.h, load_elf() is actually available as
* load_elf32() or load_elf64() and this routine needs to be able
* to deal with being called as 32 or 64 bit.
*
* The address of the PVH entry point is saved to the 'pvh_start_addr'
* global variable. (although the entry point is 32-bit, the kernel
* binary can be either 32-bit or 64-bit).
*/
static uint64_t read_pvh_start_addr(void *arg1, void *arg2, bool is64)
{
size_t *elf_note_data_addr;
/* Check if ELF Note header passed in is valid */
if (arg1 == NULL) {
return 0;
}
if (is64) {
struct elf64_note *nhdr64 = (struct elf64_note *)arg1;
uint64_t nhdr_size64 = sizeof(struct elf64_note);
uint64_t phdr_align = *(uint64_t *)arg2;
uint64_t nhdr_namesz = nhdr64->n_namesz;
elf_note_data_addr =
((void *)nhdr64) + nhdr_size64 +
QEMU_ALIGN_UP(nhdr_namesz, phdr_align);
} else {
struct elf32_note *nhdr32 = (struct elf32_note *)arg1;
uint32_t nhdr_size32 = sizeof(struct elf32_note);
uint32_t phdr_align = *(uint32_t *)arg2;
uint32_t nhdr_namesz = nhdr32->n_namesz;
elf_note_data_addr =
((void *)nhdr32) + nhdr_size32 +
QEMU_ALIGN_UP(nhdr_namesz, phdr_align);
}
pvh_start_addr = *elf_note_data_addr;
return pvh_start_addr;
}
static bool load_elfboot(const char *kernel_filename,
int kernel_file_size,
uint8_t *header,
size_t pvh_xen_start_addr,
FWCfgState *fw_cfg)
{
uint32_t flags = 0;
uint32_t mh_load_addr = 0;
uint32_t elf_kernel_size = 0;
uint64_t elf_entry;
uint64_t elf_low, elf_high;
int kernel_size;
if (ldl_p(header) != 0x464c457f) {
return false; /* no elfboot */
}
bool elf_is64 = header[EI_CLASS] == ELFCLASS64;
flags = elf_is64 ?
((Elf64_Ehdr *)header)->e_flags : ((Elf32_Ehdr *)header)->e_flags;
if (flags & 0x00010004) { /* LOAD_ELF_HEADER_HAS_ADDR */
error_report("elfboot unsupported flags = %x", flags);
exit(1);
}
uint64_t elf_note_type = XEN_ELFNOTE_PHYS32_ENTRY;
kernel_size = load_elf(kernel_filename, read_pvh_start_addr,
NULL, &elf_note_type, &elf_entry,
&elf_low, &elf_high, 0, I386_ELF_MACHINE,
0, 0);
if (kernel_size < 0) {
error_report("Error while loading elf kernel");
exit(1);
}
mh_load_addr = elf_low;
elf_kernel_size = elf_high - elf_low;
if (pvh_start_addr == 0) {
error_report("Error loading uncompressed kernel without PVH ELF Note");
exit(1);
}
fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_ENTRY, pvh_start_addr);
fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_ADDR, mh_load_addr);
fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_SIZE, elf_kernel_size);
return true;
}
void x86_load_linux(X86MachineState *x86ms,
FWCfgState *fw_cfg,
int acpi_data_size,
bool pvh_enabled,
bool linuxboot_dma_enabled)
{
uint16_t protocol;
int setup_size, kernel_size, cmdline_size;
int dtb_size, setup_data_offset;
uint32_t initrd_max;
uint8_t header[8192], *setup, *kernel;
hwaddr real_addr, prot_addr, cmdline_addr, initrd_addr = 0;
FILE *f;
char *vmode;
MachineState *machine = MACHINE(x86ms);
struct setup_data *setup_data;
const char *kernel_filename = machine->kernel_filename;
const char *initrd_filename = machine->initrd_filename;
const char *dtb_filename = machine->dtb;
const char *kernel_cmdline = machine->kernel_cmdline;
/* Align to 16 bytes as a paranoia measure */
cmdline_size = (strlen(kernel_cmdline) + 16) & ~15;
/* load the kernel header */
f = fopen(kernel_filename, "rb");
if (!f) {
fprintf(stderr, "qemu: could not open kernel file '%s': %s\n",
kernel_filename, strerror(errno));
exit(1);
}
kernel_size = get_file_size(f);
if (!kernel_size ||
fread(header, 1, MIN(ARRAY_SIZE(header), kernel_size), f) !=
MIN(ARRAY_SIZE(header), kernel_size)) {
fprintf(stderr, "qemu: could not load kernel '%s': %s\n",
kernel_filename, strerror(errno));
exit(1);
}
/* kernel protocol version */
if (ldl_p(header + 0x202) == 0x53726448) {
protocol = lduw_p(header + 0x206);
} else {
/*
* This could be a multiboot kernel. If it is, let's stop treating it
* like a Linux kernel.
* Note: some multiboot images could be in the ELF format (the same of
* PVH), so we try multiboot first since we check the multiboot magic
* header before to load it.
*/
if (load_multiboot(fw_cfg, f, kernel_filename, initrd_filename,
kernel_cmdline, kernel_size, header)) {
return;
}
/*
* Check if the file is an uncompressed kernel file (ELF) and load it,
* saving the PVH entry point used by the x86/HVM direct boot ABI.
* If load_elfboot() is successful, populate the fw_cfg info.
*/
if (pvh_enabled &&
load_elfboot(kernel_filename, kernel_size,
header, pvh_start_addr, fw_cfg)) {
fclose(f);
fw_cfg_add_i32(fw_cfg, FW_CFG_CMDLINE_SIZE,
strlen(kernel_cmdline) + 1);
fw_cfg_add_string(fw_cfg, FW_CFG_CMDLINE_DATA, kernel_cmdline);
fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_SIZE, sizeof(header));
fw_cfg_add_bytes(fw_cfg, FW_CFG_SETUP_DATA,
header, sizeof(header));
/* load initrd */
if (initrd_filename) {
GMappedFile *mapped_file;
gsize initrd_size;
gchar *initrd_data;
GError *gerr = NULL;
mapped_file = g_mapped_file_new(initrd_filename, false, &gerr);
if (!mapped_file) {
fprintf(stderr, "qemu: error reading initrd %s: %s\n",
initrd_filename, gerr->message);
exit(1);
}
x86ms->initrd_mapped_file = mapped_file;
initrd_data = g_mapped_file_get_contents(mapped_file);
initrd_size = g_mapped_file_get_length(mapped_file);
initrd_max = x86ms->below_4g_mem_size - acpi_data_size - 1;
if (initrd_size >= initrd_max) {
fprintf(stderr, "qemu: initrd is too large, cannot support."
"(max: %"PRIu32", need %"PRId64")\n",
initrd_max, (uint64_t)initrd_size);
exit(1);
}
initrd_addr = (initrd_max - initrd_size) & ~4095;
fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_ADDR, initrd_addr);
fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_SIZE, initrd_size);
fw_cfg_add_bytes(fw_cfg, FW_CFG_INITRD_DATA, initrd_data,
initrd_size);
}
option_rom[nb_option_roms].bootindex = 0;
option_rom[nb_option_roms].name = "pvh.bin";
nb_option_roms++;
return;
}
protocol = 0;
}
if (protocol < 0x200 || !(header[0x211] & 0x01)) {
/* Low kernel */
real_addr = 0x90000;
cmdline_addr = 0x9a000 - cmdline_size;
prot_addr = 0x10000;
} else if (protocol < 0x202) {
/* High but ancient kernel */
real_addr = 0x90000;
cmdline_addr = 0x9a000 - cmdline_size;
prot_addr = 0x100000;
} else {
/* High and recent kernel */
real_addr = 0x10000;
cmdline_addr = 0x20000;
prot_addr = 0x100000;
}
/* highest address for loading the initrd */
if (protocol >= 0x20c &&
lduw_p(header + 0x236) & XLF_CAN_BE_LOADED_ABOVE_4G) {
/*
* Linux has supported initrd up to 4 GB for a very long time (2007,
* long before XLF_CAN_BE_LOADED_ABOVE_4G which was added in 2013),
* though it only sets initrd_max to 2 GB to "work around bootloader
* bugs". Luckily, QEMU firmware(which does something like bootloader)
* has supported this.
*
* It's believed that if XLF_CAN_BE_LOADED_ABOVE_4G is set, initrd can
* be loaded into any address.
*
* In addition, initrd_max is uint32_t simply because QEMU doesn't
* support the 64-bit boot protocol (specifically the ext_ramdisk_image
* field).
*
* Therefore here just limit initrd_max to UINT32_MAX simply as well.
*/
initrd_max = UINT32_MAX;
} else if (protocol >= 0x203) {
initrd_max = ldl_p(header + 0x22c);
} else {
initrd_max = 0x37ffffff;
}
if (initrd_max >= x86ms->below_4g_mem_size - acpi_data_size) {
initrd_max = x86ms->below_4g_mem_size - acpi_data_size - 1;
}
fw_cfg_add_i32(fw_cfg, FW_CFG_CMDLINE_ADDR, cmdline_addr);
fw_cfg_add_i32(fw_cfg, FW_CFG_CMDLINE_SIZE, strlen(kernel_cmdline) + 1);
fw_cfg_add_string(fw_cfg, FW_CFG_CMDLINE_DATA, kernel_cmdline);
if (protocol >= 0x202) {
stl_p(header + 0x228, cmdline_addr);
} else {
stw_p(header + 0x20, 0xA33F);
stw_p(header + 0x22, cmdline_addr - real_addr);
}
/* handle vga= parameter */
vmode = strstr(kernel_cmdline, "vga=");
if (vmode) {
unsigned int video_mode;
int ret;
/* skip "vga=" */
vmode += 4;
if (!strncmp(vmode, "normal", 6)) {
video_mode = 0xffff;
} else if (!strncmp(vmode, "ext", 3)) {
video_mode = 0xfffe;
} else if (!strncmp(vmode, "ask", 3)) {
video_mode = 0xfffd;
} else {
ret = qemu_strtoui(vmode, NULL, 0, &video_mode);
if (ret != 0) {
fprintf(stderr, "qemu: can't parse 'vga' parameter: %s\n",
strerror(-ret));
exit(1);
}
}
stw_p(header + 0x1fa, video_mode);
}
/* loader type */
/*
* High nybble = B reserved for QEMU; low nybble is revision number.
* If this code is substantially changed, you may want to consider
* incrementing the revision.
*/
if (protocol >= 0x200) {
header[0x210] = 0xB0;
}
/* heap */
if (protocol >= 0x201) {
header[0x211] |= 0x80; /* CAN_USE_HEAP */
stw_p(header + 0x224, cmdline_addr - real_addr - 0x200);
}
/* load initrd */
if (initrd_filename) {
GMappedFile *mapped_file;
gsize initrd_size;
gchar *initrd_data;
GError *gerr = NULL;
if (protocol < 0x200) {
fprintf(stderr, "qemu: linux kernel too old to load a ram disk\n");
exit(1);
}
mapped_file = g_mapped_file_new(initrd_filename, false, &gerr);
if (!mapped_file) {
fprintf(stderr, "qemu: error reading initrd %s: %s\n",
initrd_filename, gerr->message);
exit(1);
}
x86ms->initrd_mapped_file = mapped_file;
initrd_data = g_mapped_file_get_contents(mapped_file);
initrd_size = g_mapped_file_get_length(mapped_file);
if (initrd_size >= initrd_max) {
fprintf(stderr, "qemu: initrd is too large, cannot support."
"(max: %"PRIu32", need %"PRId64")\n",
initrd_max, (uint64_t)initrd_size);
exit(1);
}
initrd_addr = (initrd_max - initrd_size) & ~4095;
fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_ADDR, initrd_addr);
fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_SIZE, initrd_size);
fw_cfg_add_bytes(fw_cfg, FW_CFG_INITRD_DATA, initrd_data, initrd_size);
stl_p(header + 0x218, initrd_addr);
stl_p(header + 0x21c, initrd_size);
}
/* load kernel and setup */
setup_size = header[0x1f1];
if (setup_size == 0) {
setup_size = 4;
}
setup_size = (setup_size + 1) * 512;
if (setup_size > kernel_size) {
fprintf(stderr, "qemu: invalid kernel header\n");
exit(1);
}
kernel_size -= setup_size;
setup = g_malloc(setup_size);
kernel = g_malloc(kernel_size);
fseek(f, 0, SEEK_SET);
if (fread(setup, 1, setup_size, f) != setup_size) {
fprintf(stderr, "fread() failed\n");
exit(1);
}
if (fread(kernel, 1, kernel_size, f) != kernel_size) {
fprintf(stderr, "fread() failed\n");
exit(1);
}
fclose(f);
/* append dtb to kernel */
if (dtb_filename) {
if (protocol < 0x209) {
fprintf(stderr, "qemu: Linux kernel too old to load a dtb\n");
exit(1);
}
dtb_size = get_image_size(dtb_filename);
if (dtb_size <= 0) {
fprintf(stderr, "qemu: error reading dtb %s: %s\n",
dtb_filename, strerror(errno));
exit(1);
}
setup_data_offset = QEMU_ALIGN_UP(kernel_size, 16);
kernel_size = setup_data_offset + sizeof(struct setup_data) + dtb_size;
kernel = g_realloc(kernel, kernel_size);
stq_p(header + 0x250, prot_addr + setup_data_offset);
setup_data = (struct setup_data *)(kernel + setup_data_offset);
setup_data->next = 0;
setup_data->type = cpu_to_le32(SETUP_DTB);
setup_data->len = cpu_to_le32(dtb_size);
load_image_size(dtb_filename, setup_data->data, dtb_size);
}
memcpy(setup, header, MIN(sizeof(header), setup_size));
fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_ADDR, prot_addr);
fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_SIZE, kernel_size);
fw_cfg_add_bytes(fw_cfg, FW_CFG_KERNEL_DATA, kernel, kernel_size);
fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_ADDR, real_addr);
fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_SIZE, setup_size);
fw_cfg_add_bytes(fw_cfg, FW_CFG_SETUP_DATA, setup, setup_size);
option_rom[nb_option_roms].bootindex = 0;
option_rom[nb_option_roms].name = "linuxboot.bin";
if (linuxboot_dma_enabled && fw_cfg_dma_enabled(fw_cfg)) {
option_rom[nb_option_roms].name = "linuxboot_dma.bin";
}
nb_option_roms++;
}
void x86_bios_rom_init(MemoryRegion *rom_memory, bool isapc_ram_fw)
{
char *filename;
MemoryRegion *bios, *isa_bios;
int bios_size, isa_bios_size;
int ret;
/* BIOS load */
if (bios_name == NULL) {
bios_name = BIOS_FILENAME;
}
filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, bios_name);
if (filename) {
bios_size = get_image_size(filename);
} else {
bios_size = -1;
}
if (bios_size <= 0 ||
(bios_size % 65536) != 0) {
goto bios_error;
}
bios = g_malloc(sizeof(*bios));
memory_region_init_ram(bios, NULL, "pc.bios", bios_size, &error_fatal);
if (!isapc_ram_fw) {
memory_region_set_readonly(bios, true);
}
ret = rom_add_file_fixed(bios_name, (uint32_t)(-bios_size), -1);
if (ret != 0) {
bios_error:
fprintf(stderr, "qemu: could not load PC BIOS '%s'\n", bios_name);
exit(1);
}
g_free(filename);
/* map the last 128KB of the BIOS in ISA space */
isa_bios_size = MIN(bios_size, 128 * KiB);
isa_bios = g_malloc(sizeof(*isa_bios));
memory_region_init_alias(isa_bios, NULL, "isa-bios", bios,
bios_size - isa_bios_size, isa_bios_size);
memory_region_add_subregion_overlap(rom_memory,
0x100000 - isa_bios_size,
isa_bios,
1);
if (!isapc_ram_fw) {
memory_region_set_readonly(isa_bios, true);
}
/* map all the bios at the top of memory */
memory_region_add_subregion(rom_memory,
(uint32_t)(-bios_size),
bios);
}
static void x86_machine_get_max_ram_below_4g(Object *obj, Visitor *v,
const char *name, void *opaque,
Error **errp)
{
X86MachineState *x86ms = X86_MACHINE(obj);
uint64_t value = x86ms->max_ram_below_4g;
visit_type_size(v, name, &value, errp);
}
static void x86_machine_set_max_ram_below_4g(Object *obj, Visitor *v,
const char *name, void *opaque,
Error **errp)
{
X86MachineState *x86ms = X86_MACHINE(obj);
Error *error = NULL;
uint64_t value;
visit_type_size(v, name, &value, &error);
if (error) {
error_propagate(errp, error);
return;
}
if (value > 4 * GiB) {
error_setg(&error,
"Machine option 'max-ram-below-4g=%"PRIu64
"' expects size less than or equal to 4G", value);
error_propagate(errp, error);
return;
}
if (value < 1 * MiB) {
warn_report("Only %" PRIu64 " bytes of RAM below the 4GiB boundary,"
"BIOS may not work with less than 1MiB", value);
}
x86ms->max_ram_below_4g = value;
}
static void x86_machine_initfn(Object *obj)
{
X86MachineState *x86ms = X86_MACHINE(obj);
x86ms->max_ram_below_4g = 0; /* use default */
x86ms->smp_dies = 1;
}
static void x86_machine_class_init(ObjectClass *oc, void *data)
{
MachineClass *mc = MACHINE_CLASS(oc);
X86MachineClass *x86mc = X86_MACHINE_CLASS(oc);
NMIClass *nc = NMI_CLASS(oc);
mc->cpu_index_to_instance_props = x86_cpu_index_to_props;
mc->get_default_cpu_node_id = x86_get_default_cpu_node_id;
mc->possible_cpu_arch_ids = x86_possible_cpu_arch_ids;
x86mc->compat_apic_id_mode = false;
nc->nmi_monitor_handler = x86_nmi;
object_class_property_add(oc, X86_MACHINE_MAX_RAM_BELOW_4G, "size",
x86_machine_get_max_ram_below_4g, x86_machine_set_max_ram_below_4g,
NULL, NULL, &error_abort);
object_class_property_set_description(oc, X86_MACHINE_MAX_RAM_BELOW_4G,
"Maximum ram below the 4G boundary (32bit boundary)", &error_abort);
}
static const TypeInfo x86_machine_info = {
.name = TYPE_X86_MACHINE,
.parent = TYPE_MACHINE,
.abstract = true,
.instance_size = sizeof(X86MachineState),
.instance_init = x86_machine_initfn,
.class_size = sizeof(X86MachineClass),
.class_init = x86_machine_class_init,
.interfaces = (InterfaceInfo[]) {
{ TYPE_NMI },
{ }
},
};
static void x86_machine_register_types(void)
{
type_register_static(&x86_machine_info);
}
type_init(x86_machine_register_types)

View File

@ -197,10 +197,12 @@ qemu_irq *xen_interrupt_controller_init(void)
static void xen_ram_init(PCMachineState *pcms, static void xen_ram_init(PCMachineState *pcms,
ram_addr_t ram_size, MemoryRegion **ram_memory_p) ram_addr_t ram_size, MemoryRegion **ram_memory_p)
{ {
X86MachineState *x86ms = X86_MACHINE(pcms);
MemoryRegion *sysmem = get_system_memory(); MemoryRegion *sysmem = get_system_memory();
ram_addr_t block_len; ram_addr_t block_len;
uint64_t user_lowmem = object_property_get_uint(qdev_get_machine(), uint64_t user_lowmem =
PC_MACHINE_MAX_RAM_BELOW_4G, object_property_get_uint(qdev_get_machine(),
X86_MACHINE_MAX_RAM_BELOW_4G,
&error_abort); &error_abort);
/* Handle the machine opt max-ram-below-4g. It is basically doing /* Handle the machine opt max-ram-below-4g. It is basically doing
@ -214,20 +216,20 @@ static void xen_ram_init(PCMachineState *pcms,
} }
if (ram_size >= user_lowmem) { if (ram_size >= user_lowmem) {
pcms->above_4g_mem_size = ram_size - user_lowmem; x86ms->above_4g_mem_size = ram_size - user_lowmem;
pcms->below_4g_mem_size = user_lowmem; x86ms->below_4g_mem_size = user_lowmem;
} else { } else {
pcms->above_4g_mem_size = 0; x86ms->above_4g_mem_size = 0;
pcms->below_4g_mem_size = ram_size; x86ms->below_4g_mem_size = ram_size;
} }
if (!pcms->above_4g_mem_size) { if (!x86ms->above_4g_mem_size) {
block_len = ram_size; block_len = ram_size;
} else { } else {
/* /*
* Xen does not allocate the memory continuously, it keeps a * Xen does not allocate the memory continuously, it keeps a
* hole of the size computed above or passed in. * hole of the size computed above or passed in.
*/ */
block_len = (1ULL << 32) + pcms->above_4g_mem_size; block_len = (1ULL << 32) + x86ms->above_4g_mem_size;
} }
memory_region_init_ram(&ram_memory, NULL, "xen.ram", block_len, memory_region_init_ram(&ram_memory, NULL, "xen.ram", block_len,
&error_fatal); &error_fatal);
@ -244,12 +246,12 @@ static void xen_ram_init(PCMachineState *pcms,
*/ */
memory_region_init_alias(&ram_lo, NULL, "xen.ram.lo", memory_region_init_alias(&ram_lo, NULL, "xen.ram.lo",
&ram_memory, 0xc0000, &ram_memory, 0xc0000,
pcms->below_4g_mem_size - 0xc0000); x86ms->below_4g_mem_size - 0xc0000);
memory_region_add_subregion(sysmem, 0xc0000, &ram_lo); memory_region_add_subregion(sysmem, 0xc0000, &ram_lo);
if (pcms->above_4g_mem_size > 0) { if (x86ms->above_4g_mem_size > 0) {
memory_region_init_alias(&ram_hi, NULL, "xen.ram.hi", memory_region_init_alias(&ram_hi, NULL, "xen.ram.hi",
&ram_memory, 0x100000000ULL, &ram_memory, 0x100000000ULL,
pcms->above_4g_mem_size); x86ms->above_4g_mem_size);
memory_region_add_subregion(sysmem, 0x100000000ULL, &ram_hi); memory_region_add_subregion(sysmem, 0x100000000ULL, &ram_hi);
} }
} }

View File

@ -610,7 +610,7 @@ int apic_accept_pic_intr(DeviceState *dev)
if ((s->apicbase & MSR_IA32_APICBASE_ENABLE) == 0 || if ((s->apicbase & MSR_IA32_APICBASE_ENABLE) == 0 ||
(lvt0 & APIC_LVT_MASKED) == 0) (lvt0 & APIC_LVT_MASKED) == 0)
return 1; return isa_pic != NULL;
return 0; return 0;
} }

View File

@ -89,7 +89,7 @@ static void ioapic_entry_parse(uint64_t entry, struct ioapic_entry_info *info)
static void ioapic_service(IOAPICCommonState *s) static void ioapic_service(IOAPICCommonState *s)
{ {
AddressSpace *ioapic_as = PC_MACHINE(qdev_get_machine())->ioapic_as; AddressSpace *ioapic_as = X86_MACHINE(qdev_get_machine())->ioapic_as;
struct ioapic_entry_info info; struct ioapic_entry_info info;
uint8_t i; uint8_t i;
uint32_t mask; uint32_t mask;

View File

@ -1,3 +1,3 @@
common-obj-$(CONFIG_DIMM) += pc-dimm.o common-obj-$(CONFIG_DIMM) += pc-dimm.o
common-obj-$(CONFIG_MEM_DEVICE) += memory-device.o common-obj-y += memory-device.o
common-obj-$(CONFIG_NVDIMM) += nvdimm.o common-obj-$(CONFIG_NVDIMM) += nvdimm.o

View File

@ -120,7 +120,7 @@ static void tmp421_get_temperature(Object *obj, Visitor *v, const char *name,
int tempid; int tempid;
if (sscanf(name, "temperature%d", &tempid) != 1) { if (sscanf(name, "temperature%d", &tempid) != 1) {
error_setg(errp, "error reading %s: %m", name); error_setg(errp, "error reading %s: %s", name, g_strerror(errno));
return; return;
} }
@ -160,7 +160,7 @@ static void tmp421_set_temperature(Object *obj, Visitor *v, const char *name,
} }
if (sscanf(name, "temperature%d", &tempid) != 1) { if (sscanf(name, "temperature%d", &tempid) != 1) {
error_setg(errp, "error reading %s: %m", name); error_setg(errp, "error reading %s: %s", name, g_strerror(errno));
return; return;
} }

View File

@ -690,6 +690,15 @@ void fw_cfg_add_string(FWCfgState *s, uint16_t key, const char *value)
fw_cfg_add_bytes(s, key, g_memdup(value, sz), sz); fw_cfg_add_bytes(s, key, g_memdup(value, sz), sz);
} }
void fw_cfg_modify_string(FWCfgState *s, uint16_t key, const char *value)
{
size_t sz = strlen(value) + 1;
char *old;
old = fw_cfg_modify_bytes_read(s, key, g_memdup(value, sz), sz);
g_free(old);
}
void fw_cfg_add_i16(FWCfgState *s, uint16_t key, uint16_t value) void fw_cfg_add_i16(FWCfgState *s, uint16_t key, uint16_t value)
{ {
uint16_t *copy; uint16_t *copy;
@ -720,6 +729,16 @@ void fw_cfg_add_i32(FWCfgState *s, uint16_t key, uint32_t value)
fw_cfg_add_bytes(s, key, copy, sizeof(value)); fw_cfg_add_bytes(s, key, copy, sizeof(value));
} }
void fw_cfg_modify_i32(FWCfgState *s, uint16_t key, uint32_t value)
{
uint32_t *copy, *old;
copy = g_malloc(sizeof(value));
*copy = cpu_to_le32(value);
old = fw_cfg_modify_bytes_read(s, key, copy, sizeof(value));
g_free(old);
}
void fw_cfg_add_i64(FWCfgState *s, uint16_t key, uint64_t value) void fw_cfg_add_i64(FWCfgState *s, uint16_t key, uint64_t value)
{ {
uint64_t *copy; uint64_t *copy;
@ -730,6 +749,16 @@ void fw_cfg_add_i64(FWCfgState *s, uint16_t key, uint64_t value)
fw_cfg_add_bytes(s, key, copy, sizeof(value)); fw_cfg_add_bytes(s, key, copy, sizeof(value));
} }
void fw_cfg_modify_i64(FWCfgState *s, uint16_t key, uint64_t value)
{
uint64_t *copy, *old;
copy = g_malloc(sizeof(value));
*copy = cpu_to_le64(value);
old = fw_cfg_modify_bytes_read(s, key, copy, sizeof(value));
g_free(old);
}
void fw_cfg_set_order_override(FWCfgState *s, int order) void fw_cfg_set_order_override(FWCfgState *s, int order)
{ {
assert(s->fw_cfg_order_override == 0); assert(s->fw_cfg_order_override == 0);

View File

@ -38,12 +38,13 @@
#include "hw/rtc/mc146818rtc_regs.h" #include "hw/rtc/mc146818rtc_regs.h"
#include "migration/vmstate.h" #include "migration/vmstate.h"
#include "qapi/error.h" #include "qapi/error.h"
#include "qapi/qapi-commands-misc-target.h"
#include "qapi/qapi-events-misc-target.h" #include "qapi/qapi-events-misc-target.h"
#include "qapi/visitor.h" #include "qapi/visitor.h"
#include "exec/address-spaces.h" #include "exec/address-spaces.h"
#include "hw/rtc/mc146818rtc_regs.h"
#ifdef TARGET_I386 #ifdef TARGET_I386
#include "qapi/qapi-commands-misc-target.h"
#include "hw/i386/apic.h" #include "hw/i386/apic.h"
#endif #endif
@ -72,36 +73,6 @@
#define RTC_CLOCK_RATE 32768 #define RTC_CLOCK_RATE 32768
#define UIP_HOLD_LENGTH (8 * NANOSECONDS_PER_SECOND / 32768) #define UIP_HOLD_LENGTH (8 * NANOSECONDS_PER_SECOND / 32768)
#define MC146818_RTC(obj) OBJECT_CHECK(RTCState, (obj), TYPE_MC146818_RTC)
typedef struct RTCState {
ISADevice parent_obj;
MemoryRegion io;
MemoryRegion coalesced_io;
uint8_t cmos_data[128];
uint8_t cmos_index;
int32_t base_year;
uint64_t base_rtc;
uint64_t last_update;
int64_t offset;
qemu_irq irq;
int it_shift;
/* periodic timer */
QEMUTimer *periodic_timer;
int64_t next_periodic_time;
/* update-ended timer */
QEMUTimer *update_timer;
uint64_t next_alarm_time;
uint16_t irq_reinject_on_ack_count;
uint32_t irq_coalesced;
uint32_t period;
QEMUTimer *coalesced_timer;
LostTickPolicy lost_tick_policy;
Notifier suspend_notifier;
QLIST_ENTRY(RTCState) link;
} RTCState;
static void rtc_set_time(RTCState *s); static void rtc_set_time(RTCState *s);
static void rtc_update_time(RTCState *s); static void rtc_update_time(RTCState *s);
static void rtc_set_cmos(RTCState *s, const struct tm *tm); static void rtc_set_cmos(RTCState *s, const struct tm *tm);
@ -204,7 +175,12 @@ periodic_timer_update(RTCState *s, int64_t current_time, uint32_t old_period)
period = rtc_periodic_clock_ticks(s); period = rtc_periodic_clock_ticks(s);
if (period) { if (!period) {
s->irq_coalesced = 0;
timer_del(s->periodic_timer);
return;
}
/* compute 32 khz clock */ /* compute 32 khz clock */
cur_clock = cur_clock =
muldiv64(current_time, RTC_CLOCK_RATE, NANOSECONDS_PER_SECOND); muldiv64(current_time, RTC_CLOCK_RATE, NANOSECONDS_PER_SECOND);
@ -221,7 +197,6 @@ periodic_timer_update(RTCState *s, int64_t current_time, uint32_t old_period)
last_periodic_clock = next_periodic_clock - old_period; last_periodic_clock = next_periodic_clock - old_period;
lost_clock = cur_clock - last_periodic_clock; lost_clock = cur_clock - last_periodic_clock;
assert(lost_clock >= 0); assert(lost_clock >= 0);
}
/* /*
* s->irq_coalesced can change for two reasons: * s->irq_coalesced can change for two reasons:
@ -258,16 +233,13 @@ periodic_timer_update(RTCState *s, int64_t current_time, uint32_t old_period)
*/ */
lost_clock = MIN(lost_clock, period); lost_clock = MIN(lost_clock, period);
} }
}
assert(lost_clock >= 0 && lost_clock <= period); assert(lost_clock >= 0 && lost_clock <= period);
next_irq_clock = cur_clock + period - lost_clock; next_irq_clock = cur_clock + period - lost_clock;
s->next_periodic_time = periodic_clock_to_ns(next_irq_clock) + 1; s->next_periodic_time = periodic_clock_to_ns(next_irq_clock) + 1;
timer_mod(s->periodic_timer, s->next_periodic_time); timer_mod(s->periodic_timer, s->next_periodic_time);
} else {
s->irq_coalesced = 0;
timer_del(s->periodic_timer);
}
} }
static void rtc_periodic_timer(void *opaque) static void rtc_periodic_timer(void *opaque)
@ -993,17 +965,16 @@ static void rtc_realizefn(DeviceState *dev, Error **errp)
object_property_add_tm(OBJECT(s), "date", rtc_get_date, NULL); object_property_add_tm(OBJECT(s), "date", rtc_get_date, NULL);
qdev_init_gpio_out(dev, &s->irq, 1); qdev_init_gpio_out(dev, &s->irq, 1);
QLIST_INSERT_HEAD(&rtc_devices, s, link);
} }
ISADevice *mc146818_rtc_init(ISABus *bus, int base_year, qemu_irq intercept_irq) ISADevice *mc146818_rtc_init(ISABus *bus, int base_year, qemu_irq intercept_irq)
{ {
DeviceState *dev; DeviceState *dev;
ISADevice *isadev; ISADevice *isadev;
RTCState *s;
isadev = isa_create(bus, TYPE_MC146818_RTC); isadev = isa_create(bus, TYPE_MC146818_RTC);
dev = DEVICE(isadev); dev = DEVICE(isadev);
s = MC146818_RTC(isadev);
qdev_prop_set_int32(dev, "base_year", base_year); qdev_prop_set_int32(dev, "base_year", base_year);
qdev_init_nofail(dev); qdev_init_nofail(dev);
if (intercept_irq) { if (intercept_irq) {
@ -1011,9 +982,8 @@ ISADevice *mc146818_rtc_init(ISABus *bus, int base_year, qemu_irq intercept_irq)
} else { } else {
isa_connect_gpio_out(isadev, 0, RTC_ISA_IRQ); isa_connect_gpio_out(isadev, 0, RTC_ISA_IRQ);
} }
QLIST_INSERT_HEAD(&rtc_devices, s, link);
object_property_add_alias(qdev_get_machine(), "rtc-time", OBJECT(s), object_property_add_alias(qdev_get_machine(), "rtc-time", OBJECT(isadev),
"date", NULL); "date", NULL);
return isadev; return isadev;
@ -1045,8 +1015,6 @@ static void rtc_class_initfn(ObjectClass *klass, void *data)
dc->reset = rtc_resetdev; dc->reset = rtc_resetdev;
dc->vmsd = &vmstate_rtc; dc->vmsd = &vmstate_rtc;
dc->props = mc146818rtc_properties; dc->props = mc146818rtc_properties;
/* Reason: needs to be wired up by rtc_init() */
dc->user_creatable = false;
} }
static const TypeInfo mc146818rtc_info = { static const TypeInfo mc146818rtc_info = {

View File

@ -29,57 +29,11 @@
#include "qemu/host-utils.h" #include "qemu/host-utils.h"
#include "qemu/module.h" #include "qemu/module.h"
#include "sysemu/kvm.h" #include "sysemu/kvm.h"
#include "hw/virtio/virtio-bus.h" #include "hw/virtio/virtio-mmio.h"
#include "qemu/error-report.h" #include "qemu/error-report.h"
#include "qemu/log.h" #include "qemu/log.h"
#include "trace.h" #include "trace.h"
/* QOM macros */
/* virtio-mmio-bus */
#define TYPE_VIRTIO_MMIO_BUS "virtio-mmio-bus"
#define VIRTIO_MMIO_BUS(obj) \
OBJECT_CHECK(VirtioBusState, (obj), TYPE_VIRTIO_MMIO_BUS)
#define VIRTIO_MMIO_BUS_GET_CLASS(obj) \
OBJECT_GET_CLASS(VirtioBusClass, (obj), TYPE_VIRTIO_MMIO_BUS)
#define VIRTIO_MMIO_BUS_CLASS(klass) \
OBJECT_CLASS_CHECK(VirtioBusClass, (klass), TYPE_VIRTIO_MMIO_BUS)
/* virtio-mmio */
#define TYPE_VIRTIO_MMIO "virtio-mmio"
#define VIRTIO_MMIO(obj) \
OBJECT_CHECK(VirtIOMMIOProxy, (obj), TYPE_VIRTIO_MMIO)
#define VIRT_MAGIC 0x74726976 /* 'virt' */
#define VIRT_VERSION 2
#define VIRT_VERSION_LEGACY 1
#define VIRT_VENDOR 0x554D4551 /* 'QEMU' */
typedef struct VirtIOMMIOQueue {
uint16_t num;
bool enabled;
uint32_t desc[2];
uint32_t avail[2];
uint32_t used[2];
} VirtIOMMIOQueue;
typedef struct {
/* Generic */
SysBusDevice parent_obj;
MemoryRegion iomem;
qemu_irq irq;
bool legacy;
/* Guest accessible state needing migration and reset */
uint32_t host_features_sel;
uint32_t guest_features_sel;
uint32_t guest_page_shift;
/* virtio-bus */
VirtioBusState bus;
bool format_transport_address;
/* Fields only used for non-legacy (v2) devices */
uint32_t guest_features[2];
VirtIOMMIOQueue vqs[VIRTIO_QUEUE_MAX];
} VirtIOMMIOProxy;
static bool virtio_mmio_ioeventfd_enabled(DeviceState *d) static bool virtio_mmio_ioeventfd_enabled(DeviceState *d)
{ {
return kvm_eventfds_enabled(); return kvm_eventfds_enabled();

71
include/hw/i386/microvm.h Normal file
View File

@ -0,0 +1,71 @@
/*
* Copyright (c) 2018 Intel Corporation
* Copyright (c) 2019 Red Hat, Inc.
*
* This program is free software; you can redistribute it and/or modify it
* under the terms and conditions of the GNU General Public License,
* version 2 or later, as published by the Free Software Foundation.
*
* This program is distributed in the hope it will be useful, but WITHOUT
* ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
* FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
* more details.
*
* You should have received a copy of the GNU General Public License along with
* this program. If not, see <http://www.gnu.org/licenses/>.
*/
#ifndef HW_I386_MICROVM_H
#define HW_I386_MICROVM_H
#include "qemu-common.h"
#include "exec/hwaddr.h"
#include "qemu/notify.h"
#include "hw/boards.h"
#include "hw/i386/x86.h"
/* Platform virtio definitions */
#define VIRTIO_MMIO_BASE 0xc0000000
#define VIRTIO_IRQ_BASE 5
#define VIRTIO_NUM_TRANSPORTS 8
#define VIRTIO_CMDLINE_MAXLEN 64
/* Machine type options */
#define MICROVM_MACHINE_PIT "pit"
#define MICROVM_MACHINE_PIC "pic"
#define MICROVM_MACHINE_RTC "rtc"
#define MICROVM_MACHINE_ISA_SERIAL "isa-serial"
#define MICROVM_MACHINE_OPTION_ROMS "x-option-roms"
#define MICROVM_MACHINE_AUTO_KERNEL_CMDLINE "auto-kernel-cmdline"
typedef struct {
X86MachineClass parent;
HotplugHandler *(*orig_hotplug_handler)(MachineState *machine,
DeviceState *dev);
} MicrovmMachineClass;
typedef struct {
X86MachineState parent;
/* Machine type options */
OnOffAuto pic;
OnOffAuto pit;
OnOffAuto rtc;
bool isa_serial;
bool option_roms;
bool auto_kernel_cmdline;
/* Machine state */
bool kernel_cmdline_fixed;
} MicrovmMachineState;
#define TYPE_MICROVM_MACHINE MACHINE_TYPE_NAME("microvm")
#define MICROVM_MACHINE(obj) \
OBJECT_CHECK(MicrovmMachineState, (obj), TYPE_MICROVM_MACHINE)
#define MICROVM_MACHINE_GET_CLASS(obj) \
OBJECT_GET_CLASS(MicrovmMachineClass, obj, TYPE_MICROVM_MACHINE)
#define MICROVM_MACHINE_CLASS(class) \
OBJECT_CLASS_CHECK(MicrovmMachineClass, class, TYPE_MICROVM_MACHINE)
#endif

View File

@ -8,6 +8,7 @@
#include "hw/block/flash.h" #include "hw/block/flash.h"
#include "net/net.h" #include "net/net.h"
#include "hw/i386/ioapic.h" #include "hw/i386/ioapic.h"
#include "hw/i386/x86.h"
#include "qemu/range.h" #include "qemu/range.h"
#include "qemu/bitmap.h" #include "qemu/bitmap.h"
@ -27,7 +28,7 @@
*/ */
struct PCMachineState { struct PCMachineState {
/*< private >*/ /*< private >*/
MachineState parent_obj; X86MachineState parent_obj;
/* <public> */ /* <public> */
@ -36,16 +37,11 @@ struct PCMachineState {
/* Pointers to devices and objects: */ /* Pointers to devices and objects: */
HotplugHandler *acpi_dev; HotplugHandler *acpi_dev;
ISADevice *rtc;
PCIBus *bus; PCIBus *bus;
I2CBus *smbus; I2CBus *smbus;
FWCfgState *fw_cfg;
qemu_irq *gsi;
PFlashCFI01 *flash[2]; PFlashCFI01 *flash[2];
GMappedFile *initrd_mapped_file;
/* Configuration options: */ /* Configuration options: */
uint64_t max_ram_below_4g;
OnOffAuto vmport; OnOffAuto vmport;
OnOffAuto smm; OnOffAuto smm;
@ -54,30 +50,16 @@ struct PCMachineState {
bool sata_enabled; bool sata_enabled;
bool pit_enabled; bool pit_enabled;
/* RAM information (sizes, addresses, configuration): */
ram_addr_t below_4g_mem_size, above_4g_mem_size;
/* CPU and apic information: */
bool apic_xrupt_override;
unsigned apic_id_limit;
uint16_t boot_cpus;
unsigned smp_dies;
/* NUMA information: */ /* NUMA information: */
uint64_t numa_nodes; uint64_t numa_nodes;
uint64_t *node_mem; uint64_t *node_mem;
/* Address space used by IOAPIC device. All IOAPIC interrupts
* will be translated to MSI messages in the address space. */
AddressSpace *ioapic_as;
/* ACPI Memory hotplug IO base address */ /* ACPI Memory hotplug IO base address */
hwaddr memhp_io_base; hwaddr memhp_io_base;
}; };
#define PC_MACHINE_ACPI_DEVICE_PROP "acpi-device" #define PC_MACHINE_ACPI_DEVICE_PROP "acpi-device"
#define PC_MACHINE_DEVMEM_REGION_SIZE "device-memory-region-size" #define PC_MACHINE_DEVMEM_REGION_SIZE "device-memory-region-size"
#define PC_MACHINE_MAX_RAM_BELOW_4G "max-ram-below-4g"
#define PC_MACHINE_VMPORT "vmport" #define PC_MACHINE_VMPORT "vmport"
#define PC_MACHINE_SMM "smm" #define PC_MACHINE_SMM "smm"
#define PC_MACHINE_SMBUS "smbus" #define PC_MACHINE_SMBUS "smbus"
@ -102,7 +84,7 @@ struct PCMachineState {
*/ */
typedef struct PCMachineClass { typedef struct PCMachineClass {
/*< private >*/ /*< private >*/
MachineClass parent_class; X86MachineClass parent_class;
/*< public >*/ /*< public >*/
@ -144,9 +126,6 @@ typedef struct PCMachineClass {
/* use PVH to load kernels that support this feature */ /* use PVH to load kernels that support this feature */
bool pvh_enabled; bool pvh_enabled;
/* Enables contiguous-apic-ID mode */
bool compat_apic_id_mode;
} PCMachineClass; } PCMachineClass;
#define TYPE_PC_MACHINE "generic-pc-machine" #define TYPE_PC_MACHINE "generic-pc-machine"
@ -178,6 +157,8 @@ typedef struct GSIState {
void gsi_handler(void *opaque, int n, int level); void gsi_handler(void *opaque, int n, int level);
GSIState *pc_gsi_create(qemu_irq **irqs, bool pci_enabled);
/* vmport.c */ /* vmport.c */
#define TYPE_VMPORT "vmport" #define TYPE_VMPORT "vmport"
typedef uint32_t (VMPortReadFunc)(void *opaque, uint32_t address); typedef uint32_t (VMPortReadFunc)(void *opaque, uint32_t address);
@ -198,7 +179,6 @@ bool pc_machine_is_smm_enabled(PCMachineState *pcms);
void pc_register_ferr_irq(qemu_irq irq); void pc_register_ferr_irq(qemu_irq irq);
void pc_acpi_smi_interrupt(void *opaque, int irq, int level); void pc_acpi_smi_interrupt(void *opaque, int irq, int level);
void pc_cpus_init(PCMachineState *pcms);
void pc_hot_add_cpu(MachineState *ms, const int64_t id, Error **errp); void pc_hot_add_cpu(MachineState *ms, const int64_t id, Error **errp);
void pc_smp_parse(MachineState *ms, QemuOpts *opts); void pc_smp_parse(MachineState *ms, QemuOpts *opts);
@ -239,6 +219,7 @@ void pc_pci_device_init(PCIBus *pci_bus);
typedef void (*cpu_set_smm_t)(int smm, void *arg); typedef void (*cpu_set_smm_t)(int smm, void *arg);
void pc_i8259_create(ISABus *isa_bus, qemu_irq *i8259_irqs);
void ioapic_init_gsi(GSIState *gsi_state, const char *parent_name); void ioapic_init_gsi(GSIState *gsi_state, const char *parent_name);
ISADevice *pc_find_fdc0(void); ISADevice *pc_find_fdc0(void);

96
include/hw/i386/x86.h Normal file
View File

@ -0,0 +1,96 @@
/*
* Copyright (c) 2019 Red Hat, Inc.
*
* This program is free software; you can redistribute it and/or modify it
* under the terms and conditions of the GNU General Public License,
* version 2 or later, as published by the Free Software Foundation.
*
* This program is distributed in the hope it will be useful, but WITHOUT
* ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
* FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
* more details.
*
* You should have received a copy of the GNU General Public License along with
* this program. If not, see <http://www.gnu.org/licenses/>.
*/
#ifndef HW_I386_X86_H
#define HW_I386_X86_H
#include "qemu-common.h"
#include "exec/hwaddr.h"
#include "qemu/notify.h"
#include "hw/boards.h"
#include "hw/nmi.h"
typedef struct {
/*< private >*/
MachineClass parent;
/*< public >*/
/* Enables contiguous-apic-ID mode */
bool compat_apic_id_mode;
} X86MachineClass;
typedef struct {
/*< private >*/
MachineState parent;
/*< public >*/
/* Pointers to devices and objects: */
ISADevice *rtc;
FWCfgState *fw_cfg;
qemu_irq *gsi;
GMappedFile *initrd_mapped_file;
/* Configuration options: */
uint64_t max_ram_below_4g;
/* RAM information (sizes, addresses, configuration): */
ram_addr_t below_4g_mem_size, above_4g_mem_size;
/* CPU and apic information: */
bool apic_xrupt_override;
unsigned apic_id_limit;
uint16_t boot_cpus;
unsigned smp_dies;
/*
* Address space used by IOAPIC device. All IOAPIC interrupts
* will be translated to MSI messages in the address space.
*/
AddressSpace *ioapic_as;
} X86MachineState;
#define X86_MACHINE_MAX_RAM_BELOW_4G "max-ram-below-4g"
#define TYPE_X86_MACHINE MACHINE_TYPE_NAME("x86")
#define X86_MACHINE(obj) \
OBJECT_CHECK(X86MachineState, (obj), TYPE_X86_MACHINE)
#define X86_MACHINE_GET_CLASS(obj) \
OBJECT_GET_CLASS(X86MachineClass, obj, TYPE_X86_MACHINE)
#define X86_MACHINE_CLASS(class) \
OBJECT_CLASS_CHECK(X86MachineClass, class, TYPE_X86_MACHINE)
uint32_t x86_cpu_apic_id_from_index(X86MachineState *pcms,
unsigned int cpu_index);
void x86_cpu_new(X86MachineState *pcms, int64_t apic_id, Error **errp);
void x86_cpus_init(X86MachineState *pcms, int default_cpu_version);
CpuInstanceProperties x86_cpu_index_to_props(MachineState *ms,
unsigned cpu_index);
int64_t x86_get_default_cpu_node_id(const MachineState *ms, int idx);
const CPUArchIdList *x86_possible_cpu_arch_ids(MachineState *ms);
void x86_bios_rom_init(MemoryRegion *rom_memory, bool isapc_ram_fw);
void x86_load_linux(X86MachineState *x86ms,
FWCfgState *fw_cfg,
int acpi_data_size,
bool pvh_enabled,
bool linuxboot_dma_enabled);
#endif

View File

@ -98,6 +98,20 @@ void fw_cfg_add_bytes(FWCfgState *s, uint16_t key, void *data, size_t len);
*/ */
void fw_cfg_add_string(FWCfgState *s, uint16_t key, const char *value); void fw_cfg_add_string(FWCfgState *s, uint16_t key, const char *value);
/**
* fw_cfg_modify_string:
* @s: fw_cfg device being modified
* @key: selector key value for new fw_cfg item
* @value: NUL-terminated ascii string
*
* Replace the fw_cfg item available by selecting the given key. The new
* data will consist of a dynamically allocated copy of the provided string,
* including its NUL terminator. The data being replaced, assumed to have
* been dynamically allocated during an earlier call to either
* fw_cfg_add_string() or fw_cfg_modify_string(), is freed before returning.
*/
void fw_cfg_modify_string(FWCfgState *s, uint16_t key, const char *value);
/** /**
* fw_cfg_add_i16: * fw_cfg_add_i16:
* @s: fw_cfg device being modified * @s: fw_cfg device being modified
@ -136,6 +150,20 @@ void fw_cfg_modify_i16(FWCfgState *s, uint16_t key, uint16_t value);
*/ */
void fw_cfg_add_i32(FWCfgState *s, uint16_t key, uint32_t value); void fw_cfg_add_i32(FWCfgState *s, uint16_t key, uint32_t value);
/**
* fw_cfg_modify_i32:
* @s: fw_cfg device being modified
* @key: selector key value for new fw_cfg item
* @value: 32-bit integer
*
* Replace the fw_cfg item available by selecting the given key. The new
* data will consist of a dynamically allocated copy of the given 32-bit
* value, converted to little-endian representation. The data being replaced,
* assumed to have been dynamically allocated during an earlier call to
* either fw_cfg_add_i32() or fw_cfg_modify_i32(), is freed before returning.
*/
void fw_cfg_modify_i32(FWCfgState *s, uint16_t key, uint32_t value);
/** /**
* fw_cfg_add_i64: * fw_cfg_add_i64:
* @s: fw_cfg device being modified * @s: fw_cfg device being modified
@ -148,6 +176,20 @@ void fw_cfg_add_i32(FWCfgState *s, uint16_t key, uint32_t value);
*/ */
void fw_cfg_add_i64(FWCfgState *s, uint16_t key, uint64_t value); void fw_cfg_add_i64(FWCfgState *s, uint16_t key, uint64_t value);
/**
* fw_cfg_modify_i64:
* @s: fw_cfg device being modified
* @key: selector key value for new fw_cfg item
* @value: 64-bit integer
*
* Replace the fw_cfg item available by selecting the given key. The new
* data will consist of a dynamically allocated copy of the given 64-bit
* value, converted to little-endian representation. The data being replaced,
* assumed to have been dynamically allocated during an earlier call to
* either fw_cfg_add_i64() or fw_cfg_modify_i64(), is freed before returning.
*/
void fw_cfg_modify_i64(FWCfgState *s, uint16_t key, uint64_t value);
/** /**
* fw_cfg_add_file: * fw_cfg_add_file:
* @s: fw_cfg device being modified * @s: fw_cfg device being modified

View File

@ -9,9 +9,44 @@
#ifndef HW_RTC_MC146818RTC_H #ifndef HW_RTC_MC146818RTC_H
#define HW_RTC_MC146818RTC_H #define HW_RTC_MC146818RTC_H
#include "qapi/qapi-types-misc.h"
#include "qemu/queue.h"
#include "qemu/timer.h"
#include "hw/isa/isa.h" #include "hw/isa/isa.h"
#define TYPE_MC146818_RTC "mc146818rtc" #define TYPE_MC146818_RTC "mc146818rtc"
#define MC146818_RTC(obj) OBJECT_CHECK(RTCState, (obj), TYPE_MC146818_RTC)
typedef struct RTCState {
ISADevice parent_obj;
MemoryRegion io;
MemoryRegion coalesced_io;
uint8_t cmos_data[128];
uint8_t cmos_index;
int32_t base_year;
uint64_t base_rtc;
uint64_t last_update;
int64_t offset;
qemu_irq irq;
int it_shift;
/* periodic timer */
QEMUTimer *periodic_timer;
int64_t next_periodic_time;
/* update-ended timer */
QEMUTimer *update_timer;
uint64_t next_alarm_time;
uint16_t irq_reinject_on_ack_count;
uint32_t irq_coalesced;
uint32_t period;
QEMUTimer *coalesced_timer;
Notifier clock_reset_notifier;
LostTickPolicy lost_tick_policy;
Notifier suspend_notifier;
QLIST_ENTRY(RTCState) link;
} RTCState;
#define RTC_ISA_IRQ 8
ISADevice *mc146818_rtc_init(ISABus *bus, int base_year, ISADevice *mc146818_rtc_init(ISABus *bus, int base_year,
qemu_irq intercept_irq); qemu_irq intercept_irq);

View File

@ -28,8 +28,6 @@
#include "qemu/timer.h" #include "qemu/timer.h"
#include "qemu/host-utils.h" #include "qemu/host-utils.h"
#define RTC_ISA_IRQ 8
#define RTC_SECONDS 0 #define RTC_SECONDS 0
#define RTC_SECONDS_ALARM 1 #define RTC_SECONDS_ALARM 1
#define RTC_MINUTES 2 #define RTC_MINUTES 2

View File

@ -0,0 +1,73 @@
/*
* Virtio MMIO bindings
*
* Copyright (c) 2011 Linaro Limited
*
* Author:
* Peter Maydell <peter.maydell@linaro.org>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License; either version 2
* of the License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License along
* with this program; if not, see <http://www.gnu.org/licenses/>.
*/
#ifndef HW_VIRTIO_MMIO_H
#define HW_VIRTIO_MMIO_H
#include "hw/virtio/virtio-bus.h"
/* QOM macros */
/* virtio-mmio-bus */
#define TYPE_VIRTIO_MMIO_BUS "virtio-mmio-bus"
#define VIRTIO_MMIO_BUS(obj) \
OBJECT_CHECK(VirtioBusState, (obj), TYPE_VIRTIO_MMIO_BUS)
#define VIRTIO_MMIO_BUS_GET_CLASS(obj) \
OBJECT_GET_CLASS(VirtioBusClass, (obj), TYPE_VIRTIO_MMIO_BUS)
#define VIRTIO_MMIO_BUS_CLASS(klass) \
OBJECT_CLASS_CHECK(VirtioBusClass, (klass), TYPE_VIRTIO_MMIO_BUS)
/* virtio-mmio */
#define TYPE_VIRTIO_MMIO "virtio-mmio"
#define VIRTIO_MMIO(obj) \
OBJECT_CHECK(VirtIOMMIOProxy, (obj), TYPE_VIRTIO_MMIO)
#define VIRT_MAGIC 0x74726976 /* 'virt' */
#define VIRT_VERSION 2
#define VIRT_VERSION_LEGACY 1
#define VIRT_VENDOR 0x554D4551 /* 'QEMU' */
typedef struct VirtIOMMIOQueue {
uint16_t num;
bool enabled;
uint32_t desc[2];
uint32_t avail[2];
uint32_t used[2];
} VirtIOMMIOQueue;
typedef struct {
/* Generic */
SysBusDevice parent_obj;
MemoryRegion iomem;
qemu_irq irq;
bool legacy;
/* Guest accessible state needing migration and reset */
uint32_t host_features_sel;
uint32_t guest_features_sel;
uint32_t guest_page_shift;
/* virtio-bus */
VirtioBusState bus;
bool format_transport_address;
/* Fields only used for non-legacy (v2) devices */
uint32_t guest_features[2];
VirtIOMMIOQueue vqs[VIRTIO_QUEUE_MAX];
} VirtIOMMIOProxy;
#endif

BIN
pc-bios/bios-microvm.bin Normal file

Binary file not shown.

View File

@ -1,14 +1,14 @@
# Bulgarian translation of qemu po-file. # Bulgarian translation of qemu po-file.
# Copyright (C) 2016 Alexander Shopov <ash@kambanaria.org> # Copyright (C) 2016, 2019 Alexander Shopov <ash@kambanaria.org>
# This file is distributed under the same license as the qemu package. # This file is distributed under the same license as the qemu package.
# Alexander Shopov <ash@kambanaria.org>, 2016. # Alexander Shopov <ash@kambanaria.org>, 2016, 2019.
# #
msgid "" msgid ""
msgstr "" msgstr ""
"Project-Id-Version: QEMU 2.6.50\n" "Project-Id-Version: QEMU 4.1.0\n"
"Report-Msgid-Bugs-To: qemu-devel@nongnu.org\n" "Report-Msgid-Bugs-To: qemu-devel@nongnu.org\n"
"POT-Creation-Date: 2018-07-18 07:56+0200\n" "POT-Creation-Date: 2018-07-18 07:56+0200\n"
"PO-Revision-Date: 2016-06-09 15:54+0300\n" "PO-Revision-Date: 2019-10-19 13:14+0200\n"
"Last-Translator: Alexander Shopov <ash@kambanaria.org>\n" "Last-Translator: Alexander Shopov <ash@kambanaria.org>\n"
"Language-Team: Bulgarian <dict@ludost.net>\n" "Language-Team: Bulgarian <dict@ludost.net>\n"
"Language: bg\n" "Language: bg\n"
@ -66,7 +66,7 @@ msgid "Detach Tab"
msgstr "Към самостоятелен подпрозорец" msgstr "Към самостоятелен подпрозорец"
msgid "Show Menubar" msgid "Show Menubar"
msgstr "" msgstr "Лента за менюто"
msgid "_Machine" msgid "_Machine"
msgstr "_Машина" msgstr "_Машина"

View File

@ -67,6 +67,7 @@ default help:
@echo " opensbi32-virt -- update OpenSBI for 32-bit virt machine" @echo " opensbi32-virt -- update OpenSBI for 32-bit virt machine"
@echo " opensbi64-virt -- update OpenSBI for 64-bit virt machine" @echo " opensbi64-virt -- update OpenSBI for 64-bit virt machine"
@echo " opensbi64-sifive_u -- update OpenSBI for 64-bit sifive_u machine" @echo " opensbi64-sifive_u -- update OpenSBI for 64-bit sifive_u machine"
@echo " bios-microvm -- update bios-microvm.bin (qboot)"
@echo " clean -- delete the files generated by the previous" \ @echo " clean -- delete the files generated by the previous" \
"build targets" "build targets"
@ -186,6 +187,10 @@ opensbi64-sifive_u:
PLATFORM="sifive/fu540" PLATFORM="sifive/fu540"
cp opensbi/build/platform/sifive/fu540/firmware/fw_jump.bin ../pc-bios/opensbi-riscv64-sifive_u-fw_jump.bin cp opensbi/build/platform/sifive/fu540/firmware/fw_jump.bin ../pc-bios/opensbi-riscv64-sifive_u-fw_jump.bin
bios-microvm:
$(MAKE) -C qboot
cp qboot/bios.bin ../pc-bios/bios-microvm.bin
clean: clean:
rm -rf seabios/.config seabios/out seabios/builds rm -rf seabios/.config seabios/out seabios/builds
$(MAKE) -C sgabios clean $(MAKE) -C sgabios clean
@ -198,3 +203,4 @@ clean:
$(MAKE) -C skiboot clean $(MAKE) -C skiboot clean
$(MAKE) -f Makefile.edk2 clean $(MAKE) -f Makefile.edk2 clean
$(MAKE) -C opensbi clean $(MAKE) -C opensbi clean
$(MAKE) -C qboot clean

1
roms/qboot Submodule

@ -0,0 +1 @@
Subproject commit cb1c49e0cfac99b9961d136ac0194da62c28cf64

View File

@ -2915,6 +2915,12 @@ sub process {
if ($line =~ /\bbzero\(/) { if ($line =~ /\bbzero\(/) {
ERROR("use memset() instead of bzero()\n" . $herecurr); ERROR("use memset() instead of bzero()\n" . $herecurr);
} }
if ($line =~ /\bgetpagesize\(\)/) {
ERROR("use qemu_real_host_page_size instead of getpagesize()\n" . $herecurr);
}
if ($line =~ /\bsysconf\(_SC_PAGESIZE\)/) {
ERROR("use qemu_real_host_page_size instead of sysconf(_SC_PAGESIZE)\n" . $herecurr);
}
my $non_exit_glib_asserts = qr{g_assert_cmpstr| my $non_exit_glib_asserts = qr{g_assert_cmpstr|
g_assert_cmpint| g_assert_cmpint|
g_assert_cmpuint| g_assert_cmpuint|

View File

@ -1058,7 +1058,7 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] = {
.type = CPUID_FEATURE_WORD, .type = CPUID_FEATURE_WORD,
.feat_names = { .feat_names = {
NULL, "avx512vbmi", "umip", "pku", NULL, "avx512vbmi", "umip", "pku",
NULL /* ospke */, NULL, "avx512vbmi2", NULL, NULL /* ospke */, "waitpkg", "avx512vbmi2", NULL,
"gfni", "vaes", "vpclmulqdq", "avx512vnni", "gfni", "vaes", "vpclmulqdq", "avx512vnni",
"avx512bitalg", NULL, "avx512-vpopcntdq", NULL, "avx512bitalg", NULL, "avx512-vpopcntdq", NULL,
"la57", NULL, NULL, NULL, "la57", NULL, NULL, NULL,
@ -6221,6 +6221,8 @@ static Property x86_cpu_properties[] = {
HYPERV_FEAT_IPI, 0), HYPERV_FEAT_IPI, 0),
DEFINE_PROP_BIT64("hv-stimer-direct", X86CPU, hyperv_features, DEFINE_PROP_BIT64("hv-stimer-direct", X86CPU, hyperv_features,
HYPERV_FEAT_STIMER_DIRECT, 0), HYPERV_FEAT_STIMER_DIRECT, 0),
DEFINE_PROP_ON_OFF_AUTO("hv-no-nonarch-coresharing", X86CPU,
hyperv_no_nonarch_cs, ON_OFF_AUTO_OFF),
DEFINE_PROP_BOOL("hv-passthrough", X86CPU, hyperv_passthrough, false), DEFINE_PROP_BOOL("hv-passthrough", X86CPU, hyperv_passthrough, false),
DEFINE_PROP_BOOL("check", X86CPU, check_cpuid, true), DEFINE_PROP_BOOL("check", X86CPU, check_cpuid, true),

View File

@ -24,6 +24,7 @@
#include "cpu-qom.h" #include "cpu-qom.h"
#include "hyperv-proto.h" #include "hyperv-proto.h"
#include "exec/cpu-defs.h" #include "exec/cpu-defs.h"
#include "qapi/qapi-types-common.h"
/* The x86 has a strong memory model with some store-after-load re-ordering */ /* The x86 has a strong memory model with some store-after-load re-ordering */
#define TCG_GUEST_DEFAULT_MO (TCG_MO_ALL & ~TCG_MO_ST_LD) #define TCG_GUEST_DEFAULT_MO (TCG_MO_ALL & ~TCG_MO_ST_LD)
@ -451,6 +452,7 @@ typedef enum X86Seg {
#define MSR_IA32_BNDCFGS 0x00000d90 #define MSR_IA32_BNDCFGS 0x00000d90
#define MSR_IA32_XSS 0x00000da0 #define MSR_IA32_XSS 0x00000da0
#define MSR_IA32_UMWAIT_CONTROL 0xe1
#define MSR_IA32_VMX_BASIC 0x00000480 #define MSR_IA32_VMX_BASIC 0x00000480
#define MSR_IA32_VMX_PINBASED_CTLS 0x00000481 #define MSR_IA32_VMX_PINBASED_CTLS 0x00000481
@ -730,6 +732,8 @@ typedef uint64_t FeatureWordArray[FEATURE_WORDS];
#define CPUID_7_0_ECX_PKU (1U << 3) #define CPUID_7_0_ECX_PKU (1U << 3)
/* OS Enable Protection Keys */ /* OS Enable Protection Keys */
#define CPUID_7_0_ECX_OSPKE (1U << 4) #define CPUID_7_0_ECX_OSPKE (1U << 4)
/* UMONITOR/UMWAIT/TPAUSE Instructions */
#define CPUID_7_0_ECX_WAITPKG (1U << 5)
/* Additional AVX-512 Vector Byte Manipulation Instruction */ /* Additional AVX-512 Vector Byte Manipulation Instruction */
#define CPUID_7_0_ECX_AVX512_VBMI2 (1U << 6) #define CPUID_7_0_ECX_AVX512_VBMI2 (1U << 6)
/* Galois Field New Instructions */ /* Galois Field New Instructions */
@ -1584,6 +1588,7 @@ typedef struct CPUX86State {
uint16_t fpregs_format_vmstate; uint16_t fpregs_format_vmstate;
uint64_t xss; uint64_t xss;
uint32_t umwait;
TPRAccess tpr_access_type; TPRAccess tpr_access_type;
@ -1614,6 +1619,7 @@ struct X86CPU {
bool hyperv_synic_kvm_only; bool hyperv_synic_kvm_only;
uint64_t hyperv_features; uint64_t hyperv_features;
bool hyperv_passthrough; bool hyperv_passthrough;
OnOffAuto hyperv_no_nonarch_cs;
bool check_cpuid; bool check_cpuid;
bool enforce_cpuid; bool enforce_cpuid;

View File

@ -63,6 +63,7 @@
#define HV_CLUSTER_IPI_RECOMMENDED (1u << 10) #define HV_CLUSTER_IPI_RECOMMENDED (1u << 10)
#define HV_EX_PROCESSOR_MASKS_RECOMMENDED (1u << 11) #define HV_EX_PROCESSOR_MASKS_RECOMMENDED (1u << 11)
#define HV_ENLIGHTENED_VMCS_RECOMMENDED (1u << 14) #define HV_ENLIGHTENED_VMCS_RECOMMENDED (1u << 14)
#define HV_NO_NONARCH_CORESHARING (1u << 18)
/* /*
* Basic virtualized MSRs * Basic virtualized MSRs

View File

@ -95,6 +95,7 @@ static bool has_msr_hv_stimer;
static bool has_msr_hv_frequencies; static bool has_msr_hv_frequencies;
static bool has_msr_hv_reenlightenment; static bool has_msr_hv_reenlightenment;
static bool has_msr_xss; static bool has_msr_xss;
static bool has_msr_umwait;
static bool has_msr_spec_ctrl; static bool has_msr_spec_ctrl;
static bool has_msr_virt_ssbd; static bool has_msr_virt_ssbd;
static bool has_msr_smi_count; static bool has_msr_smi_count;
@ -401,6 +402,12 @@ uint32_t kvm_arch_get_supported_cpuid(KVMState *s, uint32_t function,
if (host_tsx_blacklisted()) { if (host_tsx_blacklisted()) {
ret &= ~(CPUID_7_0_EBX_RTM | CPUID_7_0_EBX_HLE); ret &= ~(CPUID_7_0_EBX_RTM | CPUID_7_0_EBX_HLE);
} }
} else if (function == 7 && index == 0 && reg == R_ECX) {
if (enable_cpu_pm) {
ret |= CPUID_7_0_ECX_WAITPKG;
} else {
ret &= ~CPUID_7_0_ECX_WAITPKG;
}
} else if (function == 7 && index == 0 && reg == R_EDX) { } else if (function == 7 && index == 0 && reg == R_EDX) {
/* /*
* Linux v4.17-v4.20 incorrectly return ARCH_CAPABILITIES on SVM hosts. * Linux v4.17-v4.20 incorrectly return ARCH_CAPABILITIES on SVM hosts.
@ -592,9 +599,9 @@ static void kvm_mce_inject(X86CPU *cpu, hwaddr paddr, int code)
(MCM_ADDR_PHYS << 6) | 0xc, flags); (MCM_ADDR_PHYS << 6) | 0xc, flags);
} }
static void hardware_memory_error(void) static void hardware_memory_error(void *host_addr)
{ {
fprintf(stderr, "Hardware memory error!\n"); error_report("QEMU got Hardware memory error at addr %p", host_addr);
exit(1); exit(1);
} }
@ -618,15 +625,34 @@ void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr)
kvm_physical_memory_addr_from_host(c->kvm_state, addr, &paddr)) { kvm_physical_memory_addr_from_host(c->kvm_state, addr, &paddr)) {
kvm_hwpoison_page_add(ram_addr); kvm_hwpoison_page_add(ram_addr);
kvm_mce_inject(cpu, paddr, code); kvm_mce_inject(cpu, paddr, code);
/*
* Use different logging severity based on error type.
* If there is additional MCE reporting on the hypervisor, QEMU VA
* could be another source to identify the PA and MCE details.
*/
if (code == BUS_MCEERR_AR) {
error_report("Guest MCE Memory Error at QEMU addr %p and "
"GUEST addr 0x%" HWADDR_PRIx " of type %s injected",
addr, paddr, "BUS_MCEERR_AR");
} else {
warn_report("Guest MCE Memory Error at QEMU addr %p and "
"GUEST addr 0x%" HWADDR_PRIx " of type %s injected",
addr, paddr, "BUS_MCEERR_AO");
}
return; return;
} }
fprintf(stderr, "Hardware memory error for memory used by " if (code == BUS_MCEERR_AO) {
"QEMU itself instead of guest system!\n"); warn_report("Hardware memory error at addr %p of type %s "
"for memory used by QEMU itself instead of guest system!",
addr, "BUS_MCEERR_AO");
}
} }
if (code == BUS_MCEERR_AR) { if (code == BUS_MCEERR_AR) {
hardware_memory_error(); hardware_memory_error(addr);
} }
/* Hope we are lucky for AO MCE */ /* Hope we are lucky for AO MCE */
@ -1208,6 +1234,16 @@ static int hyperv_handle_properties(CPUState *cs,
} }
} }
if (cpu->hyperv_no_nonarch_cs == ON_OFF_AUTO_ON) {
env->features[FEAT_HV_RECOMM_EAX] |= HV_NO_NONARCH_CORESHARING;
} else if (cpu->hyperv_no_nonarch_cs == ON_OFF_AUTO_AUTO) {
c = cpuid_find_entry(cpuid, HV_CPUID_ENLIGHTMENT_INFO, 0);
if (c) {
env->features[FEAT_HV_RECOMM_EAX] |=
c->eax & HV_NO_NONARCH_CORESHARING;
}
}
/* Features */ /* Features */
r = hv_cpuid_check_and_set(cs, cpuid, HYPERV_FEAT_RELAXED); r = hv_cpuid_check_and_set(cs, cpuid, HYPERV_FEAT_RELAXED);
r |= hv_cpuid_check_and_set(cs, cpuid, HYPERV_FEAT_VAPIC); r |= hv_cpuid_check_and_set(cs, cpuid, HYPERV_FEAT_VAPIC);
@ -1321,6 +1357,7 @@ free:
} }
static Error *hv_passthrough_mig_blocker; static Error *hv_passthrough_mig_blocker;
static Error *hv_no_nonarch_cs_mig_blocker;
static int hyperv_init_vcpu(X86CPU *cpu) static int hyperv_init_vcpu(X86CPU *cpu)
{ {
@ -1340,6 +1377,21 @@ static int hyperv_init_vcpu(X86CPU *cpu)
} }
} }
if (cpu->hyperv_no_nonarch_cs == ON_OFF_AUTO_AUTO &&
hv_no_nonarch_cs_mig_blocker == NULL) {
error_setg(&hv_no_nonarch_cs_mig_blocker,
"'hv-no-nonarch-coresharing=auto' CPU flag prevents migration"
" use explicit 'hv-no-nonarch-coresharing=on' instead (but"
" make sure SMT is disabled and/or that vCPUs are properly"
" pinned)");
ret = migrate_add_blocker(hv_no_nonarch_cs_mig_blocker, &local_err);
if (local_err) {
error_report_err(local_err);
error_free(hv_no_nonarch_cs_mig_blocker);
return ret;
}
}
if (hyperv_feat_enabled(cpu, HYPERV_FEAT_VPINDEX) && !hv_vpindex_settable) { if (hyperv_feat_enabled(cpu, HYPERV_FEAT_VPINDEX) && !hv_vpindex_settable) {
/* /*
* the kernel doesn't support setting vp_index; assert that its value * the kernel doesn't support setting vp_index; assert that its value
@ -1954,6 +2006,9 @@ static int kvm_get_supported_msrs(KVMState *s)
case MSR_IA32_XSS: case MSR_IA32_XSS:
has_msr_xss = true; has_msr_xss = true;
break; break;
case MSR_IA32_UMWAIT_CONTROL:
has_msr_umwait = true;
break;
case HV_X64_MSR_CRASH_CTL: case HV_X64_MSR_CRASH_CTL:
has_msr_hv_crash = true; has_msr_hv_crash = true;
break; break;
@ -2633,6 +2688,9 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
if (has_msr_xss) { if (has_msr_xss) {
kvm_msr_entry_add(cpu, MSR_IA32_XSS, env->xss); kvm_msr_entry_add(cpu, MSR_IA32_XSS, env->xss);
} }
if (has_msr_umwait) {
kvm_msr_entry_add(cpu, MSR_IA32_UMWAIT_CONTROL, env->umwait);
}
if (has_msr_spec_ctrl) { if (has_msr_spec_ctrl) {
kvm_msr_entry_add(cpu, MSR_IA32_SPEC_CTRL, env->spec_ctrl); kvm_msr_entry_add(cpu, MSR_IA32_SPEC_CTRL, env->spec_ctrl);
} }
@ -3046,6 +3104,9 @@ static int kvm_get_msrs(X86CPU *cpu)
if (has_msr_xss) { if (has_msr_xss) {
kvm_msr_entry_add(cpu, MSR_IA32_XSS, 0); kvm_msr_entry_add(cpu, MSR_IA32_XSS, 0);
} }
if (has_msr_umwait) {
kvm_msr_entry_add(cpu, MSR_IA32_UMWAIT_CONTROL, 0);
}
if (has_msr_spec_ctrl) { if (has_msr_spec_ctrl) {
kvm_msr_entry_add(cpu, MSR_IA32_SPEC_CTRL, 0); kvm_msr_entry_add(cpu, MSR_IA32_SPEC_CTRL, 0);
} }
@ -3298,6 +3359,9 @@ static int kvm_get_msrs(X86CPU *cpu)
case MSR_IA32_XSS: case MSR_IA32_XSS:
env->xss = msrs[i].data; env->xss = msrs[i].data;
break; break;
case MSR_IA32_UMWAIT_CONTROL:
env->umwait = msrs[i].data;
break;
default: default:
if (msrs[i].index >= MSR_MC0_CTL && if (msrs[i].index >= MSR_MC0_CTL &&
msrs[i].index < MSR_MC0_CTL + (env->mcg_cap & 0xff) * 4) { msrs[i].index < MSR_MC0_CTL + (env->mcg_cap & 0xff) * 4) {

View File

@ -943,6 +943,25 @@ static const VMStateDescription vmstate_xss = {
} }
}; };
static bool umwait_needed(void *opaque)
{
X86CPU *cpu = opaque;
CPUX86State *env = &cpu->env;
return env->umwait != 0;
}
static const VMStateDescription vmstate_umwait = {
.name = "cpu/umwait",
.version_id = 1,
.minimum_version_id = 1,
.needed = umwait_needed,
.fields = (VMStateField[]) {
VMSTATE_UINT32(env.umwait, X86CPU),
VMSTATE_END_OF_LIST()
}
};
#ifdef TARGET_X86_64 #ifdef TARGET_X86_64
static bool pkru_needed(void *opaque) static bool pkru_needed(void *opaque)
{ {
@ -1391,6 +1410,7 @@ VMStateDescription vmstate_x86_cpu = {
&vmstate_msr_hyperv_reenlightenment, &vmstate_msr_hyperv_reenlightenment,
&vmstate_avx512, &vmstate_avx512,
&vmstate_xss, &vmstate_xss,
&vmstate_umwait,
&vmstate_tsc_khz, &vmstate_tsc_khz,
&vmstate_msr_smi_count, &vmstate_msr_smi_count,
#ifdef TARGET_X86_64 #ifdef TARGET_X86_64

View File

@ -15,6 +15,7 @@
#include "libqtest-single.h" #include "libqtest-single.h"
#include "qemu/timer.h" #include "qemu/timer.h"
#include "hw/rtc/mc146818rtc.h"
#include "hw/rtc/mc146818rtc_regs.h" #include "hw/rtc/mc146818rtc_regs.h"
#define UIP_HOLD_LENGTH (8 * NANOSECONDS_PER_SECOND / 32768) #define UIP_HOLD_LENGTH (8 * NANOSECONDS_PER_SECOND / 32768)

View File

@ -61,7 +61,8 @@ static void sigfd_handler(void *opaque)
} }
if (len != sizeof(info)) { if (len != sizeof(info)) {
printf("read from sigfd returned %zd: %m\n", len); error_report("read from sigfd returned %zd: %s", len,
g_strerror(errno));
return; return;
} }

View File

@ -60,8 +60,8 @@ unsigned int check_socket_activation(void)
* and we should exit. * and we should exit.
*/ */
error_report("Socket activation failed: " error_report("Socket activation failed: "
"invalid file descriptor fd = %d: %m", "invalid file descriptor fd = %d: %s",
fd); fd, g_strerror(errno));
exit(EXIT_FAILURE); exit(EXIT_FAILURE);
} }
} }

3
vl.c
View File

@ -1744,6 +1744,9 @@ static bool main_loop_should_exit(void)
RunState r; RunState r;
ShutdownCause request; ShutdownCause request;
if (runstate_check(RUN_STATE_FINISH_MIGRATE)) {
return false;
}
if (preconfig_exit_requested) { if (preconfig_exit_requested) {
if (runstate_check(RUN_STATE_PRECONFIG)) { if (runstate_check(RUN_STATE_PRECONFIG)) {
runstate_set(RUN_STATE_PRELAUNCH); runstate_set(RUN_STATE_PRELAUNCH);