Potentially tick-generating timer devices will gain a common property:
lock_tick_policy. It allows to encode 4 different ways how to deal with
tick events the guest did not process in time:
discard - ignore lost ticks (e.g. if the guest compensates for them
already)
delay - replay all lost ticks in a row once the guest accepts them
again
merge - if multiple ticks are lost, all of them are merged into one
which is replayed once the guest accepts it again
slew - lost ticks are gradually replayed at a higher frequency than
the original tick
Not all timer device will need to support all modes. However, all need
to accept the configuration via this common property.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This lets the RTC get adjustments from the host NTP client.
The watchdog still uses the vm_clock. The previous behavior is
available with "-rtc clock=vm".
Cc: Andreas Färber <afaerber@suse.de>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
This patch implements the RX channel of GRLIB UART with a FIFO to
improve data rate.
Signed-off-by: Fabien Chouteau <chouteau@adacore.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
When 2c74c2cb4b added support for
the 'readonly' flag against 9p filesystems, it also made QEMU
add the O_NOATIME flag as a side-effect.
The O_NOATIME flag, however, may only be set by the file owner,
or a user with CAP_FOWNER capability. QEMU cannot assume that
this is the case for filesytems exported to QEMU.
eg, run QEMU as non-root, and attempt to pass the host OS
filesystem through to the guest OS with readonly enable.
The result is that the guest OS cannot open any files at
all.
If O_NOATIME is really required, it should be optionally
enabled via a separate QEMU command line flag.
* hw/9pfs/virtio-9p.c: Remove O_NOATIME
Acked-by: M. Mohan Kumar <mohan@in.ibm.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
In passthrough security model in local fs driver, after a file creation
chown and chmod are done to set the file credentials and mode as requested
by 9p client. But if there was a request to create a file with S_ISGID
bit, doing chown on that file resets the S_ISGID bit. So first call
chown and then invoking chmod with proper mode bit retains the S_ISGID
(if present/requested)
This resulted in LTP mknod02, mknod03, mknod05, open10 test case
failures. This patch fixes this issue.
man 2 chown
When the owner or group of an executable file are changed by an unprivileged
user the S_ISUID and S_ISGID mode bits are cleared. POSIX does not specify
whether this also should happen when root does the chown(); the Linux behavior
depends on the kernel version.
Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Commit 999e12bbe8 (sysbus: apic: ioapic:
convert to QEMU Object Model) introduced two typos, one of which broke
the mac99 machine.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Cc: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This converts three devices because apic and ioapic are subclasses of sysbus.
Converting subclasses independently of their base class is prohibitively hard.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Add -pcihost to SysBus devices to resolve name conflicts,
and clarify PCI vs. Internal PCI.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
This converts two types because smbus is implemented as a subclass of i2c. It's
extremely difficult to convert these two independently.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This converts two devices at once because PIC subclasses ISA and converting
subclasses independently is extremely hard.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
These are various small stylistic changes which help make things more
consistent such that the automated conversion script can be simpler.
It's not necessary to agree or disagree with these style changes because all
of this code is going to be rewritten by the patch monkey script anyway.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Since we are still dynamically creating TypeInfo, we need to chain the
class_init function in order to be able to make use of it within subclasses of
TYPE_DEVICE.
This will disappear once we register TypeInfos directly.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
In order to introduce inheritance while still using the qdev registration
interfaces, we need to be able to use a parent other than TYPE_DEVICE. Add a
new interface that allows this.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Right now, DeviceInfo acts as the class for qdev. In order to switch to a
proper ObjectClass derivative, we need to ween all of the callers off of
interacting directly with the info pointer.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This is a very shallow integration. We register a TYPE_DEVICE but only use
QOM as basically a memory allocator. This will make all devices show up as
QOM objects but they will all carry the TYPE_DEVICE.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
---
v1 -> v2
- update for new location of object.h
* pmaydell/arm-devs.for-upstream:
arm: SoC model for Calxeda Highbank
arm_boot: support board IDs more than 16 bits wide
arm: add secondary cpu boot callbacks to arm_boot.c
ahci: add support for non-PCI based controllers
Add xgmac ethernet model
A device reset does not affect the link state, only set_link does.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
A device reset does not affect the link state, only set_link does.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
By using strncasecmp, we allow for arbitrary characters after the
"on"/"off" string. Fix this by switching to strcasecmp.
Reviewed-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Limit the return value (corresponding to the length of the buffer to be
DMAed back to the intiator) to the value in req->cmd.xfer, which is the
amount of data that the initiator expects. Eliminate now-duplicate code
that does this guarding in the functions for individual commands.
Without this, the SCRIPTS code in the emulated LSI device eventually
raises a DMA interrupt for a data overrun when an INQUIRY command whose
buflen exceeds req->cmd.xfer is processed. It's the responsibility of
the client to provide a request buffer and allocation length that are
large enough for the result of the command.
Signed-off-by: Thomas Higdon <thigdon@akamai.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
There already exists a virtio_blk_handle_write trace event as well as
completion events. Add the virtio_blk_handle_read event so it's easy to
trace virtio-blk requests for both read and write operations.
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Adds support for Calxeda's Highbank SoC.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: Mark Langsdorf <mark.langsdorf@calxeda.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Support passing a board ID value to the kernel in r1
that is more than 16 bits wide. This is needed to pass
the '-1 == invalid' value for boards which only support
device tree booting.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Mark Langsdorf <mark.langsdorf@calxeda.com>
Create two functions, write_secondary_boot() and secondary_cpu_reset_hook(),
to allow platforms more control of how secondary CPUs are brought up. The
new functions default to NULL and aren't called unless they are populated
so there are no changes to existing platform models.
Signed-off-by: Mark Langsdorf <mark.langsdorf@calxeda.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add support for ahci on sysbus.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: Mark Langsdorf <mark.langsdorf@calxeda.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This adds very basic support for the xgmac ethernet core. Missing things
include:
- statistics counters
- WoL support
- rx checksum offload
- chained descriptors (only linear descriptor ring)
- broadcast and multicast handling
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: Mark Langsdorf <mark.langsdorf@calxeda.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Remove target dependencies and compile Cirrus VGA in hwlib.
Address masking can be removed since memory API handles that now.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Instead of each target knowing or guessing the guest page size,
just pass the desired size of dirtied memory area.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
* qemu-kvm/uq/master:
kvm: Activate in-kernel irqchip support
kvm: x86: Add user space part for in-kernel IOAPIC
kvm: x86: Add user space part for in-kernel i8259
kvm: x86: Add user space part for in-kernel APIC
kvm: x86: Establish IRQ0 override control
kvm: Introduce core services for in-kernel irqchip support
memory: Introduce memory_region_init_reservation
ioapic: Factor out base class for KVM reuse
ioapic: Drop post-load irr initialization
i8259: Factor out base class for KVM reuse
i8259: Completely privatize PicState
apic: Open-code timer save/restore
apic: Factor out base class for KVM reuse
apic: Introduce apic_report_irq_delivered
apic: Inject external NMI events via LINT1
apic: Stop timer on reset
kvm: Move kvmclock into hw/kvm folder
msi: Generalize msix_supported to msi_supported
hyper-v: initialize Hyper-V CPUID leaves.
hyper-v: introduce Hyper-V support infrastructure.
Conflicts:
Makefile.target
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Improve VGA selection logic, push check for device availabilty to vl.c.
Create the devices at board level unconditionally.
Remove now unused pci_try_create*() functions.
Make PCI VGA devices optional.
Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Rename SysBus device from 'grackle' to 'grackle-pcihost' to resolve a
name conflict.
Also mark both devices as no_user.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Cc: Alexander Graf <agraf@suse.de>
Cc: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
We call pci_host_config_{read,write}_common() which perform PCI config
accesses. However they don't do all limit checking the way we expect
it to.
So let's introduce a small wrapper around them, making them behave the
way we would without touching generic code.
This patch is based on a patch by David Gibson which put this logic into
the generic code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Currently on the pseries machine the SLOF firmware is used normally,
but we bypass it when -kernel is specified. Having these two
different boot paths can cause some confusion.
In particular at present we need to "probe" the (emulated) PCI bus and
produce device tree nodes for the PCI devices in qemu, for the -kernel
case. In the SLOF case, it takes the device tree from qemu adds some
stuff to it then passes it on to the kernel.
It's been decided that a better approach is to always boot through
SLOF, even when using -kernel. WIth this approach we can leave PCI
probing and device node creation to SLOF in all cases which removes a
bunch of code in qemu, and avoids iterating the PCI devices from the
machine specific init code which we're not supposed to do.
This patch changes qemu to always boot through SLOF, and not to create
PCI nodes. Simultaneously it updates the included version of SLOF
(submodule and binary image) to one which supports (and requires) the
new approach.
The new SLOF version also includes a number of unrelated enhancements:
support for booting from virtio-pci devices and e1000, greatly
improved FCode support and many bugfixes. It also makes SLOF ready to
be used even when specifying a kernel on the qemu command line.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
The pseries machine expects a para-virtualized guest and so supplies RTAS
functions (via a hypercall) for performing PCI config space access.
Currently the implementation of these calls into
pci_default_{read,write}_config(). However this would be incorrect for
any PCI device which overrides the default config read/write functions.
AFAICT there's only one such device today, but we should still get it
right. In addition the pci_host_config_{read,write}_common() functions
which do correctly do this dispatch, perform bounds checking on the config
space address, lack of which currently leads to an exploitable bug.
This patch corrects the problem.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
On the pseries machine (which expexts a paravirtualized guest), guest
access to PCI config space is via host-provided RTAS functions. This
patch extends these RTAS functions to permit access to PCI extended
config space, as specified in PAPR.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Back when I made patches introducing dma_addr_t and various PCI DMA
wrapper functions, I made a mistake. The bmdma_addr_{read,write} functions
need to take target_phys_addr_t not dma_addr_t, since they are assigned
to MemoryRegionOps callbacks.
This patch corrects my error.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
load_image_targphys() gets passed a max size for the file, but doesn't
enforce it at all. Add a check and return -1 (error) if the file is
too big, without loading it. Fix the bracing style in the function
while we're at it.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
When accessing the device specific virtio config space, we memcpy
the data into a variable in QEMU. At that point we're basically
pulling host endianness into the game which is a really bad idea.
So instead, let's use the target specific load/store helpers for
memory pointers which fetch things in target endianness. The whole
array is already populated in target endianness anyways
(see virtio-blk).
Signed-off-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
The virtio config area in PIO space is a bit special. The initial
header is little endian but the rest (device specific) is guest
native endian.
The PIO accessors for PCI on machines that don't have native IO ports
assume that all PIO is little endian, which works fine for everything
except the above.
A complicated way to fix it would be to split the BAR into two memory
regions with different endianess settings, but this isn't practical
to do, besides, the PIO code doesn't honor region endianness anyway
(I have a patch for that too but it isn't necessary at this stage).
So I decided to go for the quick fix instead which consists of
reverting the swap in virtio-pci in selected places, hoping that when
we eventually do a "v2" of the virtio protocols, we sort that out once
and for all using a fixed endian setting for everything.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
[agraf: keep virtio in libhw and determine endianness through a
helper function in exec.c]
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Now that we have the SoC init function in the same file, let's integrate
it with the board initialization.
While at it, also make use of the newly qdev'ified PCI host controller.
Signed-off-by: Alexander Graf <agraf@suse.de>
The separation of ppc440 and ppc440_bamboo makes some sense, since ppc440
is the SoC while ppc440_bamboo is the actual board. But the separation
makes things harder for us for no good reason, so let's just fold them
in together with each other.
Signed-off-by: Alexander Graf <agraf@suse.de>
Due to popular demand, this qdevifies the PCI host controller of 4xx SoCs
the same way as e500.
We have to introduce a small stub function for pci init that will be
removed in a later patch, once we qdev'ified the board, to keep the build
working.
Signed-off-by: Alexander Graf <agraf@suse.de>
Today we're exposing a Virtex 440 CPU to the guest despite the fact
that we're telling the guest that we're running on a 440EP one in the
device tree.
So let's better default to a real 440EP to make things synced again.
Signed-off-by: Alexander Graf <agraf@suse.de>
When running a 440 target, we currently get invalid irq_num values (-1)
which completely confuse the IRQ setting code.
This is most likely due to the missing qdev conversion.
While this shouldn't happen in the first place and should really rather
be fixed by converting the target, I dislike segfaults. So for now, let's
just print a warning and ignore invalid irq_num values.
Signed-off-by: Alexander Graf <agraf@suse.de>
Back in the day when the bamboo target got introduced, the initial TLB was
dictated by KVM. TCG has been missing initial TLB values ever since, rendering
the target unusable for TCG usage.
This patch adds linear TLB maps the way Linux expects them, making the target
work.
Signed-off-by: Alexander Graf <agraf@suse.de>
To be able to support CPU reset, we need to put all register initialization
and initial state into a CPU reset hook instead of a function that is only
called once on bootup.
This is a preparation step for the initial TLB setting code and brings bamboo
more in line with what e500 and virtex already do.
Signed-off-by: Alexander Graf <agraf@suse.de>
When using TCG with a BookE PowerPC core, we need to explicitly initialize
the BookE timers with the correct frequencies.
This was missing for 440EP, since that code came from KVM and was never used
with TCG.
Signed-off-by: Alexander Graf <agraf@suse.de>
Speaker I/O, ISA bus, i8259 PIC, RTC and DMA are no longer set up
individually by the machine. Effectively, no-op speaker I/O is replaced
by pcspk; PIT and i82374 DMA are introduced.
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Remove related dead, alternative code.
Wire up PCI host bridge IRQs via GPIO-in IRQs of PCI->ISA bridge.
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
Cc: Alexander Graf <agraf@suse.de>
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Prepare Intel 82378 emulation for use by PReP platforms.
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Create ISA bus in this device (suggested by Markus).
Rebase onto Memory API, mark memory ops as Little Endian.
Add VMState. Provide access to i8259 IRQs via qdev GPIOs.
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Alexander Graf <agraf@suse.de>
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Prepare Intel 82374 emulation for use by Intel 82378 PCI->ISA bridge.
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Confine to CONFIG_I82374. Add VMState.
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
Reviewed-by: Alexander Graf <agraf@suse.de>
Drop pci_prep_init() in favor of extended device state. Inspired by
patches from Hervé and Alex.
Assign the 4 IRQs from the board after device instantiation. This moves
the knowledge out of prep_pci and allows for future machines with
different IRQ wiring (IBM 40P). Suggested by Alex.
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
Reviewed-by: Alexander Graf <agraf@suse.de>
Cc: Hervé Poussineau <hpoussin@reactos.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Anthony Liguori <aliguori@us.ibm.com>
Convert to new-style read/write callbacks.
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
Cc: Alexander Graf <agraf@suse.de>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Benoît Canet <benoit.canet@gmail.com>
The prep PowerPC CPU is Big Endian. An explicit byte swap therefore
effectively becomes Little Endian.
Remove explicit byte swaps and mark as Little Endian.
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
Reviewed-by: Alexander Graf <agraf@suse.de>
Cc: Michael S. Tsirkin <mst@redhat.com>
Move initialization of vendor ID, etc. to PCIDeviceInfo.
Introduce VMState.
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
Reviewed-by: Alexander Graf <agraf@suse.de>
Cc: Hervé Poussineau <hpoussin@reactos.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Anthony Liguori <aliguori@us.ibm.com>
This simplifies the code later when the i8259 moves to the i82378
PCI->ISA bridge and happens to fix a SysBus m48t59 io_base issue
introduced by commit 0fb56ffc5e (m48t59:
drop obsolete address base arithmetic). Suggested by Hervé and Jan.
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
Cc: Hervé Poussineau <hpoussin@reactos.org>
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Blue Swirl <blauwirbel@gmail.com>
Since 0c90c52fab (ppc_prep: convert to memory
API) OHW was "Trying to execute code outside RAM or ROM at 0xfff00700".
The BIOS MemoryRegion is created with a fixed size of 1 MiB.
Ensure that the full size can be accessed since the exception
vectors are located at 0xfff00000 and the BIOS may want to use them.
It thereby no longer depends on the actual BIOS binary size.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Cc: Avi Kivity <avi@redhat.com>
Cc: Alexander Graf <agraf@suse.de>
* stefanha/trivial-patches:
Makefile: Remove generated headers on clean
Makefile: Exclude tests/Makefile in unconfigured tree
lm32: Fix mixup of uint32 and uint32_t
tests: Silence gtester in Makefile
qemu-tool: Fix mixup of int64 and int64_t
* pmaydell/arm-devs.for-upstream:
arm: make the number of GIC interrupts configurable
hw/lan9118: Add save/load support
arm: Remove incorrect comment in arm_timer
vexpress, realview: Add (dummy) L2 cache controller
* kraxel/usb.37:
usb-redir: Improve some debugging messages
usb-redir: Try to keep our buffer size near the target size
usb-redir: Pre-fill our isoc input buffer before sending pkts to the host
usb-redir: Dynamically adjust iso buffering size based on ep interval
usb-redir: Clear iso / irq error when stopping the stream
usb: link packets to endpoints not devices
usb: add max_packet_size to USBEndpoint
usb/debug: add usb_ep_dump
usb-desc: USBEndpoint support
usb: add ifnum to USBEndpoint
usb: add USBEndpoint
xhci: Initial xHCI implementation
usb: add audio device model
usb-desc: audio endpoint support
usb: track altsetting in USBDevice
usb: track configuration and interface count in USBDevice.
usb-host: rip out legacy procfs support
This introduces the KVM-accelerated IOAPIC model 'kvm-ioapic' and
extends the IRQ routing setup by the 0->2 redirection when needed.
The kvm-ioapic model has a property that allows to define its GSI base
for injecting interrupts into the kernel model. This will allow to
disentangle PIC and IOAPIC pins for chipsets that support more
sophisticated IRQ routes than the PIIX3. So far the base is kept at 0,
i.e. PIC and IOAPIC share pins 0..15.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Introduce the alternative 'kvm-i8259' device model that exploits KVM
in-kernel acceleration.
The PIIX3 initialization code is furthermore extended by KVM specific
IRQ route setup. GSI injection differs in KVM mode from the user space
model. As we can dispatch ISA-range IRQs to both IOAPIC and PIC inside
the kernel, we do not need to inject them separately. This is reflected
by a KVM-specific GSI handler.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
This introduces the alternative APIC device which makes use of KVM's
in-kernel device model. External NMI injection via LINT1 is emulated by
checking the current state of the in-kernel APIC, only injecting a NMI
into the VCPU if LINT1 is unmasked and configured to DM_NMI.
MSI is not yet supported, so we disable this when the in-kernel model is
in use.
CC: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
KVM is forced to disable the IRQ0 override when we run with in-kernel
irqchip but without IRQ routing support of the kernel. Set the fwcfg
value correspondingly. This aligns us with qemu-kvm.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Split up the IOAPIC analogously to APIC and i8259. KVM will share the
IOAPICCommonState, the vmstate, reset logic and certain init parts with
the user space model.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
As all devices undergo a reset prior to vmloa, and the reset value of
irr is 0, we do not need to do this clearing for older vmstates
explicitly. Dropping this redundant code will also make KVM integration
a bit simpler.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Analogously to the APIC, we will reuse some parts of the user space
i8259 model for KVM. The base class provides a common device state, the
vmstate, the property list, a reset core and some shared init bits.
This also introduces a common helper to instantiate a single i8259 chip
from the cascade-creating i8259_init function.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Use DeviceState instead of PicState in the public i8259 API. This is
cleaner and allows to reorganize the PIC data structures for KVM reuse.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
To enable migration between accelerated and non-accelerated APIC models,
we will need to handle the timer saving and restoring specially and can
no longer rely on the automatics of VMSTATE_TIMER. Specifically,
accelerated model will not start any QEMUTimer.
This patch therefore factors out the generic bits into apic_next_timer
and use a post-load callback to implemented model-specific logic.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
The KVM in-kernel APIC model will reuse parts of the user space model
while providing the same frontend view to guest and most management
interfaces.
Factor out an APIC base class to encapsulate those parts that will be
shared by user space and KVM model. This class offers callback hooks for
init, base/tpr setting, and the external NMI delivery that will be
set via APICCommonInfo structure and implemented specifically in the
subclasses.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
On real hardware, NMI button events are injected via the LINT1 line of
the APICs. E.g. kdump expect this wiring and gets upset if the per-APIC
LINT1 mask is not respected, i.e. if NMIs are injected to VCPUs that
should not receive them. Change the APIC emulation code to reflect this.
Based on qemu-kvm patch by Lai Jiangshan.
CC: Lai Jiangshan <laijs@cn.fujitsu.com>
Reported-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
All LVTs are masked on reset, so the timer becomes ineffective. Letting
it tick nevertheless is harmless, but will at least create a spurious
trace event.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>