g_strdup_printf already handles OOM errors, so some error handling in
QEMU code can be removed.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Since 39bffca203 (qdev: register all
types natively through QEMU Object Model), TypeInfo as used in
the common, non-iterative pattern is no longer amended with information
and should therefore be const.
Fix the documented QOM examples:
sed -i 's/static TypeInfo/static const TypeInfo/g' include/qom/object.h
Since frequently the wrong examples are being copied by contributors of
new devices, fix all types in the tree:
sed -i 's/^static TypeInfo/static const TypeInfo/g' */*.c
sed -i 's/^static TypeInfo/static const TypeInfo/g' */*/*.c
This also avoids to piggy-back these changes onto real functional
changes or other refactorings.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
On the pseries machine the IOMMU (aka TCE tables) is always active for all
PCI and VIO devices. Mostly to simplify the SLOF firmware, we implement an
extension which allows the IOMMU to be temporarily disabled for certain
devices.
Currently this is implemented by setting the device's DMAContext pointer to
NULL (thus reverting to qemu's default no-IOMMU DMA behaviour), then
replacing it when bypass mode is disabled.
This approach causes a bunch of complications though. It complexifies the
management of the DMAContext lifetimes, it's problematic for savevm/loadvm,
and it means that while bypass is active we have nowhere to store the
device's LIOBN (Logical IO Bus Number, used to identify DMA address
spaces). At present we regenerate the LIOBN from other address information
but this restricts how we can allocate LIOBNs.
This patch gives up on this approach, replacing it with the much simpler
one of having a 'bypass' boolean flag in the TCE state structure.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
When we reset the system, the reset method for VIO bus devices resets
the state of their request queue (if present) as it should. However
it was not resetting the state of their TCE table (DMA translation) if
present. It was also not resetting the state of the per-device signal
mask set with H_VIO_SIGNAL. This patch corrects both bugs, and also
removes some small code duplication in the reset paths.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Previously the only PCI bus supported was the emulated PCI bus with
fixed DMA window with start at 0 and size 1GB. As we are going to support
PCI pass through which DMA window properties are set by the host
kernel, we have to support DMA windows with parameters other than default.
This patch adds:
1. DMA window properties to sPAPRPHBState: LIOBN (bus id), start,
size of the window.
2. An additional function spapr_dma_dt() to populate DMA window
properties in the device tree which simply accepts all the parameters
and does not try to guess what kind of IOMMU is given to it.
The original spapr_dma_dt() is renamed to spapr_tcet_dma_dt().
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Currently, the interfaces in the pseries machine code for assignment
and setup of interrupts pass around qemu_irq objects. That was done
in an attempt not to be too closely linked to the specific XICS
interrupt controller. However interactions with the device tree setup
made that attempt rather futile, and XICS is part of the PAPR spec
anyway, so this really just meant we had to carry both the qemu_irq
pointers and the XICS irq numbers around.
This mess will just get worse when we add upcoming PCI MSI support,
since that will require tracking a bunch more interrupt. Therefore,
this patch reworks the spapr code to just use XICS irq numbers
(roughly equivalent to GSIs on x86) and only retrieve the qemu_irq
pointers from the XICS code when we need them (a trivial lookup).
This is a reworked and generalized version of an earlier spapr_pci
specific patch from Alexey Kardashevskiy.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
[agraf: fix checkpath warning]
Signed-off-by: Alexander Graf <agraf@suse.de>
The pseries platform already contains an IOMMU implementation, since it is
essential for the platform's paravirtualized VIO devices. This IOMMU
support is currently built into the implementation of the VIO "bus" and
the various VIO devices.
This patch converts this code to make use of the new common IOMMU
infrastructure.
We don't yet handle synchronization of map/unmap callbacks vs. invalidations,
this will require some complex interaction with the kernel and is not a
major concern at this stage.
Cc: Alex Graf <agraf@suse.de>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Make qbus children show up as link<> properties. There is no stable
addressing for qbus children so we use an unstable naming convention.
This is okay in QOM though because the composition name is expected to
be what's stable.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
This is far less interesting than it sounds. We simply add an Object to each
BusState and then register the types appropriately. Most of the interesting
refactoring will follow in the next patches.
Since we're changing fundamental type names (BusInfo -> BusClass), it all needs
to convert at once. Fortunately, not a lot of code is affected.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[AF: Made all new bus TypeInfos static const.]
[AF: Made qbus_free() call object_delete(), required {qom,glib}_allocated]
Signed-off-by: Andreas Färber <afaerber@suse.de>
In qdev, each bus in practice identified an abstract superclass, but
this was mostly hidden. In QOM, instead, these abstract classes are
explicit so we can move bus properties there.
All bus property walks are removed, and all device property walks
are changed to look along the class hierarchy instead.
We would have duplicates if class A defines some properties and its
subclass B does not define any, because class_b->props will be
left equal to class_a->props.
The solution here is to reintroduce the class_base_init TypeInfo
callback, that was present in one of the early QOM versions but
removed (on my request...) before committing.
This breaks global bus properties, an obscure feature when used
with the command-line which is actually useful and used when used by
backwards-compatible machine types. So this patch also adjusts the
global bus properties in hw/pc_piix.c to refer to the abstract class.
Globals and other properties must be modified in the same patch to
avoid complications related to initialization ordering.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Simple code movement in order to simplify future refactoring.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
PAPR virtual IO (VIO) devices require a unique, but otherwise arbitrary,
"address" used as a token to the hypercalls which manipulate them.
Currently the pseries machine code does an ok job of allocating these
addresses when the legacy -net nic / -serial and so forth options are used
but will fail to allocate them properly when using -device.
Specifically, you can use -device if all addresses are explicitly assigned.
Without explicit assignment, only one VIO device of each type (network,
console, SCSI) will be assigned properly, any further ones will attempt
to take the same address leading to a fatal error.
This patch fixes the situation by adding a proper address allocator to the
VIO "bus" code. This is used both by -device and the legacy options and
default devices. Addresses can still be explicitly assigned with -device
options if desired.
This patch changes the (guest visible) numbering of VIO devices, but since
their addresses are discovered using the device tree and already differ
from the numbering found on existing PowerVM systems, this does not break
compatibility.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Recently we added code to properly clean away VIO CRQs on reset However,
this directly uses qemu_register, rather than the existing device model
reset callbacks. This patch cleans this up by adding proper use of the
reset hook to the VIO bus model. The existing CRQ reset code is converted
to the new method.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Andreas Färber <afaerber@suse.de>
PAPR specifies a Command Response Queue (CRQ) mechanism used for virtual
IO, which we implement. However, we don't correctly clean up registered
CRQs when we reset the system.
This patch adds a reset handler to fix this bug. While we're at it, add
in some of the extra debug messages that were used to track the problem
down.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
[AF: Updated hcall_dprintf()s to not duplicate the function name]
Signed-off-by: Andreas Färber <afaerber@suse.de>
The pseries machine code has a number of debug messages for debugging PAPR
hypercalls, dependent on DEBUG_SPAPR_HCALLS. This patch cleans these
messages up a bit, by adding __func__ to the hcall_dprintf() macro and
simplifying up a number of the individual messages accordingly.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Andreas Färber <afaerber@suse.de>
The pseries "xics" interrupt controller, like most interrupt
controllers can support both message (i.e. edge sensitive) interrupts
and level sensitive interrupts, but it needs to know which are which.
When I implemented the xics emulation for qemu, the only devices we
supported were the PAPR virtual IO devices. These devices only use
message interrupts, so they were the only ones I implemented in xics.
Since then, however, we have added support for PCI devices, which use
level sensitive interrupts. It turns out the message interrupt logic
still actually works most of the time for these, but there are
circumstances where we can lost interrupts due to the incorrect
interrupt logic.
This patch, therefore, implements the correct xics level-sensitive
interrupt logic. The type of the interrupt is set when a device
allocates a new xics interrupt.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Scripted conversion:
for file in hw/ppc*.[hc] hw/mpc8544_guts.c hw/spapr*.[hc] hw/virtex_ml507.c hw/xics.c; do
sed -i "s/CPUState/CPUPPCState/g" $file
done
Signed-off-by: Andreas Färber <afaerber@suse.de>
Acked-by: Anthony Liguori <aliguori@us.ibm.com>
Replace device_init() with generalized type_init().
While at it, unify naming convention: type_init([$prefix_]register_types)
Also, type_init() is a function, so add preceding blank line where
necessary and don't put a semicolon after the closing brace.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Cc: Anthony Liguori <anthony@codemonkey.ws>
Cc: malc <av1474@comtv.ru>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This was done in a mostly automated fashion. I did it in three steps and then
rebased it into a single step which avoids repeatedly touching every file in
the tree.
The first step was a sed-based addition of the parent type to the subclass
registration functions.
The second step was another sed-based removal of subclass registration functions
while also adding virtual functions from the base class into a class_init
function as appropriate.
Finally, a python script was used to convert the DeviceInfo structures and
qdev_register_subclass functions to TypeInfo structures, class_init functions,
and type_register_static calls.
We are almost fully converted to QOM after this commit.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This converts three devices because apic and ioapic are subclasses of sysbus.
Converting subclasses independently of their base class is prohibitively hard.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Right now, DeviceInfo acts as the class for qdev. In order to switch to a
proper ObjectClass derivative, we need to ween all of the callers off of
interacting directly with the info pointer.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Check that devices on the spapr vio bus aren't given duplicate
addresses. Currently we will not run with duplicate devices, the
fdt code will spot it, but the error reporting is not great. With
this patch we can report the error nicely in terms of the device
names given by the user.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
There is a device tree property "/chosen/linux,stdout-path" which indicates
which device should be used as stdout - ie. "the console".
Currently we don't specify anything, which means both firmware and Linux
choose something arbitrarily. Use the routine we added in the last patch
to pick a default vty and specify it as stdout.
Currently SLOF doesn't use the property, but we are hoping to update it
to do so.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Although in theory the device tree has no inherent ordering, in practice
the order of nodes in the device tree does effect the order that devices
are detected by software.
Currently the ordering is determined by the order the devices appear on
the QEMU command line. Although that does give the user control over the
ordering, it is fragile, especially when the user does not generate the
command line manually - eg. when using libvirt etc.
So order the device tree based on the reg value, ie. the address of on
the VIO bus of the devices. This gives us a sane and stable ordering.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
[agraf] add braces
For forgotten historical reasons, PAPR hypercalls for specific virtual IO
devices (oh which there are quite a number) are registered via a callback
in the VIOsPAPRDeviceInfo structure.
This is kind of ugly, so this patch instead registers hypercalls from
device_init() functions for each device type. This works just as well,
and is cleaner.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
When the user creates a device on the command line with -device, they
can specify the id, using id=foo. Currently the VIO bus code overwrites
this id with it's own value. We should only set qdev.id if it is not
already set by the user.
The device tree code uses qdev.id for the device tree node name, however
we can't rely on the user specifiying the id using proper device tree
syntax, ie. device@reg. So separate the device tree node name from the
qdev.id, but use the same syntax, so they will match by default.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
The spapr_vio_find_by_reg() function in hw/spapr_vio.c is supposed to find
the device structure for a PAPR virtual IO device with the given reg value,
and return NULL if none exists.
It does the first ok, but if no device with that reg exists, it just
returns the last device traversed in the list. This patch fixes it.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
* 'ppc-next' of git://repo.or.cz/qemu/agraf: (24 commits)
pseries: Add partial support for PCI
ppc: Alter CPU state to mask out TCG unimplemented instructions as appropriate
pseries: Allow writes to KVM accelerated TCE table
KVM: PPC: Override host vmx/vsx/dfp only when information known
ppc: Fix up usermode only builds
pseries: Correct vmx/dfp handling in both KVM and TCG cases
PPC: Fail configure when libfdt is not available
ppc: Avoid decrementer related kvm exits
PPC: Disable non-440 CPUs for ppcemb target
PPC: Bump qemu-system-ppc to 64-bit physical address space
pseries: Under kvm use guest cpu = host cpu by default
ppc: Add cpu defs for POWER7 revisions 2.1 and 2.3
ppc: First cut implementation of -cpu host
ppc: Remove broken partial PVR matching
pseries: Update SLOF firmware image
pseries: Add device tree properties for VMX/VSX and DFP under kvm
ppc: Generalize the kvmppc_get_clockfreq() function
Set an invalid-bits mask for each SPE instructions
pseries: Update SLOF firmware image
pseries: Use Book3S-HV TCE acceleration capabilities
...
The pseries machine of qemu implements the TCE mechanism used as a
virtual IOMMU for the PAPR defined virtual IO devices. Because the
PAPR spec only defines a small DMA address space, the guest VIO
drivers need to update TCE mappings very frequently - the virtual
network device is particularly bad. This means many slow exits to
qemu to emulate the H_PUT_TCE hypercall.
Sufficiently recent kernels allow this to be mitigated by implementing
H_PUT_TCE in the host kernel. To make use of this, however, qemu
needs to initialize the necessary TCE tables, and map them into itself
so that the VIO device implementations can retrieve the mappings when
they access guest memory (which is treated as a virtual DMA
operation).
This patch adds the necessary calls to use the KVM TCE acceleration.
If the kernel does not support acceleration, or there is some other
error creating the accelerated TCE table, then it will still fall back
to full userspace TCE implementation.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
SCSI buses will need to read the children list first-to-last. This
requires using a QTAILQ, because hell breaks loose if you just try
inserting at the tail (thus reversing the order of all existing
visits from last-to-first to first-to-tail).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Paulo Bonzini changed the original spapr code, which manually assigned irq
numbers for each virtual device, to allocate them automatically from the
device initialization. That allowed spapr virtual devices to be constructed
with -device, which is a good start. However, the way that patch worked
doesn't extend nicely for the future when we want to support devices other
than sPAPR VIO devices (e.g. virtio and PCI).
This patch rearranges the irq allocation to be global across the sPAPR
environment, so it can be used by other bus types as well.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
This also lets the user see the irq in "info qtree".
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Alexander Graf <agraf@suse.de>
Cc: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Right now the spapr devices cannot be instantiated with -device,
because the IRQs need to be passed to the spapr_*_create functions.
Do this instead in the bus's init wrapper.
This is particularly important with the conversion from scsi-disk
to scsi-{cd,hd} that Markus made. After his patches, if you
specify a scsi-cd device attached to an if=none drive, the default
VSCSI controller will not be created and, without qdevification,
you will not be able to add yours.
NOTE from agraf: added small compile fix
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Alexander Graf <agraf@suse.de>
Cc: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Usually, PAPR virtual IO devices use a virtual IOMMU mechanism, TCEs,
to mediate all DMA transfers. While this is necessary for some sorts of
operation, it can be complex to program and slow for others.
This patch implements a mechanism for bypassing TCE translation, treating
"IO" addresses as plain (guest) physical memory addresses. This has two
main uses:
* Simple, but 64-bit aware programs like firmwares can use the VIO devices
without the complexity of TCE setup.
* The guest OS can optionally use the TCE bypass to improve performance in
suitable situations.
The mechanism used is a per-device flag which disables TCE translation.
The flag is toggled with some (hypervisor-implemented) RTAS methods.
Signed-off-by: Ben Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
This patch implements the infrastructure and hypercalls necessary for the
PAPR specified CRQ (Command Request Queue) mechanism. This general
request queueing system is used by many of the PAPR virtual IO devices,
including the virtual scsi adapter.
Signed-off-by: Ben Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
This patch implements the necessary infrastructure and hypercalls for
sPAPR's TCE (Translation Control Entry) IOMMU mechanism. This is necessary
for all virtual IO devices which do DMA (i.e. nearly all of them).
Signed-off-by: Ben Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
This patch adds infrastructure to support interrupts from PAPR virtual IO
devices. This includes correctly advertising those interrupts in the
device tree, and implementing the H_VIO_SIGNAL hypercall, used to
enable and disable individual device interrupts.
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
This extends the "pseries" (PAPR) machine to include a virtual IO bus
supporting the PAPR defined hypercall based virtual IO mechanisms.
So far only one VIO device is provided, the vty / vterm, providing
a full console (polled only, for now).
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>