Currently the pseries PCI code uses a somewhat strange scheme of PCI irq
allocation - one per slot up to a maximum that's greater than the usual 4.
This scheme more or less worked, because we were able to tell the guest the
irq mapping in the device tree, however it's a bit odd and may break
assumptions in the future. Worse, the array used to construct the dev
tree interrupt map was mis-sized, we got away with it only because it
happened that our SPAPR_PCI_NUM_LSI value was greater than 7.
This patch changes the pseries PCI code to use the same interrupt swizzling
scheme as is standardized for PCI to PCI bridges. This makes for better
consistency, deals better with any devices which use multiple interrupt
pins and will make life easier in the future when we add passthrough of
what may be either a host bridge or a PCI to PCI bridge. This won't break
existing guests, because they don't assume a particular mapping scheme for
host bridges, but just follow what we tell them in the device tree (also
updated to match, of course). This patch also fixes the allocation of the
irq map.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
On the pseries platform, access to PCI config space is via RTAS calls(
which go to the hypervisor) rather than MMIO. This means we don't use
the same code path as nearly everyone else which goes through pci_host.c
and we're missing some of the parameter checking along the way.
We do have some parameter checking in the RTAS calls, but it's not enough.
It checks for overruns, but does not check for unaligned accesses,
oversized accesses (which means the guest could trigger an assertion
failure from pci_host_config_{read,write}_common(). Worse it doesn't do
the basic checking for the number of RTAS arguments and results before
accessing them.
This patch fixes these bugs.
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
[AF: Fix typos spotted by mst]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Currently, the function spapr_create_phb() uses its parameters to
initialize the correct memory windows for the new PCI Host Bridge
(PHB). This is not the way things are supposed to be done with qdevs,
and means you can't create extra PHBs easily using -device.
Since pSeries machines can and do have many PHBs with various
configurations, this is a real limitation, not just a theoretical.
This patch, therefore, alters the PHB initialization code to use qdev
properties to set these parameters of the new bridge, moving most of
the code from spapr_create_phb() to spapr_phb_init().
While we're at it, we change the naming of each PCI bus and its
associated memory regions to be less arbitrary and make it easier to
relate the guest and qemu views of memory to each other.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
The pseries "xics" interrupt controller, like most interrupt
controllers can support both message (i.e. edge sensitive) interrupts
and level sensitive interrupts, but it needs to know which are which.
When I implemented the xics emulation for qemu, the only devices we
supported were the PAPR virtual IO devices. These devices only use
message interrupts, so they were the only ones I implemented in xics.
Since then, however, we have added support for PCI devices, which use
level sensitive interrupts. It turns out the message interrupt logic
still actually works most of the time for these, but there are
circumstances where we can lost interrupts due to the incorrect
interrupt logic.
This patch, therefore, implements the correct xics level-sensitive
interrupt logic. The type of the interrupt is set when a device
allocates a new xics interrupt.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
The sPAPR PCI code defines a PCI device "spapr-pci-host-bridge-pci" which
is never used. This came over from the earlier bridge driver we used as
a template. Some other bridges appear on their own PCI bus as a device,
but that is not true of pSeries bridges, which are pure host to PCI with
no visible presence on the PCI side.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
The 'bars' constant array was used in experimental device allocation code
which is no longer necessary now that we always run the SLOF firmware.
This patch removes the now redundant variable.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Most MemoryRegionOps already had the const attribute.
This patch adds it to the remaining ones.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Replace device_init() with generalized type_init().
While at it, unify naming convention: type_init([$prefix_]register_types)
Also, type_init() is a function, so add preceding blank line where
necessary and don't put a semicolon after the closing brace.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Cc: Anthony Liguori <anthony@codemonkey.ws>
Cc: malc <av1474@comtv.ru>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This was done in a mostly automated fashion. I did it in three steps and then
rebased it into a single step which avoids repeatedly touching every file in
the tree.
The first step was a sed-based addition of the parent type to the subclass
registration functions.
The second step was another sed-based removal of subclass registration functions
while also adding virtual functions from the base class into a class_init
function as appropriate.
Finally, a python script was used to convert the DeviceInfo structures and
qdev_register_subclass functions to TypeInfo structures, class_init functions,
and type_register_static calls.
We are almost fully converted to QOM after this commit.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This converts three devices because apic and ioapic are subclasses of sysbus.
Converting subclasses independently of their base class is prohibitively hard.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
We call pci_host_config_{read,write}_common() which perform PCI config
accesses. However they don't do all limit checking the way we expect
it to.
So let's introduce a small wrapper around them, making them behave the
way we would without touching generic code.
This patch is based on a patch by David Gibson which put this logic into
the generic code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Currently on the pseries machine the SLOF firmware is used normally,
but we bypass it when -kernel is specified. Having these two
different boot paths can cause some confusion.
In particular at present we need to "probe" the (emulated) PCI bus and
produce device tree nodes for the PCI devices in qemu, for the -kernel
case. In the SLOF case, it takes the device tree from qemu adds some
stuff to it then passes it on to the kernel.
It's been decided that a better approach is to always boot through
SLOF, even when using -kernel. WIth this approach we can leave PCI
probing and device node creation to SLOF in all cases which removes a
bunch of code in qemu, and avoids iterating the PCI devices from the
machine specific init code which we're not supposed to do.
This patch changes qemu to always boot through SLOF, and not to create
PCI nodes. Simultaneously it updates the included version of SLOF
(submodule and binary image) to one which supports (and requires) the
new approach.
The new SLOF version also includes a number of unrelated enhancements:
support for booting from virtio-pci devices and e1000, greatly
improved FCode support and many bugfixes. It also makes SLOF ready to
be used even when specifying a kernel on the qemu command line.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
The pseries machine expects a para-virtualized guest and so supplies RTAS
functions (via a hypercall) for performing PCI config space access.
Currently the implementation of these calls into
pci_default_{read,write}_config(). However this would be incorrect for
any PCI device which overrides the default config read/write functions.
AFAICT there's only one such device today, but we should still get it
right. In addition the pci_host_config_{read,write}_common() functions
which do correctly do this dispatch, perform bounds checking on the config
space address, lack of which currently leads to an exploitable bug.
This patch corrects the problem.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
On the pseries machine (which expexts a paravirtualized guest), guest
access to PCI config space is via host-provided RTAS functions. This
patch extends these RTAS functions to permit access to PCI extended
config space, as specified in PAPR.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
spapr_populate_pci_devices() containd a loop with PCI_NUM_REGIONS (7)
iterations. However this overruns the 'bars' global array, which only has
6 elements. In fact we only want to run this loop for things listed in the
bars array, so this patch corrects the loop bounds to reflect that.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Andrzej Zaborowski <andrew.zaborowski@intel.com>
This patch adds a PCI bus to the pseries machine. This instantiates
the qemu generic PCI bus code, advertises a PCI host bridge in the
guest's device tree and implements the RTAS methods specified by PAPR
to access PCI config space. It also sets up the memory regions we
need to provide windows into the PCI memory and IO space, and
advertises those to the guest.
However, because qemu can't yet emulate an IOMMU, which is mandatory on
pseries, PCI devices which use DMA (i.e. most of them) will not work with
this code alone. Still, this is enough to support the virtio_pci device
(which probably _should_ use emulated PCI DMA, but is specced to use
direct hypervisor access to guest physical memory instead).
[agraf] remove typedef which could cause compile errors
Signed-off-by: Alexey Kardashevskiy <aik@au1.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>