Not that long ago, every device implementation using DMA directly
accessed guest memory using cpu_physical_memory_*(). This meant that
adding support for a guest visible IOMMU would require changing every
one of these devices to go through IOMMU translation.
Shortly before qemu 1.0, I made a start on fixing this by providing
helper functions for PCI DMA. These are currently just stubs which
call the direct access functions, but mean that an IOMMU can be
implemented in one place, rather than for every PCI device.
Clearly, this doesn't help for non PCI devices, which could also be
IOMMU translated on some platforms. It is also problematic for the
devices which have both PCI and non-PCI version (e.g. OHCI, AHCI) - we
cannot use the the pci_dma_*() functions, because they assume the
presence of a PCIDevice, but we don't want to have to check between
pci_dma_*() and cpu_physical_memory_*() every time we do a DMA in the
device code.
This patch makes the first step on addressing both these problems, by
introducing new (stub) dma helper functions which can be used for any
DMA capable device.
These dma functions take a DMAContext *, a new (currently empty)
variable describing the DMA address space in which the operation is to
take place. NULL indicates untranslated DMA directly into guest
physical address space. The intention is that in future non-NULL
values will given information about any necessary IOMMU translation.
DMA using devices must obtain a DMAContext (or, potentially, contexts)
from their bus or platform. For now this patch just converts the PCI
wrappers to be implemented in terms of the universal wrappers,
converting other drivers can take place over time.
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Cc: Richard Henderson <rth@twiddle.net>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
The purpose is to have a more generic pci_for_each_device by passing an extra
argument to the function called on every device.
This patch will be used in a next patch.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Valid range for devfn is -1 to 255 (-1 for automatic assignment). We do
not currently validate this due to devfn being stored as a uint32_t.
This can lead to segfaults and other strange behavior.
We could technically just cast it to int32_t to implement the checking,
but this will not work for visitor-based setting where we may do additional
bounds-checking based on target container type, which is int32_t for this
case.
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Vector notifiers shall be triggered by the MSI/MSI-X core whenever a
relevant configuration change is programmed by the guest. In case of
MSI-X, changes are reported when the effective mask (global &&
per-vector) alters its state. On unmask, the current vector
configuration is included in the event report. This allows users - e.g.
virtio-pci layer - to transfer this information to external MSI-X
routing subsystems - like vhost + KVM in-kernel irqchip.
This implementation only provides MSI-X support, but extension to MSI is
feasible and will be provided later on when adding support for KVM PCI
device assignment.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
This capability makes it possible for the guest to
report a unique chassis identifier to the user.
The spec also recommends making chassis indentifier
persist in eeprom.
This isn't implemented.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This adds support for SHPC interface, as defined by PCI Standard
Hot-Plug Controller and Subsystem Specification, Rev 1.0
http://www.pcisig.com/specifications/conventional/pci_hot_plug/SHPC_10
Only SHPC intergrated with a PCI-to-PCI bridge is supported,
SHPC integrated with a host bridge would need more work.
All main SHPC features are supported:
- MRL sensor
- Attention button
- Attention indicator
- Power indicator
Wake on hotplug and serr generation are stubbed out but unused
as we don't have interfaces to generate these events ATM.
One issue that isn't completely resolved is that qemu currently
expects an "eject" interface, which SHPC does not provide: it merely
removes the power to device and it's up to the user to remove the device
from slot. This patch works around that by ejecting the device
when power is removed and power LED goes off.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
pci_regs.h specifies many registers by mask +
shifted register values.
There's always some duplication when using such:
for example to override device type, we would need:
pci_word_test_and_clear_mask(cap + PCI_EXP_FLAGS,
PCI_EXP_FLAGS_TYPE);
pci_word_test_and_set_mask(cap + PCI_EXP_FLAGS,
PCI_EXP_TYPE_ENDPOINT << (ffs(PCI_EXP_FLAGS_TYPE) - 1));
Getting such registers also uses some duplication:
word = pci_get_word(cap + PCI_EXP_FLAGS) & PCI_EXP_FLAGS_TYPE;
if ((word >> ffs((PCI_EXP_FLAGS_TYPE) - 1)) == PCI_EXP_TYPE_ENDPOINT)
Add API to access such registers in one line:
pci_set_word_by_mask(cap + PCI_EXP_FLAGS, PCI_EXP_FLAGS_TYPE,
PCI_EXP_TYPE_ENDPOINT)
and
word = pci_get_word_by_mask(cap + PCI_EXP_FLAGS, PCI_EXP_FLAGS_TYPE)
if (word == PCI_EXP_TYPE_ENDPOINT)
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This was done in a mostly automated fashion. I did it in three steps and then
rebased it into a single step which avoids repeatedly touching every file in
the tree.
The first step was a sed-based addition of the parent type to the subclass
registration functions.
The second step was another sed-based removal of subclass registration functions
while also adding virtual functions from the base class into a class_init
function as appropriate.
Finally, a python script was used to convert the DeviceInfo structures and
qdev_register_subclass functions to TypeInfo structures, class_init functions,
and type_register_static calls.
We are almost fully converted to QOM after this commit.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Improve VGA selection logic, push check for device availabilty to vl.c.
Create the devices at board level unconditionally.
Remove now unused pci_try_create*() functions.
Make PCI VGA devices optional.
Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Only go over the table when function is masked.
This is not really important for qemu.git but helps
fix a bug in qemu-kvm.git.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This patch adds functions to pci.[ch] to perform PCI DMA operations.
At present, these are just stubs which perform directly cpu physical
memory accesses. Stubs are included which are analogous to
cpu_physical_memory_{read,write}(), the stX_phys() and ldX_phys()
functions and cpu_physical_memory_{map,unmap}().
In addition, a wrapper around qemu_sglist_init() is provided, which
also takes a PCIDevice *. It's assumed that _init() is the only
sglist function which will need wrapping, the idea being that once we
have IOMMU support whatever IOMMU context handle the wrapper derives
from the PCI device will be stored within the sglist structure for
later use.
Using these stubs, however, distinguishes PCI device DMA transactions from
other accesses to physical memory, which will allow PCI IOMMU support to
be added in one place, rather than updating every PCI driver at that time.
That is, it allows us to update individual PCI drivers to support an IOMMU
without having yet determined the details of how the IOMMU emulation will
operate. This will let us remove the most bitrot-sensitive part of an
IOMMU patch in advance.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This also fixes a bug with the old version: QMP would invert device id
and vendor id. This would look ok on HMP because it was printing
"device:vendor" instead of "vendor:device".
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Returns the I/O address space. Useful for implementing
PCI-ISA bridge devices.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Avi Kivity <avi@redhat.com>
Support bridge filtering on top of the memory
API as suggested by Avi Kivity:
Create a memory region for the bridge's address space. This region is
not directly added to system_memory or its descendants. Devices under
the bridge see this region as its pci_address_space(). The region is
as large as the entire address space - it does not take into account
any windows.
For each of the three windows (pref, non-pref, vga), create an alias
with the appropriate start and size. Map the alias into the bridge's
parent's pci_address_space(), as subregions.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
eepro100 was the last user. Now pci_add_capability is powerful enough.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Returns the PCI address space. Useful for bridges that can obscure
part of the PCI address space.
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
There is only one function, so no need for a function pointer.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Superceded by pci_register_bar_region(). The implementations
are folded together.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Superceded by pci_register_bar_region().
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
The msix table is defined as a subregion, to allow for a BAR that
mixes device specific regions with the msix table.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This lets us register BARs in the I/O address space.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Some (hacky) devices that have a back-channel to read this
address back outside the normal configuration mechanisms, such
as VMware svga.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Allow registering a BAR using a MemoryRegion. Once all users are converted,
pci_register_bar() and pci_register_bar_simple() will be removed.
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This is now done sloppily, via get_system_memory(). Eventually callers
will be converted to stop using that.
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
vender id/device id... in configuration space are read-only registers
which are commonly defined for all pci devices.
So move those initialization into common place.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This is similar to pci_register_bar(), but automatically registers a single
memory region spanning the entire BAR.
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Introduce accessor function to know INTx levels.
It will be used later by q35.
Although piix_pci tracks the intx line levels, it can be eliminated
by this helper function.
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(slot, fn) pair is somewhat confusing because of ARI.
So use devfn for pci_find_device() instead of (slot, fn).
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Introduce symbol PCI_SLOT_MAX for the # of slots,
and replace the magic, 256.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
deassert intx on device reset.
So far pci_device_reset() is used for system reset.
In that case, interrupt controller is reset at the same time so that
all irq is are deasserted.
But now pci bus reset/flr is supported, and in that case irq needs to be
disabled explicitly.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch adds a field to PCIDeviceInfo to tag devices as being
not hotpluggable. Any attempt to plug-in or -out such a device
will throw an error.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch introduce a helper function to get PCIDevice from qdev id.
This function will be used later.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Support flr: trigger device reset on flr config write.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We need a PCI ID for our new AHCI adapter. I just picked an ICH-9
because that's the one in the Q35 chipset.
This patch adds a PCI ID define for an ICH-9 AHCI adapter.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
msi depends on pci but pci should not depend on msi.
The only dependency we have is a recent addition
of pci_msi_ functions, IMO they add little enough to
open-code in the small number of users.
Follow-up patches add more cleanups.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
pcie aer needs SERR bit to be writable, and the PCI spec requires
this as well. For compatibility, introduce compat global property
command_serr_enable and make this bit readonly for a pre 0.14 pc
machine.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>