The pci host config register is used to save PCI address for
read/write config data. If guest writes a value to config register,
and then QEMU pauses the vcpu to migrate, after the migration, the guest
will continue to write pci config data, and the write data will be ignored
because of new qemu process losing the config register state.
To trigger the bug:
1. guest is booting in seabios.
2. guest enables the SMRAM in seabios:piix4_apmc_smm_setup, and then
expects to disable the SMRAM by pci_config_writeb.
3. after guest writes the pci host config register, QEMU pauses vcpu
to finish migration.
4. guest write of config data(0x0A) fails to disable the SMRAM because
the config register state is lost.
5. guest continues to boot and crashes in ipxe option ROM due to SMRAM
in enabled state.
Example Reproducer:
step 1. Make modifications to seabios and qemu for increase reproduction
efficiency, write 0xf0 to 0x402 port notify qemu to stop vcpu after
0x0cf8 port wrote i440 configure register. qemu stop vcpu when catch
0x402 port wrote 0xf0.
seabios:/src/hw/pci.c
@@ -52,6 +52,11 @@ void pci_config_writeb(u16 bdf, u32 addr, u8 val)
writeb(mmconfig_addr(bdf, addr), val);
} else {
outl(ioconfig_cmd(bdf, addr), PORT_PCI_CMD);
+ if (bdf == 0 && addr == 0x72 && val == 0xa) {
+ dprintf(1, "stop vcpu\n");
+ outb(0xf0, 0x402); // notify qemu to stop vcpu
+ dprintf(1, "resume vcpu\n");
+ }
outb(val, PORT_PCI_DATA + (addr & 3));
}
}
qemu:hw/char/debugcon.c
@@ -60,6 +61,9 @@ static void debugcon_ioport_write(void *opaque, hwaddr addr, uint64_t val,
printf(" [debugcon: write addr=0x%04" HWADDR_PRIx " val=0x%02" PRIx64 "]\n", addr, val);
#endif
+ if (ch == 0xf0) {
+ vm_stop(RUN_STATE_PAUSED);
+ }
/* XXX this blocks entire thread. Rewrite to use
* qemu_chr_fe_write and background I/O callbacks */
qemu_chr_fe_write_all(&s->chr, &ch, 1);
step 2. start vm1 by the following command line, and then vm stopped.
$ qemu-system-x86_64 -machine pc-i440fx-5.0,accel=kvm\
-netdev tap,ifname=tap-test,id=hostnet0,vhost=on,downscript=no,script=no\
-device virtio-net-pci,netdev=hostnet0,id=net0,bus=pci.0,addr=0x13,bootindex=3\
-device cirrus-vga,id=video0,vgamem_mb=16,bus=pci.0,addr=0x2\
-chardev file,id=seabios,path=/var/log/test.seabios,append=on\
-device isa-debugcon,iobase=0x402,chardev=seabios\
-monitor stdio
step 3. start vm2 to accept vm1 state.
$ qemu-system-x86_64 -machine pc-i440fx-5.0,accel=kvm\
-netdev tap,ifname=tap-test1,id=hostnet0,vhost=on,downscript=no,script=no\
-device virtio-net-pci,netdev=hostnet0,id=net0,bus=pci.0,addr=0x13,bootindex=3\
-device cirrus-vga,id=video0,vgamem_mb=16,bus=pci.0,addr=0x2\
-chardev file,id=seabios,path=/var/log/test.seabios,append=on\
-device isa-debugcon,iobase=0x402,chardev=seabios\
-monitor stdio \
-incoming tcp:127.0.0.1:8000
step 4. execute the following qmp command in vm1 to migrate.
(qemu) migrate tcp:127.0.0.1:8000
step 5. execute the following qmp command in vm2 to resume vcpu.
(qemu) cont
Before this patch, we get KVM "emulation failure" error on vm2.
This patch fixes it.
Cc: qemu-stable@nongnu.org
Signed-off-by: Hogan Wang <hogan.wang@huawei.com>
Message-Id: <20200727084621.3279-1-hogan.wang@huawei.com>
Reported-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
qbus_set_hotplug_handler() is a simple wrapper around
object_property_set_link().
object_property_set_link() fails when the property doesn't exist, is
not settable, or its .check() method fails. These are all programming
errors here, so passing &error_abort to qbus_set_hotplug_handler() is
appropriate.
Most of its callers do. Exceptions:
* pcie_cap_slot_init(), shpc_init(), spapr_phb_realize() pass NULL,
i.e. they ignore errors.
* spapr_machine_init() passes &error_fatal.
* s390_pcihost_realize(), virtio_serial_device_realize(),
s390_pcihost_plug() pass the error to their callers. The latter two
keep going after the error, which looks wrong.
Drop the @errp parameter, and instead pass &error_abort to
object_property_set_link().
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Daniel P. Berrangé" <berrange@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20200630090351.1247703-15-armbru@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20200610053247.1583243-18-armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
I'm converting from qdev_create()/qdev_init_nofail() to
qdev_new()/qdev_realize_and_unref(); recent commit "qdev: New
qdev_new(), qdev_realize(), etc." explains why.
PCI devices use qdev_create() through pci_create() and
pci_create_multifunction().
Provide pci_new(), pci_new_multifunction(), and
pci_realize_and_unref() for converting PCI devices.
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20200610053247.1583243-14-armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
I'm going to convert device realization to qdev_realize() with the
help of Coccinelle. Convert bus realization to qbus_realize() first,
to get it out of Coccinelle's way. Readability improves.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20200610053247.1583243-7-armbru@redhat.com>
Sometimes it would be good to be able to read the pin number along
with the IRQ number allocated. Since we'll dump the IRQ number, no
reason to not dump the pin information. For example, the vfio-pci
device will overwrite the pin with the hardware pin number. It would
be nice to know the pin number of one assigned device from QMP/HMP.
CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
CC: Alex Williamson <alex.williamson@redhat.com>
CC: Michael S. Tsirkin <mst@redhat.com>
CC: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
CC: Julia Suvorova <jusual@redhat.com>
CC: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20200317195908.283800-1-peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
QEMU currently aborts when being started with "-nic model=rocker" or with
"-net nic,model=rocker". This happens because the "rocker" device is not
a normal NIC but a switch, which has different properties. Thus we should
only consider real NIC devices for "-nic" and "-net". These devices can
be identified by the "netdev" property, so check for this property before
adding the device to the list.
Reported-by: Michael Tokarev <mjt@tls.msk.ru>
Fixes: 52310c3fa7 ("net: allow using any PCI NICs in -net or -nic")
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20200527153152.9211-1-thuth@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This code is not related to hardware emulation.
Move it under accel/ with the other hypervisors.
Reviewed-by: Paul Durrant <paul@xen.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200508100222.7112-1-philmd@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
IEC binary prefixes ease code review: the unit is explicit.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20200601142930.29408-5-f4bug@amsat.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
memory_region_set_size() handle the 16 Exabytes limit by
special-casing the UINT64_MAX value. This is not a problem
for the 32-bit maximum, 4 GiB.
By using the UINT32_MAX value, the pci_bridge_io MemoryRegion
ends up missing 1 byte:
(qemu) info mtree
memory-region: pci_bridge_io
0000000000000000-00000000fffffffe (prio 0, i/o): pci_bridge_io
0000000000000060-0000000000000060 (prio 0, i/o): i8042-data
0000000000000064-0000000000000064 (prio 0, i/o): i8042-cmd
00000000000001ce-00000000000001d1 (prio 0, i/o): vbe
0000000000000378-000000000000037f (prio 0, i/o): parallel
00000000000003b4-00000000000003b5 (prio 0, i/o): vga
...
Fix by using the correct value. We now have:
memory-region: pci_bridge_io
0000000000000000-00000000ffffffff (prio 0, i/o): pci_bridge_io
0000000000000060-0000000000000060 (prio 0, i/o): i8042-data
0000000000000064-0000000000000064 (prio 0, i/o): i8042-cmd
...
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20200601142930.29408-4-f4bug@amsat.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
While accessing PCI configuration bytes, assert that
'address + len' is within PCI configuration space.
Generally it is within bounds. This is more of a defensive
assert, in case a buggy device was to send 'address' which
may go out of bounds.
Suggested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <20200604113525.58898-1-ppandit@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Check for hot plug capability earlier to avoid removing devices attached
during the initialization process.
Run qemu with an unattached drive:
-drive file=$FILE,if=none,id=drive0 \
-device pcie-root-port,id=rp0,slot=3,bus=pcie.0,hotplug=off
Hotplug a block device:
device_add virtio-blk-pci,id=blk0,drive=drive0,bus=rp0
If hotplug fails on plug_cb, drive0 will be deleted.
Fixes: 0501e1aa1d ("hw/pci/pcie: Forbid hot-plug if it's disabled on the slot")
Acked-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Julia Suvorova <jusual@redhat.com>
Message-Id: <20200604125947.881210-1-jusual@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
PCI spec says:
For all accesses to MSI-X Table and MSI-X PBA fields, software must use
aligned full DWORD or aligned full QWORD transactions; otherwise, the
result is undefined.
However, since MSI-X was converted to use memory API, QEMU
started blocking qword transactions, only allowing DWORD
ones. Guests do not seem to use QWORD accesses, but let's
be spec compliant.
Fixes: 95524ae8dc ("msix: convert to memory API")
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Devices may have component devices and buses.
Device realization may fail. Realization is recursive: a device's
realize() method realizes its components, and device_set_realized()
realizes its buses (which should in turn realize the devices on that
bus, except bus_set_realized() doesn't implement that, yet).
When realization of a component or bus fails, we need to roll back:
unrealize everything we realized so far. If any of these unrealizes
failed, the device would be left in an inconsistent state. Must not
happen.
device_set_realized() lets it happen: it ignores errors in the roll
back code starting at label child_realize_fail.
Since realization is recursive, unrealization must be recursive, too.
But how could a partly failed unrealize be rolled back? We'd have to
re-realize, which can fail. This design is fundamentally broken.
device_set_realized() does not roll back at all. Instead, it keeps
unrealizing, ignoring further errors.
It can screw up even for a device with no buses: if the lone
dc->unrealize() fails, it still unregisters vmstate, and calls
listeners' unrealize() callback.
bus_set_realized() does not roll back either. Instead, it stops
unrealizing.
Fortunately, no unrealize method can fail, as we'll see below.
To fix the design error, drop parameter @errp from all the unrealize
methods.
Any unrealize method that uses @errp now needs an update. This leads
us to unrealize() methods that can fail. Merely passing it to another
unrealize method cannot cause failure, though. Here are the ones that
do other things with @errp:
* virtio_serial_device_unrealize()
Fails when qbus_set_hotplug_handler() fails, but still does all the
other work. On failure, the device would stay realized with its
resources completely gone. Oops. Can't happen, because
qbus_set_hotplug_handler() can't actually fail here. Pass
&error_abort to qbus_set_hotplug_handler() instead.
* hw/ppc/spapr_drc.c's unrealize()
Fails when object_property_del() fails, but all the other work is
already done. On failure, the device would stay realized with its
vmstate registration gone. Oops. Can't happen, because
object_property_del() can't actually fail here. Pass &error_abort
to object_property_del() instead.
* spapr_phb_unrealize()
Fails and bails out when remove_drcs() fails, but other work is
already done. On failure, the device would stay realized with some
of its resources gone. Oops. remove_drcs() fails only when
chassis_from_bus()'s object_property_get_uint() fails, and it can't
here. Pass &error_abort to remove_drcs() instead.
Therefore, no unrealize method can fail before this patch.
device_set_realized()'s recursive unrealization via bus uses
object_property_set_bool(). Can't drop @errp there, so pass
&error_abort.
We similarly unrealize with object_property_set_bool() elsewhere,
always ignoring errors. Pass &error_abort instead.
Several unrealize methods no longer handle errors from other unrealize
methods: virtio_9p_device_unrealize(),
virtio_input_device_unrealize(), scsi_qdev_unrealize(), ...
Much of the deleted error handling looks wrong anyway.
One unrealize methods no longer ignore such errors:
usb_ehci_pci_exit().
Several realize methods no longer ignore errors when rolling back:
v9fs_device_realize_common(), pci_qdev_unrealize(),
spapr_phb_realize(), usb_qdev_realize(), vfio_ccw_realize(),
virtio_device_realize().
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20200505152926.18877-17-armbru@redhat.com>
A little cleanup is possible because of hotplug_pdev introduction.
Signed-off-by: Julia Suvorova <jusual@redhat.com>
Message-Id: <20200427182440.92433-3-jusual@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Raise an error when trying to hot-plug/unplug a device through QMP to a device
with disabled hot-plug capability. This makes the device behaviour more
consistent and provides an explanation of the failure in the case of
asynchronous unplug.
Signed-off-by: Julia Suvorova <jusual@redhat.com>
Message-Id: <20200427182440.92433-2-jusual@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
The pci_do_device_reset() function (called from pci_device_reset)
clears the PCI_INTERRUPT_LINE config reg of devices on the bus but did
this without taking wmask into account. We'll have a device model now
that needs to set a constant value for this reg and this patch allows
to do that without additional workaround in device emulation to
reverse the effect of this PCI bus reset function.
Suggested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-id: 20200313082444.2439-4-mark.cave-ayland@ilande.co.uk
Signed-off-by: John Snow <jsnow@redhat.com>
Make hot-plug/hot-unplug on PCIe Root Ports optional to allow libvirt
manage it and restrict unplug for the whole machine. This is going to
prevent user-initiated unplug in guests (Windows mostly).
Hotplug is enabled by default.
Usage:
-device pcie-root-port,hotplug=off,...
If you want to disable hot-unplug on some downstream ports of one
switch, disable hot-unplug on PCIe Root Port connected to the upstream
port as well as on the selected downstream ports.
Discussion related:
https://lists.gnu.org/archive/html/qemu-devel/2020-02/msg00530.html
Signed-off-by: Julia Suvorova <jusual@redhat.com>
Message-Id: <20200226174607.205941-1-jusual@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Define the new macro VMSTATE_INSTANCE_ID_ANY for callers who wants to
auto-generate the vmstate instance ID. Previously it was hard coded
as -1 instead of this macro. It helps to change this default value in
the follow up patches. No functional change.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Both functions are called by MemoryRegionOps.[read/write] handlers
with unsigned 'size' argument. Both functions call
pci_host_config_[read/write]_common() which expect a uint32_t 'len'
parameter (also unsigned).
Since it is pointless (and confuse) to use a signed value, use a
unsigned type.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20191216002134.18279-3-philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In commit 3bf4dfdd11 we introduced the pci_cfg_[read/write]
trace events in pci_host_config_[read/write]_common().
We have the following call trace:
pci_host_data_[read/write]()
- PCI_DPRINTF()
- pci_data_[read/write]()
- PCI_DPRINTF()
- pci_host_config_[read/write]_common()
trace_pci_cfg_[read/write]()
Since the PCI_DPRINTF() calls are redundant with the trace
events, remove them.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20191216002134.18279-2-philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Now that the old pc-0.x machine types have been removed, this config
knob is not required anymore.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20191209125248.5849-4-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
On x86, KVM needs some function from the PCI subsystem in order to set
up interrupt routes. Provide some stubs to support x86 machines that
lack PCI.
Reviewed-by: Sergio Lopez <slp@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
PCIe requester IDs are used by modern IOMMUs to differentiate devices
in order to provide a unique IOVA address space per device. These
requester IDs are composed of the bus/device/function (BDF) of the
requesting device. Conventional PCI pre-dates this concept and is
simply a shared parallel bus where transactions are claimed by
decoding target ranges rather than the packetized, point-to-point
mechanisms of PCI-express. In order to interface conventional PCI
to PCIe, the PCIe-to-PCI bridge creates and accepts packetized
transactions on behalf of all downstream devices, using one of two
potential forms of a requester ID relating to the bridge itself or its
subordinate bus. All downstream devices are therefore aliased by the
bridge's requester ID and it's not possible for the IOMMU to create
unique IOVA spaces for devices downstream of such buses.
At least that's how it works on bare metal. Until now point we've
ignored this nuance of vIOMMU support in QEMU, creating a unique
AddressSpace per device regardless of the virtual bus topology.
Aside from simply being true to bare metal behavior, there are aspects
of a shared address space that we can use to our advantage when
designing a VM. For instance, a PCI device assignment scenario where
we have the following IOMMU group on the host system:
$ ls /sys/kernel/iommu_groups/1/devices/
0000:00:01.0 0000:01:00.0 0000:01:00.1
An IOMMU group is considered the smallest set of devices which are
fully DMA isolated from other devices by the IOMMU. In this case the
root port at 00:01.0 does not guarantee that it prevents peer to peer
traffic between the endpoints on bus 01: and the devices are therefore
grouped together. VFIO considers an IOMMU group to be the smallest
unit of device ownership and allows only a single shared IOVA space
per group due to the limitations of the isolation.
Therefore, if we attempt to create the following VM, we get an error:
qemu-system-x86_64 -machine q35... \
-device intel-iommu,intremap=on \
-device pcie-root-port,addr=1e.0,id=pcie.1 \
-device vfio-pci,host=1:00.0,bus=pcie.1,addr=0.0,multifunction=on \
-device vfio-pci,host=1:00.1,bus=pcie.1,addr=0.1
qemu-system-x86_64: -device vfio-pci,host=1:00.1,bus=pcie.1,addr=0.1: vfio \
0000:01:00.1: group 1 used in multiple address spaces
VFIO only allows a single IOVA space (AddressSpace) for both devices,
but we've placed them into a topology where the vIOMMU expects a
separate AddressSpace for each device. On bare metal we know that
a conventional PCI bus would provide the sort of aliasing we need
here, forcing the IOMMU to consider these devices to be part of a
single shared IOVA space. The support provided here does the same
for QEMU, such that we can create a conventional PCI topology to
expose equivalent AddressSpace sharing requirements to the VM:
qemu-system-x86_64 -machine q35... \
-device intel-iommu,intremap=on \
-device pcie-pci-bridge,addr=1e.0,id=pci.1 \
-device vfio-pci,host=1:00.0,bus=pci.1,addr=1.0,multifunction=on \
-device vfio-pci,host=1:00.1,bus=pci.1,addr=1.1
There are pros and cons to this configuration; it's not necessarily
recommended, it's simply a tool we can use to create configurations
which may provide additional functionality in spite of host hardware
limitations or as a benefit to the guest configuration or resource
usage. An incomplete list of pros and cons:
Cons:
a) Extended PCI configuration space is unavailable to devices
downstream of a conventional PCI bus. The degree to which this
is a drawback depends on the device and guest drivers.
b) Applying this topology to devices which are already isolated by
the host IOMMU (singleton IOMMU groups) will result in devices
which appear to be non-isolated to the VM (non-singleton groups).
This can limit configurations within the guest, such as userspace
drivers or nested device assignment.
Pros:
a) QEMU better emulates bare metal.
b) Configurations as above are now possible.
c) Host IOMMU resources and VM locked memory requirements are reduced
in vIOMMU configurations due to shared IOMMU domains on the host
and avoidance of duplicate locked memory accounting.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Message-Id: <157187083548.5439.14747141504058604843.stgit@gimli.home>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In "b06424de62 migration: Disable hotplug/unplug during migration" we
added a check to disable unplug for all devices until we have figured
out what works. For failover primary devices qdev_unplug() is called
from the migration handler, i.e. during migration.
This patch adds a flag to DeviceState which is set to false for all
devices and makes an exception for PCI devices that are also
primary devices in a failover pair.
Signed-off-by: Jens Freimann <jfreimann@redhat.com>
Message-Id: <20191029114905.6856-8-jfreimann@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Set pending_deleted_event in DeviceState for failover
primary devices that were successfully unplugged by the Guest OS.
Signed-off-by: Jens Freimann <jfreimann@redhat.com>
Message-Id: <20191029114905.6856-5-jfreimann@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Only the guest unplug request was triggered. This is needed for
the failover feature. In case of a failed migration we need to
plug the device back to the guest.
Signed-off-by: Jens Freimann <jfreimann@redhat.com>
Message-Id: <20191029114905.6856-4-jfreimann@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch adds a failover_pair_id property to PCIDev which is
used to link the primary device in a failover pair (the PCI dev) to
a standby (a virtio-net-pci) device.
It only supports ethernet devices. Also currently it only supports
PCIe devices. The requirement for PCIe is because it doesn't support
other hotplug controllers at the moment. The failover functionality can
be added to other hotplug controllers like ACPI, SHCP,... later on.
Signed-off-by: Jens Freimann <jfreimann@redhat.com>
Message-Id: <20191029114905.6856-3-jfreimann@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In my "build everything" tree, changing sysemu/sysemu.h triggers a
recompile of some 5400 out of 6600 objects (not counting tests and
objects that don't depend on qemu/osdep.h).
hw/qdev-core.h includes sysemu/sysemu.h since recent commit e965ffa70a
"qdev: add qdev_add_vm_change_state_handler()". This is a bad idea:
hw/qdev-core.h is widely included.
Move the declaration of qdev_add_vm_change_state_handler() to
sysemu/sysemu.h, and drop the problematic include from hw/qdev-core.h.
Touching sysemu/sysemu.h now recompiles some 1800 objects.
qemu/uuid.h also drops from 5400 to 1800. A few more headers show
smaller improvement: qemu/notify.h drops from 5600 to 5200,
qemu/timer.h from 5600 to 4500, and qapi/qapi-types-run-state.h from
5500 to 5000.
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20190812052359.30071-28-armbru@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Commit e35704ba9c "numa: Move NUMA declarations from sysemu.h to
numa.h" left a few NUMA-related macros behind. Move them now.
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190812052359.30071-26-armbru@redhat.com>
In my "build everything" tree, changing hw/qdev-properties.h triggers
a recompile of some 2700 out of 6600 objects (not counting tests and
objects that don't depend on qemu/osdep.h).
Many places including hw/qdev-properties.h (directly or via hw/qdev.h)
actually need only hw/qdev-core.h. Include hw/qdev-core.h there
instead.
hw/qdev.h is actually pointless: all it does is include hw/qdev-core.h
and hw/qdev-properties.h, which in turn includes hw/qdev-core.h.
Replace the remaining uses of hw/qdev.h by hw/qdev-properties.h.
While there, delete a few superfluous inclusions of hw/qdev-core.h.
Touching hw/qdev-properties.h now recompiles some 1200 objects.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Daniel P. Berrangé" <berrange@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190812052359.30071-22-armbru@redhat.com>
In my "build everything" tree, changing hw/hw.h triggers a recompile
of some 2600 out of 6600 objects (not counting tests and objects that
don't depend on qemu/osdep.h).
The previous commits have left only the declaration of hw_error() in
hw/hw.h. This permits dropping most of its inclusions. Touching it
now recompiles less than 200 objects.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20190812052359.30071-19-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
In my "build everything" tree, changing migration/vmstate.h triggers a
recompile of some 2700 out of 6600 objects (not counting tests and
objects that don't depend on qemu/osdep.h).
hw/hw.h supposedly includes it for convenience. Several other headers
include it just to get VMStateDescription. The previous commit made
that unnecessary.
Include migration/vmstate.h only where it's still needed. Touching it
now recompiles only some 1600 objects.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20190812052359.30071-16-armbru@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
In my "build everything" tree, changing hw/irq.h triggers a recompile
of some 5400 out of 6600 objects (not counting tests and objects that
don't depend on qemu/osdep.h).
hw/hw.h supposedly includes it for convenience. Several other headers
include it just to get qemu_irq and.or qemu_irq_handler.
Move the qemu_irq and qemu_irq_handler typedefs from hw/irq.h to
qemu/typedefs.h, and then include hw/irq.h only where it's still
needed. Touching it now recompiles only some 500 objects.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190812052359.30071-13-armbru@redhat.com>
In my "build everything" tree, changing migration/qemu-file-types.h
triggers a recompile of some 2600 out of 6600 objects (not counting
tests and objects that don't depend on qemu/osdep.h).
The culprit is again hw/hw.h, which supposedly includes it for
convenience.
Include migration/qemu-file-types.h only where it's needed. Touching
it now recompiles less than 200 objects.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190812052359.30071-10-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Rename function arguments to make intent clearer.
Better documentation for slot control logic.
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
During boot, linux guests tend to clear all bits in pcie slot status
register which is used for hotplug.
If they clear bits that weren't set this is racy and will lose events:
not a big problem for manual hotplug on bare-metal, but a problem for us.
For example, the following is broken ATM:
/x86_64-softmmu/qemu-system-x86_64 -enable-kvm -S -machine q35 \
-device pcie-root-port,id=pcie_root_port_0,slot=2,chassis=2,addr=0x2,bus=pcie.0 \
-device virtio-balloon-pci,id=balloon,bus=pcie_root_port_0 \
-monitor stdio disk.qcow2
(qemu)device_del balloon
(qemu)cont
Balloon isn't deleted as it should.
As a work-around, detect this attempt to clear slot status and revert
status to what it was before the write.
Note: in theory this can be detected as a duplicate button press
which cancels the previous press. Does not seem to happen in
practice as guests seem to only have this bug during init.
Note2: the right thing to do is probably to fix Linux to
read status before clearing it, and act on the bits that are set.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: Igor Mammedov <imammedo@redhat.com>
During boot, linux would sometimes overwrites control of a powered off
slot before powering it on. Unfortunately QEMU interprets that as a
power off request and ejects the device.
For example:
/x86_64-softmmu/qemu-system-x86_64 -enable-kvm -S -machine q35 \
-device pcie-root-port,id=pcie_root_port_0,slot=2,chassis=2,addr=0x2,bus=pcie.0 \
-monitor stdio disk.qcow2
(qemu)device_add virtio-balloon-pci,id=balloon,bus=pcie_root_port_0
(qemu)cont
Balloon is deleted during guest boot.
To fix, save control beforehand and check that power
or led state actually change before ejecting.
Note: this is more a hack than a solution, ideally we'd
find a better way to detect ejects, or move away
from ejects completely and instead monitor whether
it's safe to delete device due to e.g. its power state.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: Igor Mammedov <imammedo@redhat.com>
If we are trying to set multiple bits at once, testing that just one of
them is already set gives a false positive. As a result we won't
interrupt guest if e.g. presence detection change and attention button
press are both set. This happens with multi-function device removal.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
No header includes qemu-common.h after this commit, as prescribed by
qemu-common.h's file comment.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190523143508.25387-5-armbru@redhat.com>
[Rebased with conflicts resolved automatically, except for
include/hw/arm/xlnx-zynqmp.h hw/arm/nrf51_soc.c hw/arm/msf2-soc.c
block/qcow2-refcount.c block/qcow2-cluster.c block/qcow2-cache.c
target/arm/cpu.h target/lm32/cpu.h target/m68k/cpu.h target/mips/cpu.h
target/moxie/cpu.h target/nios2/cpu.h target/openrisc/cpu.h
target/riscv/cpu.h target/tilegx/cpu.h target/tricore/cpu.h
target/unicore32/cpu.h target/xtensa/cpu.h; bsd-user/main.c and
net/tap-bsd.c fixed up]
Rather than looking inside the definition of a BusState with "s->bus.qbus",
use the QOM prefered style: "BUS(&s->bus)".
This patch was generated using the following Coccinelle script:
// Use BUS() macros to access BusState.qbus
@use_bus_macro_to_access_qbus@
expression obj;
identifier bus;
@@
-&obj->bus.qbus
+BUS(&obj->bus)
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Message-Id: <20190528164020.32250-4-philmd@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
The only remaining caller of pci_get_bus_devfn() is pci_nic_init_nofail(),
itself an old compatibility function. Fold the two together to avoid
re-using the stale interface.
While we're there replace the explicit fprintf()s with error_report().
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190513061939.3464-6-david@gibson.dropbear.id.au>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Since c2077e2c "pci: Adjust PCI config limit based on bus topology",
pci_adjust_config_limit() has been used in the config space read and write
paths to only permit access to extended config space on buses which permit
it. Specifically it prevents access on devices below a vanilla-PCI bus via
some combination of bridges, even if both the host bridge and the device
itself are PCI-E.
It accomplishes this with a somewhat complex call up the chain of bridges
to see if any of them prohibit extended config space access. This is
overly complex, since we can always know if the bus will support such
access at the point it is constructed.
This patch simplifies the test by using a flag in the PCIBus instance
indicating whether extended configuration space is accessible. It is
false for vanilla PCI buses. For PCI-E buses, it is true for root
buses and equal to the parent bus's's capability otherwise.
For the special case of sPAPR's paravirtualized PCI root bus, which
acts mostly like vanilla PCI, but does allow extended config space
access, we override the default value of the flag from the host bridge
code.
This should cause no behavioural change.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <20190513061939.3464-4-david@gibson.dropbear.id.au>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
'MSIX_CAP_LENGTH' is defined in two .c file. Move it
to hw/pci/msix.h file to reduce duplicated code.
CC: qemu-trivial@nongnu.org
Signed-off-by: Li Qiang <liq3ea@163.com>
Message-Id: <20190521151543.92274-5-liq3ea@163.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
pci_bus_is_root() currently relies on a method in the PCIBusClass.
But it's always known if a PCI bus is a root bus when we create it, so
using a dynamic method is overkill.
This replaces it with an IS_ROOT bit in a new flags field, which is set on
root buses and otherwise clear. As a bonus this removes the special
is_root logic from pci_expander_bridge, since it already creates its bus
as a root bus.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <20190424041959.4087-3-david@gibson.dropbear.id.au>
These functions have an explicit test for accesses above the device's
config size. But pci_host_config_{read,write}_common() which they're
about to call already have checks against the config space limit and
do the right thing. So, remove the redundant tests.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190424041959.4087-2-david@gibson.dropbear.id.au>
Some machines have an AHCI adapter, but no PCI. To be able to
compile hw/ide/ahci.c without CONFIG_PCI, we still need the two
functions msi_enabled() and msi_notify() for linking.
This is required for the new Kconfig-like build system, if a user
wants to compile a QEMU binary with just one machine that has AHCI,
but no PCI, like the ARM "cubieboard" for example.
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
LSI mapping in spapr currently open-codes standard PCI swizzling. It thus
duplicates the code of pci_swizzle_map_irq_fn().
Expose the swizzling formula so that it can be used with a slot number
when building the device tree. Simply drop pci_spapr_map_irq() and call
pci_swizzle_map_irq_fn() instead.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155448184841.8446.13959787238854054119.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Some PHB implementations, eg. PAPR used on pseries machine, act like
a regular PCI bus rather than a PCIe bus, but allow access to the
PCIe extended config space anyway.
Introduce a new PCI bus class method to modelize this behaviour and
use it when adjusting the config space size limit during accesses.
No behaviour change for existing PCI bus types.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155414130271.574858.4253514266378127489.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
We spell out sub/dir/ in sub/dir/trace-events' comments pointing to
source files. That's because when trace-events got split up, the
comments were moved verbatim.
Delete the sub/dir/ part from these comments. Gets rid of several
misspellings.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190314180929.27722-3-armbru@redhat.com
Message-Id: <20190314180929.27722-3-armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Not all interrupt controllers have a working implementation of
message-signalled interrupts; in some cases, the guest may expect
MSI to work but it won't due to the buggy or lacking emulation.
In QEMU this is represented by the "msi_nonbroken" variable. This
patch adds a new configuration symbol enabled whenever the binary
contains an interrupt controller that will set "msi_nonbroken". We
can then use it to remove devices that cannot be possibly added
to the machine, because they require MSI.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Implementing an ACS capability on downstream ports and multifunction
endpoints indicates isolation and IOMMU visibility to a finer
granularity. This creates smaller IOMMU groups in the guest and thus
more flexibility in assigning endpoints to guest userspace or an L2
guest.
Signed-off-by: Knut Omang <knut.omang@oracle.com>
Message-Id: <07489975121696f5573b0a92baaf3486ef51e35d.1550768238.git-series.knut.omang@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Instead of including the same list of devices for each target,
set CONFIG_PCI to true, and make the devices default to present
whenever PCI is available. However, s390x does not want all the
PCI devices, so there is a separate symbol to enable them.
Done mostly with the following script:
while read i; do
i=${i%=y}; i=${i#CONFIG_}
sed -i -e'/^config '$i'$/!b' -en \
-e'a\' -e' default y if PCI_DEVICES\' -e' depends on PCI' \
`grep -lw $i hw/*/Kconfig`
done < default-configs/pci.mak
followed by replacing a few "depends on" clauses with "select"
whenever the symbol is not really related to PCI.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20190123065618.3520-31-yang.zhong@intel.com>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Make pcie splited from pci and make it configurable.
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20190123065618.3520-30-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The Kconfig files were generated mostly with this script:
for i in `grep -ho CONFIG_[A-Z0-9_]* default-configs/* | sort -u`; do
set fnord `git grep -lw $i -- 'hw/*/Makefile.objs' `
shift
if test $# = 1; then
cat >> $(dirname $1)/Kconfig << EOF
config ${i#CONFIG_}
bool
EOF
git add $(dirname $1)/Kconfig
else
echo $i $*
fi
done
sed -i '$d' hw/*/Kconfig
for i in hw/*; do
if test -d $i && ! test -f $i/Kconfig; then
touch $i/Kconfig
git add $i/Kconfig
fi
done
Whenever a symbol is referenced from multiple subdirectories, the
script prints the list of directories that reference the symbol.
These symbols have to be added manually to the Kconfig files.
Kconfig.host and hw/Kconfig were created manually.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20190123065618.3520-27-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When unplugging a device, at one point the device will be destroyed
via object_unparent(). This will, one the one hand, unrealize the
removed device hierarchy, and on the other hand, destroy/free the
device hierarchy.
When chaining hotplug handlers, we want to overwrite a bus hotplug
handler by the machine hotplug handler, to be able to perform
some part of the plug/unplug and to forward the calls to the bus hotplug
handler.
For now, the bus hotplug handler would trigger an object_unparent(), not
allowing us to perform some unplug action on a device after we forwarded
the call to the bus hotplug handler. The device would be gone at that
point.
machine_unplug_handler(dev)
/* eventually do unplug stuff */
bus_unplug_handler(dev)
/* dev is gone, we can't do more unplug stuff */
So move the object_unparent() to the original caller of the unplug. For
now, keep the unrealize() at the original places of the
object_unparent(). For implicitly chained hotplug handlers (e.g. pc
code calling acpi hotplug handlers), the object_unparent() has to be
done by the outermost caller. So when calling hotplug_handler_unplug()
from inside an unplug handler, nothing is to be done.
hotplug_handler_unplug(dev) -> calls machine_unplug_handler()
machine_unplug_handler(dev) {
/* eventually do unplug stuff */
bus_unplug_handler(dev) -> calls unrealize(dev)
/* we can do more unplug stuff but device already unrealized */
}
object_unparent(dev)
In the long run, every unplug action should be factored out of the
unrealize() function into the unplug handler (especially for PCI). Then
we can get rid of the additonal unrealize() calls and object_unparent()
will properly unrealize the device hierarchy after the device has been
unplugged.
hotplug_handler_unplug(dev) -> calls machine_unplug_handler()
machine_unplug_handler(dev) {
/* eventually do unplug stuff */
bus_unplug_handler(dev) -> only unplugs, does not unrealize
/* we can do more unplug stuff */
}
object_unparent(dev) -> will unrealize
The original approach was suggested by Igor Mammedov for the PCI
part, but I extended it to all hotplug handlers. I consider this one
step into the right direction.
To summarize:
- object_unparent() on synchronous unplugs is done by common code
-- "Caller of hotplug_handler_unplug"
- object_unparent() on asynchronous unplugs ("unplug requests") has to
be done manually
-- "Caller of hotplug_handler_unplug"
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190228122849.4296-2-david@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The entire link status register for SR-IOV VFs is defined as RsvdZ,
reads simply return zero. Usually this is nothing more than lspci
reporting inconsequentially broken values:
LnkSta: Speed unknown, Width x0, ...
However, now that we're using the downstream endpoint link status to
fill in the value at the parent downstream port, invalid values become
a problem. In particular, the PCIe hotplug driver in Linux looks for
a valid negotiated link width and will fail to enumerate hot-added
downstream endpoints without non-zero value here, ex:
pciehp 0000:00:02.0:pcie004: Slot(0): Attention button pressed
pciehp 0000:00:02.0:pcie004: Slot(0) Powering on due to button press
pciehp 0000:00:02.0:pcie004: Slot(0): Card present
pciehp 0000:00:02.0:pcie004: Slot(0): Link Up
pciehp 0000:00:02.0:pcie004: link training error: status 0x2000
pciehp 0000:00:02.0:pcie004: Failed to check link status
Resolve by using minimum width and speed values for the downstream
port link status when the endpoint fails to provide valid values.
Long term, we may want to implement emulation in the vfio-pci host
driver to suppliment this field with the PF value as the SR-IOV spec
seems to allow, but the solution here is compatible should that be
implemented later.
Fixes: 727b48661f ("pci: Sync PCIe downstream port LNKSTA on read")
Reported-by: Jens Freimann <jfreimann@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Message-Id: <155060310248.19547.14979269067689441201.stgit@gimli.home>
Tested-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Certain devices types, like memory/CPU, are now being handled using a
hotplug interface provided by a top-level MachineClass. Hotpluggable
host bridges are another such device where it makes sense to use a
machine-level hotplug handler. However, unlike those devices,
host-bridges have a parent bus (the main system bus), and devices with
a parent bus use a different mechanism for registering their hotplug
handlers: qbus_set_hotplug_handler(). This interface currently expects
a handler to be a subclass of DeviceClass, but this is not the case
for MachineClass, which derives directly from ObjectClass.
Internally, the interface only requires an ObjectClass, so expose that
in qbus_set_hotplug_handler().
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <154999589921.690774.3640149277362188566.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
It is going to be used later on outside MSI code to detect whether one
MSI vector is masked out.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In msix_exclusive_bar the bar_pba_size is more than what the pba is
expected to have, although this never affects the bar size.
Specifically, the math in msix_init_exclusive_bar allocates too much
memory in some cases.
For example consider nentries = 8. msix_exclusive_bar will give us
bar_pba_size = 16. So 16 bytes. However 8 bytes would be enough - this
is all that the spec requires.
So in practice bar_pba_size sometimes allocates an extra 8 bytes but
never more.
Since each MSIX entry size is 16 bytes, and since we make sure that
table+pba is a power of two, this always leaves a multiple of 16 bytes
for the PBA, so extra 8 bytes have no effect.
However, its ugly to have pba size temporary variable have an incorrect
value. For consistency switch to the formula used in msix_init.
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We better stop right away. For now, errors would be partially ignored
(so the guest might get informed or the device might get unplugged),
although actual plug/unplug will be reported as failed to the user.
While at it, properly move the check to the pre_plug handler for the plug
case, as we can test the slot state before the device will be realized.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Most files that have TABs only contain a handful of them. Change
them to spaces so that we don't confuse people.
disas, standard-headers, linux-headers and libdecnumber are imported
from other projects and probably should be exempted from the check.
Outside those, after this patch the following files still contain both
8-space and TAB sequences at the beginning of the line. Many of them
have a majority of TABs, or were initially committed with all tabs.
bsd-user/i386/target_syscall.h
bsd-user/x86_64/target_syscall.h
crypto/aes.c
hw/audio/fmopl.c
hw/audio/fmopl.h
hw/block/tc58128.c
hw/display/cirrus_vga.c
hw/display/xenfb.c
hw/dma/etraxfs_dma.c
hw/intc/sh_intc.c
hw/misc/mst_fpga.c
hw/net/pcnet.c
hw/sh4/sh7750.c
hw/timer/m48t59.c
hw/timer/sh_timer.c
include/crypto/aes.h
include/disas/bfd.h
include/hw/sh4/sh.h
libdecnumber/decNumber.c
linux-headers/asm-generic/unistd.h
linux-headers/linux/kvm.h
linux-user/alpha/target_syscall.h
linux-user/arm/nwfpe/double_cpdo.c
linux-user/arm/nwfpe/fpa11_cpdt.c
linux-user/arm/nwfpe/fpa11_cprt.c
linux-user/arm/nwfpe/fpa11.h
linux-user/flat.h
linux-user/flatload.c
linux-user/i386/target_syscall.h
linux-user/ppc/target_syscall.h
linux-user/sparc/target_syscall.h
linux-user/syscall.c
linux-user/syscall_defs.h
linux-user/x86_64/target_syscall.h
slirp/cksum.c
slirp/if.c
slirp/ip.h
slirp/ip_icmp.c
slirp/ip_icmp.h
slirp/ip_input.c
slirp/ip_output.c
slirp/mbuf.c
slirp/misc.c
slirp/sbuf.c
slirp/socket.c
slirp/socket.h
slirp/tcp_input.c
slirp/tcpip.h
slirp/tcp_output.c
slirp/tcp_subr.c
slirp/tcp_timer.c
slirp/tftp.c
slirp/udp.c
slirp/udp.h
target/cris/cpu.h
target/cris/mmu.c
target/cris/op_helper.c
target/sh4/helper.c
target/sh4/op_helper.c
target/sh4/translate.c
tcg/sparc/tcg-target.inc.c
tests/tcg/cris/check_addo.c
tests/tcg/cris/check_moveq.c
tests/tcg/cris/check_swap.c
tests/tcg/multiarch/test-mmap.c
ui/vnc-enc-hextile-template.h
ui/vnc-enc-zywrle.h
util/envlist.c
util/readline.c
The following have only TABs:
bsd-user/i386/target_signal.h
bsd-user/sparc64/target_signal.h
bsd-user/sparc64/target_syscall.h
bsd-user/sparc/target_signal.h
bsd-user/sparc/target_syscall.h
bsd-user/x86_64/target_signal.h
crypto/desrfb.c
hw/audio/intel-hda-defs.h
hw/core/uboot_image.h
hw/sh4/sh7750_regnames.c
hw/sh4/sh7750_regs.h
include/hw/cris/etraxfs_dma.h
linux-user/alpha/termbits.h
linux-user/arm/nwfpe/fpopcode.h
linux-user/arm/nwfpe/fpsr.h
linux-user/arm/syscall_nr.h
linux-user/arm/target_signal.h
linux-user/cris/target_signal.h
linux-user/i386/target_signal.h
linux-user/linux_loop.h
linux-user/m68k/target_signal.h
linux-user/microblaze/target_signal.h
linux-user/mips64/target_signal.h
linux-user/mips/target_signal.h
linux-user/mips/target_syscall.h
linux-user/mips/termbits.h
linux-user/ppc/target_signal.h
linux-user/sh4/target_signal.h
linux-user/sh4/termbits.h
linux-user/sparc64/target_syscall.h
linux-user/sparc/target_signal.h
linux-user/x86_64/target_signal.h
linux-user/x86_64/termbits.h
pc-bios/optionrom/optionrom.h
slirp/mbuf.h
slirp/misc.h
slirp/sbuf.h
slirp/tcp.h
slirp/tcp_timer.h
slirp/tcp_var.h
target/i386/svm.h
target/sparc/asi.h
target/xtensa/core-dc232b/xtensa-modules.inc.c
target/xtensa/core-dc233c/xtensa-modules.inc.c
target/xtensa/core-de212/core-isa.h
target/xtensa/core-de212/xtensa-modules.inc.c
target/xtensa/core-fsf/xtensa-modules.inc.c
target/xtensa/core-sample_controller/core-isa.h
target/xtensa/core-sample_controller/xtensa-modules.inc.c
target/xtensa/core-test_kc705_be/core-isa.h
target/xtensa/core-test_kc705_be/xtensa-modules.inc.c
tests/tcg/cris/check_abs.c
tests/tcg/cris/check_addc.c
tests/tcg/cris/check_addcm.c
tests/tcg/cris/check_addoq.c
tests/tcg/cris/check_bound.c
tests/tcg/cris/check_ftag.c
tests/tcg/cris/check_int64.c
tests/tcg/cris/check_lz.c
tests/tcg/cris/check_openpf5.c
tests/tcg/cris/check_sigalrm.c
tests/tcg/cris/crisutils.h
tests/tcg/cris/sys.c
tests/tcg/i386/test-i386-ssse3.c
ui/vgafont.h
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20181213223737.11793-3-pbonzini@redhat.com>
Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Eric Blake <eblake@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Stefan Markovic <smarkovic@wavecomp.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This adds cleanup counterparts to pci_register_root_bus(),
pci_root_bus_new(), and pci_bus_irqs().
These cleanup routines are needed in the case of hotpluggable
PCIHostBridge implementations. Currently we can rely on the
object_unparent()'ing of the PCIHostState recursively unparenting
and cleaning up it's child buses, but we need explicit calls
to also:
1) remove the PCIHostState from pci_host_bridges global list.
otherwise, we risk accessing freed memory when we access
the list later
2) clean up memory allocated in pci_bus_irqs()
Both are handled outside the context of any particular bus or
host bridge's init/realize functions, making it difficult to
avoid the need for explicit cleanup functions without remodeling
how PCIHostBridges are created. So keep it simple and just add
them for now.
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
A conventional PCI bus does not support config space accesses above
the standard 256 byte configuration space. PCIe-to-PCI bridges are
not permitted to forward transactions if the extended register address
field is non-zero and must handle it as an unsupported request (PCIe
bridge spec rev 1.0, 4.1.3, 4.1.4). Therefore, we should not support
extended config space if there is a conventional bus anywhere on the
path to a device.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Introduce and use the "unplug" callback.
This is a preparation for multi-stage hotplug handlers, whereby the bus
hotplug handler is overwritten by the machine hotplug handler. This handler
will then pass control to the bus hotplug handler. So to get this running
cleanly, we also have to make sure to go via the hotplug handler chain when
actually unplugging a device after an unplug request. Lookup the hotplug
handler and call "unplug".
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Introduce and use the "unplug" callback.
This is a preparation for multi-stage hotplug handlers, whereby the bus
hotplug handler is overwritten by the machine hotplug handler. This handler
will then pass control to the bus hotplug handler. So to get this running
cleanly, we also have to make sure to go via the hotplug handler chain when
actually unplugging a device after an unplug request. Lookup the hotplug
handler and call "unplug".
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The callbacks are also called for cold plugged devices. Drop the "hot"
to better match the actual callback names.
While at it, also rename shpc_device_hotplug_common() to
shpc_device_plug_common().
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The callbacks are also called for cold plugged devices. Drop the "hot"
to better match the actual callback names.
While at it, also rename pcie_cap_slot_hotplug_common() to
pcie_cap_slot_plug_common().
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Make use of the PCIESlot speed and width fields to update link
information beyond those configured in pcie_cap_v1_fill(). This is
only called for devices supporting a version 2 capability and
automatically skips any non-PCIESlot devices. Only devices with
increased link values generate any visible config space differences.
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Tested-by: Geoffrey McRae <geoff@hostfission.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The PCIe link speed and width between a downstream device and its
upstream port is negotiated on real hardware and susceptible to
dynamic changes due to signal issues and power management. In the
emulated device case there is no real hardware link, but we still
might wish to have some consistency between endpoint and downstream
port via a virtual negotiation. There is of course a real link for
assigned devices and this same virtual negotiation allows the
downstream port to match the endpoint, synchronizing on every read
to support underlying physical hardware dynamically adjusting the
link.
This negotiation is intentionally unidirectional for compatibility.
If the endpoint exceeds the capabilities of the downstream port or
there is no endpoint device, the downstream port reports negotiation
to its maximum speed and width, matching the previous case where
negotiation was absent. De-tuning the endpoint to match a virtual
link doesn't seem to benefit anyone and is a condition we've thus
far reported without functional issues.
Note that PCI_EXP_LNKSTA is already ignored for migration
compatibility via pcie_cap_v1_fill().
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Tested-by: Geoffrey McRae <geoff@hostfission.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In preparation for reporting higher virtual link speeds and widths,
create enums and macros to help us manage them.
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Tested-by: Geoffrey McRae <geoff@hostfission.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
When loadvm'ing a *running* snapshot qemu crashes due to an invalid
free. It's fortunately caught early by glibc heap memory corruption
protection and qemu gets killed with SIGABRT.
Steps to reproduce:
1) Create VM (e.g w/ virsh define)
2) Start the VM and take a snapshot while it's running and having a
PCI bridge attached
3) Destroy the VM and revert the running snapshot.
This commit fixes the issue.
Signed-off-by: Matthias Weckbecker <matthias@weckbecker.name>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
When VM boots from the latest version of linux kernel, after
hot-unpluging virtio-blk disks which are hotplugged into
pcie-root-port, the VM's dmesg log shows:
[ 151.046242] pciehp 0000:00:05.0:pcie004: pending interrupts 0x0001 from Slot Status
[ 151.046365] pciehp 0000:00:05.0:pcie004: Slot(0-3): Attention button pressed
[ 151.046369] pciehp 0000:00:05.0:pcie004: Slot(0-3): Powering off due to button press
[ 151.046420] pciehp 0000:00:05.0:pcie004: pending interrupts 0x0010 from Slot Status
[ 151.046425] pciehp 0000:00:05.0:pcie004: pciehp_green_led_blink: SLOTCTRL a8 write cmd 200
[ 151.046464] pciehp 0000:00:05.0:pcie004: pending interrupts 0x0010 from Slot Status
[ 151.046468] pciehp 0000:00:05.0:pcie004: pciehp_set_attention_status: SLOTCTRL a8 write cmd c0
[ 156.163421] pciehp 0000:00:05.0:pcie004: pciehp_get_power_status: SLOTCTRL a8 value read 2f1
[ 156.163427] pciehp 0000:00:05.0:pcie004: pciehp_unconfigure_device: domain🚌dev = 0000:06:00
[ 156.198736] pciehp 0000:00:05.0:pcie004: pending interrupts 0x0010 from Slot Status
[ 156.198772] pciehp 0000:00:05.0:pcie004: pciehp_power_off_slot: SLOTCTRL a8 write cmd 400
[ 157.224124] pciehp 0000:00:05.0:pcie004: pending interrupts 0x0018 from Slot Status
[ 157.224194] pciehp 0000:00:05.0:pcie004: pciehp_green_led_off: SLOTCTRL a8 write cmd 300
[ 157.224220] pciehp 0000:00:05.0:pcie004: pciehp_check_link_active: lnk_status = 2011
[ 157.224223] pciehp 0000:00:05.0:pcie004: Slot(0-3): Link Up
[ 157.224233] pciehp 0000:00:05.0:pcie004: pciehp_get_power_status: SLOTCTRL a8 value read 7f1
[ 157.224281] pciehp 0000:00:05.0:pcie004: pending interrupts 0x0010 from Slot Status
[ 157.224285] pciehp 0000:00:05.0:pcie004: pciehp_power_on_slot: SLOTCTRL a8 write cmd 0
[ 157.224300] pciehp 0000:00:05.0:pcie004: __pciehp_link_set: lnk_ctrl = 0
[ 157.224336] pciehp 0000:00:05.0:pcie004: pending interrupts 0x0010 from Slot Status
[ 157.224339] pciehp 0000:00:05.0:pcie004: pciehp_green_led_blink: SLOTCTRL a8 write cmd 200
[ 159.739294] pci 0000:06:00.0 id reading try 50 times with interval 20 ms to get ffffffff
[ 159.739315] pciehp 0000:00:05.0:pcie004: pciehp_check_link_status: lnk_status = 2011
[ 159.739318] pciehp 0000:00:05.0:pcie004: Failed to check link status
[ 159.739371] pciehp 0000:00:05.0:pcie004: pending interrupts 0x0010 from Slot Status
[ 159.739394] pciehp 0000:00:05.0:pcie004: pciehp_power_off_slot: SLOTCTRL a8 write cmd 400
[ 160.771426] pciehp 0000:00:05.0:pcie004: pending interrupts 0x0010 from Slot Status
[ 160.771452] pciehp 0000:00:05.0:pcie004: pciehp_green_led_off: SLOTCTRL a8 write cmd 300
[ 160.771495] pciehp 0000:00:05.0:pcie004: pending interrupts 0x0010 from Slot Status
[ 160.771499] pciehp 0000:00:05.0:pcie004: pciehp_set_attention_status: SLOTCTRL a8 write cmd 40
[ 160.771535] pciehp 0000:00:05.0:pcie004: pending interrupts 0x0010 from Slot Status
[ 160.771539] pciehp 0000:00:05.0:pcie004: pciehp_green_led_off: SLOTCTRL a8 write cmd 300
After analyzing the log information, it seems that qemu doesn't
change the Link Status from active to inactive after hot-unplug.
This results in the abnormal log after the linux kernel commit
d331710ea78fea merged.
Furthermore, If I hotplug the same virtio-blk disk after hot-unplug,
the virtio-blk would turn on and then back off.
So this patch set the Link Status inactive after hot-unplug and
active after hot-plug.
Signed-off-by: Zheng Xiang <zhengxiang9@huawei.com>
Signed-off-by: Zheng Xiang <xiang.zheng@linaro.org>
Cc: Wang Haibin <wanghaibin.wang@huawei.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The load_image() function is deprecated, as it does not let the
caller specify how large the buffer to read the file into is.
Instead use load_image_size().
While we are converting this code, add an error-check
for read failure.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20181130151712.2312-5-peter.maydell@linaro.org
Because they are supposed to remain const.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20181114132931.22624-1-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Mao Zhongyi <maozhongyi@cmss.chinamobile.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
According to PCI specification, subsystem id and subsystem vendor id
are present only in type 0 and type 2 headers (at different offsets),
but not in type 1 headers.
Thus we should make this data optional in struct PciDeviceId and skip
reporting them via HMP if the information is not available.
Additional (wrong information) about PCI bridges (Type1 devices) has been
added in 5383a705 and fortunately not released. This patch fixes that
problem. The problem was spotted by Markus.
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
CC: Eric Blake <eblake@redhat.com>
CC: Markus Armbruster <armbru@redhat.com>
Message-Id: <20181002135538.12113-1-den@openvz.org>
Reported-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
This is a long story. Red Hat has relicensed Windows KVM device drivers
in 2018 and there was an agreement that to avoid WHQL driver conflict
software manufacturers should set proper PCI subsystem vendor ID in
their distributions. Thus PCI subsystem vendor id becomes actively used.
The problem is that this field is applied by us via hardware compats.
Thus technically it could be lost.
This patch adds PCI susbsystem id and vendor id to exportable parameters
for validation.
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
CC: Eric Blake <eblake@redhat.com>
CC: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180918095852.28422-1-den@openvz.org>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Factor "bus_reserve", "io_reserve", "mem_reserve", "pref32_reserve"
and "pref64_reserve" fields of the "GenPCIERootPort" structure out
to "PCIResReserve" structure, so that other PCI bridges can
reuse it to add resource reserve capability.
Signed-off-by: Jing Liu <jing2.liu@linux.intel.com>
Reviewed-by: Marcel Apfelbaum<marcel.apfelbaum@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
All PCI devices are now QOM'ified.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Remove the hard-coded list of PCI NIC names; instead, fill an array
using all PCI devices listed under DEVICE_CATEGORY_NETWORK. Keep
the old shortcut "virtio" for virtio-net-pci.
Suggested-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The previous commit improved compile time by including less of the
generated QAPI headers. This is impossible for stuff defined directly
in qapi-schema.json, because that ends up in headers that that pull in
everything.
Move everything but include directives from qapi-schema.json to new
sub-module qapi/misc.json, then include just the "misc" shard where
possible.
It's possible everywhere, except:
* monitor.c needs qmp-command.h to get qmp_init_marshal()
* monitor.c, ui/vnc.c and the generated qapi-event-FOO.c need
qapi-event.h to get enum QAPIEvent
Perhaps we'll get rid of those some other day.
Adding a type to qapi/migration.json now recompiles some 120 instead
of 2300 out of 5100 objects.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180211093607.27351-25-armbru@redhat.com>
[eblake: rebase to master]
Signed-off-by: Eric Blake <eblake@redhat.com>
- new stats in virtio balloon
- virtio eventfd rework for boot speedup
- vhost memory rework for boot speedup
- fixes and cleanups all over the place
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJagxKDAAoJECgfDbjSjVRp5qAH/3gmgBaIzL3KRHd5i0RZifJv
PvyAVYgZd7h0+/1r9GM7guHKyEPZ08JtbHSm/HuDV4BD/Vf3/8joy8roExIfde2A
6k8fd6ANVQmE3t5zUxNXi9qiG4pO4xDIu4cMAbixzgN9x5ttlcfTw7fTT0e0VJxJ
8SN02/uCPPR/DY4/cpjah+slSyv6rBKT1v1ONy7djyRTYHi6h3Meoh05YfEALkwA
goxTKBZHi0L1IZ3HP/ZpXJDohQ5n2P09DX0fQgb8PgmW6WIWB/Qpi5pD53LZpMCV
n9waTF0U0ahneFd2FHo22QMMrwWvQyrjv+w5uXVr+qmHb/OyH2tUt7PgGF9+QKA=
=78s5
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
virtio,vhost,pci,pc: features, fixes and cleanups
- new stats in virtio balloon
- virtio eventfd rework for boot speedup
- vhost memory rework for boot speedup
- fixes and cleanups all over the place
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Tue 13 Feb 2018 16:29:55 GMT
# gpg: using RSA key 281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream: (22 commits)
virtio-balloon: include statistics of disk/file caches
acpi-test: update FADT
lpc: drop pcie host dependency
tests: acpi: fix FADT not being compared to reference table
hw/pci-bridge: fix pcie root port's IO hints capability
libvhost-user: Support across-memory-boundary access
libvhost-user: Fix resource leak
virtio-balloon: unref the memory region before continuing
pci: removed the is_express field since a uniform interface was inserted
virtio-blk: enable multiple vectors when using multiple I/O queues
pci/bus: let it has higher migration priority
pci-bridge/i82801b11: clear bridge registers on platform reset
vhost: Move log_dirty check
vhost: Merge and delete unused callbacks
vhost: Clean out old vhost_set_memory and friends
vhost: Regenerate region list from changed sections list
vhost: Merge sections added to temporary list
vhost: Simplify ring verification checks
vhost: Build temporary section list and deref after commit
virtio: improve virtio devices initialization time
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The gen_pcie_root_port mem-reserve and pref32-reserve properties are
defined as size (so uint64_t), but passed as uint32_t when building
the 'IO hints' vendor specific capability.
Passing 4G (or more) gets truncated and passed as a zero reservation.
Is not a huge issue since the guest firmware will always compare the
hints with the default value and take the maximum.
Fix it by passing the values as uint64_t and failing to init the
gen_pcie_root_port id invalid values are used.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180201111846.21846-19-armbru@redhat.com>
This cleanup makes the number of objects depending on qapi/qmp/qdict.h
drop from 4550 (out of 4743) to 368 in my "build everything" tree.
For qapi/qmp/qobject.h, the number drops from 4552 to 390.
While there, separate #include from file comment with a blank line.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180201111846.21846-13-armbru@redhat.com>
qapi/qmp/types.h is a convenience header to include a number of
qapi/qmp/ headers. Since we rarely need all of the headers
qapi/qmp/types.h includes, we bypass it most of the time. Most of the
places that use it don't need all the headers, either.
Include the necessary headers directly, and drop qapi/qmp/types.h.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180201111846.21846-9-armbru@redhat.com>
This cleanup makes the number of objects depending on qapi/error.h
drop from 1910 (out of 4743) to 1612 in my "build everything" tree.
While there, separate #include from file comment with a blank line,
and drop a useless comment on why qemu/osdep.h is included first.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180201111846.21846-5-armbru@redhat.com>
[Semantic conflict with commit 34e304e975 resolved, OSX breakage fixed]
according to Eduardo Habkost's commit fd3b02c889 all PCIEs now implement
INTERFACE_PCIE_DEVICE so we don't need is_express field anymore.
Devices that implements only INTERFACE_PCIE_DEVICE (is_express == 1)
or
devices that implements only INTERFACE_CONVENTIONAL_PCI_DEVICE (is_express == 0)
where not affected by the change.
The only devices that were affected are those that are hybrid and also
had (is_express == 1) - therefor only:
- hw/vfio/pci.c
- hw/usb/hcd-xhci.c
- hw/xen/xen_pt.c
For those 3 I made sure that QEMU_PCI_CAP_EXPRESS is on in instance_init()
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Yoni Bettan <ybettan@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Replace a large number of the fprintf(stderr, "*\n" calls with
error_report(). The functions were renamed with these commands and then
compiler issues where manually fixed.
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
Some lines where then manually tweaked to pass checkpatch.
A trailing '.' was removed in hw/pci/pci.c
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Marcel Apfelbaum <marcel@redhat.com>
Conversions that aren't followed by exit() dropped, because they might
be inappropriate.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180203084315.20497-9-armbru@redhat.com>
This function should be declared in generic header file so we can
utilize it.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
pci_find_primary_bus() only has one user, in pc_xen_hvm_init(). That's
inside the machine construction code, so it already has easy access to the
machine's primary PCI bus.
Get it directly, and thereby remove pci_find_primary_bus(). This removes
one of only a handful of users of the ugly pci_host_bridges global.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
The bus pointer in PCIDevice is basically redundant with QOM information.
It's always initialized to the qdev_get_parent_bus(), the only difference
is the type.
Therefore this patch eliminates the field, instead creating a pci_get_bus()
helper to do the type mangling to derive it conveniently from the QOM
Device object underneath.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
A fair proportion of the users of pci_bus_num() want to get the bus
number on a specific device, so first have to look up the bus from the
device then call it. This adds a helper to do that (since we're going
to make looking up the bus slightly more verbose).
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
pci_bus_init(), pci_bus_new_inplace(), pci_bus_new() and pci_register_bus()
are misleadingly named. They're not used for initializing *any* PCI bus,
but only for a root PCI bus.
Non-root buses - i.e. ones under a logical PCI to PCI bridge - are instead
created with a direct qbus_create_inplace() (see pci_bridge_initfn()).
This patch renames the functions to make it clear they're only used for
a root bus.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
This moves pci_dev->name initialization earlier so
pci_dev->bus_master_as could get a name instead of an empty string.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Make sure we don't forget to add the Conventional PCI or PCI
Express interface names on PCI device classes in the future.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Revieed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Those two interfaces will be used to indicate which device types
support Conventional PCI or PCI Express buses. Management
software will be able to use the qom-list-types QMP command to
query that information.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
PCIe busses are always little endian, so set the endianness of the
memory region to little endian rather than native such that operations
work as expected on big endian targets.
Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Whilst the underlying PCI bridge implementation supports 32-bit PCI IO
accesses, unfortunately they are truncated at the legacy 64K limit.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This adds a simplistic emulation of the Sun GEM ethernet controller
found in Apple ASICs.
Currently we only support the Apple UniNorth 1.x variant, but the
other Apple or Sun variants should mostly be a matter of adding
PCI IDs options.
We have a very primitive emulation of a single Broadcom 5201 PHY
which is supported by the MacOS driver.
This model brings out-of-the-box networking to MacOS 9, and all
versions of OS X I tried with the mac99 platform.
Further improvements from Mark:
- Remove sungem.h file, moving constants into sungem.c as required
- Switch to using tracepoints for debugging
- Split register blocks into separate memory regions
- Use arrays in SunGEMState to hold register values
- Add state-saving support
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Add a new slot_reserved_mask bitmask to PCIBus indicating whether or not each
PCI slot on the bus is reserved. Ensure that it is initialised to zero to
maintain the existing behaviour that all slots are available by default, and
add the additional check with appropriate error reporting to
do_pci_register_device().
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Also touch up the logic in do_pci_register_device() accordingly.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
On PCI init PCI bridges may need some extra info about bus number,
IO, memory and prefetchable memory to reserve. QEMU can provide this
with a special vendor-specific PCI capability.
Signed-off-by: Aleksandr Bezzubikov <zuban32s@gmail.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Tested-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
iQIcBAABAgAGBQJZp+UNAAoJENro4Ql1lpzlL3AP/3gyYuAt4vR9FzeDx64XfPzB
x31p50TadRXRIrb5mmN69dXZbg0pmnk68m0HEeSXBl0wh+gQVVPL2xfaMow2UhIw
jd0v9IxkR8PH9ruEso3fJH1RbNGy9aRUlgCYQdGo3Y4W3IZhOsSOKwdmrU46rohy
Bq+RzEL0sWH5I6v+ylFJXktNrVY6n1P1epWY5BnldDm58+l727z/H1rnHPA3t6sL
FHoCmDypimXE4bOEXUQ9y30z1KGYlSmVE9Jm9ABGakcnK3LK0nZl758/DEJDZg02
Ma+TJT3lnwqbLWPIanikeAiP6pf2NkYVhaJN42rqrYhFbOsl6ge2yzHxK83dzju+
3b+Rk9yO932nQLwPTFGA1VGupAUqBtdDIMfZy8RpVD1anA83xgphBP2xPJh0Jsnj
SAFinRdl1XFFVERoTLpMUqJWujp2mBsR14Ljw9dnF0HEfvr2jLkEyTwb6LwHyInx
pAT06s9grsv0wlvaH+fZK5P1KviHr8TjX56qQM0YuGYr8LzvWAbd3mPor7c0EtR6
pr2GhbKQIhCq/foRD9nWMDlmUCWmJBjaCk++XUnmwFr61eegLku0jpRiClwFwPI3
I9dNfiJWrQFdtLFi2xi6A/ibtmCE9JS4lAZYw3ZVGnW8ulx0C2qev5HrgkcDtgq+
vmNfitmbOSG5ZvBn+3eC
=jCiK
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/elmarco/tags/tidy-pull-request' into staging
# gpg: Signature made Thu 31 Aug 2017 11:29:33 BST
# gpg: using RSA key 0xDAE8E10975969CE5
# gpg: Good signature from "Marc-André Lureau <marcandre.lureau@redhat.com>"
# gpg: aka "Marc-André Lureau <marcandre.lureau@gmail.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 87A9 BD93 3F87 C606 D276 F62D DAE8 E109 7596 9CE5
* remotes/elmarco/tags/tidy-pull-request: (29 commits)
eepro100: replace g_malloc()+memcpy() with g_memdup()
test-iov: replace g_malloc()+memcpy() with g_memdup()
i386: replace g_malloc()+memcpy() with g_memdup()
i386: introduce ELF_NOTE_SIZE macro
decnumber: use DIV_ROUND_UP
kvm: use DIV_ROUND_UP
i386/dump: use DIV_ROUND_UP
ppc: use DIV_ROUND_UP
msix: use DIV_ROUND_UP
usb-hub: use DIV_ROUND_UP
q35: use DIV_ROUND_UP
piix: use DIV_ROUND_UP
virtio-serial: use DIV_ROUND_UP
console: use DIV_ROUND_UP
monitor: use DIV_ROUND_UP
virtio-gpu: use DIV_ROUND_UP
vga: use DIV_ROUND_UP
ui: use DIV_ROUND_UP
vnc: use DIV_ROUND_UP
vvfat: use DIV_ROUND_UP
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
I used the clang-tidy qemu-round check to generate the fix:
https://github.com/elmarco/clang-tools-extra
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
The msi routing code in kvm calls some pci functions: provide
some stubs to enable builds without pci.
Also, to make this more obvious, guard them via a pci_available boolean
(which also can be reused in other places).
Fixes: e1d4fb2de ("kvm-irqchip: x86: add msi route notify fn")
Fixes: 767a554a0 ("kvm-all: Pass requester ID to MSI routing functions")
Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
In trace format '#' flag of printf is forbidden. Fix it to '0x%'.
This patch is created by the following:
check that we have a problem
> find . -name trace-events | xargs grep '%#' | wc -l
56
check that there are no cases with additional printf flags before '#'
> find . -name trace-events | xargs grep "%[-+ 0'I]+#" | wc -l
0
check that there are no wrong usage of '#' and '0x' together
> find . -name trace-events | xargs grep '0x%#' | wc -l
0
fix the problem
> find . -name trace-events | xargs sed -i 's/%#/0x%/g'
[Eric Blake noted that xargs grep '%[-+ 0'I]+#' should be xargs grep
"%[-+ 0'I]+#" instead so the shell quoting is correct.
--Stefan]
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20170731160135.12101-3-vsementsov@virtuozzo.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
With the move of some docs/ to docs/devel/ on ac06724a71,
no references were updated.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Since we pass the same DeviceState object to
memory_region_init_rom_nomigrate() and vmstate_register_ram(), we can
switch to using memory_region_init_rom() instead.
(This isn't entirely obvious from the code since it is using
&pdev->qdev rather than DEVICE(pdov) for some reason, but
PCIDevice does indeed use 'qdev' for its parent DeviceState member.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1499438577-7674-10-git-send-email-peter.maydell@linaro.org
Rename memory_region_init_rom() to memory_region_init_rom_nomigrate()
and memory_region_init_rom_device() to
memory_region_init_rom_device_nomigrate().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1499438577-7674-5-git-send-email-peter.maydell@linaro.org
The kludged field 'msi_nonbroken' is declared in "hw/pci/msi.h" and defined in
hw/pci/msi.c.
When using an ARM config with CONFIG_PCI disabled, hw/pci/msi.c is not included.
Without being PCI-related, the files hw/intc/arm_gicv[23*].c do access this
field (to enable the kludge if PCI is enabled).
The final link fails since hw/pci/msi.c is not included.
Defining this field in pci-stub is safe enough for configs without CONFIG_PCI.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
In order to propagate error message better, convert shpc_init() to
Error also convert the pci_bridge_dev_initfn() to realize.
Cc: mst@redhat.com
Cc: marcel@redhat.com
Cc: armbru@redhat.com
Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Convert i82801b11, io3130_upstream, io3130_downstream and
pcie_root_port devices to realize.
Cc: mst@redhat.com
Cc: marcel@redhat.com
Cc: armbru@redhat.com
Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
After the patch 'Make errp the last parameter of pci_add_capability()',
pci_add_capability() and pci_add_capability2() now do exactly the same.
So drop the wrapper pci_add_capability() of pci_add_capability2(), then
replace the pci_add_capability2() with pci_add_capability() everywhere.
Cc: pbonzini@redhat.com
Cc: rth@twiddle.net
Cc: ehabkost@redhat.com
Cc: mst@redhat.com
Cc: dmitry@daynix.com
Cc: jasowang@redhat.com
Cc: marcel@redhat.com
Cc: alex.williamson@redhat.com
Cc: armbru@redhat.com
Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Comments for pci_add_capability2() to explain the return
value. This may help to make a correct return value check
for its callers.
Cc: mst@redhat.com
Cc: marcel@redhat.com
Cc: armbru@redhat.com
Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
On success, pci_add_capability2() returns a positive value. On
failure, it sets an error and return a negative value.
pci_add_capability() laboriously checks this behavior. No other
caller does. Drop the checks from pci_add_capability().
Cc: mst@redhat.com
Cc: marcel@redhat.com
Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In some cases a failing VMSTATE_*_EQUAL does not mean we detected a bug,
but it's actually the best we can do. Especially in these cases a verbose
error message is required.
Let's introduce infrastructure for specifying a error hint to be used if
equal check fails. Let's do this by adding a parameter to the _EQUAL
macros called _err_hint. Also change all current users to pass NULL as
last parameter so nothing changes for them.
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Message-Id: <20170623144823.42936-1-pasic@linux.vnet.ibm.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Meanwhile, abstract a function to detect msix masked bit.
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1494309644-18743-3-git-send-email-peterx@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If a pci device is not reset by VM (by writing into config space)
and unplugged by VM, after that when VM reboots, qemu may assert:
pcibus_reset: Assertion `bus->irq_count[i] == 0' failed
Cc: qemu-stable@nongnu.org
Signed-off-by: herongguang <herongguang.he@huawei.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
No one outside of pcie_aer.h was using error injection; mark them
static for internal use.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20170427215821.19397-3-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
It's simpler to just use a C struct than it is to bundle things
into a QDict in one function just to pull them back out in the
caller. Plus, doing this gets rid of one more user of dynamic
JSON through qobject_from_jsonf(), as well as a memory leak of
the QDict.
While cleaning the code, fix things to report all errors (the
code was previously silently ignoring a failure of
pcie_aer_inject_error(), at a distance).
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20170427215821.19397-2-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Normally pci_init_bus_master() would be called either via
bus->machine_done.notify or directly from do_pci_register_device().
However if a device's realize() failed, pci_init_bus_master() is not
called, and do_pci_unregister_device() fails on
memory_region_del_subregion() as it was not mapped.
This adds a check that subregion was mapped before unmapping it.
Fixes: c53598ed18 ("pci: Add missing drop of bus master AS reference")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: John Snow <jsnow@redhat.com>
The recent introduction of a bus master container added
memory_region_add_subregion() into the PCI device registering path but
missed memory_region_del_subregion() in the unregistering path leaving
a reference to the root memory region of the new container.
This adds missing memory_region_del_subregion().
Fixes: 3716d5902d ("pci: introduce a bus master container")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Make several Link Control Register flags writable to conform
with the PCI Express spec.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Absence of any Extended Capabilities is required to be
indicated by an Extended Capability header with a Capability ID of
0000h, a Capability Version of 0h, and a Next Capability Offset of 000h.
Instead of inserting a 'NULL' capability is simpler to mark the start
of the Extended Configuration Space as read-only to achieve the same
behaviour.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
96a8821d21 ("virtio: unbreak virtio-pci with IOMMU after caching ring
translations") tries to make IOMMU works with virtio memory region
cache, but it requires IOMMU to be created before any virtio
devices. This is sub optimal, fixing this by introduce a bus master
container to make sure address space can be initialized during device
registering, and then we can safely set alias and make
bus_master_enable_region as its subregion during bus master
initialization.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Since commit 1d2d974244 "spapr_pci: enumerate and add PCI device tree", QEMU
populates the PCI device tree in the opposite order compared to SLOF.
Before 1d2d974244:
Populating /pci@800000020000000
00 0000 (D) : 1af4 1000 virtio [ net ]
00 0800 (D) : 1af4 1001 virtio [ block ]
00 1000 (D) : 1af4 1009 virtio [ network ]
Populating /pci@800000020000000/unknown-legacy-device@2
7e5294b8 : /pci@800000020000000
7e52b998 : |-- ethernet@0
7e52c0c8 : |-- scsi@1
7e52c7e8 : +-- unknown-legacy-device@2 ok
Since 1d2d974244:
Populating /pci@800000020000000
00 1000 (D) : 1af4 1009 virtio [ network ]
Populating /pci@800000020000000/unknown-legacy-device@2
00 0800 (D) : 1af4 1001 virtio [ block ]
00 0000 (D) : 1af4 1000 virtio [ net ]
7e5e8118 : /pci@800000020000000
7e5ea6a0 : |-- unknown-legacy-device@2
7e5eadb8 : |-- scsi@1
7e5eb4d8 : +-- ethernet@0 ok
This behaviour change is not actually a bug since no assumptions should be
made on DT ordering. But it has no real justification either, other than
being the consequence of the way fdt_add_subnode() inserts new elements
to the front of the FDT rather than adding them to the tail.
This patch reverts to the historical SLOF ordering by walking PCI devices
in reverse order. This reconciles pseries with x86 machine types behavior.
It is expected to make things easier when porting existing applications to
power.
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
(slight update to the changelog)
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
qobject_to_qdict(obj) returns NULL when obj isn't a QDict. Check
that instead of qobject_type(obj) == QTYPE_QDICT.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1487363905-9480-8-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
When we add PCIe extended capabilities, we should be following the rule
that we add the head extended cap (at offset 0x100) first, then the rest
of them. Meanwhile, we are always adding new capability bits at the end
of the list. Here the "next" looks meaningless in all cases since it
should always be zero (along with the "header").
Simplify the function a bit, and it looks more readable now.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
VFIO actually wants to create a capability with ID == 0.
This is done to make guest drivers skip the given capability.
pcie_add_capability then trips up on this capability
when looking for end of capability list.
To support this use-case, it's easy enough to switch to
e.g. 0xffffffff for these comparisons - we can be sure
it will never match a 16-bit capability ID.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
msix_init() reports errors with error_report(), which is wrong when
it's used in realize(). The same issue was fixed for msi_init() in
commit 1108b2f. In order to make the API change as small as possible,
leave the return value check to later patch.
For some devices(like e1000e, vmxnet3, nvme) who won't fail because of
msix_init's failure, suppress the error report by passing NULL error
object.
Bonus: add comment for msix_init.
CC: Jiri Pirko <jiri@resnulli.us>
CC: Gerd Hoffmann <kraxel@redhat.com>
CC: Dmitry Fleytman <dmitry@daynix.com>
CC: Jason Wang <jasowang@redhat.com>
CC: Michael S. Tsirkin <mst@redhat.com>
CC: Hannes Reinecke <hare@suse.de>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Alex Williamson <alex.williamson@redhat.com>
CC: Markus Armbruster <armbru@redhat.com>
CC: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
CC: Markus Armbruster <armbru@redhat.com>
CC: Marcel Apfelbaum <marcel@redhat.com>
CC: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Looks like we didn't mark PCI ROMs as RO allowing
mischief such as guests writing there.
Further, e.g. vhost gets confused trying to allocate
enough space to log writes there. Fix it up.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Tested-by: Laurent Vivier <lvivier@redhat.com>
The vmstate_pci_device and vmstate_pcie_devices differ
just in the size of one buffer; combine the two using a _TEST
macro.
I think this is safe as long as everywhere which currently
uses either of these two uses the right type.
One thing that concerns me is that some places use pci_device_load/save
which does some irq mangling, but others just use the VMSTATE_PCI_DEVICE
macro - how are they getting the same irq mangling?
This passes a smoke test migrate of:
./x86_64-softmmu/qemu-system-x86_64 -M pc,accel=kvm -m 1024
./littlefed20.img -device e1000e -device virtio-net -device
e1000 -device virtio-rng -device megasas -device megasas-gen2 -device
ioh3420 -device nec-usb-xhci
to an unmodified qemu.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20161214195829.18241-1-dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Current migration code cannot handle some data structures such as
QTAILQ in qemu/queue.h. Here we extend the signatures of put/get
in VMStateInfo so that customized handling is supported. put now
will return int type.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Jianjun Duan <duanj@linux.vnet.ibm.com>
Message-Id: <1484852453-12728-2-git-send-email-duanj@linux.vnet.ibm.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
object_property_set_bool(OBJECT(dev), true, "realized", &err) in
pci_nic_init_nofail may release the object if device fails to
initialize which leads to use-after-free in error handling block.
qdev_init_nofail does the same thing while holding the reference.
(gdb) run -net nic
qemu-system-x86_64: failed to find romfile "efi-e1000.rom"
Program received signal SIGSEGV, Segmentation fault.
object_unparent (obj=0x7fffe96a0010) at qom/object.c:440
440 in qom/object.c
(gdb) bt
<nd_table>, rootbus=0x5555567ed990, default_model=<optimized out>,
default_devaddr=<optimized out>) at hw/pci/pci.c:1812
pci_bus=0x5555567ed990) at hw/i386/pc.c:1634
pci_type=0x555555c1a523 "i440FX", host_type=0x555555ba564e
"i440FX-pcihost") at hw/i386/pc_piix.c:241
out>, envp=<optimized out>) at vl.c:4481
Signed-off-by: Alex Kompel <barbos@gmail.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Now, AER capa version is fixed to v2, if assigned device isn't v2,
then this value will be inconsistent between guest and host
Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
When user specify invalid value for property aer_log_max, device should
fail to create, and report appropriate message.
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Acked-by: Dmitry Fleytman <dmitry@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patches enable the Address Translation Service support for virtio
pci devices. This is needed for a guest visible Device IOTLB
implementation and will be required by vhost device IOTLB API
implementation for intel IOMMU.
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
PCI Express downstream slot has a single PCI slot
behind it, using PCI_DEVFN(PCI_SLOT(devfn), 0)
does not give you function 0 in cases such as ARI
as well as some error cases.
This is exactly what we are hitting:
$ qemu-system-x86_64 -machine q35 -readconfig docs/q35-chipset.cfg
-monitor stdio
(qemu) device_add e1000e,bus=ich9-pcie-port-4,addr=00
(qemu) device_add e1000e,bus=ich9-pcie-port-4,addr=08
Segmentation fault (core dumped)
The fix is to use the pci_get_function_0 API.
Cc: qemu-stable@nongnu.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reported-by: Eduardo Habkost <ehabkost@redhat.com>
Tested-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Tested-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
We changed link status register in pci express endpoint capability
over time. Specifically,
commit b2101eae63 ("pcie: Set the "link
active" in the link status register") set data link layer link active
bit in this register without adding compatibility to old machine types.
When migrating from qemu 2.3 and older this affects xhci devices which
under machine type 2.0 and older have a pci express endpoint capability
even if they are on a pci bus.
Add compatibility flags to make this bit value match what it was under
2.3.
Additionally, to avoid breaking migration from qemu 2.3 and up,
suppress checking link status during migration: this seems sane
since hardware can change link status at any time.
https://bugzilla.redhat.com/show_bug.cgi?id=1352860
Reported-by: Gerd Hoffmann <kraxel@redhat.com>
Fixes: b2101eae63
("pcie: Set the "link active" in the link status register")
Cc: qemu-stable@nongnu.org
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
One more IEC notifier is added to let msi routes know about the IEC
changes. When interrupt invalidation happens, all registered msi routes
will be updated for all PCI devices.
Since both vfio and vhost are possible gsi route consumers, this patch
will go one step further to keep them safe in split irqchip mode and
when irqfd is enabled.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
[move trace-events lines into target-i386/trace-events]
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
'qjson.h' is not a QObject subtype; include this file directly in
.c files that are using it, rather than abusing qmp/types.h for
that purpose.
Meanwhile, for files that include a list of individual QObject
subtypes, it's easier to just use qmp/types.h for that purpose.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1465490926-28625-2-git-send-email-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
msi_init() reports errors with error_report(), which is wrong
when it's used in realize().
Fix by converting it to Error.
Fix its callers to handle failure instead of ignoring it.
For those callers who don't handle the failure, it might happen:
when user want msi on, but he doesn't get what he want because of
msi_init fails silently.
cc: Gerd Hoffmann <kraxel@redhat.com>
cc: John Snow <jsnow@redhat.com>
cc: Dmitry Fleytman <dmitry@daynix.com>
cc: Jason Wang <jasowang@redhat.com>
cc: Michael S. Tsirkin <mst@redhat.com>
cc: Hannes Reinecke <hare@suse.de>
cc: Paolo Bonzini <pbonzini@redhat.com>
cc: Alex Williamson <alex.williamson@redhat.com>
cc: Markus Armbruster <armbru@redhat.com>
cc: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Users of struct Range mess liberally with its members, which makes
refactoring hard. Create a set of methods, and convert all users to
call them instead of accessing members. The methods have carefully
worded contracts, and use assertions to check them.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
place relevant code tegother, make the code easier to read
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Skip bus_master_enable region creation on PCI device init
in order to be sure the IOMMU device (if present) would
be created in advance. Add this memory region at machine_done time.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Move all trace-events for files in the hw/pci/ directory to
their own file.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1466066426-16657-28-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
No caller use its return value as msi capability offset, also in order
to make its return behaviour consistent with msix_init().
cc: Michael S. Tsirkin <mst@redhat.com>
cc: Paolo Bonzini <pbonzini@redhat.com>
cc: Hannes Reinecke <hare@suse.de>
cc: Markus Armbruster <armbru@redhat.com>
cc: Marcel Apfelbaum <marcel@redhat.com>
Acked-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
It has:
1. More newlines make the code block well separated.
2. Add more comments for msi_init.
3. Fix a indentation in vmxnet3.c.
4. ioh3420 & xio3130_downstream: put PCI Express capability init function
together, make it more readable.
cc: Michael S. Tsirkin <mst@redhat.com>
cc: Markus Armbruster <armbru@redhat.com>
cc: Marcel Apfelbaum <marcel@redhat.com>
cc: Dmitry Fleytman <dmitry@daynix.com>
cc: Jason Wang <jasowang@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
ENOSPC is programming error, assert it for debugging.
cc: Michael S. Tsirkin <mst@redhat.com>
cc: Marcel Apfelbaum <marcel@redhat.com>
cc: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This fix SID verification failure when IOMMU IR is enabled with PCI
bridges. Existing pci_requester_id() is more like getting BDF info
only. Renaming it to pci_get_bdf(). Meanwhile, we provide the correct
implementation to get requester ID. VT-d spec 5.1.1 is a good reference
to go, though it talks only about interrupt delivery, the rule works
exactly the same for non-interrupt cases.
Currently, there are three use cases for pci_requester_id():
- PCIX status bits: here we need BDF only, not requester ID. Replacing
with pci_get_bdf().
- PCIe Error injection and MSI delivery: for both these cases, we are
looking for requester IDs. Here we should use the new impl.
To avoid a PCI walk every time we send MSI message, one requester_id
cache field is added to PCIDevice to cache the result when initialize
PCI device.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Added support for PCIe CAP v1, while reusing some of the existing v2
infrastructure.
Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
This function will be used by e1000e device code.
Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Move declarations out of qemu-common.h for functions declared in
utils/ files: e.g. include/qemu/path.h for utils/path.c.
Move inline functions out of qemu-common.h and into new files (e.g.
include/qemu/bcd.h)
Signed-off-by: Veronia Bahaa <veroniabahaa@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit 57cb38b included qapi/error.h into qemu/osdep.h to get the
Error typedef. Since then, we've moved to include qemu/osdep.h
everywhere. Its file comment explains: "To avoid getting into
possible circular include dependencies, this file should not include
any other QEMU headers, with the exceptions of config-host.h,
compiler.h, os-posix.h and os-win32.h, all of which are doing a
similar job to this file and are under similar constraints."
qapi/error.h doesn't do a similar job, and it doesn't adhere to
similar constraints: it includes qapi-types.h. That's in excess of
100KiB of crap most .c files don't actually need.
Add the typedef to qemu/typedefs.h, and include that instead of
qapi/error.h. Include qapi/error.h in .c files that need it and don't
get it now. Include qapi-types.h in qom/object.h for uint16List.
Update scripts/clean-includes accordingly. Update it further to match
reality: replace config.h by config-target.h, add sysemu/os-posix.h,
sysemu/os-win32.h. Update the list of includes in the qemu/osdep.h
comment quoted above similarly.
This reduces the number of objects depending on qapi/error.h from "all
of them" to less than a third. Unfortunately, the number depending on
qapi-types.h shrinks only a little. More work is needed for that one.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
[Fix compilation without the spice devel packages. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
remove unused param
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
remove unused param, and rename the other to a meaningful one.
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
For vfio device, we need to propagate the aer error to
Guest OS. we use the pcie_aer_msg() to send aer error
to guest.
Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
pcie_aer_init was used to emulate an aer capability for pcie device,
but for vfio device, the aer config space size is mutable and is not
always equal to PCI_ERR_SIZEOF(0x48). it depends on where the TLP Prefix
register required, so here we add a size argument.
Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Device's Offset and size can reach PCIE_CONFIG_SPACE_SIZE,
fix the corresponding assert.
Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Since it can`t fail. Also modify the callers.
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
commit 428c3ece97 ("fix MSI injection on Xen")
inadvertently enabled the xen-specific logic unconditionally.
Limit it to only when xen is enabled.
Additionally, msix data should be read with pci_get_log
since the format is pci little-endian.
Reported-by: "Daniel P. Berrange" <berrange@redhat.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
On Xen MSIs can be remapped into pirqs, which are a type of event
channels. It's mostly for the benefit of PCI passthrough devices, to
avoid the overhead of interacting with the emulated lapic.
However remapping interrupts and MSIs is also supported for emulated
devices, such as the e1000 and virtio-net.
When an interrupt or an MSI is remapped into a pirq, masking and
unmasking is done by masking and unmasking the event channel. The
masking bit on the PCI config space or MSI-X table should be ignored,
but it isn't at the moment.
As a consequence emulated devices which use MSI or MSI-X, such as
virtio-net, don't work properly (the guest doesn't receive any
notifications). The mechanism was working properly when xen_apic was
introduced, but I haven't narrowed down which commit in particular is
causing the regression.
Fix the issue by ignoring the masking bit for MSI and MSI-X which have
been remapped into pirqs.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
PCI devices can't be plugged directly into PCI extra root bridges
because their resources can't be computed by firmware before the ACPI
tables are loaded.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.
This commit was created with scripts/clean-includes.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1453832250-766-23-git-send-email-peter.maydell@linaro.org
Enable PCIe device multi-function hot-add, just ensure function 0 is added
last, then driver will get the notification to scan the slot.
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In case user want to cancel the hot-add operation, should roll back,
device_del the added function that still don`t work.
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
qpci_msix_pending() writes on pba region, causing qemu to SEGV:
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff7fba8c0 (LWP 25882)]
0x0000000000000000 in ?? ()
(gdb) bt
#0 0x0000000000000000 in ()
#1 0x00005555556556c5 in memory_region_oldmmio_write_accessor (mr=0x5555579f3f80, addr=0, value=0x7fffffffbf68, size=4, shift=0, mask=4294967295, attrs=...) at /home/elmarco/src/qemu/memory.c:434
#2 0x00005555556558e1 in access_with_adjusted_size (addr=0, value=0x7fffffffbf68, size=4, access_size_min=1, access_size_max=4, access=0x55555565563e <memory_region_oldmmio_write_accessor>, mr=0x5555579f3f80, attrs=...) at /home/elmarco/src/qemu/memory.c:506
#3 0x00005555556581eb in memory_region_dispatch_write (mr=0x5555579f3f80, addr=0, data=0, size=4, attrs=...) at /home/elmarco/src/qemu/memory.c:1176
#4 0x000055555560b6f9 in address_space_rw (as=0x555555eff4e0 <address_space_memory>, addr=3759147008, attrs=..., buf=0x7fffffffc1b0 "", len=4, is_write=true) at /home/elmarco/src/qemu/exec.c:2439
#5 0x000055555560baa2 in cpu_physical_memory_rw (addr=3759147008, buf=0x7fffffffc1b0 "", len=4, is_write=1) at /home/elmarco/src/qemu/exec.c:2534
#6 0x000055555564c005 in cpu_physical_memory_write (addr=3759147008, buf=0x7fffffffc1b0, len=4) at /home/elmarco/src/qemu/include/exec/cpu-common.h:80
#7 0x000055555564cd9c in qtest_process_command (chr=0x55555642b890, words=0x5555578de4b0) at /home/elmarco/src/qemu/qtest.c:378
#8 0x000055555564db77 in qtest_process_inbuf (chr=0x55555642b890, inbuf=0x55555641b340) at /home/elmarco/src/qemu/qtest.c:569
#9 0x000055555564dc07 in qtest_read (opaque=0x55555642b890, buf=0x7fffffffc2e0 "writel 0xe0100800 0x0\n", size=22) at /home/elmarco/src/qemu/qtest.c:581
#10 0x000055555574ce3e in qemu_chr_be_write (s=0x55555642b890, buf=0x7fffffffc2e0 "writel 0xe0100800 0x0\n", len=22) at qemu-char.c:306
#11 0x0000555555751263 in tcp_chr_read (chan=0x55555642bcf0, cond=G_IO_IN, opaque=0x55555642b890) at qemu-char.c:2876
#12 0x00007ffff64c9a8a in g_main_context_dispatch (context=0x55555641c400) at gmain.c:3122
(without this patch, this can be reproduced with the ivshmem qtest)
Implement an empty mmio write to avoid the crash.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
For GICv3 ITS implementation we are going to use requester IDs in KVM IRQ
routing code. This patch introduces reusable convenient way to obtain this
ID from the device pointer. The new function is now used in some places,
where the same calculation was used.
MemTxAttrs.stream_id also renamed to requester_id in order to better
reflect semantics of the field.
Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <5814bcb03a297f198e796b13ed9c35059c52f89b.1444916432.git.p.fedin@samsung.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Symptom:
$ qemu-system-x86_64 -m 10000000
Unexpected error in ram_block_add() at /work/armbru/qemu/exec.c:1456:
upstream-qemu: cannot set up guest memory 'pc.ram': Cannot allocate memory
Aborted (core dumped)
Root cause: commit ef701d7 screwed up handling of out-of-memory
conditions. Before the commit, we report the error and exit(1), in
one place, ram_block_add(). The commit lifts the error handling up
the call chain some, to three places. Fine. Except it uses
&error_abort in these places, changing the behavior from exit(1) to
abort(), and thus undoing the work of commit 3922825 "exec: Don't
abort when we can't allocate guest memory".
The three places are:
* memory_region_init_ram()
Commit 4994653 (right after commit ef701d7) lifted the error
handling further, through memory_region_init_ram(), multiplying the
incorrect use of &error_abort. Later on, imitation of existing
(bad) code may have created more.
* memory_region_init_ram_ptr()
The &error_abort is still there.
* memory_region_init_rom_device()
Doesn't need fixing, because commit 33e0eb5 (soon after commit
ef701d7) lifted the error handling further, and in the process
changed it from &error_abort to passing it up the call chain.
Correct, because the callers are realize() methods.
Fix the error handling after memory_region_init_ram() with a
Coccinelle semantic patch:
@r@
expression mr, owner, name, size, err;
position p;
@@
memory_region_init_ram(mr, owner, name, size,
(
- &error_abort
+ &error_fatal
|
err@p
)
);
@script:python@
p << r.p;
@@
print "%s:%s:%s" % (p[0].file, p[0].line, p[0].column)
When the last argument is &error_abort, it gets replaced by
&error_fatal. This is the fix.
If the last argument is anything else, its position is reported. This
lets us check the fix is complete. Four positions get reported:
* ram_backend_memory_alloc()
Error is passed up the call chain, ultimately through
user_creatable_complete(). As far as I can tell, it's callers all
handle the error sanely.
* fsl_imx25_realize(), fsl_imx31_realize(), dp8393x_realize()
DeviceClass.realize() methods, errors handled sanely further up the
call chain.
We're good. Test case again behaves:
$ qemu-system-x86_64 -m 10000000
qemu-system-x86_64: cannot set up guest memory 'pc.ram': Cannot allocate memory
[Exit 1 ]
The next commits will repair the rest of commit ef701d7's damage.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1441983105-26376-3-git-send-email-armbru@redhat.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
The spec says:
Undefined – The value read from this bit is
undefined. In previous versions of this
specification, this bit was used to indicate a Link
Training Error. System software must ignore the
value read from this bit. System software is
permitted to write any value to this bit.
Do not allow injecting it.
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
A number of files were including strings.h but not using any
of the functions it provides
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The current trace prototypes and (matching) trace calls lead to
"unorthodox" PCI BDF notation in at least the stderr trace backend. For
example, the four BARs of a QXL video card at 00:01.0 (bus 0, slot 1,
function 0) are traced like this (PID and timestamps removed):
pci_update_mappings_add d=0x7f14a73bf890 00:00.1 0,0x84000000+0x4000000
pci_update_mappings_add d=0x7f14a73bf890 00:00.1 1,0x80000000+0x4000000
pci_update_mappings_add d=0x7f14a73bf890 00:00.1 2,0x88200000+0x2000
pci_update_mappings_add d=0x7f14a73bf890 00:00.1 3,0xd060+0x20
The slot and function values are in reverse order.
Stick with the conventional BDF notation.
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Don Koch <dkoch@verizon.com>
Cc: qemu-trivial@nongnu.org
Fixes: 7828d75045
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
he current code walks up the bus tree for an iommu, however it passes
to the iommu_fn() callback the bus/devfn of the immediate child of
the level where the callback was found, rather than the original
bus/devfn where the search started from.
This prevents iommu's like POWER8 (and in fact also Q35) to properly
provide an address space for a subset of devices that aren't immediate
children of the iommu.
PCIe carries the originator bdfn acccross to the iommu on all DMA
transactions, so we must be able to properly identify devices at all
levels.
This changes the function pci_device_iommu_address_space() to pass
the original pointers to the iommu_fn() callback instead.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
A couple of places in hw/pci use an inline calculation to round a
size up to the next largest power of 2. We have a utility routine
for this, so use it.
(The behaviour of the old code is different if the size value
is 0 -- it would leave it as 0 rather than rounding up to 1,
but in both cases we know the size can't be 0.
In the case where the size value had bit 31 set, the old code
would invoke undefined behaviour; the new code will give a
result of 0. Presumably that could never happen either.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1437741192-20955-2-git-send-email-peter.maydell@linaro.org
Some kernels program a 0 address for io regions. PCI 3.0 spec
section 6.2.5.1 doesn't seem to disallow this.
based on patch by Michael Roth <mdroth@linux.vnet.ibm.com>
Add pci_allow_0_addr in MachineClass to conditionally
allow addr 0 for pseries, as this can break other architectures.
This patch allows to hotplug PCI card in pseries machine, as the first
added card BAR0 is always set to 0 address.
This as a temporary hack, waiting to fix PCI memory priorities for more
machine types...
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Some firmwares can test that and assume the device hasn't come
up if that bit isn't set
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
When loading migration fails due to a disagreement about
PCI config data we don't currently get any errors explaining
that was the cause of the problem or which byte in the config
data was at fault.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
In particular, don't include it into headers.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
These macros expand into error class enumeration constant, comma,
string. Unclean. Has been that way since commit 13f59ae.
The error class is always ERROR_CLASS_GENERIC_ERROR since the previous
commit.
Clean up as follows:
* Prepend every use of a QERR_ macro by ERROR_CLASS_GENERIC_ERROR, and
delete it from the QERR_ macro. No change after preprocessing.
* Rewrite error_set(ERROR_CLASS_GENERIC_ERROR, ...) into
error_setg(...). Again, no change after preprocessing.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
Now that qbool is fixed, let's fix getting and setting a bool
value to a qdict member to also use C99 bool rather than int.
I audited all callers to ensure that the changed return type
will not cause any changed semantics.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Acked-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Some convinience fluff: Add support for '-vga virtio', also add
virtio-vga to the list of vga cards so '-device virtio-vga' will
turn off the default vga.
Written by Dave Airlie and Gerd Hoffmann.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
This includes pxb support by Marcel, as well as multiple enhancements all over
the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJVcC2WAAoJECgfDbjSjVRpwgcH/3mvFP3UmvmXyzf8mYtQ/1fR
ikvdTGHl2DR7TMQszNeCJn/p6NgH3oXRbXh39xM1xl9D2/dsZH9o1cUyFE04K9LK
am0cTmlty1OEyFN8BX1TtpngUxa5mpRA/+NYuWbh1FoTp6RoEPM6P+L1zLqtXYn1
REF++ehrsQI2Az2pibf4nul8bwuTWJLJeMS6TcCVCRGoaHsCESiVMu2sQrzEbWEW
E8ZWaXaiycLxLkW0/oU8BmZyrAk1PHdHwgbMUINV0kV5E2u+ZU+3KY79ezC2FyHW
NV7G9Rhh/5H828/cB6UP4CPZ4AYIYmg02iz5XBGKbd8WS9oPrJVK7EoqfU3oZfc=
=5AmP
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pc, acpi, virtio, tpm
This includes pxb support by Marcel, as well as multiple enhancements all over
the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Thu Jun 4 11:51:02 2015 BST using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
* remotes/mst/tags/for_upstream: (28 commits)
vhost: logs sharing
hw/acpi: piix4_pm_init(): take fw_cfg object no more
hw/acpi: move "etc/system-states" fw_cfg file from PIIX4 to core
hw/acpi: acpi_pm1_cnt_init(): take "disable_s3" and "disable_s4"
pc-dimm: don't assert if pc-dimm alignment != hotpluggable mem range size
docs: Add PXB documentation
apci: fix PXB behaviour if used with unsupported BIOS
hw/pxb: add numa_node parameter
hw/pci: add support for NUMA nodes
hw/pxb: add map_irq func
hw/pci: inform bios if the system has extra pci root buses
hw/pci: introduce PCI Expander Bridge (PXB)
hw/pci: removed 'rootbus nr is 0' assumption from qmp_pci_query
hw/acpi: remove from root bus 0 the crs resources used by other buses.
hw/acpi: add _CRS method for extra root busses
hw/apci: add _PRT method for extra PCI root busses
hw/acpi: add support for i440fx 'snooping' root busses
hw/pci: extend PCI config access to support devices behind PXB
hw/i386: query only for q35/pc when looking for pci host bridge
hw/pci: made pci_bus_num a PCIBusClass method
...
Conflicts:
hw/i386/pc_piix.c
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We need to work with PCI BARs to generate OF properties
during PCI hotplug for sPAPR guests.
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
PCI root buses can be attached to a specific NUMA node.
PCI buses are not attached by default to a NUMA node.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Use the newer pci_bus_num to correctly get the root bus number.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
PXB buses are assumed to be children of bus 0. Look for them
while scanning the buses.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Refactoring it as a method of PCIBusClass will allow
different implementations for subclasses.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Refactoring it as a method of PCIBusClass will allow
different implementations for subclasses.
Removed the assumption that the root bus does not
have a parent device because is specific only
to the default class implementation.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJVbWZHAAoJEDhwtADrkYZTQ6oQAIv3FRNj0buUSwh8Gwx8NrxJ
DHz/NgdtoRV/olk2ZPYSmKUooTdKFcH1XsR7oYBoznh2m/CgiQXGeJQ7tTTcrWAj
hw4vt4Uq0bMGPNFYNOBgsVC5cejiZo53YfXybRBLAcUJvOlbnPVJ+g7ru8mg3qaC
rX0ZI8Cu0QmQWsGXHuKArVRGSsc+CN4SbXWyI4WmMH3g9OzTVjKVDkZekcU3NbXj
GZq1mJBeVUsDT6wzRLGXe3qxkDAhK9THLoRd999/aQDWSpvUabNtotyZew5JG5Ey
WKkmhSZsi0RgJTuPUcf5i83QKrWVN8EaADn67J5GkrnYYVPLdWuk/PhRgFiiWlVk
mHFxEVyfQL7neTbb8dcwcREHIaSSb4BBF4+dyFk5921Idxs3kzw/5l20SwNi6bQy
y23Tq/tBvbsie89O1gdXipZm3pz9xM9/9FtODjGnnVO9zb9Fi9TAyMMi8RmiS15k
YL649rCgk9cwKOlwsMLSUajXGq0G1cOzou7M5DeDOLw5db5YEMMZkcpmETKNLaXk
DWiX1E1K6PsDHDQkAipi9lmP9izVmpfXS4EWNVLaVCP3ht27VHgsZx4TaAAr8AVD
NdA4wq/GEcUMcJI+erJe3YxZgPofO+Ja8whwyqwhvLM4g6g8gQgy7PNKM+FodXY1
Z5+Ou90aprDpWjPIUHOv
=ySet
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/armbru/tags/pull-monitor-2015-06-02' into staging
Monitor patches
# gpg: Signature made Tue Jun 2 09:16:07 2015 BST using RSA key ID EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg: aka "Markus Armbruster <armbru@pond.sub.org>"
* remotes/armbru/tags/pull-monitor-2015-06-02: (21 commits)
monitor: Change return type of monitor_cur_is_qmp() to bool
monitor: Rename monitor_ctrl_mode() to monitor_is_qmp()
monitor: Turn int command_mode into bool in_command_mode
monitor: Drop do_qmp_capabilities()'s superfluous QMP check
monitor: Unbox Monitor member mc and rename to qmp
monitor: Rename monitor_control_read(), monitor_control_event()
monitor: Rename handle_user_command() to handle_hmp_command()
monitor: Limit QError use to command handlers
monitor: Inline monitor_has_error() into its only caller
monitor: Wean monitor_protocol_emitter() off mon->error
monitor: Propagate errors through invalid_qmp_mode()
monitor: Propagate errors through qmp_check_input_obj()
monitor: Propagate errors through qmp_check_client_args()
monitor: Drop unused "new" HMP command interface
monitor: Use trad. command interface for HMP pcie_aer_inject_error
monitor: Use traditional command interface for HMP device_add
monitor: Use traditional command interface for HMP drive_del
monitor: Convert client_migrate_info to QAPI
monitor: Improve and document client_migrate_info protocol error
monitor: Clean up after previous commit
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
It's being used by the hypervisor. For now simply mimic a device not
capable of masking, and fully emulate any accesses a guest may issue
nevertheless as simple reads/writes without side effects.
This is XSA-129.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
All QMP commands use the "new" handler interface (mhandler.cmd_new).
Most HMP commands still use the traditional interface (mhandler.cmd),
but a few use the "new" one. Complicates handle_user_command() for no
gain, so I'm converting these to the traditional interface.
pcie_aer_inject_error's implementation is split into the
hmp_pcie_aer_inject_error() and pcie_aer_inject_error_print(). The
former is a peculiar crossbreed between HMP and QMP handler. On
success, it works like a QMP handler: store QDict through ret_data
parameter, return 0. Printing the QDict is left to
pcie_aer_inject_error_print(). On failure, it works more like an HMP
handler: print error to monitor, return negative number.
To convert to the traditional interface, turn
pcie_aer_inject_error_print() into a command handler wrapping around
hmp_pcie_aer_inject_error(). By convention, this command handler
should be called hmp_pcie_aer_inject_error(), so rename the existing
hmp_pcie_aer_inject_error() to do_pcie_aer_inject_error().
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
GICv3 ITS distinguishes between devices by using hardwired device IDs passed on the bus.
This patch implements passing these IDs in qemu.
SMMU is also known to use stream IDs, therefore this addition can also be useful for
implementing platforms with SMMU.
Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Changes from v1:
- Added bus number to the stream ID
- Added stream ID not only to MSI-X, but also to plain MSI. Some common code was made into
msi_send_message() function.
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Memory hot-unplug support for pc, MSI-X
mapping update speedup for virtio-pci,
misc refactorings and bugfixes.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJVUFj/AAoJECgfDbjSjVRpteQH+gKoOMKilM6qvgdQS9vduFJ+
lDHNnmfgzWjVMEetiUOc9hImfEEyTyDFrkSI3wf4a8RZ7UnnDKD8hZR1nToySJPd
SuDP/EdtXYtInIMjc1MUUrJEP6qtjjgM+IbikVzHDxCeekrTMFz2w05MZ+V+hxI5
8b8ndPNfjX3ciIRjHKZ2u6hKEemhzxr1yyKTnJVGDN07hmfMbCyLsiWnFfShZwfv
g7USgiXjFfpvU5Q7QWpiCapfAaEpevRqieGzRjSbPy5Frm3XT7v+hWbFnvIJqUPj
5/SMV8I4qtKQe15Qah292HB//oaFM/AvRtHWvQkre3YIqFwyCYimQtjqoRCYC1E=
=x0ub
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pc, virtio enhancements
Memory hot-unplug support for pc, MSI-X
mapping update speedup for virtio-pci,
misc refactorings and bugfixes.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Mon May 11 08:23:43 2015 BST using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
* remotes/mst/tags/for_upstream: (28 commits)
acpi: update expected files for memory unplug
virtio-scsi: Move DEFINE_VIRTIO_SCSI_FEATURES to virtio-scsi
virtio-net: Move DEFINE_VIRTIO_NET_FEATURES to virtio-net
pci: Merge pci_nic_init() into pci_nic_init_nofail()
acpi: add a missing backslash to the \_SB scope.
qmp-event: add event notification for memory hot unplug error
acpi: add hardware implementation for memory hot unplug
acpi: fix "Memory device control fields" register
acpi: extend aml_field() to support UpdateRule
acpi, mem-hotplug: add unplug cb for memory device
acpi, mem-hotplug: add unplug request cb for memory device
acpi, mem-hotplug: add acpi_memory_slot_status() to get MemStatus
docs: update documentation for memory hot unplug
virtio: coding style tweak
pci: remove hard-coded bar size in msix_init_exclusive_bar()
virtio-pci: speedup MSI-X masking and unmasking
virtio: introduce vector to virtqueues mapping
virtio-ccw: using VIRTIO_NO_VECTOR instead of 0 for invalid virtqueue
monitor: check return value of qemu_find_net_clients_except()
monitor: replace the magic number 255 with MAX_QUEUE_NUM
...
Conflicts:
hw/s390x/s390-virtio-bus.c
[PMM: fixed conflict in s390_virtio_scsi_properties and
s390_virtio_net_properties arrays; since the result of the
two conflicting patches is to empty the property arrays
completely, the conflict resolution is to remove them entirely.]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
A future patch will be using a 'name':{dictionary} entry in the
QAPI schema to specify a default value for an optional argument
(see previous commit message for more details why); but existing
use of inline nested structs conflicts with that goal. This patch
fixes one of only two commands relying on nested types, by
breaking the nesting into an explicit type; it means that the
type is now boxed instead of unboxed in C code, but the QMP wire
format is unaffected by this change.
Prefer the safer g_new0() while making the conversion, and reduce
some long lines.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
The error reporting in pci_nic_init() is quite erratic: Some errors
are printed directly with error_report(), and some are passed back
to the caller pci_nic_init_nofail() via an Error pointer.
Since pci_nic_init() is only used by pci_nic_init_nofail(), the
functions can be simply merged to clean up this inconsistency.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
This commit was generated mechanically by coccinelle from the following
semantic patch:
@@
expression val;
@@
- (ffs(val) - 1)
+ ctz32(val)
The call sites have been audited to ensure the ffs(0) - 1 == -1 case
never occurs (due to input validation, asserts, etc). Therefore we
don't need to worry about the fact that ctz32(0) == 32.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1427124571-28598-5-git-send-email-stefanha@redhat.com
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This patch lets msix_init_exclusive_bar() can calculate the bar and
pba size based on the number of MSI-X vectors other than using a
hard-coded limit 4096. This is needed to allow device to have more
than 128 MSI_X vectors. To keep migration compatibility, keep using
4096 as bar size and 2048 for pba offset.
Notes: We don't care about the case that using vectors > 128 for
legacy machine type. Since we limit the queue max to 64, so vectors >=
65 is meaningless.
Virtio device will be the first user for this.
Cc: Keith Busch <keith.busch@intel.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Switch all the uses of ld/st*_phys to address_space_ld/st*,
except for those cases where the address space is the CPU's
(ie cs->as). This was done with the following script which
generates a Coccinelle patch.
A few over-80-columns lines in the result were rewrapped by
hand where Coccinelle failed to do the wrapping automatically,
as well as one location where it didn't put a line-continuation
'\' when wrapping lines on a change made to a match inside
a macro definition.
===begin===
#!/bin/sh -e
# Usage:
# ./ldst-phys.spatch.sh > ldst-phys.spatch
# spatch -sp_file ldst-phys.spatch -dir . | sed -e '/^+/s/\t/ /g' > out.patch
# patch -p1 < out.patch
for FN in ub uw_le uw_be l_le l_be q_le q_be uw l q; do
cat <<EOF
@ cpu_matches_ld_${FN} @
expression E1,E2;
identifier as;
@@
ld${FN}_phys(E1->as,E2)
@ other_matches_ld_${FN} depends on !cpu_matches_ld_${FN} @
expression E1,E2;
@@
-ld${FN}_phys(E1,E2)
+address_space_ld${FN}(E1,E2, MEMTXATTRS_UNSPECIFIED, NULL)
EOF
done
for FN in b w_le w_be l_le l_be q_le q_be w l q; do
cat <<EOF
@ cpu_matches_st_${FN} @
expression E1,E2,E3;
identifier as;
@@
st${FN}_phys(E1->as,E2,E3)
@ other_matches_st_${FN} depends on !cpu_matches_st_${FN} @
expression E1,E2,E3;
@@
-st${FN}_phys(E1,E2,E3)
+address_space_st${FN}(E1,E2,E3, MEMTXATTRS_UNSPECIFIED, NULL)
EOF
done
===endit===
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Current QEMU crashes when specifying an illegal model with the
"-net nic,model=xxx" option, e.g.:
$ qemu-system-x86_64 -net nic,model=n/a
qemu-system-x86_64: Unsupported NIC model: n/a
Program received signal SIGSEGV, Segmentation fault.
The gdb backtrace looks like this:
0x0000555555965fe0 in error_get_pretty (err=0x0) at util/error.c:152
152 return err->msg;
(gdb) bt
0 0x0000555555965fe0 in error_get_pretty (err=0x0) at util/error.c:152
1 0x0000555555965ffd in error_report_err (err=0x0) at util/error.c:157
2 0x0000555555809c90 in pci_nic_init_nofail (nd=0x555555e49860 <nd_table>, rootbus=0x5555564409b0,
default_model=0x55555598c37b "e1000", default_devaddr=0x0) at hw/pci/pci.c:1663
3 0x0000555555691e42 in pc_nic_init (isa_bus=0x555556f71900, pci_bus=0x5555564409b0)
at hw/i386/pc.c:1506
4 0x000055555569396b in pc_init1 (machine=0x5555562abbf0, pci_enabled=1, kvmclock_enabled=1)
at hw/i386/pc_piix.c:248
5 0x0000555555693d27 in pc_init_pci (machine=0x5555562abbf0) at hw/i386/pc_piix.c:310
6 0x000055555572ddf5 in main (argc=3, argv=0x7fffffffe018, envp=0x7fffffffe038) at vl.c:4226
The problem is that pci_nic_init_nofail() does not check whether the err
parameter from pci_nic_init has been set up and thus passes a NULL pointer
to error_report_err(). Fix it by correctly checking the err parameter.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Code comment says "table 6-2" but in fact it's is not a table, it is
"Figure 6-2" on page 479.
Cc: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Reported-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Error Status Register, so this patch fix a wrong definition
for PCI_ERR_COR_STATUS register with w1cmask type.
Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Refer to "PCI Express Base Spec3.0", this comments can't
fit the description in spec, so we should fix them.
Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
when specify TLP Prefix log as using pcie_aer_inject_error,
the TLP prefix log is always discarded. because the check
is incorrect, the End-End TLP Prefix Supported bit
(PCI_EXP_DEVCAP2_EETLPP) should be in Device Capabilities 2 Register.
Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
from pcie spec 7.8.17, the End-End TLP Prefix Blocking bit local
is 15(e.g. 0x8000) in device control 2 register.
Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
qdev_init() is deprecated, and will be removed when its callers have
been weaned off it.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
clang undefined behaviour sanitizer reports:
> hw/pci/shpc.c:162:27: runtime error: left shift of 1 by 31 places
> cannot be represented in type 'int'
Caused by the usual lack of a 'U' qualifier on a constant 1 being
shifted left. Fix it up.
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Commit 79ca616 (v1.6.0) accidentally disabled legacy x86-only HMP
commands pci_add, pci_del: it defined CONFIG_PCI_HOTPLUG only as make
variable, not as preprocessor macro, killing the code conditional on
defined(CONFIG_PCI_HOTPLUG_OLD).
In all this time, nobody reported the loss. I only noticed it when I
tried to test some error reporting change that forced me to touch this
old crap again.
Fun: git-log hw/pci/pci-hotplug-old.c shows our faith in the backward
compatibility god has been strong enough to sacrifice at its altar
about a dozen times, but not strong enough to even once verify the
legacy feature's still there, let alone works.
Remove the commands along with the code backing them.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
None of them should be used in new code.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Call the new PCIDeviceClass method realize(). Default it to
pci_default_realize(), which calls old method init().
To convert a device model, make it implement realize() rather than
init().
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Implement DeviceClass methods realize() and unrealize() instead of
init() and exit(). The core's initialization errors now get
propagated properly, and QMP sends them instead of an unspecific
"Device initialization failed" error. Unrealize can't fail, so no
change there.
PCIDeviceClass is unchanged: it still provides init() and exit().
Therefore, device models' errors are still not propagated.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Some are called do_COMMAND() (old ones, usually), some hmp_COMMAND(),
and sometimes COMMAND pointlessly differs in spelling.
Normalize to hmp_COMMAND(), where COMMAND is exactly the command name
with '-' replaced by '_'.
Exceptions:
* do_device_add() and client_migrate_info() *not* renamed to
hmp_device_add(), hmp_client_migrate_info(), because they're also
QMP handlers. They still need to be converted to QAPI.
* do_memory_dump(), do_physical_memory_dump(), do_ioport_read(),
do_ioport_write() renamed do hmp_* instead of hmp_x(), hmp_xp(),
hmp_i(), hmp_o(), because those names are too cryptic for my taste.
* do_info_help() renamed to hmp_info_help() instead of hmp_info(),
because it only covers help.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
object_unparent should not be called until the parent device is going to be
destroyed. Only remove the capability and do memory_region_del_subregion
at unrealize time. Freeing the data structures is left in shpc_free, to
be called from the instance_finalize callback.
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This memory leak was introduced inadvertently by omitting object_unparent.
A better fix is to use the new memory_region_set_size instead of destroying
and recreating the MMIO region on the fly.
Also, ensure that unmapping and remapping the region is done atomically.
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
One of the annoyances of the current migration format is the fact that
it's not self-describing. In fact, it's not properly describing at all.
Some code randomly scattered throughout QEMU elaborates roughly how to
read and write a stream of bytes.
We discussed an idea during KVM Forum 2013 to add a JSON description of
the migration protocol itself to the migration stream. This patch
adds a section after the VM_END migration end marker that contains
description data on what the device sections of the stream are composed of.
This approach is backwards compatible with any QEMU version reading the
stream, because QEMU just stops reading after the VM_END marker and ignores
any data following it.
With an additional external program this allows us to decipher the
contents of any migration stream and hopefully make migration bugs easier
to track down.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
The mmcfg space is a memory region that allows access to PCI config space
in the PCIe world. To maintain abstraction layers, I would like to expose
the mmcfg space as a sysbus mmio region rather than have it mapped straight
into the system's memory address space though.
So this patch splits the initialization of the mmcfg space from the actual
mapping, allowing us to only have an mmfg memory region without the map.
Signed-off-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
Reported-by:
https://bugs.launchpad.net/qemu/+bug/1393440
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
If the pci bridge enters in error flow as part
of init process it will only delete the shpc mmio
subregion but not remove it from the properties list,
resulting in segmentation fault when the bridge runs
the exit function.
Example: add a pci bridge without specifing the chassis number:
<qemu-bin> ... -device pci-bridge,id=p1
Result:
(qemu) qemu-system-x86_64: -device pci-bridge,id=p1: Bridge chassis not specified. Each bridge is required to be assigned a unique chassis id > 0.
qemu-system-x86_64: -device pci-bridge,id=p1: Device
initialization failed.
Segmentation fault (core dumped)
if (child->class->unparent) {
#0 0x00005555558d629b in object_finalize_child_property (obj=0x555556d2e830, name=0x555556d30630 "shpc-mmio[0]", opaque=0x555556a42fc8) at qom/object.c:1078
#1 0x00005555558d4b1f in object_property_del_all (obj=0x555556d2e830) at qom/object.c:367
#2 0x00005555558d4ca1 in object_finalize (data=0x555556d2e830) at qom/object.c:412
#3 0x00005555558d55a1 in object_unref (obj=0x555556d2e830) at qom/object.c:720
#4 0x000055555572c907 in qdev_device_add (opts=0x5555563544f0) at qdev-monitor.c:566
#5 0x0000555555744f16 in device_init_func (opts=0x5555563544f0, opaque=0x0) at vl.c:2213
#6 0x00005555559cf5f0 in qemu_opts_foreach (list=0x555555e0f8e0 <qemu_device_opts>, func=0x555555744efa <device_init_func>, opaque=0x0, abort_on_failure=1) at util/qemu-option.c:1057
#7 0x000055555574a11b in main (argc=16, argv=0x7fffffffdde8, envp=0x7fffffffde70) at vl.c:423
Unparent the shpc mmio region as part of shpc cleanup.
Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Amos Kong <akong@redhat.com>
Hot-plugging a device that has a romfile (either supplied by user
or built-in) using rombar=0 option is a user error,
do not allow the device to be hot-plugged.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Verify return code for pci_add_option_rom.
Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>