Intel specifies that the Intel IGD must occupy slot 2 on the PCI bus,
as noted in docs/igd-assign.txt in the Qemu source code.
Currently, when the xl toolstack is used to configure a Xen HVM guest with
Intel IGD passthrough to the guest with the Qemu upstream device model,
a Qemu emulated PCI device will occupy slot 2 and the Intel IGD will occupy
a different slot. This problem often prevents the guest from booting.
The only available workarounds are not good: Configure Xen HVM guests to
use the old and no longer maintained Qemu traditional device model
available from xenbits.xen.org which does reserve slot 2 for the Intel
IGD or use the "pc" machine type instead of the "xenfv" machine type and
add the xen platform device at slot 3 using a command line option
instead of patching qemu to fix the "xenfv" machine type directly. The
second workaround causes some degredation in startup performance such as
a longer boot time and reduced resolution of the grub menu that is
displayed on the monitor. This patch avoids that reduced startup
performance when using the Qemu upstream device model for Xen HVM guests
configured with the igd-passthru=on option.
To implement this feature in the Qemu upstream device model for Xen HVM
guests, introduce the following new functions, types, and macros:
* XEN_PT_DEVICE_CLASS declaration, based on the existing TYPE_XEN_PT_DEVICE
* XEN_PT_DEVICE_GET_CLASS macro helper function for XEN_PT_DEVICE_CLASS
* typedef XenPTQdevRealize function pointer
* XEN_PCI_IGD_SLOT_MASK, the value of slot_reserved_mask to reserve slot 2
* xen_igd_reserve_slot and xen_igd_clear_slot functions
Michael Tsirkin:
* Introduce XEN_PCI_IGD_DOMAIN, XEN_PCI_IGD_BUS, XEN_PCI_IGD_DEV, and
XEN_PCI_IGD_FN - use them to compute the value of XEN_PCI_IGD_SLOT_MASK
The new xen_igd_reserve_slot function uses the existing slot_reserved_mask
member of PCIBus to reserve PCI slot 2 for Xen HVM guests configured using
the xl toolstack with the gfx_passthru option enabled, which sets the
igd-passthru=on option to Qemu for the Xen HVM machine type.
The new xen_igd_reserve_slot function also needs to be implemented in
hw/xen/xen_pt_stub.c to prevent FTBFS during the link stage for the case
when Qemu is configured with --enable-xen and --disable-xen-pci-passthrough,
in which case it does nothing.
The new xen_igd_clear_slot function overrides qdev->realize of the parent
PCI device class to enable the Intel IGD to occupy slot 2 on the PCI bus
since slot 2 was reserved by xen_igd_reserve_slot when the PCI bus was
created in hw/i386/pc_piix.c for the case when igd-passthru=on.
Move the call to xen_host_pci_device_get, and the associated error
handling, from xen_pt_realize to the new xen_igd_clear_slot function to
initialize the device class and vendor values which enables the checks for
the Intel IGD to succeed. The verification that the host device is an
Intel IGD to be passed through is done by checking the domain, bus, slot,
and function values as well as by checking that gfx_passthru is enabled,
the device class is VGA, and the device vendor in Intel.
Signed-off-by: Chuck Zmudzinski <brchuckz@aol.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Message-Id: <b1b4a21fe9a600b1322742dda55a40e9961daa57.1674346505.git.brchuckz@aol.com>
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
hw/pci/pci_bridge.h and hw/cxl/cxl.h include each other.
Fortunately, breaking the loop is merely a matter of deleting
unnecessary includes from headers, and adding them back in places
where they are now missing.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20221222100330.380143-2-armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Now that igd_passthrough_isa_bridge_create() is implemented within the
xen context it may use Xen* data types directly and become
xen_igd_passthrough_isa_bridge_create(). This resolves an indirection.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Message-Id: <20220326165825.30794-3-shentey@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
igd-passthrough-isa-bridge is only requested in xen_pt but was
implemented in pc_piix.c. This caused xen_pt to dependend on i386/pc
which is hereby resolved.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Message-Id: <20220326165825.30794-2-shentey@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
One less qemu-specific macro. It also helps to make some headers/units
only depend on glib, and thus moved in standalone projects eventually.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Some typedefs and macros are defined after the type check macros.
This makes it difficult to automatically replace their
definitions with OBJECT_DECLARE_TYPE.
Patch generated using:
$ ./scripts/codeconverter/converter.py -i \
--pattern=QOMStructTypedefSplit $(git grep -l '' -- '*.[ch]')
which will split "typdef struct { ... } TypedefName"
declarations.
Followed by:
$ ./scripts/codeconverter/converter.py -i --pattern=MoveSymbols \
$(git grep -l '' -- '*.[ch]')
which will:
- move the typedefs and #defines above the type check macros
- add missing #include "qom/object.h" lines if necessary
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20200831210740.126168-9-ehabkost@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20200831210740.126168-10-ehabkost@redhat.com>
Message-Id: <20200831210740.126168-11-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Xen PCI passthrough support may not be available and thus the global
variable "has_igd_gfx_passthru" might be compiled out. Common code
should not access it in that case.
Unfortunately, we can't use CONFIG_XEN_PCI_PASSTHROUGH directly in
xen-common.c so this patch instead move access to the
has_igd_gfx_passthru variable via function and those functions are
also implemented as stubs. The stubs will be used when QEMU is built
without passthrough support.
Now, when one will want to enable igd-passthru via the -machine
property, they will get an error message if QEMU is built without
passthrough support.
Fixes: 46472d8232 ('xen: convert "-machine igd-passthru" to an accelerator property')
Reported-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Message-Id: <20200603160442.3151170-1-anthony.perard@citrix.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Description copied from Linux kernel commit from Gustavo A. R. Silva
(see [3]):
--v-- description start --v--
The current codebase makes use of the zero-length array language
extension to the C90 standard, but the preferred mechanism to
declare variable-length types such as these ones is a flexible
array member [1], introduced in C99:
struct foo {
int stuff;
struct boo array[];
};
By making use of the mechanism above, we will get a compiler
warning in case the flexible array does not occur last in the
structure, which will help us prevent some kind of undefined
behavior bugs from being unadvertenly introduced [2] to the
Linux codebase from now on.
--^-- description end --^--
Do the similar housekeeping in the QEMU codebase (which uses
C99 since commit 7be41675f7).
All these instances of code were found with the help of the
following Coccinelle script:
@@
identifier s, m, a;
type t, T;
@@
struct s {
...
t m;
- T a[0];
+ T a[];
};
@@
identifier s, m, a;
type t, T;
@@
struct s {
...
t m;
- T a[0];
+ T a[];
} QEMU_PACKED;
[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=76497732932f
[3] https://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux.git/commit/?id=17642a2fbd2c1
Inspired-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
No header includes qemu-common.h after this commit, as prescribed by
qemu-common.h's file comment.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190523143508.25387-5-armbru@redhat.com>
[Rebased with conflicts resolved automatically, except for
include/hw/arm/xlnx-zynqmp.h hw/arm/nrf51_soc.c hw/arm/msf2-soc.c
block/qcow2-refcount.c block/qcow2-cluster.c block/qcow2-cache.c
target/arm/cpu.h target/lm32/cpu.h target/m68k/cpu.h target/mips/cpu.h
target/moxie/cpu.h target/nios2/cpu.h target/openrisc/cpu.h
target/riscv/cpu.h target/tilegx/cpu.h target/tricore/cpu.h
target/unicore32/cpu.h target/xtensa/cpu.h; bsd-user/main.c and
net/tap-bsd.c fixed up]
The xen pci_assign_dev_load_option_rom() currently creates a RAM
memory region with memory_region_init_ram_nomigrate(), and then
manually registers it with vmstate_register_ram(). In fact for
its only callsite, the 'owner' pointer we use for the init call
and the '&dev->qdev' pointer we use for the vmstate_register_ram()
call refer to the same object. Simplify the function to only
take a pointer to the device once instead of twice, and use
memory_region_init_ram() which automatically does the vmstate
register for us.
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When a MSI interrupt is bound to a guest using
xc_domain_update_msi_irq (XEN_DOMCTL_bind_pt_irq) the interrupt is
left masked by default.
This causes problems with guests that first configure interrupts and
clean the per-entry MSIX table mask bit and afterwards enable MSIX
globally. In such scenario the Xen internal msixtbl handlers would not
detect the unmasking of MSIX entries because vectors are not yet
registered since MSIX is not enabled, and vectors would be left
masked.
Introduce a new flag in the gflags field to signal Xen whether a MSI
interrupt should be unmasked after being bound.
This also requires to track the mask register for MSI interrupts, so
QEMU can also notify to Xen whether the MSI interrupt should be bound
masked or unmasked
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reported-by: Andreas Kinzler <hfp@posteo.de>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
To catch the error message. Also modify the caller
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
To catch the error message. Also modify the caller
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Introduce yet another mask for them, so that the generic routine can
handle them, at once rendering xen_pt_pmcsr_reg_write() superfluous.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
The remaining log message in pci_msix_write() is wrong, as there guest
behavior may only appear to be wrong: For one, the old logic didn't
take the mask-all bit into account. And then this shouldn't depend on
host device state (i.e. the host may have masked the entry without the
guest having done so). Plus these writes shouldn't be dropped even when
an entry gets unmasked. Instead, if they can't be made take effect
right away, they should take effect on the next unmasking or enabling
operation - the specification explicitly describes such caching
behavior.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
msix->mmio is added to XenPCIPassthroughState's object as property.
object_finalize_child_property is called for XenPCIPassthroughState's
object, which calls object_property_del_all, which is going to try to
delete msix->mmio. object_finalize_child_property() will access
msix->mmio's obj. But the whole msix struct has already been freed
by xen_pt_msix_delete. This will cause segment fault when msix->mmio
has been overwritten.
This patch is to fix the issue.
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
To deal with xen_host_pci_[set|get]_ functions returning error values
and clearing ourselves in the init function we should make the
.exit (xen_pt_unregister_device) function be idempotent in case
the generic code starts calling .exit (or for fun does it before
calling .init!).
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
We do not want to have two entries to cache the guest configuration
registers: XenPTReg->data and dev.config. Instead we want to use
only the dev.config.
To do without much complications we rip out the ->data field
and replace it with an pointer to the dev.config. This way we
have the type-checking (uint8_t, uint16_t, etc) and as well
and pre-computed location.
Alternatively we could compute the offset in dev.config by
using the XenPTRRegInfo and XenPTRegGroup every time but
this way we have the pre-computed values.
This change also exposes some mis-use:
- In 'xen_pt_status_reg_init' we used u32 for the Capabilities Pointer
register, but said register is an an u16.
- In 'xen_pt_msgdata_reg_write' we used u32 but should have only use u16.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
As we do not use it outside our code.
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
The OpRegion shouldn't be mapped 1:1 because the address in the host
can't be used in the guest directly.
This patch traps read and write access to the opregion of the Intel
GPU config space (offset 0xfc).
The original patch is from Jean Guyader <jean.guyader@eu.citrix.com>
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Now we retrieve VGA bios like kvm stuff in qemu but we need to
fix Device Identification in case if its not matched with the
real IGD device since Seabios is always trying to compare this
ID to work out VGA BIOS.
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
This is done indirectly by adjusting two typedefs and helps emphasizing
that the respective tables aren't supposed to be modified at runtime
(as they may be shared between devices).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
... by default. Add a per-device "permissive" mode similar to pciback's
to allow restoring previous behavior (and hence break security again,
i.e. should be used only for trusted guests).
This is part of XSA-131.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>)
The adjustments are solely to make the subsequent patches work right
(and hence make the patch set consistent), namely if permissive mode
(introduced by the last patch) gets used (as both reserved registers
and reserved fields must be similarly protected from guest access in
default mode, but the guest should be allowed access to them in
permissive mode).
This is a preparatory patch for XSA-131.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Limit error messages resulting from bad guest behavior to avoid allowing
the guest to cause the control domain's disk to fill.
The first message in pci_msix_write() can simply be deleted, as this
is indeed bad guest behavior, but such out of bounds writes don't
really need to be logged.
The second one is more problematic, as there guest behavior may only
appear to be wrong: For one, the old logic didn't take the mask-all bit
into account. And then this shouldn't depend on host device state (i.e.
the host may have masked the entry without the guest having done so).
Plus these writes shouldn't be dropped even when an entry is unmasked.
Instead, if they can't be made take effect right away, they should take
effect on the next unmasking or enabling operation - the specification
explicitly describes such caching behavior. Until we can validly drop
the message (implementing such caching/latching behavior), issue the
message just once per MSI-X table entry.
Note that the log message in pci_msix_read() similar to the one being
removed here is not an issue: "addr" being of unsigned type, and the
maximum size of the MSI-X table being 32k, entry_nr simply can't be
negative and hence the conditonal guarding issuing of the message will
never be true.
This is XSA-130.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
The old logic didn't work as intended when an access spanned multiple
fields (for example a 32-bit access to the location of the MSI Message
Data field with the high 16 bits not being covered by any known field).
Remove it and derive which fields not to write to from the accessed
fields' emulation masks: When they're all ones, there's no point in
doing any host write.
This fixes a secondary issue at once: We obviously shouldn't make any
host write attempt when already the host read failed.
This is XSA-128.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>