When the user specifies -nodefaults he can tell us that he doesn't want any
serial ports spawned by default. While we do honor that wish, we still create
device tree entries for those non-existent devices.
Make device tree generation depend on whether the device is actually available.
Signed-off-by: Alexander Graf <agraf@suse.de>
Currently SPAPR PHB keeps track of all allocated MSI (here and below
MSI stands for both MSI and MSIX) interrupt because
XICS used to be unable to reuse interrupts. This is a problem for
dynamic MSI reconfiguration which happens when guest reloads a driver
or performs PCI hotplug. Another problem is that the existing
implementation can enable MSI on 32 devices maximum
(SPAPR_MSIX_MAX_DEVS=32) and there is no good reason for that.
This makes use of new XICS ability to reuse interrupts.
This reorganizes MSI information storage in sPAPRPHBState. Instead of
static array of 32 descriptors (one per a PCI function), this patch adds
a GHashTable when @config_addr is a key and (first_irq, num) pair is
a value. GHashTable can dynamically grow and shrink so the initial limit
of 32 devices is gone.
This changes migration stream as @msi_table was a static array while new
@msi_devs is a dynamic hash table. This adds temporary array which is
used for migration, it is populated in "spapr_pci"::pre_save() callback
and expanded into the hash table in post_load() callback. Since
the destination side does not know the number of MSI-enabled devices
in advance and cannot pre-allocate the temporary array to receive
migration state, this makes use of new VMSTATE_STRUCT_VARRAY_ALLOC macro
which allocates the array automatically.
This resets the MSI configuration space when interrupts are released by
the ibm,change-msi RTAS call.
This fixed traces to be more informative.
This changes vmstate_spapr_pci_msi name from "...lsi" to "...msi" which
was incorrect by accident. As the internal representation changed,
thus bumps migration version number.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[agraf: drop g_malloc_n usage]
Signed-off-by: Alexander Graf <agraf@suse.de>
This implements interrupt release function so IRQs can be returned back
to the pool for reuse in cases such as PCI hot plug.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
This removes @next_irq from sPAPREnvironment which was used in old
IRQ allocator as XICS is now responsible for IRQs and keeps track of
allocated IRQs.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
The current allocator returns IRQ numbers from a pool and does not
support IRQs reuse in any form as it did not keep track of what it
previously returned, it only keeps the last returned IRQ. Some use
cases such as PCI hot(un)plug may require IRQ release and reallocation.
This moves an allocator from SPAPR to XICS.
This switches IRQ users to use new API.
This uses LSI/MSI flags to know if interrupt is allocated.
The interrupt release function will be posted as a separate patch.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
Since islsi[] array has been merged into the ICSState struct,
we must not reset flags as they tell if the interrupt is in use.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
PAPR allows having multiple interrupt sources such as PHB.
This adds a source lookup function and makes use of it.
Since at the moment QEMU only supports a single source,
no change in behaviour is expected.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
The existing interrupt allocation scheme in SPAPR assumes that
interrupts are allocated at the start time, continously and the config
will not change. However, there are cases when this is not going to work
such as:
1. migration - we will have to have an ability to choose interrupt
numbers for devices in the command line and this will create gaps in
interrupt space.
2. PCI hotplug - interrupts from unplugged device need to be returned
back to interrupt pool, otherwise we will quickly run out of interrupts.
This replaces a separate lslsi[] array with a byte in the ICSIRQState
struct and defines "LSI" and "MSI" flags. Neither of these flags set
signals that the descriptor is not allocated and not in use.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
Add support for the SPLPAR Characteristics parameter to the emulated
RTAS call ibm,get-system-parameter.
The support provides just enough information to allow "cat
/proc/powerpc/lparcfg" to succeed without generating a kernel error
message.
Without this patch the above command will produce the following kernel
message: arch/powerpc/platforms/pseries/lparcfg.c \
parse_system_parameter_string Error calling get-system-parameter \
(0xfffffffd)
Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Add support for the UUID parameter to the emulated RTAS call
ibm,get-system-parameter.
Return the guest's UUID as the value for the RTAS UUID system
parameter, or null (a zero length result) if it is not set.
Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
This allows the ibm,get-system-parameter RTAS call to succeed for the
DIAGNOSTICS_RUN_MODE system parameter.
The problem can be seen with "ppc64_cpu --run-mode" from the
powerpc-utils package which fails before this patch with "Machine does
not support diagnostic run mode".
This is corrected by using the rtas_st_buffer() function to write to
the buffer.
The RTAS constants are also moved out into a header file, some new
constants added and the surrounding code slightly simplified.
Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
[agraf: remove some commentary]
Signed-off-by: Alexander Graf <agraf@suse.de>
This adds a v2.1 machine to support backward compatibility
for newer macines in the case if they ever be implemented.
This adds a "pseries-2.1" machine as a child of the "pseries"
machine and only changes visible machine name.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
Every single sPAPR QOM object has small first "s".
Most (not all yet) QOM objects have "State" suffix.
This replaces SPAPRMachine with sPAPRMachineState to conform with QEMU
code style and removes redundant empty line.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
Fix PCI hole size to match that what is found on real hardware.
(OpenBIOS already uses the correct length.)
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Alexander Graf <agraf@suse.de>
Change the order of creating devices for New World Mac emulation so
that devices on the motherboard are added first and PCI cards (VGA and
NIC) come later. As a side effect, this also causes OpenBIOS to map
the motherboard devices into the MMIO space to the same addresses as
on real hardware and allow clients that hardcode these addresses (e.g.
MorphOS) to find and use them until OpenBIOS is tought to map devices
to specific addresses. (On real hardware the graphics and network
cards are really on separate buses but we don't model that yet.) This
brings the memory map closer to what is found on PowerMac3,1.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Alexander Graf <agraf@suse.de>
This turns the sPAPR support on and enables VFIO container use
in the kernel.
This extends vfio_connect_container to support VFIO_SPAPR_TCE_IOMMU type
in the host kernel.
This registers a memory listener which sPAPR IOMMU will notify when
executing H_PUT_TCE/etc DMA calls. The listener then will notify the host
kernel about DMA map/unmap operation via VFIO_IOMMU_MAP_DMA/
VFIO_IOMMU_UNMAP_DMA ioctls.
This executes VFIO_IOMMU_ENABLE ioctl to make sure that the IOMMU is free
of mappings and can be exclusively given to the user. At the moment SPAPR
is the only platform requiring this call to be implemented.
Note that the host kernel function implementing VFIO_IOMMU_DISABLE
is called automatically when container's fd is closed so there is
no need to call it explicitly from QEMU. We may need to call
VFIO_IOMMU_DISABLE explicitly in the future for some sort of dynamic
reconfiguration (PCI hotplug or dynamic IOMMU group management).
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
The patch adds a spapr-pci-vfio-host-bridge device type
which is a PCI Host Bridge with VFIO support. The new device
inherits from the spapr-pci-host-bridge device and adds an "iommu"
property which is an IOMMU id. This ID represents a minimal entity
for which IOMMU isolation can be guaranteed. In SPAPR architecture IOMMU
group is called a Partitionable Endpoint (PE).
Current implementation supports one IOMMU id per QEMU VFIO PHB. Since
SPAPR allows multiple PHB for no extra cost, this does not seem to
be a problem. This limitation may change in the future though.
Example of use:
Configure and Add 3 functions of a multifunctional device to QEMU:
(the NEC PCI USB card is used as an example here):
-device spapr-pci-vfio-host-bridge,id=USB,iommu=4,index=7 \
-device vfio-pci,host=4:0:1.0,addr=1.0,bus=USB,multifunction=true
-device vfio-pci,host=4:0:1.1,addr=1.1,bus=USB
-device vfio-pci,host=4:0:1.2,addr=1.2,bus=USB
where:
* index=7 is a QEMU PHB index (used as source for MMIO/MSI/IO windows
offset);
* iommu=4 is an IOMMU id which can be found in sysfs:
[aik@vpl2 ~]$ cd /sys/bus/pci/devices/0004:00:00.0/
[aik@vpl2 0004:00:00.0]$ ls -l iommu_group
lrwxrwxrwx 1 root root 0 Jun 5 12:49 iommu_group -> ../../../kernel/iommu_groups/4
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
While most operations with VFIO IOMMU driver are generic and used inside
vfio.c, there are still some operations which only specific VFIO IOMMU
drivers implement. The first example of it will be reading a DMA window
start from the host.
This adds a helper which passes an ioctl request to the container's fd.
The helper will check if @req is known. For this, stub is added. This return
-1 on any requests for now.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
POWER KVM supports an KVM_CAP_SPAPR_TCE capability which allows allocating
TCE tables in the host kernel memory and handle H_PUT_TCE requests
targeted to specific LIOBN (logical bus number) right in the host without
switching to QEMU. At the moment this is used for emulated devices only
and the handler only puts TCE to the table. If the in-kernel H_PUT_TCE
handler finds a LIOBN and corresponding table, it will put a TCE to
the table and complete hypercall execution. The user space will not be
notified.
Upcoming VFIO support is going to use the same sPAPRTCETable device class
so KVM_CAP_SPAPR_TCE is going to be used as well. That means that TCE
tables for VFIO are going to be allocated in the host as well.
However VFIO operates with real IOMMU tables and simple copying of
a TCE to the real hardware TCE table will not work as guest physical
to host physical address translation is requited.
So until the host kernel gets VFIO support for H_PUT_TCE, we better not
to register VFIO's TCE in the host.
This adds a place holder for KVM_CAP_SPAPR_TCE_VFIO capability. It is not
in upstream yet and being discussed so now it is always false which means
that in-kernel VFIO acceleration is not supported.
This adds a bool @vfio_accel flag to the sPAPRTCETable device telling
that sPAPRTCETable should not try allocating TCE table in the host kernel
for VFIO. The flag is false now as at the moment there is no VFIO.
This adds an vfio_accel parameter to spapr_tce_new_table(), the semantic
is the same. Since there is only emulated PCI and VIO now, the flag is set
to false. Upcoming VFIO support will set it to true.
This is a preparation patch so no change in behaviour is expected
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
At the moment spapr_rtas_register() allocates a new token number for every
new RTAS callback so numbers are not fixed and depend on the number of
supported RTAS handlers and the exact order of spapr_rtas_register() calls.
These tokens are copied into the device tree and remain the same during
the guest lifetime.
When we start another guest to receive a migration, it calls
spapr_rtas_register() as well. If the number of RTAS handlers or their
order is different in QEMU on source and destination sides, the "/rtas"
node in the device tree will differ. Since migration overwrites the device
tree (as it overwrites the entire RAM), the actual RTAS config on
the destination side gets broken.
This defines global contant values for every RTAS token which QEMU
is using today.
This changes spapr_rtas_register() to accept a token number instead of
allocating one. This changes all users of spapr_rtas_register().
This changes XICS-KVM not to cache tokens registered with KVM as they
constant now.
This makes TOKEN_BASE global as RTAS_XXX use TOKEN_BASE as
a base. TOKEN_MAX is moved and renamed too and its value is changed
to the last token + 1. Boundary checks for token values are adjusted.
This reserves token numbers for "os-term" handlers and PCI hotplug
which we are working on.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
This is required to enable boot menu display during booting
Signed-off-by: Avik Sil <aviksil@linux.vnet.ibm.com>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Although we defined an eepro100_mdi_mask[] array indicating which bits
in the registers are read-only, we weren't actually doing anything with
it. Make the MDI register-write code use it rather than manually making
register 1 read-only and leaving the rest as reads-as-written. (The
special-case handling of register 0 remains as before since its mask is
all-zeros and the special casing happens before we apply the masking.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1402159924-13853-1-git-send-email-peter.maydell@linaro.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
It indicates the number of elements in ncs field and makes sense to have
int inside NICPeers. Also in parse_netdev we do not need to access
container and work with NICPeers only.
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The function moxie_intc_create() is unused; remove it.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Its only contents are a dead memcpy. Since it is optional,
drop the function altogether.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
'monitor.h' is still included in target-s390x/kvm.c, since I have
no good way to verify whether other code need it on my x86 host.
Signed-off-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Param name is declared as optional, since in code it is an optional
one.
Signed-off-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
This patch also eliminates build time warning caused by no caller
of monitor_qapi_event_throttle().
Signed-off-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
This is a pc & q35 only machine opt.
If you add enough PCI devices then all mmio for them will not fit
below 4G which may not be the layout the user wanted. This allows
you to increase the below 4G address space that PCI devices can use
(aka decrease ram below 4G) and therefore in more cases not have any
mmio that is above 4G.
For example using "-machine pc,max-ram-below-4g=2G" on the command
line will limit the amount of ram that is below 4G to 2G.
Note: this machine option cannot be used to increase the amount
of ram below 4G.
Signed-off-by: Don Slutz <dslutz@verizon.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
MST: fix 32 bit
In order to let event defines use existing types later, instead of
redefine new ones, some old type defines for spice and vnc are changed,
and BlockErrorAction is moved from block.h to qapi schema. Note that
BlockErrorAction is not merged with BlockdevOnError.
At this point, VncInfo is not made a child of VncBasicInfo, because
VncBasicInfo has mandatory fields where VncInfo makes them optional.
Signed-off-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
This commit expands all uses of the INLINE macro and drop it.
The reason for this is to avoid clashes with external libraries with
bad name conventions and also because renaming keywords is not a good
practice.
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
This is just below_4g_mem_size and above_4g_mem_size which is used later in QEMU.
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Don Slutz <dslutz@verizon.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The current code is broken: it does surprise removal which crashes guests.
Reimplemented the steps:
- Hotplug triggers both 'present detect change' and
'attention button pressed'.
- Hotunplug starts by triggering 'attention button pressed',
then waits for the OS to power off the device and only
then detaches it.
Fixes CVE-2014-3471.
Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
It is needed by hot-unplug in order to get an indication
from the OS when the device can be physically detached.
Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Trivial issue, discovered while debugging.
Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
pc-q35-1.4 was incorrectly using PC_COMPAT_1_4 instead of
PC_Q35_COMPAT_1_4.
The only side-effect was that the hpet compat property (inherited from
PC_Q35_COMPAT_1_7) was missing.
Without this patch, pc-q35-1.4 inicorrectly initializes hpet-intcap to
0xff0104 (behavior introduced in QEMU 2.0, by commit
7a10ef51c2397ac4323bc786af02c58b413b5cd2).
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Currently vectors silently cleared to 0 if the initialization is failed,
but user should at least have one way to notice this.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Also fix minor indentation issues in the surrounding code.
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Enable calling set_ics() from within e1000_autoneg_timer() without
the need for a forward declaration.
This patch contains no functional changes.
Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Generate a link status change interrupt once link auto-netotiation
is successfully completed. This does not affect Linux and Windows
(XP and 7 tested) in any way, but is needed by the stock OS X driver
(AppleIntel8254XEthernet.kext), which would otherwise fail to notice
the link status change event.
Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>