We currently size the msi window trap page according to the host's page
size so that we poke a working hole into a memory slot in case we overlap.
However, this is only ever necessary with KVM active. Without KVM, we should
rather try to be host platform agnostic and use a constant size: 4k.
This fixes a build breakage on win32 hosts.
Signed-off-by: Alexander Graf <agraf@suse.de>
We will use this in later patches to make sure we use the right load
functions when copying hpte entries.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
This makes use of new error codes which load_elf() can return and
prints more informative error message.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
Currently everybody uses ELF kernel images with "-kernel" option on
pseries machine but QEMU still tries to boot from an image even it
fails to recognize it is ELF. This produces undefined behaviour if
the user tries a kernel image compiled for another architecture.
This removes support of raw kernel images.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
[agraf: fix up stray quotes and newlines in strings]
Signed-off-by: Alexander Graf <agraf@suse.de>
Recent changes introduced cannot_instantiate_with_device_add_yet
and removed capability of adding yet another PCI host bridge via
command line for SPAPR platform (POWERPC64 server).
This brings the capability back and puts SPAPR PHB into "bridge"
category.
This is not much use for emulated PHB but it is absolutely required
for VFIO as we put an IOMMU group onto a separate PHB on SPAPR.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
Targets like ppc64 support different types of KVM, one which use
hypervisor mode and the other which doesn't. Add a new machine
option kvm-type that helps in selecting the respective ones
We also add a new QEMUMachine callback get_vm_type that helps
in mapping the string representation of kvm type specified.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
[agraf: spelling fixes, use error_report(), use qemumachine.h]
Signed-off-by: Alexander Graf <agraf@suse.de>
This is now obsolete - remove the header and all its inclusions.
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Inline these usages. Converts these init to at least a semi-recent QOM
styling.
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Inline these usages. Converts these init to at least a semi-recent QOM
styling.
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Define macros for the interrupt and memory maps for the sake of self
documentation.
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Replace them with uint8/32/64.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
While ISA address space in prep machine is currently the one returned
by get_system_io(), this depends of the implementation of i82378/raven
devices, and this may not be the case forever.
Use the right ISA address space when adding some more ports to it.
We can use whatever ISA device on the right ISA bus, as all ISA devices
on the same ISA bus share the same ISA address space.
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
Many PCI host bridges consist of a sysbus device and a PCI device.
You need both for the thing to work. Arguably, these bridges should
be modelled as a single, composite devices instead of pairs of
seemingly independent devices you can only use together, but we're not
there, yet.
Since the sysbus part can't be instantiated with device_add, yet,
permitting it with the PCI part is useless. We shouldn't offer
useless options to the user, so let's set
cannot_instantiate_with_device_add_yet for them.
It's already set for Bonito, Grackle, i440FX and Raven. Document why.
Set it for the others: dec-21154, e500-host-bridge, gt64120_pci, mch,
pbm-pci, ppc4xx-host-bridge, sh_pci_host, u3-agp, uni-north-agp,
uni-north-internal-pci, uni-north-pci, and versatile_pci_host.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel.a@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
device_add plugs devices into suitable bus. For "real" buses, that
actually connects the device. For sysbus, the connections need to be
made separately, and device_add can't do that. The device would be
left unconnected, and could not possibly work.
Quite a few, but not all sysbus devices already set
cannot_instantiate_with_device_add_yet in their class init function.
Set it in their abstract base's class init function
sysbus_device_class_init(), and remove the now redundant assignments
from device class init functions.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel.a@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
In an ideal world, machines can be built by wiring devices together
with configuration, not code. Unfortunately, that's not the world we
live in right now. We still have quite a few devices that need to be
wired up by code. If you try to device_add such a device, it'll fail
in sometimes mysterious ways. If you're lucky, you get an
unmysterious immediate crash.
To protect users from such badness, DeviceClass member no_user used to
make device models unavailable with -device / device_add, but that
regressed in commit 18b6dad. The device model is still omitted from
help, but is available anyway.
Attempts to fix the regression have been rejected with the argument
that the purpose of no_user isn't clear, and it's prone to misuse.
This commit clarifies no_user's purpose. Anthony suggested to rename
it cannot_instantiate_with_device_add_yet_due_to_internal_bugs, which
I shorten somewhat to keep checkpatch happy. While there, make it
bool.
Every use of cannot_instantiate_with_device_add_yet gets a FIXME
comment asking for rationale. The next few commits will clean them
all up, either by providing a rationale, or by getting rid of the use.
With that done, the regression fix is hopefully acceptable.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel.a@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
This makes sure that all NUMA memory blocks reside within RAM or
have zero length.
Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
The SPAPR specification says that the RMA starts at the LPAR's logical
address 0 and is the first logical memory block reported in
the LPAR’s device tree.
So SLOF only maps the first block and that block needs to span
the full RMA.
This makes sure that the RMA area is where SLOF expects it.
Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
The qemu_devtree API is a wrapper around the fdt_ set of APIs.
Rename accordingly.
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
[agraf: also convert hw/arm/virt.c]
Signed-off-by: Alexander Graf <agraf@suse.de>
spapr-nvram's drive property is currently connected to a non-existent
"-machine nvram=<drivename>" option. Instead, tie it to -pflash like
other non-volatile RAM devices. This provides the following possibilities
for adding a backend for the sPAPR non-volatile RAM:
* -pflash filename
* -drive if=pflash,file=filename,format=raw,...
* -drive if=none,file=filename,format=raw,id=foo,... -global spapr-nvram.drive=foo
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
This adds very basic handlers for ibm,get-system-parameter and
ibm,set-system-parameter RTAS calls.
The only parameter handled at the moment is
"platform-processor-diagnostics-run-mode" which is always disabled and
does not support changing. This is expected to make
"ppc64_cpu --run-mode=1" happy.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[agraf: s/papameter/parameter/g]
Signed-off-by: Alexander Graf <agraf@suse.de>
It doesn't make sense for a region to be INT64_MAX in size:
memory core uses UINT64_MAX as a special value meaning
"all 64 bit" this is what was meant here.
While this should never affect the spapr system which at the moment always
has < 63 bit size, this makes us hit all kind of corner case bugs with
sub-pages, so users are probably better off if we just use UINT64_MAX
instead.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Alexander Graf <agraf@suse.de>
Most code already used QEMUTimer without the redundant 'struct' keyword.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The default granularity for the FIT timer on 440 is on every 0x1000th
transition of TB from 0 to 1. Translated that means 48828 times a second.
Since interrupts are quite expensive for 440 and we don't really care
about the accuracy of the FIT to that significance, let's force FIT and
WDT to at best millisecond granularity.
This basically restores behavior as it was in QEMU 1.6, where timers
could only deal with millisecond granularities at all.
This patch greatly improves performance with the 440 target and restores
roughly the same performance level that QEMU 1.6 had for me.
Signed-off-by: Alexander Graf <agraf@suse.de>
Message-id: 1385416015-22775-3-git-send-email-agraf@suse.de
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
Today we fire FIT and WDT timer events every time the respective bit
position in TB flips from 0 -> 1.
However, there is no need to do this if the end result would be that
we're changing a TSR bit that is set to 1 to 1 again. No guest visible
change would have occured.
So whenever we see that the TSR bit to our timer is already set, don't
even bother to update the timer that would potentially fire it off.
However, we do need to make sure that we update our timer that notifies
us of the TB flip when the respective TSR bit gets unset. In that case
we do care about the flip and need to notify the guest again. So add
a callback into our timer handlers when TSR bits get unset.
This improves performance for me when the guest is busy processing things.
Signed-off-by: Alexander Graf <agraf@suse.de>
Message-id: 1385416015-22775-2-git-send-email-agraf@suse.de
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
rom_add_blob never fails, and neither does rom_add_blob_fixed,
so there's no need to return value from it.
In fact, rom_add_blob_fixed was erroneously returning -1 unconditionally
which made the only system that checked the return value -M bamboo fail
to start.
Drop the return value and drop checks from ppc440_bamboo to
fix this failure.
Reported-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
compatiblity -> compatibility
continously -> continuously
existance -> existence
usefull -> useful
shoudl -> should
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Instead of relying on cpu_model, obtain the device tree node label
per CPU. Use DeviceClass::fw_name as source.
Whenever DeviceClass::fw_name is unknown, default to "PowerPC,UNKNOWN".
As a consequence, spapr_fixup_cpu_dt() can operate on each CPU's fw_name,
obsoleting sPAPREnvironment::cpu_model, and spapr_create_fdt_skel() can
drop its cpu_model argument.
Signed-off-by: Prerna Saxena <prerna@linux.vnet.ibm.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
This enables IRQFD for LSI (level triggered INTx interrupts) by adding
a spapr_route_intx_pin_to_irq() callback to the sPAPR PCI host bus. This
callback is called to know the global interrupt number to link resampling fd
with IRQFD's fd in KVM.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
Recent (host) kernels support emulating the PAPR defined "XICS" interrupt
controller system within KVM. This patch allows qemu to initialize and
configure the in-kernel XICS, and keep its state in sync with qemu's XICS
state as necessary.
This should give considerable performance improvements. e.g. on a simple
IPI ping-pong test between hardware threads, using qemu XICS gives us
around 5,000 irqs/second, whereas the in-kernel XICS gives us around
70,000 irqs/s on the same hardware configuration.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
[Mike Qiu <qiudayu@linux.vnet.ibm.com>: fixed mistype which caused ics_set_kvm_state() to fail]
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
The upcoming XICS-KVM support will use bits of emulated XICS code.
So this introduces new level of hierarchy - "xics-common" class. Both
emulated XICS and XICS-KVM will inherit from it and override class
callbacks when required.
The new "xics-common" class implements:
1. replaces static "nr_irqs" and "nr_servers" properties with
the dynamic ones and adds callbacks to be executed when properties
are set.
2. xics_cpu_setup() callback renamed to xics_common_cpu_setup() as
it is a common part for both XICS'es
3. xics_reset() renamed to xics_common_reset() for the same reason.
The emulated XICS changes:
1. the part of xics_realize() which creates ICPs is moved to
the "nr_servers" property callback as realize() is too late to
create/initialize devices and instance_init() is too early to create
devices as the number of child devices comes via the "nr_servers"
property.
2. added ics_initfn() which does a little part of what xics_realize() did.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
This moves the xics_cpu_setup() call after kvmppc_set_papr()
in order to get VCPUs initialized as this is required by upcoming
XICS-KVM.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
On the real hardware, RTAS is called in real mode and therefore
top 4 bits of the address passed in the call are ignored.
So does the patch.
This converts h_rtas() to use existing rtas_ld() handlers.
This fixed rtas_ld()/rtas_st() to ignore top 4 bits.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
PAPR+ says that no "ibm,purr" tells the guest that H_PURR is not
supported. However some guests still try calling H_PURR on POWER7 unless
the property is present and equal to 0. This adds the property for CPUs
supporting the PURR special register.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
At the moment the size of the buffer is set to 64K which is
enough for approximately 150 VCPUs which is not the limit.
This increases the buffer up to 256K which allows having
a tree for approximately 600 VCPUs which is way beyond the real
number we need.
As only the real size of the tree is copied to the guest, there
will be no impact on existing configurations.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
Try loading the kernel as little endian if it fails big endian.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
* Conversion of global CPU list to QTAILQ - preparing for CPU hot-unplug
* Document X86CPU magic numbers for CPUID cache info
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iQIcBAABAgAGBQJSJgdaAAoJEPou0S0+fgE/WqAQAJ6pcTymZO86NLKwcY4dD5Dr
Es2aTs4XFs9V3+gpbH9vOA71n9HanFQp1s4ZUskQ2BVQU8cZeRUKlGhKJfqcEbPF
H5wkxskqgV2Sw8+XWjQk80J/X/W6k10Fit64CUpQqxzd3HwXXzT/QHXzM8t6p79i
KdEAsjaQYqR8/qa7+pd437lLcTiRb51FqB5u3ClbCbIKjnnjswr/ZypKr+CUc9WY
1AzP9UKg0qSxz1yCkgzYHt3eWjfuGhsqn8KXVQfc+37xFRZp0uYQYkCahhwrPRUO
jTg0eJKxoyH76t+2jIsnNHfd6r5zaTmVThGnun/SzJTGj8AFNrz81EfT1niJdp2/
6RdykpWdqqeA3usKoSzBgTEAXGL50tCL0xiREk7hPwflxJqjbjFuVuttkazEcHZf
Q2OS0tUFhYi3yUojms/YJYFUaNUhA033wJSjKGbFfSDdtJdjnxmB2r+LhsH4ByfS
4SPU5zr4up1Yr1dnmIlNUA5W/KMgZseT3shasLhFmODR7wGvrQ7DuEHRs87UQbbM
pedvN92VmWzByEvLNkICJGuaVer+mHznig9f1eOkxXlK4RdNBmAf5QYMU+oxbkUG
fwXu0w7/aUJKpcYl6aYUmkhgn9dB3Oe/WTVLkvfg54MUFKpo4b72AR01+fWT91XO
r8DQQYwP94htozAC6F9n
=/bSY
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'afaerber/tags/qom-cpu-for-anthony' into staging
QOM CPUState refactorings / X86CPU
* Conversion of global CPU list to QTAILQ - preparing for CPU hot-unplug
* Document X86CPU magic numbers for CPUID cache info
# gpg: Signature made Tue 03 Sep 2013 10:59:22 AM CDT using RSA key ID 3E7E013F
# gpg: Can't check signature: public key not found
# By Andreas Färber (3) and Eduardo Habkost (1)
# Via Andreas Färber
* afaerber/tags/qom-cpu-for-anthony:
target-i386: Use #defines instead of magic numbers for CPUID cache info
cpu: Replace qemu_for_each_cpu()
cpu: Use QTAILQ for CPU list
a15mpcore: Use qemu_get_cpu() for generic timers
This includes pc and pci cleanups and enhancements,
and a virtio bugfix for level interrupts.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
iQEcBAABAgAGBQJSIveoAAoJECgfDbjSjVRp2C8IAL7DE0oM0jfEB5DAd8jlULHx
hA8RP21rFzyU8PwtHB+72+C1ImldBge4hvhI+qbsm6PoW3RCeV/lbESIRTiv8dCO
pGUOFmv8MfJAH+WWFsle5mRisoTksYQWWBMHCOqvmaY4JL9pBQOhCLHVhV1XfjtL
hO7uGrWmlijeILv5CxYyPMYuOEdVvRSZKzE+Fp2YKfNstiQrS5fJIlqmwCHrlneW
l2atnt2d9ZV1K8QYiGg4GRVbSAMJvA1wum+0F4gnXIz9yAeOt+Ht1s8cNKQDMouJ
r2OyVgPM9aS/XaO6ejct1Sjo7Vgh/Ublrpw3lFqV/qHix6rEHwy2I3JHFEJPjvk=
=SytJ
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'mst/tags/for_anthony' into staging
pc,pci,virtio fixes and cleanups
This includes pc and pci cleanups and enhancements,
and a virtio bugfix for level interrupts.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Sun 01 Sep 2013 03:15:36 AM CDT using RSA key ID D28D5469
# gpg: Can't check signature: public key not found
# By Michael S. Tsirkin (3) and others
# Via Michael S. Tsirkin
* mst/tags/for_anthony:
virtio_pci: fix level interrupts with irqfd
pc: reduce duplication, fix PIIX descriptions
hw: Clean up bogus default boot order
pci: add config space access traces
pc: fix regression for 64 bit PCI memory
pci: Introduce helper to retrieve a PCI device's DMA address space
Message-id: 1378023590-11109-1-git-send-email-mst@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
This converts old style fprintf to traces.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[agraf: change patch subject]
Signed-off-by: Alexander Graf <agraf@suse.de>
PAPR+ requires two RTAS calls to be supported by the hypervisor in
order to allow hotplugging VCPUs from the guest. The "start-cpu" RTAS
call was already there but "stop-self" was not.
This adds the "stop-self" RTAS call.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
H_SET_MODE is used for controlling various partition settings. One
of these settings is the endianness a guest takes its exceptions in.
Signed-off-by: Anton Blanchard <anton@samba.org>
[agraf: fix whitespace]
Signed-off-by: Alexander Graf <agraf@suse.de>
On the sPAPR platform a guest allocates MSI/MSIX vectors via RTAS
hypercalls which return global IRQ numbers to a guest so it only
operates with those and never touches MSIMessage.
Therefore MSIMessage handling is completely hidden in QEMU.
Previously every sPAPR PCI host bridge implemented its own MSI window
to catch msi_notify()/msix_notify() calls from QEMU devices (virtio-pci
or vfio) and route them to the guest via qemu_pulse_irq().
MSIMessage used to be encoded as:
.addr - address within the PHB MSI window;
.data - the device index on PHB plus vector number.
The MSI MR write function translated this MSIMessage to a global IRQ
number and called qemu_pulse_irq().
However the total number of IRQs is not really big (at the moment it is
1024 IRQs starting from 4096) and even 16bit data field of MSIMessage
seems to be enough to store an IRQ number there.
This simplifies MSI handling in sPAPR PHB. Specifically, this does:
1. remove a MSI window from a PHB;
2. add a single memory region for all MSIs to sPAPREnvironment
and spapr_pci_msi_init() to initialize it;
3. encode MSIMessage as:
* .addr - a fixed address of SPAPR_PCI_MSI_WINDOW==0x40000000000ULL;
* .data as an IRQ number.
4. change IRQ allocator to align first IRQ number in a block for MSI.
MSI uses lower bits to specify the vector number so the first IRQ has to
be aligned. MSIX does not need any special allocator though.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
spapr-pci config space accessors use find_dev() to find a PCI device.
However find_dev() only searched on a primary bus and did not do
recursive search through secondary buses so config space access was not
possible for devices other that on a primary bus.
This fixed find_dev() by using the PCI API pci_find_device() function.
This effectively enabled pci bridges on spapr.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
QEMU has 'dtb' option for specifing the device tree file for the kernel.
The patch adds support for this option to the 'virtex_ml507' machine
implementation.
Signed-off-by: Efimov Vasily <real@ispras.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
Today we generate the device tree once on machine initialization and then
store the finalized blob in memory to reload it on reset.
This is bad for 2 reasons. First we potentially waste a bunch of RAM for no
good reason, as we have all information required to regenerate the device
tree available anyways.
The second reason is even more important. On machine init when we generate
the device tree for the first time, we don't have all of the devices fully
initialized yet. But the device tree needs to potentially walk devices to
put information about them into the device tree.
Move the generation into a reset function. That way we just generate it new
every time we reset, solving both of the above issues.
Signed-off-by: Alexander Graf <agraf@suse.de>
This includes pc and pci cleanups, future-proofing of ROM files,
and a virtio bugfix correcting splice on virtio console.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
iQEcBAABAgAGBQJSGvbsAAoJECgfDbjSjVRp0qsH/2N/b/5bNhL6gx5Kjigzv7Lt
bmCUsSsUm/Rnhke2SpDPVdIOU9W5WIX2zte8qiXuBJi3j/RS9procbgUZrCetpUI
GHQOn+n8e2TX/P9CIcHaN/bA1b+Kx+bvaVbDL/6P2PJvodbDGwDLp/9y+Tisv5Ye
kLow8Y4ZJcaTlmPf/Mh1AnhNa3gel231A3qwugVUNDSyITM6pG/0M07xk8YBj+JJ
6DYrlK0bKMNDxPu5St+YP94D1ODv2zM1aio/TdMUaNfqTZM1iqGTj3zKkBS2PjT0
RhuU2x4N91lY/uCdudtWSj5dYtfRT/xw7qwNAz9IHNz6RlGeX0n++ClMC8z9Y8A=
=Pc09
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'mst/tags/for_anthony' into stable-1.5
pc,pci,virtio fixes and cleanups
This includes pc and pci cleanups, future-proofing of ROM files,
and a virtio bugfix correcting splice on virtio console.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Mon 26 Aug 2013 01:34:20 AM CDT using RSA key ID D28D5469
# gpg: Can't check signature: public key not found
# By Markus Armbruster (5) and others
# Via Michael S. Tsirkin
* mst/tags/for_anthony:
virtio: virtqueue_get_avail_bytes: fix desc_pa when loop over the indirect descriptor table
pc_piix: Kill pc_init1() memory region args
pc: pc_compat_1_4() now can call pc_compat_1_5()
pc: Create pc_compat_*() functions
pc: Kill pc_init_pci_1_0()
pc: Don't explode QEMUMachineInitArgs into local variables needlessly
pc: Don't prematurely explode QEMUMachineInitArgs
ppc: Don't duplicate QEMUMachineInitArgs in PPCE500Params
ppc: Don't explode QEMUMachineInitArgs into local variables needlessly
sun4: Don't prematurely explode QEMUMachineInitArgs
q35: Add PCIe switch to example q35 configuration
loader: store FW CFG ROM files in RAM
arch_init: align MR size to target page size
pc: cleanup 1.4 compat support
Message-id: 1377535318-30491-1-git-send-email-mst@redhat.com
We set default boot order "cad" in every single machine definition
except "pseries" and "moxiesim", even though very few boards actually
care for boot order, and "cad" makes sense for even fewer.
Machines that care:
* pc and its variants
Accept up to three letters 'a', 'b' (undocumented alias for 'a'),
'c', 'd' and 'n'. Reject all others (fatal with -boot).
* nseries (n800, n810)
Check whether order starts with 'n'. Silently ignored otherwise.
* prep, g3beige, mac99
Extract the first character the machine understands (subset of
'a'..'f'). Silently ignored otherwise.
* spapr
Accept an arbitrary string (vl.c restricts it to contain only
'a'..'p', no duplicates).
* sun4[mdc]
Use the first character. Silently ignored otherwise.
Strip characters these machines ignore from their default boot order.
For all other machines, remove the unused default boot order
alltogether.
Note that my rename of QEMUMachine member boot_order to
default_boot_order and QEMUMachineInitArgs member boot_device to
boot_order has a welcome side effect: it makes every use of boot
orders visible in this patch, for easy review.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This is an autogenerated patch using scripts/switch-timer-api.
Switch the entire code base to using the new timer API.
Note this patch may introduce some line length issues.
Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Pass on the generic arguments unadulterated, and the machine-specific
ones as separate argument.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Don't explode when the variable is used just once, and never changed.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
'dprintf' is the name of a POSIX standard function so we should not be
stealing it for our debug macro. Rename to 'DPRINTF' (in line with
a number of other source files.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Acked-by: Richard Henderson <rth@twiddle.net>
Acked-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1375100199-13934-5-git-send-email-peter.maydell@linaro.org
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Basically, in HW the layout of the interrupt network is:
- One ICP per processor thread (the "presenter"). This contains the
registers to fetch a pending interrupt (ack), EOI, and control the
processor priority.
- One ICS per logical source of interrupts (ie, one per PCI host
bridge, and a few others here or there). This contains the per-interrupt
source configuration (target processor(s), priority, mask) and the
per-interrupt internal state.
Under PAPR, there is a single "virtual" ICS ... somewhat (it's a bit
oddball what pHyp does here, arguably there are two but we can ignore
that distinction). There is no register level access. A pair of firmware
(RTAS) calls is used to configure each virtual interrupt.
So our model here is somewhat the same. We have one ICS in the emulated
XICS which arguably *is* the emulated XICS, there's no point making it a
separate "device", that would just be gross, and each VCPU has an
associated ICP.
Yet we call the "XICS" struct icp_state and then the ICPs
'struct icp_server_state'. It's particularly confusing when all of the
functions have xics_prefixes yet take *icp arguments.
Rename:
struct icp_state -> XICSState
struct icp_server_state -> ICPState
struct ics_state -> ICSState
struct ics_irq_state -> ICSIRQState
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-id: 1374175984-8930-12-git-send-email-aliguori@us.ibm.com
[aik: added ics_resend() on post_load]
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
At present, the savevm / migration support for the pseries machine will not
work when KVM is enabled. That's because KVM manages the guest's hash page
table in the host kernel, so qemu has no visibility of it. This patch
fixes this by using new kernel interfaces to extract and reinsert the
guest's hash table during the migration process.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Message-id: 1374175984-8930-11-git-send-email-aliguori@us.ibm.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This adds the necessary support for saving the state of the PAPR virtual
PCI host bridge (or host bridges).
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Message-id: 1374175984-8930-10-git-send-email-aliguori@us.ibm.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This adds the necessary pieces to implement savevm / migration for the
pseries machine. The most complex part here is migrating the hash
table - for the paravirtualized pseries machine the guest's hash page
table is not stored within guest memory, but externally and the guest
accesses it via hypercalls.
This patch uses a hypervisor reserved bit of the HPTE as a dirty bit
(tracking changes to the HPTE itself, not the page it references).
This is used to implement a live migration style incremental save and
restore of the hash table contents.
Normally a hash table is 16MB but it can get bigger depending on how
much RAM the guest has. Due to its nature, updates to it are random so
the live migration style is used for it.
In addition it adds VMStateDescription information to save and restore
the (few) remaining pieces of state information needed by the pseries
machine.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Message-id: 1374175984-8930-9-git-send-email-aliguori@us.ibm.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Model TCE tables as a device that's hooked up as a child object to
the owner. Besides the code cleanup, we get a few nice benefits:
1) free actually works now (it was dead code before)
2) the TCE information is visible in the device tree
3) we can expose table information as properties such that if we
change the window_size, we can use globals to keep migration
working.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-id: 1374175984-8930-6-git-send-email-aliguori@us.ibm.com
[dwg: pseries: savevm support for PAPR TCE tables]
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[alexey: ppc kvm: fix to compile]
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This patch adds helpers to allow PAPR VIO devices to save state common
to all VIO devices during savevm.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Message-id: 1374175984-8930-3-git-send-email-aliguori@us.ibm.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This fixes endianness bugs in I/O port access.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Message-id: 1374501278-31549-10-git-send-email-pbonzini@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Do not swap endianness here, it will happen during cpu_{in,out}{b,w,l}.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Message-id: 1374501278-31549-6-git-send-email-pbonzini@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This fixes endianness bugs in I/O port access.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Message-id: 1374501278-31549-5-git-send-email-pbonzini@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This fixes endianness bugs in I/O port access.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Message-id: 1374501278-31549-4-git-send-email-pbonzini@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This fixes endianness bugs in I/O port access.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Message-id: 1374501278-31549-3-git-send-email-pbonzini@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
We should only start processing DMA requests when we have data to process.
Hold off working through the DMA shuffling until the IDE core told us that
it's ready.
This is required because the guest can program the DMA engine or the IDE
transfer first. Both are legal.
Signed-off-by: Alexander Graf <agraf@suse.de>
We need to know when the IDE core starts a DMA transfer. Add a notifier
function so we have the chance to start transmitting data.
Signed-off-by: Alexander Graf <agraf@suse.de>
On a real G3 Beige the secondary IDE bus lives on the mac-io chip, not
on some random PCI device. Move it there to become more compatible.
While at it, also clean up the IDE channel connection logic.
Signed-off-by: Alexander Graf <agraf@suse.de>
We can tell the guest the frequency of its time base through fwcfg.
However, we tell it a different value from the speed tb actually runs
at. Let's fix it and make the tbfreq initialization and the fwcfg exposure
use the same values.
Signed-off-by: Alexander Graf <agraf@suse.de>
Allow the user to override the firmware file name rather than always
using "slof.bin".
Reported-by: Dinar Valeev <k0da@opensuse.org>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
The function returned a target_ulong which was made from unnamed enum
values. The target_ulong was then assigned to an int variable which
was used in a switch statement.
Using a named enum in both cases makes reviews easier.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
i686-w64-mingw32-gcc (GCC) 4.6.3 from Debian wheezy reports these warnings:
hw/ppc/spapr_hcall.c:188:1: warning:
control reaches end of non-void function [-Wreturn-type]
hw/ppc/spapr_pci.c:454:1: warning:
control reaches end of non-void function [-Wreturn-type]
Both warnings are fixed by using g_assert_not_reached instead of assert.
A second line with assert(0) in spapr_pci.c which did not raise a compiler
warning was modified, too, because g_assert_not_reached documents the
purpose of that statement and is not removed in release builds.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Since current_cpu is CPUState it no longer depends on CPUPPCState.
Move ppce500_set_mpic_proxy() to a new hw/ppc/ppc_e500.h because
hw/ppc/ppc.h is too heavily using CPUPPCState and PowerPCCPU.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Move next_cpu from CPU_COMMON to CPUState.
Move first_cpu variable to qom/cpu.h.
gdbstub needs to use CPUState::env_ptr for now.
cpu_copy() no longer needs to save and restore cpu_next.
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
[AF: Rebased, simplified cpu_copy()]
Signed-off-by: Andreas Färber <afaerber@suse.de>
The previous two commits fixed bugs in -machine option queries. I
can't find fault with the remaining queries, but let's use
qemu_get_machine_opts() everywhere, for consistency, simplicity and
robustness.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1372943363-24081-7-git-send-email-armbru@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Multiple -machine options with the same ID are merged. All but the
one without an ID are to be silently ignored.
In most places, we query these options with a null ID. This is
correct.
In some places, we instead query whatever options come first in the
list. This is wrong. When the -machine processed first happens to
have an ID, options are taken from that ID, and the ones specified
without ID are silently ignored.
Example:
$ upstream-qemu -nodefaults -S -display none -monitor stdio -machine id=foo -machine accel=kvm,usb=on
$ upstream-qemu -nodefaults -S -display none -monitor stdio -machine id=foo,accel=kvm,usb=on -machine accel=xen
$ upstream-qemu -nodefaults -S -display none -monitor stdio -machine accel=xen -machine id=foo,accel=kvm,usb=on
$ qemu-system-x86_64 -nodefaults -S -display none -monitor stdio -machine accel=kvm,usb=on
QEMU 1.5.50 monitor - type 'help' for more information
(qemu) info kvm
kvm support: enabled
(qemu) info usb
(qemu) q
$ qemu-system-x86_64 -nodefaults -S -display none -monitor stdio -machine id=foo -machine accel=kvm,usb=on
QEMU 1.5.50 monitor - type 'help' for more information
(qemu) info kvm
kvm support: disabled
(qemu) info usb
(qemu) q
$ qemu-system-x86_64 -nodefaults -S -display none -monitor stdio -machine id=foo,accel=kvm,usb=on -machine accel=xen
QEMU 1.5.50 monitor - type 'help' for more information
(qemu) info kvm
kvm support: enabled
(qemu) info usb
USB support not enabled
(qemu) q
$ qemu-system-x86_64 -nodefaults -S -display none -monitor stdio -machine accel=xen -machine id=foo,accel=kvm,usb=on
xc: error: Could not obtain handle on privileged command interface (2 = No such file or directory): Internal error
xen be core: can't open xen interface
failed to initialize Xen: Operation not permitted
Option usb is queried correctly, and the one without an ID wins,
regardless of option order.
Option accel is queried incorrectly, and which one wins depends on
option order and ID.
Affected options are accel (and its sugared forms -enable-kvm and
-no-kvm), kernel_irqchip, kvm_shadow_mem.
Additionally, option kernel_irqchip is normally on by default, except
it's off when no -machine options are given. Bug can't bite, because
kernel_irqchip is used only when KVM is enabled, KVM is off by
default, and enabling always creates -machine options. Downstreams
that enable KVM by default do get bitten, though.
Use qemu_get_machine_opts() to fix these bugs.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1372943363-24081-5-git-send-email-armbru@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This includes some pci enhancements:
Better support for systems with multiple PCI root buses
FW cfg interface for more robust pci programming in BIOS
Minor fixes/cleanups for fw cfg and cross-version migration -
because of dependencies with other patches
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.13 (GNU/Linux)
iQEcBAABAgAGBQJR2ctmAAoJECgfDbjSjVRpQpAH/Rk00yLrQ2R5ScNa8AL9LeaJ
gVFndBmmuRz4gdhyATx6lzR98ic32iTr0+YR5mL51btgmM5a0bEd/SIu34nXriWj
PsM0wdXfo/oEygdttxhvzJOH17tohRV9xg2WA2d8BEwDzrDyqoQ4J0VJlHlG7u3W
nq4KVDVUpLNQFKG8ZgJ2vW0WMw/mBSj2rluhQUALhcuvChphtvAFZ2rsSfJr6bzD
aBELrtIvfLvPGN/0WVeYs9qlp4EE03H3X6gN61QvV3/YElxubKUV5XyMDOX2dW3D
2j0NQi84LYHn0SFap2r/Kgm47/F6Q56SFk5lrgZrg60mhQTwocw7PfL8CGxjXRI=
=gxxc
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'mst/tags/for_anthony' into staging
pci,misc enhancements
This includes some pci enhancements:
Better support for systems with multiple PCI root buses
FW cfg interface for more robust pci programming in BIOS
Minor fixes/cleanups for fw cfg and cross-version migration -
because of dependencies with other patches
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Sun 07 Jul 2013 03:11:18 PM CDT using RSA key ID D28D5469
# gpg: Can't check signature: public key not found
# By David Gibson (10) and others
# Via Michael S. Tsirkin
* mst/tags/for_anthony:
pci: Fold host_buses list into PCIHostState functionality
pci: Remove domain from PCIHostBus
pci: Simpler implementation of primary PCI bus
pci: Add root bus parameter to pci_nic_init()
pci: Add root bus argument to pci_get_bus_devfn()
pci: Replace pci_find_domain() with more general pci_root_bus_path()
pci: Use helper to find device's root bus in pci_find_domain()
pci: Abolish pci_find_root_bus()
pci: Move pci_read_devaddr to pci-hotplug-old.c
pci: Cleanup configuration for pci-hotplug.c
pvpanic: fix fwcfg for big endian hosts
pvpanic: initialization cleanup
MAINTAINERS: s/Marcelo/Paolo/
e1000: cleanup process_tx_desc
pc_piix: cleanup init compat handling
pc: pass PCI hole ranges to Guests
pci: store PCI hole ranges in guestinfo structure
range: add Range structure
Message-id: 1373228271-31223-1-git-send-email-mst@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
At present, pci_nic_init() and pci_nic_init_nofail() assume that they will
only create a NIC under the primary PCI root. As we add support for
multiple PCI roots, that may no longer be the case. This patch adds a root
bus parameter to pci_nic_init() (and updates callers accordingly) to allow
the machine init code using it to specify the right PCI root for NICs
created by old-style -net nic parameters. NICs created new-style, with
-device can of course be put anywhere.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
pci_find_domain() is used in a number of places where we want an id for a
whole PCI domain (i.e. the subtree under a PCI root bus). The trouble is
that many platforms may support multiple independent host bridges with no
hardware supplied notion of domain number.
This patch, therefore, replaces calls to pci_find_domain() with calls to
a new pci_root_bus_path() returning a string. The new call is implemented
in terms of a new callback in the host bridge class, so it can be defined
in some way that's well defined for the platform. When no callback is
available we fall back on the qbus name.
Most current uses of pci_find_domain() are for error or informational
messages, so the change in identifiers should be harmless. The exception
is pci_get_dev_path(), whose results form part of migration streams. To
maintain compatibility with old migration streams, the PIIX PCI host is
altered to always supply "0000" for this path, which matches the old domain
number (since the code didn't actually support domains other than 0).
For the pseries (spapr) PCI bridge we use a different platform-unique
identifier (pseries machines can routinely have dozens of PCI host
bridges). Theoretically that breaks migration streams, but given that we
don't yet have migration support for pseries, it doesn't matter.
Any other machines that have working migration support including PCI
devices will need to be updated to maintain migration stream compatibility.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>