We need to set cs->halted to 1 before calling ppc_set_compat. The reason
is that ppc_set_compat kicks up the new thread created to manage the
hotplugged KVM virtual CPU and the code drives directly to KVM_RUN
ioctl. When cs->halted is 1, the code:
int kvm_cpu_exec(CPUState *cpu)
...
if (kvm_arch_process_async_events(cpu)) {
atomic_set(&cpu->exit_request, 0);
return EXCP_HLT;
}
...
returns before it reaches KVM_RUN, giving time to the main thread to
finish its job. Otherwise we can fall in a deadlock because the KVM
thread will issue the KVM_RUN ioctl while the main thread is setting up
KVM registers. Depending on how these jobs are scheduled we'll end up
freezing QEMU.
The following output shows kvm_vcpu_ioctl sleeping because it cannot get
the mutex and never will.
PS: kvm_vcpu_ioctl was triggered kvm_set_one_reg - compat_pvr.
STATE: TASK_UNINTERRUPTIBLE|TASK_WAKEKILL
PID: 61564 TASK: c000003e981e0780 CPU: 48 COMMAND: "qemu-system-ppc"
#0 [c000003e982679a0] __schedule at c000000000b10a44
#1 [c000003e98267a60] schedule at c000000000b113a8
#2 [c000003e98267a90] schedule_preempt_disabled at c000000000b11910
#3 [c000003e98267ab0] __mutex_lock at c000000000b132ec
#4 [c000003e98267bc0] kvm_vcpu_ioctl at c00800000ea03140 [kvm]
#5 [c000003e98267d20] do_vfs_ioctl at c000000000407d30
#6 [c000003e98267dc0] ksys_ioctl at c000000000408674
#7 [c000003e98267e10] sys_ioctl at c0000000004086f8
#8 [c000003e98267e30] system_call at c00000000000b488
crash> struct -x kvm.vcpus 0xc000003da0000000
vcpus = {0xc000003db4880000, 0xc000003d52b80000, 0xc0000039e9c80000, 0xc000003d0e200000, 0xc000003d58280000, 0x0, 0x0, ...}
crash> struct -x kvm_vcpu.mutex.owner 0xc000003d58280000
mutex.owner = {
counter = 0xc000003a23a5c881 <- flag 1: waiters
},
crash> bt 0xc000003a23a5c880
PID: 61579 TASK: c000003a23a5c880 CPU: 9 COMMAND: "CPU 4/KVM"
(active)
crash> struct -x kvm_vcpu.mutex.wait_list 0xc000003d58280000
mutex.wait_list = {
next = 0xc000003e98267b10,
prev = 0xc000003e98267b10
},
crash> struct -x mutex_waiter.task 0xc000003e98267b10
task = 0xc000003e981e0780
The following command-line was used to reproduce the problem (note: gdb
and trace can change the results).
$ qemu-ppc/build/ppc64-softmmu/qemu-system-ppc64 -cpu host \
-enable-kvm -m 4096 \
-smp 4,maxcpus=8,sockets=1,cores=2,threads=4 \
-display none -nographic \
-drive file=disk1.qcow2,format=qcow2
...
(qemu) device_add host-spapr-cpu-core,core-id=4
[no interaction is possible after it, only SIGKILL to take the terminal
back]
Signed-off-by: Jose Ricardo Ziviani <joserz@linux.ibm.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
spapr_init_cpus() currently creates spapr-cpu-core objects via
object_new() and setting their realized property to true. This leaves
their reference count at two, because object_new() adds an initial
reference and the realization attaches them to a default parent object
which also increments the reference count.
This causes a problem if one of these cores is hot unplugged: no
delete event is generated for it because it's reference count doesn't
reach zero when it is detached from it's parent.
Correct this by adding a call to object_unref() in spapr_init_cpus().
Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This enables the correct generation of bootdevice fw paths for in-built IDE
and virtio-pci-blk devices suitable for OpenBIOS.
Note we also set the MachineClass ignore_boot_device_suffixes property to true
since an additional disk node should not be added except for virtio devices.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This enables the correct generation of bootdevice fw paths for in-built IDE
and virtio-pci-blk devices suitable for OpenBIOS.
Note we also set the MachineClass ignore_boot_device_suffixes property to true
since an additional disk node should not be added except for virtio devices.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The generated qapi_event_send_FOO() take an Error ** argument. They
can't actually fail, because all they do with the argument is passing it
to functions that can't fail: the QObject output visitor, and the
@qmp_emit callback, which is either monitor_qapi_event_queue() or
event_test_emit().
Drop the argument, and pass &error_abort to the QObject output visitor
and @qmp_emit instead.
Suggested-by: Eric Blake <eblake@redhat.com>
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180815133747.25032-4-peterx@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Commit message rewritten, update to qapi-code-gen.txt corrected]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Commit 2c88b098e7 added a call to SPAPR_MACHINE_GET_CLASS(spapr) in
spapr_phb_realize() before we check spapr isn't NULL. This causes QEMU
to crash when starting a non-pseries machine with a sPAPR PHB.
This could be fixed by setting the smc variable after the null check,
but it seems more explicit to use a ternary operator to skip the call
to SPAPR_MACHINE_GET_CLASS() if spapr is NULL, since spapr_phb_realize()
will return immediately in this case.
This was reported by Coverity (CID 1395170 and 1395183).
Fixes: 2c88b098e7
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Introduced in 04d595b300 ("spapr: do not use CPU_FOREACH_REVERSE",
2018-08-23)
Fixes: CID1395181
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
There is no known available OS for ppc around anymore that uses page
sizes below 4k, so it does not make much sense that we keep wasting
our time on building and testing the ppcemb-softmmu target. It has
been deprecated since two releases, and nobody complained, so let's
remove this now.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* qumu-guest-agent freeze-hook tweak (Christian)
* pm_smbus improvements (Corey)
* Move validation to pre_plug for pc-dimm (David)
* Fix memory leaks (Eduardo, Marc-André)
* synchronization profiler (Emilio)
* Convert the CPU list to RCU (Emilio)
* LSI support for PPR Extended Message (George)
* vhost-scsi support for protection information (Greg)
* Mark mptsas as a storage device in the help (Guenter)
* checkpatch tweak cherry-picked from Linux (me)
* Typos, cleanups and dead-code removal (Julia, Marc-André)
* qemu-pr-helper support for old libmultipath (Murilo)
* Annotate fallthroughs (me)
* MemoryRegionOps cleanup (me, Peter)
* Make s390 qtests independent from libqos, which doesn't actually support it (me)
* Make cpu_get_ticks independent from BQL (me)
* Introspection fixes (Thomas)
* Support QEMU_MODULE_DIR environment variable (ryang)
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAlt+5OYUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroPtxwf8CQM/F+0L+EKeYfYcVgVZsDhhOkLj
Pm61q0bZsWKLby5jCqIDYw7Z/vodJnSS1DO0slIRoXxvQ9DwlkbBnBy/aG/E9U0q
WF1vbCezibDIt7sGcsu9F5zXU9eqe+E6dZfxFrv8FQSOFVxn34TfeJagWLCtzg0d
LnVTF/e4zJD8IQiM7w6lJQxua3fz13ssPEg2KnMkguDhACMwvZ/K/cA2AJkHRMhY
sroPMwLHlrF1NOoeCIrWxYUmSGCRCAy1DmiPGiiSs0yBq/dL0UkAa5Eu6HMQ7rgI
zUff3JDmzEjixUSIEbpVRN+yPCN0/ACSOpJUrKLDxXbc4nZ+PBQ04YpyPQ==
=UZiV
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* x86 TCG fixes for 64-bit call gates (Andrew)
* qumu-guest-agent freeze-hook tweak (Christian)
* pm_smbus improvements (Corey)
* Move validation to pre_plug for pc-dimm (David)
* Fix memory leaks (Eduardo, Marc-André)
* synchronization profiler (Emilio)
* Convert the CPU list to RCU (Emilio)
* LSI support for PPR Extended Message (George)
* vhost-scsi support for protection information (Greg)
* Mark mptsas as a storage device in the help (Guenter)
* checkpatch tweak cherry-picked from Linux (me)
* Typos, cleanups and dead-code removal (Julia, Marc-André)
* qemu-pr-helper support for old libmultipath (Murilo)
* Annotate fallthroughs (me)
* MemoryRegionOps cleanup (me, Peter)
* Make s390 qtests independent from libqos, which doesn't actually support it (me)
* Make cpu_get_ticks independent from BQL (me)
* Introspection fixes (Thomas)
* Support QEMU_MODULE_DIR environment variable (ryang)
# gpg: Signature made Thu 23 Aug 2018 17:46:30 BST
# gpg: using RSA key BFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini/tags/for-upstream: (69 commits)
KVM: cleanup unnecessary #ifdef KVM_CAP_...
target/i386: update MPX flags when CPL changes
i2c: pm_smbus: Add the ability to force block transfer enable
i2c: pm_smbus: Don't delay host status register busy bit when interrupts are enabled
i2c: pm_smbus: Add interrupt handling
i2c: pm_smbus: Add block transfer capability
i2c: pm_smbus: Make the I2C block read command read-only
i2c: pm_smbus: Fix the semantics of block I2C transfers
i2c: pm_smbus: Clean up some style issues
pc-dimm: assign and verify the "addr" property during pre_plug
pc: drop memory region alignment check for 0
util/oslib-win32: indicate alignment for qemu_anon_ram_alloc()
pc-dimm: assign and verify the "slot" property during pre_plug
ipmi: Use proper struct reference for BT vmstate
vhost-scsi: expose 't10_pi' property for VIRTIO_SCSI_F_T10_PI
vhost-scsi: unify vhost-scsi get_features implementations
vhost-user-scsi: move host_features into VHostSCSICommon
cpus: allow cpu_get_ticks out of BQL
cpus: protect TimerState writes with a spinlock
seqlock: add QemuLockable support
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We can assign and verify the address before realizing and trying to plug.
reading/writing the address property should never fail for DIMMs, so let's
reduce error handling a bit by using &error_abort. Getting access to the
memory region now might however fail. So forward errors from
get_memory_region() properly.
As all memory devices should use the alignment of the underlying memory
region for guest physical address asignment, do detection of the
alignment in pc_dimm_pre_plug(), but allow pc.c to overwrite the
alignment for compatibility handling.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180801133444.11269-5-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We can assign and verify the slot before realizing and trying to plug.
reading/writing the slot property should never fail, so let's reduce
error handling a bit by using &error_abort.
To do this during pre_plug, add and use (x86, ppc) pc_dimm_pre_plug().
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180801133444.11269-2-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This paves the way for implementing the CPU list with an RCU list,
which cannot be traversed in reverse order.
Note that this is the only caller of CPU_FOREACH_REVERSE.
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <20180819091335.22863-11-cota@braap.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There is currently a funny problem with the "mc146818rtc" device:
1) Start QEMU like this:
qemu-system-ppc64 -M pseries -S
2) At the HMP monitor, enter "info qom-tree". Note that there is an
entry for "/rtc (spapr-rtc)".
3) Introspect the mc146818rtc device like this:
device_add mc146818rtc,help
4) Run "info qom-tree" again. The "/rtc" entry is gone now!
The rtc_finalize() function of the mc146818rtc device has two bugs: First,
it tries to remove a "rtc" property, while the rtc_realizefn() added a
"rtc-time" property instead. And second, it should have been done in an
unrealize function, not in a finalize function, to avoid that this causes
problems during introspection.
But since adding aliases to the global machine state should not be done
from a device's realize function anyway, let's rather fix this issue
by moving the creation of the alias to the code that creates the device
(and thus is run from the machine init functions instead), i.e. the
mc146818_rtc_init() function for most machines. The prep machines are
special, since the mc146818rtc device is created here in the realize
function of the i82378 device. Since we certainly don't want to add the
alias there, we add it to some code that is called from the ibm_40p_init()
machine init function instead.
Since the alias is now only created during the machine init, we can remove
the object_property_del() completely.
Fixes: 654a36d857
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1534419358-10932-5-git-send-email-thuth@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It should save us some CPU cycles as these routines perform a lot of
checks.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Instead initialise the device via qdev to allow us to set device properties
directly as required.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Instead initialise the device via qdev to allow us to set device properties
directly as required.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Instead initialise the device via qdev to allow us to set device properties
directly as required.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Acked-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
- prep machine is a fictional machine, so has no specifications. Which
devices can be changed/added/removed without impact? Are interrupts
correctly mapped?
- prep firmware (OHW) has support only for IDE drives (no SCSI).
Booting from IDE has been broken approximatively 3 years ago, and nobody complained.
- OHW is limited on IDE boot to a specific set of OS loaders.
These operating systems are of the 2004 time frame.
- OHW can use -kernel. Linux kernel freezes a long time after PS/2 mouse
detection, and then screen becomes garbage. This was already broken in
QEMU v2.7, 2 years ago, and nobody complained.
On the other side:
- 40p is a real machine, so emulation can be checked against
hardware specifications
- OpenBIOS has support for SCSI block devices, including 40p LSI adapter
- OpenBIOS can start mostly all Linux kernels (including recent ones)
and recent operating system (like NetBSD 7.1.2)
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
[dwg: Drop prep from boot-serial test to avoid deprecation warnings]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This proposal moves all the related IRQ routines of the sPAPR machine
behind a sPAPR IRQ backend interface 'spapr_irq' to prepare for future
changes. First of which will be to increase the size of the IRQ number
space, then, will follow a new backend for the POWER9 XIVE IRQ controller.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Convert the devices in ppc405_uc away from using the old_mmio
MemoryRegion accessors:
* opba's 32-bit and 16-bit accessors were just calling the
8-bit accessors and assembling a big-endian order number,
which we can do by setting the .impl.max_access_size to 1
and the endianness to DEVICE_BIG_ENDIAN, and letting the
core memory code do the assembly
* ppc405_gpio's accessors were all just stubs
* ppc4xx_gpt's 8-bit and 16-bit accessors were treating the
access as invalid, which we can do by setting the
.valid.min_access_size and .valid.max_access_size fields
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Switch the ref405ep_fpga device away from using the old_mmio
MemoryRegion accessors.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The prep machine has some code which is stubs of accessors
for XCSR registers. This has been disabled via #if 0
since commit b6b8bd1819 in 2004, and doesn't have any
actual interesting content. It also uses the deprecated
old_mmio accessor functions. Remove it entirely.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This proposal introduces a new IRQ number space layout using static
numbers for all devices, depending on a device index, and a bitmap
allocator for the MSI IRQ numbers which are negotiated by the guest at
runtime.
As the VIO device model does not have a device index but a "reg"
property, we introduce a formula to compute an IRQ number from a "reg"
value. It should minimize most of the collisions.
The previous layout is kept in pre-3.1 machines raising the
'legacy_irq_allocation' machine class flag.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
VMStateDescription vmstate_spapr_cpu_state was added by commit
b94020268e (spapr_cpu_core: migrate per-CPU data) to migrate per-CPU
data with the required vmstate registration and unregistration calls.
However the unregistration is being done only from vcpu creation error path
and not from CPU delete path.
This causes migration to fail with the following error if migration is
attempted after a CPU unplug like this:
Unknown savevm section or instance 'spapr_cpu' 16
Additionally this leaves the source VM unresponsive after migration failure.
Fix this by ensuring the vmstate_unregister happens during CPU removal.
Fixing this becomes easier when vmstate (un)registration calls are moved to
vcpu (un)realize functions which is what this patch does.
Fixes: https://bugs.launchpad.net/qemu/+bug/1785972
Reported-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
For the older machines (such as Mac and SPARC) the DT nodes representing
bootdevices for disk nodes are irregular for mainly historical reasons.
Since the majority of bootdevice nodes for these machines either do not have a
separate disk node or require different (custom) names then it is much easier
for processing to just disable all suffixes for a particular machine.
Introduce a new ignore_boot_device_suffixes MachineClass property to control
bootdevice suffix generation, defaulting to false in order to preserve
compatibility.
Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20180810124027.10698-1-mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The four interrupts of the PCI bus are connected to the same UIC pin
on the real Sam460ex. Evidence for this can be found in the UBoot
source for the Sam460ex in the Sam460ex.c file where
PCI_INTERRUPT_LINE is written. Change the ppc440_pcix model to behave
more like this.
This fixes the problem that can be observed when adding further PCI
cards that got their interrupt rotated to other interrupts than PCI
INT A. In particular, the bug was observed with an additional OHCI PCI
card or an ES1370 sound device.
Signed-off-by: Sebastian Bauer <mail@sebastianbauer.info>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Tested-by: Sebastian Bauer <mail@sebastianbauer.info>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Commit 51b0d834c changed error handling to report file name in error
message but forgot to move freeing it after usage. Noticed by Coverity.
Fixes: CID 1394217
Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This function was introduced between v2.11 and v2.12 to replace obsolete
ways of specifying the NUMA nodes for DIMMs. It's used to find the correct
node for an LMB, by locating which DIMM object it lies within.
Unfortunately, one of the checks is inverted, so we check whether the
address is less than two different things, rather than actually checking
a range. This introduced a regression, meaning that after a reboot qemu
will advertise incorrect node information for memory to the guest.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
sam460ex_load_device_tree() handles nearly all possible errors by simply
exiting (within helper functions and macros). It handles two early error
cases by returning an error.
There's no particular point to this, so make it handle those directly as
well, removing the need for the caller to handle a failure. As a bonus it
gives us more specific error messages.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The qemu_fdt_*() helper functions already exit with a message instead of
returning errors, so we don't need to check for errors in the caller.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
In a couple of places sam460ex_load_device_tree() calls "raw" libfdt
functions which can fail, but doesn't check for error codes. At best,
if these fail the guest will be silently started in a non-standard state,
or it could fail entirely.
Fix this by using the _FDT() helper macro which aborts on a libfdt failure.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Commit 29f9cef "ppc: Include vga cirrus card into the compiling process"
changed the default display adapter for all PPC machines to cirrus. Unfortunately
it missed setting the default display type to stdvga for both PReP machines
causing the display to fail to initialise under OpenHackWare.
Update the MachineClass for both prep and 40p machines so that the default
std(vga) display adapter is the default if no options are specified
which fixes the display for the PReP machines.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
commit efe2add7cb ("spapr/vio: deprecate the "irq" property")
introduced get/set accessors for the "irq" property to warn of its
usage, but the warning in the get pollutes the monitor 'info qtree'.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Commit 29f9cef39e "ppc: Include vga cirrus card into the compiling process"
changed the default display adapter for all PPC machines to cirrus. Unfortunately
it missed setting the default display type to stdvga for both Mac machines
causing the display to fail to initialise under OpenBIOS.
Update the MachineClass for both Old World and New World Macs so that the
default std(vga) display adapter is the default if no options are specified
which fixes the display for the Mac machines.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Here's a last minue pull request before today's soft freeze. Ideally
I would have sent this earlier, but I was waiting for a couple of
extra fixes I knew were close. And the freeze crept up on me, like
always.
Most of the changes here are bugfixes in any case. There are some
cleanups as well, which have been in my staging tree for a little
while. There are a couple of truly new features (some extensions to
the sam460ex platform), but these are low risk, since they only affect
a new and not really stabilized machine type anyway.
Higlights are:
* Mac platform improvements from Mark Cave-Ayland
* Sam460ex improvements from BALATON Zoltan et al.
* XICS interrupt handler cleanups from Cédric Le Goater
* TCG improvements for atomic loads and stores from Richard
Henderson
* Assorted other bugfixes
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEdfRlhq5hpmzETofcbDjKyiDZs5IFAls7D8oACgkQbDjKyiDZ
s5Lxmg//YzPfC/nKqTTKkyJPzh/NnSC+kRTMAT3mbxdRIc7yfgMqJtWGGbS1iKgK
EeJ9hl5Qm0HfscfDuzf0xasU62ZEv3kNdLnWJEIgkqiXrxoO5KCnC0y4D8NN1W03
mvINNCa8+QDg2OsirGmNUTkriiG3wLIrHTpLZ4+JuC2Bd9H3nTHZgJ0MXON/1VWY
oRgr6kMZ5+IAzPhvYLFR6l3nPI883fgJOFyRo7YqYrkVBKFrFkfK0Xjw6vpsNxcx
2dE/YCHhNIriLuBG5noewL7GuqZRtLnl6rjjee5VAKIe1EmFeR+jsXwNjzGOVOJg
dhjOtsJsQQ3WdEw5uImJzE64kV228WCgmkeXzZd1010JBLr7sUkrd2EuoZ23vvat
uvZAHVSBrJg5WvzMo1VMEoPU3VeeZQ5HL+MI80iKiU6oUgRK11gVJcebtA0sEKt+
zhJC4JiUlHtZLTGIpMBmU8DJZ3Tyk1cBEm+Ky+SaPE+dsz16UHI0fazFQXJnXphE
MLHEGAyQgzWYp7kIcAjUFev0Geq/Uovy4JKIGI6ISop1wRPEQDxkthfkfRyQxQkE
zuse4EBcEH/Undw9KrmEQa0hCe+8BRkxklVbPesFPPdqH3PKNxtHYuWpSShQF0PW
XMjw43O2Rbsl8kBUHCpy4pYSugD1hpfgaw/mVUOU1u/M1O6toTw=
=AHrx
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-3.0-20180703' into staging
ppc patch queue 2018-07-03
Here's a last minue pull request before today's soft freeze. Ideally
I would have sent this earlier, but I was waiting for a couple of
extra fixes I knew were close. And the freeze crept up on me, like
always.
Most of the changes here are bugfixes in any case. There are some
cleanups as well, which have been in my staging tree for a little
while. There are a couple of truly new features (some extensions to
the sam460ex platform), but these are low risk, since they only affect
a new and not really stabilized machine type anyway.
Higlights are:
* Mac platform improvements from Mark Cave-Ayland
* Sam460ex improvements from BALATON Zoltan et al.
* XICS interrupt handler cleanups from Cédric Le Goater
* TCG improvements for atomic loads and stores from Richard
Henderson
* Assorted other bugfixes
# gpg: Signature made Tue 03 Jul 2018 06:55:22 BST
# gpg: using RSA key 6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-3.0-20180703: (35 commits)
ppc: Include vga cirrus card into the compiling process
target/ppc: Relax reserved bitmask of indexed store instructions
target/ppc: set is_jmp on ppc_tr_breakpoint_check
spapr: compute default value of "hpt-max-page-size" later
target/ppc/kvm: don't pass cpu to kvm_get_smmu_info()
target/ppc/kvm: get rid of kvm_get_fallback_smmu_info()
ppc440_uc: Basic emulation of PPC440 DMA controller
sam460ex: Add RTC device
hw/timer: Add basic M41T80 emulation
ppc4xx_i2c: Rewrite to model hardware more closely
hw/ppc: Give sam46ex its own config option
fpu_helper.c: fix setting FPSCR[FI] bit
target/ppc: Implement the rest of gen_st_atomic
target/ppc: Implement the rest of gen_ld_atomic
target/ppc: Use atomic min/max helpers
target/ppc: Use MO_ALIGN for EXIWX and ECOWX
target/ppc: Split out gen_st_atomic
target/ppc: Split out gen_ld_atomic
target/ppc: Split out gen_load_locked
target/ppc: Tidy gen_conditional_store
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
# Conflicts:
# hw/ppc/spapr.c
Drivers for this card exists on PPC-based AmigaOS guests so it is useful to
allow users to emulate the graphics card for PPC machines.
As cirrus vga is currently preferred over std(vga) in absence of any user
choice, this change also sets the default display of spapr machines to
std as otherwise qemu refuses to start these machines. Not specifying an
explicit graphics mode is for instance done by 'make check'.
Signed-off-by: Sebastian Bauer <mail@sebastianbauer.info>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
It is currently not possible to run a pseries-2.12 or older machine
with HV KVM. QEMU prints the following and exits right away.
qemu-system-ppc64: KVM doesn't support for base page shift 34
The "hpt-max-page-size" capability was recently added to spapr to hide
host configuration details from HPT mode guests. Its default value for
newer machine types is 64k.
For backwards compatibility, pseries-2.12 and older machine types need
a different value. This is handled as usual in a class init function.
The default value is 16G, ie, all page sizes supported by POWER7 and
newer CPUs, but HV KVM requires guest pages to be hpa contiguous as
well as gpa contiguous. The default value is the page size used to
back the guest RAM in this case.
Unfortunately kvmppc_hpt_needs_host_contiguous_pages()->kvm_enabled() is
called way before KVM init and returns false, even if the user requested
KVM. We thus end up selecting 16G, which isn't supported by HV KVM. The
default value must be set during machine init, because we can safely
assume that KVM is initialized at this point.
We fix this by moving the logic to default_caps_with_cpu(). Since the
user cannot pass cap-hpt-max-page-size=0, we set the default to 0 in
the pseries-2.12 class init function and use that as a flag to do the
real work.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
PPC440 SoCs such as the AMCC 460EX have a DMA controller which is used
by AmigaOS on the sam460ex. Implement the parts used by AmigaOS so it
can get further booting on the sam460ex machine.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The Sam460ex has an M41T80 serial RTC chip on I2C bus 0 at address 0x68.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
At present the Sam460ex board is activated by the general CONFIG_PPC4XX
option. However that includes the board for both ppc-softmmu and
(deprecated) ppcemb-softmmu builds. As Sam460ex is developed, that would
require adding more things into ppcemb-softmmu, which we don't want to do.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
commit d35aefa9ae ("ppc/pnv: introduce a new intc_create() operation
to the chip model") changed the object link in the pnv_core_realize()
routine but a return was forgotten in case of error, which can lead to
more problems afterwards (segv)
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
With the previous changes, we can now let the ICS_KVM class inherit
directly from ICS_BASE class and not from the intermediate ICS_SIMPLE.
It makes the class hierarchy much cleaner.
What is left in the top classes is the low level interface to access
the KVM XICS device in ICS_KVM and the XICS emulating handlers in
ICS_SIMPLE.
This should not break migration compatibility.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
sam460ex (or at least this emulation) does not support the "ibm,cpm" power
management. As a result, Linux crashes when trying to access it. Remove
its device tree node. Also, if/when we boot the Linux kernel directly,
serial port clock frequencies in the device tree file will be unset, and
serial port initialization will fail. Add valid frequency values to
the serial ports to be able to use it. Also set valid values for the other
clock nodes otherwise set by u-boot.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Commit 84051eb400 "adb: add property to disable direct reg 3 writes" added a
workaround for MacOS 9 incorrectly setting the mouse address during boot of
PMU machines.
Further testing has shown that since fb6649f172 "adb: fix read reg 3 byte
ordering" this can still sometimes happen with the CUDA mac99 machine,
so let's enable this workaround for all New World machines using ADB for now.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
It eases code review, unit is explicit.
Patch generated using:
$ git grep -E '(1024|2048|4096|8192|(<<|>>).?(10|20|30))' hw/ include/hw/
and modified manually.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20180625124238.25339-33-f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These files don't use anything exposed by "qemu/cutils.h",
simplify preprocessing including directly "qemu/units.h".
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au> (ppc parts)
Message-Id: <20180625124238.25339-7-f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Let's try to reduce error handling a bit. In the plug/unplug case, the
device was realized and therefore we can assume that getting access to
the memory region will not fail.
For get_vmstate_memory_region() this is already handled that way.
Document both cases.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180619134141.29478-13-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Let's rename it to make it look more consistent.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180619134141.29478-4-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Currently during KVM initialization on POWER, kvm_fixup_page_sizes()
rewrites a bunch of information in the cpu state to reflect the
capabilities of the host MMU and KVM. This overwrites the information
that's already there reflecting how the TCG implementation of the MMU will
operate.
This means that we can get guest-visibly different behaviour between KVM
and TCG (and between different KVM implementations). That's bad. It also
prevents migration between KVM and TCG.
The pseries machine type now has filtering of the pagesizes it allows the
guest to use which means it can present a consistent model of the MMU
across all accelerators.
So, we can now replace kvm_fixup_page_sizes() with kvm_check_mmu() which
merely verifies that the expected cpu model can be faithfully handled by
KVM, rather than updating the cpu model to match KVM.
We call kvm_check_mmu() from the spapr cpu reset code. This is a hack:
conceptually it makes more sense where fixup_page_sizes() was - in the KVM
cpu init path. However, doing that would require moving the platform's
pagesize filtering much earlier, which would require a lot of work making
further adjustments. There wouldn't be a lot of concrete point to doing
that, since the only KVM implementation which has the awkward MMU
restrictions is KVM HV, which can only work with an spapr guest anyway.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
KVM HV has some limitations (deriving from the hardware) that mean not all
host-cpu supported pagesizes may be usable in the guest. At present this
means that KVM guests and TCG guests may see different available page sizes
even if they notionally have the same vcpu model. This is confusing and
also prevents migration between TCG and KVM.
This patch makes the environment consistent by always allowing the same set
of pagesizes. Since we can't remove the KVM limitations, we do this by
always applying the same limitations it has, even to TCG guests.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
The way we used to handle KVM allowable guest pagesizes for PAPR guests
required some convoluted checking of memory attached to the guest.
The allowable pagesizes advertised to the guest cpus depended on the memory
which was attached at boot, but then we needed to ensure that any memory
later hotplugged didn't change which pagesizes were allowed.
Now that we have an explicit machine option to control the allowable
maximum pagesize we can simplify this. We just check all memory backends
against that declared pagesize. We check base and cold-plugged memory at
reset time, and hotplugged memory at pre_plug() time.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
The way the POWER Hash Page Table (HPT) MMU is virtualized by KVM HV means
that every page that the guest puts in the pagetables must be truly
physically contiguous, not just GPA-contiguous. In effect this means that
an HPT guest can't use any pagesizes greater than the host page size used
to back its memory.
At present we handle this by changing what we advertise to the guest based
on the backing pagesizes. This is pretty bad, because it means the guest
sees a different environment depending on what should be host configuration
details.
As a start on fixing this, we add a new capability parameter to the
pseries machine type which gives the maximum allowed pagesizes for an
HPT guest. For now we just create and validate the parameter without
making it do anything.
For backwards compatibility, on older machine types we set it to the max
available page size for the host. For the 3.0 machine type, we fix it to
16, the intention being to only allow HPT pagesizes up to 64kiB by default
in future.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
spapr_irq_alloc_block and spapr_irq_alloc() are now deprecated.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Today, when a device requests for IRQ number in a sPAPR machine, the
spapr_irq_alloc() routine first scans the ICSState status array to
find an empty slot and then performs the assignement of the selected
numbers. Split this sequence in two distinct routines : spapr_irq_find()
for lookups and spapr_irq_claim() for claiming the IRQ numbers.
This will ease the introduction of a static layout of IRQ numbers.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
spapr capabilities have an apply hook to actually activate (or deactivate)
the feature in the system at reset time. However, a number of capabilities
affect the setup of cpus, and need to be applied to each of them -
including hotplugged cpus for extra complication. To make this simpler,
add an optional cpu_apply hook that is called from spapr_cpu_reset().
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Previously, the effective values of the various spapr capability flags
were only determined at machine reset time. That was a lazy way of making
sure it was after cpu initialization so it could use the cpu object to
inform the defaults.
But we've now improved the compat checking code so that we don't need to
instantiate the cpus to use it. That lets us move the resolution of the
capability defaults much earlier.
This is going to be necessary for some future capabilities.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
ppc_check_compat() is used in a number of places to check if a cpu object
supports a certain compatiblity mode, subject to various constraints.
It takes a PowerPCCPU *, however it really only depends on the cpu's class.
We have upcoming cases where it would be useful to make compatibility
checks before we fully instantiate the cpu objects.
ppc_type_check_compat() will now make an equivalent check, but based on a
CPU's QOM typename instead of an instantiated CPU object.
We make use of the new interface in several places in spapr, where we're
essentially making a global check, rather than one specific to a particular
cpu. This avoids some ugly uses of first_cpu to grab a "representative"
instance.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
The device tree node of the ISA bus was being partially done in
different places. Move all the nodes creation under the same routine.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
It introduces a base PnvChip class from which the specific processor
chip classes, Pnv8Chip and Pnv9Chip, inherit. Each of them needs to
define an init and a realize routine which will create the controllers
of the target processor. For the moment, the base PnvChip class
handles the XSCOM bus and the cores.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
QEMU implements the "Shared Processor LPAR" (SPLPAR) option, which allows
the hypervisor to time-slice a physical processor into multiple virtual
processor. The intent is to allow more guests to run, and to optimize
processor utilization.
The guest OS can cede idle VCPUs, so that their processing capacity may
be used by other VCPUs, with the H_CEDE hcall. The guest OS can also
optimize spinlocks, by confering the time-slice of a spinning VCPU to the
spinlock holder if it's currently notrunning, with the H_CONFER hcall.
Both hcalls depend on a "Virtual Processor Area" (VPA) to be registered
by the guest OS, generally during early boot. Other per-VCPU areas can
be registered: the "SLB Shadow Buffer" which allows a more efficient
dispatching of VCPUs, and the "Dispatch Trace Log Buffer" (DTL) which
is used to compute time stolen by the hypervisor. Both DTL and SLB Shadow
areas depend on the VPA to be registered.
The VPA/SLB Shadow/DTL are state that QEMU should migrate, but this doesn't
happen, for no apparent reason other than it was just never coded. This
causes the features listed above to stop working after migration, and it
breaks the logic of the H_REGISTER_VPA hcall in the destination.
The VPA is set at the guest request, ie, we don't have to migrate
it before the guest has actually set it. This patch hence adds an
"spapr_cpu/vpa" subsection to the recently introduced per-CPU machine
data migration stream.
Since DTL and SLB Shadow are optional and both depend on VPA, they get
their own subsections "spapr_cpu/vpa/slb_shadow" and "spapr_cpu/vpa/dtl"
hanging from the "spapr_cpu/vpa" subsection.
Note that this won't break migration to older QEMUs. Is is already handled
by only registering the vmstate handler for per-CPU data with newer machine
types.
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
A per-CPU machine data pointer was recently added to PowerPCCPU. The
motivation is to to hide platform specific details from the core CPU
code. This per-CPU data can hold state which is relevant to the guest
though, eg, Virtual Processor Areas, and we should migrate this state.
This patch adds the plumbing so that we can migrate the per-CPU data
for PAPR guests. We only do this for newer machine types for the sake
of backward compatibility. No state is migrated for the moment: the
vmstate_spapr_cpu_state structure will be populated by subsequent
patches.
Signed-off-by: Greg Kurz <groug@kaod.org>
[dwg: Fix some trivial spelling and spacing errors]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This moves the details of the ISA bus creation under the LPC model but
more important, the new PnvChip operation will let us choose the chip
class to use when we introduce the different chip classes for Power9
and Power8. It hides away the processor chip controllers from the
machine.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
On Power9, the thread interrupt presenter has a different type and is
linked to the chip owning the cores.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Commit 3d85885a1b tried to fix error handling, but it actually
went into the wrong direction by dropping the local Error *.
In the default KVM case, the rationale is to try the in-kernel XICS first,
and if not possible, to fallback to userland XICS. Passing errp everywhere
makes this fallback impossible if errp is &error_fatal (which happens to
be the case). And anyway, if the caller would pass a regular &local_err,
things would be worse: we could possibly pass an already set *errp to
error_setg() and crash, or return an error even in case of success.
So we definitely need a local Error * and only propagate it when we're
done with the fallback logic. This is what this patch does.
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
CPUPPCState currently contains a number of fields containing the state of
the VPA. The VPA is a PAPR specific concept covering several guest/host
shared memory areas used to communicate some information with the
hypervisor.
As a PAPR concept this is really machine specific information, although it
is per-cpu, so it doesn't really belong in the core CPU state structure.
There's also other information that's per-cpu, but platform/machine
specific. So create a (void *)machine_data in PowerPCCPU which can be
used by the machine to locate per-cpu data. Intialization, lifetime and
cleanup of machine_data is entirely up to the machine type.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Greg Kurz <groug@kaod.org>
This extracts from the PvChip realize routine the part creating the
cores. On Power9, we will need to create the cores after the Xive
interrupt controller is created.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This moves some code out from spapr_cpu_core_realize() for clarity. No
functional change.
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The spapr_realize_vcpu() function doesn't rollback in case of error.
This isn't a problem with coldplugged CPUs because the machine won't
start and QEMU will exit. Hotplug is a different story though: the
CPU thread is started under object_property_set_bool() and it assumes
it can access the CPU object.
If icp_create() fails, we return an error without unregistering the
reset handler for this CPU, and we let the underlying QEMU thread for
this CPU alive. Since spapr_cpu_core_realize() doesn't care to unrealize
already realized CPUs either, but happily frees all of them anyway, the
CPU thread crashes instantly:
(qemu) device_add host-spapr-cpu-core,core-id=1,id=gku
GKU: failing icp_create (cpu 0x11497fd0)
^^^^^^^^^^
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffee3feaa0 (LWP 24725)]
0x00000000104c8374 in object_dynamic_cast_assert (obj=0x11497fd0,
^^^^^^^^^^^^^^
pointer to the CPU object
623 trace_object_dynamic_cast_assert(obj ? obj->class->type->name
(gdb) p obj->class->type
$1 = (Type) 0x0
(gdb) p * obj
$2 = {class = 0x10ea9c10, free = 0x11244620,
^^^^^^^^^^
should be g_free
(gdb) p g_free
$3 = {<text variable, no debug info>} 0x7ffff282bef0 <g_free>
obj is a dangling pointer to the CPU that was just destroyed in
spapr_cpu_core_realize().
This patch adds proper rollback to both spapr_realize_vcpu() and
spapr_cpu_core_realize().
Signed-off-by: Greg Kurz <groug@kaod.org>
[dwg: Fixed a conflict due to a change in my tree]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Commit 94ad93bd97 (QEMU 2.12) switched to instantiate CPUs separately
but it missed to adapt the error path accordingly. If something fails in
the CPU creation loop, then the CPU object that was just created is leaked.
The error paths in this function are a bit obfuscated, and adding
yet another label to free this CPU object makes it worse. We should
move the block of the loop to a separate function, with a proper
rollback path, but this is a bigger cleanup.
For now, let's just fix the bug by adding the missing calls to
object_unref(). This will allow easier backport to older QEMU
versions.
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Currently we don't have any unrealize path for pnv cpu cores. We get away
with this because we don't yet support cpu hotplug for pnv.
However, we're going to want it eventually, and in the meantime, it makes
it non-obvious why there are a bunch of allocations on the realize() path
that don't have matching frees.
So, implement the missing unrealize path.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
pnv_cpu_init() is only called from the the pnv cpu core realize path, and
really only can be called from there. So fold it into its caller, which
we also rename for brevity.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Currently, we allocate space for all the cpu objects within a single core
in one big block. This was copied from an older version of the spapr code
and requires some ugly pointer manipulation to extract the individual
objects.
This design was due to a misunderstanding of qemu lifetime conventions and
has already been changed in spapr (in 94ad93bd "spapr_cpu_core: instantiate
CPUs separately".
Make an equivalent change in pnv_core to get rid of the nasty pointer
arithmetic.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
In pnv_core_realize() we call two functions with an Error * parameter in
succession, which will go badly if they both cause errors. In fact, a
failure in either of them indicates a qemu internal error, so we can just
use &error_abort in both cases.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
spapr_cpu_init() and spapr_cpu_destroy() are only called from the spapr
cpu core realize/unrealize paths, and really can only be called from there.
Those are all short functions, so fold the pairs together for simplicity.
While we're there rename some functions and change some parameter types
for brevity and clarity.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
The PMU device supercedes the CUDA device found on older New World Macs and
is supported by a larger number of guest OSs from OS 9 to OS X 10.5.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
PMU-enabled New World Macs expose their GPIOs via a separate memory region
within the macio device.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This option allows the VIA configuration to be controlled between 3
different possible setups: cuda, pmu-adb and pmu with USB rather than ADB
keyboard/mouse.
For the moment we don't do anything with the configuration except to pass
it to the macio device (the via-cuda parent) and also to the firmware via
the fw_cfg interface so that it can present the correct device tree.
The default is cuda which is the current default and so will have no
change in behaviour.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This is in preparation for adding configuration controlled via machine
options.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
If the negotiated compat mode can't be set, but raw mode is supported,
we decide to ignore the error. An so, we should free it to prevent a
memory leak.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
In default_caps_with_cpu() we set spapr_cap_cfpc to broken for POWER8
processors and before.
Since we no longer require private l1d cache on POWER8 for this cap to
be set to workaround change this to default to broken for POWER7
processors and before.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Add an IOMMU index argument to the translate method of
IOMMUs. Since all of our current IOMMU implementations
support only a single IOMMU index, this has no effect
on the behaviour.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20180604152941.20374-4-peter.maydell@linaro.org
Add support for multiple IOMMU indexes to the IOMMU notifier APIs.
When initializing a notifier with iommu_notifier_init(), the caller
must pass the IOMMU index that it is interested in. When a change
happens, the IOMMU implementation must pass
memory_region_notify_iommu() the IOMMU index that has changed and
that notifiers must be called for.
IOMMUs which support only a single index don't need to change.
Callers which only really support working with IOMMUs with a single
index can use the result of passing MEMTXATTRS_UNSPECIFIED to
memory_region_iommu_attrs_to_index().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20180604152941.20374-3-peter.maydell@linaro.org
We banned use of certain g_assert_FOO() functions outside tests, and
made checkpatch.pl flag them (commit 6e9389563e). We neglected to
purge existing uses. Do that now.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180608170231.27912-1-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: John Snow <jsnow@redhat.com>
By default, the IOMMU model built into the spapr virtual PCI host bridge
supports 4kiB and 64kiB IOMMU page sizes. However this can be overridden
which may be desirable to allow larger IOMMU page sizes when running a
guest with hugepage backing and passthrough devices. For that reason a
warning was printed when the device wasn't configured to allow the pagesize
with which guest RAM is backed.
Experience has proven, however, that this message is more confusing than
useful. Worse it sometimes makes little sense when the host-available page
sizes don't match those available on the guest, which can happen with
a POWER8 guest running on a POWER9 KVM host.
Long term we do want better handling to allow large IOMMU page sizes to be
used, but for now this parameter and warning don't really accomplish it.
So, remove the message, pending a better solution.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
A specific MemoryRegion is required for the LPC HC Firmware address
space.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Factor out cpu core unplug into separate function from
spapr_core_release(). Then use generic hotplug_handler_unplug() to trigger
cpu core unplug, which would call spapr_machine_device_unplug() ->
spapr_core_unplug() in the end.
This way unplug operation is not buried in spapr internals and located
in the same place like in other targets, following similar
logic/call chain across targets.
Acked-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Factor out memory unplug into separate function from spapr_lmb_release().
Then use generic hotplug_handler_unplug() to trigger memory unplug,
which will call spapr_machine_device_unplug() -> spapr_memory_unplug()
in the end.
This way unplug operation is not buried in lmb internals and located in
the same place like in other targets, following similar logic/call chain
across targets.
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
We'll be handling unplug of e.g. CPUs and PCDIMMs via the general
hotplug handler soon, so let's add that handler function.
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Let's finish cleaning up the hotplug handler. This check can be
performed in the pre_plug code as the very first thing.
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Let's clean the hotplug handler up by moving lookup of the node into
the function where it is actually being used.
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The node property can always be queried and the value has already been
verified in pc_dimm_realize().
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Commits b6712ea391 removed the macio_init() function but missed the header
prototype in mac.h. Remove it since it is no longer needed.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Commits 7b19318bee and 8ce3f743c7 removed the pci_pmac_init() and
pci_pmac_u3_init() functions but missed the header prototypes in mac.h. Remove
them since they are no longer needed.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
VIO devices have an "irq" property that can be used by the sPAPR IRQ
allocator as an IRQ number hint. But it is not set in QEMU nor in
libvirt. It brings unnecessary complexity to the underlying layers
managing the IRQ number space and it is in full opposition with the
new static IRQ allocator we want to introduce in sPAPR.
Let's deprecate it to simplify the spapr_irq_alloc routine in the
future.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
[dwg: Check qtest_enabled() to suppress bogus warnings from make check]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Commit 72d3d8f052 "hw/isa/superio: Add a keyboard/mouse controller (8042)"
added an 8042 keyboard device to the PC87312 superio device to replace that
being used by the prep machine.
Unfortunately this commit didn't do the same for the 40p machine which broke
the keyboard by registering two 8042 keyboard devices at the same address.
Resolve this by similarly removing the 8042 keyboard from the 40p machine as
done for the prep machine in commit 72d3d8f052.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The Linux sandalfoot zImage has an initialisation process which resets the
VGA controller by setting all the BAR addresses to zero to access the VGA
ioports at their legacy addresses.
Unfortunately setting the framebuffer BAR to address 0 makes the framebuffer
memory overlap the internal VGA memory causing accesses to fail, and so
prevents the kernel from switching successfully to text mode.
Since OpenHackWare configures the framebuffer BAR address outside of the legacy
VGA internal memory space, remove pci_allow_0_address from the 40p machine class
which causes the BAR reprogramming to zero to fail and so the VGA internal
memory can be accessed correctly again.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Use error_report() + abort() instead of error_setg(&error_abort),
as suggested by the "qapi/error.h" documentation:
Please don't error_setg(&error_fatal, ...), use error_report() and
exit(), because that's more obvious.
Likewise, don't error_setg(&error_abort, ...), use assert().
Use abort() instead of the suggested assert() because the error message
already got displayed.
Suggested-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Message-id: 20180606152128.449-5-f4bug@amsat.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
vDPA support, fix to vhost blk RO bit handling, some include path
cleanups, NFIT ACPI table.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJbEXNvAAoJECgfDbjSjVRpc8gH/R8xrcFrV+k9wwbgYcOcGb6Y
LWjseE31pqJcxRV80vLOdzYEuLStZQKQQY7xBDMlA5vdyvZxIA6FLO2IsiJSbFAk
EK8pclwhpwQAahr8BfzenabohBv2UO7zu5+dqSvuJCiMWF3jGtPAIMxInfjXaOZY
odc1zY2D2EgsC7wZZ1hfraRbISBOiRaez9BoGDKPOyBY9G1ASEgxJgleFgoBLfsK
a1XU+fDM6hAVdxftfkTm0nibyf7PWPDyzqghLqjR9WXLvZP3Cqud4p8N29mY51pR
KSTjA4FYk6Z9EVMltyBHfdJs6RQzglKjxcNGdlrvacDfyFi79fGdiosVllrjfJM=
=3+V0
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
acpi, vhost, misc: fixes, features
vDPA support, fix to vhost blk RO bit handling, some include path
cleanups, NFIT ACPI table.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Fri 01 Jun 2018 17:25:19 BST
# gpg: using RSA key 281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream: (31 commits)
vhost-blk: turn on pre-defined RO feature bit
ACPI testing: test NFIT platform capabilities
nvdimm, acpi: support NFIT platform capabilities
tests/.gitignore: add entry for generated file
arch_init: sort architectures
ui: use local path for local headers
qga: use local path for local headers
colo: use local path for local headers
migration: use local path for local headers
usb: use local path for local headers
sd: fix up include
vhost-scsi: drop an unused include
ppc: use local path for local headers
rocker: drop an unused include
e1000e: use local path for local headers
ioapic: fix up includes
ide: use local path for local headers
display: use local path for local headers
trace: use local path for local headers
migration: drop an unused include
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When pulling in headers that are in the same directory as the C file (as
opposed to one in include/), we should use its relative path, without a
directory.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Remove those unneeded includes to speed up the compilation
process a little bit. (Continue 7eceff5b5a cleanup)
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180528232719.4721-13-f4bug@amsat.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Rename the 2.13 machines to match the number we're going to
use for the next release.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-id: 20180522104000.9044-5-peter.maydell@linaro.org
platform-bus were using machine_done notifier to get and map
(assign irq/mmio resources) dynamically added sysbus devices
after all '-device' options had been processed.
That however creates non obvious dependencies on ordering of
machine_done notifiers and requires carefull line juggling
to keep it working. For example see comment above
create_platform_bus() and 'straitforward' arm_load_kernel()
had to converted to machine_done notifier and that lead to
yet another machine_done notifier to keep it working
arm_register_platform_bus_fdt_creator().
Instead of hiding resource assignment in platform-bus-device
to magically initialize sysbus devices, use device plug
callback and assign resources explicitly at board level
at the moment each -device option is being processed.
That adds a bunch of machine declaration boiler plate to
e500plat board, similar to ARM/x86 but gets rid of hidden
machine_done notifier and would allow to remove the dependent
notifiers in ARM code simplifying it and making code flow
easier to follow.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Message-id: 1525691524-32265-3-git-send-email-imammedo@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* pc-dimm: factor out MemoryDevice
(virtio-pmem and virtio-mem will make use of the new abstraction later)
* scripts/device-crash-test: Removed fixed CAN entries
-----BEGIN PGP SIGNATURE-----
iQIcBAABCAAGBQJa8IZ2AAoJECgHk2+YTcWmmD0P/2Lddw+ilGhGS/CWarq4uLSF
ILtEMwNgbJeJAEza6IQx/IIuUER3H5UcxgZhO49nELpurobhl5yW9JKP1qjH9z9i
7hVPORGioiyGkjgjbm8jWtljePAloTIwEiIcrqYkVHpWDCUJaZ7SES2VQL7ltY/W
AU3uSFQQMDfVqr/MXDxZq084wFK3Jm2aIE+p8a0MF7B+29RSHdFU9iKysCC1Wu/1
AllXCkQ4yWHCGoSRBfzFz9EWBb4VlzM+VNj9nhHu75zdF3hm7J05yIiGuZLiOjmB
MDOkvKhSeXNj+21mXVLmSxkfI65z6jrq3aI7iTp4+orrd2SCXoHsOZoj4Q2cRSnw
kJlY62+p85H9NYIKTgMCM/oURpL2ZnqPKmCto1NRFywSBGLXll2weyKpX9ByvXe2
gL8hqra/K8eUPW4zSsPYbbN1b16EnK4MY2nkYvG0Y/aAXGZF6V9zQwKNT4/F5GyY
SRMC4c2OtQOgZNDSuPdgZ5Lu5PXfetvvcqWCj0tXNdaScOp6Omsc/i/YCUtu6r/3
IbBIclJ+K5aD+U4QP4DKZ+DJbEkIGMU4pSHgR2i8bK7MmoJpJcAIB1mL5nA/TknP
/RVgtnP7gVbfGIVVwjUw9bMurvOti4PBp0/DxC/VqUqGs9e8avE1yb9grVJdj/jA
oEGJ6EIsmO1URbk1+f93
=Hhge
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/ehabkost/tags/machine-next-pull-request' into staging
Machine queue, 2018-05-07
* pc-dimm: factor out MemoryDevice
(virtio-pmem and virtio-mem will make use of the new abstraction later)
* scripts/device-crash-test: Removed fixed CAN entries
# gpg: Signature made Mon 07 May 2018 18:01:42 BST
# gpg: using RSA key 2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF D1AA 2807 936F 984D C5A6
* remotes/ehabkost/tags/machine-next-pull-request:
scripts/device-crash-test: Removed fixed CAN entries
vl: allow 'maxmem' without 'slot'
spapr: rename "hotplug memory" terminology to "device memory"
pc: rename "hotplug memory" terminology to "device memory"
machine: rename MemoryHotplugState to DeviceMemoryState
pc-dimm: move actual plug/unplug of a memory region to MemoryDevice
pc-dimm: factor out capacity and slot checks into MemoryDevice
pc-dimm: factor out address search into MemoryDevice code
pc-dimm: pass in the machine and to the MemoryHotplugState
pc-dimm: no need to pass the memory region
machine: make MemoryHotplugState accessible via the machine
pc-dimm: factor out MemoryDevice interface
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
qemu-system-ppc fails to build with GCC 8.0.1:
/home/hsp/src/qemu-master/hw/ppc/e500.c: In function ‘ppce500_load_device_tree’:
/home/hsp/src/qemu-master/hw/ppc/e500.c:442:37: error: ‘/pic@’
directive output may be truncated writing 5 bytes into a region of
size between 1 and 128 [-Werror=format-truncation=]
snprintf(mpic, sizeof(mpic), "%s/pic@%llx", soc, MPC8544_MPIC_REGS_OFFSET);
^~~~~
In file included from /usr/include/stdio.h:862,
from /home/hsp/src/qemu-master/include/qemu/osdep.h:68,
from /home/hsp/src/qemu-master/hw/ppc/e500.c:17:
/usr/include/bits/stdio2.h:64:10: note: ‘__builtin___snprintf_chk’
output between 11 and 138 bytes into a destination of size 128
return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1,
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
__bos (__s), __fmt, __va_arg_pack ());
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/hsp/src/qemu-master/hw/ppc/e500.c:470:39: error:
‘/global-utilities@’ directive output may be truncated writing 18
bytes into a region of size between 1 and 128
[-Werror=format-truncation=]
snprintf(gutil, sizeof(gutil), "%s/global-utilities@%llx", soc,
^~~~~~~~~~~~~~~~~~
In file included from /usr/include/stdio.h:862,
from /home/hsp/src/qemu-master/include/qemu/osdep.h:68,
from /home/hsp/src/qemu-master/hw/ppc/e500.c:17:
/usr/include/bits/stdio2.h:64:10: note: ‘__builtin___snprintf_chk’
output between 24 and 151 bytes into a destination of size 128
return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1,
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
__bos (__s), __fmt, __va_arg_pack ());
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/hsp/src/qemu-master/hw/ppc/e500.c:477:36: error: ‘/msi@’
directive output may be truncated writing 5 bytes into a region of
size between 0 and 127 [-Werror=format-truncation=]
snprintf(msi, sizeof(msi), "/%s/msi@%llx", soc, MPC8544_MSI_REGS_OFFSET);
^~~~~
In file included from /usr/include/stdio.h:862,
from /home/hsp/src/qemu-master/include/qemu/osdep.h:68,
from /home/hsp/src/qemu-master/hw/ppc/e500.c:17:
/usr/include/bits/stdio2.h:64:10: note: ‘__builtin___snprintf_chk’
output between 12 and 139 bytes into a destination of size 128
return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1,
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
__bos (__s), __fmt, __va_arg_pack ());
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Fix this by converting e500 to use g_strdup_printf()+g_free() instead
of snprintf(). This is done globally, even for call sites that don't
break build, since this is the preferred practice in QEMU.
Reported-by: Howard Spoelstra <hsp.cat7@gmail.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 152568372989.443627.900708381919207053.stgit@bahia.lan
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Let's make it clear at relevant places that we are dealing with device
memory. That it can be used for memory hotplug is just a special case.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180423165126.15441-11-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
[ehabkost: rebased series, solved conflicts at spapr.c]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Rename it to better match the new terminology.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180423165126.15441-9-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
We use the machine internally either way, so let's just pass it in then.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180423165126.15441-5-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
We can just query it ourselves. When unplugging, we should always be
able to the region (as it was previously plugged). E.g. PPC already
assumed that and used &error_abort.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180423165126.15441-4-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Let's allow to query the MemoryHotplugState directly from the machine.
If the pointer is NULL, the machine does not support memory devices. If
the pointer is !NULL, the machine supports memory devices and the
data structure contains information about the applicable physical
guest address space region.
This allows us to generically detect if a certain machine has support
for memory devices, and to generically manage it (find free address
range, plug/unplug a memory region).
We will rename "MemoryHotplugState" to something more meaningful
("DeviceMemory") after we completed factoring out the pc-dimm code into
MemoryDevice code.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180423165126.15441-3-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
[ehabkost: rebased series, solved conflicts at spapr.c]
[ehabkost: squashed fix to use g_malloc0()]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
On the qmp level, we already have the concept of memory devices:
"query-memory-devices"
Right now, we only support NVDIMM and PCDIMM.
We want to map other devices later into the address space of the guest.
Such device could e.g. be virtio devices. These devices will have a
guest memory range assigned but won't be exposed via e.g. ACPI. We want
to make them look like memory device, but not glued to pc-dimm.
Especially, it will not always be possible to have TYPE_PC_DIMM as a parent
class (e.g. virtio devices). Let's use an interface instead. As a first
part, convert handling of
- qmp_pc_dimm_device_list
- get_plugged_memory_size
to our new model. plug/unplug stuff etc. will follow later.
A memory device will have to provide the following functions:
- get_addr(): Necessary, as the property "addr" can e.g. not be used for
virtio devices (already defined).
- get_plugged_size(): The amount this device offers to the guest as of
now.
- get_region_size(): Because this can later on be bigger than the
plugged size.
- fill_device_info(): Fill MemoryDeviceInfo, e.g. for qmp.
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180423165126.15441-2-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJa7BLUAAoJEDhwtADrkYZTumIQAJC6wXmN+wBYc2MoR2Y8SQgY
+gTM9J6R6H50ijb7RkkERLTgys7IxCDD/jy2p0yX/Re3ReXbYwzYQXmSFpF1KWGe
SXB84uDtwSILbvR5iS0TBdQSyO+u5DRboukuLfTEZHjYQUP+guT1we3YwqWGzIKp
o5kV/7Nq0vPWO5Sbs4FWB0t9hWzWV3Kef9b4gRPn05sWPaq2/sU6A3xai+ty6qS7
PCm7VwT4z5SACdR4LRiL45h3HdThgr/alJJ6lUr2kaNCBiDBvM4h6d7W+lI/Vi3Y
rG+wqyPQFyWLXf0uuI3AmSScVUzfYv9C4TcBTJkFnebrFcybPsGwEJLGtaIgFnBU
1Mcz/TCl1bB4fDvhwV2qexxlXryOWXKn+ygdu9sBSY/QSA+NEqbJQo6cCDqMQ9Qy
6zqrGxUrM/peVLvhfle4cIbyPslGRGn2s95oQzCJi8TlZxBj8lgW1x1kr7OhSlf4
rNteSYAHDNSiNVL1PcW3vOS7ndTA6O0vHAtGa+0vbQzAf+RUfFG0sfggG6350O8e
97Hp4LKT3VpGEuwyQEw6wk3zODNfAgtkkwjQHTnQYHriKB/fcVfY3g7gpYp4zMLF
GJ3h5KZj71JNoFoxVJniAgkWY8+IP11ggXMyYWSMxMZ3M81EqQ/rbvOvGxn1wjd8
kHbpUEMmGBHF1VmKs7e1
=Kukn
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2018-05-04' into staging
QAPI patches for 2018-05-04
# gpg: Signature made Fri 04 May 2018 08:59:16 BST
# gpg: using RSA key 3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg: aka "Markus Armbruster <armbru@pond.sub.org>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653
* remotes/armbru/tags/pull-qapi-2018-05-04:
qapi: deprecate CpuInfoFast.arch
qapi: discriminate CpuInfoFast on SysEmuTarget, not CpuInfoArch
qapi: change the type of TargetInfo.arch from string to enum SysEmuTarget
qapi: add SysEmuTarget to "common.json"
qapi: fill in CpuInfoFast.arch in query-cpus-fast
qobject: Modify qobject_ref() to return obj
qobject: Replace qobject_incref/QINCREF qobject_decref/QDECREF
qobject: use a QObjectBase_ struct
qobject: Ensure base is at offset 0
qobject: Use qobject_to() instead of type cast
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Now that we can safely call QOBJECT() on QObject * as well as its
subtypes, we can have macros qobject_ref() / qobject_unref() that work
everywhere instead of having to use QINCREF() / QDECREF() for QObject
and qobject_incref() / qobject_decref() for its subtypes.
The replacement is mechanical, except I broke a long line, and added a
cast in monitor_qmp_cleanup_req_queue_locked(). Unlike
qobject_decref(), qobject_unref() doesn't accept void *.
Note that the new macros evaluate their argument exactly once, thus no
need to shout them.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180419150145.24795-4-marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Rebased, semantic conflict resolved, commit message improved]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
On a POWER9 host, if a guest runs in pre POWER9 compat mode, it necessarily
uses the hash MMU mode. In this case, we shouldn't advertise radix GTSE in
the ibm,arch-vec-5-platform-support DT property as the current code does.
The first reason is that it doesn't make sense, and the second one is that
causes the CAS-negotiated options subsection to be migrated. This breaks
backward migration to QEMU 2.7 and older versions on POWER8 hosts:
qemu-system-ppc64: error while loading state for instance 0x0 of device
'spapr'
qemu-system-ppc64: load of migration failed: No such file or directory
This patch hence initialize CPUs a bit earlier so that we can check the
requested compat mode, and don't set OV5_MMU_RADIX_GTSE for power8 and
older.
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
a324d6f166 "spapr: Support ibm,dynamic-memory-v2 property" added
a new feature in the set of CAS-negotiatable options. This causes
the CAS-negotiated options subsection to be migrated, even for old
machine types that don't know about it, and breaks backward migration
to QEMU 2.7 and older versions:
qemu-system-ppc64: error while loading state for instance 0x0 of device
'spapr'
qemu-system-ppc64: load of migration failed: No such file or directory
Since this feature only affects boot time behaviour, it should be
filtered out when we decide to migrate CAS-negotiated options, like
we already do with OV5_FORM1_AFFINITY and OV5_DRCONF_MEMORY.
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Since the macio device has a link to the PIC device, we can now wire up the
IRQs directly via qdev GPIOs rather than having to use an intermediate array.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Introduce constants for the pre-defined New World IRQs to help keep things
readable.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Commit 4e46dcdbd3 "PPC: Newworld: Add uninorth token register" added a TODO
which was to convert the uninorth registers hack to a proper device. Move
these registers to a new uninorth device, removing the old hacks from
mac_newworld.c.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
To prevent spurious wakeups on cpus that are supposed to be disabled, we
need to clear the LPCR bits which control certain wakeup events.
spapr_cpu_reset() has separate cases here for boot and non-boot (initially
inactive) cpus. rtas_start_cpu() then turns the LPCR bits on when the
non-boot cpus are activated.
But explicit checks against first_cpu are not how we usually do things:
instead spapr_cpu_reset() generally sets things up for non-boot (inactive)
cpus, then spapr_machine_reset() and/or rtas_start_cpu() override as
necessary.
So, do that instead. Because the LPCR activation is identical for boot
cpus and non-boot cpus just activated with rtas_start_cpu() we can put the
code common in spapr_cpu_set_entry_state().
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Tested-by: Cédric Le Goater <clg@kaod.org>
cpu_ppc_set_papr() does several things:
1) it sets up the virtual hypervisor interface
2) it prevents the cpu from ever entering hypervisor mode
3) it tells KVM that we're emulating a cpu in PAPR mode
and 4) it configures the LPCR and AMOR (hypervisor privileged registers)
so that TCG will behave correctly for PAPR guests, without
attempting to emulate the cpu in hypervisor mode
(1) & (2) make sense for any virtual hypervisor (if another one ever
exists).
(3) belongs more properly in the machine type specific to a PAPR guest, so
move it to spapr_cpu_init(). While we're at it, remove an ugly test on
kvm_enabled() by making kvmppc_set_papr() a safe no-op on non-KVM.
(4) also belongs more properly in the machine type specific code. (4) is
done by mangling the default values of the SPRs, so that they will be set
correctly at reset time. Manipulating usually-static parameters of the cpu
model like this is kind of ugly, especially since the values used really
have more to do with the platform than the cpu.
The spapr code already has places for PAPR specific initializations of
register state in spapr_cpu_reset(), so move this handling there.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Tested-by: Cédric Le Goater <clg@kaod.org>
In cpu_ppc_set_papr() the UPRT and GTSE bits of the LPCR default value are
initialized based on on ppc64_radix_guest(). Which seems reasonable,
except that ppc64_radix_guest() is based on spapr->patb_entry which is
only set up in spapr_machine_reset, called _after_ cpu_ppc_set_papr() for
boot cpus. Well, and the fact that modifying the SPR default value for an
instance rather than a class is kind of yucky.
The initialization here is really only necessary or valid for
hotplugged cpus; the base cpu initialization already sets a value
that's good enough for the boot cpus until the guest uses an hcall to
configure it's preferred MMU mode.
So, move this initialization to the rtas_start_cpu() path, at which point
ppc64_radix_guest() will have a sensible value, to make sure secondary cpus
come up in an MMU mode matching the existing cpus.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Tested-by: Cédric Le Goater <clg@kaod.org>
There are several places in spapr_hcall.c where we need to update the LPCR
value on all CPUs. We do this with the set_spr() helper. That's not
really correct because this directly sets the SPR value, without going
through the ppc_store_lpcr() helper which may need to update state based
on the LPCR change.
In fact, set_spr() is only ever used for the LPCR, so replace it with an
explicit LPCR updated which uses the right low-level helper. While we're
there, move the CPU_FOREACH() which was in every one of the callers into
the new helper: set_all_lpcrs().
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Tested-by: Cédric Le Goater <clg@kaod.org>
Under PAPR, only the boot CPU is active when the system starts. Other cpus
must be explicitly activated using an RTAS call. The entry state for the
boot and secondary cpus isn't identical, but it has some things in common.
We're going to add a bit more common setup later, too, so to simplify
make a helper which sets up the common entry state for both boot and
secondary cpu threads.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Tested-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
rtas_start_cpu() calls spapr_cpu_update_tb_offset() and
spapr_cpu_set_endianness() to initialize certain things in the new cpu's
state. This is the only caller of those helpers, and they're each only
a few lines long, so we might as well just fold them into the caller.
In addition, those helpers initialize state on the new cpu to match that of
the first cpu. That will generally work, but might be at least logically
incorrect if the first cpu has been set offline by the guest. So, instead
base the state on that of the cpu invoking the RTAS call, which is
obviously active already.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Tested-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
This makes several minor cleanups to these functions:
* Follow usual convention of an early exit on error, rather than having
most of the body in an if
* Clearer naming of cpu and cpu_. Now callcpu is the cpu from which the
RTAS call is invoked, newcpu is the cpu which we're starting
* Use cpu_synchronize_state() instead of kvm_cpu_synchronize_state()
directly
* Remove pointless comment describing what cpu_synchronize_state() does
* Use ppc_store_lpcr() instead of directly writing the register field
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Tested-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Current POWER cpus allow for a VRMA, a special mapping which describes a
guest's view of memory when in real mode (MMU off, from the guest's point
of view). Older cpus didn't have that which meant that to support a guest
a special host-contiguous region of memory was needed to give the guest its
Real Mode Area (RMA).
KVM used to provide special calls to allocate a contiguous RMA for those
cases. This was useful in the early days of KVM on Power to allow it to be
tested on PowerPC 970 chips as used in Macintosh G5 machines. Now, those
machines are so old as to be almost irrelevant.
The normal qemu deprecation process would require this to be marked
deprecated then removed in 2 releases. However, this can only be used
with corresponding support in the host kernel - which was dropped
years ago (in c17b98cf "KVM: PPC: Book3S HV: Remove code for PPC970
processors" of 2014-12-03 to be precise). Therefore it should be ok
to drop this immediately.
Just to be clear this only affects *KVM HV* guests with PowerPC 970,
and those already require an ancient host kernel. TCG and KVM PR
guests with PowerPC 970 should still work.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Thomas Huth <thuth@redhat.com>
Although the order doesn't really matter at the moment, it's possible
other initializastions could depend on the compatiblity mode, so make sure
we set it first in spapr_cpu_reset().
While we're at it drop the test against first_cpu. Setting the compat mode
to the value it already has is redundant, but harmless, so we might as well
make a small simplification to the code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
The new property ibm,dynamic-memory-v2 allows memory to be represented
in a more compact manner in device tree.
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Convert PPCE500Params to PCCE500MachineClass which it essentially is,
and introduce PCCE500MachineState to keep track of E500 specific
state instead of adding global variables or extra parameters to
functions when we need to keep data beyond machine init
(i.e. make it look like typical fully defined machine).
It's pretty shallow conversion instead of currently used trivial
DEFINE_MACHINE() macro. It adds extra 60LOC of boilerplate code
of full machine definition.
The patch on top[1] will use PCCE500MachineState to keep track of
platform_bus device and add E500Plate specific machine class
to use HOTPLUG_HANDLER for explicitly initializing dynamic
sysbus devices at the time they are added instead of delaying
it to machine done time by platform_bus_init_notify() which is
being removed.
1) <1523551221-11612-3-git-send-email-imammedo@redhat.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Suggested-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Now recent kernels (i.e. since linux-stable commit a346137e9142
("powerpc/numa: Use ibm,max-associativity-domains to discover possible nodes")
support this property to mark initially memory-less NUMA nodes as "possible"
to allow further memory hot-add to them.
Advertise this property for pSeries machines to let guest kernels detect
maximum supported node configuration and benefit from kernel side change
when hot-add memory to specific, possibly empty before, NUMA node.
Signed-off-by: Serhii Popovych <spopovyc@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The env->slb_nr field gives the size of the SLB (Segment Lookaside Buffer).
This is another static-after-initialization parameter of the specific
version of the 64-bit hash MMU in the CPU. So, this patch folds the field
into PPCHash64Options with the other hash MMU options.
This is a bit more complicated that the things previously put in there,
because slb_nr was foolishly included in the migration stream. So we need
some of the usual dance to handle backwards compatible migration.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
The ci_large_pages boolean in CPUPPCState is only relevant to 64-bit hash
MMU machines, indicating whether it's possible to map large (> 4kiB) pages
as cache-inhibitied (i.e. for IO, rather than memory). Fold it as another
flag into the PPCHash64Options structure.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Currently env->mmu_model is a bit of an unholy mess of an enum of distinct
MMU types, with various flag bits as well. This makes which bits of the
field should be compared pretty confusing.
Make a start on cleaning that up by moving two of the flags bits -
POWERPC_MMU_1TSEG and POWERPC_MMU_AMR - which are specific to the 64-bit
hash MMU into a new flags field in PPCHash64Options structure.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
env->sps contains page size encoding information as an embedded structure.
Since this information is specific to 64-bit hash MMUs, split it out into
a separately allocated structure, to reduce the basic env size for other
cpus. Along the way we make a few other cleanups:
* Rename to PPCHash64Options which is more in line with qemu name
conventions, and reflects that we're going to merge some more hash64
mmu specific details in there in future. Also rename its
substructures to match qemu conventions.
* Move structure definitions to the mmu-hash64.[ch] files.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
As a rule we prefer to pass PowerPCCPU instead of CPUPPCState, and this
change will make some things simpler later on.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Since commit 7da79a167a, the machine class init function registers
dynamic sysbus device types it supports. Passing an unsupported device
type on the command line causes QEMU to exit with an error message
just after machine init.
It is hence not needed to do the same sanity check at machine reset.
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This reverts commit b556854bd8.
Leave change @node type from uint32_t to to int from reverted commit
because node < 0 is always false.
Note that implementing capability or some trick to detect if guest
kernel does not support hot-add to memory: this returns previous
behavour where memory added to first non-empty node.
Signed-off-by: Serhii Popovych <spopovyc@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Both spapr_irq_alloc() and spapr_irq_alloc_block() have an errp
parameter, but they don't use it if XICS hasn't been initialized
yet.
This is doubly wrong:
- all callers do pass a non-null Error **, ie, they expect an error
to be propagated in case of failure
- XICS obviously needs to be initialized before anything starts allocating
IRQs
So this patch turns the check into an assert.
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The existing UNINState actually represents the PCI/AGP host bridge stage so
rename it accordingly.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Do this for both the uninorth main and uninorth u3 AGP buses, using the main
PCI bus for each machine (this ensures the IO addresses still match those
used by OpenBIOS).
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Now that the OpenPIC is wired up via the board, we can now remove our temporary
PIC qdev pointer property and replace it with an object link instead.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Instead wire up the PCI/AGP host bridges in mac_newworld.c. Now this is complete
it is possible to move the initialisation of the PCI hole alias into
pci_u3_agp_init().
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Instead wire up the PCI/AGP host bridges in mac_newworld.c. Now this is complete
it is possible to move the initialisation of the PCI hole alias into
pci_unin_main_init().
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Since the IO address space is fixed to use the standard system IO address
space then we can also use the opportunity to remove the address_space_io
parameter from pci_pmac_init() and pci_pmac_u3_init().
Note we also move the default mac99 PCI bus to the end of the initialisation
list so that it becomes the default destination for any devices specified
via -device without an explicit PCI bus provided.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Since the macio device has a link to the PIC device, we can now wire up the
IRQs directly via qdev GPIOs rather than having to use an intermediate array.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Introduce constants for the pre-defined Old World IRQs to help keep things
readable.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This simplifies the Old World machine to simply mapping the ISA memory region
into the main address space.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Instead wire up the grackle device inside the Mac Old World machine.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This is the first step towards removing the old-style pci_grackle_init()
function. Following on from the previous commit we can now pass the heathrow
device as an object link and wire up the heathrow IRQs via qdev GPIOs.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Instead wire up heathrow to the CPU and grackle PCI host using qdev GPIOs.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This is in preparation for moving the device wiring into the New World machine.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
After QOMification this is clearly no longer needed (and possibly hasn't been
for some time).
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Commit 593c181160: "PPC: Newworld: Add second uninorth control register set"
added a second set of uninorth registers at 0xf3000000.
Testing MacOS 9.2 to MacOS X 10.4 reveals no accesses to this address and I
can't find any reference to it in Apple's Core99.cpp source so I'm assuming
that this was the result of another bug that has now been fixed.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Create a new function serial_max_hds() which returns the number of
serial ports defined by the user. This is needed only by spapr.
This allows us to remove the MAX_SERIAL_PORTS define.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180420145249.32435-14-peter.maydell@linaro.org
The ISA serial port handling in serial-isa.c imposes a limit
of 4 serial ports. This is because we only know of 4 IO port
and IRQ settings for them, and is unrelated to the generic
MAX_SERIAL_PORTS limit, though they happen to both be set at
4 currently.
Use a new MAX_ISA_SERIAL_PORTS wherever that is the correct
limit to be checking against.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180420145249.32435-11-peter.maydell@linaro.org
Change all the uses of serial_hds[] to go via the new
serial_hd() function. Code change produced with:
find hw -name '*.[ch]' | xargs sed -i -e 's/serial_hds\[\([^]]*\)\]/serial_hd(\1)/g'
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-id: 20180420145249.32435-8-peter.maydell@linaro.org
We only emulate timer running at CPU frequency which is what most
guests expect so set the frequency to match real hardware. This also
allows setting clock multipliers which caused slowdown previously due
to wrong timer frequency.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
At the moment the device tree produced by the H_CAS handler has no
reserved map initialized at all which is not correct as at least one
empty record is required to be present as a marker of the end.
This does not cause problems now as the only consumer is SLOF which
does not look at the reserved map area.
However when DTC's "Improve libfdt's memory safety" changeset hits
the QEMU upstream, there will be errors reported and crashes observed.
This fixes the problem by adding an empty entry to the reserved map,
just like create_device_tree() does already.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Make qmp_pc_dimm_device_list() return sorted by start address
list of devices so that it could be reused in places that
would need sorted list*. Reuse existing pc_dimm_built_list()
to get sorted list.
While at it hide recursive callbacks from callers, so that:
qmp_pc_dimm_device_list(qdev_get_machine(), &list);
could be replaced with simpler:
list = qmp_pc_dimm_device_list();
* follow up patch will use it in build_srat()
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au> for ppc part
Reviewed-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Using log unimp is more appropriate for these messages and this also
silences them by default so they won't clobber make check output when
tests are added for this board.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
With the new "--nic" command line parameter option, the "old" way of
specifying a NIC model via the nd_table[] is becoming more prominent
again. But for the pseries "spapr-vlan" device, there is a confusing
discrepancy between the model name that is used for "--device" (i.e.
"spapr-vlan") and the model name that has to be used for "--net nic"
or the new "--nic" parameter (i.e. "ibmveth"). Since "spapr-vlan" is
the "real" name of the device, let's allow "spapr-vlan" to be used
as model name for the nd_table[] entries, too.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This patch moves the gap between u-boot and kernel at the correct location.
Signed-off-by: David Engraf <david.engraf@sysgo.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The global hack for creating SCSI devices has recently been removed,
but this apparently broke SCSI devices on some boards that were not
ready for this change yet. For the 40p machine you now get:
$ ppc64-softmmu/qemu-system-ppc64 -M 40p -cdrom x.iso
qemu-system-ppc64: -cdrom x.iso: machine type does not support if=scsi,bus=0,unit=2
Fix it by providing a lsi53c810_create() function that takes care
of calling scsi_bus_legacy_handle_cmdline() after creating the
corresponding SCSI controller.
Fixes: 1454509726
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* SCSI fix to pass maximum transfer size (Daniel Barboza)
* chardev fixes and improved iothread support (Daniel Berrangé, Peter)
* checkpatch tweak (Eric)
* make help tweak (Marc-André)
* make more PCI NICs available with -net or -nic (myself)
* change default q35 NIC to e1000e (myself)
* SCSI support for NDOB bit (myself)
* membarrier system call support (myself)
* SuperIO refactoring (Philippe)
* miscellaneous cleanups and fixes (Thomas)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJapqaMAAoJEL/70l94x66DQoUH/Rvg+a8giz/SrEA4P8D3Cb2z
4GNbNUUoy4oU0ltD5IAMskMwpOsvl1batE0D+pKIlfO9NV4+Cj2kpgo0p9TxoYqM
VCby3wRtx27zb5nVytC6M++iIKXmeEMqXmFw61I6umddNPSl4IR3hiHEE0DM+7dV
UPIOvJeEiazyQaw3Iw+ZctNn8dDBKc/+6oxP9xRcYTaZ6hB4G9RZkqGNNSLcJkk7
R0UotdjzIZhyWMOkjIwlpTF4sWv8gsYUV4bPYKMYho5B0Obda2dBM3I1kpA8yDa/
xZ5lheOaAVBZvM5aMIcaQPa65MO9hLyXFmhMOgyfpJhLBBz6Qpa4OLLI6DeTN+0=
=UAgA
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* Record-replay lockstep execution, log dumper and fixes (Alex, Pavel)
* SCSI fix to pass maximum transfer size (Daniel Barboza)
* chardev fixes and improved iothread support (Daniel Berrangé, Peter)
* checkpatch tweak (Eric)
* make help tweak (Marc-André)
* make more PCI NICs available with -net or -nic (myself)
* change default q35 NIC to e1000e (myself)
* SCSI support for NDOB bit (myself)
* membarrier system call support (myself)
* SuperIO refactoring (Philippe)
* miscellaneous cleanups and fixes (Thomas)
# gpg: Signature made Mon 12 Mar 2018 16:10:52 GMT
# gpg: using RSA key BFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini/tags/for-upstream: (69 commits)
tcg: fix cpu_io_recompile
replay: update documentation
replay: save vmstate of the asynchronous events
replay: don't process async events when warping the clock
scripts/replay-dump.py: replay log dumper
replay: avoid recursive call of checkpoints
replay: check return values of fwrite
replay: push replay_mutex_lock up the call tree
replay: don't destroy mutex at exit
replay: make locking visible outside replay code
replay/replay-internal.c: track holding of replay_lock
replay/replay.c: bump REPLAY_VERSION again
replay: save prior value of the host clock
replay: added replay log format description
replay: fix save/load vm for non-empty queue
replay: fixed replay_enable_events
replay: fix processing async events
cpu-exec: fix exception_index handling
hw/i386/pc: Factor out the superio code
hw/alpha/dp264: Use the TYPE_SMC37C669_SUPERIO
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
# Conflicts:
# default-configs/i386-softmmu.mak
# default-configs/x86_64-softmmu.mak
This adds a possibility for the platform to tell VFIO not to emulate MSIX
so MMIO memory regions do not get split into chunks in flatview and
the entire page can be registered as a KVM memory slot and make direct
MMIO access possible for the guest.
This enables the entire MSIX BAR mapping to the guest for the pseries
platform in order to achieve the maximum MMIO preformance for certain
devices.
Tested on:
LSI Logic / Symbios Logic SAS3008 PCI-Express Fusion-MPT SAS-3 (rev 02)
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Since the PC87312 inherits this abstract model, we remove the I8042
instance in the PREP machine.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20180308223946.26784-14-f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au> (hw/ppc)
Message-Id: <20180308223946.26784-6-f4bug@amsat.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au> (hw/ppc)
Message-Id: <20180308223946.26784-4-f4bug@amsat.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
After reviewing a patch from Philippe that removes block-backend.h
from hw/lm32/milkymist.c, I noticed that this header is included
unnecessarily in a lot of other files, too. Remove those unneeded
includes to speed up the compilation process a little bit.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1518684912-31637-1-git-send-email-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Remove the hard-coded list of PCI NIC names; instead, fill an array
using all PCI devices listed under DEVICE_CATEGORY_NETWORK. Keep
the old shortcut "virtio" for virtio-net-pci.
Suggested-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch fixes an incorrect behavior when the -kernel argument has been
specified without -bios. In this case the kernel was loaded twice. At address
32M as a raw image and afterwards by load_elf/load_uimage at the
corresponding load address. In this case the region for the device tree and
the raw kernel image may overlap.
The patch fixes the behavior by loading the kernel image once with
load_elf/load_uimage and skips loading the raw image.
When here do not use bios_name/size for the kernel and use a more generic
name called payload_name/size.
New in v3: dtb must be stored between kernel and initrd because Linux can
handle the dtb only within the first 64MB. Add a comment to
clarify the behavior.
Signed-off-by: David Engraf <david.engraf@sysgo.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Linux kernel commit 2a9d832cc9aae21ea827520fef635b6c49a06c6d
(of: Add bindings for chosen node, stdout-path) deprecated chosen property
"linux,stdout-path" and "stdout".
Introduce the new property "stdout-path" and continue supporting the older
property to remain compatible with existing/older firmware. This older property
can be deprecated after 5 years.
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The sxxm (speculative execution exploit mitigation) machine type is a
variant of the 2.12 machine type with workarounds for speculative
execution vulnerabilities enabled by default.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Convert cap-ibs (indirect branch speculation) to a custom spapr-cap
type.
All tristate caps have now been converted to custom spapr-caps, so
remove the remaining support for them.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
[dwg: Don't explicitly list "?"/help option, trust convention]
[dwg: Fold tristate removal into here, to not break bisect]
[dwg: Fix minor style problems]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
There are currently 2 implemented types of spapr-caps, boolean and
tristate. However there may be a need for caps which don't fit either of
these options. Add a custom capability type for which a list of custom
valid strings can be specified and implement the get/set functions for
these. Also add a field for help text to describe the available options.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
[dwg: Change "help" option to "?" matching qemu conventions]
[dwg: Add ATTRIBUTE_UNUSED to avoid breaking bisect]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Move the remaining comment into macio.c for reference, then remove the
macio_init() function and instantiate the macio devices for both Old World
and New World machines via qdev_init_nofail() directly.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Also switch macio_newworld_realize() over to use it rather than using the pic_mem
memory region directly.
Now that both Old World and New World macio devices no longer make use of the
pic_mem memory region directly, we can remove it.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This is needed before the next patch because the target-dependent kvm stub
uses the existing kvm_openpic_connect_vcpu() declaration, making it impossible
to move the device-specific declarations into the same file without breaking
ppc-linux-user compilation.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Also switch macio_oldworld_realize() over to use it rather than using the pic_mem
memory region directly.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This enables the device to be made available during the setup of the Old World
machine. In order to pass back the previous set of IRQs we temporarily introduce
a new pic_irqs parameter until it can be removed.
An additional benefit of this change is that it is also possible to remove the
pic_mem pointer used for macio by accessing the memory region via sysbus.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Now that the ESCC device is instantiated directly via qdev, move it to within
the macio device and wire up the IRQs and memory regions using the sysbus API.
This enables to remove the now-obsolete escc_mem parameter to the macio_init()
function.
(Note this patch also contains small touch-ups to the formatting in
macio_escc_legacy_setup() and ppc_heathrow_init() in order to keep checkpatch
happy)
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
VSMT must be set in order to compute VCPU ids. This means that the
following functions must not be called before spapr_set_vsmt_mode()
was called:
- spapr_vcpu_id()
- spapr_is_thread0_in_vcore()
- xics_max_server_number()
We had a recent regression where the latter would be called before VSMT
was set, and broke migration of some old machine types. This patch
adds assert() in the above functions to avoid problems in the future.
Also, since VSMT is really a CPU related thing, spapr_set_vsmt_mode() is
now called from spapr_init_cpus(), just before the first VSMT user.
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Some older machine types create more ICPs than needed. We hence
need to register up to xics_max_server_number() dummy ICPs to
accomodate the migration of these machine types.
Recent VSMT rework changed xics_max_server_number() to return
DIV_ROUND_UP(max_cpus * spapr->vsmt, smp_threads)
instead of
DIV_ROUND_UP(max_cpus * kvmppc_smt_threads(), smp_threads);
The change is okay but it requires spapr->vsmt to be set, which
isn't the case with the current code. This causes the formula to
return zero and we don't create dummy ICPs. This breaks migration
of older guests as reported here:
https://bugzilla.redhat.com/show_bug.cgi?id=1549087
The dummy ICP workaround doesn't really have a dependency on XICS
itself. But it does depend on proper VCPU id numbering and it must
be applied before creating vCPUs (ie, creating real ICPs). So this
patch moves the workaround to spapr_init_cpus(), which already
assumes VSMT to be set.
Fixes: 72194664c8 ("spapr: use spapr->vsmt to compute VCPU ids")
Reported-by: Lukas Doktor <ldoktor@redhat.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Add emulation of aCube Sam460ex board based on AMCC 460EX embedded SoC.
This is not a complete implementation yet with a lot of components
still missing but enough for the U-Boot firmware to start and to boot
a Linux kernel or AROS.
Signed-off-by: François Revol <revol@free.fr>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This is the PCIX controller found in newer 440 core SoCs e.g. the
AMMC 460EX. The device tree refers to this as plb-pcix compared to
the plb-pci controller in older 440 SoCs.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
[dwg: Remove hwaddr from trace-events, that doesn't work with some
trace backends]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Commit 5d0fb1508e "spapr: consolidate the VCPU id numbering logic
in a single place" introduced a helper to detect thread0 of a virtual
core based on its VCPU id. This is used to create CPU core nodes in
the DT, but it is broken in TCG.
$ qemu-system-ppc64 -nographic -accel tcg -machine dumpdtb=dtb.bin \
-smp cores=16,maxcpus=16,threads=1
$ dtc -f -O dts dtb.bin | grep POWER8
PowerPC,POWER8@0 {
PowerPC,POWER8@8 {
instead of the expected 16 cores that we get with KVM:
$ dtc -f -O dts dtb.bin | grep POWER8
PowerPC,POWER8@0 {
PowerPC,POWER8@8 {
PowerPC,POWER8@10 {
PowerPC,POWER8@18 {
PowerPC,POWER8@20 {
PowerPC,POWER8@28 {
PowerPC,POWER8@30 {
PowerPC,POWER8@38 {
PowerPC,POWER8@40 {
PowerPC,POWER8@48 {
PowerPC,POWER8@50 {
PowerPC,POWER8@58 {
PowerPC,POWER8@60 {
PowerPC,POWER8@68 {
PowerPC,POWER8@70 {
PowerPC,POWER8@78 {
This happens because spapr_get_vcpu_id() maps VCPU ids to
cs->cpu_index in TCG mode. This confuses the code in
spapr_is_thread0_in_vcore(), since it assumes thread0 VCPU
ids to have a spapr->vsmt spacing.
spapr_get_vcpu_id(cpu) % spapr->vsmt == 0
Actually, there's no real reason to expose cs->cpu_index instead
of the VCPU id, since we also generate it with TCG. Also we already
set it explicitly in spapr_set_vcpu_id(), so there's no real reason
either to call kvm_arch_vcpu_id() with KVM.
This patch unifies spapr_get_vcpu_id() to always return the computed
VCPU id both in TCG and KVM. This is one step forward towards KVM<->TCG
migration.
Fixes: 5d0fb1508e
Reported-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The previous commit improved compile time by including less of the
generated QAPI headers. This is impossible for stuff defined directly
in qapi-schema.json, because that ends up in headers that that pull in
everything.
Move everything but include directives from qapi-schema.json to new
sub-module qapi/misc.json, then include just the "misc" shard where
possible.
It's possible everywhere, except:
* monitor.c needs qmp-command.h to get qmp_init_marshal()
* monitor.c, ui/vnc.c and the generated qapi-event-FOO.c need
qapi-event.h to get enum QAPIEvent
Perhaps we'll get rid of those some other day.
Adding a type to qapi/migration.json now recompiles some 120 instead
of 2300 out of 5100 objects.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180211093607.27351-25-armbru@redhat.com>
[eblake: rebase to master]
Signed-off-by: Eric Blake <eblake@redhat.com>
In my "build everything" tree, a change to the types in
qapi-schema.json triggers a recompile of about 4800 out of 5100
objects.
The previous commit split up qmp-commands.h, qmp-event.h, qmp-visit.h,
qapi-types.h. Each of these headers still includes all its shards.
Reduce compile time by including just the shards we actually need.
To illustrate the benefits: adding a type to qapi/migration.json now
recompiles some 2300 instead of 4800 objects. The next commit will
improve it further.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180211093607.27351-24-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
[eblake: rebase to master]
Signed-off-by: Eric Blake <eblake@redhat.com>
These devices are found in newer SoCs based on 440 core e.g. the 460EX
(http://www.embeddeddeveloper.com/assets/processors/amcc/datasheets/
PP460EX_DS2063.pdf)
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The spapr-cap cap-ibs can only have values broken or fixed as there is
no explicit workaround required. Currently setting the value workaround
for this cap will hit an assert if the guest makes the hcall
h_get_cpu_characteristics.
Report an error when attempting to apply the setting with a more helpful
error message.
Reported-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Several places in the code need to calculate a VCPU id:
(cpu_index / smp_threads) * spapr->vsmt + cpu_index % smp_threads
(core_id / smp_threads) * spapr->vsmt (1 user)
index * spapr->vsmt (2 users)
or guess that the VCPU id of a given VCPU is the first thread of a virtual
core:
index % spapr->vsmt != 0
Even if the numbering logic isn't that complex, it is rather fragile to
have these assumptions open-coded in several places. FWIW this was
proved with recent issues related to VSMT.
This patch moves the VCPU id formula to a single function to be called
everywhere the code needs to compute one. It also adds an helper to
guess if a VCPU is the first thread of a VCORE.
Signed-off-by: Greg Kurz <groug@kaod.org>
[dwg: Rename spapr_is_vcore() to spapr_is_thread0_in_vcore() for clarity]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The spapr_vcpu_id() function is an accessor actually. Let's rename it
for symmetry with the recently added spapr_set_vcpu_id() helper.
The motivation behind this is that a later patch will consolidate
the VCPU id formula in a function and spapr_vcpu_id looks like an
appropriate name.
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The VCPU ids are currently computed and assigned to each individual
CPU threads in spapr_cpu_core_realize(). But the numbering logic
of VCPU ids is actually a machine-level concept, and many places
in hw/ppc/spapr.c also have to compute VCPU ids out of CPU indexes.
The current formula used in spapr_cpu_core_realize() is:
vcpu_id = (cc->core_id * spapr->vsmt / smp_threads) + i
where:
cc->core_id is a multiple of smp_threads
cpu_index = cc->core_id + i
0 <= i < smp_threads
So we have:
cpu_index % smp_threads == i
cc->core_id / smp_threads == cpu_index / smp_threads
hence:
vcpu_id =
(cpu_index / smp_threads) * spapr->vsmt + cpu_index % smp_threads;
This formula was used before VSMT at the time VCPU ids where computed
at the target emulation level. It has the advantage of being useable
to derive a VPCU id out of a CPU index only. It is fitted for all the
places where the machine code has to compute a VCPU id.
This patch introduces an accessor to set the VCPU id in a PowerPCCPU object
using the above formula. It is a first step to consolidate all the VCPU id
logic in a single place.
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Since the introduction of VSMT in 2.11, the spacing of VCPU ids
between cores is controllable through a machine property instead
of being only dictated by the SMT mode of the host:
cpu->vcpu_id = (cc->core_id * spapr->vsmt / smp_threads) + i
Until recently, the machine code would try to change the SMT mode
of the host to be equal to VSMT or exit. This allowed the rest of
the code to assume that kvmppc_smt_threads() == spapr->vsmt is
always true.
Recent commit "8904e5a75005 spapr: Adjust default VSMT value for
better migration compatibility" relaxed the rule. If the VSMT
mode cannot be set in KVM for some reasons, but the requested
CPU topology is compatible with the current SMT mode, then we
let the guest run with kvmppc_smt_threads() != spapr->vsmt.
This breaks quite a few places in the code, in particular when
calculating DRC indexes.
This is what happens on a POWER host with subcores-per-core=2 (ie,
supports up to SMT4) when passing the following topology:
-smp threads=4,maxcpus=16 \
-device host-spapr-cpu-core,core-id=4,id=core1 \
-device host-spapr-cpu-core,core-id=8,id=core2
qemu-system-ppc64: warning: Failed to set KVM's VSMT mode to 8 (errno -22)
This is expected since KVM is limited to SMT4, but the guest is started
anyway because this topology can run on SMT4 even with a VSMT8 spacing.
But when we look at the DT, things get nastier:
cpus {
...
ibm,drc-indexes = <0x4 0x10000000 0x10000004 0x10000008 0x1000000c>;
This means that we have the following association:
CPU core device | DRC | VCPU id
-----------------+------------+---------
boot core | 0x10000000 | 0
core1 | 0x10000004 | 4
core2 | 0x10000008 | 8
core3 | 0x1000000c | 12
But since the spacing of VCPU ids is 8, the DRC for core1 points to a
VCPU that doesn't exist, the DRC for core2 points to the first VCPU of
core1 and and so on...
...
PowerPC,POWER8@0 {
...
ibm,my-drc-index = <0x10000000>;
...
};
PowerPC,POWER8@8 {
...
ibm,my-drc-index = <0x10000008>;
...
};
PowerPC,POWER8@10 {
...
No ibm,my-drc-index property for this core since 0x10000010 doesn't
exist in ibm,drc-indexes above.
...
};
};
...
interrupt-controller {
...
ibm,interrupt-server-ranges = <0x0 0x10>;
With a spacing of 8, the highest VCPU id for the given topology should be:
16 * 8 / 4 = 32 and not 16
...
linux,phandle = <0x7e7323b8>;
interrupt-controller;
};
And CPU hot-plug/unplug is broken:
(qemu) device_del core1
pseries-hotplug-cpu: Cannot find CPU (drc index 10000004) to remove
(qemu) device_del core2
cpu 4 (hwid 8) Ready to die...
cpu 5 (hwid 9) Ready to die...
cpu 6 (hwid 10) Ready to die...
cpu 7 (hwid 11) Ready to die...
These are the VCPU ids of core1 actually
(qemu) device_add host-spapr-cpu-core,core-id=12,id=core3
(qemu) device_del core3
pseries-hotplug-cpu: Cannot find CPU (drc index 1000000c) to remove
This patches all the code in hw/ppc/spapr.c to assume the VSMT
spacing when manipulating VCPU ids.
Fixes: 8904e5a750
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Change the macro that generates the vmstate migration field and the needed
function for the spapr-caps to take the full spapr-cap name. This has
the benefit of meaning this instance will be picked up when greping
for the spapr-caps and making it more obvious what this macro is doing.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Move necessary stuff in escc.h and update type names.
Remove slavio_serial_ms_kbd_init().
Fix code style problems reported by checkpatch.pl
Update mac_newworld, mac_oldworld and sun4m to use directly the
QDEV interface.
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Newer kernels have a htab resize capability when adding or remove
memory. At these situations, the guest kernel might reallocate its
htab to a more suitable size based on the resulting memory.
However, we're not setting the new value back into the machine state
when a KVM guest resizes its htab. At first this doesn't seem harmful,
but when migrating or saving the guest state (via virsh managedsave,
for instance) this mismatch between the htab size of QEMU and the
kernel makes the guest hangs when trying to load its state.
Inside h_resize_hpt_commit, the hypercall that commits the hash page
resize changes, let's set spapr->htab_shift to the new value if we're
sure that kvmppc_resize_hpt_commit were successful.
While we're here, add a "not RADIX" sanity check as it is already done
in the related hypercall h_resize_hpt_prepare.
Fixes: https://github.com/open-power-host-os/qemu/issues/28
Reported-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Add the relevant hooks as required for the MacOS timer calibration and delayed
SR interrupt.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This allows us to more easily differentiate between the timebase frequency used
to calibrate the MacOS timers and the actual frequency of the hardware clock as
indicated by CUDA_TIMER_FREQ.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
[dwg: Revert some extraneous changes which break compile]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
We ignore silently the value of smp_threads when we set
the default VSMT value, and if smp_threads is greater than VSMT
kernel is going into trouble later.
Fixes: 8904e5a750
("spapr: Adjust default VSMT value for better migration compatibility")
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Commit bcb5ce08cf ("spapr: Rename machine init functions for clarity")
renamed ppc_spapr_reset to spapr_machine_reset and ppc_spapr_init
to spapr_machine_init. Let's also rename the references in
comments.
Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Detected by Coverity (CID 1385702). This fixes the recently added hypercall
to let guests properly apply Spectre and Meltdown workarounds.
Fixes: c59704b254 "target/ppc/spapr: Add H-Call H_GET_CPU_CHARACTERISTICS"
Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
qemu-common.h includes qemu/option.h, but most places that include the
former don't actually need the latter. Drop the include, and add it
to the places that actually need it.
While there, drop superfluous includes of both headers, and
separate #include from file comment with a blank line.
This cleanup makes the number of objects depending on qemu/option.h
drop from 4545 (out of 4743) to 284 in my "build everything" tree.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180201111846.21846-20-armbru@redhat.com>
[Semantic conflict with commit bdd6a90a9e in block/nvme.c resolved]
The macro expansions of qdict_put_TYPE() and qlist_append_TYPE() need
qbool.h, qnull.h, qnum.h and qstring.h to compile. We include qnull.h
and qnum.h in the headers, but not qbool.h and qstring.h. Works,
because we include those wherever the macros get used.
Open-coding these helpers is of dubious value. Turn them into
functions and drop the includes from the headers.
This cleanup makes the number of objects depending on qapi/qmp/qnum.h
from 4551 (out of 4743) to 46 in my "build everything" tree. For
qapi/qmp/qnull.h, the number drops from 4552 to 21.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180201111846.21846-10-armbru@redhat.com>
This cleanup makes the number of objects depending on qapi/error.h
drop from 1910 (out of 4743) to 1612 in my "build everything" tree.
While there, separate #include from file comment with a blank line,
and drop a useless comment on why qemu/osdep.h is included first.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180201111846.21846-5-armbru@redhat.com>
[Semantic conflict with commit 34e304e975 resolved, OSX breakage fixed]
In order to enable TCE operations support in KVM, we have to inform
the KVM about VFIO groups being attached to specific LIOBNs;
the necessary bits are implemented already by IOMMU MR and VFIO.
This defines get_attr() for the SPAPR TCE IOMMU MR which makes VFIO
call the KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE ioctl and establish
LIOBN-to-IOMMU link.
This changes spapr_tce_set_need_vfio() to avoid TCE table reallocation
if the kernel supports the TCE acceleration.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
[aw - remove unnecessary sys/ioctl.h include]
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Replace a large number of the fprintf(stderr, "*\n" calls with
error_report(). The functions were renamed with these commands and then
compiler issues where manually fixed.
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
Some lines were then manually tweaked to pass checkpatch and some curly
braces were added to match QEMU style.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Cc: qemu-ppc@nongnu.org
Conversions that aren't followed by exit() dropped, because they might
be inappropriate.
Also trim trailing punctuation from error messages.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180203084315.20497-10-armbru@redhat.com>
The new H-Call H_GET_CPU_CHARACTERISTICS is used by the guest to query
behaviours and available characteristics of the cpu.
Implement the handler for this new H-Call which formulates its response
based on the setting of the spapr_caps cap-cfpc, cap-sbbc and cap-ibs.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Add new tristate cap cap-ibs to represent the indirect branch
serialisation capability.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Add new tristate cap cap-sbbc to represent the speculation barrier
bounds checking capability.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Add new tristate cap cap-cfpc to represent the cache flush on privilege
change capability.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
spapr_caps are used to represent the level of support for various
capabilities related to the spapr machine type. Currently there is
only support for boolean capabilities.
Add support for tristate capabilities by implementing their get/set
functions. These capabilities can have the values 0, 1 or 2
corresponding to broken, workaround and fixed.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
In various place we don't correctly check if the device supports MSI or
MSI-X. This can cause devices to be advertised with MSI support, even
if they only support MSI-X (like virtio-pci-* devices for example):
ethernet@0 {
ibm,req#msi = <0x1>; <--- wrong!
.
ibm,loc-code = "qemu_virtio-net-pci:0000:00:00.0";
.
ibm,req#msi-x = <0x3>;
};
Worse, this can also cause the "ibm,change-msi" RTAS call to corrupt the
PCI status and cause migration to fail:
qemu-system-ppc64: get_pci_config_device: Bad config data: i=0x6
read: 0 device: 10 cmask: 10 wmask: 0 w1cmask:0
^^
PCI_STATUS_CAP_LIST bit which is assumed to be constant
This patch changes spapr_populate_pci_child_dt() to properly check for
MSI support using msi_present(): this ensures that PCIDevice::msi_cap
was set by msi_init() and that msi_nr_vectors_allocated() will look at
the right place in the config space.
Checking PCIDevice::msix_entries_nr is enough for MSI-X but let's add
a call to msix_present() there as well for consistency.
It also changes rtas_ibm_change_msi() to select the appropriate MSI
type in Function 1 instead of always selecting plain MSI. This new
behaviour is compliant with LoPAPR 1.1, as described in "Table 71.
ibm,change-msi Argument Call Buffer":
Function 1: If Number Outputs is equal to 3, request to set to a new
number of MSIs (including set to 0).
If the “ibm,change-msix-capable” property exists and Number
Outputs is equal to 4, request is to set to a new number of
MSI or MSI-X (platform choice) interrupts (including set to
0).
Since MSI is the the platform default (LoPAPR 6.2.3 MSI Option), let's
check for MSI support first.
And finally, it checks the input parameters are valid, as described in
LoPAPR 1.1 "R1–7.3.10.5.1–3":
For the MSI option: The platform must return a Status of -3 (Parameter
error) from ibm,change-msi, with no change in interrupt assignments if
the PCI configuration address does not support MSI and Function 3 was
requested (that is, the “ibm,req#msi” property must exist for the PCI
configuration address in order to use Function 3), or does not support
MSI-X and Function 4 is requested (that is, the “ibm,req#msi-x” property
must exist for the PCI configuration address in order to use Function 4),
or if neither MSIs nor MSI-Xs are supported and Function 1 is requested.
This ensures that the ret_intr_type variable contains a valid MSI type
for this device, and that spapr_msi_setmsg() won't corrupt the PCI status.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
qemu-system-ppcemb has been once split of qemu-system-ppc to support
CPU page sizes < 4096 for some of the embedded 4xx PowerPC CPUs.
However, there was hardly any OS available in the wild that really
used such small page sizes (Linux uses 4096 on PPC), so there is
no known recent use case for this separate build anymore. It's
rather cumbersome to maintain a separate set of config switches for
this, and it's wasting compile and test time of all the developers
who have to build all QEMU targets to verify that their changes did
not break anything.
Except for the small CPU page sizes, qemu-system-ppc can be used as
a full replacement for qemu-system-ppcemb since it contains all the
embedded 4xx PPC boards and CPUs, too. Thus let's start the deprecation
process for qemu-system-ppcemb to see whether somebody still needs
the small page sizes or whether we could finally remove this unloved
separate build.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The vmstate description and the contained needed function for migration
of spapr_caps is the same for each cap, with the name of the cap
substituted. As such introduce a macro to allow for easier generation of
these.
Convert the three existing spapr_caps (htm, vsx, and dfp) to use this
macro.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Commit 51f84465dd changed the compatility mode setting logic:
- machine reset only sets compatibility mode for the boot CPU
- compatibility mode is set for other CPUs when they are put online
by the guest with the "start-cpu" RTAS call
This causes a regression for machines started with max-compat-cpu:
the device tree nodes related to secondary CPU cores contain wrong
"cpu-version" and "ibm,pa-features" values, as shown below.
Guest started on a POWER8 host with:
-smp cores=2 -machine pseries,max-cpu-compat=compat7
ibm,pa-features = [18 00 f6 3f c7 c0 80 f0 80 00
00 00 00 00 00 00 00 00 80 00 80 00 80 00 00 00];
cpu-version = <0x4d0200>;
^^^
second CPU core
ibm,pa-features = <0x600f63f 0xc70080c0>;
cpu-version = <0xf000003>;
^^^
boot CPU core
The second core is advertised in raw POWER8 mode. This happens because
CAS assumes all CPUs to have the same compatibility mode. Since the
boot CPU already has the requested compatibility mode, the CAS code
does not set it for the secondary one, and exposes the bogus device
tree properties in in the CAS response to the guest.
A similar situation is observed when hot-plugging a CPU core. The
related device tree properties are generated and exposed to guest
with the "ibm,configure-connector" RTAS before "start-cpu" is called.
The CPU core is advertised to the guest in raw mode as well.
It both cases, it boils down to the fact that "start-cpu" happens too
late. This can be fixed globally by propagating the compatibility mode
of the boot CPU to the other CPUs during reset. For this to work, the
compatibility mode of the boot CPU must be set before the machine code
actually resets all CPUs.
It is not needed to set the compatibility mode in "start-cpu" anymore,
so the code is dropped.
Fixes: 51f84465dd
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
A variable is already defined at the begining of the function to
hold a pointer to the CPU core object:
sPAPRCPUCore *core = SPAPR_CPU_CORE(OBJECT(dev));
No need to define it again in the pre-2.10 compatibility code snipplet.
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
We've got the config switch CONFIG_PPC4XX, so we should use it
in the Makefile accordingly and only include the PPC4xx boards
if this switch has been enabled. (Note: Unfortunately, the files
ppc4xx_devs.c and ppc405_uc.c still have to be included in the
build anyway to fulfil some complicated linker dependencies ...
so these are subject to a more thourough clean-up later)
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Remove dependency of possible_cpus on 1st CPU instance,
which decouples configuration data from CPU instances that
are created using that data.
Also later it would be used for enabling early cpu to numa node
configuration at runtime qmp_query_hotpluggable_cpus() should
provide a list of available cpu slots at early stage,
before machine_init() is called and the 1st cpu is created,
so that mgmt might be able to call it and use output to set
numa mapping.
Use MachineClass::possible_cpu_arch_ids() callback to set
cpu type info, along with the rest of possible cpu properties,
to let machine define which cpu type* will be used.
* for SPAPR it will be a spapr core type and for ARM/s390x/x86
a respective descendant of CPUClass.
Move parse_numa_opts() in vl.c after cpu_model is parsed into
cpu_type so that possible_cpu_arch_ids() would know which
cpu_type to use during layout initialization.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <1515597770-268979-1-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
TYPE_SPAPR_PCI_HOST_BRIDGE is the only dynamic sysbus device not
rejected by ppc_spapr_reset(), so it can be the only entry on the
allowed list.
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Alexander Graf <agraf@suse.de>
Cc: qemu-ppc@nongnu.org
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20171125151610.20547-5-ehabkost@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
platform_bus_create_devtree() already rejects all dynamic sysbus
devices except TYPE_ETSEC_COMMON, so register it as the only
allowed dynamic sysbus device for the ppce500 machine-type.
Cc: Alexander Graf <agraf@suse.de>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: qemu-ppc@nongnu.org
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20171125151610.20547-4-ehabkost@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The existing has_dynamic_sysbus flag makes the machine accept
every user-creatable sysbus device type on the command-line.
Replace it with a list of allowed device types, so machines can
easily accept some sysbus devices while rejecting others.
To keep exactly the same behavior as before, the existing
has_dynamic_sysbus=true assignments are replaced with a
TYPE_SYS_BUS_DEVICE entry on the allowed list. Other patches
will replace the TYPE_SYS_BUS_DEVICE entries with more specific
lists of devices.
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Marcel Apfelbaum <marcel@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alexander Graf <agraf@suse.de>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Anthony Perard <anthony.perard@citrix.com>
Cc: qemu-arm@nongnu.org
Cc: qemu-ppc@nongnu.org
Cc: xen-devel@lists.xenproject.org
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20171125151610.20547-2-ehabkost@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
When skiboot starts, it first clears the CPU structs for all possible
CPUs on a system :
for (i = 0; i <= cpu_max_pir; i++)
memset(&cpu_stacks[i].cpu, 0, sizeof(struct cpu_thread));
On POWER9, cpu_max_pir is quite big, 0x7fff, and the skiboot cpu_stacks
array overlaps with the memory region in which QEMU maps the initramfs
file. Move it upwards in memory to keep it safe.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The XSCOM base address of the core chiplet was wrongly calculated. Use
the OPAL macros to fix that and do a couple of renames.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
These are useful when instantiating device models which are shared
between the POWER8 and the POWER9 processor families.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
When addressed by XSCOM, the first core has the 0x20 chiplet ID but
the CPU PIR can start at 0x0.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
commit 1ed9c8af50 ("target/ppc: Add POWER9 DD2.0 model information")
deprecated the POWER9 model v1.0.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
fa98fbfc "PC: KVM: Support machine option to set VSMT mode" introduced the
"vsmt" parameter for the pseries machine type, which controls the spacing
of the vcpu ids of thread 0 for each virtual core. This was done to bring
some consistency and stability to how that was done, while still allowing
backwards compatibility for migration and otherwise.
The default value we used for vsmt was set to the max of the host's
advertised default number of threads and the number of vthreads per vcore
in the guest. This was done to continue running without extra parameters
on older KVM versions which don't allow the VSMT value to be changed.
Unfortunately, even that smaller than before leakage of host configuration
into guest visible configuration still breaks things. Specifically a guest
with 4 (or less) vthread/vcore will get a different vsmt value when
running on a POWER8 (vsmt==8) and POWER9 (vsmt==4) host. That means the
vcpu ids don't line up so you can't migrate between them, though you should
be able to.
Long term we really want to make vsmt == smp_threads for sufficiently
new machine types. However, that means that qemu will then require a
sufficiently recent KVM (one which supports changing VSMT) - that's still
not widely enough deployed to be really comfortable to do.
In the meantime we need some default that will work as often as
possible. This patch changes that default to 8 in all circumstances.
This does change guest visible behaviour (including for existing
machine versions) for many cases - just not the most common/important
case.
Following is case by case justification for why this is still the least
worst option. Note that any of the old behaviours can still be duplicated
after this patch, it's just that it requires manual intervention by
setting the vsmt property on the command line.
KVM HV on POWER8 host:
This is the overwhelmingly common case in production setups, and is
unchanged by design. POWER8 hosts will advertise a default VSMT mode
of 8, and > 8 vthreads/vcore isn't permitted
KVM HV on POWER7 host:
Will break, but POWER7s allowing KVM were never released to the public.
KVM HV on POWER9 host:
Not yet released to the public, breaking this now will reduce other
breakage later.
KVM HV on PowerPC 970:
Will theoretically break it, but it was barely supported to begin with
and already required various user visible hacks to work. Also so old
that I just don't care.
TCG:
This is the nastiest one; it means migration of TCG guests (without
manual vsmt setting) will break. Since TCG is rarely used in production
I think this is worth it for the other benefits. It does also remove
one more barrier to TCG<->KVM migration which could be interesting for
debugging applications.
KVM PR:
As with TCG, this will break migration of existing configurations,
without adding extra manual vsmt options. As with TCG, it is rare in
production so I think the benefits outweigh breakages.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
At present if we require a vsmt mode that's not equal to the kernel's
default, and the kernel doesn't let us change it (e.g. because it's an old
kernel without support) then we always fail.
But in fact we can cope with the kernel having a different vsmt as long as
a) it's >= the actual number of vthreads/vcore (so that guest threads
that are supposed to be on the same core act like it)
b) it's a submultiple of the requested vsmt mode (so that guest threads
spaced by the vsmt value will act like they're on different cores)
Allowing this case gives us a bit more freedom to adjust the vsmt behaviour
without breaking existing cases.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Tested-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>