This is an autogenerated patch using scripts/switch-timer-api.
Switch the entire code base to using the new timer API.
Note this patch may introduce some line length issues.
Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
'dprintf' is the name of a POSIX standard function so we should not be
stealing it for our debug macro. Rename to 'DPRINTF' (in line with
a number of other source files.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Acked-by: Richard Henderson <rth@twiddle.net>
Acked-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1375100199-13934-4-git-send-email-peter.maydell@linaro.org
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
At present, the savevm / migration support for the pseries machine will not
work when KVM is enabled. That's because KVM manages the guest's hash page
table in the host kernel, so qemu has no visibility of it. This patch
fixes this by using new kernel interfaces to extract and reinsert the
guest's hash table during the migration process.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Message-id: 1374175984-8930-11-git-send-email-aliguori@us.ibm.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Model TCE tables as a device that's hooked up as a child object to
the owner. Besides the code cleanup, we get a few nice benefits:
1) free actually works now (it was dead code before)
2) the TCE information is visible in the device tree
3) we can expose table information as properties such that if we
change the window_size, we can use globals to keep migration
working.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-id: 1374175984-8930-6-git-send-email-aliguori@us.ibm.com
[dwg: pseries: savevm support for PAPR TCE tables]
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[alexey: ppc kvm: fix to compile]
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Move next_cpu from CPU_COMMON to CPUState.
Move first_cpu variable to qom/cpu.h.
gdbstub needs to use CPUState::env_ptr for now.
cpu_copy() no longer needs to save and restore cpu_next.
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
[AF: Rebased, simplified cpu_copy()]
Signed-off-by: Andreas Färber <afaerber@suse.de>
This adds a missing code to save CR (condition register) via
kvm_arch_put_registers(). kvm_arch_get_registers() already has it.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
The common KVM code insists on calling kvm_arch_init_irq_routing()
as soon as it sees kernel header support for it (regardless of whether
QEMU supports it). Provide a dummy function to satisfy this.
Unlike x86, PPC does not have one default irqchip, so there's no common
code that we'd stick here. Even if you ignore the routes themselves,
which even on x86 are not set up in this function, the initial XICS
kernel implementation will not support IRQ routing, so it's best to
leave even the general feature flags up to the specific irqchip code.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Some source files #include the same header more than
once for no good reason. Remove second #includes in
such cases.
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
For PAPR guests, KVM tracks the various areas registered with the
H_REGISTER_VPA hypercall. For full emulation, of course, these are tracked
within qemu. At present these values are not synchronized. This is a
problem for reset (qemu's reset of the VPA address is not pushed to KVM)
and will also be a problem for savevm / migration.
The kernel now supports accessing the VPA state via the ONE_REG interface,
this patch adds code to qemu to use that interface to keep the qemu and
KVM ideas of the VPA state synchronized.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
PAPR requires that the device tree's CPU nodes have several properties
with information about the L1 cache. We already create two of these
properties, but with incorrect names - "[id]cache-block-size" instead
of "[id]-cache-block-size" (note the extra hyphen).
We were also missing some of the required cache properties. This
patch adds the [id]-cache-line-size properties (which have the same
values as the block size properties in all current cases). We also
add the [id]-cache-size properties.
Adding the cache sizes requires some extra infrastructure in the
general target-ppc code to (optionally) set the cache sizes for
various CPUs. The CPU family descriptions in translate_init.c can set
these sizes - this patch adds correct information for POWER7, I'm
leaving other CPU types to people who have a physical example to
verify against. In addition, for -cpu host we take the values
advertised by the host (if available) and use those to override the
information based on PVR.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
For the pseries machine, we need to advertise to the guest the size of its
RMA - that is the amount of memory it can access with the MMU off. For HV
KVM, this is constrained by the hardware limitations on the virtual RMA of
one hash PTE per PTE group in the hash page table. We already had code to
calculate this, but it was assuming the VRMA page size was the same as the
(host) backing page size for guest RAM.
In the case of a host kernel configured for 64k base page size, but running
on hardware (or firmware) which only allows 4k pages, the hose will do all
its allocations with a 64k page size, but still use 4k hardware pages for
actual mappings. Usually that's transparent to things running under the
host, but in the case of the maximum VRMA size it's not.
This patch refines the RMA size calculation to instead use the largest
available hardware page size (as reported by the SMMU_INFO call) which is
less than or equal to the backing page size. This now gives the correct
RMA size in all cases I've tested.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Enable the KVM emulated watchdog if KVM supports (use the
capability enablement in watchdog handler). Also watchdog exit
(KVM_EXIT_WATCHDOG) handling is added.
Watchdog state machine is cleared whenever VM state changes to running.
This is to handle the cases like return from debug halt etc.
Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
[agraf: rebase to current code base, fix non-kvm cases]
Signed-off-by: Alexander Graf <agraf@suse.de>
Older KVM versions don't support EPR which breaks guests when we announce
MPIC variants that support EPR.
Catch that case and expose only MPIC version 2.0 which tells the guest that
we don't support the EPR capability yet.
Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com>
[agraf: Add comment, route cap check through kvm_ppc.c]
Signed-off-by: Alexander Graf <agraf@suse.de>
Many of these should be cleaned up with proper qdev-/QOM-ification.
Right now there are many catch-all headers in include/hw/ARCH depending
on cpu.h, and this makes it necessary to compile these files per-target.
However, fixing this does not belong in these patches.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Currently cpu.h contains a number of definitions relating to the 64-bit
hash MMU. Some are used in the MMU emulation code, but some are only used
in the spapr MMU management hcall implementations.
This patch moves these definitions (except for a few that are needed
more widely) into mmu-hash64.h header, shared between the MMU emulation
code and the spapr hcall code. The MMU emulation code is also updated to
actually use a number of those definitions in place of hard coded
constants.
Similarly, we add new analogous definitions to mmu-hash32.h and use those
in place of many hard-coded constants in mmu-hash32.c
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
[agraf: fix 32-bit hosts]
Signed-off-by: Alexander Graf <agraf@suse.de>
target-ppc/kvm.c has an #ifdef on CONFIG_PSERIES, for the handling of
KVM exits due to a PAPR hypercall from the guest. However, since commit
e4c8b28cde "ppc: express FDT dependency of
pSeries and e500 boards via default-configs/", this hasn't worked properly.
That patch altered the configuration setup so that although CONFIG_PSERIES
is visible from the Makefiles, it is not visible from C files. This broke
the pseries machine when KVM is in use.
This patch makes a quick and dirty fix, by removing the CONFIG_PSERIES
dependency, replacing it with TARGET_PPC64 (since removing it entirely
leads to type mismatch errors). Technically this breaks the build when
configured with --disable-fdt, since that disables CONFIG_PSERIES on
TARGET_PPC64. However, it turns out the build was already broken in that
case, so this fixes pseries kvm without breaking anything extra. I'm
looking into how to fix that build breakage, but I don't think that need
delay applying this patch.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Both fields are used in VMState, thus need to be moved together.
Explicitly zero them on reset since they were located before
breakpoints.
Pass PowerPCCPU to kvmppc_handle_halt().
Signed-off-by: Andreas Färber <afaerber@suse.de>
This avoids assigning individual class fields and contributors
forgetting to add field assignments in KVM-only code.
ppc_cpu_class_find_by_pvr() requires the CPU model classes to be
registered, so defer host CPU type registration to kvm_arch_init().
Only register the host CPU type if there is a class with matching PVR.
This lets us drop error handling from instance_init.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
Currently qemu does not get and put the state of the floating point and
vector registers to KVM. This is obviously a problem for savevm, as well
as possibly being problematic for debugging of FP-using guests.
This patch fixes this by using new extensions to the ONE_REG interface to
synchronize the qemu floating point state with KVM.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Currently when runing under KVM on ppc, we synchronize a certain number of
vital SPRs to KVM through the SET_SREGS call. This leaves out quite a lot
of important SPRs which are maintained in KVM. It would be helpful to
have their contents in qemu for debugging purposes, and when we implement
migration it will be vital, since they include important guest state that
will need to be restored on the target.
This patch sets up for synchronization of any registers supported by the
KVM ONE_REG calls. A new variant on spr_register() allows a ONE_REG id to
be stored with the SPR information. When we set/get information to KVM
we also synchronize any SPRs so registered.
For now we set this mechanism up to synchronize a handful of important
registers that already have ONE_REG IDs, notably the DAR and DSISR.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Turn the array of model definitions into a set of self-registering QOM
types with their own class_init. Unique identifiers are obtained from
the combination of PVR, SVR and family identifiers; this requires all
alias #defines to be removed from the list. Possibly there are some more
left after this commit that are not currently being compiled.
Prepares for introducing abstract intermediate CPU types for families.
Keep the right-aligned macro line breaks within 78 chars to aid
three-way merges.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
In preparation for more efficient setting of these fields.
Cc: Alexander Graf <agraf@suse.de>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
This will allow each architecture to define how the VCPU ID is set on
the KVM_CREATE_VCPU ioctl call.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Note that target-alpha accesses this field from TCG, now using a
negative offset. Therefore the field is placed last in CPUState.
Pass PowerPCCPU to [kvm]ppc_fixup_cpu() to facilitate this change.
Move common parts of mips cpu_state_reset() to mips_cpu_reset().
Acked-by: Richard Henderson <rth@twiddle.net> (for alpha)
[AF: Rebased onto ppc CPU subclasses and openpic changes]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Previously we silently exited, with subclasses we got an opcode warning.
Instead, explicitly tell the user what's wrong.
An indication for this is -cpu ? showing "host" with an all-zero PVR.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
Since the model list is highly macrofied, keep ppc_def_t for now and
save a pointer to it in PowerPCCPUClass. This results in a flat list of
subclasses including aliases, to be refined later.
Move cpu_ppc_init() to translate_init.c and drop helper.c.
Long-term the idea is to turn translate_init.c into a standalone cpu.c.
Inline cpu_ppc_usable() into type registration.
Split cpu_ppc_register() in two by code movement into the initfn and
by turning the remaining part into a realizefn.
Move qemu_init_vcpu() call into the new realizefn and adapt
create_ppc_opcodes() to return an Error.
Change ppc_find_by_pvr() -> ppc_cpu_class_by_pvr().
Change ppc_find_by_name() -> ppc_cpu_class_by_name().
Turn -cpu host into its own subclass. This requires to move the
kvm_enabled() check in ppc_cpu_class_by_name() to avoid the class being
found via the normal name lookup in the !kvm_enabled() case.
Turn kvmppc_host_cpu_def() into the class_init and add an initfn that
asserts KVM is in fact enabled.
Implement -cpu ? and the QMP equivalent in terms of subclasses.
This newly exposes -cpu host to the user, ordered last for -cpu ?.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
On e500mc, the platform doesn't provide a way for the CPU to go idle.
To still not uselessly burn CPU time, expose an idle hypercall to the guest
if kvm supports it.
Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com>
[agraf: adjust for current code base, add patch description, fix non-kvm case]
Signed-off-by: Alexander Graf <agraf@suse.de>
* 'ppc-for-upstream' of git://repo.or.cz/qemu/agraf: (35 commits)
PPC: KVM: Fix BAT put
PPC: e500: Only expose even TLB sizes in initial TLB
ppc/pseries: Reset VPA registration on CPU reset
pseries: Don't test for MSR_PR for hypercalls under KVM
PPC: e500: calculate initrd_base like dt_base
PPC: e500: increase DTC_LOAD_PAD
device tree: simplify dumpdtb code
fdt: move dumpdtb interpretation code to device_tree.c
target-ppc: Remove unused power_mode field from cpu state
pseries: Set hash table size based on RAM size
pseries: Remove unnecessary locking from PAPR hash table hcalls
ppc405_uc: Fix buffer overflow
target-ppc: KVM: Fix some kernel version edge cases for kvmppc_reset_htab()
pseries: Fix semantics of RTAS int-on, int-off and set-xive functions
pseries: Rework implementation of TCE bypass
pseries: Remove never used flags field from spapr vio devices
pseries: Remove XICS irq type enum type
pseries: Remove C bitfields from xics code
pseries: Small cleanup to H_CEDE implementation
pseries: Fix XICS reset
...
A terminal NUL is required by caller's use of strchr.
It's better not to use strncpy at all, since there is no need
to zero out hundreds of trailing bytes for each iteration.
Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
In the sregs API, upper and lower 32bit segments of the BAT registers
are swapped when doing a set. Since we need to support old kernels out
there, don't bother to fix it in the kernel, but instead work around
the problem in QEMU by swapping on put.
Signed-off-by: Alexander Graf <agraf@suse.de>
The kvmppc_reset_htab() function invokes the KVM_PPC_ALLOCATE_HTAB vm ioctl
to request KVM to allocate and reset a hash page table for the guest - it
returns the size of hash table allocated, or 0 to indicate that qemu needs
to allocate the hash table itself. In practice qemu needs to allocate the
htab for full emulation and with Book3sPR KVM, but the kernel has to
allocate it for Book3sHV KVM (the hash table needs to be physically
contiguous in that case).
Unfortunately, the logic in this function is incorrect for some existing
kernels. Specifically:
* at least some PR KVM versions advertise the relevant capability but
don't actually implement the ioctl(), returning ENOTTY.
* For old kernels which don't have the capability, we currently return 0.
This is correct for PV KVM, where we need to allocate the htab, but not for
HV KVM - kernels of this era always allocate a 16MB hash table per guest.
This patch corrects both of these edge cases.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
This adds support for then new "reset htab" ioctl which allows qemu
to properly cleanup the MMU hash table when the guest is reset. With
the corresponding kernel support, reset of a guest now works properly.
This also paves the way for indicating a different size hash table
to the kernel and for the kernel to be able to impose limits on
the requested size.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
At least when invoked with high enough 'level' arguments,
kvm_arch_put_registers() is supposed to copy essentially all the cpu state
as encoded in qemu's internal structures into the kvm state. Currently
the ppc version does not do this - it never calls KVM_SET_SREGS, for
example, and therefore never sets the SDR1 and various other important
though rarely changed registers.
Instead, the code paths which need to set these registers need to
explicitly make (conditional) kvm calls which transfer the changes to kvm.
This breaks the usual model of handling state updates in qemu, where code
just changes the internal model and has it flushed out to kvm automatically
at some later point.
This patch fixes this for Book S ppc CPUs by adding a suitable call to
KVM_SET_SREGS and als to KVM_SET_ONE_REG to set the HIOR (the only register
that is set with that call so far). This lets us remove the hacks to
explicitly set these registers from the kvmppc_set_papr() function.
The problem still exists for Book E CPUs (which use a different version of
the kvm_sregs structure). But fixing that has some complications of its
own so can be left to another day.
Lkewise, there is still some ugly code for setting the PVR through special
calls to SET_SREGS which is left in for now. The PVR needs to be set
especially early because it can affect what other features are available
on the CPU, so I need to do more thinking to see if it can be integrated
into the normal paths or not.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Currently for powerpc, kvm_arch_handle_exit() always returns 1, meaning
that its caller - kvm_cpu_exec() - will always exit immediately afterwards
to the loop in qemu_kvm_cpu_thread_fn().
There's no need to do this. Once we've handled the hypercall there's no
reason we can't go straight around and KVM_RUN again, which is what ret = 0
will signal. The only exception might be for hypercalls which affect the
state of cpu_can_run(), however the only one that might do this is H_CEDE
and for kvm that is always handled in the kernel, not qemu.
Furtherm setting ret = 0 means that when exit_requested is set from a
hypercall, we will enter KVM_RUN once more with a signal which lets the
the kernel do its internal logic to complete the hypercall with out
actually executing any more guest code. This is important if our hypercall
also triggered a reset, which previously would re-initialize everything
without completing the hypercall. This caused the kernel to get confused
because it thought the guest was still in the middle of a hypercall when
it has actually been reset.
This patch therefore changes to ret = 0, which is both a bugfix and a small
optimization.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
The pseries platform already contains an IOMMU implementation, since it is
essential for the platform's paravirtualized VIO devices. This IOMMU
support is currently built into the implementation of the VIO "bus" and
the various VIO devices.
This patch converts this code to make use of the new common IOMMU
infrastructure.
We don't yet handle synchronization of map/unmap callbacks vs. invalidations,
this will require some complex interaction with the kernel and is not a
major concern at this stage.
Cc: Alex Graf <agraf@suse.de>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
More recent Power server chips (i.e. based on the 64 bit hash MMU)
support more than just the traditional 4k and 16M page sizes. This
can get quite complicated, because which page sizes are supported,
which combinations are supported within an MMU segment and how these
page sizes are encoded both in the SLB entry and the hash PTE can vary
depending on the CPU model (they are not specified by the
architecture). In addition the firmware or hypervisor may not permit
use of certain page sizes, for various reasons. Whether various page
sizes are supported on KVM, for example, depends on whether the PR or
HV variant of KVM is in use, and on the page size of the memory
backing the guest's RAM.
This patch adds information to the CPUState and cpu defs to describe
the supported page sizes and encodings. Since TCG does not yet
support any extended page sizes, we just set this to NULL in the
static CPU definitions, expanding this to the default 4k and 16M page
sizes when we initialize the cpu state. When using KVM, however, we
instead determine available page sizes using the new
KVM_PPC_GET_SMMU_INFO call. For old kernels without that call, we use
some defaults, with some guesswork which should do the right thing for
existing HV and PR implementations. The fallback might not be correct
for future versions, but that's ok, because they'll have
KVM_PPC_GET_SMMU_INFO.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
On target-ppc, our table of CPU types and features encodes the features as
found on the hardware, regardless of whether these features are actually
usable under TCG or KVM. We already have cases where the information from
the cpu table must be fixed up to account for limitations in the emulation
method we're using. e.g. TCG does not support the DFP and VSX instructions
and KVM needs different numbering of the CPUs in order to tell it the
correct thread to core mappings.
This patch cleans up these hacks to handle emulation limitations by
consolidating them into a pair of functions specifically for the purpose.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
[AF: Style and typo fixes, rename new functions and drop ppc_def_t arg]
Signed-off-by: Andreas Färber <afaerber@suse.de>
The official spelling is QEMU.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Andreas Färber <afaerber@suse.de>
[blauwirbel@gmail.com: fixed comment style in hw/sun4m.c]
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
For the pseries machine, TCE (IOMMU) tables can either be directly
malloc()ed in qemu or, when running on a KVM which supports it, mmap()ed
from a KVM ioctl. The latter option is used when available, because it
allows the (frequent bottlenext) H_PUT_TCE hypercall to be KVM accelerated.
However, even when KVM is persent, TCE acceleration is not always possible.
Only KVM HV supports this ioctl(), not KVM PR, or the kernel could run out
of contiguous memory to allocate the new table. In this case we need to
fall back on the malloc()ed table.
When a device is removed, and we need to remove the TCE table, we need to
either munmap() or free() the table as appropriate for how it was
allocated. The code is supposed to do that, but we buggily fail to
initialize the tcet->fd variable in the malloc() case, which is used as a
flag to determine which is the right choice.
This patch fixes the bug, and cleans up error messages relating to this
path while we're at it.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>