Move MMU, TLB, SLB and BAT ops to mmu_helper.c.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
Add an explicit CPUPPCState parameter instead of relying on AREG0.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Andreas Färber <afaerber@suse.de>
[fix unwanted whitespace line in Makefile.target]
Signed-off-by: Alexander Graf <agraf@suse.de>
Move integer and vector ops to int_helper.c.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
Add an explicit CPUPPCState parameter instead of relying on AREG0.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
Move FPU and SPE helpers from op_helper.c to fpu_helper.c.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
Move exception helpers from helper.c to excp_helper.c and
make cpu_dump_rfi() static.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
helper.c will be spilt by the next patches, fix
style issues before that.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
Add an explicit CPUPPCState parameter instead of relying on AREG0.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
Move exception helpers from op_helper.c to excp_helper.c.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
op_helper.c will be split by the next patches, fix
style issues before that.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
The file is located in target-ppc/, not hw/.
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Anthony Liguori <anthony@codemonkey.ws>
Cc: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
In commit 1bba0dc932 cpu_reset()
was renamed to cpu_state_reset(), to allow introducing a new cpu_reset()
that would operate on QOM objects.
All callers have been updated except for one in target-mips, so drop all
implementations except for the one in target-mips and move the
declaration there until MIPSCPU reset can be fully QOM'ified.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Acked-by: Michael Walle <michael@walle.cc> (for lm32)
Acked-by: Max Filippov <jcmvbkbc@gmail.com> (for xtensa)
Acked-by: Edgar E. Iglesias <edgar.iglesias@gmail.com> (for mb + cris)
Acked-by: Alexander Graf <agraf@suse.de> (for ppc)
Acked-by: Blue Swirl <blauwirbel@gmail.com>
Adapt e500 mpc8544ds machine accordingly.
Turn cpu_init() into a static inline function returning CPUPPCState for
backwards compatibility.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Acked-by: Alexander Graf <agraf@suse.de>
When initializing the e500 code, we need to expose its
cache line size for user and system mode, while the mmu
details are only interesting for system emulation.
Split the 2 switch statements apart, allowing us to #ifdef
out the mmu parts for user mode emulation while keeping all
cache information consistent.
Signed-off-by: Alexander Graf <agraf@suse.de>
machine.c is only compiled for softmmu targets, so checks for
!defined(CONFIG_USER_ONLY) are unnecessary and can be dropped.
Signed-off-by: Juan Quintela <quintela@redhat.com>
[AF: Use more verbose commit message suggested by PMM]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andreas Färber <afaerber@suse.de>
commit f7aa558396 pulled the dcache and icache
line size initialization inside of a '#if !defined(CONFIG_USER_ONLY)' block.
This is not correct because instructions like 'dcbz' need the dcache size
initialized even for user mode.
Signed-off-by: Meador Inge <meadori@codesourcery.com>
Cc: Varun Sethi <Varun.Sethi@freescale.com>
[AF: Simplify #ifdefs by using cache line size 32 for *-user as before]
Suggested-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Move code from cpu_state_reset() into ppc_cpu_reset().
Reorder #include of helper_regs.h to use it in translate_init.c.
Adjust whitespace and add braces.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Move code not dependent on ppc_def_t from cpu_ppc_init() into an initfn.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Embed CPUPPCState as first member of PowerPCCPU.
Distinguish between "powerpc-cpu", "powerpc64-cpu" and
"embedded-powerpc-cpu".
Let CPUClass::reset() call cpu_state_reset() for now.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
On target-ppc, our table of CPU types and features encodes the features as
found on the hardware, regardless of whether these features are actually
usable under TCG or KVM. We already have cases where the information from
the cpu table must be fixed up to account for limitations in the emulation
method we're using. e.g. TCG does not support the DFP and VSX instructions
and KVM needs different numbering of the CPUs in order to tell it the
correct thread to core mappings.
This patch cleans up these hacks to handle emulation limitations by
consolidating them into a pair of functions specifically for the purpose.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
[AF: Style and typo fixes, rename new functions and drop ppc_def_t arg]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Commit 41557447d3 also introduced a subtle TLB
flush bug. By applying a mask to the interrupt MSR which cleared the IR/DR
bits at the start of the interrupt handler, the logic towards the end of the
handler to force a TLB flush if either one of these bits were set would never
be triggered.
This patch simply changes the IR/DR bit check in the TLB flush logic to use
the original MSR value (albeit with some interrupt-specific bits cleared) so
that the IR/DR bits are preserved at the point where the check takes place.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Use uintptr_t instead of void * or unsigned long in
several op related functions, env->mem_io_pc and
GETPC() macro.
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
The official spelling is QEMU.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Andreas Färber <afaerber@suse.de>
[blauwirbel@gmail.com: fixed comment style in hw/sun4m.c]
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
When we dump the CPU registers, there's a certain chance they haven't been
synchronized with KVM yet, so we have to manually trigger that.
This aligns the code with x86 and fixes a bug where the register state was
bogus on invalid/unknown kvm exit reasons.
Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
'POWERPC_INSNS2_DEFAULT' was defined incorrectly which was causing the
opcode table creation code to erroneously register 'eieio' and 'mbar'
for the "default" processor:
** ERROR: opcode 1a already assigned in opcode table 16
*** ERROR: unable to insert opcode [1f-16-1a]
*** ERROR initializing PowerPC instruction 0x1f 0x16 0x1a
Signed-off-by: Meador Inge <meadori@codesourcery.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Fix large page support in TCG. The old code would overwrite the large page
table entry with the fake 4 KB one generated here whenever the ref/change bits
were updated, causing it to point to the wrong area of memory.
Signed-off-by: Nathan Whitehorn <nwhitehorn@freebsd.org>
Acked-by: David Gibson <david@gibson.drobpear.id.au>
[agraf: fix whitespace, braces]
Signed-off-by: Alexander Graf <agraf@suse.de>
The POWER7 emulation is missing the Processor Identification Register,
mandatory in recent POWER CPUs, that is required for SMP on at least
some operating systems (e.g. FreeBSD) to function properly. This patch
copies the existing PIR code from the other CPUs that implement it.
Signed-off-by: Nathan Whitehorn <nwhitehorn@freebsd.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
These instructions for loading and storing byte-swapped 64-bit values have
been introduced in PowerISA 2.06.
Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
For the pseries machine, TCE (IOMMU) tables can either be directly
malloc()ed in qemu or, when running on a KVM which supports it, mmap()ed
from a KVM ioctl. The latter option is used when available, because it
allows the (frequent bottlenext) H_PUT_TCE hypercall to be KVM accelerated.
However, even when KVM is persent, TCE acceleration is not always possible.
Only KVM HV supports this ioctl(), not KVM PR, or the kernel could run out
of contiguous memory to allocate the new table. In this case we need to
fall back on the malloc()ed table.
When a device is removed, and we need to remove the TCE table, we need to
either munmap() or free() the table as appropriate for how it was
allocated. The code is supposed to do that, but we buggily fail to
initialize the tcet->fd variable in the malloc() case, which is used as a
flag to determine which is the right choice.
This patch fixes the bug, and cleans up error messages relating to this
path while we're at it.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Scripted conversion:
for file in *.[hc] hw/*.[hc] hw/kvm/*.[hc] linux-user/*.[hc] linux-user/m68k/*.[hc] bsd-user/*.[hc] darwin-user/*.[hc] tcg/*/*.[hc] target-*/cpu.h; do
sed -i "s/CPUState/CPUArchState/g" $file
done
All occurrences of CPUArchState are expected to be replaced by QOM CPUState,
once all targets are QOM'ified and common fields have been extracted.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Scripted conversion:
sed -i "s/CPUState/CPUPPCState/g" target-ppc/*.[hc]
sed -i "s/#define CPUPPCState/#define CPUState/" target-ppc/cpu.h
Signed-off-by: Andreas Färber <afaerber@suse.de>
Acked-by: Anthony Liguori <aliguori@us.ibm.com>
Frees the identifier cpu_reset for QOM CPUs (manual rename).
Don't hide the parameter type behind explicit casts, use static
functions with strongly typed argument to indirect.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
On ppc405ep there is a register that allows for software to reset the
core, but not the whole system. Implement this reset using a reset
interrupt.
This gets rid of a bunch of #if 0'ed code.
Reported-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Fix this error:
/src/qemu/target-ppc/helper.c: In function 'booke206_tlb_to_page_size':
/src/qemu/target-ppc/helper.c:1296:14: error: variable 'tlbncfg' set but not used [-Werror=unused-but-set-variable]
Tested-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
When running Linux on e500 with powersave-nap enabled, Linux tries to
read out the L1CFG0 register and calculates some things from it. Passing
0 there ends up in a division by 0, resulting in -1, resulting in badness.
So let's populate the L1CFG0 register with reasonable defaults. That way
guests aren't completely confused.
Reported-by: Shrijeet Mukherjee <shm@cumulusnetworks.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
The e500mc implements Embedded.Processor Control, so enable it and
thus enable guests to IPI each other. This makes -smp work with -cpu
e500mc.
Signed-off-by: Alexander Graf <agraf@suse.de>
This patch implements the msgsnd instruction. It is part of the
Embedded.Processor Control specification and allows one CPU to
IPI another CPU without going through an interrupt controller.
Signed-off-by: Alexander Graf <agraf@suse.de>
This patch implements the msgclr instruction. It is part of the
Embedded.Processor Control specification and clears pending doorbell
interrupts on the current CPU.
Signed-off-by: Alexander Graf <agraf@suse.de>
We already had all the code available to have doorbell exceptions
be handled properly. It was just disabled.
Enable it, so we can rely on it.
Signed-off-by: Alexander Graf <agraf@suse.de>
We're going to introduce doorbell instructions (called processor
control in the spec) soon. Add some defines for easier patch
readability later.
Signed-off-by: Alexander Graf <agraf@suse.de>
Our EXCP list is getting outdated. By now, 3 new exception vectors have
been introduced. Update the list so we have everything at one place.
Signed-off-by: Alexander Graf <agraf@suse.de>
We can have TLBs that only support a single page size. This is defined
by the absence of the AVAIL flag in TLBnCFG. If this is the case, we
currently write invalid size info into the TLB, but override it on
internal fault.
Let's move the check over to tlbwe, so we don't have the AVAIL check in
the hotter fault path.
Signed-off-by: Alexander Graf <agraf@suse.de>
Our internal helpers to fetch TLB entries were not able to tell us
that an entry doesn't even exist. Pass an error out if we hit such
a case to not accidently pass beyond the TLB array.
Signed-off-by: Alexander Graf <agraf@suse.de>
The PowerPC 2.06 BookE ISA defines an opcode called "tlbilx" which is used
to flush TLB entries. It's the recommended way of flushing in virtualized
environments.
So far we got away without implementing it, but Linux for e500mc uses this
instruction, so we better add it :).
Signed-off-by: Alexander Graf <agraf@suse.de>
When setting a TLB entry, we need to check if the TLB we're putting it in
actually supports the given size. According to the 2.06 PowerPC ISA, a
value that's out of range can either be redefined to something implementation
dependent or we can raise an illegal opcode exception. We do the latter.
Signed-off-by: Alexander Graf <agraf@suse.de>
When using MAV 2.0 TLB registers, we have another range of TLB registers
available to read the supported page sizes from.
Add SPR definitions for those and add a helper function that we can use
to receive such a bitmap even when using MAV 1.0.
Signed-off-by: Alexander Graf <agraf@suse.de>
We might want to call the tlb check function without actually caring about
the real address resolution. Check if we really should write the value
back.
Signed-off-by: Alexander Graf <agraf@suse.de>
The msync instruction as defined today is only valid on 4xx cores, not
on e500 which also supports msync, but treats it the same way as sync.
Rename it to reflect that it's 4xx only.
Signed-off-by: Alexander Graf <agraf@suse.de>
The e500 CPUs don't use 440's msync which falls on the same opcode IDs,
but instead use the real powerpc sync instruction. This is important,
since the invalid mask differs between the two.
Signed-off-by: Alexander Graf <agraf@suse.de>
Our code only knows IVORs up to 37. Add the new ones defined in ISA 2.06
from 38 - 42.
Signed-off-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Unfortunately the HIOR setting code slipped into upstream QEMU
before it was pulled into upstream KVM. And since Murphy is always
right, comments on the patches only emerged on the pull request
leading to changes in the interface.
So here's an update to the HIOR setting. While at it, I also relaxed
it a bit since for HV KVM we can already run fine without and 3.2
works just fine with HV KVM but when not setting HIOR. We will only
need this when running PAPR in PR KVM.
Since we accidently changed the ABI and API along the way, we have
to update the underlying kernel headers together with the code that
uses it to not break bisectability.
Signed-off-by: Alexander Graf <agraf@suse.de>
Now that we have 440 TLB emulation, we can also support running the 440EP
CPU target in system emulation mode.
Signed-off-by: Alexander Graf <agraf@suse.de>
Commit c5705a772 ("vmstate, memory: decouple vmstate from memory API") changed
the signature of memory_region_init_ram_ptr() but did not update a caller in
the ppc kvm module. Fix.
Signed-off-by: Avi Kivity <avi@redhat.com>
This core is found on chips such as p4080, p3041, p2040, and p5020.
More needs to be done to make this viable for TCG (such as missing SPRs
and instructions), but this suffices to get KVM running with appropriate
kernel support.
Signed-off-by: Varun Sethi <Varun.Sethi@freescale.com>
[scottwood@freescale.com: tweak some flags]
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
When guest reset, we need to halt secondary cpus until guest kick them.
This already works for tcg. The patch add the support for kvm.
Signed-off-by: Liu Yu <yu.liu@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
[agraf: remove in-kernel irqchip code]
When run with a PPC Book3S (server) CPU Currently 'info tlb' in the
qemu monitor reports "dump_mmu: unimplemented". However, during
bringup work, it can be quite handy to have the SLB entries, which are
available in the CPUPPCState. This patch adds an implementation of
info tlb for book3s, which dumps the SLB.
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Alexander Graf <agraf@suse.de>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
When using gdb to single step a ppc interrupt routine, the execution
flow passes the rfi instruction without actually returning from the
interrupt.
The patch fixes this by avoiding to update the nip when the debug
exception is raised and a previous POWERPC_EXCP_SYNC was set.
The latter is the case only, if code for rfi or a related instruction
was generated.
Signed-off-by: Sebastian Bauer <mail@sebastianbauer.info>
Signed-off-by: Alexander Graf <agraf@suse.de>
The CPU state contains two bitmaps, initialized from the CPU spec
which describes which instructions are implemented on the CPU. A
couple of bits are defined which cover instructions (VSX and DFP)
which are not currently implemented in TCG. So far, these are only
used to handle the case of -cpu host because a KVM guest can use
the instructions when the host CPU supports them.
However, it's a mild layering violation to simply not include those
bits in the CPU descriptions for those CPUs that do support them,
just because we can't handle them in TCG. This patch corrects the
situation, so that the instruction bits _are_ shown correctly in the
cpu spec table, but are masked out from the cpu state in the non-KVM
case.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Sufficiently recent kernels include a KVM call to accelerate use of
PAPR TCE tables (IOMMU), which are used by PAPR virtual IO devices.
This involves qemu mapping the TCE table in from a kernel obtained fd,
which currently we do with PROT_READ only. This is a hangover from
early (never released) versions of this kernel interface which only
permitted read-only mappings and required us to destroy and recreate
the table when we needed to clear it from qemu.
Now, the kernel permits read-write mappings, and we rely on this to
clear the table in spapr_vio_quiesce_one(). However, due to
insufficient testing, I forgot to update the actual mapping of the
table in kvmppc_create_spapr_tce() to add PROT_WRITE to the mmap().
This patch corrects the oversight.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
The -cpu host feature tries to find out the host capabilities based
on device tree information. However, we don't always have that available
because it's an optional property in dt.
So instead of force unsetting values depending on an unreliable source
of information, let's just try to be clever about it and not override
capabilities when we don't know the device tree pieces.
This fixes altivec with -cpu host on YDL PowerStations.
Reported-by: Nishanth Aravamudan <nacc@us.ibm.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
The recent usage of MemoryRegion in kvm_ppc.h breaks builds with
CONFIG_USER_ONLY=y. This patch fixes it.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Currently, when KVM is enabled, the pseries machine checks if the host
CPU supports VMX, VSX and/or DFP instructions and advertises
accordingly in the guest device tree. It does this regardless of what
CPU is selected on the command line. On the other hand, when in TCG
mode, it never advertises any of these facilities, even basic VMX
(Altivec) which is supported in TCG.
Now that we have a -cpu host option for ppc, it is fairly
straightforward to fix both problems. This patch changes the -cpu
host code to override the basic cpu spec derived from the PVR with
information queried from the host avout VMX, VSX and DFP capability.
The pseries code then uses the instruction availability advertised in
the cpu state to set the guest device tree correctly for both the KVM
and TCG cases.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
The sole reason we have the ppcemb target is to support MMUs that have
less than the usual 4k possible page size. There are very few of these
chips and I don't want to add additional QA and testing burden to everyone
to ensure that code still works when TARGET_PAGE_SIZE is not 4k.
So this patch disables all CPUs except for MMU_BOOKE capable ones from
the ppcemb target.
Signed-off-by: Alexander Graf <agraf@suse.de>
Some 32-bit PPC CPUs can use up to 36 bit of physical address space.
Treat them accordingly in the qemu-system-ppc binary type.
Signed-off-by: Alexander Graf <agraf@suse.de>
This patch adds cpu specs to the table for POWER7 revisions 2.1 and 2.3.
This allows -cpu host to be used on these host cpus.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
For convenience with kvm, x86 allows the user to specify -cpu host on the
qemu command line, which means make the guest cpu the same as the host
cpu. This patch implements the same option for ppc targets.
For now, this just read the host PVR (Processor Version Register) and
selects one of our existing CPU specs based on it. This means that the
option will not work if the host cpu is not supported by TCG, even if that
wouldn't matter for use under kvm.
In future, we can extend this in future to override parts of the cpu spec
based on information obtained from the host (via /proc/cpuinfo, the host
device tree, or explicit KVM calls). That will let us handle cases where
the real kvm-virtualized CPU doesn't behave exactly like the TCG-emulated
CPU. With appropriate annotation of the CPU specs we'll also then be able
to use host cpus under kvm even when there isn't a matching full TCG model.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
The ppc target contains a ppc_find_by_pvr() function, which looks up a
CPU spec based on a PVR (that is, based on the value in the target cpu's
Processor Version Register). PVR values contain information on both the
cpu model (upper 16 bits, usually) and on the precise revision (low 16
bits, usually).
ppc_find_by_pvr, as well as making exact PVR matches, attempts to find
"close" PVR matches, when we don't have a CPU spec for the exact revision
specified. This sounds like a good idea, execpt that the current logic
is completely nonsensical.
It seems to assume CPU families are subdivided bit by bit in the PVR in a
way they just aren't. Specifically, it requires a match on all bits of the
specified pvr up to the last non-zero bit. This has the bizarre effect
that when the low bits are simply a sequential revision number (a common
though not universal pattern), then odd specified revisions must be matched
exactly, whereas even specified revisions will also match the next odd
revision, likewise for powers of 4, 8 and so forth.
To correctly do inexact matching we'd need to re-organize the table of CPU
specs to include a mask showing what PVR range the spec is compatible with
(similar to the cputable code in the Linux kernel).
For now, just remove the bogosity by only permitting exact PVR matches.
That at least makes the matching simple and consistent. If we need inexact
matching we can add the necessary per-subfamily masks later.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Sufficiently recent PAPR specifications define properties "ibm,vmx"
and "ibm,dfp" on the CPU node which advertise whether the VMX vector
extensions (or the later VSX version) and/or the Decimal Floating
Point operations from IBM's recent POWER CPUs are available.
Currently we do not put these in the guest device tree and the guest
kernel will consequently assume they are not available. This is good,
because they are not supported under TCG. VMX is similar enough to
Altivec that it might be trivial to support, but VSX and DFP would
both require significant work to support in TCG.
However, when running under kvm on a host which supports these
instructions, there's no reason not to let the guest use them. This
patch, therefore, checks for the relevant support on the host CPU
and, if present, advertises them to the guest as well.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Currently the kvmppc_get_clockfreq() function reads the host's clock
frequency from /proc/device-tree, which is useful to past to the guest
in KVM setups. However, there are some other host properties
advertised in the device tree which can also be relevant to the
guests.
This patch, therefore, replaces kvmppc_get_clockfreq() which can
retrieve any named, single integer property from the host device
tree's CPU node.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
SPE instructions are defined by pairs. Currently, the invalid-bits mask is set
for the first instruction, but the second one can have a different mask.
example:
GEN_SPE(efdcmpeq, efdcfs, 0x17, 0x0B, 0x00600000, 0x00180000, PPC_SPE_DOUBLE),
Signed-off-by: Fabien Chouteau <chouteau@adacore.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
The pseries machine of qemu implements the TCE mechanism used as a
virtual IOMMU for the PAPR defined virtual IO devices. Because the
PAPR spec only defines a small DMA address space, the guest VIO
drivers need to update TCE mappings very frequently - the virtual
network device is particularly bad. This means many slow exits to
qemu to emulate the H_PUT_TCE hypercall.
Sufficiently recent kernels allow this to be mitigated by implementing
H_PUT_TCE in the host kernel. To make use of this, however, qemu
needs to initialize the necessary TCE tables, and map them into itself
so that the VIO device implementations can retrieve the mappings when
they access guest memory (which is treated as a virtual DMA
operation).
This patch adds the necessary calls to use the KVM TCE acceleration.
If the kernel does not support acceleration, or there is some other
error creating the accelerated TCE table, then it will still fall back
to full userspace TCE implementation.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
At present, using the hypervisor aware Book3S-HV KVM will only work
with qemu on POWER7 CPUs. PPC970 CPUs also have hypervisor
capability, but they lack the VRMA feature which makes assigning guest
memory easier.
In order to allow KVM Book3S-HV on PPC970, we need to specially
allocate the first chunk of guest memory (the "Real Mode Area" or
RMA), so that it is physically contiguous.
Sufficiently recent host kernels allow such contiguous RMAs to be
allocated, with a kvm capability advertising whether the feature is
available and/or necessary on this hardware. This patch enables qemu
to use this support, thus allowing kvm acceleration of pseries qemu
machines on PPC970 hardware.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
agraf: fix to use memory api
Alex Graf has already made qemu support KVM for the pseries machine
when using the Book3S-PR KVM variant (which runs the guest in
usermode, emulating supervisor operations). This code allows gets us
very close to also working with KVM Book3S-HV (using the hypervisor
capabilities of recent POWER CPUs).
This patch moves us another step towards Book3S-HV support by
correctly handling SMT (multithreaded) POWER CPUs. There are two
parts to this:
* Querying KVM to check SMT capability, and if present, adjusting the
cpu numbers that qemu assigns to cause KVM to assign guest threads
to cores in the right way (this isn't automatic, because the POWER
HV support has a limitation that different threads on a single core
cannot be in different guests at the same time).
* Correctly informing the guest OS of the SMT thread to core mappings
via the device tree.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
While working on the emulation of the freescale p2010 (e500v2) I realized that
there's no implementation of booke's timers features. Currently mpc8544 uses
ppc_emb (ppc_emb_timers_init) which is close but not exactly like booke (for
example booke uses different SPR).
Signed-off-by: Fabien Chouteau <chouteau@adacore.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
When running with PR KVM, we need to set HIOR directly. Thankfully there
is now a new interface to set registers individually so we can just use that
and poke HIOR into the guest vcpu's HIOR register.
While at it, this also sets SDR1 because -M pseries requires it to run.
With this patch, -M pseries works properly with PR KVM.
Signed-off-by: Alexander Graf <agraf@suse.de>
This patch implements support for the CFAR SPR on POWER7 (Come From
Address Register), which snapshots the PC value at the time of a branch or
an rfid. The latest powerpc-next kernel also catches it and can show it in
xmon or in the signal frames.
This works well enough to let recent kernels boot (which otherwise oops
on the CFAR access). It hasn't been tested enough to be confident that the
CFAR values are actually accurate, but one thing at a time.
Signed-off-by: Ben Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
This definition is backward compatible with MAV=1.0 as long as
the guest does not set reserved bits in MAS1/MAS4.
Also, fix the shift in booke206_tlb_to_page_size -- it's the base
that should be able to hold a 4G page size, not the shift count.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Share the TLB array with KVM. This allows us to set the initial TLB
both on initial boot and reset, is useful for debugging, and could
eventually be used to support migration.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
When running PR style KVM, we need to tell the kernel that we want
to run in PAPR mode now. This means that we need to pass some more
register information down and enable papr mode. We also need to align
the HTAB to htab_size boundary.
Using this patch, -M pseries works with kvm even on non-hv kvm
implementations, as long as the preceding kernel patches are in.
Signed-off-by: Alexander Graf <agraf@suse.de>
---
v1 -> v2:
- match on CONFIG_PSERIES
v2 -> v3:
- remove HIOR pieces from PAPR patch (ABI breakage)
We have a bunch of helper functions that don't have any stubs for them in case
we don't have CONFIG_KVM enabled. That didn't bite us so far, because gcc can
optimize them out pretty well, but we should really provide them.
Signed-off-by: Alexander Graf <agraf@suse.de>
---
v1 -> v2:
- use uint64_t for clockfreq
We need to find out the host's clock-frequency when running on KVM, so
let's export a respective function.
Signed-off-by: Alexander Graf <agraf@suse.de>
---
v1 -> v2:
- enable 64bit values
qemu_service_io was mainly an alias to qemu_notify_event,
currently used only by PPC for timer hack, so call
qemu_notify_event directly.
Signed-off-by: Frediano Ziglio <freddy77@gmail.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Those blanks violate the coding conventions, see
scripts/checkpatch.pl.
Blanks missing after colons in the changed lines were added.
This patch does not try to fix tabs, long lines and other
problems in the changed lines, therefore checkpatch.pl reports
many violations.
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
When an exception occurs on BookE, we need to set ESR bits to expose
to the guest information on what exactly happened. Add the obvious ones.
Reported-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>