Sufficiently recent kernels include a KVM call to accelerate use of
PAPR TCE tables (IOMMU), which are used by PAPR virtual IO devices.
This involves qemu mapping the TCE table in from a kernel obtained fd,
which currently we do with PROT_READ only. This is a hangover from
early (never released) versions of this kernel interface which only
permitted read-only mappings and required us to destroy and recreate
the table when we needed to clear it from qemu.
Now, the kernel permits read-write mappings, and we rely on this to
clear the table in spapr_vio_quiesce_one(). However, due to
insufficient testing, I forgot to update the actual mapping of the
table in kvmppc_create_spapr_tce() to add PROT_WRITE to the mmap().
This patch corrects the oversight.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
The -cpu host feature tries to find out the host capabilities based
on device tree information. However, we don't always have that available
because it's an optional property in dt.
So instead of force unsetting values depending on an unreliable source
of information, let's just try to be clever about it and not override
capabilities when we don't know the device tree pieces.
This fixes altivec with -cpu host on YDL PowerStations.
Reported-by: Nishanth Aravamudan <nacc@us.ibm.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
The recent usage of MemoryRegion in kvm_ppc.h breaks builds with
CONFIG_USER_ONLY=y. This patch fixes it.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Currently, when KVM is enabled, the pseries machine checks if the host
CPU supports VMX, VSX and/or DFP instructions and advertises
accordingly in the guest device tree. It does this regardless of what
CPU is selected on the command line. On the other hand, when in TCG
mode, it never advertises any of these facilities, even basic VMX
(Altivec) which is supported in TCG.
Now that we have a -cpu host option for ppc, it is fairly
straightforward to fix both problems. This patch changes the -cpu
host code to override the basic cpu spec derived from the PVR with
information queried from the host avout VMX, VSX and DFP capability.
The pseries code then uses the instruction availability advertised in
the cpu state to set the guest device tree correctly for both the KVM
and TCG cases.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
The sole reason we have the ppcemb target is to support MMUs that have
less than the usual 4k possible page size. There are very few of these
chips and I don't want to add additional QA and testing burden to everyone
to ensure that code still works when TARGET_PAGE_SIZE is not 4k.
So this patch disables all CPUs except for MMU_BOOKE capable ones from
the ppcemb target.
Signed-off-by: Alexander Graf <agraf@suse.de>
Some 32-bit PPC CPUs can use up to 36 bit of physical address space.
Treat them accordingly in the qemu-system-ppc binary type.
Signed-off-by: Alexander Graf <agraf@suse.de>
This patch adds cpu specs to the table for POWER7 revisions 2.1 and 2.3.
This allows -cpu host to be used on these host cpus.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
For convenience with kvm, x86 allows the user to specify -cpu host on the
qemu command line, which means make the guest cpu the same as the host
cpu. This patch implements the same option for ppc targets.
For now, this just read the host PVR (Processor Version Register) and
selects one of our existing CPU specs based on it. This means that the
option will not work if the host cpu is not supported by TCG, even if that
wouldn't matter for use under kvm.
In future, we can extend this in future to override parts of the cpu spec
based on information obtained from the host (via /proc/cpuinfo, the host
device tree, or explicit KVM calls). That will let us handle cases where
the real kvm-virtualized CPU doesn't behave exactly like the TCG-emulated
CPU. With appropriate annotation of the CPU specs we'll also then be able
to use host cpus under kvm even when there isn't a matching full TCG model.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
The ppc target contains a ppc_find_by_pvr() function, which looks up a
CPU spec based on a PVR (that is, based on the value in the target cpu's
Processor Version Register). PVR values contain information on both the
cpu model (upper 16 bits, usually) and on the precise revision (low 16
bits, usually).
ppc_find_by_pvr, as well as making exact PVR matches, attempts to find
"close" PVR matches, when we don't have a CPU spec for the exact revision
specified. This sounds like a good idea, execpt that the current logic
is completely nonsensical.
It seems to assume CPU families are subdivided bit by bit in the PVR in a
way they just aren't. Specifically, it requires a match on all bits of the
specified pvr up to the last non-zero bit. This has the bizarre effect
that when the low bits are simply a sequential revision number (a common
though not universal pattern), then odd specified revisions must be matched
exactly, whereas even specified revisions will also match the next odd
revision, likewise for powers of 4, 8 and so forth.
To correctly do inexact matching we'd need to re-organize the table of CPU
specs to include a mask showing what PVR range the spec is compatible with
(similar to the cputable code in the Linux kernel).
For now, just remove the bogosity by only permitting exact PVR matches.
That at least makes the matching simple and consistent. If we need inexact
matching we can add the necessary per-subfamily masks later.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Sufficiently recent PAPR specifications define properties "ibm,vmx"
and "ibm,dfp" on the CPU node which advertise whether the VMX vector
extensions (or the later VSX version) and/or the Decimal Floating
Point operations from IBM's recent POWER CPUs are available.
Currently we do not put these in the guest device tree and the guest
kernel will consequently assume they are not available. This is good,
because they are not supported under TCG. VMX is similar enough to
Altivec that it might be trivial to support, but VSX and DFP would
both require significant work to support in TCG.
However, when running under kvm on a host which supports these
instructions, there's no reason not to let the guest use them. This
patch, therefore, checks for the relevant support on the host CPU
and, if present, advertises them to the guest as well.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Currently the kvmppc_get_clockfreq() function reads the host's clock
frequency from /proc/device-tree, which is useful to past to the guest
in KVM setups. However, there are some other host properties
advertised in the device tree which can also be relevant to the
guests.
This patch, therefore, replaces kvmppc_get_clockfreq() which can
retrieve any named, single integer property from the host device
tree's CPU node.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
SPE instructions are defined by pairs. Currently, the invalid-bits mask is set
for the first instruction, but the second one can have a different mask.
example:
GEN_SPE(efdcmpeq, efdcfs, 0x17, 0x0B, 0x00600000, 0x00180000, PPC_SPE_DOUBLE),
Signed-off-by: Fabien Chouteau <chouteau@adacore.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
The pseries machine of qemu implements the TCE mechanism used as a
virtual IOMMU for the PAPR defined virtual IO devices. Because the
PAPR spec only defines a small DMA address space, the guest VIO
drivers need to update TCE mappings very frequently - the virtual
network device is particularly bad. This means many slow exits to
qemu to emulate the H_PUT_TCE hypercall.
Sufficiently recent kernels allow this to be mitigated by implementing
H_PUT_TCE in the host kernel. To make use of this, however, qemu
needs to initialize the necessary TCE tables, and map them into itself
so that the VIO device implementations can retrieve the mappings when
they access guest memory (which is treated as a virtual DMA
operation).
This patch adds the necessary calls to use the KVM TCE acceleration.
If the kernel does not support acceleration, or there is some other
error creating the accelerated TCE table, then it will still fall back
to full userspace TCE implementation.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
At present, using the hypervisor aware Book3S-HV KVM will only work
with qemu on POWER7 CPUs. PPC970 CPUs also have hypervisor
capability, but they lack the VRMA feature which makes assigning guest
memory easier.
In order to allow KVM Book3S-HV on PPC970, we need to specially
allocate the first chunk of guest memory (the "Real Mode Area" or
RMA), so that it is physically contiguous.
Sufficiently recent host kernels allow such contiguous RMAs to be
allocated, with a kvm capability advertising whether the feature is
available and/or necessary on this hardware. This patch enables qemu
to use this support, thus allowing kvm acceleration of pseries qemu
machines on PPC970 hardware.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
agraf: fix to use memory api
Alex Graf has already made qemu support KVM for the pseries machine
when using the Book3S-PR KVM variant (which runs the guest in
usermode, emulating supervisor operations). This code allows gets us
very close to also working with KVM Book3S-HV (using the hypervisor
capabilities of recent POWER CPUs).
This patch moves us another step towards Book3S-HV support by
correctly handling SMT (multithreaded) POWER CPUs. There are two
parts to this:
* Querying KVM to check SMT capability, and if present, adjusting the
cpu numbers that qemu assigns to cause KVM to assign guest threads
to cores in the right way (this isn't automatic, because the POWER
HV support has a limitation that different threads on a single core
cannot be in different guests at the same time).
* Correctly informing the guest OS of the SMT thread to core mappings
via the device tree.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
While working on the emulation of the freescale p2010 (e500v2) I realized that
there's no implementation of booke's timers features. Currently mpc8544 uses
ppc_emb (ppc_emb_timers_init) which is close but not exactly like booke (for
example booke uses different SPR).
Signed-off-by: Fabien Chouteau <chouteau@adacore.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
When running with PR KVM, we need to set HIOR directly. Thankfully there
is now a new interface to set registers individually so we can just use that
and poke HIOR into the guest vcpu's HIOR register.
While at it, this also sets SDR1 because -M pseries requires it to run.
With this patch, -M pseries works properly with PR KVM.
Signed-off-by: Alexander Graf <agraf@suse.de>
This patch implements support for the CFAR SPR on POWER7 (Come From
Address Register), which snapshots the PC value at the time of a branch or
an rfid. The latest powerpc-next kernel also catches it and can show it in
xmon or in the signal frames.
This works well enough to let recent kernels boot (which otherwise oops
on the CFAR access). It hasn't been tested enough to be confident that the
CFAR values are actually accurate, but one thing at a time.
Signed-off-by: Ben Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
This definition is backward compatible with MAV=1.0 as long as
the guest does not set reserved bits in MAS1/MAS4.
Also, fix the shift in booke206_tlb_to_page_size -- it's the base
that should be able to hold a 4G page size, not the shift count.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Share the TLB array with KVM. This allows us to set the initial TLB
both on initial boot and reset, is useful for debugging, and could
eventually be used to support migration.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
When running PR style KVM, we need to tell the kernel that we want
to run in PAPR mode now. This means that we need to pass some more
register information down and enable papr mode. We also need to align
the HTAB to htab_size boundary.
Using this patch, -M pseries works with kvm even on non-hv kvm
implementations, as long as the preceding kernel patches are in.
Signed-off-by: Alexander Graf <agraf@suse.de>
---
v1 -> v2:
- match on CONFIG_PSERIES
v2 -> v3:
- remove HIOR pieces from PAPR patch (ABI breakage)
We have a bunch of helper functions that don't have any stubs for them in case
we don't have CONFIG_KVM enabled. That didn't bite us so far, because gcc can
optimize them out pretty well, but we should really provide them.
Signed-off-by: Alexander Graf <agraf@suse.de>
---
v1 -> v2:
- use uint64_t for clockfreq
We need to find out the host's clock-frequency when running on KVM, so
let's export a respective function.
Signed-off-by: Alexander Graf <agraf@suse.de>
---
v1 -> v2:
- enable 64bit values
qemu_service_io was mainly an alias to qemu_notify_event,
currently used only by PPC for timer hack, so call
qemu_notify_event directly.
Signed-off-by: Frediano Ziglio <freddy77@gmail.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Those blanks violate the coding conventions, see
scripts/checkpatch.pl.
Blanks missing after colons in the changed lines were added.
This patch does not try to fix tabs, long lines and other
problems in the changed lines, therefore checkpatch.pl reports
many violations.
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
When an exception occurs on BookE, we need to set ESR bits to expose
to the guest information on what exactly happened. Add the obvious ones.
Reported-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
When accessing an SPE instruction despite it being not available,
throw an SPE exception instead of an APU exception. That way the
guest knows what's going on and actually uses SPE.
Reported-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
The BookE spec specifies a number of ESR bits. Add defines for them
so we can use them later on.
Reported-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Parameter is_softmmu (and its evil mutant twin brother is_softmuu)
is not used in cpu_*_handle_mmu_fault() functions, remove them
and adjust callers.
Acked-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Do not allocate TCG-only resources like the translation buffer when
running over KVM or XEN. Saves a "few" bytes in the qemu address space
and is also conceptually cleaner.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Move softmmu_exec.h include directives from target-*/exec.h to
target-*/op_helper.c. Move also various other stuff only used in
op_helper.c there.
Define global env in dyngen-exec.h.
For i386, move wrappers for segment and FPU helpers from user-exec.c
to op_helper.c. Implement raise_exception_err_env() to handle dynamic
CPUState. Move the function declarations to cpu.h since they can be
used outside of op_helper.c context.
LM32, s390x, UniCore32: remove unused cpu_halted(), regs_to_env() and
env_to_regs().
ARM: make raise_exception() static.
Convert
#include "exec.h"
to
#include "cpu.h"
#include "dyngen-exec.h"
and remove now unused target-*/exec.h.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Remove the include of setjmp.h from the cpu.h of target-alpha
and target-ppc. This is unnecessary because cpu-defs.h already
includes this header; this change brings these two targets
into line with all the rest.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
* 'ppc-next' of git://repo.or.cz/qemu/agraf:
PPC: move TLBs to their own arrays
PPC: 440: Use 440 style MMU as default, so Qemu knows the MMU type
PPC: E500: Use MAS registers instead of internal TLB representation
PPC: Only set lower 32bits with mtmsr
PPC: update openbios firmware
PPC: mpc8544ds: Add hypervisor node
PPC: calculate kernel,initrd,cmdline locations dynamically
target-ppc: Handle memory-forced I/O controller access
PPC: E500: Implement reboot controller
Move functions cpu_has_work() and cpu_pc_from_tb() from exec.h to cpu.h. This is
needed by later patches.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Before the next patch, fix coding style of the areas affected.
Change the type of the return value from cpu_has_work() and
qemu_cpu_has_work() to bool.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
No longer needed with accompanied kernel headers.
CC: Alexander Graf <agraf@suse.de>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Required header support is now unconditionally available.
CC: Alexander Graf <agraf@suse.de>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Until now, we've created a union over multiple different TLB types and
allocated that union. While it's a waste of memory (and cache) to allocate
TLB information for a TLB type with much information when you only need
little, it also inflicts another issue.
With the new KVM API, we can now share the TLB between KVM and qemu, but
for that to work we need to have both be in the same layout. We can't just
stretch it over to fit some internal different TLB representation.
Hence this patch moves all TLB types to their own array, allowing us to only
address and allocate exactly the boundaries required for the specific TLB
type at hand.
Signed-off-by: Alexander Graf <agraf@suse.de>
The natural format for e500 cores to do TLB manipulation with are the MAS
registers. Instead of converting them into some internal representation
and back again when the guest reads them, we can just keep the data
identical to the way the guest passed it to us.
The main advantage of this approach is that we're getting closer to being
able to share MMU data with KVM using shared memory, so that we don't need
to copy lots of MMU data back and forth all the time. For this to work
however, another patch is required that gets rid of the TLB union, as that
destroys our memory layout that needs to be identical with the kernel one.
Signed-off-by: Alexander Graf <agraf@suse.de>
As Nathan pointed out correctly, the mtmsr instruction does not modify
the high 32 bits of MSR. It also doesn't matter if SF is set or not,
the instruction always behaves the same.
This patch moves it a bit closer to the spec.
Reported-by: Nathan Whitehorn <nwhitehorn@freebsd.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
On at least the PowerPC 601, a direct-store (T=1) with bus unit ID 0x07F
is special-cased as memory-forced I/O controller access. It is supposed
to be checked immediately if T=1, bypassing all protection mechanisms
and acting cache-inhibited and global.
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Simplified by avoiding reindentation. Added explanatory comments.
Cc: Alexander Graf <agraf@suse.de>
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
This patch removes all references to signal.h when qemu-common.h is included
as they become redundant.
Signed-off-by: Alexandre Raymond <cerbere@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
target-ppc has been switched to softfloat only long ago, but a
few #ifdef CONFIG_SOFTFLOAT have been forgotten. Remove them.
Cc: Alexander Graf <agraf@suse.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
When compiling qemu with kvm support on BookE PPC machines, I get
the following error:
cc1: warnings being treated as errors
/tmp/qemu/target-ppc/kvm.c: In function 'kvm_arch_get_registers':
/tmp/qemu/target-ppc/kvm.c:188: error: unused variable 'sregs'
This is due to overly ambitious #ifdef'ery introduced in 90dc88.
Fix it by keeping code that doesn't depend on new headers alive
for the compiler, but never executed due to failing capability
checks.
CC: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
When QEMU was configured with --enable-debug-tcg,
compilation fails in spr_write_booke206_mmucsr0() and in
spr_write_booke_pid(). Similar changes are also needed
in conditional code which is normally unused.
Cc: Alexander Graf <agraf@suse.de>
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
* 'ppc-next' of git://repo.or.cz/qemu/agraf:
Fix a bug in mtsr/mtsrin emulation on ppc64
pSeries: Clean up write-only variables
w32: Fix compilation and replace non-portable usage of ulong
tb_invalidate_page_range() was intended to be used to invalidate an
area of a TB which the guest explicitly flushes from i-cache. However,
QEMU detects writes to code areas where TBs have been generated, so
his has never been useful.
Delete the function, adjust callers.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Early ppc64 CPUs include a hack to partially simulate the ppc32 segment
registers, by translating writes to them into writes to the SLB. This is
not used by any current Linux kernel, but it is used by the openbios used
in the qemu mac99 model.
Commit 81762d6dd0, cleaning up the SLB
handling introduced a bug in this code, breaking the openbios currently in
qemu. Specifically, there was an off by one error bitshuffling the
register format used by mtsr into the format needed for the SLB load,
causing the flag bits to end up in the wrong place. This caused the
storage keys to be wrong under openbios, meaning that the translation code
incorrectly thought a legitimate access was a permission violation.
This patch fixes the bug, at the same time it fixes some build bug in the
MMU debugging code (only exposed when DEBUG_MMU is enabled).
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
ulong is undefined for w32 (and maybe other) compilations.
Replace it by uintptr_t (which also fixes compilation for w64
and is a better choice for pointer to integer conversions).
Cc: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
* 'ppc-next' of git://repo.or.cz/qemu/agraf:
PPC: Qdev'ify e500 pci
PPC MPC7544DS: Use new TLB helper function
PPC: Implement e500 (FSL) MMU
PPC: Add another 64 bits to instruction feature mask
PPC: Add GS MSR definition
PPC: Make MPC8544DS emulation work w/o KVM
PPC: Make MPC8544DS obey -cpu switch
Fix off-by-one error in sizing pSeries hcall table
ppc64: Fix out-of-tree builds
kvm: ppc: warn user on PAGE_SIZE mismatch
kvm: ppc: detect old headers
monitor: add PPC BookE SPRs
kvm: ppc: fixes for KVM_SET_SREGS on init
ppc64: Don't try to build sPAPR RTAS on Darwin
Place pseries vty devices at addresses more similar to existing machines
Make pSeries 'model' property more closely resemble real hardware
pseries: Increase maximum CPUs to 256
Most of the code to support e500 style MMUs is already in place, but
we're missing on some of the special TLB0-TLB1 handling code and slightly
different TLB modification.
This patch adds support for the FSL style MMU.
Signed-off-by: Alexander Graf <agraf@suse.de>
To enable quick runtime detection of instruction groups to the currently
selected CPU emulation, we have a feature mask of what exactly the respective
instruction supports.
This feature mask is 64 bits long and we just successfully exceeded those 64
bits. To add more features, we need to think of something.
The easiest solution that came to my mind was to simply add another 64 bits
that we can also match on. Since the comparison is only done on start of the
qemu process to generate an internal opcode calling table, we should be fine
on any performance penalties here.
Signed-off-by: Alexander Graf <agraf@suse.de>
When compiling Qemu with older kernel headers, the PVR setting
mechanism isn't available yet. Unfortunately, back then I didn't add
a capability we could check against, so all we can do is add a configure
test to see if we support PVR setting. For BookE, we don't care yet.
This fixes compilation errors with KVM enabled on older kernel headers
(like 2.6.32).
Signed-off-by: Alexander Graf <agraf@suse.de>
Read them via KVM_GET_SREGS in kvm_arch_get_registers(),
and display them in "info registers".
Also get CR and PID from the existing KVM_GET_REGS.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Classic/server ppc has had SREGS for a while now (though I think not
always?), but it's still missing for booke. Check the capability before
calling KVM_SET_SREGS.
Without this, booke kvm fails to boot as of commit
84b4915dd2 (kvm: Handle kvm_init_vcpu
errors).
Also, don't write random stack state into the non-PVR sregs fields --
have kvm fill it in first.
Eventually booke will have sregs and it will have its own capability to
be tested here. However, we will want a way for platform code to request
to look like the actual CPU we're running on, especially if SoC devices
are being directly assigned.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
The previous patch removed the need for parameter puc.
Is is now unused, so remove it.
Cc: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Function gen_pc_load was introduced in commit
d2856f1ad4.
The only reason for parameter searched_pc was
a debug statement in target-i386/translate.c.
Parameter puc was needed by target-sparc until
commit d7da2a1040.
Remove searched_pc from the debug statement and remove both
parameters from the parameter list of gen_pc_load.
As the function name gen_pc_load was also misleading,
it is now called restore_state_to_opc. This new name
was suggested by Peter Maydell, thanks.
v2: Remove last parameter, too, and rename the function.
v3: Fix [] typo in target-arm/translate.c.
Fix wrong SHA1 object name in commit message (copy+paste error).
Cc: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
efstst*() functions are fast SPE funtions which do not take into account
special values (infinites, NaN, etc.), while efscmp*() functions are
IEEE754 compliant.
Given that float32_*() functions are IEEE754 compliant, the efscmp*()
functions are correctly implemented, while efstst*() are not. This
patch reverse the implementation of this two groups of functions and
fix the comments. It also use float32_eq() instead of float32_eq_quiet()
as qNaNs should not be ignored.
Cc: Alexander Graf <agraf@suse.de>
Cc: Nathan Froyd <froydnj@codesourcery.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
float*_eq functions have a different semantics than other comparison
functions. Fix that by first renaming float*_quiet() into float*_eq_quiet().
Note that it is purely mechanical, and the behaviour should be unchanged.
That said it clearly highlight problems due to this different semantics,
they are fixed later in this patch series.
Cc: Alexander Graf <agraf@suse.de>
Acked-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Now that PPC defaults to softfloat which always provides float128
support, there is no need to keep two version of the code, depending if
float128 support is available or not. Suggested by Peter Maydell.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
tcg_gen_exit_tb takes a parameter of type tcg_target_long,
so the type casts of pointer to long should be replaced by
type casts of pointer to tcg_target_long (suggested by Blue Swirl).
These changes are needed for build environments where
sizeof(long) != sizeof(void *), especially for w64.
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
When configured with --enable-debug, we compile without optimization.
This means that the function mpc8544_copy_soc_cell() in ppce500_mpc8544ds.c
is not optimized out, even though it is never called without kvm. That in
turn causes a link failure, because it calls the function
kvmppc_read_host_property() which is in kvm_ppc.o and therefore not
included in a --disable-kvm build.
This patch fixes the problem by providing a dummy stub for
kvmppc_read_host_property() in kvm_ppc.h when !CONFIG_KVM.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
The recent patches adding partial support for POWER7 cpu emulation included
implementing the popcntd instruction. The support for this was open coded,
but host-utils.h already included a function implementing an equivalent
population count function, which uses a gcc builtin (which can use special
host instructions) if available.
This patch makes the popcntd implementation use the existing, potentially
faster, implementation.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Shared-processor partitions are those where a CPU is time-sliced between
partitions, rather than being permanently dedicated to a single
partition. qemu emulated partitions, since they are just scheduled with
the qemu user process, behave mostly like shared processor partitions.
In order to better support shared processor partitions (splpar), PAPR
defines the "VPA" (Virtual Processor Area), a shared memory communication
channel between the hypervisor and partitions. There are also two
additional shared memory communication areas for specialized purposes
associated with the VPA.
A VPA is not essential for operating an splpar, though it can be necessary
for obtaining accurate performance measurements in the presence of
runtime partition switching.
Most importantly, however, the VPA is a prerequisite for PAPR's H_CEDE,
hypercall, which allows a partition OS to give up it's shared processor
timeslices to other partitions when idle.
This patch implements the VPA and H_CEDE hypercalls in qemu. We don't
implement any of the more advanced statistics which can be communicated
through the VPA. However, this is enough to make normal pSeries kernels
do an effective power-save idle on an emulated pSeries, significantly
reducing the host load of a qemu emulated pSeries running an idle guest OS.
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
This patch implements the infrastructure and hypercalls necessary for the
PAPR specified CRQ (Command Request Queue) mechanism. This general
request queueing system is used by many of the PAPR virtual IO devices,
including the virtual scsi adapter.
Signed-off-by: Ben Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
On pSeries logical partitions, excepting the old POWER4-style full system
partitions, the guest does not have direct access to the hardware page
table. Instead, the pagetable exists in hypervisor memory, and the guest
must manipulate it with hypercalls.
However, our current pSeries emulation more closely resembles the old
style where the guest must set up and handle the pagetables itself. This
patch converts it to act like a modern partition.
This involves two things: first, the hash translation path is modified to
permit the has table to be stored externally to the emulated machine's
RAM. The pSeries machine init code configures the CPUs to use this mode.
Secondly, we emulate the PAPR hypercalls for manipulating the external
hashed page table.
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
This adds emulation support for the recent POWER7 cpu to qemu. It's far
from perfect - it's missing a number of POWER7 features so far, including
any support for VSX or decimal floating point instructions. However, it's
close enough to boot a kernel with the POWER7 PVR.
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Traditionally, the "segments" used for the two-stage translation used on
powerpc MMUs were 256MB in size. This was the only option on all hash
page table based 32-bit powerpc cpus, and on the earlier 64-bit hash page
table based cpus. However, newer 64-bit cpus also permit 1TB segments
This patch adds support for 1TB segment translation to the qemu code.
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Currently the path handling hash page table translation in get_segment()
has a mix of common and 32 or 64 bit specific code. However the
division is not done terribly well which results in a lot of messy code
flipping between common and divided paths.
This patch improves the organization, consolidating several divided paths
into one. This in turn allows simplification of some code in
get_segment(), removing a number of ugly interim variables.
This new factorization will also make it easier to add support for the 1T
segments added in newer CPUs.
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Currently, get_segment() has a variable called hash. However it doesn't
(quite) get the hash value for the ppc hashed page table. Instead it
gets the hash shifted - effectively the offset of the hash bucket within
the hash page table.
As well, as being different to the normal use of plain "hash" in the
architecture documentation, this usage necessitates some awkward 32/64
dependent masks and shifts which clutter up the path in get_segment().
This patch alters the code to use raw hash values through get_segment()
including storing raw hashes instead of pte group offsets in the ctx
structure. This cleans up the path noticeably.
This does necessitate 32/64 dependent shifts when the hash values are
taken out of the ctx structure and used, but those paths already have
32/64 bit variants so this is less awkward than it was in get_segment().
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
On ppc machines with hash table MMUs, the special purpose register SDR1
contains both the base address of the encoded size (hashed) page tables.
At present, we interpret the SDR1 value within the address translation
path. But because the encodings of the size for 32-bit and 64-bit are
different this makes for a confusing branch on the MMU type with a bunch
of curly shifts and masks in the middle of the translate path.
This patch cleans things up by moving the interpretation on SDR1 into the
helper function handling the write to the register. This leaves a simple
pre-sanitized base address and mask for the hash table in the CPUState
structure which is easier to work with in the translation path.
This makes the translation path more readable. It addresses the FIXME
comment currently in the mtsdr1 helper, by validating the SDR1 value during
interpretation. Finally it opens the way for emulating a pSeries-style
partition where the hash table used for translation is not mapped into
the guests's RAM.
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
The slb_lookup() function, used in the ppc translation path returns a
number of slb entry fields in reference parameters. However, only one
of the two callers of slb_lookup() actually wants this information.
This patch, therefore, makes slb_lookup() return a simple pointer to the
located SLB entry (or NULL), and the caller which needs the fields can
extract them itself.
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
qemu already includes support for the popcntb instruction introduced
in POWER5 (although it doesn't actually allow you to choose POWER5).
However, the logic is slightly incorrect: it will generate results
truncated to 32-bits when the CPU is in 32-bit mode. This is not
normal for powerpc - generally arithmetic instructions on a 64-bit
powerpc cpu will generate full 64 bit results, it's just that only the
low 32 bits will be significant for condition codes.
This patch corrects this nit, which actually simplifies the code slightly.
In addition, this patch implements the popcntw and popcntd
instructions added in POWER7, in preparation for allowing POWER7 as an
emulated CPU.
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
The PURR (Processor Utilization Resource Register) is a register found
on recent POWER CPUs. The guts of implementing it at least enough to
get by are already present in qemu, however some of the helper
functions needed to actually wire it up are missing.
This patch adds the necessary glue, so that the PURR can be wired up
when we implement newer POWER CPU targets which include it.
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
For a 64-bit PowerPC target, qemu correctly implements translation
through the segment lookaside buffer. Likewise it supports the
slbmte instruction which is used to load entries into the SLB.
However, it does not emulate the slbmfee and slbmfev instructions
which read SLB entries back into registers. Because these are
only occasionally used in guests (mostly for debugging) we get
away with it.
However, given the recent SLB cleanups, it becomes quite easy to
implement these, and thereby allow, amongst other things, a guest
Linux to use xmon's command to dump the SLB.
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
PowerPC and POWER chips since the POWER4 and 970 have a special
hypervisor mode, and a corresponding form of the system call
instruction which traps to the hypervisor.
qemu currently has stub implementations of hypervisor mode. That
is, the outline is there to allow qemu to run a PowerPC hypervisor
under emulation. There are a number of details missing so this
won't actually work at present, but the idea is there.
What there is no provision at all, is for qemu to instead emulate
the hypervisor itself. That is to have hypercalls trap into qemu
and their result be emulated from qemu, rather than running
hypervisor code within the emulated system.
Hypervisor hardware aware KVM implementations are in the works and
it would be useful for debugging and development to also allow
full emulation of the same para-virtualized guests as such a KVM.
Therefore, this patch adds a hook which will allow a machine to
set up emulation of hypervisor calls.
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Currently the SLB information when emulating a PowerPC 970 is
storeed in a structure with the unhelpfully named fields 'tmp'
and 'tmp64'. While the layout in these fields does match the
description of the SLB in the architecture document, it is not
convenient either for looking up the SLB, or for emulating the
slbmte instruction.
This patch, therefore, reorganizes the SLB entry structure to be
divided in the the "ESID related" and "VSID related" fields as
they are divided in instructions accessing the SLB.
In addition to making the code smaller and more readable, this will
make it easier to implement for the 1TB segments used in more
recent PowerPC chips.
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
This was done with:
sed -i 's/qemu_get_clock\>/qemu_get_clock_ns/' \
$(git grep -l 'qemu_get_clock\>' )
sed -i 's/qemu_new_timer\>/qemu_new_timer_ns/' \
$(git grep -l 'qemu_new_timer\>' )
after checking that get_clock and new_timer never occur twice
on the same line. There were no missed occurrences; however, even
if there had been, they would have been caught by the compiler.
There was exactly one false positive in qemu_run_timers:
- current_time = qemu_get_clock (clock);
+ current_time = qemu_get_clock_ns (clock);
which is of course not in this patch.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Make the return code of kvm_arch_handle_exit directly usable for
kvm_cpu_exec. This is straightforward for x86 and ppc, just s390
would require more work. Avoid this for now by pushing the return code
translation logic into s390's kvm_arch_handle_exit.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
CC: Alexander Graf <agraf@suse.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
We will broaden the scope of this function on x86 beyond irqchip events.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Commit 7a39fe5882 failed to convert the right arch function.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
All implementations are now the same, and there is only one caller,
so inline the function there.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
We do not check them, and the only arch with non-empty implementations
always returns 0 (this is also true for qemu-kvm).
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
CC: Alexander Graf <agraf@suse.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Provide arch-independent kvm_on_sigbus* stubs to remove the #ifdef'ery
from cpus.c. This patch also fixes --disable-kvm build by providing the
missing kvm_on_sigbus_vcpu kvm-stub.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Instead of splattering the code with #ifdefs and runtime checks for
capabilities we cannot work without anyway, provide central test
infrastructure for verifying their availability both at build and
runtime.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Ensure that we stop the guest whenever we face a fatal or unknown exit
reason. If we stop, we also have to enforce a cpu loop exit.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Some tests in FPU emulation code were wrongly using float64_is_nan()
before commit 185698715d, and wrongly
using float64_is_quiet_nan() after. Fix them by using float64_is_any_nan()
instead.
Reviewed-by: Nathan Froyd <froydnj@codesourcery.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
The current FPU code returns 0.0 if one of the operand is a
signaling NaN and the VXSNAN exception is disabled.
fload_invalid_op_excp() doesn't return a qNaN in case of a VXSNAN
exception as the operand should be propagated instead of a new
qNaN to be generated. Fix that by calling fload_invalid_op_excp()
only for the exception generation (if enabled), and use the softfloat
code to correctly compute the result.
Reviewed-by: Nathan Froyd <froydnj@codesourcery.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Use the new function float32_is_any_nan() instead of
float32_is_quiet_nan() || float32_is_signaling_nan().
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
The PRECISE_EMULATION is "hardcoded" to one in target-ppc/exec.h and not
something easily tunable. Remove it and non-precise emulation code as
it doesn't make a noticeable difference in speed. People wanting speed
improvement should use softfloat-native instead.
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
The softfloat functions float*_is_nan() were badly misnamed,
because they return true only for quiet NaNs, not for all NaNs.
Rename them to float*_is_quiet_nan() to more accurately reflect
what they do.
This change was produced by:
perl -p -i -e 's/_is_nan/_is_quiet_nan/g' $(git grep -l is_nan)
(with the results manually checked.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Nathan Froyd <froydnj@codesourcery.com>
Acked-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
I get a warning on a signed comparison with an unsigned variable, so
let's make the variable signed and be happy.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Edgar E. Iglesias <edgar@axis.com>
fprintf_function uses format checking with GCC_FMT_ATTR.
Format errors were fixed in
* target-i386/helper.c
* target-mips/translate.c
* target-ppc/translate.c
Cc: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Compiling with GCC 4.6.0 20100925 produced warnings:
/src/qemu/target-ppc/op_helper.c: In function 'helper_icbi':
/src/qemu/target-ppc/op_helper.c:351:14: error: variable 'tmp' set but not used [-Werror=unused-but-set-variable]
/src/qemu/target-ppc/op_helper.c: In function 'do_6xx_tlb':
/src/qemu/target-ppc/op_helper.c:3805:28: error: variable 'EPN' set but not used [-Werror=unused-but-set-variable]
/src/qemu/target-ppc/op_helper.c: In function 'do_74xx_tlb':
/src/qemu/target-ppc/op_helper.c:3838:28: error: variable 'EPN' set but not used [-Werror=unused-but-set-variable]
Fix by adding a dummy cast so that the variable is not unused. Delete tmp.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Only Mac-on-Linux stuff used video.x, OpenBIOS does not need it.
Remove video.x MoL hacks.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
* Fix swapped reading of tlblo/hi.
* Fix tlb exec permissions
Signed-off-by: John Clark <clarkjc@runbox.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Add a powerpc 440x5 with the model ID on the Xilinx virtex5.
Connect the 440x5 to the 40x interrupt logic.
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
The hack added by c5b76b3810 was not
enough to avoid warnings with gcc flag -Wtype-limits. Add a new macro
to fix both problems.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
According to the Book3S spec, the interrupt context starts with an MSR
value that is rather simple. If we leave out the HV case, it's almost
always 0.
To reflect this, let's redesign the way that MSR value gets calculated.
Using this, we also squash the bug where MSR_POW can slip through into
the interrupt handler MSR.
Reported-by: Thomas Monjalon <thomas.monjalon@openwide.fr>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
The lwarx and ldarx instructions have a bit to give some hint to the
CPU which is safe to ignore. We currently refuse to accept any instruction
with that bit set, as it used to be declared MBZ.
Let's remove the reserved bit and make the instruction work as expected.
This fixes Linux boot for ppc64.
Signed-off-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Andreas Färber <andreas.faerber@web.de>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
KVM on PowerPC used to have completely broken interrupt logic. Usually,
interrupts work by having a PIC that pulls a line up/down, so the CPU knows
that an interrupt is active. This line stays active until some action is
done to the PIC to release the line.
On KVM for PPC, we just checked if there was an interrupt pending and pulled
a line in the kernel module. We never released it though, hoping that kernel
space would just declare an interrupt as released when injected - which is
wrong.
To fix this, we need to completely redesign the interrupt injection logic.
Whenever an interrupt line gets triggered, we need to notify kernel space
that the line is up. Whenever it gets released, we do the same. This way
we can assure that the interrupt state is always known to kernel space.
This fixes random stalls in KVM guests on PowerPC that were waiting for
an interrupt while everyone else thought they received it already.
Signed-off-by: Alexander Graf <agraf@suse.de>
On KVM for PPC we need to tell the guest which instructions to use when
doing a hypercall. The clean way to do this is to go through an ioctl
from userspace and passing it on to the guest using the device tree.
So let's do the qemu part here: read out the hypercall and pass it on
to the guest's fw_cfg so openBIOS can read it out and expose it again.
Signed-off-by: Alexander Graf <agraf@suse.de>
Some hosts (amd64, ia64) have an ABI that ignores the high bits
of the 64-bit register when passing 32-bit arguments. Others
require the value to be properly sign-extended for the type.
I.e. "int32_t" must be sign-extended and "uint32_t" must be
zero-extended to 64-bits.
To effect this, extend the "sizemask" parameter to tcg_gen_callN
to include the signedness of the type of each parameter. If the
tcg target requires it, extend each 32-bit argument into a 64-bit
temp and pass that to the function call.
This ABI feature is required by sparc64, ppc64 and s390x.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
This line was a bit clear.
The next lines set or reset this bit (LE) depending of another bit (ILE).
So the first line is useless.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Since commit 2ada0ed, "Return From Interrupt" is broken for PPC processors
because some interrupt specifics bits of SRR1 are copied to MSR.
SRR1 is a save of MSR during interrupt.
During RFI, MSR must be restored from SRR1.
But some bits of SRR1 are interrupt-specific and are not used for MSR saving.
This is the specification (ISA 2.06) at chapter 6.4.3 (Interrupt Processing):
"2. Bits 33:36 and 42:47 of SRR1 or HSRR1 are loaded with information specific
to the interrupt type.
3. Bits 0:32, 37:41, and 48:63 of SRR1 or HSRR1 are loaded with a copy of the
corresponding bits of the MSR."
Below is a representation of MSR bits which are not saved:
0:15 16:31 32 33:36 37:41 42:47 48:63
——— | ——— | — X X X X — — — — — X X X X X X | ————
0000 0000 | 7 | 8 | 3 | F | 0000
History:
In the initial Qemu implementation (e1833e1), the mask 0x783F0000 was used for
saving MSR in SRR1. But all the bits 32:47 were cleared during RFI restoring.
This was wrong. The commit 2ada0ed explains that this breaks Altivec.
Indeed, bit 38 (for Altivec support) must be saved and restored.
The change of 2ada0ed was to restore all the bits of SRR1 to MSR.
But it's also wrong.
Explanation:
As an example, let's see what's happening after a TLB miss.
According to the e300 manual (E300CORERM table 5-6), the TLB miss interrupts
set the bits 44-47 for KEY, I/D, WAY and S/L. These bits are specifics to the
interrupt and must not be copied into MSR at the end of the interrupt.
With the current implementation, a TLB miss overwrite bits POW, TGPR and ILE.
Fix:
It shouldn't be needed to filter-out bits on MSR saving when interrupt occurs.
Specific bits overwrite MSR ones in SRR1.
But at the end of interrupt (RFI), specifics bits must be cleared before
restoring MSR from SRR1. The mask 0x783F0000 apply here.
Discussion:
The bits of the mask 0x783F0000 are cleared after an interrupt.
I cannot find a specification which talks about this
but I assume it is the truth since Linux can run this way.
Maybe it's not perfect but it's better (works for e300).
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
When running with --enable-io-thread the timer we have doesn't help,
because it doesn't wake up the CPU thread. So instead we need to
actually kick it.
While at it I refined the logic a bit to not dumbly trigger a timer
every 500ms, but rather do it more often after an interrupt got injected.
If there's no level based interrupt to be expected, we don't need the
timer anyways.
This makes qemu-system-ppc with --enable-io-thread work when using KVM.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Continue vcpu execution in case emulation failure happened while vcpu
was in userspace. In this case #UD will be injected into the guest
allowing guest OS to kill offending process and continue.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Time base SPRs TBL/TBU should be accessible in user/priv modes for reading
as specified in POWER ISA documentation. Therefore SPRs permissions were
changed in gen_tbl function.
Signed-off-by: Dmitry Ilyevsky <ilyevsky@gmail.com>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
QEMU uses a fixed page size for the CPU TLB. If the guest uses large
pages then we effectively split these into multiple smaller pages, and
populate the corresponding TLB entries on demand.
When the guest invalidates the TLB by virtual address we must invalidate
all entries covered by the large page. However the address used to
invalidate the entry may not be present in the QEMU TLB, so we do not
know which regions to clear.
Implementing a full vaiable size TLB is hard and slow, so just keep a
simple address/mask pair to record which addresses may have been mapped by
large pages. If the guest invalidates this region then flush the
whole TLB.
Signed-off-by: Paul Brook <paul@codesourcery.com>
Removes a set of ifdefs from exec.c.
Introduce TARGET_VIRT_ADDR_SPACE_BITS for all targets other
than Alpha. This will be used for page_find_alloc, which is
supposed to be using virtual addresses in the first place.
Signed-off-by: Richard Henderson <rth@twiddle.net>
This grand cleanup drops all reset and vmsave/load related
synchronization points in favor of four(!) generic hooks:
- cpu_synchronize_all_states in qemu_savevm_state_complete
(initial sync from kernel before vmsave)
- cpu_synchronize_all_post_init in qemu_loadvm_state
(writeback after vmload)
- cpu_synchronize_all_post_init in main after machine init
- cpu_synchronize_all_post_reset in qemu_system_reset
(writeback after system reset)
These writeback points + the existing one of VCPU exec after
cpu_synchronize_state map on three levels of writeback:
- KVM_PUT_RUNTIME_STATE (during runtime, other VCPUs continue to run)
- KVM_PUT_RESET_STATE (on synchronous system reset, all VCPUs stopped)
- KVM_PUT_FULL_STATE (on init or vmload, all VCPUs stopped as well)
This level is passed to the arch-specific VCPU state writing function
that will decide which concrete substates need to be written. That way,
no writer of load, save or reset functions that interact with in-kernel
KVM states will ever have to worry about synchronization again. That
also means that a lot of reasons for races, segfaults and deadlocks are
eliminated.
cpu_synchronize_state remains untouched, just as Anthony suggested. We
continue to need it before reading or writing of VCPU states that are
also tracked by in-kernel KVM subsystems.
Consequently, this patch removes many cpu_synchronize_state calls that
are now redundant, just like remaining explicit register syncs.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Invalid opcode messages can be perfectly normal, for example if this
code is never executed. Don't print an error message on the console,
but keep the message in the log for debugging purposes.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
The shifts in the gen_evsplat* functions were expecting rA to be masked,
not extracted, and so used the wrong shift amounts to sign-extend or pad
with zeroes.
Signed-off-by: Nathan Froyd <froydnj@codesourcery.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
The CRF_{CH,CL,CH_OR_CL,CH_AND_CL} constants were all off by one bit
position. Because of this, the SPE evcmp* family of instructions would
store values in the result condition register that were also off by one
bit position.
Fixed by using the CRF_{LT,GT,EQ,SO} constants for the shift amounts.
Signed-off-by: Nathan Froyd <froydnj@codesourcery.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
For some odd reason we sometimes hang inside KVM forever. I'd guess it's
a race condition where we actually have a level triggered interrupt, but
the infrastructure can't expose that yet, so the guest ACKs it, goes to
sleep and never gets notified that there's still an interrupt pending.
As a quick workaround, let's just wake up every 500 ms. That way we can
assure that we're always reinjecting interrupts in time.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We were masking 1TB SLB entries on the feature bit of 16 MB pages. Obviously
that breaks, so let's just ignore 1TB SLB entries for now and instead do
16MB pages correctly.
This fixes PPC64 Linux boot with -m above 256.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Our guest systems need to know by how much the timebase increases every second,
so there usually is a "timebase-frequency" property in the cpu leaf of the
device tree.
This property is missing in OpenBIOS.
With qemu, Linux's fallback timebase speed and qemu's internal timebase speed
match up. With KVM, that is no longer true. The guest is running at the same
timebase speed as the host.
This leads to massive timing problems. On my test machine, a "sleep 2" takes
about 14 seconds with KVM enabled.
This patch exports the timebase frequency to OpenBIOS, so it can then put them
into the device tree.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The recent transition to always have the DCR helper functions take 32 bit
values broke the PPC64 target, as target_long became 64 bits there.
This patch changes DCR helpers to target_long arguments, and cast the values
to 32 bit when needed.
Fixes PPC64 build with --enable-debug-tcg
Based on a patch from Alexander Graf <agraf@suse.de>
Reported-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
For what I know DCR is always 32 bits wide, so we should also use uint32_t to
pass it along the stacks.
This fixes a warning when compiling qemu-system-ppc64 with KVM enabled, making
it compile without --disable-werror
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Fix the alternate time base the same way as the default timebase. SPR_ATBL
should return a 64-bit value on 64 bit implementations.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
On PPC we have a 64-bit time base. Usually (PPC32) this is accessed using
two separate 32 bit SPR accesses to SPR_TBU and SPR_TBL.
On PPC64 the SPR_TBL register acts as 64 bit though, so we get the full
64 bits as return value. If we only take the lower ones, fine. But Linux
wants to see all 64 bits or it breaks.
This patch makes PPC64 Linux work even after TB crossed the 32-bit boundary,
which usually happened a few seconds after bootup.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
My segment sync patch broke compilation on PPC32, because it was trying to
sync the SLB even though ppc32 CPUs don't have an SLB.
So let's only sync it when we're on a PP64 one!
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
While x86 only needs to sync cr0-4 to know all about its MMU state and enable
qemu to resolve virtual to physical addresses, we need to sync all of the
segment registers on PPC to know which mapping we're in.
So let's grab the segment register contents to be able to use the "x" monitor
command and also enable the gdbstub to resolve virtual addresses.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
No need to alias e300 core for each CPU package.
Differences between microcontrollers have to be implemented in a higher layer
than translate_init.c
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Add CPU declarations of MPC8343, MPC8343E, MPC8347 and MPC8347E.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Declare HID2 register.
Use high BATs for e300 (8 instead of 4).
Fix index of high BATs registers.
Before the fix, IBAT4-7 were overwriting IBAT0-3.
Signed-off-by: François Armand <francois.armand@os4i.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
In the very least, a change like this requires discussion on the list.
The naming convention is goofy and it causes a massive merge problem. Something
like this _must_ be presented on the list first so people can provide input
and cope with it.
This reverts commit 99a0949b72.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Problem: Our file sys-queue.h is a copy of the BSD file, but there are
some additions and it's not entirely compatible. Because of that, there have
been conflicts with system headers on BSD systems. Some hacks have been
introduced in the commits 15cc923584,
f40d753718,
96555a96d7 and
3990d09adf but the fixes were fragile.
Solution: Avoid the conflict entirely by renaming the functions and the
file. Revert the previous hacks.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
cpu_synchronize_state() is a little unreadable since the 'modified'
argument isn't self-explanatory. Simplify it by making it always
synchronize the kernel state into qemu, and automatically flush the
registers back to the kernel if they've been synchronized on this
exit.
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
handle_cpu_signal is very nearly copy-paste code for each target, with a
few minor variations. This patch sets up appropriate defaults for a
generic handle_cpu_signal and provides overrides for particular targets
that did things differently. Fixing things like the persistent (XXX:
use sigsetjmp) should now become somewhat easier.
Previous comments on this patch suggest that the "activate soft MMU for
this block" comments refer to defunct functionality. I have removed
such blocks for the appropriate targets in this patch.
Signed-off-by: Nathan Froyd <froydnj@codesourcery.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
We do this so we can check on the corresponding stc{w,d}x. whether the
value has changed. It's a poor man's form of implementing atomic
operations and is valid only for NPTL usermode Linux emulation.
Signed-off-by: Nathan Froyd <froydnj@codesourcery.com>
Signed-off-by: malc <av1474@comtv.ru>
We only need to make sure that the clone syscall looks like it
succeeded, not clobber 60% of the register set.
Signed-off-by: Nathan Froyd <froydnj@codesourcery.com>
Signed-off-by: malc <av1474@comtv.ru>
440 and desktop codes use different input constants for interrupt indication.
Let's use the respective ones for KVM.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
We need to tell the kernel about some initial CPU state we don't have yet,
so let's use the "sregs" IOCTL for that and simply put the Processor Version
Register in there.
Now the kernel knows which guest CPU to virtualize.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>