Move the slow path out of line, as the TODO's mention.
This allows the fast path to be unconditional, which can
speed up the fast path as well, depending on the core.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Work better with branch predition when we have movw+movt,
as the size of the code is the same. Perhaps re-evaluate
when we have a proper constant pool.
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
The schedule was fully serial, with no possibility for dual issue.
The old schedule had a minimal issue of 7 cycles; the new schedule
has a minimal issue of 5 cycles.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Share code between qemu_ld and qemu_st to process the tlb.
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Use even more primitive helper functions to avoid lots of duplicated code.
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Make the code more readable by only having one copy of the magic
numbers, swapping registers as needed prior to that. Speed the
compiler by not applying the rd == rn avoidance for v6 or later.
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
R12 is call clobbered, while R8 is call saved. This change
gives tcg one more call saved register for real data.
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
We get to re-use the _rIN and _rIK subroutines to handle the various
combinations of add vs sub. Fold the << 21 into the opcode enum values
so that we can explicitly add TO_CPSR as desired.
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
This greatly improves code generation for addition of small
negative constants.
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
This greatly improves the code we can produce for deposit
without armv7 support.
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
This makes it easier to verify changes to the code
generating the prologue.
[Aurelien: change the format from %i to %zu]
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
We were not allocating TCG_STATIC_CALL_ARGS_SIZE, so this meant that
any helper with more than 4 arguments would clobber the saved regs.
Realizing that we're supposed to have this memory pre-allocated means
we can clean up the tcg_out_arg functions, which were trying to do
more stack allocation.
Allocate stack memory for the TCG temporaries while we're at it.
Signed-off-by: Richard Henderson <rth@twiddle.net>
On 32-bit TCG targets, when emulating deposit_i64 with a mov_i32 +
deposit_i32, care should be taken to not overwrite the low part of
the second argument before the deposit when it is the same the
destination.
This fixes the shld instruction in qemu-system-x86_64, which in turns
fixes booting "system rescue CD version 2.8.0" on this target.
Reported-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
* 'ppc-for-upstream' of git://github.com/agraf/qemu: (30 commits)
target-ppc: add support for extended mtfsf/mtfsfi forms
target-ppc: emulate store doubleword pair instructions
target-ppc: emulate load doubleword pair instructions
target-ppc: emulate lfiwax instruction
target-ppc: emulate fcpsgn instruction
target-ppc: emulate prtyw and prtyd instructions
target-ppc: emulate cmpb instruction
target-ppc: add instruction flags for Book I 2.05
disas: Disassemble all ppc insns for the guest
target-ppc: optimize fabs, fnabs, fneg
PPC: Fix dcbz for linux-user on 970
powerpc: correctly handle fpu exceptions.
pseries: Generate device paths for VIO devices
pseries: Convert VIO code to QOM style type safe(ish) casts
target-ppc: Synchronize VPA state with KVM
pseries: Fix some small errors in XICS logic
target-ppc: Add more stubs for POWER7 PMU registers
pseries: Fixes and enhancements to L1 cache properties
pseries: Fix incorrect calculation of RMA size in certain configurations
PPC: Fix compile with profiling enabled
...
Power ISA 2.05 adds support for extended mtfsf/mtfsfi form, with a new
W field to select the upper part of the FPCSR register.
For that the helper is changed to handle 64-bit input values and mask with
up to 16 bits. The mtfsf/mtfsfi instructions do not have the W bit
marked as invalid anymore. Instead this is checked in the helper, which
therefore needs to access to the insns/insns_flags2. They are added in
the DisasContext struct. Finally change all accesses to the opcode fields
through extract helpers, prefixed with FP for consistency.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
Needed for Power ISA version 2.05 compliance. The check for odd register
pairs is done using the invalid bits.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
Needed for Power ISA version 2.05 compliance. The check for odd register
pairs is done using the invalid bits.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
Needed for Power ISA version 2.05 compliance.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
[agraf: fix tcg debug error]
Signed-off-by: Alexander Graf <agraf@suse.de>
Needed for Power ISA version 2.05 compliance.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
Needed for Power ISA version 2.05 compliance.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
[agraf: fix 32-bit host compile, simplify code]
Signed-off-by: Alexander Graf <agraf@suse.de>
Needed for Power ISA version 2.05 compliance.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
.. and enable it on POWER7 CPU.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
fabs, fnabs and fneg are just flipping the bit sign of an FP register,
this can be implemented in TCG instead of using softfloat.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
The default with linux-user for dcbz on 970 is to emulate 32 byte clears.
However, redoing the dcbzl support we added a check to not honor the bit
in HID5 that sets this.
Remove the #ifdef check on linux user, so that we get 32 byte clears again.
Reported-by: Riku Voipio <riku.voipio@iki.fi>
Signed-off-by: Alexander Graf <agraf@suse.de>
Raise the exception on the first occurence, do not wait for the next
floating point operation.
Signed-off-by: Fabien Chouteau <chouteau@adacore.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
This patch implements a get_dev_path qdev hook for the pseries paravirtual
VIO bus. With upcoming savevm support, this will become very important for
scsi disks hanging of VIO virtual SCSI adapters. scsibus_get_dev_path
uses the get_dev_path of the parent adapter if available, but otherwise
just uses a local channel/target/lun number to identify the device. So if
two disks are present in the system having the same target and lun on
seperate VIO scsi adapters, savevm cannot distinguish them. Since the
conventional way of using VSCSI adapters is to have just one disk per
adapter, such a conflict is very likely.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Curerntly the pseries VIO device code contains quite a few explicit
uses of DO_UPCAST and plain C casts. This is (obviously) type unsafe,
and not the conventional way of doing things in the QOM model. This
patch converts the code to use the QOM convention of per-type macros
to do verified casts with OBJECT_CHECK().
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
For PAPR guests, KVM tracks the various areas registered with the
H_REGISTER_VPA hypercall. For full emulation, of course, these are tracked
within qemu. At present these values are not synchronized. This is a
problem for reset (qemu's reset of the VPA address is not pushed to KVM)
and will also be a problem for savevm / migration.
The kernel now supports accessing the VPA state via the ONE_REG interface,
this patch adds code to qemu to use that interface to keep the qemu and
KVM ideas of the VPA state synchronized.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Under certain circumstances the emulation for the pseries "XICS" interrupt
controller was clearing a pending interrupt from the XISR register, without
also clearing the corresponding priority variable. This will cause
problems later when can trigger sanity checks in the under-development
in-kernel XICS implementation.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
In addition to the performance monitor registers found on nearly all
6xx chips, the POWER7 has two additional counters (PMC5 & PMC6) and an
extra control register (MMCRA). This patch adds stub support for them to
qemu - the registers won't do anything, but with this change won't cause
illegal instruction traps accessing them. They're also registered with
their ONE_REG ids, so their value will be kept in sync with KVM where
appropriate.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
PAPR requires that the device tree's CPU nodes have several properties
with information about the L1 cache. We already create two of these
properties, but with incorrect names - "[id]cache-block-size" instead
of "[id]-cache-block-size" (note the extra hyphen).
We were also missing some of the required cache properties. This
patch adds the [id]-cache-line-size properties (which have the same
values as the block size properties in all current cases). We also
add the [id]-cache-size properties.
Adding the cache sizes requires some extra infrastructure in the
general target-ppc code to (optionally) set the cache sizes for
various CPUs. The CPU family descriptions in translate_init.c can set
these sizes - this patch adds correct information for POWER7, I'm
leaving other CPU types to people who have a physical example to
verify against. In addition, for -cpu host we take the values
advertised by the host (if available) and use those to override the
information based on PVR.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
For the pseries machine, we need to advertise to the guest the size of its
RMA - that is the amount of memory it can access with the MMU off. For HV
KVM, this is constrained by the hardware limitations on the virtual RMA of
one hash PTE per PTE group in the hash page table. We already had code to
calculate this, but it was assuming the VRMA page size was the same as the
(host) backing page size for guest RAM.
In the case of a host kernel configured for 64k base page size, but running
on hardware (or firmware) which only allows 4k pages, the hose will do all
its allocations with a 64k page size, but still use 4k hardware pages for
actual mappings. Usually that's transparent to things running under the
host, but in the case of the maximum VRMA size it's not.
This patch refines the RMA size calculation to instead use the largest
available hardware page size (as reported by the SMMU_INFO call) which is
less than or equal to the backing page size. This now gives the correct
RMA size in all cases I've tested.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
When using profiling, we rely on profile_getclock() being available
at our disposal. Somehow that function got moved from an indirect
include we used to have in translate-init.c, so that we were now
left not properly compiling anymore.
Add an explicit include to timer.h which defines profile_getclock,
so that we can compile again.
Signed-off-by: Alexander Graf <agraf@suse.de>
On -M mac99, we can run 970 CPUs. However, these CPUs define the initial
instruction pointer they start execution at as part of their bootup protocol,
so effectively it's up to the board to decide where they start.
This went unnoticed, because they used to boot at the same location our flash
was mapped to, but due to the recent reset changes our 970 CPUs want to reset
to 0x100 now, which is always a 0 instruction.
Set the initial IP to something reasonable for -M mac99.
Signed-off-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Fabien Chouteau <chouteau@adacore.com>
Enable the KVM emulated watchdog if KVM supports (use the
capability enablement in watchdog handler). Also watchdog exit
(KVM_EXIT_WATCHDOG) handling is added.
Watchdog state machine is cleared whenever VM state changes to running.
This is to handle the cases like return from debug halt etc.
Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
[agraf: rebase to current code base, fix non-kvm cases]
Signed-off-by: Alexander Graf <agraf@suse.de>
Broken in b5a73f8d8a, the carry itself was
fixed in 79482e5ab3. But we still need to
produce the full 64-bit addition.
Simplify the conditions at the top of the functions for when we need a
new temporary. Only plain addition is important enough to warrent avoiding
the temporary, and the extra tcg move op that would come with it.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
According to the different user's manuals, the vector offset for system
reset (both /HRESET and /SRESET) is 0x00100.
This patch may break support of some executables, as the power-on start
address may change. For a specific board, if the power-on start address
is different than HRESET vector (i.e. 0x00000100 or 0xfff00100), this
should be fixed in board's initialization code.
Signed-off-by: Fabien Chouteau <chouteau@adacore.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
The overflow computation of nego and subf*o instructions has been broken
in commit ffe30937. Contrary to other targets, the instruction is subtract
from an not subtract on PowerPC.
This patch fixes the issue by using the correct argument in the xor
computation. Thanks to Peter Maydell for the hint.
With this change the PPC emulation passes the Gwenole Beauchesne
testsuite again.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
This value is not needed if we use correctly the MSR[IP] bit.
excp_prefix is always 0x00000000, except when the MSR[IP] bit is
implemented and set to 1, in that case excp_prefix is 0xfff00000.
The handling of MSR[IP] was already implemented but not used at reset
because the value of env->msr was changed "manually".
The patch uses the function hreg_store_msr() to set env->msr, this
ensures a good handling of MSR[IP] at reset, and therefore a good value
for excp_prefix.
Signed-off-by: Fabien Chouteau <chouteau@adacore.com>
Signed-off-by: Alexander Graf <agraf@suse.de>