Since I found this two instructions implemented with tcg, I refactored
them so they are consistent with other similar implementations that
I introduced in this patch.
Also, a new dual macro GEN_VXFORM_TRANS_DUAL is added. This macro is
used if one instruction is realized with direct translation, and second
one with a helper.
Signed-off-by: Stefan Brankovic <stefan.brankovic@rt-rk.com>
Message-Id: <1566898663-25858-4-git-send-email-stefan.brankovic@rt-rk.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The underflow and inexact exceptions are not mutually exclusive.
Check for both of them. Tidy the reset of FPSCR[FI].
Fixes: https://bugs.launchpad.net/bugs/1841442
Reported-by: Paul Clarke <pc@us.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Paul Clarke <pc@us.ibm.com>
Message-Id: <20190826165434.18403-2-richard.henderson@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
As defined in Power 3.0 section 4.4.4 "Underflow Exception",
a tiny result is detected before rounding.
Fixes: https://bugs.launchpad.net/qemu/+bug/1841491
Reported-by: Paul Clarke <pc@us.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20190827020013.27154-1-richard.henderson@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The xscvdpspn instruction implements a non-arithmetic conversion.
In particular, NaNs are not silenced and rounding is not performed.
Rewrite to match the pseudocode for ConvertDPtoSP_NS() in the
Power 3.0B manual.
Signed-off-by: Paul A. Clarke <pc@us.ibm.com>
Message-Id: <1566321964-1447-1-git-send-email-pc@us.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
[dwg: Replaced description with clearer version from rth]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
A class of instructions of the form:
op Target,A,B
which operate like:
Target = Target * A + B
have a bit set which distinguishes them from instructions that operate as:
Target = Target * B + A
This bit is not being checked properly (using PPC_BIT macro), so all
instructions in this class are operating incorrectly as the second form
above. The bit was being checked as if it were part of a 64-bit
instruction opcode, rather than a proper 32-bit opcode. Fix by using the
macro (PPC_BIT32) which treats the opcode as a 32-bit quantity.
Fixes: c9f4e4d8b6 ("target/ppc: improve VSX_FMADD with new GEN_VSX_HELPER_VSX_MADD macro")
Signed-off-by: Paul A. Clarke <pc@us.ibm.com>
Message-Id: <1566401321-22419-1-git-send-email-pc@us.ibm.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Tested-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Commit e41c945297 ("target/alpha: Convert to CPUClass::tlb_fill")
slightly changed the way the trap_arg2 value is computed in case of TLB
fill. The type of the variable used in the ternary operator has been
changed from an int to an enum. This causes the -1 value to not be
sign-extended to 64-bit in case of an instruction fetch. The trap_arg2
ends up with 0xffffffff instead of 0xffffffffffffffff. Fix that by
changing the -1 into -1LL.
This fixes the execution of user space processes in qemu-system-alpha.
Fixes: e41c945297
Cc: qemu-stable@nongnu.org
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
[rth: Test MMU_DATA_LOAD and MMU_DATA_STORE instead of implying them.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Factor it out, add a comment how it all works, and also use it in the
REAL MMU.
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190816084708.602-7-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Any access sets the reference bit. In case we have a read-fault, we
should not allow writes to the TLB entry if the change bit was not
already set.
This is a preparation for proper storage-key reference/change bit handling
in TCG and a fix for KVM whereby read accesses would set the change
bit (old KVM versions without the ioctl to carry out the translation).
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190816084708.602-6-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Whenever we modify a storage key, we should flush the TLBs of all CPUs,
so the MMU fault handling code can properly consider the changed storage
key (to e.g., properly set the reference and change bit on the next
accesses).
These functions are barely used in modern Linux guests, so the performance
implications are neglectable for now.
This is a preparation for better reference and change bit handling for
TCG, which will require more MMU changes.
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190816084708.602-5-david@redhat.com>
Acked-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Instructions are always fetched from primary address space, except when
in home address mode. Perform the selection directly in cpu_mmu_index().
get_mem_index() is only used to perform data access, instructions are
fetched via cpu_lduw_code(), which translates to cpu_mmu_index(env, true).
We don't care about restricting the access permissions of the TLB
entries anymore, as we no longer enter PRIMARY entries into the
SECONDARY MMU. Cleanup related code a bit.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20190816084708.602-4-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Let's select the ASC before calling the function. This is a prepararion
to remove the ASC magic depending on the access mode from mmu_translate.
There is currently no way to distinguish if we have code or data access.
For now, we were using code access, because especially when debugging with
the gdbstub, we want to read and disassemble what we single-step.
Note: KVM guest can now no longer be crashed using qmp/hmp/gdbstub if they
happen to be in AR mode.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190816084708.602-3-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
We want to trace the actual return value, not "0".
Fixes: 0f5f669147 ("s390x: Enable new s390-storage-keys device")
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190816084708.602-2-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Wrong order of operands. The constant always comes last. Makes QEMU crash
reliably on specific git fetch invocations.
Reported-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190814151242.27199-1-david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Fixes: 5c4b0ab460 ("s390x/tcg: Implement VECTOR ELEMENT ROTATE AND INSERT UNDER MASK")
Cc: qemu-stable@nongnu.org
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
we now fetch 2 bytes first, check whether we have a 32 bit insn, and only then
fetch another 2 bytes. We also make sure that a 16 bit insn that still fits
into the current page does not end up in the next page.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
this helper is only used to raise qemu specific exceptions. We use this
helper to raise it on breakpoints.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
otherwise we have to pass env down through all functions which blocks
the usage of translator_loop.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
this gets rid of the copied fields of TriCore's DisasContext and now
uses the shared DisasContextBase, which is necessary for the conversion
to translate_loop.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
First ppc and spapr pull request for qemu-4.2. Includes:
* Some TCG emulation fixes and performance improvements
* Support for the mffsl instruction in TCG
* Added missing DPDES SPR
* Some enhancements to the emulation of the XIVE interrupt
controller
* Cleanups to spapr MSI management
* Some new suspend/resume infrastructure and a draft suspend
implementation for spapr
* New spapr hypercall for TPM communication (will be needed for
secure guests under an Ultravisor)
* Fix several memory leaks
And a few other assorted fixes.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEdfRlhq5hpmzETofcbDjKyiDZs5IFAl1c8bwACgkQbDjKyiDZ
s5Ko6hAA1Y1xOreKTUP9UtAIaipfdasOMOcGYQ+MMovh05Zn0CwmB0uukeIzbnhi
hU3qMue6Q0EAt5F9d9z4YWRZqkgsAOBd7SVHpSouoY6DOtIsL9Tc0jTrpr6z8t0L
j4TYZYlJUybKMocj/8YayTALMZf2myh5A+oxDGPQHqYNWYGCEcttsFbcoeWQbAXG
eXrGDuSzXDXJSKej99ty/tpSjbJXDbRcvMv+v3v6F+tHWhNke3Ku8s7niDy3fIZU
lU1Sbz0/UnjKXpCWI/WRBFFWrr1bYICvKPzjK1tNJgA/HhAp37IIsF/j/5kmmF0Y
dxOCf3kRBhGi5/KKDFrVWwdTiU0CdJ4iF/NvaNlZGZ+oSTZzANz6O/nlAjcBlbt6
nAJRB4irKkDpL0slwDhl+oF73kFXMUokNgqeaMXE03agMapHrHfmxHs7yL5lAnxf
I0hyfAUYTZBc1yd8dxEtmEoFYGE9OXU5jZC4BcV8GcrT1tK3ZVzsALetRF2Sm1wm
wW16B0V6szsDd67cwJdPIs3tR6ZSxX2D6/vhK4mK77TM9TAN7nEMJBFNwjNbnttD
QLRhFnIZQ61Ja+tDI0aV37bSM32Mi43bYRksh2FujgaYpX92Z0QfsDf9NtM9yQab
Ihbq7KJ/bK4m9OvmWTUO4CKrCbnzMEzL+ncFamoO2PcvG9uTk+M=
=E+7d
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-4.2-20190821' into staging
ppc patch queue for 2019-08-21
First ppc and spapr pull request for qemu-4.2. Includes:
* Some TCG emulation fixes and performance improvements
* Support for the mffsl instruction in TCG
* Added missing DPDES SPR
* Some enhancements to the emulation of the XIVE interrupt
controller
* Cleanups to spapr MSI management
* Some new suspend/resume infrastructure and a draft suspend
implementation for spapr
* New spapr hypercall for TPM communication (will be needed for
secure guests under an Ultravisor)
* Fix several memory leaks
And a few other assorted fixes.
# gpg: Signature made Wed 21 Aug 2019 08:24:44 BST
# gpg: using RSA key 75F46586AE61A66CC44E87DC6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" [full]
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>" [full]
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" [full]
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>" [unknown]
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-4.2-20190821: (42 commits)
ppc: Fix emulated single to double denormalized conversions
ppc: Fix emulated INFINITY and NAN conversions
ppc: conform to processor User's Manual for xscvdpspn
ppc: Add support for 'mffsl' instruction
target/ppc: Add Directed Privileged Door-bell Exception State (DPDES) SPR
spapr/xive: Mask the EAS when allocating an IRQ
spapr: Implement better workaround in spapr-vty device
spapr/irq: Drop spapr_irq_msi_reset()
spapr/pci: Free MSIs during reset
spapr/pci: Consolidate de-allocation of MSIs
ppc: remove idle_timer logic
spapr: Implement ibm,suspend-me
i386: use machine class ->wakeup method
machine: Add wakeup method to MachineClass
ppc/xive: Improve 'info pic' support
ppc/xive: Provide silent escalation support
ppc/xive: Provide unconditional escalation support
ppc/xive: Provide escalation support
ppc/xive: Provide backlog support
ppc/xive: Implement TM_PULL_OS_CTX special command
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190709152053.16670-2-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
[Rebased onto merge commit 95a9457fd44; missed instances of qom/cpu.h
in comments replaced]
GCC9 is confused by this comment when building with CFLAG
-Wimplicit-fallthrough=2:
target/ppc/mmu_helper.c: In function ‘dump_mmu’:
target/ppc/mmu_helper.c:1349:12: error: this statement may fall through [-Werror=implicit-fallthrough=]
1349 | if (ppc64_v3_radix(env_archcpu(env))) {
| ^
target/ppc/mmu_helper.c:1356:5: note: here
1356 | default:
| ^~~~~~~
cc1: all warnings being treated as errors
Rewrite the comment using 'fall through' which is recognized by
GCC and static analyzers.
Reported-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190719131425.10835-6-philmd@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
helper_todouble() was not properly converting any denormalized 32 bit
float to 64 bit double.
Fix-suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paul A. Clarke <pc@us.ibm.com>
v2:
- Splitting patch "ppc: Three floating point fixes"; this is just one part.
- Original suggested "fix" was likely flawed. v2 is rewritten by
Richard Henderson (Thanks, Richard!); I reformatted the comments in a
couple of places, compiled, and tested.
Message-Id: <1566250936-14538-1-git-send-email-pc@us.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
helper_todouble() was not properly converting INFINITY from 32 bit
float to 64 bit double.
(Normalized operand conversion is unchanged, other than indentation.)
Signed-off-by: Paul A. Clarke <pc@us.ibm.com>
Message-Id: <1566242388-9244-1-git-send-email-pc@us.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The POWER8 and POWER9 User's Manuals specify the implementation
behavior for what the ISA leaves "undefined" behavior for the
xscvdpspn and xscvdpsp instructions. This patch corrects the QEMU
implementation to match the hardware implementation for that case.
ISA 3.0B has xscvdpspn leaving its result in word 0 of the target register,
with the other words of the target register left "undefined".
The User's Manuals specify:
VSX scalar convert from double-precision to single-precision (xscvdpsp,
xscvdpspn).
VSR[32:63] is set to VSR[0:31].
So, words 0 and 1 both contain the result.
Note: this is important because GCC as of version 8 or so, assumes and takes
advantage of this behavior to optimize the following sequence:
xscvdpspn vs0,vs1
mffprwz r8,f0
ISA 3.0B has xscvdpspn leaving its result in word 0 of the target register,
and mffprwz expecting its input to come from word 1 of the source register.
This sequence fails with QEMU, as a shift is required between those two
instructions. However, since the hardware splats the result to both words 0
and 1 of its output register, the shift is not necessary.
Expect a future revision of the ISA to specify this behavior.
Signed-off-by: Paul A. Clarke <pc@us.ibm.com>
v2
- Splitting patch "ppc: Three floating point fixes"; this is just one part.
- Updated commit message to clarify behavior is documented in User's Manuals.
- Updated commit message to correct which words are in output and source of
xscvdpspn and mffprz.
- No source changes to this part of the original patch.
Message-Id: <1566236601-22954-1-git-send-email-pc@us.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
ISA 3.0B added a set of Floating-Point Status and Control Register (FPSCR)
instructions: mffsce, mffscdrn, mffscdrni, mffscrn, mffscrni, mffsl.
This patch adds support for 'mffsl'.
'mffsl' is identical to 'mffs', except it only returns mode, status, and enable
bits from the FPSCR.
On CPUs without support for 'mffsl' (below ISA 3.0), the 'mffsl' instruction
will execute identically to 'mffs'.
Note: I renamed FPSCR_RN to FPSCR_RN0 so I could create an FPSCR_RN mask which
is both bits of the FPSCR rounding mode, as defined in the ISA.
I also fixed a typo in the definition of FPSCR_FR.
Signed-off-by: Paul A. Clarke <pc@us.ibm.com>
v4:
- nit: added some braces to resolve a checkpatch complaint.
v3:
- Changed tcg_gen_and_i64 to tcg_gen_andi_i64, eliminating the need for a
temporary, per review from Richard Henderson.
v2:
- I found that I copied too much of the 'mffs' implementation.
The 'Rc' condition code bits are not needed for 'mffsl'. Removed.
- I now free the (renamed) 'tmask' temporary.
- I now bail early for older ISA to the original 'mffs' implementation.
Message-Id: <1565982203-11048-1-git-send-email-pc@us.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
DPDES stores a status of a doorbell message and if it is lost in
migration, the destination CPU won't receive it. This does not hit us
much as IPIs complete too quick to catch a pending one and even if
we missed one, broadcasts happen often enough to wake that CPU.
This defines DPDES and registers with KVM for migration.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20190816061733.53572-1-aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The logic is broken for multiple vcpu guests, also causing memory leak.
The logic is in place to handle kvm not having KVM_CAP_PPC_IRQ_LEVEL,
which is part of the kernel now since 2.6.37. Instead of fixing the
leak, drop the redundant logic which is not excercised on new kernels
anymore. Exit with error on older kernels.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Message-Id: <156406409479.19996.7606556689856621111.stgit@lep8c.aus.stglabs.ibm.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Implement cpu_exec_enter/exit on ppc which calls into new methods of
the same name in PPCVirtualHypervisorClass. These are used by spapr
to implement the splpar VPA dispatch counter initially.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20190718034214.14948-2-npiggin@gmail.com>
[dwg: Removed unnecessary CONFIG_USER_ONLY checks as suggested by gkurz]
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Optimize Altivec instruction vclzw (Vector Count Leading Zeros Word).
This instruction counts the number of leading zeros of each word element
in source register and places result in the appropriate word element of
destination register.
Counting is to be performed in four iterations of for loop(one for each
word elemnt of source register vB). Every iteration consists of loading
appropriate word element from source register, counting leading zeros
with tcg_gen_clzi_i32, and saving the result in appropriate word element
of destination register.
Signed-off-by: Stefan Brankovic <stefan.brankovic@rt-rk.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <1563200574-11098-7-git-send-email-stefan.brankovic@rt-rk.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Optimize Altivec instruction vclzd (Vector Count Leading Zeros Doubleword).
This instruction counts the number of leading zeros of each doubleword element
in source register and places result in the appropriate doubleword element of
destination register.
Using tcg-s count leading zeros instruction two times(once for each
doubleword element of source register vB) and placing result in
appropriate doubleword element of destination register vD.
Signed-off-by: Stefan Brankovic <stefan.brankovic@rt-rk.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <1563200574-11098-6-git-send-email-stefan.brankovic@rt-rk.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Optimize altivec instruction vgbbd (Vector Gather Bits by Bytes by Doubleword)
All ith bits (i in range 1 to 8) of each byte of doubleword element in
source register are concatenated and placed into ith byte of appropriate
doubleword element in destination register.
Following solution is done for both doubleword elements of source register
in parallel, in order to reduce the number of instructions needed(that's why
arrays are used):
First, both doubleword elements of source register vB are placed in
appropriate element of array avr. Bits are gathered in 2x8 iterations(2 for
loops). In first iteration bit 1 of byte 1, bit 2 of byte 2,... bit 8 of
byte 8 are in their final spots so avr[i], i={0,1} can be and-ed with
tcg_mask. For every following iteration, both avr[i] and tcg_mask variables
have to be shifted right for 7 and 8 places, respectively, in order to get
bit 1 of byte 2, bit 2 of byte 3.. bit 7 of byte 8 in their final spots so
shifted avr values(saved in tmp) can be and-ed with new value of tcg_mask...
After first 8 iteration(first loop), all the first bits are in their final
places, all second bits but second bit from eight byte are in their places...
only 1 eight bit from eight byte is in it's place). In second loop we do all
operations symmetrically, in order to get other half of bits in their final
spots. Results for first and second doubleword elements are saved in
result[0] and result[1] respectively. In the end those results are saved in
appropriate doubleword element of destination register vD.
Signed-off-by: Stefan Brankovic <stefan.brankovic@rt-rk.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <1563200574-11098-5-git-send-email-stefan.brankovic@rt-rk.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The opcode decode tables aren't really part of the CPUPPCState but an
internal implementation detail for the translator. This can cause
problems with memcpy in cpu_copy as any table created during
ppc_cpu_realize get written over causing a memory leak. To avoid this
move the tables into PowerPCCPU which is better suited to hold
internal implementation details.
Attempts to fix: https://bugs.launchpad.net/qemu/+bug/1836558
Cc: 1836558@bugs.launchpad.net
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20190716121352.302-1-alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Optimization of altivec instructions vsl and vsr(Vector Shift Left/Rigt).
Perform shift operation (left and right respectively) on 128 bit value of
register vA by value specified in bits 125-127 of register vB. Lowest 3
bits in each byte element of register vB must be identical or result is
undefined.
For vsl instruction, the first step is bits 125-127 of register vB have
to be saved in variable sh. Then, the highest sh bits of the lower
doubleword element of register vA are saved in variable shifted,
in order not to lose those bits when shift operation is performed on
the lower doubleword element of register vA, which is the next
step. After shifting the lower doubleword element shift operation
is performed on higher doubleword element of vA, with replacement of
the lowest sh bits(that are now 0) with bits saved in shifted.
For vsr instruction, firstly, the bits 125-127 of register vB have
to be saved in variable sh. Then, the lowest sh bits of the higher
doubleword element of register vA are saved in variable shifted,
in odred not to lose those bits when the shift operation is
performed on the higher doubleword element of register vA, which is
the next step. After shifting higher doubleword element, shift operation
is performed on lower doubleword element of vA, with replacement of
highest sh bits(that are now 0) with bits saved in shifted.
Signed-off-by: Stefan Brankovic <stefan.brankovic@rt-rk.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <1563200574-11098-3-git-send-email-stefan.brankovic@rt-rk.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Adding simple macro that is calling tcg implementation of appropriate
instruction if altivec support is active.
Optimization of altivec instruction lvsl (Load Vector for Shift Left).
Place bytes sh:sh+15 of value 0x00 || 0x01 || 0x02 || ... || 0x1E || 0x1F
in destination register. Sh is calculated by adding 2 source registers and
getting bits 60-63 of result.
First, the bits [28-31] are placed from EA to variable sh. After that,
the bytes are created in the following way:
sh:(sh+7) of X(from description) by multiplying sh with 0x0101010101010101
followed by addition of the result with 0x0001020304050607. Value obtained
is placed in higher doubleword element of vD.
(sh+8):(sh+15) by adding the result of previous multiplication with
0x08090a0b0c0d0e0f. Value obtained is placed in lower doubleword element
of vD.
Optimization of altivec instruction lvsr (Load Vector for Shift Right).
Place bytes 16-sh:31-sh of value 0x00 || 0x01 || 0x02 || ... || 0x1E ||
0x1F in destination register. Sh is calculated by adding 2 source
registers and getting bits 60-63 of result.
First, the bits [28-31] are placed from EA to variable sh. After that,
the bytes are created in the following way:
sh:(sh+7) of X(from description) by multiplying sh with 0x0101010101010101
followed by substraction of the result from 0x1011121314151617. Value
obtained is placed in higher doubleword element of vD.
(sh+8):(sh+15) by substracting the result of previous multiplication from
0x18191a1b1c1d1e1f. Value obtained is placed in lower doubleword element
of vD.
Signed-off-by: Stefan Brankovic <stefan.brankovic@rt-rk.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <1563200574-11098-2-git-send-email-stefan.brankovic@rt-rk.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Re-read the timebase before migrate was ported from x86 commit:
6053a86fe7: kvmclock: reduce kvmclock difference on migration
The clock move makes the guest knows about the paused time between
the stop and migrate commands. This is an issue in an already-paused
VM because some side effects, like process stalls, could happen
after migration.
So, this patch checks the runstate of guest in the pre_save handler and
do not re-reads the timebase in case of paused state (cold migration).
Signed-off-by: Maxiwell S. Garcia <maxiwell@linux.ibm.com>
Message-Id: <20190711194702.26598-1-maxiwell@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Intel CooperLake cpu adds AVX512_BF16 instruction, defining as
CPUID.(EAX=7,ECX=1):EAX[bit 05].
The patch adds a property for setting the subleaf of CPUID leaf 7 in
case that people would like to specify it.
The release spec link as follows,
https://software.intel.com/sites/default/files/managed/c5/15/\
architecture-instruction-set-extensions-programming-reference.pdf
Signed-off-by: Jing Liu <jing2.liu@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Prior patch resets can_do_io flag at the TB entry. Therefore there is no
need in resetting this flag at the end of the block.
This patch removes redundant gen_io_end calls.
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Message-Id: <156404429499.18669.13404064982854123855.stgit@pasha-Precision-3630-Tower>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@gmail.com>
The x86 architecture requires that all conversions from floating
point to integer which raise the 'invalid' exception (infinities of
both signs, NaN, and all values which don't fit in the destination
integer) return what the x86 spec calls the "indefinite integer
value", which is 0x8000_0000 for 32-bits or 0x8000_0000_0000_0000 for
64-bits. The softfloat functions return the more usual behaviour of
positive overflows returning the maximum value that fits in the
destination integer format and negative overflows returning the
minimum value that fits.
Wrap the softfloat functions in x86-specific versions which
detect the 'invalid' condition and return the indefinite integer.
Note that we don't use these wrappers for the 3DNow! pf2id and pf2iw
instructions, which do return the minimum value that fits in
an int32 if the input float is a large negative number.
Fixes: https://bugs.launchpad.net/qemu/+bug/1815423
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20190805180332.10185-1-peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Not the whole structure is initialized before passing it to the KVM.
Reduce the number of Valgrind reports.
Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Message-Id: <1564502498-805893-4-git-send-email-andrey.shinkevich@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Function 'kvm_get_supported_msrs' is only called once
now, get rid of the static variable 'kvm_supported_msrs'.
Signed-off-by: Li Qiang <liq3ea@163.com>
Message-Id: <20190725151639.21693-1-liq3ea@163.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch moves the define of target access alignment earlier from
target/foo/cpu.h to configure.
Suggested in Richard Henderson's reply to "[PATCH 1/4] tcg: TCGMemOp is now
accelerator independent MemOp"
Signed-off-by: Tony Nguyen <tony.nguyen@bt.com>
Message-Id: <11e818d38ebc40e986cfa62dd7d0afdc@tpw09926dag18e.domain1.systemhost.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: tony.nguyen@bt.com <tony.nguyen@bt.com>
Add support for halt poll control MSR: save/restore, migration
and new feature name.
The purpose of this MSR is to allow the guest to disable
host halt poll.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Message-Id: <20190603230408.GA7938@amt.cnet>
[Do not enable by default, as pointed out by Mark Kanda. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJdWuVWAAoJENSXKoln91plH+UH/j2N0RdS/pLbJwW7JnmqDvDF
SKYZbK5i0KBzSMYMyiNimh+j7gQOfuPqbLJM/Y+FVPasJDfqqEsTdpHNc/HabbO2
fJNuviWT5LgiJ4E8K/y4RUa60uOdQFfaepukFFsGC1TanlDqGid0qRU2KXZwU1sQ
BV4LyM2FHsDG9AqPKfMiH012YsFQN5Qizu5He6JZxoW5tmqR3Mp7wIYJj6nqEEts
+zCGkFJAAYh8ZhkiRuYu0FwGjfjl3AGNKnjlmqDWsz/gjE19BHT9PDg9z5pWvOAH
IRfcRk9HH+GWUMXDgYti50i0/vILfU4O8nYwcC5FN2bHB3To/sCEfW6A/XiiFM8=
=eB6a
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/amarkovic/tags/mips-queue-aug-20-2019' into staging
MIPS queue for August 20th, 2019
# gpg: Signature made Mon 19 Aug 2019 19:07:18 BST
# gpg: using RSA key D4972A8967F75A65
# gpg: Good signature from "Aleksandar Markovic <amarkovic@wavecomp.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 8526 FBF1 5DA3 811F 4A01 DD75 D497 2A89 67F7 5A65
* remotes/amarkovic/tags/mips-queue-aug-20-2019:
target/mips: tests/tcg: Fix target configurations for MSA tests
target/mips: tests/tcg: Add optional printing of more detailed failure info
target/mips: Style improvements in mips_mipssim.c
target/mips: Style improvements in mips_malta.c
target/mips: Style improvements in mips_int.c
target/mips: Style improvements in mips_fulong2e.c
target/mips: Style improvements in cps.c
target/mips: Style improvements in translate.c
target/mips: Style improvements in machine.c
target/mips: Style improvements in cpu.c
target/mips: Style improvements in cp0_timer.c
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Fixes mostly errors and warnings reported by 'checkpatch.pl -f'.
Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reviewed-by: Aleksandar Rikalo <arikalo@wavecomp.com>
Message-Id: <1566216496-17375-12-git-send-email-aleksandar.markovic@rt-rk.com>
Fixes mostly errors and warnings reported by 'checkpatch.pl -f'.
Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <1566216496-17375-10-git-send-email-aleksandar.markovic@rt-rk.com>