mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Giuseppe Musacchio	861f10fd52	target/ppc: Fix load endianness for lxvwsx/lxvdsx TARGET_WORDS_BIGENDIAN may not match the machine endianness if that's a runtime-configurable parameter. Fixes: `bcb0b7b1a1` Fixes: `afae37d98a` Resolves: https://gitlab.com/qemu-project/qemu/-/issues/212 Signed-off-by: Giuseppe Musacchio <thatlemon@gmail.com> Message-Id: <20210518133020.58927-1-thatlemon@gmail.com> Tested-by: Paul A. Clarke <pc@us.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-05-19 10:44:04 +10:00
Giuseppe Musacchio	bcb0b7b1a1	ppc/translate: Rewrite gen_lxvdsx to use gvec primitives Make the implementation match the lxvwsx one. The code is now shorter smaller and potentially faster as the translation will use the host SIMD capabilities if available. No functional change. Signed-off-by: Giuseppe Musacchio <thatlemon@gmail.com> Message-Id: <a463dea379da4cb3a22de49c678932f74fb15dd7.1604912739.git.thatlemon@gmail.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-12-14 15:54:09 +11:00
LemonBoy	afae37d98a	ppc/translate: Implement lxvwsx opcode Implement the "Load VSX Vector Word & Splat Indexed" opcode, introduced in Power ISA v3.0. Buglink: https://bugs.launchpad.net/qemu/+bug/1793608 Signed-off-by: Giuseppe Musacchio <thatlemon@gmail.com> Message-Id: <d7d533e18c2bc10d924ee3e09907ff2b41fddb3a.1604912739.git.thatlemon@gmail.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-11-24 11:34:18 +11:00
Peter Maydell	dd8014e4e9	ppc patch queue 2020-08-18 Here's my first pull request for qemu-5.2, which has quite a few accumulated things. Highlights are: * Preliminary support for POWER10 (Power ISA 3.1) instruction emulation * Add documentation on the (very confusing) pseries NUMA configuration * Fix some bugs handling edge cases with XICS, XIVE and kernel_irqchip * Fix icount for a number of POWER registers * Many cleanups to error handling in XIVE code * Validate size of -prom-env data -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEdfRlhq5hpmzETofcbDjKyiDZs5IFAl87VpwACgkQbDjKyiDZ s5LjIxAAs8YAQe3uDRz1Wb9GftoMmEHdq7JQoO0FbXDQIVXzpTAXmFLSBtCWKl6p O1MEIy/o48b5ORXJqSDSA5LgxbHxYfHdIPEY5Tbn/TGvTvKyCukx9n11milUG8In JxRrOTQBnQAAHkLoyuZyrWKOauC0N1scNrnX9Geuid13GcmqHg1d2alXAUu8jEeC HSiVmtMqqyyqTx2xA4vfhaGuuwTthnKNfbGdg9ksVqBsCW+etn6ZKGImt8hBe3qO 5iqbQZvFbkpzgbjkhDzUDM6tmUAFN55y/Y+y7I8Tz4/IX7d3WbdqpplwrXXVWkpq 2gcBBjQ/9a1hPTBRVN9jn4CvHfhILBfeHIElUiLpSTQZQQALymTnnI2pLCgKoEFX LcchXbjiX+pZ2OJnAijpwBcknjgT2U/ZNyiqHJfSQ6jzlYx1YtUf4xGUsgloSiK8 9QDK8o2k0Cm8Be+lPMBMmTctoi8bq+8SN5UUF710WQL235J58o9+z1vuGO2HVk3x flBtv/+B890wcCDpGU80DPs/LSzR0xTTbA5JsWft2fvO569mda0MoWkJH5w6jvSc ZLYqljCzFCVW+tKiGHzaBalJaMwn0+QMDTsxzP3yTt5LmmEeRXpBELgvrW64IobD xBeryH3nG4SwxFSJq+4ATfvUzjy/Eo58lTTl6c53Ji8/D3aFwsA= =L9Wi -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-5.2-20200818' into staging ppc patch queue 2020-08-18 Here's my first pull request for qemu-5.2, which has quite a few accumulated things. Highlights are: * Preliminary support for POWER10 (Power ISA 3.1) instruction emulation * Add documentation on the (very confusing) pseries NUMA configuration * Fix some bugs handling edge cases with XICS, XIVE and kernel_irqchip * Fix icount for a number of POWER registers * Many cleanups to error handling in XIVE code * Validate size of -prom-env data # gpg: Signature made Tue 18 Aug 2020 05:18:36 BST # gpg: using RSA key 75F46586AE61A66CC44E87DC6C38CACA20D9B392 # gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" [full] # gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>" [full] # gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" [full] # gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>" [unknown] # Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392 * remotes/dgibson/tags/ppc-for-5.2-20200818: (40 commits) spapr/xive: Use xive_source_esb_len() nvram: Exit QEMU if NVRAM cannot contain all -prom-env data spapr/xive: Simplify error handling of kvmppc_xive_cpu_synchronize_state() ppc/xive: Simplify error handling in xive_tctx_realize() spapr/xive: Simplify error handling in kvmppc_xive_connect() ppc/xive: Fix error handling in vmstate_xive_tctx_*() callbacks spapr/xive: Fix error handling in kvmppc_xive_post_load() spapr/kvm: Fix error handling in kvmppc_xive_pre_save() spapr/xive: Rework error handling of kvmppc_xive_set_source_config() spapr/xive: Rework error handling in kvmppc_xive_get_queues() spapr/xive: Rework error handling of kvmppc_xive_[gs]et_queue_config() spapr/xive: Rework error handling of kvmppc_xive_cpu_[gs]et_state() spapr/xive: Rework error handling of kvmppc_xive_mmap() spapr/xive: Rework error handling of kvmppc_xive_source_reset() spapr/xive: Rework error handling of kvmppc_xive_cpu_connect() spapr: Simplify error handling in spapr_phb_realize() spapr/xive: Convert KVM device fd checks to assert() ppc/xive: Introduce dedicated kvm_irqchip_in_kernel() wrappers ppc/xive: Rework setup of XiveSource::esb_mmio target/ppc: Integrate icount to purr, vtb, and tbu40 ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2020-08-24 09:35:21 +01:00
Paolo Bonzini	139c1837db	meson: rename included C source files to .c.inc With Makefiles that have automatically generated dependencies, you generated includes are set as dependencies of the Makefile, so that they are built before everything else and they are available when first building the .c files. Alternatively you can use a fine-grained dependency, e.g. target/arm/translate.o: target/arm/decode-neon-shared.inc.c With Meson you have only one choice and it is a third option, namely "build at the beginning of the corresponding target"; the way you express it is to list the includes in the sources of that target. The problem is that Meson decides if something is a source vs. a generated include by looking at the extension: '.c', '.cc', '.m', '.C' are sources, while everything else is considered an include---including '.inc.c'. Use '.c.inc' to avoid this, as it is consistent with our other convention of using '.rst.inc' for included reStructuredText files. The editorconfig file is adjusted. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-08-21 06:18:30 -04:00
Matthieu Bucchianeri	8dcdb535d7	target/ppc: Fix SPE unavailable exception triggering When emulating certain floating point instructions or vector instructions on PowerPC machines, QEMU did not properly generate the SPE/Embedded Floating- Point Unavailable interrupt. See the buglink further below for references to the relevant NXP documentation. This patch fixes the behavior of some evfs* instructions that were incorrectly emitting the interrupt. More importantly, this patch fixes the behavior of several efd* and ev* instructions that were not generating the interrupt. Triggering the interrupt for these instructions fixes lazy FPU/vector context switching on some operating systems like Linux. Without this patch, the result of some double-precision arithmetic could be corrupted due to the lack of proper saving and restoring of the upper 32-bit part of the general-purpose registers. Buglink: https://bugs.launchpad.net/qemu/+bug/1888918 Buglink: https://bugs.launchpad.net/qemu/+bug/1611394 Signed-off-by: Matthieu Bucchianeri <matthieu.bucchianeri@leostella.com> Message-Id: <20200727175553.32276-1-matthieu.bucchianeri@leostella.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-08-12 13:16:27 +10:00
Lijun Pan	c4b8b49d68	target/ppc: add vmulh{su}d instructions vmulhsd: Vector Multiply High Signed Doubleword vmulhud: Vector Multiply High Unsigned Doubleword Signed-off-by: Lijun Pan <ljp@linux.ibm.com> Message-Id: <20200724045845.89976-5-ljp@linux.ibm.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-08-12 13:16:27 +10:00
Lijun Pan	f3e0d864ab	target/ppc: add vmulh{su}w instructions vmulhsw: Vector Multiply High Signed Word vmulhuw: Vector Multiply High Unsigned Word Signed-off-by: Lijun Pan <ljp@linux.ibm.com> Message-Id: <20200724045845.89976-4-ljp@linux.ibm.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-08-12 13:16:27 +10:00
Lijun Pan	adcced8784	target/ppc: add vmulld instruction vmulld: Vector Multiply Low Doubleword. Signed-off-by: Lijun Pan <ljp@linux.ibm.com> Message-Id: <20200701234344.91843-6-ljp@linux.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-08-12 13:16:27 +10:00
Lijun Pan	a285ffa680	target/ppc: convert vmuluwm to tcg_gen_gvec_mul Convert the original implementation of vmuluwm to the more generic tcg_gen_gvec_mul. Signed-off-by: Lijun Pan <ljp@linux.ibm.com> Message-Id: <20200701234344.91843-5-ljp@linux.ibm.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-08-12 13:16:27 +10:00
Matthieu Bucchianeri	6d592c557e	target/ppc: Fix TCG leak with the evmwsmiaa instruction Fix double-call to tcg_temp_new_i64(), where a temp is allocated both at declaration time and further down the implementation of gen_evmwsmiaa(). Note that gen_evmwsmia() and gen_evmwsmiaa() are still not implemented correctly, as they invoke gen_evmwsmi() which may return early, but the return is not propagated. This will be fixed in my patch for bug #1888918. Signed-off-by: Matthieu Bucchianeri <matthieu.bucchianeri@leostella.com> Message-Id: <20200727172114.31415-1-matthieu.bucchianeri@leostella.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-08-12 13:16:27 +10:00
Richard Henderson	3e114acc91	target/ppc: Use tcg_gen_gvec_rotlv Acked-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-06-02 08:42:37 -07:00
Richard Henderson	36af59d062	target/ppc: Use tcg_gen_gvec_dup_imm We can now unify the implementation of the 3 VSPLTI instructions. Acked-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-05-06 09:25:01 -07:00
BALATON Zoltan	92eeb004e8	target/ppc: Fix typo in comments "Deferred" was misspelled as "differed" in some comments, correct this typo, Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu> Message-Id: <20200214155748.0896B745953@zero.eik.bme.hu> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-02-21 09:15:04 +11:00
Stefan Brankovic	8d745875c2	target/ppc: Fix for optimized vsl/vsr instructions In previous implementation, invocation of TCG shift function could request shift of TCG variable by 64 bits when variable 'sh' is 0, which is not supported in TCG (values can be shifted by 0 to 63 bits). This patch fixes this by using two separate invocation of TCG shift functions, with maximum shift amount of 32. Name of variable 'shifted' is changed to 'carry' so variable naming is similar to old helper implementation. Variables 'avrA' and 'avrB' are replaced with variable 'avr'. Fixes: `4e6d0920e7` Reported-by: "Paul A. Clark" <pc@us.ibm.com> Reported-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Suggested-by: Aleksandar Markovic <aleksandar.markovic@rt-rk.com> Signed-off-by: Stefan Brankovic <stefan.brankovic@rt-rk.com> Message-Id: <1570196639-7025-2-git-send-email-stefan.brankovic@rt-rk.com> Tested-by: Paul A. Clarke <pc@us.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-10-24 09:36:55 +11:00
Paul A. Clarke	bc7a45ab88	ppc: Add support for 'mffsce' instruction ISA 3.0B added a set of Floating-Point Status and Control Register (FPSCR) instructions: mffsce, mffscdrn, mffscdrni, mffscrn, mffscrni, mffsl. This patch adds support for 'mffsce' instruction. 'mffsce' is identical to 'mffs', except that it also clears the exception enable bits in the FPSCR. On CPUs without support for 'mffsce' (below ISA 3.0), the instruction will execute identically to 'mffs'. Signed-off-by: Paul A. Clarke <pc@us.ibm.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <1568817082-1384-1-git-send-email-pc@us.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-10-04 10:25:23 +10:00
Paul A. Clarke	a2735cf483	ppc: Add support for 'mffscrn','mffscrni' instructions ISA 3.0B added a set of Floating-Point Status and Control Register (FPSCR) instructions: mffsce, mffscdrn, mffscdrni, mffscrn, mffscrni, mffsl. This patch adds support for 'mffscrn' and 'mffscrni' instructions. 'mffscrn' and 'mffscrni' are similar to 'mffsl', except they do not return the status bits (FI, FR, FPRF) and they also set the rounding mode in the FPSCR. On CPUs without support for 'mffscrn'/'mffscrni' (below ISA 3.0), the instructions will execute identically to 'mffs'. Signed-off-by: Paul A. Clarke <pc@us.ibm.com> Message-Id: <1568817081-1345-1-git-send-email-pc@us.ibm.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-10-04 10:25:23 +10:00
Stefan Brankovic	897b639789	target/ppc: Refactor emulation of vmrgew and vmrgow instructions Since I found this two instructions implemented with tcg, I refactored them so they are consistent with other similar implementations that I introduced in this patch. Also, a new dual macro GEN_VXFORM_TRANS_DUAL is added. This macro is used if one instruction is realized with direct translation, and second one with a helper. Signed-off-by: Stefan Brankovic <stefan.brankovic@rt-rk.com> Message-Id: <1566898663-25858-4-git-send-email-stefan.brankovic@rt-rk.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-08-29 09:46:07 +10:00
Paul A. Clarke	256be7d07a	ppc: Fix xsmaddmdp and friends A class of instructions of the form: op Target,A,B which operate like: Target = Target * A + B have a bit set which distinguishes them from instructions that operate as: Target = Target * B + A This bit is not being checked properly (using PPC_BIT macro), so all instructions in this class are operating incorrectly as the second form above. The bit was being checked as if it were part of a 64-bit instruction opcode, rather than a proper 32-bit opcode. Fix by using the macro (PPC_BIT32) which treats the opcode as a 32-bit quantity. Fixes: `c9f4e4d8b6` ("target/ppc: improve VSX_FMADD with new GEN_VSX_HELPER_VSX_MADD macro") Signed-off-by: Paul A. Clarke <pc@us.ibm.com> Message-Id: <1566401321-22419-1-git-send-email-pc@us.ibm.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Tested-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-08-29 09:46:07 +10:00
Paul A. Clarke	31eb7dddac	ppc: Add support for 'mffsl' instruction ISA 3.0B added a set of Floating-Point Status and Control Register (FPSCR) instructions: mffsce, mffscdrn, mffscdrni, mffscrn, mffscrni, mffsl. This patch adds support for 'mffsl'. 'mffsl' is identical to 'mffs', except it only returns mode, status, and enable bits from the FPSCR. On CPUs without support for 'mffsl' (below ISA 3.0), the 'mffsl' instruction will execute identically to 'mffs'. Note: I renamed FPSCR_RN to FPSCR_RN0 so I could create an FPSCR_RN mask which is both bits of the FPSCR rounding mode, as defined in the ISA. I also fixed a typo in the definition of FPSCR_FR. Signed-off-by: Paul A. Clarke <pc@us.ibm.com> v4: - nit: added some braces to resolve a checkpatch complaint. v3: - Changed tcg_gen_and_i64 to tcg_gen_andi_i64, eliminating the need for a temporary, per review from Richard Henderson. v2: - I found that I copied too much of the 'mffs' implementation. The 'Rc' condition code bits are not needed for 'mffsl'. Removed. - I now free the (renamed) 'tmask' temporary. - I now bail early for older ISA to the original 'mffs' implementation. Message-Id: <1565982203-11048-1-git-send-email-pc@us.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-08-21 17:17:39 +10:00
Stefan Brankovic	1872588ede	target/ppc: Optimize emulation of vclzw instruction Optimize Altivec instruction vclzw (Vector Count Leading Zeros Word). This instruction counts the number of leading zeros of each word element in source register and places result in the appropriate word element of destination register. Counting is to be performed in four iterations of for loop(one for each word elemnt of source register vB). Every iteration consists of loading appropriate word element from source register, counting leading zeros with tcg_gen_clzi_i32, and saving the result in appropriate word element of destination register. Signed-off-by: Stefan Brankovic <stefan.brankovic@rt-rk.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <1563200574-11098-7-git-send-email-stefan.brankovic@rt-rk.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-08-21 17:17:11 +10:00
Stefan Brankovic	b8313f0d91	target/ppc: Optimize emulation of vclzd instruction Optimize Altivec instruction vclzd (Vector Count Leading Zeros Doubleword). This instruction counts the number of leading zeros of each doubleword element in source register and places result in the appropriate doubleword element of destination register. Using tcg-s count leading zeros instruction two times(once for each doubleword element of source register vB) and placing result in appropriate doubleword element of destination register vD. Signed-off-by: Stefan Brankovic <stefan.brankovic@rt-rk.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <1563200574-11098-6-git-send-email-stefan.brankovic@rt-rk.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-08-21 17:17:11 +10:00
Stefan Brankovic	083b3f012f	target/ppc: Optimize emulation of vgbbd instruction Optimize altivec instruction vgbbd (Vector Gather Bits by Bytes by Doubleword) All ith bits (i in range 1 to 8) of each byte of doubleword element in source register are concatenated and placed into ith byte of appropriate doubleword element in destination register. Following solution is done for both doubleword elements of source register in parallel, in order to reduce the number of instructions needed(that's why arrays are used): First, both doubleword elements of source register vB are placed in appropriate element of array avr. Bits are gathered in 2x8 iterations(2 for loops). In first iteration bit 1 of byte 1, bit 2 of byte 2,... bit 8 of byte 8 are in their final spots so avr[i], i={0,1} can be and-ed with tcg_mask. For every following iteration, both avr[i] and tcg_mask variables have to be shifted right for 7 and 8 places, respectively, in order to get bit 1 of byte 2, bit 2 of byte 3.. bit 7 of byte 8 in their final spots so shifted avr values(saved in tmp) can be and-ed with new value of tcg_mask... After first 8 iteration(first loop), all the first bits are in their final places, all second bits but second bit from eight byte are in their places... only 1 eight bit from eight byte is in it's place). In second loop we do all operations symmetrically, in order to get other half of bits in their final spots. Results for first and second doubleword elements are saved in result[0] and result[1] respectively. In the end those results are saved in appropriate doubleword element of destination register vD. Signed-off-by: Stefan Brankovic <stefan.brankovic@rt-rk.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <1563200574-11098-5-git-send-email-stefan.brankovic@rt-rk.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-08-21 17:17:11 +10:00
Stefan Brankovic	4e6d0920e7	target/ppc: Optimize emulation of vsl and vsr instructions Optimization of altivec instructions vsl and vsr(Vector Shift Left/Rigt). Perform shift operation (left and right respectively) on 128 bit value of register vA by value specified in bits 125-127 of register vB. Lowest 3 bits in each byte element of register vB must be identical or result is undefined. For vsl instruction, the first step is bits 125-127 of register vB have to be saved in variable sh. Then, the highest sh bits of the lower doubleword element of register vA are saved in variable shifted, in order not to lose those bits when shift operation is performed on the lower doubleword element of register vA, which is the next step. After shifting the lower doubleword element shift operation is performed on higher doubleword element of vA, with replacement of the lowest sh bits(that are now 0) with bits saved in shifted. For vsr instruction, firstly, the bits 125-127 of register vB have to be saved in variable sh. Then, the lowest sh bits of the higher doubleword element of register vA are saved in variable shifted, in odred not to lose those bits when the shift operation is performed on the higher doubleword element of register vA, which is the next step. After shifting higher doubleword element, shift operation is performed on lower doubleword element of vA, with replacement of highest sh bits(that are now 0) with bits saved in shifted. Signed-off-by: Stefan Brankovic <stefan.brankovic@rt-rk.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <1563200574-11098-3-git-send-email-stefan.brankovic@rt-rk.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-08-21 17:17:11 +10:00
Stefan Brankovic	1cc792698e	target/ppc: Optimize emulation of lvsl and lvsr instructions Adding simple macro that is calling tcg implementation of appropriate instruction if altivec support is active. Optimization of altivec instruction lvsl (Load Vector for Shift Left). Place bytes sh:sh+15 of value 0x00 \|\| 0x01 \|\| 0x02 \|\| ... \|\| 0x1E \|\| 0x1F in destination register. Sh is calculated by adding 2 source registers and getting bits 60-63 of result. First, the bits [28-31] are placed from EA to variable sh. After that, the bytes are created in the following way: sh:(sh+7) of X(from description) by multiplying sh with 0x0101010101010101 followed by addition of the result with 0x0001020304050607. Value obtained is placed in higher doubleword element of vD. (sh+8):(sh+15) by adding the result of previous multiplication with 0x08090a0b0c0d0e0f. Value obtained is placed in lower doubleword element of vD. Optimization of altivec instruction lvsr (Load Vector for Shift Right). Place bytes 16-sh:31-sh of value 0x00 \|\| 0x01 \|\| 0x02 \|\| ... \|\| 0x1E \|\| 0x1F in destination register. Sh is calculated by adding 2 source registers and getting bits 60-63 of result. First, the bits [28-31] are placed from EA to variable sh. After that, the bytes are created in the following way: sh:(sh+7) of X(from description) by multiplying sh with 0x0101010101010101 followed by substraction of the result from 0x1011121314151617. Value obtained is placed in higher doubleword element of vD. (sh+8):(sh+15) by substracting the result of previous multiplication from 0x18191a1b1c1d1e1f. Value obtained is placed in lower doubleword element of vD. Signed-off-by: Stefan Brankovic <stefan.brankovic@rt-rk.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <1563200574-11098-2-git-send-email-stefan.brankovic@rt-rk.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-08-21 17:17:11 +10:00
Mark Cave-Ayland	c9f4e4d8b6	target/ppc: improve VSX_FMADD with new GEN_VSX_HELPER_VSX_MADD macro Introduce a new GEN_VSX_HELPER_VSX_MADD macro for the generator function which enables the source and destination registers to be decoded at translation time. This enables the determination of a or m form to be made at translation time so that a single helper function can now be used for both variants. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20190616123751.781-16-mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-07-02 09:43:58 +10:00
Mark Cave-Ayland	5ba5335d93	target/ppc: decode target register in VSX_EXTRACT_INSERT at translation time Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20190616123751.781-15-mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-07-02 09:43:58 +10:00
Mark Cave-Ayland	2aba168e50	target/ppc: decode target register in VSX_VECTOR_LOAD_STORE_LENGTH at translation time Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20190616123751.781-14-mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-07-02 09:43:58 +10:00
Mark Cave-Ayland	6ae4a57ab0	target/ppc: introduce GEN_VSX_HELPER_R2_AB macro to fpu_helper.c Rather than perform the VSR register decoding within the helper itself, introduce a new GEN_VSX_HELPER_R2_AB macro which performs the decode based upon rA and rB at translation time. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20190616123751.781-13-mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-07-02 09:43:58 +10:00
Mark Cave-Ayland	9922962011	target/ppc: introduce GEN_VSX_HELPER_R2 macro to fpu_helper.c Rather than perform the VSR register decoding within the helper itself, introduce a new GEN_VSX_HELPER_R2 macro which performs the decode based upon rD and rB at translation time. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20190616123751.781-12-mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-07-02 09:43:58 +10:00
Mark Cave-Ayland	23d0766bd9	target/ppc: introduce GEN_VSX_HELPER_R3 macro to fpu_helper.c Rather than perform the VSR register decoding within the helper itself, introduce a new GEN_VSX_HELPER_R3 macro which performs the decode based upon rD, rA and rB at translation time. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20190616123751.781-11-mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-07-02 09:43:58 +10:00
Mark Cave-Ayland	8d830485fc	target/ppc: introduce GEN_VSX_HELPER_X1 macro to fpu_helper.c Rather than perform the VSR register decoding within the helper itself, introduce a new GEN_VSX_HELPER_X1 macro which performs the decode based upon xB at translation time. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20190616123751.781-10-mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-07-02 09:43:58 +10:00
Mark Cave-Ayland	033e1fcd97	target/ppc: introduce GEN_VSX_HELPER_X2_AB macro to fpu_helper.c Rather than perform the VSR register decoding within the helper itself, introduce a new GEN_VSX_HELPER_X2_AB macro which performs the decode based upon xA and xB at translation time. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20190616123751.781-9-mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-07-02 09:43:58 +10:00
Mark Cave-Ayland	75cf84cbee	target/ppc: introduce GEN_VSX_HELPER_X2 macro to fpu_helper.c Rather than perform the VSR register decoding within the helper itself, introduce a new GEN_VSX_HELPER_X2 macro which performs the decode based upon xT and xB at translation time. With the previous change to the xscvqpdp generator and helper functions the opcode parameter is no longer required in the common case and can be removed. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20190616123751.781-8-mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-07-02 09:43:58 +10:00
Mark Cave-Ayland	e0d6a362be	target/ppc: introduce separate generator and helper for xscvqpdp Rather than perform the VSR register decoding within the helper itself, introduce a new generator and helper function which perform the decode based upon xT and xB at translation time. The xscvqpdp helper is the only 2 parameter xT/xB implementation that requires the opcode to be passed as an additional parameter, so handling this separately allows us to optimise the conversion in the next commit. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20190616123751.781-7-mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-07-02 09:43:58 +10:00
Mark Cave-Ayland	99125c7499	target/ppc: introduce GEN_VSX_HELPER_X3 macro to fpu_helper.c Rather than perform the VSR register decoding within the helper itself, introduce a new GEN_VSX_HELPER_X3 macro which performs the decode based upon xT, xA and xB at translation time. With the previous changes to the VSX_CMP generator and helper macros the opcode parameter is no longer required in the common case and can be removed. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20190616123751.781-6-mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-07-02 09:43:58 +10:00
Mark Cave-Ayland	00084a25ad	target/ppc: introduce separate VSX_CMP macro for xvcmp* instructions Rather than perform the VSR register decoding within the helper itself, introduce a new VSX_CMP macro which performs the decode based upon xT, xA and xB at translation time. Subsequent commits will make the same changes for other instructions however the xvcmp* instructions are different in that they return a set of flags to be optionally written back to the crf[6] register. Move this logic from the helper function to the generator function, along with the float_status update. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20190616123751.781-5-mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-07-02 09:43:58 +10:00
Richard Henderson	fe2d169614	target/ppc: Use tcg_gen_gvec_bitsel Replace the target-specific implementation of XXSEL. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20190603164927.8336-1-richard.henderson@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-06-12 10:41:50 +10:00
Anton Blanchard	2a12243590	target/ppc: Fix lxvw4x, lxvh8x and lxvb16x During the conversion these instructions were incorrectly treated as stores. We need to use set_cpu_vsr* and not get_cpu_vsr*. Fixes: `8b3b2d75c7` ("introduce get_cpu_vsr{l,h}() and set_cpu_vsr{l,h}() helpers for VSR register access") Signed-off-by: Anton Blanchard <anton@ozlabs.org> Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Tested-by: Greg Kurz <groug@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Message-Id: <20190524065345.25591-1-mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-06-12 10:41:49 +10:00
Richard Henderson	571fbe6ccd	target/ppc: Use vector variable shifts for VSL, VSR, VSRA The gvec expanders take care of masking the shift amount against the element width. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20190518191430.21686-2-richard.henderson@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-05-29 11:39:45 +10:00
Anton Blanchard	77bd8937c0	target/ppc: Fix xvabs[sd]p, xvnabs[sd]p, xvneg[sd]p, xvcpsgn[sd]p We were using set_cpu_vsr() when we should have used get_cpu_vsr(). Fixes: `8b3b2d75c7` ("introduce get_cpu_vsr{l,h}() and set_cpu_vsr{l,h}() helpers for VSR register access") Signed-off-by: Anton Blanchard <anton@ozlabs.org> Message-Id: <20190509104912.6b754dff@kryten> Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-05-29 11:39:45 +10:00
Anton Blanchard	e04c5dd139	target/ppc: Optimise VSX_LOAD_SCALAR_DS and VSX_VECTOR_LOAD_STORE A few small optimisations: In VSX_LOAD_SCALAR_DS() we can don't need to read the VSR via get_cpu_vsrh(). Split VSX_VECTOR_LOAD_STORE() into two functions. Loads only need to write the VSRs (set_cpu_vsr()) and stores only need to read the VSRs (get_cpu_vsr()) Thanks to Mark Cave-Ayland for the suggestions. Signed-off-by: Anton Blanchard <anton@ozlabs.org> Message-Id: <20190509103545.4a7fa71a@kryten> Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-05-29 11:39:45 +10:00
Anton Blanchard	4c406ca734	target/ppc: Fix xxspltib xxspltib raises a VMX or a VSX exception depending on the register set it is operating on. We had a check, but it was backwards. Fixes: `f113283525` ("target-ppc: add xxspltib instruction") Signed-off-by: Anton Blanchard <anton@ozlabs.org> Message-Id: <20190509061713.69490488@kryten> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-05-29 11:39:44 +10:00
Anton Blanchard	d47a751ada	target/ppc: Fix xxbrq, xxbrw Fix a typo in xxbrq and xxbrw where we put both results into the lower doubleword. Fixes: `8b3b2d75c7` ("introduce get_cpu_vsr{l,h}() and set_cpu_vsr{l,h}() helpers for VSR register access") Signed-off-by: Anton Blanchard <anton@ozlabs.org> Message-Id: <20190507004811.29968-3-anton@ozlabs.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-05-29 11:39:44 +10:00
Anton Blanchard	cf4e9363f7	target/ppc: Fix xvxsigdp Fix a typo in xvxsigdp where we put both results into the lower doubleword. Fixes: `dd977e4f45` ("target/ppc: Optimize x[sv]xsigdp using deposit_i64()") Signed-off-by: Anton Blanchard <anton@ozlabs.org> Message-Id: <20190507004811.29968-1-anton@ozlabs.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-05-29 11:39:44 +10:00
Philippe Mathieu-Daudé	d577dbaac7	target/ppc: Use tcg_gen_abs_i32 Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20190423102145.14812-2-f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-13 22:52:08 +00:00
Richard Henderson	53229a7703	tcg: Specify optional vector requirements with a list Replace the single opcode in .opc with a null-terminated array in .opt_opc. We still require that all opcodes be used with the same .vece. Validate the contents of this list with CONFIG_DEBUG_TCG. All tcg_gen_*_vec functions will check any list active during .fniv expansion. Swap the active list in and out as we expand other opcodes, or take control away from the front-end function. Convert all existing vector aware front ends. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-13 14:44:03 -07:00
David Gibson	eb512d15a0	target/ppc: Style fixes for translate/spe-impl.inc.c Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org>	2019-04-26 11:37:57 +10:00
David Gibson	3255386633	target/ppc: Style fixes for translate/vmx-impl.inc.c Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org>	2019-04-26 11:37:57 +10:00
David Gibson	34b2300cbb	target/ppc: Style fixes for translate/vsx-impl.inc.c Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org>	2019-04-26 11:37:57 +10:00

1 2 3

138 Commits