Moved VPRTYBW and VPRTYBD to use gvec and both of them and VPRTYBQ to
decodetree. VPRTYBW and VPRTYBD now also use .fni4 and .fni8,
respectively.
vprtybw:
rept loop master patch
8 12500 0,01198900 0,00703100 (-41.4%)
25 4000 0,01070100 0,00571400 (-46.6%)
100 1000 0,01123300 0,00678200 (-39.6%)
500 200 0,01601500 0,01535600 (-4.1%)
2500 40 0,03872900 0,05562100 (43.6%)
8000 12 0,10047000 0,16643000 (65.7%)
vprtybd:
rept loop master patch
8 12500 0,00757700 0,00788100 (4.0%)
25 4000 0,00652500 0,00669600 (2.6%)
100 1000 0,00714400 0,00825400 (15.5%)
500 200 0,01211000 0,01903700 (57.2%)
2500 40 0,03483800 0,07021200 (101.5%)
8000 12 0,09591800 0,21036200 (119.3%)
vprtybq:
rept loop master patch
8 12500 0,00675600 0,00667200 (-1.2%)
25 4000 0,00619400 0,00643200 (3.8%)
100 1000 0,00707100 0,00751100 (6.2%)
500 200 0,01199300 0,01342000 (11.9%)
2500 40 0,03490900 0,04092900 (17.2%)
8000 12 0,09588200 0,11465100 (19.6%)
I wasn't expecting such a performance lost in both VPRTYBD and VPRTYBQ,
I'm not sure if it's worth to move those instructions. Comparing the
assembly of the helper with the TCGop they are pretty similar, so
I'm not sure why vprtybd took so much more time.
Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221019125040.48028-6-lucas.araujo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
This patch moves VADDCUW and VSUBCUW to decodtree with gvec using an
implementation based on the helper, with the main difference being
changing the -1 (aka all bits set to 1) result returned by cmp when
true to +1. It also implemented a .fni4 version of those instructions
and dropped the helper.
vaddcuw:
rept loop master patch
8 12500 0,01008200 0,00612400 (-39.3%)
25 4000 0,01091500 0,00471600 (-56.8%)
100 1000 0,01332500 0,00593700 (-55.4%)
500 200 0,01998500 0,01275700 (-36.2%)
2500 40 0,04704300 0,04364300 (-7.2%)
8000 12 0,10748200 0,11241000 (+4.6%)
vsubcuw:
rept loop master patch
8 12500 0,01226200 0,00571600 (-53.4%)
25 4000 0,01493500 0,00462100 (-69.1%)
100 1000 0,01522700 0,00455100 (-70.1%)
500 200 0,02384600 0,01133500 (-52.5%)
2500 40 0,04935200 0,03178100 (-35.6%)
8000 12 0,09039900 0,09440600 (+4.4%)
Overall there was a gain in performance, but the TCGop code was still
slightly bigger in the new version (it went from 4 to 5).
Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221019125040.48028-4-lucas.araujo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
And also move the insn to decodetree and remove the now unused
avr_qw_not, avr_qw_cmpu, and avr_qw_add methods.
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Víctor Colombo <victor.colombo@eldorado.org.br>
Message-Id: <20220606150037.338931-8-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
And also move the insns to decodetree.
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Víctor Colombo <victor.colombo@eldorado.org.br>
Message-Id: <20220606150037.338931-7-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
And also move the insn to decodetree
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Víctor Colombo <victor.colombo@eldorado.org.br>
Message-Id: <20220606150037.338931-6-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
And also move the insn to decodetree.
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Víctor Colombo <victor.colombo@eldorado.org.br>
Message-Id: <20220606150037.338931-5-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
And also move the insns to decodetree and remove the now unused
avr_qw_addc method.
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Víctor Colombo <victor.colombo@eldorado.org.br>
Message-Id: <20220606150037.338931-4-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
And also move the insn to decodetree.
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Víctor Colombo <victor.colombo@eldorado.org.br>
Message-Id: <20220606150037.338931-3-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Also drop VECTOR_FOR_INORDER_I usage since there is no need to access
the elements in any particular order, and move the instruction to
decodetree.
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Víctor Colombo <victor.colombo@eldorado.org.br>
Message-Id: <20220606150037.338931-2-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Move vmsumshm and vmsumshs to decodetree, declare vmsumshm helper with
TCG_CALL_NO_RWG, and drop the unused env argument.
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220517123929.284511-13-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Move vmsumuhm and vmsumuhs to decodetree, declare vmsumuhm helper with
TCG_CALL_NO_RWG, and drop the unused env argument.
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220517123929.284511-12-matheus.ferst@eldorado.org.br>
[danielhb: added #undef VMSUMUHM to fix ppc64 build]
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Move vmsumubm and vmsummbm to decodetree, declare both helpers with
TCG_CALL_NO_RWG, and drop the unused env argument.
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220517123929.284511-11-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Move the following instructions to decodetree:
vextsb2w: Vector Extend Sign Byte To Word
vextsh2w: Vector Extend Sign Halfword To Word
vextsb2d: Vector Extend Sign Byte To Doubleword
vextsh2d: Vector Extend Sign Halfword To Doubleword
vextsw2d: Vector Extend Sign Word To Doubleword
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Lucas Coutinho <lucas.coutinho@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Message-Id: <20220225210936.1749575-8-matheus.ferst@eldorado.org.br>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Here's my first pull request for qemu-5.2, which has quite a few
accumulated things. Highlights are:
* Preliminary support for POWER10 (Power ISA 3.1) instruction emulation
* Add documentation on the (very confusing) pseries NUMA configuration
* Fix some bugs handling edge cases with XICS, XIVE and kernel_irqchip
* Fix icount for a number of POWER registers
* Many cleanups to error handling in XIVE code
* Validate size of -prom-env data
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEdfRlhq5hpmzETofcbDjKyiDZs5IFAl87VpwACgkQbDjKyiDZ
s5LjIxAAs8YAQe3uDRz1Wb9GftoMmEHdq7JQoO0FbXDQIVXzpTAXmFLSBtCWKl6p
O1MEIy/o48b5ORXJqSDSA5LgxbHxYfHdIPEY5Tbn/TGvTvKyCukx9n11milUG8In
JxRrOTQBnQAAHkLoyuZyrWKOauC0N1scNrnX9Geuid13GcmqHg1d2alXAUu8jEeC
HSiVmtMqqyyqTx2xA4vfhaGuuwTthnKNfbGdg9ksVqBsCW+etn6ZKGImt8hBe3qO
5iqbQZvFbkpzgbjkhDzUDM6tmUAFN55y/Y+y7I8Tz4/IX7d3WbdqpplwrXXVWkpq
2gcBBjQ/9a1hPTBRVN9jn4CvHfhILBfeHIElUiLpSTQZQQALymTnnI2pLCgKoEFX
LcchXbjiX+pZ2OJnAijpwBcknjgT2U/ZNyiqHJfSQ6jzlYx1YtUf4xGUsgloSiK8
9QDK8o2k0Cm8Be+lPMBMmTctoi8bq+8SN5UUF710WQL235J58o9+z1vuGO2HVk3x
flBtv/+B890wcCDpGU80DPs/LSzR0xTTbA5JsWft2fvO569mda0MoWkJH5w6jvSc
ZLYqljCzFCVW+tKiGHzaBalJaMwn0+QMDTsxzP3yTt5LmmEeRXpBELgvrW64IobD
xBeryH3nG4SwxFSJq+4ATfvUzjy/Eo58lTTl6c53Ji8/D3aFwsA=
=L9Wi
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-5.2-20200818' into staging
ppc patch queue 2020-08-18
Here's my first pull request for qemu-5.2, which has quite a few
accumulated things. Highlights are:
* Preliminary support for POWER10 (Power ISA 3.1) instruction emulation
* Add documentation on the (very confusing) pseries NUMA configuration
* Fix some bugs handling edge cases with XICS, XIVE and kernel_irqchip
* Fix icount for a number of POWER registers
* Many cleanups to error handling in XIVE code
* Validate size of -prom-env data
# gpg: Signature made Tue 18 Aug 2020 05:18:36 BST
# gpg: using RSA key 75F46586AE61A66CC44E87DC6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" [full]
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>" [full]
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" [full]
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>" [unknown]
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-5.2-20200818: (40 commits)
spapr/xive: Use xive_source_esb_len()
nvram: Exit QEMU if NVRAM cannot contain all -prom-env data
spapr/xive: Simplify error handling of kvmppc_xive_cpu_synchronize_state()
ppc/xive: Simplify error handling in xive_tctx_realize()
spapr/xive: Simplify error handling in kvmppc_xive_connect()
ppc/xive: Fix error handling in vmstate_xive_tctx_*() callbacks
spapr/xive: Fix error handling in kvmppc_xive_post_load()
spapr/kvm: Fix error handling in kvmppc_xive_pre_save()
spapr/xive: Rework error handling of kvmppc_xive_set_source_config()
spapr/xive: Rework error handling in kvmppc_xive_get_queues()
spapr/xive: Rework error handling of kvmppc_xive_[gs]et_queue_config()
spapr/xive: Rework error handling of kvmppc_xive_cpu_[gs]et_state()
spapr/xive: Rework error handling of kvmppc_xive_mmap()
spapr/xive: Rework error handling of kvmppc_xive_source_reset()
spapr/xive: Rework error handling of kvmppc_xive_cpu_connect()
spapr: Simplify error handling in spapr_phb_realize()
spapr/xive: Convert KVM device fd checks to assert()
ppc/xive: Introduce dedicated kvm_irqchip_in_kernel() wrappers
ppc/xive: Rework setup of XiveSource::esb_mmio
target/ppc: Integrate icount to purr, vtb, and tbu40
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
With Makefiles that have automatically generated dependencies, you
generated includes are set as dependencies of the Makefile, so that they
are built before everything else and they are available when first
building the .c files.
Alternatively you can use a fine-grained dependency, e.g.
target/arm/translate.o: target/arm/decode-neon-shared.inc.c
With Meson you have only one choice and it is a third option, namely
"build at the beginning of the corresponding target"; the way you
express it is to list the includes in the sources of that target.
The problem is that Meson decides if something is a source vs. a
generated include by looking at the extension: '.c', '.cc', '.m', '.C'
are sources, while everything else is considered an include---including
'.inc.c'.
Use '.c.inc' to avoid this, as it is consistent with our other convention
of using '.rst.inc' for included reStructuredText files. The editorconfig
file is adjusted.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>