mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Richard Henderson	398f21412a	tcg: Remove tcg_gen_gvec_dup{8,16,32,64}i These interfaces are now unused. Reviewed-by: LIU Zhiwei <zhiwei_liu@c-sky.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-05-06 09:25:05 -07:00
Richard Henderson	03ddb6f315	tcg: Use tcg_gen_gvec_dup_imm in logical simplifications Replace the outgoing interface. Reviewed-by: LIU Zhiwei <zhiwei_liu@c-sky.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-05-06 09:25:04 -07:00
Richard Henderson	44c94677fe	tcg: Add tcg_gen_gvec_dup_imm Add a version of tcg_gen_dup_* that takes both immediate and a vector element size operand. This will replace the set of tcg_gen_gvec_dup{8,16,32,64}i functions that encode the element size within the function name. Reviewed-by: LIU Zhiwei <zhiwei_liu@c-sky.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-05-06 09:24:58 -07:00
lixinyu	a4e57084c1	tcg/mips: mips sync* encode error OPC_SYNC_WMB, OPC_SYNC_MB, OPC_SYNC_ACQUIRE, OPC_SYNC_RELEASE and OPC_SYNC_RMB have wrong encode. According to the mips manual, their encode should be 'OPC_SYNC \| 0x?? << 6' rather than 'OPC_SYNC \| 0x?? << 5'. Wrong encode can lead illegal instruction errors. These instructions often appear with multi-threaded simulation. Fixes: `6f0b99104a` ("tcg/mips: Add support for fence") Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Aleksandar Markovic <aleksandar.qemu.devel@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: lixinyu <precinct@mail.ustc.edu.cn> Message-Id: <20200411124612.12560-1-precinct@mail.ustc.edu.cn> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-04-12 14:07:07 -07:00
Richard Henderson	cce743abbf	tcg/i386: Fix %r12 guest_base initialization When %gs cannot be used, we use register offset addressing. This path is almost never used, so it was clearly not tested. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Tested-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20200406174803.8192-1-richard.henderson@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>	2020-04-07 16:19:49 +01:00
Richard Henderson	e20cb81d9c	tcg/i386: Fix INDEX_op_dup2_vec We were only constructing the 64-bit element, and not replicating the 64-bit element across the rest of the vector. Cc: qemu-stable@nongnu.org Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-03-30 11:16:17 -07:00
Richard Henderson	312b426fea	tcg/i386: Bound shift count expanding sari_vec A given RISU testcase for SVE can produce tcg-op-vec.c:511: do_shifti: Assertion `i >= 0 && i < (8 << vece)' failed. because expand_vec_sari gave a shift count of 32 to a MO_32 vector shift. In `44f1441dbe`, we changed from direct expansion of vector opcodes to re-use of the tcg expanders. So while the comment correctly notes that the hw will handle such a shift count, we now have to take our own sanity checks into account. Which is easy in this particular case. Fixes: `44f1441dbe` Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-03-17 08:41:07 -07:00
Richard Henderson	fc1bfdfd0c	tcg/arm: Expand epilogue inline It is, after all, just two instructions. Profiling on a cortex-a15, using -d nochain to increase the number of exit_tb that are executed, shows a minor improvement of 0.5%. Signed-off-by: Richard Henderson <rth@twiddle.net>	2020-02-28 10:58:41 -08:00
Richard Henderson	d6fa50f536	tcg/arm: Split out tcg_out_epilogue We will shortly use this function from tcg_out_op as well. Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2020-02-28 10:58:41 -08:00
Alex Bennée	fcc54ab5c7	tcg: save vaddr temp for plugin usage While do_gen_mem_cb does copy (via extu_tl_i64) vaddr into a new temp this won't help if the vaddr temp gets clobbered by the actual load/store op. To avoid this clobbering we explicitly copy vaddr before the op to ensure it is live my the time we do the instrumentation. Suggested-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Emilio G. Cota <cota@braap.org> Cc: qemu-stable@nongnu.org Message-Id: <20200225124710.14152-18-alex.bennee@linaro.org>	2020-02-25 20:20:23 +00:00
Richard Henderson	2445971604	tcg: Add tcg_gen_gvec_5_ptr Extend the vector generator infrastructure to handle 5 vector arguments. Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Taylor Simpson <tsimpson@quicinc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-02-12 14:58:36 -08:00
Philippe Mathieu-Daudé	d3582cfd27	tcg: Move TCG headers to include/tcg/ Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Stefan Weil <sw@weilnetz.de> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200101112303.20724-4-philmd@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-01-15 15:13:10 -10:00
Philippe Mathieu-Daudé	2b434dd127	tcg: Search includes in the parent source directory All the *.inc.c files included by tcg/$TARGET/tcg-target.inc.c are in tcg/, their parent directory. To simplify the preprocessor search path, include the relative parent path: '..'. Patch created mechanically by running: $ for x in tcg-pool.inc.c tcg-ldst.inc.c; do \ sed -i "s,#include \"$x\",#include \"../$x\"," \ $(git grep -l "#include \"$x\""); \ done Acked-by: David Gibson <david@gibson.dropbear.id.au> (ppc parts) Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Stefan Weil <sw@weilnetz.de> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200101112303.20724-3-philmd@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-01-15 15:13:10 -10:00
Philippe Mathieu-Daudé	dcb32f1d8f	tcg: Search includes from the project root source directory We currently search both the root and the tcg/ directories for tcg files: $ git grep '#include "tcg/' \| wc -l 28 $ git grep '#include "tcg[^/]' \| wc -l 94 To simplify the preprocessor search path, unify by expliciting the tcg/ directory. Patch created mechanically by running: $ for x in \ tcg.h tcg-mo.h tcg-op.h tcg-opc.h \ tcg-op-gvec.h tcg-gvec-desc.h; do \ sed -i "s,#include \"$x\",#include \"tcg/$x\"," \ $(git grep -l "#include \"$x\""); \ done Acked-by: David Gibson <david@gibson.dropbear.id.au> (ppc parts) Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Stefan Weil <sw@weilnetz.de> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200101112303.20724-2-philmd@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-01-15 15:13:10 -10:00
Richard Henderson	fc4120a378	cputlb: Rename helper_ret_ld_cmmu to cpu_ld_code There are no uses of the _cmmu names other than the bare wrapping within the _code inlines. Therefore rename the functions so we can drop the inlines. Use abi_ptr instead of target_ulong in preparation for user-only; the two types are identical for softmmu. Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-01-15 15:13:10 -10:00
Robert Foley	7606488c0e	Add use of RCU for qemu_logfile. This now allows changing the logfile while logging is active, and also solves the issue of a seg fault while changing the logfile. Any read access to the qemu_logfile handle will use the rcu_read_lock()/unlock() around the use of the handle. To fetch the handle we will use atomic_rcu_read(). We also in many cases do a check for validity of the logfile handle before using it to deal with the case where the file is closed and set to NULL. The cases where we write to the qemu_logfile will use atomic_rcu_set(). Writers will also use call_rcu() with a newly added qemu_logfile_free function for freeing/closing when readers have finished. Signed-off-by: Robert Foley <robert.foley@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20191118211528.3221-6-robert.foley@linaro.org>	2019-12-18 20:18:02 +00:00
Robert Foley	fc59d2d870	qemu_log_lock/unlock now preserves the qemu_logfile handle. qemu_log_lock() now returns a handle and qemu_log_unlock() receives a handle to unlock. This allows for changing the handle during logging and ensures the lock() and unlock() are for the same file. Also in target/tilegx/translate.c removed the qemu_log_lock()/unlock() calls (and the log("\n")), since the translator can longjmp out of the loop if it attempts to translate an instruction in an inaccessible page. Signed-off-by: Robert Foley <robert.foley@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20191118211528.3221-5-robert.foley@linaro.org>	2019-12-18 20:18:02 +00:00
Peter Maydell	cb974c95df	tcg/LICENSE: Remove out of date claim about TCG subdirectory licensing Since 2008 the tcg/LICENSE file has not changed: it claims that everything under tcg/ is BSD-licensed. This is not true and hasn't been true for years: in 2013 we accepted the tcg/aarch64 target code under a GPLv2-or-later license statement. We also have generic vector optimisation code under the LGPL2.1-or-later, and the TCI backend is GPLv2-or-later. Further, many of the files are not BSD licensed but MIT licensed. We don't really consider the tcg subdirectory to be a distinct part of QEMU anyway. Remove the LICENSE file, since claiming false information about the license of the code is confusing. Update the main project LICENSE file also to be clearer about the licenses used by TCG. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <20191025155848.17362-5-peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-11-11 15:11:21 +01:00
Peter Maydell	2552e30cba	tcg/ppc/tcg-target.opc.h: Add copyright/license Add the copyright/license boilerplate for tcg/i386/tcg-target.opc.h. This file has had only two commits, `4b06c21682` and `d9897efa1f`, both by a Linaro engineer. The license is MIT, since that's what the rest of tcg/ppc/ is. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <20191025155848.17362-4-peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-11-11 15:11:21 +01:00
Peter Maydell	2029bf7e52	tcg/i386/tcg-target.opc.h: Add copyright/license Add the copyright/license boilerplate for tcg/i386/tcg-target.opc.h. This file has had only one commit, `770c2fc7bb`, by a Linaro engineer. The license is MIT, since that's what the rest of tcg/i386/ is. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <20191025155848.17362-3-peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-11-11 15:11:21 +01:00
Peter Maydell	97105f2921	tcg/aarch64/tcg-target.opc.h: Add copyright/license Add the copyright/license boilerplate for target/aarch64/tcg-target.opc.h. This file has only had two commits: `14e4c1e235` and `79525dfd08`, both by the same Linaro engineer. The license is GPL-2-or-later, since that's what the rest of tcg/aarch64 uses. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <20191025155848.17362-2-peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-11-11 15:11:21 +01:00
Peter Maydell	68d8ef4ec5	TCG Plugins initial implementation - use --enable-plugins @ configure - low impact introspection (-plugin empty.so to measure overhead) - plugins cannot alter guest state - example plugins included in source tree (tests/plugins) - -d plugin to enable plugin output in logs - check-tcg runs extra tests when plugins enabled - documentation in docs/devel/plugins.rst -----BEGIN PGP SIGNATURE----- iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAl23BZMACgkQ+9DbCVqe KkRPegf/QHygZ4ER2jOaWEookxiOEcik+dzQKVGNqLNXeMLvo5fGjGVpFoFxSgfv ZvCAL4xbW44zsYlVfh59tfn4Tu9qK7s7/qM3WXpHsmuvEuhoWef0Lt2jSe+D46Rs KeG/aX+rHLUR8rr9eCgE+1/MQmxPUj3VUonkUpNkk2ebBbSNoLSOudB4DD9Vcyl7 Pya1kPvA6W9bwI20ZSWihE7flg13o62Pp+LgAFLrsfxXOxOMkPrU8Pp+B0Dvr+hL 5Oh0clZLhiRi75x+KVGZ90TVsoftdjYoOWGMOudS/+NNmqKT1NTLm0K1WJYyRMQ1 V0ne4/OcGNq7x8gcOx/xs09ADu5/VA== =UXR/ -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/stsquad/tags/pull-tcg-plugins-281019-4' into staging TCG Plugins initial implementation - use --enable-plugins @ configure - low impact introspection (-plugin empty.so to measure overhead) - plugins cannot alter guest state - example plugins included in source tree (tests/plugins) - -d plugin to enable plugin output in logs - check-tcg runs extra tests when plugins enabled - documentation in docs/devel/plugins.rst # gpg: Signature made Mon 28 Oct 2019 15:13:23 GMT # gpg: using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44 # gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [full] # Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44 * remotes/stsquad/tags/pull-tcg-plugins-281019-4: (57 commits) travis.yml: enable linux-gcc-debug-tcg cache MAINTAINERS: add me for the TCG plugins code scripts/checkpatch.pl: don't complain about (foo, /* empty */) .travis.yml: add --enable-plugins tests include/exec: wrap cpu_ldst.h in CONFIG_TCG accel/stubs: reduce headers from tcg-stub tests/plugin: add hotpages to analyse memory access patterns tests/plugin: add instruction execution breakdown tests/plugin: add a hotblocks plugin tests/tcg: enable plugin testing tests/tcg: drop test-i386-fprem from TESTS when not SLOW tests/tcg: move "virtual" tests to EXTRA_TESTS tests/tcg: set QEMU_OPTS for all cris runs tests/tcg/Makefile.target: fix path to config-host.mak tests/plugin: add sample plugins linux-user: support -plugin option vl: support -plugin option plugin: add qemu_plugin_outs helper plugin: add qemu_plugin_insn_disas helper plugin: expand the plugin_init function to include an info block ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2019-10-30 14:10:32 +00:00
Alex Bennée	7dec71d5ff	cputlb: ensure _cmmu helper functions follow the naming standard We document this in docs/devel/load-stores.rst so lets follow it. The 32 bit and 64 bit access functions have historically not included the sign so we leave those as is. We also introduce some signed helpers which are used for loading immediate values in the translator. Fixes: `282dffc8` Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20191021150910.23216-1-alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>	2019-10-28 15:12:38 +00:00
Emilio G. Cota	e6d86bed50	tcg: let plugins instrument virtual memory accesses To capture all memory accesses we need hook into all the various helper functions that are involved in memory operations as well as the injected inline helper calls. A later commit will allow us to resolve the actual guest HW addresses by replaying the lookup. Signed-off-by: Emilio G. Cota <cota@braap.org> [AJB: drop haddr handling, just deal in vaddr] Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>	2019-10-28 15:12:38 +00:00
Emilio G. Cota	38b47b19ec	plugin-gen: add module for TCG-related code We first inject empty instrumentation from translator_loop. After translation, we go through the plugins to see what they want to register for, filling in the empty instrumentation. If if turns out that some instrumentation remains unused, we remove it. This approach supports the following features: - Inlining TCG code for simple operations. Note that we do not export TCG ops to plugins. Instead, we give them a C API to insert inlined ops. So far we only support adding an immediate to a u64, e.g. to count events. - "Direct" callbacks. These are callbacks that do not go via a helper. Instead, the helper is defined at run-time, so that the plugin code is directly called from TCG. This makes direct callbacks as efficient as possible; they are therefore used for very frequent events, e.g. memory callbacks. - Passing the host address to memory callbacks. Most of this is implemented in a later patch though. - Instrumentation of memory accesses performed from helpers. See the corresponding comment, as well as a later patch. Signed-off-by: Emilio G. Cota <cota@braap.org> [AJB: add alloc_tcg_plugin_context, use glib, rm hwaddr] Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>	2019-10-28 15:12:38 +00:00
Emilio G. Cota	c87fb14fde	tcg: add tcg_gen_st_ptr Will gain a user soon. Signed-off-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>	2019-10-28 15:12:38 +00:00
Alex Bennée	504f73f7b3	trace: add mmu_index to mem_info We are going to re-use mem_info later for plugins and will need to track the mmu_idx for softmmu code. Signed-off-by: Alex Bennée <alex.bennee@linaro.org>	2019-10-28 15:12:38 +00:00
Alex Bennée	4cef72d042	cputlb: ensure _cmmu helper functions follow the naming standard We document this in docs/devel/load-stores.rst so lets follow it. The 32 bit and 64 bit access functions have historically not included the sign so we leave those as is. We also introduce some signed helpers which are used for loading immediate values in the translator. Fixes: `282dffc8` Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20191021150910.23216-1-alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-10-28 10:26:02 +01:00
Stefan Weil	2f160e0f97	tci: Add implementation for INDEX_op_ld16u_i64 This fixes "make check-tcg" on a Debian x86_64 host. Signed-off-by: Stefan Weil <sw@weilnetz.de> Tested-by: Thomas Huth <thuth@redhat.com> Message-Id: <20190410194838.10123-1-sw@weilnetz.de> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-10-28 10:26:02 +01:00
Richard Henderson	b7ce3cff21	tcg/ppc: Update vector support for v3.00 dup/dupi These new instructions are conditional on MSR.VEC for TX=1, so we can consider these Altivec instructions. Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-10-14 07:10:53 -07:00
Richard Henderson	6e11cde150	tcg/ppc: Update vector support for v3.00 load/store These new instructions are a mix of those like LXSD that are only conditional only on MSR.VEC and those like LXV that are conditional on MSR.VEC for TX=1. Thus, in the end, we can consider all of these as Altivec instructions. Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-10-14 07:10:44 -07:00
Richard Henderson	d7cd6a2f25	tcg/ppc: Update vector support for v3.00 Altivec These new instructions are conditional only on MSR.VEC and are thus part of the Altivec instruction set, and not VSX. This includes negation and compare not equal. Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-10-14 07:10:40 -07:00
Richard Henderson	7097312d37	tcg/ppc: Update vector support for v2.07 FP These new instructions are conditional on MSR.FP when TX=0 and MSR.VEC when TX=1. Since we only care about the Altivec registers, and force TX=1, we can consider these to be Altivec instructions. Since Altivec is true for any use of vector types, we only need test have_isa_2_07. This includes moves to and from the integer registers. Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-10-14 07:10:33 -07:00
Richard Henderson	b2dda6400c	tcg/ppc: Update vector support for v2.07 VSX These new instructions are conditional only on MSR.VSX and are thus part of the VSX instruction set, and not Altivec. This includes double-word loads and stores. Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-10-14 07:10:29 -07:00
Richard Henderson	64ff1c6d21	tcg/ppc: Update vector support for v2.07 Altivec These new instructions are conditional only on MSR.VEC and are thus part of the Altivec instruction set, and not VSX. This includes lots of double-word arithmetic and a few extra logical operations. Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-10-14 07:10:25 -07:00
Richard Henderson	47c906ae6f	tcg/ppc: Update vector support for VSX The VSX instruction set instructions include double-word loads and stores, double-word load and splat, double-word permute, and bit select. All of which require multiple operations in the Altivec instruction set. Because the VSX registers map %vsr32 to %vr0, and we have no current intention or need to use vector registers outside %vr0-%vr19, force on the {ax,bx,cx,tx} bits within the added VSX insns so that we don't have to otherwise modify the VR[TABC] macros. Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>	2019-10-14 07:10:20 -07:00
Richard Henderson	68f340d4cd	tcg/ppc: Enable Altivec detection Now that we have implemented the required tcg operations, we can enable detection of host vector support. Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> (PPC32) Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-10-14 07:10:17 -07:00
Richard Henderson	597cf97892	tcg/ppc: Support vector dup2 This is only used for 32-bit hosts. Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>	2019-10-14 07:10:13 -07:00
Richard Henderson	d9897efa1f	tcg/ppc: Support vector multiply For Altivec, this is always an expansion. Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>	2019-10-14 07:10:10 -07:00
Richard Henderson	dabae0971b	tcg/ppc: Support vector shift by immediate For Altivec, this is done via vector shift by vector, and loading the immediate into a register. Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>	2019-10-14 07:10:06 -07:00
Richard Henderson	e9d1a53ae6	tcg/ppc: Add support for vector saturated add/subtract Add support for vector saturated add/subtract using Altivec instructions: VADDSBS, VADDSHS, VADDSWS, VADDUBS, VADDUHS, VADDUWS, and VSUBSBS, VSUBSHS, VSUBSWS, VSUBUBS, VSUBUHS, VSUBUWS. Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>	2019-10-14 07:10:02 -07:00
Richard Henderson	d67508117d	tcg/ppc: Add support for vector add/subtract Add support for vector add/subtract using Altivec instructions: VADDUBM, VADDUHM, VADDUWM, VSUBUBM, VSUBUHM, VSUBUWM. Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>	2019-10-14 07:09:58 -07:00
Richard Henderson	e238297282	tcg/ppc: Add support for vector maximum/minimum Add support for vector maximum/minimum using Altivec instructions VMAXSB, VMAXSH, VMAXSW, VMAXUB, VMAXUH, VMAXUW, and VMINSB, VMINSH, VMINSW, VMINUB, VMINUH, VMINUW. Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>	2019-10-14 07:09:54 -07:00
Richard Henderson	6ef14d7ebe	tcg/ppc: Add support for load/store/logic/comparison Add various bits and peaces related mostly to load and store operations. In that context, logic, compare, and splat Altivec instructions are used, and, therefore, the support for emitting them is included in this patch too. Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>	2019-10-14 07:09:50 -07:00
Richard Henderson	4b06c21682	tcg/ppc: Enable tcg backend vector compilation Introduce all of the flags required to enable tcg backend vector support, and a runtime flag to indicate the host supports Altivec instructions. For now, do not actually set have_isa_altivec to true, because we have not yet added all of the code to actually generate all of the required insns. However, we must define these flags in order to disable ifndefs that create stub versions of the functions added here. The change to tcg_out_movi works around a buglet in tcg.c wherein if we do not define tcg_out_dupi_vec we get a declared but not defined Werror, but if we only declare it we get a defined but not used Werror. We need to this change to tcg_out_movi eventually anyway, so it's no biggie. Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>	2019-10-14 07:09:44 -07:00
Richard Henderson	63922f467a	tcg/ppc: Replace HAVE_ISEL macro with a variable Previously we've been hard-coding knowledge that Power7 has ISEL, but it was an optional instruction before that. Use the AT_HWCAP2 bit, when present, to properly determine support. Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-10-14 07:09:40 -07:00
Richard Henderson	4e33fe0137	tcg/ppc: Replace HAVE_ISA_2_06 This is identical to have_isa_2_06, so replace it. Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-10-14 07:09:36 -07:00
Richard Henderson	7d9dae0a10	tcg/ppc: Create TCGPowerISA and have_isa Introduce an enum to hold base < 2.06 < 3.00. Use macros to preserve the existing have_isa_2_06 and have_isa_3_00 predicates. Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-10-14 07:09:30 -07:00
Richard Henderson	b82f769cc1	tcg/ppc: Introduce macros VRT(), VRA(), VRB(), VRC() Introduce macros VRT(), VRA(), VRB(), VRC() used for encoding elements of Altivec instructions. Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>	2019-10-14 07:09:26 -07:00
Richard Henderson	1838905eb3	tcg/ppc: Introduce macro VX4() Introduce macro VX4() used for encoding Altivec instructions. Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>	2019-10-14 07:09:22 -07:00
Richard Henderson	42281ec646	tcg/ppc: Introduce Altivec registers Altivec supports 32 128-bit vector registers, whose names are by convention v0 through v31. Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>	2019-10-14 07:08:30 -07:00
Peter Maydell	9de65783e1	Allow page table bit to swap endianness. Reorganize watchpoints out of i/o path. Return host address from probe_write / probe_access. -----BEGIN PGP SIGNATURE----- iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAl1uiyYdHHJpY2hhcmQu aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV8AuwgAnYLQQbL8kjSqzp7q gRlj0M2SX41ZW3fMkI794RwsljD9Z0QS7YGnpzHolig9XUYrGnip7STrMvlCr/1L CIMWNHlgitgBMszLqg42/TB+6RxXn+DMX/ShUzTagC6xQhinCIpdEjoLaTKSgeP+ foIyJ2uoJLKOBP8cPTQp8evongtoQIljpsZZ0K8a4sreO1d6ytH+olkuoGiROft+ VoJkA+kNHd9cE+LPCva8UFGu1QE6uCySvhepzOpnvOtK+SXKUm2yLOFGu7RWP1pT RkE0oRyRnImtg+cViHfUUFogIffFROdL5tuYMQVuqbINeROPUgJPav+R1Nz1P60a xM2HEw== =bLLU -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20190903' into staging Allow page table bit to swap endianness. Reorganize watchpoints out of i/o path. Return host address from probe_write / probe_access. # gpg: Signature made Tue 03 Sep 2019 16:47:50 BST # gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F # gpg: issuer "richard.henderson@linaro.org" # gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [full] # Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A 05C0 64DF 38E8 AF7E 215F * remotes/rth/tags/pull-tcg-20190903: (36 commits) tcg: Factor out probe_write() logic into probe_access() tcg: Make probe_write() return a pointer to the host page s390x/tcg: Pass a size to probe_write() in do_csst() hppa/tcg: Call probe_write() also for CONFIG_USER_ONLY mips/tcg: Call probe_write() for CONFIG_USER_ONLY as well tcg: Enforce single page access in probe_write() tcg: Factor out CONFIG_USER_ONLY probe_write() from s390x code s390x/tcg: Fix length calculation in probe_write_access() s390x/tcg: Use guest_addr_valid() instead of h2g_valid() in probe_write_access() tcg: Check for watchpoints in probe_write() cputlb: Handle watchpoints via TLB_WATCHPOINT cputlb: Remove double-alignment in store_helper cputlb: Fix size operand for tlb_fill on unaligned store exec: Factor out cpu_watchpoint_address_matches cputlb: Fold TLB_RECHECK into TLB_INVALID_MASK exec: Factor out core logic of check_watchpoint() exec: Move user-only watchpoint stubs inline target/sparc: sun4u Invert Endian TTE bit target/sparc: Add TLB entry with attributes cputlb: Byte swap memory transaction attribute ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2019-09-04 16:29:18 +01:00
Tony Nguyen	14776ab5a1	tcg: TCGMemOp is now accelerator independent MemOp Preparation for collapsing the two byte swaps, adjust_endianness and handle_bswap, along the I/O path. Target dependant attributes are conditionalized upon NEED_CPU_H. Signed-off-by: Tony Nguyen <tony.nguyen@bt.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Acked-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <81d9cd7d7f5aaadfa772d6c48ecee834e9cf7882.1566466906.git.tony.nguyen@bt.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-09-03 08:30:38 -07:00
Emilio G. Cota	2bc89637b7	tcg/README: fix typo s/afterwise/afterwards/ Afterwise is "wise after the fact", as in "hindsight". Here we meant "afterwards" (as in "subsequently"). Fix it. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20190828165307.18321-7-alex.bennee@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2019-09-03 16:20:35 +01:00
tony.nguyen@bt.com	52bf9771fd	configure: Define target access alignment in configure This patch moves the define of target access alignment earlier from target/foo/cpu.h to configure. Suggested in Richard Henderson's reply to "[PATCH 1/4] tcg: TCGMemOp is now accelerator independent MemOp" Signed-off-by: Tony Nguyen <tony.nguyen@bt.com> Message-Id: <11e818d38ebc40e986cfa62dd7d0afdc@tpw09926dag18e.domain1.systemhost.net> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: tony.nguyen@bt.com <tony.nguyen@bt.com>	2019-08-20 17:26:19 +02:00
Markus Armbruster	db72581598	Include qemu/main-loop.h less In my "build everything" tree, changing qemu/main-loop.h triggers a recompile of some 5600 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). It includes block/aio.h, which in turn includes qemu/event_notifier.h, qemu/notify.h, qemu/processor.h, qemu/qsp.h, qemu/queue.h, qemu/thread-posix.h, qemu/thread.h, qemu/timer.h, and a few more. Include qemu/main-loop.h only where it's needed. Touching it now recompiles only some 1700 objects. For block/aio.h and qemu/event_notifier.h, these numbers drop from 5600 to 2800. For the others, they shrink only slightly. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190812052359.30071-21-armbru@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>	2019-08-16 13:31:52 +02:00
Markus Armbruster	6a0acfff99	Clean up inclusion of exec/cpu-common.h migration/qemu-file.h neglects to include it even though it needs ram_addr_t. Fix that. Drop a few superfluous inclusions elsewhere. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20190812052359.30071-14-armbru@redhat.com>	2019-08-16 13:31:52 +02:00
Richard Henderson	1789d4274b	tcg/aarch64: Fix output of extract2 opcodes This patch fixes two problems: (1) The inputs to the EXTR insn were reversed, (2) The input constraints use rZ, which means that we need to use the REG0 macro in order to supply XZR for a constant 0 input. Fixes: `464c2969d5` Reported-by: Peter Maydell <peter.maydell@linaro.org> Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-07-14 12:19:00 +02:00
Richard Henderson	80f4d7c3ae	tcg: Fix constant folding of INDEX_op_extract2_i32 On a 64-bit host, discard any replications of the 32-bit sign bit when performing the shift and merge. Fixes: https://bugs.launchpad.net/bugs/1834496 Tested-by: Christophe Lyon <christophe.lyon@linaro.org> Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-07-14 12:19:00 +02:00
Richard Henderson	11978f6f58	tcg: Fix expansion of INDEX_op_not_vec This operation can always be emitted, even if we need to fall back to xor. Adjust the assertions to match. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-07-09 08:26:11 +02:00
Alistair Francis	7ab7e9c7c7	tcg/riscv: Fix RISC-VH host build failure Commit `269bd5d8` "cpu: Move the softmmu tlb to CPUNegativeOffsetState' broke the RISC-V host build as there are two variables that are used but not defined. This patch renames the undefined variables mask_off and table_off to the existing (but unused) mask_ofs and table_ofs variables. Signed-off-by: Alistair Francis <alistair.francis@wdc.com> Message-Id: <79729cc88ca509e08b5c4aa0aa8a52847af70c0f.1561039316.git.alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-07-09 08:26:11 +02:00
Like Xu	5cc8767d05	general: Replace global smp variables with smp machine properties Basically, the context could get the MachineState reference via call chains or unrecommended qdev_get_machine() in !CONFIG_USER_ONLY mode. A local variable of the same name would be introduced in the declaration phase out of less effort OR replace it on the spot if it's only used once in the context. No semantic changes. Signed-off-by: Like Xu <like.xu@linux.intel.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-Id: <20190518205428.90532-4-like.xu@linux.intel.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2019-07-05 17:07:36 -03:00
Markus Armbruster	f91005e195	Supply missing header guards Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190604181618.19980-5-armbru@redhat.com>	2019-06-12 13:20:21 +02:00
Markus Armbruster	a8d2532645	Include qemu-common.h exactly where needed No header includes qemu-common.h after this commit, as prescribed by qemu-common.h's file comment. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190523143508.25387-5-armbru@redhat.com> [Rebased with conflicts resolved automatically, except for include/hw/arm/xlnx-zynqmp.h hw/arm/nrf51_soc.c hw/arm/msf2-soc.c block/qcow2-refcount.c block/qcow2-cluster.c block/qcow2-cache.c target/arm/cpu.h target/lm32/cpu.h target/m68k/cpu.h target/mips/cpu.h target/moxie/cpu.h target/nios2/cpu.h target/openrisc/cpu.h target/riscv/cpu.h target/tilegx/cpu.h target/tricore/cpu.h target/unicore32/cpu.h target/xtensa/cpu.h; bsd-user/main.c and net/tap-bsd.c fixed up]	2019-06-12 13:20:20 +02:00
Richard Henderson	43b3952dea	tcg/arm: Remove mostly unreachable tlb special case There was nothing armv7 specific about the bic+cmp sequence, however looking at the set of guests more closely shows that the 8-bit immediate operand for the bic can only be satisfied with one guest in tree: baseline m-profile -- 10-bit pages with aligned 4-byte memory ops. Therefore it does not seem useful to keep this path. Acked-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-06-10 07:03:42 -07:00
Richard Henderson	057b6e370b	tcg/arm: Use LDRD to load tlb mask+table This changes the code generation for the tlb from e.g. ldr ip, [r6, #-0x10] ldr r2, [r6, #-0xc] and ip, ip, r4, lsr #8 ldrd r0, r1, [r2, ip]! ldr r2, [r2, #0x18] to ldrd r0, r1, [r6, #-0x10] and r0, r0, r4, lsr #8 ldrd r2, r3, [r1, r0]! ldr r1, [r1, #0x18] for armv7 hosts. Rearranging the register allocation in order to avoid overlap between the two ldrd pairs causes the patch to be larger than it ordinarily would be. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-06-10 07:03:42 -07:00
Richard Henderson	65b23204d6	tcg/aarch64: Use LDP to load tlb mask+table This changes the code generation for the tlb from e.g. ldur x0, [x19, #0xffffffffffffffe0] ldur x1, [x19, #0xffffffffffffffe8] and x0, x0, x20, lsr #8 add x1, x1, x0 ldr x0, [x1] ldr x1, [x1, #0x18] to ldp x0, x1, [x19, #-0x20] and x0, x0, x20, lsr #8 add x1, x1, x0 ldr x0, [x1] ldr x1, [x1, #0x18] Acked-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-06-10 07:03:42 -07:00
Richard Henderson	269bd5d8f6	cpu: Move the softmmu tlb to CPUNegativeOffsetState We have for some time had code within the tcg backends to handle large positive offsets from env. This move makes sure that need not happen. Indeed, we are able to assert at build time that simple offsets suffice for all hosts. Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-06-10 07:03:42 -07:00
Richard Henderson	a40ec84ee2	tcg: Create struct CPUTLB Move all softmmu tlb data into this structure. Arrange the members so that we are able to place mask+table together and at a smaller absolute offset from ENV. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Acked-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-06-10 07:03:34 -07:00
Richard Henderson	11e2bfef79	tcg/i386: Use MOVDQA for TCG_TYPE_V128 load/store This instruction raises #GP, aka SIGSEGV, if the effective address is not aligned to 16-bytes. We have assertions in tcg-op-gvec.c that the offset from ENV is aligned, for vector types <= V128. But the offset itself does not validate that the final pointer is aligned -- one must also remember to use the QEMU_ALIGNED() attribute on the vector member within ENV. PowerPC Altivec has vector load/store instructions that silently discard the low 4 bits of the address, making alignment mistakes difficult to discover. Aid that by making the most popular host visibly signal the error. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-22 15:09:43 -04:00
Richard Henderson	9e27f58b99	tcg/aarch64: Allow immediates for vector ORR and BIC The allows immediates to be used for ORR and BIC, as well as the trivial inversions, ORC and AND. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-22 15:09:43 -04:00
Richard Henderson	02f3a5b474	tcg/aarch64: Build vector immediates with two insns Use MOVI+ORR or MVNI+BIC in order to build some vector constants, as opposed to dropping them to the constant pool. This includes all 16-bit constants and a similar set of 32-bit constants. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-22 15:09:43 -04:00
Richard Henderson	7e308e003e	tcg/aarch64: Use MVNI in tcg_out_dupi_vec The compliment of a subset of immediates can be computed with a single instruction. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-22 15:09:43 -04:00
Richard Henderson	984fdcee34	tcg/aarch64: Split up is_fimm There are several sub-classes of vector immediate, and only MOVI can use them all. This will enable usage of MVNI and ORRI, which use progressively fewer sub-classes. This patch adds no new functionality, merely splits the function and moves part of the logic into tcg_out_dupi_vec. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-22 15:09:43 -04:00
Richard Henderson	a9e434a5dc	tcg/aarch64: Support vector bitwise select value The instruction set has 3 insns that perform the same operation, only varying in which operand must overlap the destination. We can represent the operation without overlap and choose based on the operands seen. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-22 15:09:43 -04:00
Richard Henderson	ebcfb91abe	tcg/i386: Use umin/umax in expanding unsigned compare Using umin(a, b) == a as an expansion for TCG_COND_LEU is a better alternative to (a - INT_MIN) <= (b - INT_MIN). Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-22 15:09:43 -04:00
Richard Henderson	3ec3538a45	tcg/i386: Remove expansion for missing minmax This is now handled by code within tcg-op-vec.c. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-22 15:09:43 -04:00
Richard Henderson	904c5e1967	tcg/i386: Support vector comparison select value We already had backend support for this feature. Expand the new cmpsel opcode using vpblendb. The combination allows us to avoid an extra NOT for some comparison codes. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-22 15:09:43 -04:00
Richard Henderson	25c012b400	tcg: Add TCG_OPF_NOT_PRESENT if TCG_TARGET_HAS_foo is negative If INDEX_op_foo is always expanded by tcg_expand_vec_op, then there may be no reasonable set of constraints to return from tcg_target_op_def for that opcode. Let TCG_TARGET_HAS_foo be specified as -1 in that case. Thus a boolean test for TCG_TARGET_HAS_foo is true, but we will not assert within process_op_defs when no constraints are specified. Compare this with tcg_can_emit_vec_op, which already uses this tri-state indication. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-22 15:09:43 -04:00
Richard Henderson	72b4c792c7	tcg: Expand vector minmax using cmp+cmpsel Provide a generic fallback for the min/max operations. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-22 15:09:43 -04:00
Richard Henderson	17f79944eb	tcg: Introduce do_op3_nofail for vector expansion This makes do_op3 match do_op2 in allowing for failure, and thus fall back expansions. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-22 15:09:43 -04:00
Richard Henderson	f75da2988e	tcg: Add support for vector compare select Perform a per-element conditional move. This combination operation is easier to implement on some host vector units than plain cmp+bitsel. Omit the usual gvec interface, as this is intended to be used by target-specific gvec expansion call-backs. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-22 15:09:43 -04:00
Richard Henderson	38dc12947e	tcg: Add support for vector bitwise select This operation performs d = (b & a) \| (c & ~a), and is present on a majority of host vector units. Include gvec expanders. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-22 15:09:43 -04:00
Richard Henderson	532ba368a1	tcg: Fix missing checks and clears in tcg_gen_gvec_dup_mem The paths through tcg_gen_dup_mem_vec and through MO_128 were missing the check_size_align. The path through MO_128 was also missing the expand_clr. This last was not visible because the only user is ARM SVE, which would set oprsz == maxsz, and not require the clear. Fix by adding the check_size_align and using do_dup directly instead of duplicating the check in tcg_gen_gvec_dup_{i32,i64}. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-22 15:09:43 -04:00
Richard Henderson	7b60ef3264	tcg/i386: Fix dupi/dupm for avx1 and 32-bit hosts The VBROADCASTSD instruction only allows %ymm registers as destination. Rather than forcing VEX.L and writing to the entire 256-bit register, revert to using MOVDDUP with an %xmm register. This is sufficient for an avx1 host since we do not support TCG_TYPE_V256 for that case. Also fix the 32-bit avx2, which should have used VPBROADCASTW. Fixes: `1e262b49b5` Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reported-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-22 15:09:43 -04:00
Richard Henderson	a7b6d286cf	tcg/aarch64: Do not advertise minmax for MO_64 The min/max instructions are not available for 64-bit elements. Fixes: `93f332a503` Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-13 22:52:08 +00:00
Richard Henderson	a456394ae5	tcg/aarch64: Support vector absolute value Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-13 22:52:08 +00:00
Richard Henderson	18f9b65f1a	tcg/i386: Support vector absolute value Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-13 22:52:08 +00:00
Richard Henderson	bcefc90208	tcg: Add support for vector absolute value Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-13 22:52:08 +00:00
Richard Henderson	ff1f11f7f8	tcg: Add support for integer absolute value Remove a function of the same name from target/arm/. Use a branchless implementation of abs gleaned from gcc. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-13 22:52:08 +00:00
Richard Henderson	0a8d7a3bf5	tcg/i386: Support vector scalar shift opcodes Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-13 22:52:08 +00:00
Richard Henderson	b4578cd91c	tcg: Add gvec expanders for vector shift by scalar Allow expansion either via shift by scalar or by replicating the scalar for shift by vector. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> --- v3: Use a private structure for do_gvec_shifts.	2019-05-13 22:52:08 +00:00
Richard Henderson	79525dfd08	tcg/aarch64: Support vector variable shift opcodes Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-13 22:52:08 +00:00
Richard Henderson	a2ce146a06	tcg/i386: Support vector variable shift opcodes Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-13 22:52:08 +00:00
Richard Henderson	5ee5c14cac	tcg: Add gvec expanders for variable shift The gvec expanders perform a modulo on the shift count. If the target requires alternate behaviour, then it cannot use the generic gvec expanders anyway, and will have to have its own custom code. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-13 22:52:08 +00:00
Richard Henderson	37ee55a081	tcg: Add INDEX_op_dupm_vec Allow the backend to expand dup from memory directly, instead of forcing the value into a temp first. This is especially important if integer/vector register moves do not exist. Note that officially tcg_out_dupm_vec is allowed to fail. If it did, we could fix this up relatively easily: VECE == 32/64: Load the value into a vector register, then dup. Both of these must work. VECE == 8/16: If the value happens to be at an offset such that an aligned load would place the desired value in the least significant end of the register, go ahead and load w/garbage in high bits. Load the value w/INDEX_op_ld{8,16}_i32. Attempt a move directly to vector reg, which may fail. Store the value into the backing store for OTS. Load the value into the vector reg w/TCG_TYPE_I32, which must work. Duplicate from the vector reg into itself, which must work. All of which is well and good, except that all supported hosts can support dupm for all vece, so all of the failure paths would be dead code and untestable. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-13 22:52:08 +00:00
Richard Henderson	f23e5e15ed	tcg/aarch64: Implement tcg_out_dupm_vec The LD1R instruction does all the work. Note that the only useful addressing mode is a base register with no offset. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-13 22:50:35 +00:00
Richard Henderson	1e262b49b5	tcg/i386: Implement tcg_out_dupm_vec At the same time, improve tcg_out_dupi_vec wrt broadcast from the constant pool. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-13 14:44:03 -07:00
Richard Henderson	d6ecb4a978	tcg: Add tcg_out_dupm_vec to the backend interface Currently stubbed out in all backends that support vectors. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-13 14:44:03 -07:00
Richard Henderson	bab1671f0f	tcg: Manually expand INDEX_op_dup_vec This case is similar to INDEX_op_mov_* in that we need to do different things depending on the current location of the source. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> --- v3: Added some commentary to the tcg_reg_alloc_* functions.	2019-05-13 14:44:03 -07:00
Richard Henderson	e7632cfa8b	tcg: Promote tcg_out_{dup,dupi}_vec to backend interface The i386 backend already has these functions, and the aarch64 backend could easily split out one. Nothing is done with these functions yet, but this will aid register allocation of INDEX_op_dup_vec in a later patch. Adjust the aarch64 tcg_out_dupi_vec signature to match the new interface. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-13 14:44:03 -07:00
Richard Henderson	240c08d099	tcg: Support cross-class moves without instruction support PowerPC Altivec does not support direct moves between vector registers and general registers. So when tcg_out_mov fails, we can use the backing memory for the temporary to perform the move. Acked-by: David Hildenbrand <david@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-13 14:44:03 -07:00
Richard Henderson	78113e83e0	tcg: Return bool success from tcg_out_mov This patch merely changes the interface, aborting on all failures, of which there are currently none. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-13 14:44:03 -07:00
Richard Henderson	c16f52b2c5	tcg/arm: Use tcg_out_mov_reg in tcg_out_mov We have a function that takes an additional condition parameter over the standard backend interface. It already takes care of eliding no-op moves. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-13 14:44:03 -07:00
Richard Henderson	d63e3b6e69	tcg: Assert fixed_reg is read-only The only fixed_reg is cpu_env, and it should not be modified during any TB. Therefore code that tries to special-case moves into a fixed_reg is dead. Remove it. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-13 14:44:03 -07:00
Richard Henderson	53229a7703	tcg: Specify optional vector requirements with a list Replace the single opcode in .opc with a null-terminated array in .opt_opc. We still require that all opcodes be used with the same .vece. Validate the contents of this list with CONFIG_DEBUG_TCG. All tcg_gen_*_vec functions will check any list active during .fniv expansion. Swap the active list in and out as we expand other opcodes, or take control away from the front-end function. Convert all existing vector aware front ends. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-13 14:44:03 -07:00
Richard Henderson	ce27c5d1a3	tcg: Allow add_vec, sub_vec, neg_vec, not_vec to be expanded PowerPC Altivec does not support add and subtract of 64-bit elements. Prepare for that configuration by not assuming the operation is universally supported. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-13 14:44:03 -07:00
Richard Henderson	ac383dde33	tcg: Do not recreate INDEX_op_neg_vec unless supported Use tcg_can_emit_vec_op instead of just TCG_TARGET_HAS_neg_vec, so that we check the type and vece for the actual operation. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-13 14:26:28 -07:00
David Hildenbrand	e1227bb6e5	tcg: Implement tcg_gen_gvec_3i() Let's add tcg_gen_gvec_3i(), similar to tcg_gen_gvec_2i(), however without introducing "gen_helper_gvec_3i *fnoi", as it isn't needed for now. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20190416185301.25344-2-david@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-13 14:26:28 -07:00
Richard Henderson	b4b82d7e9c	tcg/arm: Restrict constant pool displacement to 12 bits This will not necessarily restrict the size of the TB, since for v7 the majority of constant pool usage is for calls from the out-of-line ldst code, which is already at the end of the TB. But this does allow us to save one insn per reference on the off-chance. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-04-25 10:39:39 -07:00
Richard Henderson	a7cdaf710f	tcg/ppc: Allow the constant pool to overflow at 32k There is no point in coding for a 2GB offset when the max TB size is already limited to 64k. If we further restrict to 32k then we can eliminate the extra ADDIS instruction. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-04-24 13:05:28 -07:00
Richard Henderson	aeee05f53a	tcg: Restart TB generation after out-of-line ldst overflow This is part c of relocation overflow handling. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-04-24 13:05:28 -07:00
Richard Henderson	1768987b73	tcg: Restart TB generation after constant pool overflow This is part b of relocation overflow handling. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-04-24 13:05:28 -07:00
Richard Henderson	7ecd02a06f	tcg: Restart TB generation after relocation overflow If the TB generates too much code, such that backend relocations overflow, try again with a smaller TB. In support of this, move relocation processing from a random place within tcg_out_op, in the handling of branch opcodes, to a new function at the end of tcg_gen_code. This is not a complete solution, as there are additional relocs generated for out-of-line ldst handling and constant pools. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-04-24 13:05:21 -07:00
Richard Henderson	6e6c4efed9	tcg: Restart after TB code generation overflow If a TB generates too much code, try again with fewer insns. Fixes: https://bugs.launchpad.net/bugs/1824853 Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-04-24 13:04:33 -07:00
Richard Henderson	464c2969d5	tcg/aarch64: Support INDEX_op_extract2_{i32,i64} Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-04-24 13:04:33 -07:00
Richard Henderson	3b832d67a9	tcg/arm: Support INDEX_op_extract2_i32 Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-04-24 13:04:33 -07:00
Richard Henderson	c6fb8c0cf7	tcg/i386: Support INDEX_op_extract2_{i32,i64} Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-04-24 13:04:33 -07:00
Richard Henderson	b0a6056719	tcg: Use extract2 in tcg_gen_deposit_{i32,i64} Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-04-24 13:04:33 -07:00
Richard Henderson	02616bad6f	tcg: Use deposit and extract2 in tcg_gen_shifti_i64 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-04-24 13:04:33 -07:00
Richard Henderson	fce1296f13	tcg: Add INDEX_op_extract2_{i32,i64} This will let backends implement the double-word shift operation. Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-04-24 13:04:33 -07:00
David Hildenbrand	2089fcc9e7	tcg: Implement tcg_gen_extract2_{i32,i64} Will be helpful for s390x. Input 128 bit and output 64 bit only, which is sufficient for now. Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20190225154204.26751-1-david@redhat.com> [rth: Add matching tcg_gen_extract2_i32.] Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-04-24 13:04:33 -07:00
Markus Armbruster	3de2faa9a8	tcg: Simplify how dump_exec_info() prints dump_exec_info() takes an fprintf()-like callback and a FILE * to pass to it. Its only caller hmp_info_jit() passes monitor_fprintf() and the current monitor cast to FILE *. monitor_fprintf() casts it right back, and is otherwise identical to monitor_printf(). The type-punning is ugly. Drop the callback, and call qemu_printf() instead. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20190417191805.28198-5-armbru@redhat.com>	2019-04-18 22:18:59 +02:00
Markus Armbruster	d4c51a0af3	tcg: Simplify how dump_opcount_info() prints dump_opcount_info() takes an fprintf()-like callback and a FILE * to pass to it. Its only caller hmp_info_opcount() passes monitor_fprintf() and the current monitor cast to FILE *. monitor_fprintf() casts it right back, and is otherwise identical to monitor_printf(). The type-punning is ugly. Drop the callback, and call qemu_printf() instead. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20190417191805.28198-4-armbru@redhat.com>	2019-04-18 22:18:59 +02:00
Richard Henderson	9e564a1dde	tcg: Remove TODO file The last update to this file was 9 years ago. In the meantime, 4 of the 6 ideas have actually been completed. The lat two do not actually make sense anymore. Suggested-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-02-21 10:22:24 -08:00
Mark Cave-Ayland	3115584d39	tcg/i386: fix unsigned vector saturating arithmetic Due to a cut/paste error in the original implementation, the unsigned vector saturating arithmetic was erroneously being calculated as signed vector saturating arithmetic. Fixes: `8ffafbcec2` ("tcg/i386: Implement vector saturating arithmetic") Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Message-Id: <20190207224258.426-1-mark.cave-ayland@ilande.co.uk> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-02-11 08:52:44 -08:00
Richard Henderson	bef16ab4e6	tcg: Diagnose referenced labels that have not been emitted Currently, a jump to a label that is not defined anywhere will be emitted not be relocated. This results in a jump to a random jump target. With tcg debugging, print a diagnostic to the -d op file and abort. This could help debug or detect errors like `c2d9644e6d` ("target/arm: Fix crash on conditional instruction in an IT block") Reported-by: Roman Kapl <code@rkapl.cz> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-02-11 08:52:44 -08:00
Thomas Huth	fb0343d5b4	tcg: Fix LGPL version number It's either "GNU Library General Public version 2" or "GNU Lesser General Public version 2.1", but there was no "version 2.0" of the "Lesser" library. So assume that version 2.1 is meant here. Cc: Richard Henderson <rth@twiddle.net> Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <1548252536-6242-5-git-send-email-thuth@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2019-01-30 11:01:52 +01:00
Richard Henderson	e77c89fb08	cputlb: Remove static tlb sizing Now that all tcg backends support TCG_TARGET_IMPLEMENTS_DYN_TLB, remove the define and the old code. Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:04:35 -08:00
Richard Henderson	0a9a83d6bf	tcg/tci: enable dynamic TLB sizing This is automatic due to TCI using the other softtlb macros. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:04:10 -08:00
Richard Henderson	ac33373e0e	tcg/mips: enable dynamic TLB sizing Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:04:10 -08:00
Richard Henderson	a31aa4ce00	tcg/mips: Fix tcg_out_qemu_ld_slow_path Patch the branch after it has been emitted rather than before it exists. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:04:10 -08:00
Richard Henderson	cd7d3cb7a2	tcg/arm: enable dynamic TLB sizing Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:04:10 -08:00
Richard Henderson	41b70f220b	tcg/riscv: enable dynamic TLB sizing Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:04:24 -08:00
Richard Henderson	4f47e338f6	tcg/s390: enable dynamic TLB sizing Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:04:10 -08:00
Richard Henderson	17ff9f7801	tcg/sparc: enable dynamic TLB sizing Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:04:10 -08:00
Richard Henderson	644f591ab0	tcg/ppc: enable dynamic TLB sizing Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:04:10 -08:00
Richard Henderson	f7bcd96669	tcg/aarch64: enable dynamic TLB sizing Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Tested-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:04:10 -08:00
Emilio G. Cota	54eaf40b8f	tcg/i386: enable dynamic TLB sizing As the following experiments show, this series is a net perf gain, particularly for memory-heavy workloads. Experiments are run on an Intel(R) Xeon(R) Gold 6142 CPU @ 2.60GHz. 1. System boot + shudown, debian aarch64: - Before (v3.1.0): Performance counter stats for './die.sh v3.1.0' (10 runs): 9019.797015 task-clock (msec) # 0.993 CPUs utilized ( +- 0.23% ) 29,910,312,379 cycles # 3.316 GHz ( +- 0.14% ) 54,699,252,014 instructions # 1.83 insn per cycle ( +- 0.08% ) 10,061,951,686 branches # 1115.541 M/sec ( +- 0.08% ) 172,966,530 branch-misses # 1.72% of all branches ( +- 0.07% ) 9.084039051 seconds time elapsed ( +- 0.23% ) - After: Performance counter stats for './die.sh tlb-dyn-v5' (10 runs): 8624.084842 task-clock (msec) # 0.993 CPUs utilized ( +- 0.23% ) 28,556,123,404 cycles # 3.311 GHz ( +- 0.13% ) 51,755,089,512 instructions # 1.81 insn per cycle ( +- 0.05% ) 9,526,513,946 branches # 1104.641 M/sec ( +- 0.05% ) 166,578,509 branch-misses # 1.75% of all branches ( +- 0.19% ) 8.680540350 seconds time elapsed ( +- 0.24% ) That is, a 4.4% perf increase. 2. System boot + shutdown, ubuntu 18.04 x86_64: - Before (v3.1.0): 56100.574751 task-clock (msec) # 1.016 CPUs utilized ( +- 4.81% ) 200,745,466,128 cycles # 3.578 GHz ( +- 5.24% ) 431,949,100,608 instructions # 2.15 insn per cycle ( +- 5.65% ) 77,502,383,330 branches # 1381.490 M/sec ( +- 6.18% ) 844,681,191 branch-misses # 1.09% of all branches ( +- 3.82% ) 55.221556378 seconds time elapsed ( +- 5.01% ) - After: 56603.419540 task-clock (msec) # 1.019 CPUs utilized ( +- 10.19% ) 202,217,930,479 cycles # 3.573 GHz ( +- 10.69% ) 439,336,291,626 instructions # 2.17 insn per cycle ( +- 14.14% ) 80,538,357,447 branches # 1422.853 M/sec ( +- 16.09% ) 776,321,622 branch-misses # 0.96% of all branches ( +- 3.77% ) 55.549661409 seconds time elapsed ( +- 10.44% ) No improvement (within noise range). Note that for this workload, increasing the time window too much can lead to perf degradation, since it flushes the TLB very frequently. 3. x86_64 SPEC06int: x86_64-softmmu speedup vs. v3.1.0 for SPEC06int (test set) Host: Intel(R) Xeon(R) Gold 6142 CPU @ 2.60GHz (Skylake) 5.5 +------------------------------------------------------------------------+ \| +-+ \| 5 \|-+.................+-+...............................tlb-dyn-v5.......+-\| \| * * \| 4.5 \|-+..................................................................+-\| \| * * \| 4 \|-+..................................................................+-\| \| * * \| 3.5 \|-+..................................................................+-\| \| * * \| 3 \|-+......+-+........................................................+-\| \| * * * \| 2.5 \|-+.................................................+-+...........+-\| \| * * * * * \| 2 \|-+..............................................................+-\| \| * * * * * * +-+ \| 1.5 \|-+....................................................+-+.+-+.+-\| \| * * +-+ * +-+ +-+ +-+ +-+ * * * * * \| 1 \|++++-++++++++++++++++-+++-+++-++++-++++-++++++++++++++\| \| * * * * * * * * * * * * * * * * * * * * * * * * * * \| 0.5 +------------------------------------------------------------------------+ 400.perlb401.bzip403.g429445.g456.hm462.libq464.h471.omn47483.xalancbgeomean png: https://imgur.com/YRF90f7 That is, a 1.51x average speedup over the baseline, with a max speedup of 5.17x. Here's a different look at the SPEC06int results, using KVM as the baseline: x86_64-softmmu slowdown vs. KVM for SPEC06int (test set) Host: Intel(R) Xeon(R) Gold 6142 CPU @ 2.60GHz (Skylake) 25 +---------------------------------------------------------------------------+ \| +-+ +-+ \| \| * * +-+ v3.1.0 \| \| * * +-+ tlb-dyn-v5 \| \| * * * * +-+ \| 20 \|-+................................................+-+...............+-\| \| * * # # * * \| \| +-+ * * * # # * * \| \| * * * * * # # * * \| 15 \|-+..............................................#.#.......+-+......+-\| \| * * * * * # # * #\|# \| \| * * * * +-+ * # # * +-+ \| \| * * +-+ * * ++-+ +-+ * # # * # # +-+ \| \| * * +-+ * * * ## \| +-+ # # * # # +-+ \| 10 \|-+..........+-+...........##.......++-+..+-+.#.#.......#.#....+-\| \| * * +-+ * * * ## +-+ # # # #* # # +-+ * # # * * \| \| * * * # # * * +-+ * ## * +-+ # # # #* # # * * * # # +-+ \| \| * * # # * * * +-+ * ## * # # # # # #* # # * * * # # * ## \| 5 \|-+.......+-+.#.#.....#.#..##..#.#.#.#..#.#.#.#.....#.#..##.+-\| \| * # #* # # * +-+* # # * ## * # # # # # #* # # * * * # # * ## \| \| * # #* # # * # #* # # * ## * # # # # # #* # # * +-+* # # * ## \| \| ++-+ * # #* # # * # #* # # * ## * # # # # # #* # # * # #* # # * ## \| \|+++#+#++#+#+#+#++#+#+#+#++##++#+#+#+#++#+#+#+#++#+#+#+#+*+##+++\| 0 +---------------------------------------------------------------------------+ 400.perlbe401.bzi403.gc429445.go456.h462.libqu464.h471.omne4483.xalancbmgeomean png: https://imgur.com/YzAMNEV After this series, we bring down the average SPEC06int slowdown vs KVM from 11.47x to 7.58x. Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <20190116170114.26802-4-cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:03:34 -08:00
Emilio G. Cota	86e1eff8bc	tcg: introduce dynamic TLB sizing Disabled in all TCG backends for now. Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <20190116170114.26802-3-cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:03:34 -08:00
Richard Henderson	93f332a503	tcg/aarch64: Implement vector minmax arithmetic Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:03:34 -08:00
Richard Henderson	d32648d445	tcg/aarch64: Implement vector saturating arithmetic Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:03:34 -08:00
Richard Henderson	bc37faf4cb	tcg/i386: Implement vector minmax arithmetic The avx instruction set does not directly provide MO_64. We can still implement 64-bit with comparison and vpblendvb. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:03:34 -08:00
Richard Henderson	8ffafbcec2	tcg/i386: Implement vector saturating arithmetic Only MO_8 and MO_16 are implemented, since that's all the instruction set provides. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:03:34 -08:00
Richard Henderson	44f1441dbe	tcg/i386: Split subroutines out of tcg_expand_vec_op This routine was becoming too large. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:03:34 -08:00
Richard Henderson	dd0a0fcdd8	tcg: Add opcodes for vector minmax arithmetic Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:03:34 -08:00
Richard Henderson	8afaf05066	tcg: Add opcodes for vector saturated arithmetic Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:03:34 -08:00
Richard Henderson	5d6acdd4a4	tcg: Add write_aofs to GVecGen4 This allows writing 2 output, 3 input operations. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:03:34 -08:00
Richard Henderson	f550805d83	tcg: Add gvec expanders for nand, nor, eqv Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:03:34 -08:00
Richard Henderson	9a9eda78e4	tcg: Add logical simplifications during gvec expand We handle many of these during integer expansion, and the rest of them during integer optimization. Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:03:34 -08:00

1 2 3 4 5 ...

1971 Commits