mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Richard Henderson	324b9d462e	accel/tcg: Inline cpu_gen_init It consists of one function call and has only one caller. Reviewed-by: Luis Pires <luis.pires@eldorado.org.br> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-06-11 09:26:28 -07:00
Richard Henderson	b0a0794a0f	tcg: Re-order tcg_region_init vs tcg_prologue_init Instead of delaying tcg_region_init until after tcg_prologue_init is complete, do tcg_region_init first and let tcg_prologue_init shrink the first region by the size of the generated prologue. Reviewed-by: Luis Pires <luis.pires@eldorado.org.br> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-06-11 09:26:28 -07:00
Philippe Mathieu-Daudé	e5ceadff47	accel/tcg: Keep TranslationBlock headers local to TCG Only the TCG accelerator uses the TranslationBlock API. Move the tb-context.h / tb-hash.h / tb-lookup.h from the global namespace to the TCG one (in accel/tcg). Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20210524170453.3791436-3-f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-05-26 15:33:59 -07:00
Philippe Mathieu-Daudé	824f4bac9f	accel/tcg: Reduce 'exec/tb-context.h' inclusion Only 2 headers require "exec/tb-context.h". Instead of having all files including "exec/exec-all.h" also including it, directly include it where it is required: - accel/tcg/cpu-exec.c - accel/tcg/translate-all.c For plugins/plugin.h, we were implicitly relying on exec/exec-all.h -> exec/tb-context.h -> qemu/qht.h which is now included directly. Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20210524170453.3791436-2-f4bug@amsat.org> [rth: Fix plugins/plugin.h compilation] Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-05-26 15:31:45 -07:00
Ilya Leoshkevich	0b00b0c1e0	accel/tcg: Assert that tb->size != 0 after translation If arch-specific code generates a translation block of size 0, tb_gen_code() may generate a spurious exception. Add an assertion in order to catch such situations early. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Message-Id: <20210416154939.32404-5-iii@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>	2021-05-20 14:19:30 +02:00
Philippe Mathieu-Daudé	6c6a4a76ee	accel/tcg: Align data dumped at end of TB To better visualize the data dumped at the end of a TB, left-align it (padding it with 0). Print ".long" instead of ".quad" on 32-bit hosts. Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20210515104202.241504-1-f4bug@amsat.org> [rth: Split the qemu_log and print .long for 32-bit hosts.] Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-05-16 09:05:14 -05:00
Thomas Huth	2068cabd3f	Do not include cpu.h if it's not really necessary Stop including cpu.h in files that don't need it. Signed-off-by: Thomas Huth <thuth@redhat.com> Message-Id: <20210416171314.2074665-4-thuth@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2021-05-02 17:24:51 +02:00
Alex Bennée	277aed998a	accel/tcg: avoid re-translating one-shot instructions By definition a single instruction is capable of being an IO instruction. This avoids a problem of triggering a cpu_io_recompile on a non-recorded translation which then fails because it expects tcg_tb_lookup() to succeed unconditionally. The normal use case requires a TB to be able to resolve machine state. The other users of tcg_tb_lookup() are able to tolerate a missing TB if the machine state has been resolved by other means - which in the single-shot case is always true because machine state is synced at the start of a block. Reported-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210415162454.22056-1-alex.bennee@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2021-04-17 18:51:14 +01:00
Richard Henderson	eb42297a59	accel/tcg: Preserve PAGE_ANON when changing page permissions Using mprotect() to change PROT_* does not change the MAP_ANON previously set with mmap(). Our linux-user version of MTE only works with MAP_ANON pages, so losing PAGE_ANON caused MTE to stop working. Reported-by: Stephen Long <steplong@quicinc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Tested-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2021-04-12 11:06:24 +01:00
Richard Henderson	6cc9d67c6f	accel/tcg: Precompute curr_cflags into cpu->tcg_cflags The primary motivation is to remove a dozen insns along the fast-path in tb_lookup. As a byproduct, this allows us to completely remove parallel_cpus. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-03-06 11:53:57 -08:00
Alex Bennée	bf253ac606	accel/tcg: drop the use of CF_HASH_MASK and rename params We don't really deal in cf_mask most of the time. The one time it's relevant is when we want to remove an invalidated TB from the QHT lookup. Everywhere else we should be looking up things without CF_INVALID set. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20210224165811.11567-4-alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-03-06 11:52:06 -08:00
Alex Bennée	c0ae396a81	accel/tcg: move CF_CLUSTER calculation to curr_cflags There is nothing special about this compile flag that doesn't mean we can't just compute it with curr_cflags() which we should be using when building a new set. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20210224165811.11567-3-alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-03-06 11:50:50 -08:00
Alex Bennée	cfd405eae6	accel/tcg: allow plugin instrumentation to be disable via cflags When icount is enabled and we recompile an MMIO access we end up double counting the instruction execution. To avoid this we introduce the CF_MEMI cflag which only allows memory instrumentation for the next TB (which won't yet have been counted). As this is part of the hashed compile flags we will only execute the generated TB while coming out of a cpu_io_recompile. While we are at it delete the old TODO. We might as well keep the translation handy as it's likely you will repeatedly hit it on each MMIO access. Reported-by: Aaron Lindsay <aaron@os.amperecomputing.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Tested-by: Aaron Lindsay <aaron@os.amperecomputing.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20210213130325.14781-21-alex.bennee@linaro.org>	2021-02-18 08:19:23 +00:00
Alex Bennée	c4afb3456c	accel/tcg: remove CF_NOCACHE and special cases Now we no longer generate CF_NOCACHE blocks we can remove a bunch of the special case handling for them. While we are at it we can remove the unused tb->orig_tb field and save a few bytes on the TB structure. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20210213130325.14781-20-alex.bennee@linaro.org>	2021-02-18 08:19:23 +00:00
Alex Bennée	873d64ac30	accel/tcg: re-factor non-RAM execution code There is no real need to use CF_NOCACHE here. As long as the TB isn't linked to other TBs or included in the QHT or jump cache then it will only get executed once. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20210213130325.14781-19-alex.bennee@linaro.org>	2021-02-18 08:19:23 +00:00
Richard Henderson	eb56afdb15	target/sh4: Create superh_io_recompile_replay_branch Move the code from accel/tcg/translate-all.c to target/sh4/cpu.c. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20210208233906.479571-5-richard.henderson@linaro.org> Message-Id: <20210213130325.14781-14-alex.bennee@linaro.org>	2021-02-18 08:19:20 +00:00
Richard Henderson	95ab7c2291	target/mips: Create mips_io_recompile_replay_branch Move the code from accel/tcg/translate-all.c to target/mips/cpu.c. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20210208233906.479571-4-richard.henderson@linaro.org> Message-Id: <20210213130325.14781-13-alex.bennee@linaro.org>	2021-02-18 08:19:15 +00:00
Richard Henderson	d9bcb58a12	accel/tcg: Create io_recompile_replay_branch hook Create a hook in which to split out the mips and sh4 ifdefs from cpu_io_recompile. [AJB: s/stoped/stopped/] Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20210208233906.479571-3-richard.henderson@linaro.org> Message-Id: <20210213130325.14781-12-alex.bennee@linaro.org>	2021-02-18 08:19:12 +00:00
Richard Henderson	3e8f1628e8	exec: Use cpu_untagged_addr in g2h; split out g2h_untagged Use g2h_untagged in contexts that have no cpu, e.g. the binary loaders that operate before the primary cpu is created. As a colollary, target_mmap and friends must use untagged addresses, since they are used by the loaders. Use g2h_untagged on values returned from target_mmap, as the kernel never applies a tag itself. Use g2h_untagged on all pc values. The only current user of tags, aarch64, removes tags from code addresses upon branch, so "pc" is always untagged. Use g2h with the cpu context on hand wherever possible. Use g2h_untagged in lock_user, which will be updated soon. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210212184902.1251044-13-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2021-02-16 11:04:53 +00:00
Richard Henderson	d9c5858570	tcg: Introduce target-specific page data for user-only This data can be allocated by page_alloc_target_data() and released by page_set_flags(start, end, prot \| PAGE_RESET). This data will be used to hold tag memory for AArch64 MTE. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210212184902.1251044-2-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2021-02-16 11:04:52 +00:00
Richard Henderson	ae30e86661	tcg: Restart code generation when we run out of temps Some large translation blocks can generate so many unique constants that we run out of temps to hold them. In this case, longjmp back to the start of code generation and restart with a smaller translation block. Buglink: https://bugs.launchpad.net/bugs/1912065 Tested-by: BALATON Zoltan <balaton@eik.bme.hu> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-24 08:03:27 -10:00
Roman Bolshakov	653b87eb36	tcg: Toggle page execution for Apple Silicon Pages can't be both write and executable at the same time on Apple Silicon. macOS provides public API to switch write protection [1] for JIT applications, like TCG. 1. https://developer.apple.com/documentation/apple_silicon/porting_just-in-time_compilers_to_apple_silicon Tested-by: Alexander Graf <agraf@csgraf.de> Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com> Message-Id: <20210113032806.18220-1-r.bolshakov@yadro.com> [rth: Inline the qemu_thread_jit_* functions; drop the MAP_JIT change for a follow-on patch.] Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-23 12:13:00 -10:00
Philippe Mathieu-Daudé	c03f041f12	accel/tcg: Restrict tb_gen_code() from other accelerators tb_gen_code() is only called within TCG accelerator, declare it locally. Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20210117164813.4101761-4-f4bug@amsat.org> [rth: Adjust vs changed tb_flush_jmp_cache patch.] Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-23 12:12:59 -10:00
Richard Henderson	0f4abea8ef	accel/tcg: Move tb_flush_jmp_cache() to cputlb.c Move and make the function static, as the only users are here in cputlb.c. Suggested-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-23 12:12:59 -10:00
Philippe Mathieu-Daudé	d9d699dd7c	accel/tcg: Make cpu_gen_init() static cpu_gen_init() is TCG specific, only used in tcg/translate-all.c. No need to export it to other accelerators, declare it statically. Reviewed-by: Claudio Fontana <cfontana@suse.de> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20210117164813.4101761-2-f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-23 12:12:59 -10:00
Richard Henderson	de2fac62d2	tcg: Remove TCG_TARGET_SUPPORT_MIRROR Now that all native tcg hosts support splitwx, remove the define. Replace the one use with a test for CONFIG_TCG_INTERPRETER. Reviewed-by: Joelle van Dyne <j@getutm.app> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-07 05:09:42 -10:00
Richard Henderson	d1861aa409	accel/tcg: Add mips support to alloc_code_gen_buffer_splitwx_memfd Re-use the 256MiB region handling from alloc_code_gen_buffer_anon, and replace that with the shared file mapping. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-07 05:09:42 -10:00
Richard Henderson	6f70ddee19	accel/tcg: Support split-wx for darwin/iOS with vm_remap Cribbed from code posted by Joelle van Dyne <j@getutm.app>, and rearranged to a cleaner structure. Reviewed-by: Joelle van Dyne <j@getutm.app> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-07 05:09:42 -10:00
Richard Henderson	a8c35b2cad	accel/tcg: Support split-wx for linux with memfd We cannot use a real temp file, because we would need to find a filesystem that does not have noexec enabled. However, a memfd is not associated with any filesystem. Reviewed-by: Joelle van Dyne <j@getutm.app> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-07 05:09:42 -10:00
Richard Henderson	a35b3e1415	tcg: Add --accel tcg,split-wx property Plumb the value through to alloc_code_gen_buffer. This is not supported by any os or tcg backend, so for now enabling it will result in an error. Reviewed-by: Joelle van Dyne <j@getutm.app> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-07 05:09:41 -10:00
Richard Henderson	6bc144237a	tcg: Use Error with alloc_code_gen_buffer Report better error messages than just "could not allocate". Let alloc_code_gen_buffer set ctx->code_gen_buffer_size and ctx->code_gen_buffer, and simply return bool. Reviewed-by: Joelle van Dyne <j@getutm.app> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-07 05:09:41 -10:00
Richard Henderson	db0c51a380	tcg: Introduce tcg_splitwx_to_{rx,rw} Add two helper functions, using a global variable to hold the displacement. The displacement is currently always 0, so no change in behaviour. Begin using the functions in tcg common code only. Reviewed-by: Joelle van Dyne <j@getutm.app> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-07 05:09:41 -10:00
Richard Henderson	4846cd37df	tcg: Add in_code_gen_buffer Create a function to determine if a pointer is within the buffer. Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-07 05:09:41 -10:00
Paolo Bonzini	3b9bd3f46b	remove TCG includes from common code Enable removing tcg/$tcg_arch from the include path when TCG is disabled. Move translate-all.h to include/exec, since stubs exist for the functions defined therein. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-01-02 21:03:36 +01:00
Philippe Mathieu-Daudé	e76f68d3cc	accel/tcg: Remove deprecated '-tb-size' option The '-tb-size' option (replaced by '-accel tcg,tb-size') is deprecated since 5.0 (commit `fe17413247`). Remove it. Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20201202112714.1223783-1-philmd@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com> Message-Id: <20201210155808.233895-2-thuth@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-12-15 12:52:01 -05:00
Peter Maydell	1d705e8a5b	accel/tcg: Add CPU_LOG_EXEC tracing for cpu_io_recompile() When using -icount, it's useful for the CPU_LOG_EXEC logging to include information about when cpu_io_recompile() was called, because it alerts the reader of the log that the tracing of a previous TB execution may not actually correspond to an actually executed instruction. For instance if you're using -icount and also -singlestep then a guest instruction that makes an IO access appears in two "Trace" lines, once in a TB that triggers the cpu_io_recompile() and then again in the TB that actually executes. (This is a similar reason to why the "Stopped execution of TB chain before..." logging in cpu_tb_exec() is helpful when trying to track execution flow in the logs.) Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <20201013122658.4620-1-peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-10-27 09:48:07 -07:00
Claudio Fontana	740b175973	cpu-timers, icount: new modules refactoring of cpus.c continues with cpu timer state extraction. cpu-timers: responsible for the softmmu cpu timers state, including cpu clocks and ticks. icount: counts the TCG instructions executed. As such it is specific to the TCG accelerator. Therefore, it is built only under CONFIG_TCG. One complication is due to qtest, which uses an icount field to warp time as part of qtest (qtest_clock_warp). In order to solve this problem, provide a separate counter for qtest. This requires fixing assumptions scattered in the code that qtest_enabled() implies icount_enabled(), checking each specific case. Signed-off-by: Claudio Fontana <cfontana@suse.de> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> [remove redundant initialization with qemu_spice_init] Reviewed-by: Alex Bennée <alex.bennee@linaro.org> [fix lingering calls to icount_get] Signed-off-by: Claudio Fontana <cfontana@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-10-05 16:41:22 +02:00
Richard Henderson	4c389f6edf	disas: Move host asm annotations to tb_gen_code Instead of creating GStrings and passing them into log_disas, just print the annotations directly in tb_gen_code. Fix the annotations for the slow paths of the TB, after the part implementing the final guest instruction. Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-10-03 04:25:14 -05:00
Stefan Hajnoczi	d73415a315	qemu/atomic.h: rename atomic_ to qatomic_ clang's C11 atomic_fetch_() functions only take a C11 atomic type pointer argument. QEMU uses direct types (int, etc) and this causes a compiler error when a QEMU code calls these functions in a source file that also included <stdatomic.h> via a system header file: $ CC=clang CXX=clang++ ./configure ... && make ../util/async.c:79:17: error: address argument to atomic operation must be a pointer to _Atomic type ('unsigned int ' invalid) Avoid using atomic_*() names in QEMU's atomic.h since that namespace is used by <stdatomic.h>. Prefix QEMU's APIs with 'q' so that atomic.h and <stdatomic.h> can co-exist. I checked /usr/include on my machine and searched GitHub for existing "qatomic_" users but there seem to be none. This patch was generated using: $ git grep -h -o '\<atomic$64$\?_[a-z0-9_]\+' include/qemu/atomic.h \| \ sort -u >/tmp/changed_identifiers $ for identifier in $(</tmp/changed_identifiers); do sed -i "s%\<$identifier\>%q$identifier%g" \ $(git grep -I -l "\<$identifier\>") done I manually fixed line-wrap issues and misaligned rST tables. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200923105646.47864-1-stefanha@redhat.com>	2020-09-23 16:07:44 +01:00
Alex Bennée	c83d628b7f	accel/tcg: better handle memory constrained systems It turns out there are some 64 bit systems that have relatively low amounts of physical memory available to them (typically CI system). Even with swapping available a 1GB translation buffer that fills up can put the machine under increased memory pressure. Detect these low memory situations and reduce tb_size appropriately. Fixes: `600e17b261` ("accel/tcg: increase default code gen buffer size for 64 bit") Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Robert Foley <robert.foley@linaro.org> Cc: BALATON Zoltan <balaton@eik.bme.hu> Cc: Christian Ehrhardt <christian.ehrhardt@canonical.com> Message-Id: <20200724064509.331-7-alex.bennee@linaro.org>	2020-07-27 09:40:16 +01:00
Eric Blake	f9919116b8	osdep: Make MIN/MAX evaluate arguments only once I'm not aware of any immediate bugs in qemu where a second runtime evaluation of the arguments to MIN() or MAX() causes a problem, but proactively preventing such abuse is easier than falling prey to an unintended case down the road. At any rate, here's the conversation that sparked the current patch: https://lists.gnu.org/archive/html/qemu-devel/2018-12/msg05718.html Update the MIN/MAX macros to only evaluate their argument once at runtime; this uses typeof(1 ? (a) : (b)) to ensure that we are promoting the temporaries to the same type as the final comparison (we have to trigger type promotion, as typeof(bitfield) won't compile; and we can't use typeof((a) + (b)) or even typeof((a) + 0), as some of our uses of MAX are on void* pointers where such addition is undefined). However, we are unable to work around gcc refusing to compile ({}) in a constant context (such as the array length of a static variable), even when only used in the dead branch of a __builtin_choose_expr(), so we have to provide a second macro pair MIN_CONST and MAX_CONST for use when both arguments are known to be compile-time constants and where the result must also be usable as a constant; this second form evaluates arguments multiple times but that doesn't matter for constants. By using a void expression as the expansion if a non-constant is presented to this second form, we can enlist the compiler to ensure the double evaluation is not attempted on non-constants. Alas, as both macros now rely on compiler intrinsics, they are no longer usable in preprocessor #if conditions; those will just have to be open-coded or the logic rewritten into #define or runtime 'if' conditions (but where the compiler dead-code-elimination will probably still apply). I tested that both gcc 10.1.1 and clang 10.0.0 produce errors for all forms of macro mis-use. As the errors can sometimes be cryptic, I'm demonstrating the gcc output: Use of MIN when MIN_CONST is needed: In file included from /home/eblake/qemu/qemu-img.c:25: /home/eblake/qemu/include/qemu/osdep.h:249:5: error: braced-group within expression allowed only inside a function 249 \| ({ \ \| ^ /home/eblake/qemu/qemu-img.c:92:12: note: in expansion of macro ‘MIN’ 92 \| char array[MIN(1, 2)] = ""; \| ^~~ Use of MIN_CONST when MIN is needed: /home/eblake/qemu/qemu-img.c: In function ‘is_allocated_sectors’: /home/eblake/qemu/qemu-img.c:1225:15: error: void value not ignored as it ought to be 1225 \| i = MIN_CONST(i, n); \| ^ Use of MIN in the preprocessor: In file included from /home/eblake/qemu/accel/tcg/translate-all.c:20: /home/eblake/qemu/accel/tcg/translate-all.c: In function ‘page_check_range’: /home/eblake/qemu/include/qemu/osdep.h:249:6: error: token "{" is not valid in preprocessor expressions 249 \| ({ \ \| ^ Fix the resulting callsites that used #if or computed a compile-time constant min or max to use the new macros. cpu-defs.h is interesting, as CPU_TLB_DYN_MAX_BITS is sometimes used as a constant and sometimes dynamic. It may be worth improving glib's MIN/MAX definitions to be saner, but that is a task for another day. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200625162602.700741-1-eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-06-26 09:39:39 -04:00
Emilio G. Cota	3f640eb881	translate-all: call qemu_spin_destroy for PageDesc The radix tree is append-only, but we can fail to insert a PageDesc if the insertion races with another thread. Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Robert Foley <robert.foley@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20200609200738.445-8-robert.foley@linaro.org> Message-Id: <20200612190237.30436-11-alex.bennee@linaro.org>	2020-06-16 14:49:05 +01:00
Emilio G. Cota	938e897a66	tcg: call qemu_spin_destroy for tb->jmp_lock Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Robert Foley <robert.foley@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> [RF: minor changes + remove tb_destroy_func] Message-Id: <20200609200738.445-7-robert.foley@linaro.org> Message-Id: <20200612190237.30436-10-alex.bennee@linaro.org>	2020-06-16 14:49:05 +01:00
Alex Bennée	5f0df0333b	translate-all: include guest address in out_asm output We already have information about where each guest instructions representation starts stored in the tcg_ctx->gen_insn_data so we can rectify the PC for faults. We can re-use this information to annotate the out_asm output with guest instruction address which makes it a bit easier to work out where you are especially with longer blocks. A minor wrinkle is that some instructions get optimised away so we have to scan forward until we find some actual generated code. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20200513175134.19619-11-alex.bennee@linaro.org>	2020-05-15 15:25:16 +01:00
Alex Bennée	e5ef4ec28b	disas: include an optional note for the start of disassembly This will become useful shortly for providing more information about output assembly inline. While there fix up the indenting and code formatting in disas(). Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200513175134.19619-9-alex.bennee@linaro.org>	2020-05-15 15:25:16 +01:00
Richard Henderson	7d8cbbabcb	accel/tcg: Relax va restrictions on 64-bit guests We cannot at present limit a 64-bit guest to a virtual address space smaller than the host. It will mostly work to ignore this limitation, except if the guest uses high bits of the address space for tags. But it will certainly work better, as presently we can wind up failing to allocate the guest stack. Widen our user-only page tree to the host or abi pointer width. Remove the workaround for this problem from target/alpha. Always validate guest addresses vs reserved_va, as there we control allocation ourselves. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20200513175134.19619-7-alex.bennee@linaro.org>	2020-05-15 15:25:16 +01:00
Richard Henderson	64547a3bb6	tcg: Remove softmmu code_gen_buffer fixed address The commentary talks about "in concert with the addresses assigned in the relevant linker script", except there is no linker script for softmmu, nor has there been for some time. (Do not confuse the user-only linker script editing that was removed in the previous patch, because user-only does not use this code_gen_buffer allocation method.) Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-03-28 14:09:44 -07:00
Alex Bennée	600e17b261	accel/tcg: increase default code gen buffer size for 64 bit While 32mb is certainly usable a full system boot ends up flushing the codegen buffer nearly 100 times. Increase the default on 64 bit hosts to take advantage of all that spare memory. After this change I can boot my tests system without any TB flushes. As we usually run more CONFIG_USER binaries at a time in typical usage we aren't quite as profligate for user-mode code generation usage. We also bring the static code gen defies to the same place to keep all the reasoning in the comments together. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Tested-by: Niek Linnenbank <nieklinnenbank@gmail.com> Reviewed-by: Niek Linnenbank <nieklinnenbank@gmail.com> Message-Id: <20200228192415.19867-5-alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-02-28 17:43:31 -08:00
Alex Bennée	21f2f447ad	accel/tcg: only USE_STATIC_CODE_GEN_BUFFER on 32 bit hosts There is no particular reason to use a static codegen buffer on 64 bit hosts as we have address space to burn. Allow the common CONFIG_USER case to use the mmap'ed buffers like SoftMMU. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Niek Linnenbank <nieklinnenbank@gmail.com> Message-Id: <20200228192415.19867-4-alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-02-28 17:43:25 -08:00
Alex Bennée	47a2def453	accel/tcg: remove link between guest ram and TCG cache size Basing the TB cache size on the ram_size was always a little heuristic and was broken by `a1b18df9a4` which caused ram_size not to be fully realised at the time we initialise the TCG translation cache. The current DEFAULT_CODE_GEN_BUFFER_SIZE may still be a little small but follow-up patches will address that. Fixes: `a1b18df9a4` Cc: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Niek Linnenbank <nieklinnenbank@gmail.com> Message-Id: <20200228192415.19867-3-alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-02-28 17:42:48 -08:00
Alex Bennée	a2fa63a8f5	accel/tcg: use units.h for defining code gen buffer sizes It's easier to read. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Niek Linnenbank <nieklinnenbank@gmail.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200228192415.19867-2-alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-02-28 17:42:39 -08:00
Philippe Mathieu-Daudé	dcb32f1d8f	tcg: Search includes from the project root source directory We currently search both the root and the tcg/ directories for tcg files: $ git grep '#include "tcg/' \| wc -l 28 $ git grep '#include "tcg[^/]' \| wc -l 94 To simplify the preprocessor search path, unify by expliciting the tcg/ directory. Patch created mechanically by running: $ for x in \ tcg.h tcg-mo.h tcg-op.h tcg-opc.h \ tcg-op-gvec.h tcg-gvec-desc.h; do \ sed -i "s,#include \"$x\",#include \"tcg/$x\"," \ $(git grep -l "#include \"$x\""); \ done Acked-by: David Gibson <david@gibson.dropbear.id.au> (ppc parts) Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Stefan Weil <sw@weilnetz.de> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200101112303.20724-2-philmd@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-01-15 15:13:10 -10:00
Robert Foley	fc59d2d870	qemu_log_lock/unlock now preserves the qemu_logfile handle. qemu_log_lock() now returns a handle and qemu_log_unlock() receives a handle to unlock. This allows for changing the handle during logging and ensures the lock() and unlock() are for the same file. Also in target/tilegx/translate.c removed the qemu_log_lock()/unlock() calls (and the log("\n")), since the translator can longjmp out of the loop if it attempts to translate an instruction in an inaccessible page. Signed-off-by: Robert Foley <robert.foley@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20191118211528.3221-5-robert.foley@linaro.org>	2019-12-18 20:18:02 +00:00
Peter Maydell	68d8ef4ec5	TCG Plugins initial implementation - use --enable-plugins @ configure - low impact introspection (-plugin empty.so to measure overhead) - plugins cannot alter guest state - example plugins included in source tree (tests/plugins) - -d plugin to enable plugin output in logs - check-tcg runs extra tests when plugins enabled - documentation in docs/devel/plugins.rst -----BEGIN PGP SIGNATURE----- iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAl23BZMACgkQ+9DbCVqe KkRPegf/QHygZ4ER2jOaWEookxiOEcik+dzQKVGNqLNXeMLvo5fGjGVpFoFxSgfv ZvCAL4xbW44zsYlVfh59tfn4Tu9qK7s7/qM3WXpHsmuvEuhoWef0Lt2jSe+D46Rs KeG/aX+rHLUR8rr9eCgE+1/MQmxPUj3VUonkUpNkk2ebBbSNoLSOudB4DD9Vcyl7 Pya1kPvA6W9bwI20ZSWihE7flg13o62Pp+LgAFLrsfxXOxOMkPrU8Pp+B0Dvr+hL 5Oh0clZLhiRi75x+KVGZ90TVsoftdjYoOWGMOudS/+NNmqKT1NTLm0K1WJYyRMQ1 V0ne4/OcGNq7x8gcOx/xs09ADu5/VA== =UXR/ -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/stsquad/tags/pull-tcg-plugins-281019-4' into staging TCG Plugins initial implementation - use --enable-plugins @ configure - low impact introspection (-plugin empty.so to measure overhead) - plugins cannot alter guest state - example plugins included in source tree (tests/plugins) - -d plugin to enable plugin output in logs - check-tcg runs extra tests when plugins enabled - documentation in docs/devel/plugins.rst # gpg: Signature made Mon 28 Oct 2019 15:13:23 GMT # gpg: using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44 # gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [full] # Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44 * remotes/stsquad/tags/pull-tcg-plugins-281019-4: (57 commits) travis.yml: enable linux-gcc-debug-tcg cache MAINTAINERS: add me for the TCG plugins code scripts/checkpatch.pl: don't complain about (foo, /* empty */) .travis.yml: add --enable-plugins tests include/exec: wrap cpu_ldst.h in CONFIG_TCG accel/stubs: reduce headers from tcg-stub tests/plugin: add hotpages to analyse memory access patterns tests/plugin: add instruction execution breakdown tests/plugin: add a hotblocks plugin tests/tcg: enable plugin testing tests/tcg: drop test-i386-fprem from TESTS when not SLOW tests/tcg: move "virtual" tests to EXTRA_TESTS tests/tcg: set QEMU_OPTS for all cris runs tests/tcg/Makefile.target: fix path to config-host.mak tests/plugin: add sample plugins linux-user: support -plugin option vl: support -plugin option plugin: add qemu_plugin_outs helper plugin: add qemu_plugin_insn_disas helper plugin: expand the plugin_init function to include an info block ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2019-10-30 14:10:32 +00:00
Emilio G. Cota	5025bb7fde	translate-all: notify plugin code of tb_flush Plugins might allocate per-TB data that then they get passed each time a TB is executed (via the *userdata pointer). Notify plugin code every time a code cache flush occurs, so that plugins can then reclaim the memory of the per-TB data. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>	2019-10-28 15:12:38 +00:00
Emilio G. Cota	136094d0b2	translate-all: use cpu_in_exclusive_work_context() in tb_flush tb_flush will be called by the plugin module from a safe work environment. Prepare for that. Suggested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>	2019-10-28 15:12:38 +00:00
Richard Henderson	fe9b676fb3	translate-all: Remove tb_alloc Since `2ac01d6daf`, this function does only two things: assert a lock is held, and call tcg_tb_alloc. It is used exactly once, and its user has already done the assert. Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Clement Deschamps <clement.deschamps@greensocs.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-10-28 10:35:23 +01:00
Clement Deschamps	1b1940026c	translate-all: fix uninitialized tb->orig_tb This fixes a segmentation fault in icount mode when executing from an IO region. TB is marked as CF_NOCACHE but tb->orig_tb is not initialized (equals previous value in code_gen_buffer). The issue happens in cpu_io_recompile() when it tries to invalidate orig_tb. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Clement Deschamps <clement.deschamps@greensocs.com> Message-Id: <20191022140016.918371-1-clement.deschamps@greensocs.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-10-28 10:35:23 +01:00
Richard Henderson	ae57db63ac	cputlb: Pass retaddr to tb_check_watchpoint Fixes the previous TLB_WATCHPOINT patches because we are currently failing to set cpu->mem_io_pc with the call to cpu_check_watchpoint. Pass down the retaddr directly because it's readily available. Fixes: `50b107c5d6` Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-09-25 10:56:28 -07:00
Richard Henderson	5a7c27bb8a	cputlb: Pass retaddr to tb_invalidate_phys_page_fast Rather than rely on cpu->mem_io_pc, pass retaddr down directly. Within tb_invalidate_phys_page_range__locked, the is_cpu_write_access parameter is non-zero exactly when retaddr would be non-zero, so that is a simple replacement. Recognize that current_tb_not_found is true only when mem_io_pc (and now retaddr) are also non-zero, so remove a redundant test. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-09-25 10:56:28 -07:00
Richard Henderson	ce9f5e2792	cputlb: Remove tb_invalidate_phys_page_range is_cpu_write_access All callers pass false to this argument. Remove it and pass the constant on to tb_invalidate_phys_page_range__locked. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-09-25 10:56:28 -07:00
Markus Armbruster	a8d2532645	Include qemu-common.h exactly where needed No header includes qemu-common.h after this commit, as prescribed by qemu-common.h's file comment. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190523143508.25387-5-armbru@redhat.com> [Rebased with conflicts resolved automatically, except for include/hw/arm/xlnx-zynqmp.h hw/arm/nrf51_soc.c hw/arm/msf2-soc.c block/qcow2-refcount.c block/qcow2-cluster.c block/qcow2-cache.c target/arm/cpu.h target/lm32/cpu.h target/m68k/cpu.h target/mips/cpu.h target/moxie/cpu.h target/nios2/cpu.h target/openrisc/cpu.h target/riscv/cpu.h target/tilegx/cpu.h target/tricore/cpu.h target/unicore32/cpu.h target/xtensa/cpu.h; bsd-user/main.c and net/tap-bsd.c fixed up]	2019-06-12 13:20:20 +02:00
Markus Armbruster	14a48c1d0d	qemu-common: Move tcg_enabled() etc. to sysemu/tcg.h Other accelerators have their own headers: sysemu/hax.h, sysemu/hvf.h, sysemu/kvm.h, sysemu/whpx.h. Only tcg_enabled() & friends sit in qemu-common.h. This necessitates inclusion of qemu-common.h into headers, which is against the rules spelled out in qemu-common.h's file comment. Move tcg_enabled() & friends into their own header sysemu/tcg.h, and adjust #include directives. Cc: Richard Henderson <rth@twiddle.net> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190523143508.25387-2-armbru@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> [Rebased with conflicts resolved automatically, except for accel/tcg/tcg-all.c]	2019-06-11 20:22:09 +02:00
Richard Henderson	5e1401969b	cpu: Move icount_decr to CPUNegativeOffsetState Amusingly, we had already ignored the comment to keep this value at the end of CPUState. This restores the minimum negative offset from TCG_AREG0 for code generation. For the couple of uses within qom/cpu.c, without NEED_CPU_H, add a pointer from the CPUState object to the IcountDecr object within CPUNegativeOffsetState. Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-06-10 07:03:42 -07:00
Richard Henderson	29a0af618d	cpu: Replace ENV_GET_CPU with env_cpu Now that we have both ArchCPU and CPUArchState, we can define this generically instead of via macro in each target's cpu.h. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Acked-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-06-10 07:03:34 -07:00
Richard Henderson	6e6c4efed9	tcg: Restart after TB code generation overflow If a TB generates too much code, try again with fewer insns. Fixes: https://bugs.launchpad.net/bugs/1824853 Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-04-24 13:04:33 -07:00
Richard Henderson	8b86d6d258	tcg: Hoist max_insns computation to tb_gen_code In order to handle TB's that translate to too much code, we need to place the control of the length of the translation in the hands of the code gen master loop. Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-04-24 13:04:33 -07:00
Markus Armbruster	3de2faa9a8	tcg: Simplify how dump_exec_info() prints dump_exec_info() takes an fprintf()-like callback and a FILE * to pass to it. Its only caller hmp_info_jit() passes monitor_fprintf() and the current monitor cast to FILE *. monitor_fprintf() casts it right back, and is otherwise identical to monitor_printf(). The type-punning is ugly. Drop the callback, and call qemu_printf() instead. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20190417191805.28198-5-armbru@redhat.com>	2019-04-18 22:18:59 +02:00
Markus Armbruster	d4c51a0af3	tcg: Simplify how dump_opcount_info() prints dump_opcount_info() takes an fprintf()-like callback and a FILE * to pass to it. Its only caller hmp_info_opcount() passes monitor_fprintf() and the current monitor cast to FILE *. monitor_fprintf() casts it right back, and is otherwise identical to monitor_printf(). The type-punning is ugly. Drop the callback, and call qemu_printf() instead. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20190417191805.28198-4-armbru@redhat.com>	2019-04-18 22:18:59 +02:00
Thomas Huth	fb0343d5b4	tcg: Fix LGPL version number It's either "GNU Library General Public version 2" or "GNU Lesser General Public version 2.1", but there was no "version 2.0" of the "Lesser" library. So assume that version 2.1 is meant here. Cc: Richard Henderson <rth@twiddle.net> Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <1548252536-6242-5-git-send-email-thuth@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2019-01-30 11:01:52 +01:00
Peter Maydell	f7b78602fd	accel/tcg: Add cluster number to TCG TB hash Include the cluster number in the hash we use to look up TBs. This is important because a TB that is valid for one cluster at a given physical address and set of CPU flags is not necessarily valid for another: the two clusters may have different views of physical memory, or may have different CPU features (eg FPU present or absent). We put the cluster number in the high 8 bits of the TB cflags. This gives us up to 256 clusters, which should be enough for anybody. If we ever need more, or need more bits in cflags for other purposes, we could make tb_hash_func() take more data (and expand qemu_xxhash7() to qemu_xxhash8()). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Message-id: 20190121152218.9592-4-peter.maydell@linaro.org	2019-01-29 11:46:06 +00:00
Marc-André Lureau	444e20a36f	build-sys: don't include windows.h, osdep.h does it osdep.h will also define the available Windows API version for QEMU. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20181122110039.15972-2-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-01-11 13:57:24 +01:00
Richard Henderson	e09de0a20d	cputlb: Count "partial" and "elided" tlb flushes Our only statistic so far was "full" tlb flushes, where all mmu_idx are flushed at the same time. Now count "partial" tlb flushes where sets of mmu_idx are flushed, but the set is not maximal. Account one per mmu_idx flushed, as that is the unit of work performed. We don't actually count elided flushes yet, but go ahead and change the interface presented to the monitor all at once. Tested-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-10-31 12:16:30 +00:00
Emilio G. Cota	fff42f183e	tcg: access cpu->icount_decr.u16.high with atomics Consistently access u16.high with atomics to avoid undefined behaviour in MTTCG. Note that icount_decr.u16.low is only used in icount mode, so regular accesses to it are OK. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <20181010144853.13005-2-cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-10-18 18:58:10 -07:00
Thomas Huth	dcf6760a64	accel/tcg: Remove dead code The global cpu_single_env variable has been removed more than 5 years ago, so apparently nobody used this dead debug code in that timeframe anymore. Thus let's remove it completely now. Signed-off-by: Thomas Huth <thuth@redhat.com> Message-Id: <1537204134-15905-1-git-send-email-thuth@redhat.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-10-02 19:09:13 +02:00
Emilio G. Cota	78255ba2cc	qht: drop ht argument from qht iterators Accessing the HT from an iterator results almost always in a deadlock. Given that only one qht-internal function uses this argument, drop it from the interface. Suggested-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-09-26 08:55:54 -07:00
Peter Maydell	9739e3767a	accel/tcg: tb_gen_code(): Create single-insn TB for execution from non-RAM If get_page_addr_code() returns -1, this indicates that there is no RAM page we can read a full TB from. Instead we must create a TB which contains a single instruction and which we do not cache, so it is executed only once. Since this means we can now have TBs which are not in any page list, we also need to make tb_phys_invalidate() handle them (by not trying to remove them from a nonexistent page list). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Emilio G. Cota <cota@braap.org> Tested-by: Cédric Le Goater <clg@kaod.org> Message-id: 20180710160013.26559-5-peter.maydell@linaro.org	2018-08-14 17:17:19 +01:00
Peter Maydell	c360a0fd71	accel/tcg: Handle get_page_addr_code() returning -1 in tb_check_watchpoint() When we support execution from non-RAM MMIO regions, get_page_addr_code() will return -1 to indicate that there is no RAM at the requested address. Handle this in tb_check_watchpoint() -- if the exception happened for a PC which doesn't correspond to RAM then there is no need to invalidate any TBs, because the one-instruction TB will not have been cached. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Tested-by: Cédric Le Goater <clg@kaod.org> Message-id: 20180710160013.26559-4-peter.maydell@linaro.org	2018-08-14 17:17:19 +01:00
Stefan Weil	696c706642	accel: Fix typo and grammar in comment The typo was found by codespell. Signed-off-by: Stefan Weil <sw@weilnetz.de> Message-Id: <20180712194454.26765-1-sw@weilnetz.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-07-16 16:58:16 +02:00
Emilio G. Cota	ec7eb2ae77	translate-all: honour CF_NOCACHE in tb_gen_code This fixes a record-replay regression introduced by `95590e2` ("translate-all: discard TB when tb_link_page returns an existing matching TB", 2018-06-15). The problem is that code using CF_NOCACHE assumes that the TB returned from tb_gen_code is always a newly-generated one. This assumption, however, was broken in the aforementioned commit. Fix it by honouring CF_NOCACHE, so that tb_gen_code always returns a newly-generated TB when CF_NOCACHE is passed to it. Do this by avoiding the TB hash table if CF_NOCACHE is set. Reported-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru> Tested-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru> Signed-off-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Tested-by: Alistair Francis <alistair.francis@wdc.com> Message-id: 1530806837-5416-1-git-send-email-cota@braap.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2018-07-09 16:14:36 +01:00
Emilio G. Cota	a688e73ba8	translate-all: fix locking of TBs whose two pages share the same physical page Commit `0b5c91f` ("translate-all: use per-page locking in !user-mode", 2018-06-15) introduced per-page locking. It assumed that the physical pages corresponding to a TB (at most two pages) are always distinct, which is wrong. For instance, an xtensa test provided by Max Filippov is broken by the commit, since the test maps two virtual pages to the same physical page: virt1: 7fff, virt2: 8000 phys1 6000fff, phys2 6000000 Fix it by removing the assumption from page_lock_pair. If the two physical page addresses are equal, we only lock the PageDesc once. Note that the two callers of page_lock_pair, namely page_unlock_tb and tb_link_page, are also updated so that we do not try to unlock the same PageDesc twice. Fixes: `0b5c91f74f` Reported-by: Max Filippov <jcmvbkbc@gmail.com> Tested-by: Max Filippov <jcmvbkbc@gmail.com> Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <1529944302-14186-1-git-send-email-cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-07-02 08:02:20 -07:00
Peter Maydell	109b25045b	* "info mtree" improvements (Alexey) * fake VPD block limits for SCSI passthrough (Daniel Barboza) * chardev and main loop fixes (Daniel Berrangé, Sergio, Stefan) * help fixes (Eduardo) * pc-dimm refactoring (David) * tests improvements and fixes (Emilio, Thomas) * SVM emulation fixes (Jan) * MemoryRegionCache fix (Eric) * WHPX improvements (Justin) * ESP cleanup (Mark) * -overcommit option (Michael) * qemu-pr-helper fixes (me) * "info pic" improvements for x86 (Peter) * x86 TCG emulation fixes (Richard) * KVM slot handling fix (Shannon) * Next round of deprecation (Thomas) * Windows dump format support (Viktor) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJbNhHpAAoJEL/70l94x66D4ZoH/22cAWj2SOEVbHt1id1noVpI VxyS+MWXYG3se1HseFpNVlI32f3XyyABGMtJqzNuusD5s5Him8yZcwPxsu1RmEy2 uOk+PIo67qbLhJyZ+f3Q+rWRbFV9W+DvrRBM7RCArWUDCDOBaEVoPrRTWC2y3oId EdcLDc2tP/DvOmXtbNcELCuS3w6G2Nly0WwRI4VLJ2aJT6jAZoSfOONjuRg1gamw 7iUwk6UlCHmIMawnlwe1iQHtleX9KNYv0bA9etDrYIpNoZP935pGybchvztcmgMv QymjNptqse65emcbZ9rp0tqNyJhvP2wOyjQCWlCsooyRSoPQDY2Qc7623cRthqU= =D07Z -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging * "info mtree" improvements (Alexey) * fake VPD block limits for SCSI passthrough (Daniel Barboza) * chardev and main loop fixes (Daniel Berrangé, Sergio, Stefan) * help fixes (Eduardo) * pc-dimm refactoring (David) * tests improvements and fixes (Emilio, Thomas) * SVM emulation fixes (Jan) * MemoryRegionCache fix (Eric) * WHPX improvements (Justin) * ESP cleanup (Mark) * -overcommit option (Michael) * qemu-pr-helper fixes (me) * "info pic" improvements for x86 (Peter) * x86 TCG emulation fixes (Richard) * KVM slot handling fix (Shannon) * Next round of deprecation (Thomas) * Windows dump format support (Viktor) # gpg: Signature made Fri 29 Jun 2018 12:03:05 BST # gpg: using RSA key BFFBD25F78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini/tags/for-upstream: (60 commits) tests/boot-serial: Do not delete the output file in case of errors hw/scsi: add VPD Block Limits emulation hw/scsi: centralize SG_IO calls into single function hw/scsi: cleanups before VPD BL emulation dump: add Windows live system dump dump: add fallback KDBG using in Windows dump dump: use system context in Windows dump dump: add Windows dump format to dump-guest-memory i386/cpu: make -cpu host support monitor/mwait kvm: support -overcommit cpu-pm=on\|off hmp: obsolete "info ioapic" ioapic: support "info irq" ioapic: some proper indents when dump info ioapic: support "info pic" doc: another fix to "info pic" target-i386: Mark cpu_vmexit noreturn target-i386: Allow interrupt injection after STGI target-i386: Add NMI interception to SVM memory/hmp: Print owners/parents in "info mtree" WHPX: register for unrecognized MSR exits ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2018-06-29 12:30:29 +01:00
Paolo Bonzini	8bca9a03ec	move public invalidate APIs out of translate-all.{c,h}, clean up Place them in exec.c, exec-all.h and ram_addr.h. This removes knowledge of translate-all.h (which is an internal header) from several files outside accel/tcg and removes knowledge of AddressSpace from translate-all.c (as it only operates on ram_addr_t). Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-06-28 19:05:30 +02:00
Stefan Hajnoczi	f18793b096	compiler: add a sizeof_field() macro Determining the size of a field is useful when you don't have a struct variable handy. Open-coding this is ugly. This patch adds the sizeof_field() macro, which is similar to typeof_field(). Existing instances are updated to use the macro. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 20180614164431.29305-1-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2018-06-27 13:01:40 +01:00
Emilio G. Cota	0ac20318ce	tcg: remove tb_lock Use mmap_lock in user-mode to protect TCG state and the page descriptors. In !user-mode, each vCPU has its own TCG state, so no locks needed. Per-page locks are used to protect the page descriptors. Per-TB locks are used in both modes to protect TB jumps. Some notes: - tb_lock is removed from notdirty_mem_write by passing a locked page_collection to tb_invalidate_phys_page_fast. - tcg_tb_lookup/remove/insert/etc have their own internal lock(s), so there is no need to further serialize access to them. - do_tb_flush is run in a safe async context, meaning no other vCPU threads are running. Therefore acquiring mmap_lock there is just to please tools such as thread sanitizer. - Not visible in the diff, but tb_invalidate_phys_page already has an assert_memory_lock. - cpu_io_recompile is !user-only, so no mmap_lock there. - Added mmap_unlock()'s before all siglongjmp's that could be called in user-mode while mmap_lock is held. + Added an assert for !have_mmap_lock() after returning from the longjmp in cpu_exec, just like we do in cpu_exec_step_atomic. Performance numbers before/after: Host: AMD Opteron(tm) Processor 6376 ubuntu 17.04 ppc64 bootup+shutdown time 700 +-+--+----+------+------------+-----------+--------------+-+ \| + + + + + B \| \| before *B* ** * \| \|tb lock removal ###D### * \| 600 +-+ * +-+ \| ** # \| \| B #D \| \| *** * ## \| 500 +-+ *** ### +-+ \| * *** ### \| \| B # ## \| \| ** * #D# \| 400 +-+ ## +-+ \| ### \| \| ## \| \| # ## \| 300 +-+ * B* #D# +-+ \| B *** ### \| \| * ** #### \| \| * *** ### \| 200 +-+ B B #D# +-+ \| #B * ## # \| \| #* ## \| \| + D##D# + + + + \| 100 +-+--+----+------+------------+-----------+------------+--+-+ 1 8 16 Guest CPUs 48 64 png: https://imgur.com/HwmBHXe debian jessie aarch64 bootup+shutdown time 90 +-+--+-----+-----+------------+------------+------------+--+-+ \| + + + + + + \| \| before *B* B \| 80 +tb lock removal ###D### D +-+ \| ### \| \| ## \| 70 +-+ # +-+ \| ## \| \| # \| 60 +-+ B ## +-+ \| * ## \| \| * #D \| 50 +-+ * ## +-+ \| * ### \| \| B* ### \| 40 +-+ ** # ## +-+ \| #D# \| \| B* ### \| 30 +-+ B*B #### +-+ \| B * * # ### \| \| B ###D# \| 20 +-+ D ##D## +-+ \| D# \| \| + + + + + + \| 10 +-+--+-----+-----+------------+------------+------------+--+-+ 1 8 16 Guest CPUs 48 64 png: https://imgur.com/iGpGFtv The gains are high for 4-8 CPUs. Beyond that point, however, unrelated lock contention significantly hurts scalability. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-06-15 08:18:48 -10:00
Emilio G. Cota	705ad1ff0c	translate-all: remove tb_lock mention from cpu_restore_state_from_tb tb_lock was needed when the function did retranslation. However, since `fca8a500d5` ("tcg: Save insn data and use it in cpu_restore_state_from_tb") we don't do retranslation. Get rid of the comment. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-06-15 08:18:48 -10:00
Emilio G. Cota	194125e3eb	translate-all: protect TB jumps with a per-destination-TB lock This applies to both user-mode and !user-mode emulation. Instead of relying on a global lock, protect the list of incoming jumps with tb->jmp_lock. This lock also protects tb->cflags, so update all tb->cflags readers outside tb->jmp_lock to use atomic reads via tb_cflags(). In order to find the destination TB (and therefore its jmp_lock) from the origin TB, we introduce tb->jmp_dest[]. I considered not using a linked list of jumps, which simplifies code and makes the struct smaller. However, it unnecessarily increases memory usage, which results in a performance decrease. See for instance these numbers booting+shutting down debian-arm: Time (s) Rel. err (%) Abs. err (s) Rel. slowdown (%) ------------------------------------------------------------------------------ before 20.88 0.74 0.154512 0. after 20.81 0.38 0.079078 -0.33524904 GTree 21.02 0.28 0.058856 0.67049808 GHashTable + xxhash 21.63 1.08 0.233604 3.5919540 Using a hash table or a binary tree to keep track of the jumps doesn't really pay off, not only due to the increased memory usage, but also because most TBs have only 0 or 1 jumps to them. The maximum number of jumps when booting debian-arm that I measured is 35, but as we can see in the histogram below a TB with that many incoming jumps is extremely rare; the average TB has 0.80 incoming jumps. n_jumps: 379208; avg jumps/tb: 0.801099 dist: [0.0,1.0)\|▄█▁▁▁▁▁▁▁▁▁▁▁ ▁▁▁▁▁▁ ▁▁▁ ▁▁▁ ▁\|[34.0,35.0] Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-06-15 08:18:48 -10:00
Emilio G. Cota	95590e24af	translate-all: discard TB when tb_link_page returns an existing matching TB Use the recently-gained QHT feature of returning the matching TB if it already exists. This allows us to get rid of the lookup we perform right after acquiring tb_lock. Suggested-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-06-15 08:18:42 -10:00
Emilio G. Cota	faa9372c07	translate-all: introduce assert_no_pages_locked The appended adds assertions to make sure we do not longjmp with page locks held. Note that user-mode has nothing to check, since page_locks are !user-mode only. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-06-15 07:42:55 -10:00
Emilio G. Cota	6d9abf85d5	translate-all: add page_locked assertions This is only compiled under CONFIG_DEBUG_TCG to avoid bloating the binary. In user-mode, assert_page_locked is equivalent to assert_mmap_lock. Note: There are some tb_lock assertions left that will be removed by later patches. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Suggested-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-06-15 07:42:55 -10:00
Emilio G. Cota	0b5c91f74f	translate-all: use per-page locking in !user-mode Groundwork for supporting parallel TCG generation. Instead of using a global lock (tb_lock) to protect changes to pages, use fine-grained, per-page locks in !user-mode. User-mode stays with mmap_lock. Sometimes changes need to happen atomically on more than one page (e.g. when a TB that spans across two pages is added/invalidated, or when a range of pages is invalidated). We therefore introduce struct page_collection, which helps us keep track of a set of pages that have been locked in the appropriate locking order (i.e. by ascending page index). This commit first introduces the structs and the function helpers, to then convert the calling code to use per-page locking. Note that tb_lock is not removed yet. While at it, rename tb_alloc_page to tb_page_add, which pairs with tb_page_remove. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-06-15 07:42:55 -10:00
Emilio G. Cota	45c73de594	translate-all: move tb_invalidate_phys_page_range up in the file This greatly simplifies next commit's diff. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-06-15 07:42:55 -10:00
Emilio G. Cota	ae5486e273	translate-all: work page-by-page in tb_invalidate_phys_range_1 So that we pass a same-page range to tb_invalidate_phys_page_range, instead of always passing an end address that could be on a different page. As discussed with Peter Maydell on the list [1], tb_invalidate_phys_page_range doesn't actually do much with 'end', which explains why we have never hit a bug despite going against what the comment on top of tb_invalidate_phys_page_range requires: > * Invalidate all TBs which intersect with the target physical address range > * [start;end[. NOTE: start and end must refer to the same physical page. The appended honours the comment, which avoids confusion. While at it, rework the loop into a for loop, which is less error prone (e.g. "continue" won't result in an infinite loop). [1] https://lists.gnu.org/archive/html/qemu-devel/2017-07/msg09165.html Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-06-15 07:42:55 -10:00
Emilio G. Cota	94da9aec2a	translate-all: remove hole in PageDesc Groundwork for supporting parallel TCG generation. Move the hole to the end of the struct, so that a u32 field can be added there without bloating the struct. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-06-15 07:42:55 -10:00
Emilio G. Cota	78722ed0b8	translate-all: make l1_map lockless Groundwork for supporting parallel TCG generation. We never remove entries from the radix tree, so we can use cmpxchg to implement lockless insertions. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-06-15 07:42:55 -10:00
Emilio G. Cota	1e05197f24	translate-all: iterate over TBs in a page with PAGE_FOR_EACH_TB This commit does several things, but to avoid churn I merged them all into the same commit. To wit: - Use uintptr_t instead of TranslationBlock * for the list of TBs in a page. Just like we did in (`c37e6d7e` "tcg: Use uintptr_t type for jmp_list_{next\|first} fields of TB"), the rationale is the same: these are tagged pointers, not pointers. So use a more appropriate type. - Only check the least significant bit of the tagged pointers. Masking with 3/~3 is unnecessary and confusing. - Introduce the TB_FOR_EACH_TAGGED macro, and use it to define PAGE_FOR_EACH_TB, which improves readability. Note that TB_FOR_EACH_TAGGED will gain another user in a subsequent patch. - Update tb_page_remove to use PAGE_FOR_EACH_TB. In case there is a bug and we attempt to remove a TB that is not in the list, instead of segfaulting (since the list is NULL-terminated) we will reach g_assert_not_reached(). Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-06-15 07:42:55 -10:00
Emilio G. Cota	128ed2278c	tcg: move tb_ctx.tb_phys_invalidate_count to tcg_ctx Thereby making it per-TCGContext. Once we remove tb_lock, this will avoid an atomic increment every time a TB is invalidated. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-06-15 07:42:55 -10:00
Emilio G. Cota	be2cdc5e35	tcg: track TBs with per-region BST's This paves the way for enabling scalable parallel generation of TCG code. Instead of tracking TBs with a single binary search tree (BST), use a BST for each TCG region, protecting it with a lock. This is as scalable as it gets, since each TCG thread operates on a separate region. The core of this change is the introduction of struct tcg_region_tree, which contains a pointer to a GTree and an associated lock to serialize accesses to it. We then allocate an array of tcg_region_tree's, adding the appropriate padding to avoid false sharing based on qemu_dcache_linesize. Given a tc_ptr, we first find the corresponding region_tree. This is done by special-casing the first and last regions first, since they might be of size != region.size; otherwise we just divide the offset by region.stride. I was worried about this division (several dozen cycles of latency), but profiling shows that this is not a fast path. Note that region.stride is not required to be a power of two; it is only required to be a multiple of the host's page size. Note that with this design we can also provide consistent snapshots about all region trees at once; for instance, tcg_tb_foreach acquires/releases all region_tree locks before/after iterating over them. For this reason we now drop tb_lock in dump_exec_info(). As an alternative I considered implementing a concurrent BST, but this can be tricky to get right, offers no consistent snapshots of the BST, and performance and scalability-wise I don't think it could ever beat having separate GTrees, given that our workload is insert-mostly (all concurrent BST designs I've seen focus, understandably, on making lookups fast, which comes at the expense of convoluted, non-wait-free insertions/removals). Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-06-15 07:42:55 -10:00
Emilio G. Cota	32359d529f	qht: return existing entry when qht_insert fails The meaning of "existing" is now changed to "matches in hash and ht->cmp result". This is saner than just checking the pointer value. Suggested-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-06-15 07:42:55 -10:00
Emilio G. Cota	61b8cef1d4	qht: require a default comparison function qht_lookup now uses the default cmp function. qht_lookup_custom is defined to retain the old behaviour, that is a cmp function is explicitly provided. qht_insert will gain use of the default cmp in the next patch. Note that we move qht_lookup_custom's @func to be the last argument, which makes the new qht_lookup as simple as possible. Instead of this (i.e. keeping @func 2nd): 0000000000010750 <qht_lookup>: 10750: 89 d1 mov %edx,%ecx 10752: 48 89 f2 mov %rsi,%rdx 10755: 48 8b 77 08 mov 0x8(%rdi),%rsi 10759: e9 22 ff ff ff jmpq 10680 <qht_lookup_custom> 1075e: 66 90 xchg %ax,%ax We get: 0000000000010740 <qht_lookup>: 10740: 48 8b 4f 08 mov 0x8(%rdi),%rcx 10744: e9 37 ff ff ff jmpq 10680 <qht_lookup_custom> 10749: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-06-15 07:42:55 -10:00

1 2 3 4

197 Commits