mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Richard Henderson	632cb63d92	accel/tcg: Move tb_lookup to cpu-exec.c Now that we've moved helper_lookup_tb_ptr, the only user of tb-lookup.h is cpu-exec.c; merge the contents in. Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-07-09 20:23:19 -07:00
Richard Henderson	4288eb26a0	accel/tcg: Move helper_lookup_tb_ptr to cpu-exec.c This will allow additional code sharing. No functional change. Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-07-09 20:23:06 -07:00
Philippe Mathieu-Daudé	e5ceadff47	accel/tcg: Keep TranslationBlock headers local to TCG Only the TCG accelerator uses the TranslationBlock API. Move the tb-context.h / tb-hash.h / tb-lookup.h from the global namespace to the TCG one (in accel/tcg). Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20210524170453.3791436-3-f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-05-26 15:33:59 -07:00
Philippe Mathieu-Daudé	824f4bac9f	accel/tcg: Reduce 'exec/tb-context.h' inclusion Only 2 headers require "exec/tb-context.h". Instead of having all files including "exec/exec-all.h" also including it, directly include it where it is required: - accel/tcg/cpu-exec.c - accel/tcg/translate-all.c For plugins/plugin.h, we were implicitly relying on exec/exec-all.h -> exec/tb-context.h -> qemu/qht.h which is now included directly. Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20210524170453.3791436-2-f4bug@amsat.org> [rth: Fix plugins/plugin.h compilation] Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-05-26 15:31:45 -07:00
Thomas Huth	2068cabd3f	Do not include cpu.h if it's not really necessary Stop including cpu.h in files that don't need it. Signed-off-by: Thomas Huth <thuth@redhat.com> Message-Id: <20210416171314.2074665-4-thuth@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2021-05-02 17:24:51 +02:00
Peter Maydell	821e7ed167	* Add some missing gitlab-CI job dependencies * Re-enable "make check SPEED=slow" * Improve the gitlab-pipeline-status script * Clean up inclusing of qtest.h headers * Improve libqos/qgraph documentation * Fix downloading problem in the acceptance tests * Remove deprecated target tilegx * Add new bsd-user maintainers -----BEGIN PGP SIGNATURE----- iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmBHTZERHHRodXRoQHJl ZGhhdC5jb20ACgkQLtnXdP5wLbW1bA//Xkqv1MXiTllIp9/MaayzrCXE4QI1yosA onaEQWBuchw2oO/riyayskD516J5q0dyMlQAWx2tv9N1wFC+F8gUcP/q0zpckqhu 79DyRj9upDYrCgmGUi+0O9qelv3f7VHB6B1bBZlJzA+W7WskrXYk97dXkezvGDQa a+D95upQiOLu2cxvEWTx+Z4Gz1R4NMM/JaudnMkNy/WECLOrEQr/bEgk60dwomO2 Vdb2t1DLwmjFXXQgBvP5olVk/4vHGcDCMOD3gy8TTt7sNv3VR7re18rUdWnOQcB8 hm3IRGLYZ/JYTqKutJ4QYpOFA1hUyKOLysi3Wj/jhuzV/n028izpPbeCsuWGZ1Ck QmdOdP/g8XZzPWekEEG+pL8gZgVM/HdJAm+Ameiwq2F6ybDXS75EgBzCjFC3p1kF KA6UFUD9tw2ZGIjy5vzJToTn4wtku6n9B9sP3nHeVQYbQtSFQhfQwP02NVM66dua PLSlIPP09jtmGS/LO9j+aw72bNhMJzpEORQvnoAOsbH8cgTpu6auzvKDg2+cMqGb pXBihfvhRvfk3RV8dn2nk929FS6hxybjW3aU9iZAG+Dg0YIPwFOk/w/awgbAjhYe bwywmRZSE1mkqm6brE8J1y6SORlcvontv6PLy5NYLe6gGHJex4j8U0zWKW0qDjNr hot+/3Mstsw= =8JHZ -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/thuth-gitlab/tags/pull-request-2021-03-09' into staging * Add some missing gitlab-CI job dependencies * Re-enable "make check SPEED=slow" * Improve the gitlab-pipeline-status script * Clean up inclusing of qtest.h headers * Improve libqos/qgraph documentation * Fix downloading problem in the acceptance tests * Remove deprecated target tilegx * Add new bsd-user maintainers # gpg: Signature made Tue 09 Mar 2021 10:27:29 GMT # gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5 # gpg: issuer "thuth@redhat.com" # gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full] # gpg: aka "Thomas Huth <thuth@redhat.com>" [full] # gpg: aka "Thomas Huth <huth@tuxfamily.org>" [full] # gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown] # Primary key fingerprint: 27B8 8847 EEE0 2501 18F3 EAB9 2ED9 D774 FE70 2DB5 * remotes/thuth-gitlab/tags/pull-request-2021-03-09: bsd-user: Add new maintainers Remove deprecated target tilegx Acceptance Tests: restore filtering of tests by target arch Acceptance Tests: restore downloading of VM images docs/devel/qgraph: improve qgraph documentation libqos/qgraph: format qgraph comments for sphinx documentation scripts/ci/gitlab-pipeline-status: give more info when pipeline not found scripts/ci/gitlab-pipeline-status: give more information on failures scripts/ci/gitlab-pipeline-status: split utlity function for HTTP GET meson: Re-enable the possibility to run "make check SPEED=slow" docker: OpenSBI build job depends on OpenSBI container docker: EDK2 build job depends on EDK2 container docker: Alpine build job depends on Alpine container qtest: delete superfluous inclusions of qtest.h Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2021-03-10 17:22:45 +00:00
Chen Qun	d6eb39b554	qtest: delete superfluous inclusions of qtest.h There are 23 files that include the "sysemu/qtest.h", but they do not use any qtest functions. Signed-off-by: Chen Qun <kuhn.chenqun@huawei.com> Acked-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20210226081414.205946-1-kuhn.chenqun@huawei.com> Signed-off-by: Thomas Huth <thuth@redhat.com>	2021-03-09 06:03:53 +01:00
Richard Henderson	6cc9d67c6f	accel/tcg: Precompute curr_cflags into cpu->tcg_cflags The primary motivation is to remove a dozen insns along the fast-path in tb_lookup. As a byproduct, this allows us to completely remove parallel_cpus. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-03-06 11:53:57 -08:00
Alex Bennée	bf253ac606	accel/tcg: drop the use of CF_HASH_MASK and rename params We don't really deal in cf_mask most of the time. The one time it's relevant is when we want to remove an invalidated TB from the QHT lookup. Everywhere else we should be looking up things without CF_INVALID set. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20210224165811.11567-4-alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-03-06 11:52:06 -08:00
Alex Bennée	c0ae396a81	accel/tcg: move CF_CLUSTER calculation to curr_cflags There is nothing special about this compile flag that doesn't mean we can't just compute it with curr_cflags() which we should be using when building a new set. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20210224165811.11567-3-alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-03-06 11:50:50 -08:00
Alex Bennée	6f04cb1c8f	accel/tcg: rename tb_lookup__cpu_state and hoist state extraction Having a function return either and valid TB and some system state seems excessive. It will make the subsequent re-factoring easier if we lookup the current state where we are. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20210224165811.11567-2-alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-03-06 11:50:43 -08:00
Alex Bennée	a11bbb6a23	accel/tcg: cache single instruction TB on pending replay exception Again there is no reason to jump through the nocache hoops to execute a single instruction block. We do have to add an additional wrinkle to the cpu_handle_interrupt case to ensure we let through a TB where we have specifically disabled icount for the block. As the last user of cpu_exec_nocache we can now remove the function. Further clean-up will follow in subsequent patches. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20210213130325.14781-18-alex.bennee@linaro.org>	2021-02-18 08:19:23 +00:00
Alex Bennée	bc662a3351	accel/tcg: actually cache our partial icount TB When we exit a block under icount with instructions left to execute we might need a shorter than normal block to take us to the next deterministic event. Instead of creating a throwaway block on demand we use the existing compile flags mechanism to ensure we fetch (or compile and fetch) a block with exactly the number of instructions we need. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20210213130325.14781-17-alex.bennee@linaro.org>	2021-02-18 08:19:23 +00:00
Peter Maydell	e6a41a045c	accel/tcg: Add URL of clang bug to comment about our workaround In cpu_exec() we have a longstanding workaround for compilers which do not correctly implement the part of the sigsetjmp()/siglongjmp() spec which requires that local variables which are not changed between the setjmp and the longjmp retain their value. I recently ran across the upstream clang bug report for this; add a link to it to the comment describing the workaround, and generally expand the comment, so that we have a reasonable chance in future of understanding why it's there and determining when we can remove it, assuming clang eventually fixes the bug. Remove the /* buggy compiler / comments on the #else and #endif: they don't add anything to understanding and are somewhat misleading since they're sandwiching the code path for non*-buggy compilers. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-id: 20210129130330.30820-1-peter.maydell@linaro.org	2021-02-11 11:50:14 +00:00
Claudio Fontana	7827168471	cpu: tcg_ops: move to tcg-cpu-ops.h, keep a pointer in CPUClass we cannot in principle make the TCG Operations field definitions conditional on CONFIG_TCG in code that is included by both common_ss and specific_ss modules. Therefore, what we can do safely to restrict the TCG fields to TCG-only builds, is to move all tcg cpu operations into a separate header file, which is only included by TCG, target-specific code. This leaves just a NULL pointer in the cpu.h for the non-TCG builds. This also tidies up the code in all targets a bit, having all TCG cpu operations neatly contained by a dedicated data struct. Signed-off-by: Claudio Fontana <cfontana@suse.de> Message-Id: <20210204163931.7358-16-cfontana@suse.de> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-02-05 10:24:15 -10:00
Claudio Fontana	0545608056	cpu: move cc->do_interrupt to tcg_ops Signed-off-by: Claudio Fontana <cfontana@suse.de> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20210204163931.7358-10-cfontana@suse.de> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-02-05 10:24:14 -10:00
Eduardo Habkost	e9ce43e97a	cpu: Move debug_excp_handler to tcg_ops Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Claudio Fontana <cfontana@suse.de> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20210204163931.7358-8-cfontana@suse.de> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-02-05 10:24:14 -10:00
Eduardo Habkost	48c1a3e303	cpu: Move cpu_exec_* to tcg_ops Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> [claudio: wrapped target code in CONFIG_TCG] Signed-off-by: Claudio Fontana <cfontana@suse.de> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20210204163931.7358-6-cfontana@suse.de> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-02-05 10:24:14 -10:00
Eduardo Habkost	ec62595bab	cpu: Move synchronize_from_tb() to tcg_ops Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> [claudio: wrapped target code in CONFIG_TCG, reworded comments] Signed-off-by: Claudio Fontana <cfontana@suse.de> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20210204163931.7358-5-cfontana@suse.de> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-02-05 10:24:14 -10:00
Claudio Fontana	7df5e3d6ad	accel/tcg: split TCG-only code from cpu_exec_realizefn move away TCG-only code, make it compile only on TCG. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> [claudio: moved the prototypes from hw/core/cpu.h to exec/cpu-all.h] Signed-off-by: Claudio Fontana <cfontana@suse.de> Message-Id: <20210204163931.7358-4-cfontana@suse.de> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-02-05 10:24:14 -10:00
Roman Bolshakov	653b87eb36	tcg: Toggle page execution for Apple Silicon Pages can't be both write and executable at the same time on Apple Silicon. macOS provides public API to switch write protection [1] for JIT applications, like TCG. 1. https://developer.apple.com/documentation/apple_silicon/porting_just-in-time_compilers_to_apple_silicon Tested-by: Alexander Graf <agraf@csgraf.de> Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com> Message-Id: <20210113032806.18220-1-r.bolshakov@yadro.com> [rth: Inline the qemu_thread_jit_* functions; drop the MAP_JIT change for a follow-on patch.] Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-23 12:13:00 -10:00
Philippe Mathieu-Daudé	c03f041f12	accel/tcg: Restrict tb_gen_code() from other accelerators tb_gen_code() is only called within TCG accelerator, declare it locally. Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20210117164813.4101761-4-f4bug@amsat.org> [rth: Adjust vs changed tb_flush_jmp_cache patch.] Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-23 12:12:59 -10:00
Douglas Crosher	bfff072c50	tcg: update the cpu running flag in cpu_exec_step_atomic The cpu_exec_step_atomic() function is called with the cpu->running clear and proceeds to run target code without setting this flag. If this target code generates an exception then handle_cpu_signal() will unnecessarily abort. For example if atomic code generates a memory protection fault. This patch at least sets and clears this running flag, and adds some assertions to help detect other cases. Signed-off-by: Douglas Crosher <dtc-ubuntu@scieneer.com> Message-Id: <a272c656-f7c5-019d-1cc0-499b8f80f2fc@scieneer.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-22 12:48:01 -10:00
Richard Henderson	eba40358b4	tcg: Return the TB pointer from the rx region from exit_tb This produces a small pc-relative displacement within the generated code to the TB structure that preceeds it. Reviewed-by: Joelle van Dyne <j@getutm.app> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-07 05:09:42 -10:00
Richard Henderson	1acbad0f27	tcg: Adjust tb_target_set_jmp_target for split-wx Pass both rx and rw addresses to tb_target_set_jmp_target. Reviewed-by: Joelle van Dyne <j@getutm.app> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-07 05:09:41 -10:00
Richard Henderson	db0c51a380	tcg: Introduce tcg_splitwx_to_{rx,rw} Add two helper functions, using a global variable to hold the displacement. The displacement is currently always 0, so no change in behaviour. Begin using the functions in tcg common code only. Reviewed-by: Joelle van Dyne <j@getutm.app> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-07 05:09:41 -10:00
Daniele Buono	c905a3680d	cfi: Initial support for cfi-icall in QEMU LLVM/Clang, supports runtime checks for forward-edge Control-Flow Integrity (CFI). CFI on indirect function calls (cfi-icall) ensures that, in indirect function calls, the function called is of the right signature for the pointer type defined at compile time. For this check to work, the code must always respect the function signature when using function pointer, the function must be defined at compile time, and be compiled with link-time optimization. This rules out, for example, shared libraries that are dynamically loaded (given that functions are not known at compile time), and code that is dynamically generated at run-time. This patch: 1) Introduces the CONFIG_CFI flag to support cfi in QEMU 2) Introduces a decorator to allow the definition of "sensitive" functions, where a non-instrumented function may be called at runtime through a pointer. The decorator will take care of disabling cfi-icall checks on such functions, when cfi is enabled. 3) Marks functions currently in QEMU that exhibit such behavior, in particular: - The function in TCG that calls pre-compiled TBs - The function in TCI that interprets instructions - Functions in the plugin infrastructures that jump to callbacks - Functions in util that directly call a signal handler Signed-off-by: Daniele Buono <dbuono@linux.vnet.ibm.com> Acked-by: Alex Bennée <alex.bennee@linaro.org Message-Id: <20201204230615.2392-3-dbuono@linux.vnet.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-01-02 21:03:35 +01:00
Eduardo Habkost	710384d042	tcg: Make CPUClass.debug_excp_handler optional Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Claudio Fontana <cfontana@suse.de> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20201212155530.23098-12-cfontana@suse.de> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2020-12-16 15:50:33 -05:00
Eduardo Habkost	80c4750ba8	tcg: make CPUClass.cpu_exec_* optional This will let us simplify the code that initializes CPU class methods, when we move cpu_exec_*() to a separate struct. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Claudio Fontana <cfontana@suse.de> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20201212155530.23098-11-cfontana@suse.de> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2020-12-16 15:50:33 -05:00
Eduardo Habkost	035ba06c2e	tcg: cpu_exec_{enter,exit} helpers Move invocation of CPUClass.cpu_exec_*() to separate helpers, to make it easier to refactor that code later. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Claudio Fontana <cfontana@suse.de> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20201212155530.23098-10-cfontana@suse.de> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2020-12-16 15:50:33 -05:00
Philippe Mathieu-Daudé	19a84318c6	accel/tcg: Remove special case for GCC < 4.6 Since commit `efc6c070ac` ("configure: Add a test for the minimum compiler version") the minimum compiler version required for GCC is 4.8. We can safely remove the special case for GCC 4.6 introduced in commit `0448f5f8b8` ("cpu-exec: Fix compiler warning (-Werror=clobbered)"). No change for Clang as we don't know. Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20201210134752.780923-3-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-12-15 12:52:08 -05:00
Pavel Dovgalyuk	835cbd8d44	icount: improve exec nocache usage cpu-exec tries to execute TB without caching when current icount budget is over. But sometimes refilled budget is big enough to try executing cached blocks. This patch checks that instruction budget is big enough for next block execution instead of just running cpu_exec_nocache. It halves the number of calls of cpu_exec_nocache function during tested OS boot scenario. Signed-off-by: Pavel Dovgalyuk <pavel.dovgalyuk@ispras.ru> Message-Id: <160741865825.348476.7169239332367828943.stgit@pasha-ThinkPad-X280> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-12-15 12:52:04 -05:00
Pavel Dovgalyuk	4084893ddc	replay: don't record interrupt poll Interrupt poll is not a real interrupt event. It is needed only for thread safety. This interrupt is used for i386 and converted to hardware interrupt by cpu_handle_interrupt function. Therefore it is not needed to be recorded, because hardware interrupt will be recorded after converting. Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> -- v4 changes: - Condition check refactoring (suggested by Alex Bennée) Message-Id: <160174517124.12451.12983410242461131737.stgit@pasha-ThinkPad-X280> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-10-06 08:34:49 +02:00
Claudio Fontana	8191d36841	icount: rename functions to be consistent with the module name Signed-off-by: Claudio Fontana <cfontana@suse.de> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-10-05 16:41:22 +02:00
Claudio Fontana	740b175973	cpu-timers, icount: new modules refactoring of cpus.c continues with cpu timer state extraction. cpu-timers: responsible for the softmmu cpu timers state, including cpu clocks and ticks. icount: counts the TCG instructions executed. As such it is specific to the TCG accelerator. Therefore, it is built only under CONFIG_TCG. One complication is due to qtest, which uses an icount field to warp time as part of qtest (qtest_clock_warp). In order to solve this problem, provide a separate counter for qtest. This requires fixing assumptions scattered in the code that qtest_enabled() implies icount_enabled(), checking each specific case. Signed-off-by: Claudio Fontana <cfontana@suse.de> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> [remove redundant initialization with qemu_spice_init] Reviewed-by: Alex Bennée <alex.bennee@linaro.org> [fix lingering calls to icount_get] Signed-off-by: Claudio Fontana <cfontana@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-10-05 16:41:22 +02:00
Stefan Hajnoczi	d73415a315	qemu/atomic.h: rename atomic_ to qatomic_ clang's C11 atomic_fetch_() functions only take a C11 atomic type pointer argument. QEMU uses direct types (int, etc) and this causes a compiler error when a QEMU code calls these functions in a source file that also included <stdatomic.h> via a system header file: $ CC=clang CXX=clang++ ./configure ... && make ../util/async.c:79:17: error: address argument to atomic operation must be a pointer to _Atomic type ('unsigned int ' invalid) Avoid using atomic_*() names in QEMU's atomic.h since that namespace is used by <stdatomic.h>. Prefix QEMU's APIs with 'q' so that atomic.h and <stdatomic.h> can co-exist. I checked /usr/include on my machine and searched GitHub for existing "qatomic_" users but there seem to be none. This patch was generated using: $ git grep -h -o '\<atomic$64$\?_[a-z0-9_]\+' include/qemu/atomic.h \| \ sort -u >/tmp/changed_identifiers $ for identifier in $(</tmp/changed_identifiers); do sed -i "s%\<$identifier\>%q$identifier%g" \ $(git grep -I -l "\<$identifier\>") done I manually fixed line-wrap issues and misaligned rST tables. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200923105646.47864-1-stefanha@redhat.com>	2020-09-23 16:07:44 +01:00
Richard Henderson	ba3c35d9c4	tcg/cpu-exec: precise single-stepping after an interrupt When single-stepping with a debugger attached to QEMU, and when an interrupt is raised, the debugger misses the first instruction after the interrupt. Tested-by: Luc Michel <luc.michel@greensocs.com> Reviewed-by: Luc Michel <luc.michel@greensocs.com> Buglink: https://bugs.launchpad.net/qemu/+bug/757702 Message-Id: <20200717163029.2737546-1-richard.henderson@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-07-17 11:09:34 -07:00
Luc Michel	a7ba744f40	tcg/cpu-exec: precise single-stepping after an exception When single-stepping with a debugger attached to QEMU, and when an exception is raised, the debugger misses the first instruction after the exception: $ qemu-system-aarch64 -M virt -display none -cpu cortex-a53 -s -S $ aarch64-linux-gnu-gdb GNU gdb (GDB) 9.2 [...] (gdb) tar rem :1234 Remote debugging using :1234 warning: No executable has been specified and target does not support determining executable automatically. Try using the "file" command. 0x0000000000000000 in ?? () (gdb) # writing nop insns to 0x200 and 0x204 (gdb) set 0x200 = 0xd503201f (gdb) set 0x204 = 0xd503201f (gdb) # 0x0 address contains 0 which is an invalid opcode. (gdb) # The CPU should raise an exception and jump to 0x200 (gdb) si 0x0000000000000204 in ?? () With this commit, the same run steps correctly on the first instruction of the exception vector: (gdb) si 0x0000000000000200 in ?? () Buglink: https://bugs.launchpad.net/qemu/+bug/757702 Signed-off-by: Luc Michel <luc.michel@greensocs.com> Message-Id: <20200716193947.3058389-1-luc.michel@greensocs.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-07-16 14:08:29 -07:00
Alex Bennée	886cc68943	accel/tcg: fix race in cpu_exec_step_atomic (bug 1863025) The bug describes a race whereby cpu_exec_step_atomic can acquire a TB which is invalidated by a tb_flush before we execute it. This doesn't affect the other cpu_exec modes as a tb_flush by it's nature can only occur on a quiescent system. The race was described as: B2. tcg_cpu_exec => cpu_exec => tb_find => tb_gen_code B3. tcg_tb_alloc obtains a new TB C3. TB obtained with tb_lookup__cpu_state or tb_gen_code (same TB as B2) A3. start_exclusive critical section entered A4. do_tb_flush is called, TB memory freed/re-allocated A5. end_exclusive exits critical section B2. tcg_cpu_exec => cpu_exec => tb_find => tb_gen_code B3. tcg_tb_alloc reallocates TB from B2 C4. start_exclusive critical section entered C5. cpu_tb_exec executes the TB code that was free in A4 The simplest fix is to widen the exclusive period to include the TB lookup. As a result we can drop the complication of checking we are in the exclusive region before we end it. Cc: Yifan <me@yifanlu.com> Buglink: https://bugs.launchpad.net/qemu/+bug/1863025 Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20200214144952.15502-1-alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-02-28 10:58:41 -08:00
Philippe Mathieu-Daudé	dcb32f1d8f	tcg: Search includes from the project root source directory We currently search both the root and the tcg/ directories for tcg files: $ git grep '#include "tcg/' \| wc -l 28 $ git grep '#include "tcg[^/]' \| wc -l 94 To simplify the preprocessor search path, unify by expliciting the tcg/ directory. Patch created mechanically by running: $ for x in \ tcg.h tcg-mo.h tcg-op.h tcg-opc.h \ tcg-op-gvec.h tcg-gvec-desc.h; do \ sed -i "s,#include \"$x\",#include \"tcg/$x\"," \ $(git grep -l "#include \"$x\""); \ done Acked-by: David Gibson <david@gibson.dropbear.id.au> (ppc parts) Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Stefan Weil <sw@weilnetz.de> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200101112303.20724-2-philmd@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-01-15 15:13:10 -10:00
Robert Foley	fc59d2d870	qemu_log_lock/unlock now preserves the qemu_logfile handle. qemu_log_lock() now returns a handle and qemu_log_unlock() receives a handle to unlock. This allows for changing the handle during logging and ensures the lock() and unlock() are for the same file. Also in target/tilegx/translate.c removed the qemu_log_lock()/unlock() calls (and the log("\n")), since the translator can longjmp out of the loop if it attempts to translate an instruction in an inaccessible page. Signed-off-by: Robert Foley <robert.foley@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20191118211528.3221-5-robert.foley@linaro.org>	2019-12-18 20:18:02 +00:00
Emilio G. Cota	e6d86bed50	tcg: let plugins instrument virtual memory accesses To capture all memory accesses we need hook into all the various helper functions that are involved in memory operations as well as the injected inline helper calls. A later commit will allow us to resolve the actual guest HW addresses by replaying the lookup. Signed-off-by: Emilio G. Cota <cota@braap.org> [AJB: drop haddr handling, just deal in vaddr] Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>	2019-10-28 15:12:38 +00:00
Emilio G. Cota	cfbc3c6083	cpu: introduce cpu_in_exclusive_context() Suggested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> [AJB: moved inside start/end_exclusive fns + cleanup] Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>	2019-10-28 15:12:38 +00:00
Pavel Dovgalyuk	ba3e792669	icount: clean up cpu_can_io at the entry to the block Most of IO instructions can be executed only at the end of the block in icount mode. Therefore translator can set cpu_can_io flag when translating the last instruction. But when the blocks are chained, then this flag is not reset and may remain set at the beginning of the next block. This patch resets the flag at the entry of any translation block, making I/O operations impossible by default. Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru> -- v2 changes: - reset can_do_io at the start of every TB (suggested by Paolo Bonzini) Message-Id: <156404428943.18669.15747009371169578935.stgit@pasha-Precision-3630-Tower> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-08-20 17:26:22 +02:00
Markus Armbruster	a8d2532645	Include qemu-common.h exactly where needed No header includes qemu-common.h after this commit, as prescribed by qemu-common.h's file comment. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190523143508.25387-5-armbru@redhat.com> [Rebased with conflicts resolved automatically, except for include/hw/arm/xlnx-zynqmp.h hw/arm/nrf51_soc.c hw/arm/msf2-soc.c block/qcow2-refcount.c block/qcow2-cluster.c block/qcow2-cache.c target/arm/cpu.h target/lm32/cpu.h target/m68k/cpu.h target/mips/cpu.h target/moxie/cpu.h target/nios2/cpu.h target/openrisc/cpu.h target/riscv/cpu.h target/tilegx/cpu.h target/tricore/cpu.h target/unicore32/cpu.h target/xtensa/cpu.h; bsd-user/main.c and net/tap-bsd.c fixed up]	2019-06-12 13:20:20 +02:00
Richard Henderson	5e1401969b	cpu: Move icount_decr to CPUNegativeOffsetState Amusingly, we had already ignored the comment to keep this value at the end of CPUState. This restores the minimum negative offset from TCG_AREG0 for code generation. For the couple of uses within qom/cpu.c, without NEED_CPU_H, add a pointer from the CPUState object to the IcountDecr object within CPUNegativeOffsetState. Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-06-10 07:03:42 -07:00
Peter Maydell	713acc316d	Queued accel/tcg patches -----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJcWle8AAoJEGTfOOivfiFfYeAIAKAu8KlM6JVHVITkk58lzRmT /cwd2O0yRroROslxrdAC0IRgMAH/oSIRK2W3dAIXLXb2TNuf3ydvdkBs6oVGWYnd 5P6fdFSq4ZqOGY6Y2QbBe1RhIOnmYOVreVePQsT+ofIFZOYrP0hKtcc68nB8nGZM vVdlfotU/6aQuPZnXVBPNsXZCZEWz67JY1pzJTTNJKtPV51jsgxNSp4ktXuqkmvY RnnG347hbQWFs5bCaVXoSp5rzuDxfjlI5/vJw/HuG3h47LSWkiq1GYg39d5Yn802 N7nZURXEI6onmLkTVndV/EkA1yqvUpOh4NM7S1hZVJ2DXu/wXcEr8kAATSeYkos= =Gmjb -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20190206' into staging Queued accel/tcg patches # gpg: Signature made Wed 06 Feb 2019 03:42:52 GMT # gpg: using RSA key 64DF38E8AF7E215F # gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [full] # Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A 05C0 64DF 38E8 AF7E 215F * remotes/rth/tags/pull-tcg-20190206: accel/tcg: Consider cluster index in tb_lookup__cpu_state() tcg: add early clober modifier in atomic16_cmpxchg on aarch64 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2019-02-07 11:46:40 +00:00
Peter Maydell	9fd9b7de61	accel/tcg: Consider cluster index in tb_lookup__cpu_state() In commit `f7b78602fd` we added the CPU cluster number to the cflags field of the TB hash; this included adding it to the value kept in tb->cflags, since we pass that field directly into the hash calculation in some places. Unfortunately we forgot to check whether other parts of the code were doing comparisons against tb->cflags that would need to be updated. It turns out that there is exactly one such place: the tb_lookup__cpu_state() function checks whether the TB it has found in the tb_jmp_cache has a tb->cflags matching the cf_mask that is passed in. The tb->cflags has the cluster_index in it but the cf_mask does not. Hoist the "add cluster index to the cf_mask" code up from tb_htable_lookup() to tb_lookup__cpu_state() so it can be considered in the "did this TB match in the jmp cache" condition, as well as when we do the full hash lookup by physical PC, flags, etc. (tb_htable_lookup() is only called from tb_lookup__cpu_state(), so this change doesn't require any further knock-on changes.) Fixes: `f7b78602fd` ("accel/tcg: Add cluster number to TCG TB hash") Tested-by: Cleber Rosa <crosa@redhat.com> Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reported-by: Howard Spoelstra <hsp.cat7@gmail.com> Reported-by: Cleber Rosa <crosa@redhat.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <20190205151810.571-1-peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-02-06 03:39:24 +00:00
Emilio G. Cota	6aaa24f9d4	cpu-exec: reset BQL after longjmp in cpu_exec_step_atomic Just like we do in cpu_exec(). Reported-by: Max Filippov <jcmvbkbc@gmail.com> Tested-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-02-05 16:50:16 +01:00
Emilio G. Cota	8fd3a9b81d	cpu-exec: add assert_no_pages_locked() after longjmp We forgot to add this check in `faa9372c07` ("translate-all: introduce assert_no_pages_locked", 2018-06-15); we only added it after returning from a longjmp in cpu_exec_step_atomic. Fix it. Signed-off-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-02-05 16:50:16 +01:00
Thomas Huth	fb0343d5b4	tcg: Fix LGPL version number It's either "GNU Library General Public version 2" or "GNU Lesser General Public version 2.1", but there was no "version 2.0" of the "Lesser" library. So assume that version 2.1 is meant here. Cc: Richard Henderson <rth@twiddle.net> Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <1548252536-6242-5-git-send-email-thuth@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2019-01-30 11:01:52 +01:00
Peter Maydell	f7b78602fd	accel/tcg: Add cluster number to TCG TB hash Include the cluster number in the hash we use to look up TBs. This is important because a TB that is valid for one cluster at a given physical address and set of CPU flags is not necessarily valid for another: the two clusters may have different views of physical memory, or may have different CPU features (eg FPU present or absent). We put the cluster number in the high 8 bits of the TB cflags. This gives us up to 256 clusters, which should be enough for anybody. If we ever need more, or need more bits in cflags for other purposes, we could make tb_hash_func() take more data (and expand qemu_xxhash7() to qemu_xxhash8()). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Message-id: 20190121152218.9592-4-peter.maydell@linaro.org	2019-01-29 11:46:06 +00:00
Richard Henderson	d7f425fdea	tcg: Implement CPU_LOG_TB_NOCHAIN during expansion Rather than test NOCHAIN before linking, do not emit the goto_tb opcode at all. We already do this for goto_ptr. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-10-18 18:58:10 -07:00
Peter Maydell	7252f2dea9	accel/tcg: Handle get_page_addr_code() returning -1 in hashtable lookups When we support execution from non-RAM MMIO regions, get_page_addr_code() will return -1 to indicate that there is no RAM at the requested address. Handle this in the cpu-exec TB hashtable lookup code, treating it as "no match found". Note that the call to get_page_addr_code() in tb_lookup_cmp() needs no changes -- a return of -1 will already correctly result in the function returning false. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Emilio G. Cota <cota@braap.org> Tested-by: Cédric Le Goater <clg@kaod.org> Message-id: 20180710160013.26559-3-peter.maydell@linaro.org	2018-08-14 17:17:19 +01:00
Emilio G. Cota	0ac20318ce	tcg: remove tb_lock Use mmap_lock in user-mode to protect TCG state and the page descriptors. In !user-mode, each vCPU has its own TCG state, so no locks needed. Per-page locks are used to protect the page descriptors. Per-TB locks are used in both modes to protect TB jumps. Some notes: - tb_lock is removed from notdirty_mem_write by passing a locked page_collection to tb_invalidate_phys_page_fast. - tcg_tb_lookup/remove/insert/etc have their own internal lock(s), so there is no need to further serialize access to them. - do_tb_flush is run in a safe async context, meaning no other vCPU threads are running. Therefore acquiring mmap_lock there is just to please tools such as thread sanitizer. - Not visible in the diff, but tb_invalidate_phys_page already has an assert_memory_lock. - cpu_io_recompile is !user-only, so no mmap_lock there. - Added mmap_unlock()'s before all siglongjmp's that could be called in user-mode while mmap_lock is held. + Added an assert for !have_mmap_lock() after returning from the longjmp in cpu_exec, just like we do in cpu_exec_step_atomic. Performance numbers before/after: Host: AMD Opteron(tm) Processor 6376 ubuntu 17.04 ppc64 bootup+shutdown time 700 +-+--+----+------+------------+-----------+--------------+-+ \| + + + + + B \| \| before *B* ** * \| \|tb lock removal ###D### * \| 600 +-+ * +-+ \| ** # \| \| B #D \| \| *** * ## \| 500 +-+ *** ### +-+ \| * *** ### \| \| B # ## \| \| ** * #D# \| 400 +-+ ## +-+ \| ### \| \| ## \| \| # ## \| 300 +-+ * B* #D# +-+ \| B *** ### \| \| * ** #### \| \| * *** ### \| 200 +-+ B B #D# +-+ \| #B * ## # \| \| #* ## \| \| + D##D# + + + + \| 100 +-+--+----+------+------------+-----------+------------+--+-+ 1 8 16 Guest CPUs 48 64 png: https://imgur.com/HwmBHXe debian jessie aarch64 bootup+shutdown time 90 +-+--+-----+-----+------------+------------+------------+--+-+ \| + + + + + + \| \| before *B* B \| 80 +tb lock removal ###D### D +-+ \| ### \| \| ## \| 70 +-+ # +-+ \| ## \| \| # \| 60 +-+ B ## +-+ \| * ## \| \| * #D \| 50 +-+ * ## +-+ \| * ### \| \| B* ### \| 40 +-+ ** # ## +-+ \| #D# \| \| B* ### \| 30 +-+ B*B #### +-+ \| B * * # ### \| \| B ###D# \| 20 +-+ D ##D## +-+ \| D# \| \| + + + + + + \| 10 +-+--+-----+-----+------------+------------+------------+--+-+ 1 8 16 Guest CPUs 48 64 png: https://imgur.com/iGpGFtv The gains are high for 4-8 CPUs. Beyond that point, however, unrelated lock contention significantly hurts scalability. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-06-15 08:18:48 -10:00
Emilio G. Cota	194125e3eb	translate-all: protect TB jumps with a per-destination-TB lock This applies to both user-mode and !user-mode emulation. Instead of relying on a global lock, protect the list of incoming jumps with tb->jmp_lock. This lock also protects tb->cflags, so update all tb->cflags readers outside tb->jmp_lock to use atomic reads via tb_cflags(). In order to find the destination TB (and therefore its jmp_lock) from the origin TB, we introduce tb->jmp_dest[]. I considered not using a linked list of jumps, which simplifies code and makes the struct smaller. However, it unnecessarily increases memory usage, which results in a performance decrease. See for instance these numbers booting+shutting down debian-arm: Time (s) Rel. err (%) Abs. err (s) Rel. slowdown (%) ------------------------------------------------------------------------------ before 20.88 0.74 0.154512 0. after 20.81 0.38 0.079078 -0.33524904 GTree 21.02 0.28 0.058856 0.67049808 GHashTable + xxhash 21.63 1.08 0.233604 3.5919540 Using a hash table or a binary tree to keep track of the jumps doesn't really pay off, not only due to the increased memory usage, but also because most TBs have only 0 or 1 jumps to them. The maximum number of jumps when booting debian-arm that I measured is 35, but as we can see in the histogram below a TB with that many incoming jumps is extremely rare; the average TB has 0.80 incoming jumps. n_jumps: 379208; avg jumps/tb: 0.801099 dist: [0.0,1.0)\|▄█▁▁▁▁▁▁▁▁▁▁▁ ▁▁▁▁▁▁ ▁▁▁ ▁▁▁ ▁\|[34.0,35.0] Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-06-15 08:18:48 -10:00
Emilio G. Cota	95590e24af	translate-all: discard TB when tb_link_page returns an existing matching TB Use the recently-gained QHT feature of returning the matching TB if it already exists. This allows us to get rid of the lookup we perform right after acquiring tb_lock. Suggested-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-06-15 08:18:42 -10:00
Emilio G. Cota	faa9372c07	translate-all: introduce assert_no_pages_locked The appended adds assertions to make sure we do not longjmp with page locks held. Note that user-mode has nothing to check, since page_locks are !user-mode only. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-06-15 07:42:55 -10:00
Emilio G. Cota	be2cdc5e35	tcg: track TBs with per-region BST's This paves the way for enabling scalable parallel generation of TCG code. Instead of tracking TBs with a single binary search tree (BST), use a BST for each TCG region, protecting it with a lock. This is as scalable as it gets, since each TCG thread operates on a separate region. The core of this change is the introduction of struct tcg_region_tree, which contains a pointer to a GTree and an associated lock to serialize accesses to it. We then allocate an array of tcg_region_tree's, adding the appropriate padding to avoid false sharing based on qemu_dcache_linesize. Given a tc_ptr, we first find the corresponding region_tree. This is done by special-casing the first and last regions first, since they might be of size != region.size; otherwise we just divide the offset by region.stride. I was worried about this division (several dozen cycles of latency), but profiling shows that this is not a fast path. Note that region.stride is not required to be a power of two; it is only required to be a multiple of the host's page size. Note that with this design we can also provide consistent snapshots about all region trees at once; for instance, tcg_tb_foreach acquires/releases all region_tree locks before/after iterating over them. For this reason we now drop tb_lock in dump_exec_info(). As an alternative I considered implementing a concurrent BST, but this can be tricky to get right, offers no consistent snapshots of the BST, and performance and scalability-wise I don't think it could ever beat having separate GTrees, given that our workload is insert-mostly (all concurrent BST designs I've seen focus, understandably, on making lookups fast, which comes at the expense of convoluted, non-wait-free insertions/removals). Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-06-15 07:42:55 -10:00
Emilio G. Cota	61b8cef1d4	qht: require a default comparison function qht_lookup now uses the default cmp function. qht_lookup_custom is defined to retain the old behaviour, that is a cmp function is explicitly provided. qht_insert will gain use of the default cmp in the next patch. Note that we move qht_lookup_custom's @func to be the last argument, which makes the new qht_lookup as simple as possible. Instead of this (i.e. keeping @func 2nd): 0000000000010750 <qht_lookup>: 10750: 89 d1 mov %edx,%ecx 10752: 48 89 f2 mov %rsi,%rdx 10755: 48 8b 77 08 mov 0x8(%rdi),%rsi 10759: e9 22 ff ff ff jmpq 10680 <qht_lookup_custom> 1075e: 66 90 xchg %ax,%ax We get: 0000000000010740 <qht_lookup>: 10740: 48 8b 4f 08 mov 0x8(%rdi),%rcx 10744: e9 37 ff ff ff jmpq 10680 <qht_lookup_custom> 10749: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-06-15 07:42:55 -10:00
Philippe Mathieu-Daudé	df924c0643	accel: Do not include "exec/address-spaces.h" if it is not necessary Code change produced with: $ git grep '#include "exec/address-spaces.h"' accel \| \ cut -d: -f-1 \| \ xargs egrep -L "(get_system_\|address_space_)" \| \ xargs sed -i.bak '/#include "exec\/address-spaces.h"/d' Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20180528232719.4721-3-f4bug@amsat.org> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-05-31 19:12:13 +02:00
Peter Maydell	ae76518047	tcg: Optionally log FPU state in TCG -d cpu logging Usually the logging of the CPU state produced by -d cpu is sufficient to diagnose problems, but sometimes you want to see the state of the floating point registers as well. We don't want to enable that by default as it adds a lot of extra data to the log; instead, allow it to be optionally enabled via -d fpu. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20180510130024.31678-1-peter.maydell@linaro.org	2018-05-15 14:58:44 +01:00
Pavel Dovgalyuk	afd46fcad2	icount: fix cpu_restore_state_from_tb for non-tb-exit cases In icount mode, instructions that access io memory spaces in the middle of the translation block invoke TB recompilation. After recompilation, such instructions become last in the TB and are allowed to access io memory spaces. When the code includes instruction like i386 'xchg eax, 0xffffd080' which accesses APIC, QEMU goes into an infinite loop of the recompilation. This instruction includes two memory accesses - one read and one write. After the first access, APIC calls cpu_report_tpr_access, which restores the CPU state to get the current eip. But cpu_restore_state_from_tb resets the cpu->can_do_io flag which makes the second memory access invalid. Therefore the second memory access causes a recompilation of the block. Then these operations repeat again and again. This patch moves resetting cpu->can_do_io flag from cpu_restore_state_from_tb to cpu_loop_exit* functions. It also adds a parameter for cpu_restore_state which controls restoring icount. There is no need to restore icount when we only query CPU state without breaking the TB. Restoring it in such cases leads to the incorrect flow of the virtual time. In most cases new parameter is true (icount should be recalculated). But there are two cases in i386 and openrisc when the CPU state is only queried without the need to break the TB. This patch fixes both of these cases. Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru> Message-Id: <20180409091320.12504.35329.stgit@pasha-VirtualBox> [rth: Make can_do_io setting unconditional; move from cpu_exec; make cpu_loop_exit_{noexc,restore} call cpu_loop_exit.] Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-04-11 09:05:22 +10:00
Pavel Dovgalyuk	5f3bdfd4fa	cpu-exec: fix exception_index handling Function cpu_handle_interrupt calls cc->cpu_exec_interrupt to process pending hardware interrupts. Under the hood cpu_exec_interrupt uses cpu->exception_index to pass information to the internal function which is usually common for exception and interrupt processing. But this value is not reset after return and may be processed again by cpu_handle_exception. This does not happen due to overwriting the exception_index at the end of cpu_handle_interrupt. But this branch may also overwrite the valid exception_index in some cases. Therefore this patch: 1. resets exception_index just after the call to cpu_exec_interrupt 2. prevents overwriting the meaningful value of exception_index Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20180227095140.1060.61357.stgit@pasha-VirtualBox> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>	2018-03-12 16:12:50 +01:00
Paolo Bonzini	4fad446bc9	tcg: add cs_base and flags to -d exec output Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20171217055023.29225-1-pbonzini@redhat.com> [rth: Also change the Chain logging in helper_lookup_tb_ptr.] Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-12-29 12:43:40 -08:00
David Hildenbrand	d84be02d69	cpu-exec: fix missed CPU kick during interrupt injection The conditional memory barrier not only looks strange but actually is wrong. On s390x, I can reproduce interrupts via cpu_interrupt() not leading to a proper kick out of emulation every now and then. cpu_interrupt() is especially used for inter CPU communication via SIGP (esp. external calls and emergency interrupts). With this patch, I was not able to reproduce. (esp. no stalls or hangs in the guest). My setup is s390x MTTCG with 16 VCPUs on 8 CPU host, running make -j16. Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20171129191319.11483-1-david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-12-21 09:22:44 +01:00
Peter Maydell	b11ce33fe0	Revert "cpu-exec: don't overwrite exception_index" This reverts commit `e01cecabf3`, which breaks booting of aarch64 Linux images. Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-11-20 10:58:27 +00:00
Pavel Dovgalyuk	17b50b0c29	cpu-exec: avoid cpu_exec_nocache infinite loop with record/replay This patch ensures that icount_decr.u32.high is clear before calling cpu_exec_nocache when exception is pending. Because the exception is caused by the first instruction in the block and it cannot be executed without resetting the flag. There are two parts in the fix. First, clear icount_decr.u32.high in cpu_handle_interrupt (just before processing the "dependent" request, stored in cpu->interrupt_request or cpu->exit_request) rather than cpu_loop_exec_tb; this ensures that cpu_handle_exception is always reached with zero icount_decr.u32.high unless another interrupt has happened in the meanwhile. Second, try to cause the exception at the beginning of cpu_handle_exception, and exit immediately if the TB cannot execute. With this change, interrupts are processed and cpu_exec_nocache can make process. Signed-off-by: Maria Klimushenkova <maria.klimushenkova@ispras.ru> Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru> Message-Id: <20171114081818.27640.33165.stgit@pasha-VirtualBox> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-11-14 14:46:46 +01:00
Pavel Dovgalyuk	e01cecabf3	cpu-exec: don't overwrite exception_index This patch adds a condition before overwriting exception_index fiels. It is needed when exception_index is already set to some meaningful value. Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru> Message-Id: <20171114081812.27640.26372.stgit@pasha-VirtualBox> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-11-14 14:46:46 +01:00
Peter Maydell	426eeecdf5	cpu-exec: Exit exclusive region on longjmp from step_atomic Commit `ac03ee5331` narrowed the scope of the exclusive region so it only covers when we're executing the TB, not when we're generating it. However it missed that there is more than one execution path out of cpu_tb_exec -- if the atomic insn causes an exception then the code will longjmp out, skipping the code to end the exclusive region. This causes QEMU to hang the next time the CPU calls start_exclusive(), waiting for itself to exit the region. Move the "end the region" code out to the end of the function so that it is run for both normal exit and also for exit-via-longjmp. We have to use a volatile bool flag to decide whether we need to end the region, because we can longjump out of the codegen as well as the execution. (For some reason this only reproduces for me with a clang optimized build, not a gcc debug build.) Reviewed-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Fixes: `ac03ee5331` Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <1509640536-32160-1-git-send-email-peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-11-03 09:34:21 +01:00
Emilio G. Cota	44ded3d048	tcg: take tb_ctx out of TCGContext Groundwork for supporting multiple TCG contexts. Reviewed-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-10-24 13:53:42 -07:00
Emilio G. Cota	be1e01171b	exec-all: rename tb_free to tb_remove We don't really free anything in this function anymore; we just remove the TB from the binary search tree. Suggested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-10-24 13:53:42 -07:00
Richard Henderson	416986d3f9	tcg: Remove CF_IGNORE_ICOUNT Now that we have curr_cflags, we can include CF_USE_ICOUNT early and then remove it as necessary. Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-10-24 13:53:42 -07:00
Emilio G. Cota	ac03ee5331	cpu-exec: lookup/generate TB outside exclusive region during step_atomic Now that all code generation has been converted to check CF_PARALLEL, we can generate !CF_PARALLEL code without having yet set !parallel_cpus -- and therefore without having to be in the exclusive region during cpu_exec_step_atomic. While at it, merge cpu_exec_step into cpu_exec_step_atomic. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-10-24 13:53:42 -07:00
Richard Henderson	9b990ee5a3	tcg: Add CPUState cflags_next_tb We were generating code during tb_invalidate_phys_page_range, check_watchpoint, cpu_io_recompile, and (seemingly) discarding the TB, assuming that it would magically be picked up during the next iteration through the cpu_exec loop. Instead, record the desired cflags in CPUState so that we request the proper TB so that there is no more magic. Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-10-24 13:53:41 -07:00
Emilio G. Cota	4e2ca83e71	tcg: define CF_PARALLEL and use it for TB hashing along with CF_COUNT_MASK This will enable us to decouple code translation from the value of parallel_cpus at any given time. It will also help us minimize TB flushes when generating code via EXCP_ATOMIC. Note that the declaration of parallel_cpus is brought to exec-all.h to be able to define there the "curr_cflags" inline. Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-10-24 13:53:41 -07:00
Emilio G. Cota	e7e168f413	exec-all: extract tb->tc_* into a separate struct tc_tb In preparation for adding tc.size to be able to keep track of TB's using the binary search tree implementation from glib. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-10-10 07:37:10 -07:00
Emilio G. Cota	84f1c148da	exec-all: bring tb->invalid into tb->cflags This gets rid of a hole in struct TranslationBlock. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-10-10 07:37:10 -07:00
Emilio G. Cota	f6bb84d531	tcg: consolidate TB lookups in tb_lookup__cpu_state This avoids duplicating code. cpu_exec_step will also use the new common function once we integrate parallel_cpus into tb->cflags. Note that in this commit we also fix a race, described by Richard Henderson during review. Think of this scenario with threads A and B: (A) Lookup succeeds for TB in hash without tb_lock (B) Sets the TB's tb->invalid flag (B) Removes the TB from tb_htable (B) Clears all CPU's tb_jmp_cache (A) Store TB into local tb_jmp_cache Given that order of events, (A) will keep executing that invalid TB until another flush of its tb_jmp_cache happens, which in theory might never happen. We can fix this by checking the tb->invalid flag every time we look up a TB from tb_jmp_cache, so that in the above scenario, next time we try to find that TB in tb_jmp_cache, we won't, and will therefore be forced to look it up in tb_htable. Performance-wise, I measured a small improvement when booting debian-arm. Note that inlining pays off: Performance counter stats for 'taskset -c 0 qemu-system-arm \ -machine type=virt -nographic -smp 1 -m 4096 \ -netdev user,id=unet,hostfwd=tcp::2222-:22 \ -device virtio-net-device,netdev=unet \ -drive file=jessie.qcow2,id=myblock,index=0,if=none \ -device virtio-blk-device,drive=myblock \ -kernel kernel.img -append console=ttyAMA0 root=/dev/vda1 \ -name arm,debug-threads=on -smp 1' (10 runs): Before: 18714.917392 task-clock # 0.952 CPUs utilized ( +- 0.95% ) 23,142 context-switches # 0.001 M/sec ( +- 0.50% ) 1 CPU-migrations # 0.000 M/sec 10,558 page-faults # 0.001 M/sec ( +- 0.95% ) 53,957,727,252 cycles # 2.883 GHz ( +- 0.91% ) [83.33%] 24,440,599,852 stalled-cycles-frontend # 45.30% frontend cycles idle ( +- 1.20% ) [83.33%] 16,495,714,424 stalled-cycles-backend # 30.57% backend cycles idle ( +- 0.95% ) [66.66%] 76,267,572,582 instructions # 1.41 insns per cycle # 0.32 stalled cycles per insn ( +- 0.87% ) [83.34%] 12,692,186,323 branches # 678.186 M/sec ( +- 0.92% ) [83.35%] 263,486,879 branch-misses # 2.08% of all branches ( +- 0.73% ) [83.34%] 19.648474449 seconds time elapsed ( +- 0.82% ) After, w/ inline (this patch): 18471.376627 task-clock # 0.955 CPUs utilized ( +- 0.96% ) 23,048 context-switches # 0.001 M/sec ( +- 0.48% ) 1 CPU-migrations # 0.000 M/sec 10,708 page-faults # 0.001 M/sec ( +- 0.81% ) 53,208,990,796 cycles # 2.881 GHz ( +- 0.98% ) [83.34%] 23,941,071,673 stalled-cycles-frontend # 44.99% frontend cycles idle ( +- 0.95% ) [83.34%] 16,161,773,848 stalled-cycles-backend # 30.37% backend cycles idle ( +- 0.76% ) [66.67%] 75,786,269,766 instructions # 1.42 insns per cycle # 0.32 stalled cycles per insn ( +- 1.24% ) [83.34%] 12,573,617,143 branches # 680.708 M/sec ( +- 1.34% ) [83.33%] 260,235,550 branch-misses # 2.07% of all branches ( +- 0.66% ) [83.33%] 19.340502161 seconds time elapsed ( +- 0.56% ) After, w/o inline: 18791.253967 task-clock # 0.954 CPUs utilized ( +- 0.78% ) 23,230 context-switches # 0.001 M/sec ( +- 0.42% ) 1 CPU-migrations # 0.000 M/sec 10,563 page-faults # 0.001 M/sec ( +- 1.27% ) 54,168,674,622 cycles # 2.883 GHz ( +- 0.80% ) [83.34%] 24,244,712,629 stalled-cycles-frontend # 44.76% frontend cycles idle ( +- 1.37% ) [83.33%] 16,288,648,572 stalled-cycles-backend # 30.07% backend cycles idle ( +- 0.95% ) [66.66%] 77,659,755,503 instructions # 1.43 insns per cycle # 0.31 stalled cycles per insn ( +- 0.97% ) [83.34%] 12,922,780,045 branches # 687.702 M/sec ( +- 1.06% ) [83.34%] 261,962,386 branch-misses # 2.03% of all branches ( +- 0.71% ) [83.35%] 19.700174670 seconds time elapsed ( +- 0.56% ) Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-10-10 07:37:10 -07:00
Emilio G. Cota	841710c78e	cpu-exec: rename have_tb_lock to acquired_tb_lock in tb_find Reusing the have_tb_lock name, which is also defined in translate-all.c, makes code reviewing unnecessarily harder. Avoid potential confusion by renaming the local have_tb_lock variable to something else. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-10-10 07:37:10 -07:00
Richard Henderson	a858339336	tcg: Move USE_DIRECT_JUMP discriminator to tcg/cpu/tcg-target.h Replace the USE_DIRECT_JUMP ifdef with a TCG_TARGET_HAS_direct_jump boolean test. Replace the tb_set_jmp_target1 ifdef with an unconditional function tb_target_set_jmp_target. While we're touching all backends, add a parameter for tb->tc_ptr; we're going to need it shortly for some backends. Move tb_set_jmp_target and tb_add_jump from exec-all.h to cpu-exec.c. This opens the possibility for TCG_TARGET_HAS_direct_jump to be a runtime decision -- based on host cpu capabilities, the size of code_gen_buffer, or a future debugging switch. Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-09-07 11:57:34 -07:00
Lluís Vilanova	61a67f71dd	exec: [tcg] Use different TBs according to the vCPU's dynamic tracing state Every vCPU now uses a separate set of TBs for each set of dynamic tracing event state values. Each set of TBs can be used by any number of vCPUs to maximize TB reuse when vCPUs have the same tracing state. This feature is later used by tracetool to optimize tracing of guest code events. The maximum number of TB sets is defined as 2^E, where E is the number of events that have the 'vcpu' property (their state is stored in CPUState->trace_dstate). For this to work, a change on the dynamic tracing state of a vCPU will force it to flush its virtual TB cache (which is only indexed by address), and fall back to the physical TB cache (which now contains the vCPU's dynamic tracing state as part of the hashing function). Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu> Reviewed-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Message-id: 149915775266.6295.10060144081246467690.stgit@frigg.lan Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-07-17 13:11:05 +01:00
Yang Zhong	d9bb58e510	tcg: move tcg related files into accel/tcg/ subdirectory move cputlb.c, cpu-exec-common.c and cpu-exec.c related tcg exec file into accel/tcg/ subdirectory. Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <1496383606-18060-3-git-send-email-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-06-15 11:04:06 +02:00

1 2 3 4

183 Commits