mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Richard Henderson	a858339336	tcg: Move USE_DIRECT_JUMP discriminator to tcg/cpu/tcg-target.h Replace the USE_DIRECT_JUMP ifdef with a TCG_TARGET_HAS_direct_jump boolean test. Replace the tb_set_jmp_target1 ifdef with an unconditional function tb_target_set_jmp_target. While we're touching all backends, add a parameter for tb->tc_ptr; we're going to need it shortly for some backends. Move tb_set_jmp_target and tb_add_jump from exec-all.h to cpu-exec.c. This opens the possibility for TCG_TARGET_HAS_direct_jump to be a runtime decision -- based on host cpu capabilities, the size of code_gen_buffer, or a future debugging switch. Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-09-07 11:57:34 -07:00
Lluís Vilanova	77fc6f5e28	target: [tcg] Use a generic enum for DISAS_ values Used later. An enum makes expected values explicit and bounds the value space of switches. Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu> Reviewed-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Richard Henderson <rth@twiddle.net> Message-Id: <150002049746.22386.2316077281615710615.stgit@frigg.lan> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-09-06 08:06:47 -07:00
Richard Henderson	5dc66895b0	tcg: Add generic DISAS_NORETURN This will allow some amount of cleanup to happen before switching the backends over to enum DisasJumpType. Reviewed-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Lluís Vilanova <vilanova@ac.upc.edu> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-09-06 08:06:47 -07:00
Lluís Vilanova	9c489ea6be	tcg: Pass generic CPUState to gen_intermediate_code() Needed to implement a target-agnostic gen_intermediate_code() in the future. Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Alex Benneé <alex.benee@linaro.org> Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu> Message-Id: <150002025498.22386.18051908483085660588.stgit@frigg.lan> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-07-19 14:45:16 -07:00
Peter Maydell	6c4591566d	target-arm queue: * new model of the ARM MPS2/MPS2+ FPGA based development board * clean up DISAS_* exit conditions and fix various regressions since commits `e75449a346` `8a6b28c7b5` (in particular including ones which broke OP-TEE guests) * make Cortex-M3 and M4 correctly default to 8 PMSA regions -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABCAAGBQJZbLEBAAoJEDwlJe0UNgzeTqsP/06M2a/rswUKjIGAsXv+TeTl 5N31g9E6Jr57HXK94Q0XtNkLlPwvIn97Dcv6VKg5+E8OgJx7ozldwZVFghWvMbOA mbaikzgTRRUf6ydNTA4DtWYZPkaLNT86Vmb2T1GKS0nmw2ymd+hMLNk5vZd1jhDv krHxwECI5e+u1INpw7erlQ2mqVP1NjvOuMNtjdAgtJ5tnjFRfQaVedePmr5qOuIK xkYMKMNtled/KS0gP4TaSu5S012iYhzrpKISN/g4WHT/8kllr+iEowNAOJSJ6l38 oaBJJJCsLwnnV1nRClp4NNQv0Q/RXyIex5mPkeWERk4QU9adSDHnYJR7xn7JEyzV l9o+av28bXA7l3C8BOi3ahSGh5cDu+hif0Biml/Kke7e4+1Lp3/QWSQ+p/E5PDDq rhk65cg07PxSHeogN8hgu+RYN0gF3WBKASwUIDAkVdBsLlH8LVmoT5DtllL+6PyY cwCp3nWeu0q2YDxGOfCZrUC4YJMl8hqHoWbdVah8vLKV/w/JVUtVEIol0za50dzG ii6wOLqzV8GH0vkVa5x0InlH+t+/LtDRVkgHUT3/64eEEG+SsK/GmZeEtvcmp7GP 7Qx+Dd7hPgh+uis0XZPz37vqyCYhaFNw1+M9EESlQKUKfdY8B5B5bpXVDOBF+0Zl daOoMw8xBd21DXNk9tCk =gVxi -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20170717' into staging target-arm queue: * new model of the ARM MPS2/MPS2+ FPGA based development board * clean up DISAS_* exit conditions and fix various regressions since commits `e75449a346` `8a6b28c7b5` (in particular including ones which broke OP-TEE guests) * make Cortex-M3 and M4 correctly default to 8 PMSA regions # gpg: Signature made Mon 17 Jul 2017 13:43:45 BST # gpg: using RSA key 0x3C2525ED14360CDE # gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" # gpg: aka "Peter Maydell <pmaydell@gmail.com>" # gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" # Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE * remotes/pmaydell/tags/pull-target-arm-20170717: MAINTAINERS: Add entries for MPS2 board hw/arm/mps2: Add ethernet hw/arm/mps2: Add SCC hw/misc/mps2_scc: Implement MPS2 Serial Communication Controller hw/arm/mps2: Add timers hw/char/cmsdk-apb-timer: Implement CMSDK APB timer device hw/arm/mps2: Add UARTs hw/char/cmsdk-apb-uart.c: Implement CMSDK APB UART hw/arm/mps2: Implement skeleton mps2-an385 and mps2-an511 board models target/arm: use DISAS_EXIT for eret handling target/arm: use gen_goto_tb for ISB handling target/arm/translate: ensure gen_goto_tb sets exit flags target/arm/translate.h: expand comment on DISAS_EXIT target/arm/translate: make DISAS_UPDATE match declared semantics include/exec/exec-all: document common exit conditions target/arm: Make Cortex-M3 and M4 default to 8 PMSA regions qdev: support properties which don't set a default value qdev-properties.h: Explicitly set the default value for arraylen properties Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-07-18 10:35:06 +01:00
Alex Bennée	df0311e634	include/exec/exec-all: document common exit conditions As a precursor to later patches attempt to come up with a more concrete wording for what each of the common exit cases would be. CC: Emilio G. Cota <cota@braap.org> CC: Richard Henderson <rth@twiddle.net> CC: Lluís Vilanova <vilanova@ac.upc.edu> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <rth@twiddle.net> Message-id: 20170713141928.25419-2-alex.bennee@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-07-17 13:36:07 +01:00
Lluís Vilanova	61a67f71dd	exec: [tcg] Use different TBs according to the vCPU's dynamic tracing state Every vCPU now uses a separate set of TBs for each set of dynamic tracing event state values. Each set of TBs can be used by any number of vCPUs to maximize TB reuse when vCPUs have the same tracing state. This feature is later used by tracetool to optimize tracing of guest code events. The maximum number of TB sets is defined as 2^E, where E is the number of events that have the 'vcpu' property (their state is stored in CPUState->trace_dstate). For this to work, a change on the dynamic tracing state of a vCPU will force it to flush its virtual TB cache (which is only indexed by address), and fall back to the physical TB cache (which now contains the vCPU's dynamic tracing state as part of the hashing function). Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu> Reviewed-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Message-id: 149915775266.6295.10060144081246467690.stgit@frigg.lan Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-07-17 13:11:05 +01:00
Pranith Kumar	406bc339b0	Revert "exec.c: Fix breakpoint invalidation race" Now that we have proper locking after MTTCG patches have landed, we can revert the commit. This reverts commit `a9353fe897`. CC: Peter Maydell <peter.maydell@linaro.org> CC: Alex BennÃ©e <alex.bennee@linaro.org> Signed-off-by: Pranith Kumar <bobby.prani@gmail.com> Message-Id: <20170712215143.19594-1-bobby.prani@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-07-14 11:05:19 +02:00
Yang Zhong	b11ec7f2e4	tcg: add CONFIG_TCG guards in headers Add CONFIG_TCG around TLB-related functions and structure declarations. Some of these functions are defined in ./accel/tcg/cputlb.c, which will not be linked in if TCG is disabled, and have no stubs; therefore, their callers will also be compiled out for --disable-tcg. Signed-off-by: Yang Zhong <yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-07-05 09:11:08 +02:00
Paolo Bonzini	beeaef55e4	tcg: move tb_lock out of translate-all.h Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-07-04 16:01:16 +02:00
Richard Henderson	3fb53fb4d1	tcg/arm: Use indirect branch for goto_tb Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-06-19 11:10:59 -07:00
Emilio G. Cota	cedbcb0152	tcg: Introduce goto_ptr opcode and tcg_gen_lookup_and_goto_ptr Instead of exporting goto_ptr directly to TCG frontends, export tcg_gen_lookup_and_goto_ptr(), which calls goto_ptr with the pointer returned by the lookup_tb_ptr() helper. This is the only use case we have for goto_ptr and lookup_tb_ptr, so having this function is very convenient. Furthermore, it trivially allows us to avoid calling the lookup helper if goto_ptr is not implemented by the backend. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <1493263764-18657-2-git-send-email-cota@braap.org> Message-Id: <1493263764-18657-3-git-send-email-cota@braap.org> Message-Id: <1493263764-18657-4-git-send-email-cota@braap.org> Message-Id: <1493263764-18657-5-git-send-email-cota@braap.org> [rth: Squashed 4 related commits.] Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-06-05 09:25:42 -07:00
Alex Bennée	c3b9a07a33	cputlb: introduce tlb_flush_*_all_cpus[_synced] This introduces support to the cputlb API for flushing all CPUs TLBs with one call. This avoids the need for target helpers to iterate through the vCPUs themselves. An additional variant of the API (_synced) will cause the source vCPUs work to be scheduled as "safe work". The result will be all the flush operations will be complete by the time the originating vCPU executes its safe work. The calling implementation can either end the TB straight away (which will then pick up the cpu->exit_request on entering the next block) or defer the exit until the architectural sync point (usually a barrier instruction). Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <rth@twiddle.net>	2017-02-24 10:32:46 +00:00
Alex Bennée	0336cbf853	cputlb and arm/sparc targets: convert mmuidx flushes from varg to bitmap While the vargs approach was flexible the original MTTCG ended up having munge the bits to a bitmap so the data could be used in deferred work helpers. Instead of hiding that in cputlb we push the change to the API to make it take a bitmap of MMU indexes instead. For ARM some the resulting flushes end up being quite long so to aid readability I've tended to move the index shifting to a new line so all the bits being or-ed together line up nicely, for example: tlb_flush_page_by_mmuidx(other_cs, pageaddr, (1 << ARMMMUIdx_S1SE1) \| (1 << ARMMMUIdx_S1SE0)); Signed-off-by: Alex Bennée <alex.bennee@linaro.org> [AT: SPARC parts only] Reviewed-by: Artyom Tarasenko <atar4qemu@gmail.com> Reviewed-by: Richard Henderson <rth@twiddle.net> [PM: ARM parts only] Reviewed-by: Peter Maydell <peter.maydell@linaro.org>	2017-02-24 10:32:46 +00:00
KONRAD Frederic	e3b9ca8109	cputlb: introduce tlb_flush_* async work. Some architectures allow to flush the tlb of other VCPUs. This is not a problem when we have only one thread for all VCPUs but it definitely needs to be an asynchronous work when we are in true multithreaded work. We take the tb_lock() when doing this to avoid racing with other threads which may be invalidating TB's at the same time. The alternative would be to use proper atomic primitives to clear the tlb entries en-mass. This patch doesn't do anything to protect other cputlb function being called in MTTCG mode making cross vCPU changes. Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com> [AJB: remove need for g_malloc on defer, make check fixes, tb_lock] Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <rth@twiddle.net>	2017-02-24 10:32:46 +00:00
Alex Bennée	e5143e30fb	tcg: remove global exit_request There are now only two uses of the global exit_request left. The first ensures we exit the run_loop when we first start to process pending work and in the kick handler. This is just as easily done by setting the first_cpu->exit_request flag. The second use is in the round robin kick routine. The global exit_request ensured every vCPU would set its local exit_request and cause a full exit of the loop. Now the iothread isn't being held while running we can just rely on the kick handler to push us out as intended. We lightly re-factor the main vCPU thread to ensure cpu->exit_requests cause us to exit the main loop and process any IO requests that might come along. As an cpu->exit_request may legitimately get squashed while processing the EXCP_INTERRUPT exception we also check cpu->queued_work_first to ensure queued work is expedited as soon as possible. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <rth@twiddle.net>	2017-02-24 10:32:45 +00:00
Alex Bennée	791158d93b	tcg: rename tcg_current_cpu to tcg_current_rr_cpu ..and make the definition local to cpus. In preparation for MTTCG the concept of a global tcg_current_cpu will no longer make sense. However we still need to keep track of it in the single-threaded case to be able to exit quickly when required. qemu_cpu_kick_no_halt() moves and becomes qemu_cpu_kick_rr_cpu() to emphasise its use-case. qemu_cpu_kick now kicks the relevant cpu as well as qemu_kick_rr_cpu() which will become a no-op in MTTCG. For the time being the setting of the global exit_request remains. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>	2017-02-24 10:32:45 +00:00
Paolo Bonzini	43d70ddf9f	cpu-exec: fix icount out-of-bounds access When icount is active, tb_add_jump is surprisingly called with an out of bounds basic block index. I have no idea how that can work, but it does not seem like a good idea. Clear *last_tb for all TB_EXIT_ICOUNT_EXPIRED cases, even when all you have to do is refill icount_extra. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-02-16 14:06:56 +01:00
Alex Bennée	d10eb08f5d	cputlb: drop flush_global flag from tlb_flush We have never has the concept of global TLB entries which would avoid the flush so we never actually use this flag. Drop it and make clear that tlb_flush is the sledge-hammer it has always been. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <rth@twiddle.net> [DG: ppc portions] Acked-by: David Gibson <david@gibson.dropbear.id.au>	2017-01-13 14:24:37 +00:00
Paolo Bonzini	7d7500d998	tcg: comment on which functions have to be called with tb_lock held softmmu requires more functions to be thread-safe, because translation blocks can be invalidated from e.g. notdirty callbacks. Probably the same holds for user-mode emulation, it's just that no one has ever tried to produce a coherent locking there. This patch will guide the introduction of more tb_lock and tb_unlock calls for system emulation. Note that after this patch some (most) of the mentioned functions are still called outside tb_lock/tb_unlock. The next one will rectify this. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <rth@twiddle.net> Message-Id: <20161027151030.20863-7-alex.bennee@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-10-31 10:51:16 +01:00
Alex Bennée	301e40ed80	translate-all: add DEBUG_LOCKING asserts This adds asserts to check the locking on the various translation engines structures. There are two sets of structures that are protected by locks. The first the l1map and PageDesc structures used to track which translation blocks are associated with which physical addresses. In user-mode this is covered by the mmap_lock. The second case are TB context related structures which are protected by tb_lock which is also user-mode only. Currently the asserts do nothing in SoftMMU mode but this will change for MTTCG. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <rth@twiddle.net> Message-Id: <20161027151030.20863-4-alex.bennee@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-10-31 10:24:45 +01:00
Richard Henderson	fdbc2b5722	tcg: Add EXCP_ATOMIC When we cannot emulate an atomic operation within a parallel context, this exception allows us to stop the world and try again in a serial context. Reviewed-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-10-26 08:29:00 -07:00
Laurent Vivier	ce5b1bbf62	exec: move cpu_exec_init() calls to realize functions Modify all CPUs to call it from XXX_cpu_realizefn() function. Remove all the cannot_destroy_with_object_finalize_yet as unsafe references have been moved to cpu_exec_realizefn(). (tested with QOM command provided by commit `4c315c27`) for arm: Setting of cpu->mp_affinity is moved from arm_cpu_initfn() to arm_cpu_realizefn() as setting of cpu_index is now done in cpu_exec_realizefn(). To avoid to overwrite an user defined value, we set it to an invalid value by default, and update it in realize function only if the value is still invalid. Signed-off-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2016-10-24 17:29:16 -02:00
Paolo Bonzini	267f685b8b	cpus-common: move CPU list management to common code Add a mutex for the CPU list to system emulation, as it will be used to manage safe work. Abstract manipulation of the CPU list in new functions cpu_list_add and cpu_list_remove. Reviewed-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-09-27 11:57:29 +02:00
Richard Henderson	01ecaf438b	tcg: Merge GETPC and GETRA The return address argument to the softmmu template helpers was confused. In the legacy case, we wanted to indicate that there is no return address, and so passed in NULL. However, we then immediately subtracted GETPC_ADJ from NULL, resulting in a non-zero value, indicating the presence of an (invalid) return address. Push the GETPC_ADJ subtraction down to the only point it's required: immediately before use within cpu_restore_state_from_tb, after all NULL pointer checks have been completed. This makes GETPC and GETRA identical. Remove GETRA as the lesser used macro, replacing all uses with GETPC. Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-09-16 08:12:11 -07:00
Paolo Bonzini	6d21e4208f	tcg: Prepare TB invalidation for lockless TB lookup When invalidating a translation block, set an invalid flag into the TranslationBlock structure first. It is also necessary to check whether the target TB is still valid after acquiring 'tb_lock' but before calling tb_add_jump() since TB lookup is to be performed out of 'tb_lock' in future. Note that we don't have to check 'last_tb'; an already invalidated TB will not be executed anyway and it is thus safe to patch it. Suggested-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-09-13 19:08:43 +02:00
Igor Mammedov	1bc7e522d9	exec: Reduce CONFIG_USER_ONLY ifdeffenery Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2016-07-26 15:31:58 -03:00
Markus Armbruster	2a6a4076e1	Clean up ill-advised or unusual header guards Cleaned up with scripts/clean-header-guards.pl. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Richard Henderson <rth@twiddle.net>	2016-07-12 16:20:46 +02:00
Sergey Sorokin	b35399bb4e	Fix confusing argument names in some common functions There are functions tlb_fill(), cpu_unaligned_access() and do_unaligned_access() that are called with access type and mmu index arguments. But these arguments are named 'is_write' and 'is_user' in their declarations. The patches fix the arguments to avoid a confusion. Signed-off-by: Sergey Sorokin <afarallax@yandex.ru> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Message-id: 1465907177-1399402-1-git-send-email-afarallax@yandex.ru Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2016-07-12 13:06:08 +01:00
Emilio G. Cota	909eaac9bb	tb hash: track translated blocks with qht Having a fixed-size hash table for keeping track of all translation blocks is suboptimal: some workloads are just too big or too small to get maximum performance from the hash table. The MRU promotion policy helps improve performance when the hash table is a little undersized, but it cannot make up for severely undersized hash tables. Furthermore, frequent MRU promotions result in writes that are a scalability bottleneck. For scalability, lookups should only perform reads, not writes. This is not a big deal for now, but it will become one once MTTCG matures. The appended fixes these issues by using qht as the implementation of the TB hash table. This solution is superior to other alternatives considered, namely: - master: implementation in QEMU before this patchset - xxhash: before this patch, i.e. fixed buckets + xxhash hashing + MRU. - xxhash-rcu: fixed buckets + xxhash + RCU list + MRU. MRU is implemented here by adding an intermediate struct that contains the u32 hash and a pointer to the TB; this allows us, on an MRU promotion, to copy said struct (that is not at the head), and put this new copy at the head. After a grace period, the original non-head struct can be eliminated, and after another grace period, freed. - qht-fixed-nomru: fixed buckets + xxhash + qht without auto-resize + no MRU for lookups; MRU for inserts. The appended solution is the following: - qht-dyn-nomru: dynamic number of buckets + xxhash + qht w/ auto-resize + no MRU for lookups; MRU for inserts. The plots below compare the considered solutions. The Y axis shows the boot time (in seconds) of a debian jessie image with arm-softmmu; the X axis sweeps the number of buckets (or initial number of buckets for qht-autoresize). The plots in PNG format (and with errorbars) can be seen here: http://imgur.com/a/Awgnq Each test runs 5 times, and the entire QEMU process is pinned to a single core for repeatability of results. Host: Intel Xeon E5-2690 28 ++------------+-------------+-------------+-------------+------------++ A*** + + + master A*** + 27 ++ * xxhash ##B###++ \| A****A** xxhash-rcu $$C$$$ \| 26 C$$ A**A**** qht-fixed-nomru%%D%%%++ D%%$$ A***A***Aqht-dyn-mru AE*A 25 ++ %%$$ qht-dyn-nomru &&F&&&++ B#####% \| 24 ++ #C$$$$$ ++ \| B### $ \| \| ## C$$$$$$ \| 23 ++ # C$$$$$$ ++ \| B###### C$$$$$$ %%%D 22 ++ %B###### C$$$$$$C$$$$$$C$$$$$$C$$$$$$C$$$$$$C \| D%%%%%%B###### @E@@@@@@ %%%D%%%@@@E@@@@@@E 21 E@@@@@@E@@@@@@F&&&@@@E@@@&&&D%%%%%%B######B######B######B######B######B + E@@@ F&&& + E@ + F&&& + + 20 ++------------+-------------+-------------+-------------+------------++ 14 16 18 20 22 24 log2 number of buckets Host: Intel i7-4790K 14.5 ++------------+------------+-------------+------------+------------++ A + + + master A* + 14 ++ xxhash ##B###++ 13.5 ++ xxhash-rcu $$C$$$++ \| qht-fixed-nomru %%D%%% \| 13 ++ A**** qht-dyn-mru @@E@@@++ \| A*A**A** qht-dyn-nomru &&F&&& \| 12.5 C$$ A**A**A*A** A 12 ++ $$ A ++ D%%% $$ \| 11.5 ++ %% ++ B### %C$$$$$$ \| 11 ++ ## D%%%%% C$$$$$ ++ \| # % C$$$$$$ \| 10.5 F&&&&&&B######D%%%%% C$$$$$$C$$$$$$C$$$$$$C$$$$$C$$$$$$ $$$C 10 E@@@@@@E@@@@@@B#####B######B######E@@@@@@E@@@%%%D%%%%%D%%%###B######B + F&& D%%%%%%B######B######B#####B###@@@D%%% + 9.5 ++------------+------------+-------------+------------+------------++ 14 16 18 20 22 24 log2 number of buckets Note that the original point before this patch series is X=15 for "master"; the little sensitivity to the increased number of buckets is due to the poor hashing function in master. xxhash-rcu has significant overhead due to the constant churn of allocating and deallocating intermediate structs for implementing MRU. An alternative would be do consider failed lookups as "maybe not there", and then acquire the external lock (tb_lock in this case) to really confirm that there was indeed a failed lookup. This, however, would not be enough to implement dynamic resizing--this is more complex: see "Resizable, Scalable, Concurrent Hash Tables via Relativistic Programming" by Triplett, McKenney and Walpole. This solution was discarded due to the very coarse RCU read critical sections that we have in MTTCG; resizing requires waiting for readers after every pointer update, and resizes require many pointer updates, so this would quickly become prohibitive. qht-fixed-nomru shows that MRU promotion is advisable for undersized hash tables. However, qht-dyn-mru shows that MRU promotion is not important if the hash table is properly sized: there is virtually no difference in performance between qht-dyn-nomru and qht-dyn-mru. Before this patch, we're at X=15 on "xxhash"; after this patch, we're at X=15 @ qht-dyn-nomru. This patch thus matches the best performance that we can achieve with optimum sizing of the hash table, while keeping the hash table scalable for readers. The improvement we get before and after this patch for booting debian jessie with arm-softmmu is: - Intel Xeon E5-2690: 10.5% less time - Intel i7-4790K: 5.2% less time We could get this same improvement _for this particular workload_ by statically increasing the size of the hash table. But this would hurt workloads that do not need a large hash table. The dynamic (upward) resizing allows us to start small and enlarge the hash table as needed. A quick note on downsizing: the table is resized back to 215 buckets on every tb_flush; this makes sense because it is not guaranteed that the table will reach the same number of TBs later on (e.g. most bootup code is thrown away after boot); it makes sense to grow the hash table as more code blocks are translated. This also avoids the complication of having to build downsizing hysteresis logic into qht. Reviewed-by: Sergey Fedorov <serge.fedorov@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <1465412133-3029-15-git-send-email-cota@braap.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-06-11 17:11:16 -07:00
Peter Maydell	6886b98036	cpu-exec: Rename cpu_resume_from_signal() to cpu_loop_exit_noexc() The function cpu_resume_from_signal() is now always called with a NULL puc argument, and is rather misnamed since it is never called from a signal handler. It is essentially forcing an exit to the top level cpu loop but without raising any exception, so rename it to cpu_loop_exit_noexc() and drop the useless unused argument. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org> Acked-by: Eduardo Habkost <ehabkost@redhat.com> Acked-by: Riku Voipio <riku.voipio@linaro.org> Message-id: 1463494687-25947-4-git-send-email-peter.maydell@linaro.org	2016-06-09 15:55:02 +01:00
Paolo Bonzini	63c915526d	cpu: move exec-all.h inclusion out of cpu.h exec-all.h contains TCG-specific definitions. It is not needed outside TCG-specific files such as translate.c, exec.c or *helper.c. One generic function had snuck into include/exec/exec-all.h; move it to include/qom/cpu.h. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-05-19 16:42:29 +02:00
Paolo Bonzini	00f6da6a1a	exec: extract exec/tb-context.h TCG backends do not need most of exec-all.h; extract what they actually need to a separate file or move it directly to tcg.h. The next patch will stop including exec-all.h from everywhere. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-05-19 16:42:29 +02:00
Sergey Fedorov	6f789be56d	tcg: Rework tb_invalidated_flag 'tb_invalidated_flag' was meant to catch two events: * some TB has been invalidated by tb_phys_invalidate(); * the whole translation buffer has been flushed by tb_flush(). Then it was checked: * in cpu_exec() to ensure that the last executed TB can be safely linked to directly call the next one; * in cpu_exec_nocache() to decide if the original TB should be provided for further possible invalidation along with the temporarily generated TB. It is always safe to patch an invalidated TB since it is not going to be used anyway. It is also safe to call tb_phys_invalidate() for an already invalidated TB. Thus, setting this flag in tb_phys_invalidate() is simply unnecessary. Moreover, it can prevent from pretty proper linking of TBs, if any arbitrary TB has been invalidated. So just don't touch it in tb_phys_invalidate(). If this flag is only used to catch whether tb_flush() has been called then rename it to 'tb_flushed'. Declare it as 'bool' and stick to using only 'true' and 'false' to set its value. Also, instead of setting it in tb_gen_code(), just after tb_flush() has been called, do it right inside of tb_flush(). In cpu_exec(), this flag is used to track if tb_flush() has been called and have made 'next_tb' (a reference to the last executed TB) invalid for linking it to directly call the next TB. tb_flush() can be called during the CPU execution loop from tb_gen_code(), during TB execution or by another thread while 'tb_lock' is released. Catch for translation buffer flush reliably by resetting this flag once before first TB lookup and each time we find it set before trying to add a direct jump. Don't touch in in tb_find_physical(). Each vCPU has its own execution loop in multithreaded mode and thus should have its own copy of the flag to be able to reset it with its own 'next_tb' and don't affect any other vCPU execution thread. So make this flag per-vCPU and move it to CPUState. In cpu_exec_nocache(), we only need to check if tb_flush() has been called from tb_gen_code() called by cpu_exec_nocache() itself. To do this reliably, preserve the old value of the flag, reset it before calling tb_gen_code(), check afterwards, and combine the saved value back to the flag. This patch is based on the patch "tcg: move tb_invalidated_flag to CPUState" from Paolo Bonzini <pbonzini@redhat.com>. Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-05-12 14:06:42 -10:00
Sergey Fedorov	9962c478b1	tcg: Clarify thread safety check in tb_add_jump() The check is to make sure that another thread hasn't already done the same while we were outside of tb_lock. Mention this in a comment. Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-05-12 14:06:41 -10:00
Sergey Fedorov	c37e6d7e35	tcg: Use uintptr_t type for jmp_list_{next\|first} fields of TB These fields do not contain pure pointers to a TranslationBlock structure. So uintptr_t is the most appropriate type for them. Also put some asserts to assure that the two least significant bits of the pointer are always zero before assigning it to jmp_list_first. Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-05-12 14:06:41 -10:00
Sergey Fedorov	f309101c26	tcg: Clean up direct block chaining data fields Briefly describe in a comment how direct block chaining is done. It should help in understanding of the following data fields. Rename some fields in TranslationBlock and TCGContext structures to better reflect their purpose (dropping excessive 'tb_' prefix in TranslationBlock but keeping it in TCGContext): tb_next_offset => jmp_reset_offset tb_jmp_offset => jmp_insn_offset tb_next => jmp_target_addr jmp_next => jmp_list_next jmp_first => jmp_list_first Avoid using a magic constant as an invalid offset which is used to indicate that there's no n-th jump generated. Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-05-12 14:06:41 -10:00
Sergey Fedorov	10b4f48555	tcg: Note requirement on atomic direct jump patching Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <1461341333-19646-12-git-send-email-sergey.fedorov@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-05-12 14:06:41 -10:00
Sergey Fedorov	7d14e0e2d6	tcg/arm: Make direct jump patching thread-safe Ensure direct jump patching in ARM is atomic by using atomic_read()/atomic_set() for code patching. Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Message-Id: <1461341333-19646-8-git-send-email-sergey.fedorov@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-05-12 14:06:41 -10:00
Sergey Fedorov	ed3d51ecd7	tcg/s390: Make direct jump patching thread-safe Ensure direct jump patching in s390 is atomic by: * naturally aligning a location of direct jump address; * using atomic_read()/atomic_set() for code patching. Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Message-Id: <1461341333-19646-7-git-send-email-sergey.fedorov@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-05-12 14:06:41 -10:00
Sergey Fedorov	0d07abf05e	tcg/i386: Make direct jump patching thread-safe Ensure direct jump patching in i386 is atomic by: * naturally aligning a location of direct jump address; * using atomic_read()/atomic_set() for code patching. tcg_out_nopn() implementation: Suggested-by: Richard Henderson <rth@twiddle.net>. Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Message-Id: <1461341333-19646-6-git-send-email-sergey.fedorov@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-05-12 14:06:41 -10:00
Sergey Fedorov	76442a939e	tci: Make direct jump patching thread-safe Ensure direct jump patching in TCI is atomic by: * naturally aligning a location of direct jump address; * using atomic_read()/atomic_set() to load/store the address. Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Message-Id: <1461341333-19646-4-git-send-email-sergey.fedorov@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-05-12 14:06:40 -10:00
Emilio G. Cota	89fee74a0f	tb: consistently use uint32_t for tb->flags We are inconsistent with the type of tb->flags: usage varies loosely between int and uint64_t. Settle to uint32_t everywhere, which is superior to both: at least one target (aarch64) uses the most significant bit in the u32, and uint64_t is wasteful. Compile-tested for all targets. Suggested-by: Laurent Desnogues <laurent.desnogues@gmail.com> Suggested-by: Richard Henderson <rth@twiddle.net> Tested-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <rth@twiddle.net> Message-Id: <1460049562-23517-1-git-send-email-cota@braap.org>	2016-05-12 14:06:40 -10:00
Alex Bennée	d977e1c2db	qemu-log: dfilter-ise exec, out_asm, op and opt_op This ensures the code generation debug code will honour -dfilter if set. For the "exec" tracing I've added a new inline macro for efficiency's sake. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Aurelien Jarno <aurelien@aureL32.net> Reviewed-by: Richard Henderson <rth@twiddle.net> Message-Id: <1458052224-9316-8-git-send-email-alex.bennee@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-03-22 22:20:18 +01:00
Peter Maydell	1a83063522	qemu-log: Improve the "exec" TB execution logging Improve the TB execution logging so that it is easier to identify what is happening from trace logs: * move the "Trace" logging of executed TBs into cpu_tb_exec() so that it is emitted if and only if we actually execute a TB, and for consistency for the CPU state logging * log when we link two TBs together via tb_add_jump() * log when cpu_tb_exec() returns early from a chain of TBs The new style logging looks like this: Trace 0x7fb7cc822ca0 [ffffffc0000dce00] Linking TBs 0x7fb7cc822ca0 [ffffffc0000dce00] index 0 -> 0x7fb7cc823110 [ffffffc0000dce10] Trace 0x7fb7cc823110 [ffffffc0000dce10] Trace 0x7fb7cc823420 [ffffffc000302688] Trace 0x7fb7cc8234a0 [ffffffc000302698] Trace 0x7fb7cc823520 [ffffffc0003026a4] Trace 0x7fb7cc823560 [ffffffc0000dce44] Linking TBs 0x7fb7cc823560 [ffffffc0000dce44] index 1 -> 0x7fb7cc8235d0 [ffffffc0000dce70] Trace 0x7fb7cc8235d0 [ffffffc0000dce70] Stopped execution of TB chain before 0x7fb7cc8235d0 [ffffffc0000dce70] Trace 0x7fb7cc8235d0 [ffffffc0000dce70] Trace 0x7fb7cc822fd0 [ffffffc0000dd52c] Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> [AJB: reword patch title, Abandoned->Stopped] Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Reviewed-by: Richard Henderson <rth@twiddle.net> Message-Id: <1458052224-9316-6-git-send-email-alex.bennee@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-03-22 22:20:18 +01:00
Peter Maydell	651a5bc037	exec.c: Add cpu_get_address_space() Add a function to return the AddressSpace for a CPU based on its numerical index. (Callers outside exec.c don't have access to the CPUAddressSpace struct so can't just fish it out of the CPUState struct directly.) Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Acked-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>	2016-01-21 14:15:05 +00:00
Peter Maydell	a54c87b68a	exec.c: Pass MemTxAttrs to iotlb_to_region so it uses the right AS Pass the MemTxAttrs for the memory access to iotlb_to_region(); this allows it to determine the correct AddressSpace to use for the lookup. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Acked-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>	2016-01-21 14:15:05 +00:00
Peter Maydell	d7898cda81	cputlb.c: Use correct address space when looking up MemoryRegionSection When looking up the MemoryRegionSection for the new TLB entry in tlb_set_page_with_attrs(), use cpu_asidx_from_attrs() to determine the correct address space index for the lookup, and pass it into address_space_translate_for_iotlb(). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Acked-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>	2016-01-21 14:15:05 +00:00
Peter Maydell	1787cc8ee5	exec-all.h: Document tlb_set_page_with_attrs, tlb_set_page Add documentation comments for tlb_set_page_with_attrs() and tlb_set_page(). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Acked-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>	2016-01-21 14:15:04 +00:00
Peter Maydell	12ebc9a76d	exec.c: Allow target CPUs to define multiple AddressSpaces Allow multiple calls to cpu_address_space_init(); each call adds an entry to the cpu->ases array at the specified index. It is up to the target-specific CPU code to actually use these extra address spaces. Since this multiple AddressSpace support won't work with KVM, add an assertion to avoid confusing failures. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Acked-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>	2016-01-21 14:15:04 +00:00
Peter Maydell	56943e8cc1	exec.c: Don't set cpu->as until cpu_address_space_init Rather than setting cpu->as unconditionally in cpu_exec_init (and then having target-i386 override this later), don't set it until the first call to cpu_address_space_init. This requires us to initialise the address space for both TCG and KVM (KVM doesn't need the AS listener but it does require cpu->as to be set). For target CPUs which don't set up any address spaces (currently everything except i386), add the default address_space_memory in qemu_init_vcpu(). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Acked-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>	2016-01-21 14:15:04 +00:00
Dr. David Alan Gilbert	87f50caa30	Move page_size_init earlier The HOST_PAGE_ALIGN macros don't work until the page size variables have been set up; later in postcopy I use those macros in the RAM code, and it can be triggered using -object. Fix this by initialising page_size_init() earlier - it's currently initialised inside the accelerators, move it up into vl.c. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2015-11-10 14:51:48 +01:00
Pavel Dovgalyuk	56c0269a9e	cpu-exec: allow temporary disabling icount This patch is required for deterministic replay to generate an exception by trying executing an instruction without changing icount. It adds new flag to TB for disabling icount while translating it. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru> Message-Id: <20150917162359.8676.77011.stgit@PASHA-ISP.def.inno> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-11-05 12:19:09 +01:00
Paolo Bonzini	88401cbc5b	exec: remove non-TCG stuff from exec-all.h header. The header is included from basically everywhere, thanks to cpu.h. It should be moved to the (TCG only) files that actually need it. As a start, remove non-TCG stuff. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-10-12 18:29:26 +02:00
Peter Maydell	32857f4d5e	exec.c: Collect AddressSpace related fields into a CPUAddressSpace struct Gather up all the fields currently in CPUState which deal with the CPU's AddressSpace into a separate CPUAddressSpace struct. This paves the way for allowing the CPU to know about more than one AddressSpace. The rearrangement also allows us to make the MemoryListener a directly embedded object in the CPUAddressSpace (it could not be embedded in CPUState because 'struct MemoryListener' isn't defined for the user-only builds). This allows us to resolve the FIXME in tcg_commit() by going directly from the MemoryListener to the CPUAddressSpace. This patch extracts the actual update of the cached dispatch pointer from cpu_reload_memory_map() (which is renamed accordingly to cpu_reloading_memory_map() as it is only responsible for breaking cpu-exec.c's RCU critical section now). This lets us keep the definition of the CPUAddressSpace struct private to exec.c. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <1443709790-25180-4-git-send-email-peter.maydell@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-10-12 18:29:26 +02:00
Richard Henderson	126d89e8cd	tcg: Adjust CODE_GEN_AVG_BLOCK_SIZE At present, the "average" guestimate of TB size is way too small, leading to many unused entries in the pre-allocated TB array. For a guest with 1GB ram, we're currently allocating 256MB for the array. Survey arm, alpha, aarch64, ppc, sparc, i686, x86_64 guests running on x86_64 and ppc64 hosts and select a new average. The size of the array drops to 81MB with no more flushing than before. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-10-07 20:40:00 +11:00
Richard Henderson	b125f9dc7b	tcg: Check for overflow via highwater mark We currently pre-compute an worst case code size for any TB, which works out to be 122kB. Since the average TB size is near 1kB, this wastes quite a lot of storage. Instead, check for overflow in between generating code for each opcode. The overhead of the check isn't measurable and wastage is minimized. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-10-07 20:36:53 +11:00
Richard Henderson	4e5e121515	tcg: Remove gen_intermediate_code_pc It is no longer used, so tidy up everything reached by it. This includes the gen_opc_* arrays, the search_pc parameter and the inline gen_intermediate_code_internal functions. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-10-07 20:36:52 +11:00
Richard Henderson	fca8a500d5	tcg: Save insn data and use it in cpu_restore_state_from_tb We can now restore state without retranslation. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-10-07 20:36:51 +11:00
Richard Henderson	bad729e272	tcg: Pass data argument to restore_state_to_opc The gen_opc_* arrays are already redundant with the data stored in the insn_start arguments. Transition restore_state_to_opc to use data from the latter. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-10-07 20:36:51 +11:00
Richard Henderson	fec88f64bd	tcg: Merge cpu_gen_code into tb_gen_code As it's only caller, this tidies things a bit. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-10-07 20:36:49 +11:00
Peter Crosthwaite	dfccc76023	include/exec: Move cputlb exec.c defs out Move the architecture agnostic function prototypes for exec.c out of cputlb.h to exec-all.h. This allows hiding of the arch specific cputlb.h from exec.c which should be getting close to having no architecture specifics. Prepares support for multi-arch, which will have a minimal cpu.h that services exec.c but not cputlb.h. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com> Message-Id: <b4fe754c58c860315e35d44430c26b1c967ce2c9.1441614289.git.crosthwaite.peter@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-09-16 17:33:33 +02:00
Peter Maydell	a2aa09e181	* Support for jemalloc * qemu_mutex_lock_iothread "No such process" fix * cutils: qemu_strto* wrappers * iohandler.c simplification * Many other fixes and misc patches. And some MTTCG work (with Emilio's fixes squashed): * Signal-free TCG kick * Removing spinlock in favor of QemuMutex * User-mode emulation multi-threading fixes/docs -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQEcBAABCAAGBQJV8Tk7AAoJEL/70l94x66Ds3QH/3bi0RRR2NtKIXAQrGo5tfuD NPMu1K5Hy+/26AC6mEVNRh4kh7dPH5E4NnDGbxet1+osvmpjxAjc2JrxEybhHD0j fkpzqynuBN6cA2Gu5GUNoKzxxTmi2RrEYigWDZqCftRXBeO2Hsr1etxJh9UoZw5H dgpU3j/n0Q8s08jUJ1o789knZI/ckwL4oXK4u2KhSC7ZTCWhJT7Qr7c0JmiKReaF JEYAsKkQhICVKRVmC8NxML8U58O8maBjQ62UN6nQpVaQd0Yo/6cstFTZsRrHMHL3 7A2Tyg862cMvp+1DOX3Bk02yXA+nxnzLF8kUe0rYo6llqDBDStzqyn1j9R0qeqA= =nB06 -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging * Support for jemalloc * qemu_mutex_lock_iothread "No such process" fix * cutils: qemu_strto* wrappers * iohandler.c simplification * Many other fixes and misc patches. And some MTTCG work (with Emilio's fixes squashed): * Signal-free TCG kick * Removing spinlock in favor of QemuMutex * User-mode emulation multi-threading fixes/docs # gpg: Signature made Thu 10 Sep 2015 09:03:07 BST using RSA key ID 78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" * remotes/bonzini/tags/for-upstream: (44 commits) cutils: work around platform differences in strto{l,ul,ll,ull} cpu-exec: fix lock hierarchy for user-mode emulation exec: make mmap_lock/mmap_unlock globally available tcg: comment on which functions have to be called with mmap_lock held tcg: add memory barriers in page_find_alloc accesses remove unused spinlock. replace spinlock by QemuMutex. cpus: remove tcg_halt_cond and tcg_cpu_thread globals cpus: protect work list with work_mutex scripts/dump-guest-memory.py: fix after RAMBlock change configure: Add support for jemalloc add macro file for coccinelle configure: factor out adding disas configure vhost-scsi: fix wrong vhost-scsi firmware path checkpatch: remove tests that are not relevant outside the kernel checkpatch: adapt some tests to QEMU CODING_STYLE: update mixed declaration rules qmp: Add example usage of strto*l() qemu wrapper cutils: Add qemu_strtoull() wrapper cutils: Add qemu_strtoll() wrapper ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2015-09-14 16:13:16 +01:00
Pavel Dovgalyuk	1c3c8af1fb	cpu-exec: introduce loop exit with restore function This patch introduces loop exit function, which also restores guest CPU state according to the value of host program counter. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru> Message-Id: <20150710095702.13280.97477.stgit@PASHA-ISP> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-09-11 08:16:16 -07:00
Paolo Bonzini	8fd19e6cfd	exec: make mmap_lock/mmap_unlock globally available There is some iffy lock hierarchy going on in translate-all.c. To fix it, we need to take the mmap_lock in cpu-exec.c. Make the functions globally available. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-09-09 15:34:56 +02:00
KONRAD Frederic	677ef6230b	replace spinlock by QemuMutex. spinlock is only used in two cases: * cpu-exec.c: to protect TranslationBlock * mem_helper.c: for lock helper in target-i386 (which seems broken). It's a pthread_mutex_t in user-mode, so we can use QemuMutex directly, with an #ifdef. The #ifdef will be removed when multithreaded TCG will need the mutex as well. Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com> Message-Id: <1439220437-23957-5-git-send-email-fred.konrad@greensocs.com> Signed-off-by: Emilio G. Cota <cota@braap.org> [Merge Emilio G. Cota's patch to remove volatile. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-09-09 15:34:55 +02:00
Paolo Bonzini	e0c382113f	tcg: signal-free qemu_cpu_kick Signals are slow and do not exist on Win32. The previous patches have done most of the legwork to introduce memory barriers (some of them were even there already for the sake of Windows!) and we can now set the flags directly in the iothread. qemu_cpu_kick_thread is not used anymore on TCG, since the TCG thread is never outside usermode while the CPU is running (not halted). Instead run the content of the signal handler (now in qemu_cpu_kick_no_halt) directly. qemu_cpu_kick_no_halt is also used in qemu_mutex_lock_iothread to avoid the overhead of qemu_cond_broadcast. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-09-09 15:34:54 +02:00
Paolo Bonzini	9373e63297	tcg: introduce tcg_current_cpu This is already useful on Windows in order to remove tls.h, because accesses to current_cpu are done from a different thread on that platform. It will be used on POSIX platforms as soon TCG stops using signals to interrupt the execution of translated code. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-09-09 15:34:53 +02:00
Peter Crosthwaite	a17d448274	exec-all: Translate TCI return addresses backwards too This subtraction of return addresses applies directly to TCI as well as host-TCG. This fixes Linux boots for at least Microblaze, CRIS, ARM and SH4 when using TCI. [sw: Removed indentation for preprocessor statement] [sw: The patch also fixes Linux boot for x86_64] Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>	2015-08-26 20:50:46 +02:00
Peter Maydell	d7a74a9d4a	cputlb: Add functions for flushing TLB for a single MMU index Guest CPU TLB maintenance operations may be sufficiently specialized to only need to flush TLB entries corresponding to a particular MMU index. Implement cputlb functions for this, to avoid the inefficiency of flushing TLB entries which we don't need to. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Message-id: 1439548879-1972-2-git-send-email-peter.maydell@linaro.org	2015-08-25 16:18:33 +01:00
Paolo Bonzini	414b15c909	exec: drop cpu_can_do_io, just read cpu->can_do_io After commit `626cf8f` (icount: set can_do_io outside TB execution, 2014-12-08), can_do_io is set to 1 if not executing code. It is no longer necessary to make this assumption in cpu_can_do_io. It is also possible to remove the use_icount test, simply by never setting cpu->can_do_io to 0 unless use_icount is true. With these changes cpu_can_do_io boils down to a read of cpu->can_do_io. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-08-14 23:40:32 +02:00
Sergey Fedorov	02d57ea115	cpu-exec: Do not invalidate original TB in cpu_exec_nocache() Instead of invalidating an original TB in cpu_exec_nocache() prematurely, just save a link to it in the temporary generated TB. If cpu_io_recompile() is raised subsequently from the temporary TB, invalidate the original one as well. That allows reusing the original TB each time cpu_exec_nocache() is called to handle expired instruction counter in icount mode. Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Message-Id: <1435656909-29116-1-git-send-email-serge.fdrv@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-08-06 12:04:08 +02:00
Peter Crosthwaite	4bad9e392e	cpu: Change cpu_exec_init() arg to cpu, not env The callers (most of them in target-foo/cpu.c) to this function all have the cpu pointer handy. Just pass it to avoid an ENV_GET_CPU() from core code (in exec.c). Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Peter Maydell <peter.maydell@linaro.org> Cc: "Edgar E. Iglesias" <edgar.iglesias@gmail.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Michael Walle <michael@walle.cc> Cc: Leon Alrae <leon.alrae@imgtec.com> Cc: Anthony Green <green@moxielogic.com> Cc: Jia Liu <proljc@gmail.com> Cc: Alexander Graf <agraf@suse.de> Cc: Blue Swirl <blauwirbel@gmail.com> Cc: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Cc: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Max Filippov <jcmvbkbc@gmail.com> Reviewed-by: Andreas Färber <afaerber@suse.de> Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com> Signed-off-by: Andreas Färber <afaerber@suse.de>	2015-07-09 15:20:40 +02:00
Peter Crosthwaite	bbd77c180d	translate-all: Change tb_flush() env argument to cpu All of the core-code usages of this API have the cpu pointer handy so pass it in. There are only 3 architecture specific usages (2 of which are commented out) which can just use ENV_GET_CPU() locally to get the cpu pointer. The reduces core code usage of the CPU env, which brings us closer to common-obj'ing these core files. Cc: Riku Voipio <riku.voipio@iki.fi> Cc: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Acked-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com> Signed-off-by: Andreas Färber <afaerber@suse.de>	2015-07-09 15:20:40 +02:00
Bharata B Rao	5a790cc4b9	cpu: Add Error argument to cpu_exec_init() Add an Error argument to cpu_exec_init() to let users collect the error. This is in preparation to change the CPU enumeration logic in cpu_exec_init(). With the new enumeration logic, cpu_exec_init() can fail if cpu_index values corresponding to max_cpus have already been handed out. Since all current callers of cpu_exec_init() are from instance_init, use error_abort Error argument to abort in case of an error. Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com> Signed-off-by: Andreas Färber <afaerber@suse.de>	2015-07-09 15:20:40 +02:00
Li Zhijian	dd63169766	migration: extend migration_bitmap Prevously, if we hotplug a device(e.g. device_add e1000) during migration is processing in source side, qemu will add a new ram block but migration_bitmap is not extended. In this case, migration_bitmap will overflow and lead qemu abort unexpectedly. Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2015-07-07 14:54:56 +02:00
Peter Crosthwaite	e1b89321ba	include/exec: Move tb hash functions out This is one of very few things in exec-all with a genuine CPU architecture dependency. Move these hashing helpers to a new header to trim exec-all.h down to a near architecture-agnostic header. The defs are only used by cpu-exec and translate-all which are both arch-obj's so the new tb-hash.h has no core code usage. Reviewed-by: Richard Henderson <rth@redhat.com> Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com> Message-Id: <9d048b96f7cfa64a4d9c0b88e0dd2877fac51d41.1433052532.git.crosthwaite.peter@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-06-26 16:00:51 +02:00
Yongbok Kim	3b4afc9e75	softmmu: Add probe_write() Probe for whether the specified guest write access is permitted. If it is not permitted then an exception will be taken in the same way as if this were a real write access (and we will not return). Otherwise the function will return, and there will be a valid entry in the TLB for this access. Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com> Reviewed-by: Leon Alrae <leon.alrae@imgtec.com> Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>	2015-06-11 10:13:28 +01:00
Paolo Bonzini	1652b97476	exec: move functions to translate-all.h Remove them from the sundry exec-all.h header, since they are only used by the TCG runtime in exec.c and user-exec.c. Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-06-05 17:09:59 +02:00
Peter Maydell	fadc1cbe85	Add MemTxAttrs to the IOTLB Add a MemTxAttrs field to the IOTLB, and allow target-specific code to set it via a new tlb_set_page_with_attrs() function; pass the attributes through to the device when making IO accesses. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>	2015-04-26 16:49:24 +01:00
Peter Maydell	3b64349539	memory: Replace io_mem_read/write with memory_region_dispatch_read/write Rather than retaining io_mem_read/write as simple wrappers around the memory_region_dispatch_read/write functions, make the latter public and change all the callers to use them, since we need to touch all the callsites anyway to add MemTxAttrs and MemTxResult support. Delete io_mem_read and io_mem_write entirely. (All the callers currently pass MEMTXATTRS_UNSPECIFIED and convert the return value back to bool or ignore it.) Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>	2015-04-26 16:49:23 +01:00
Paolo Bonzini	79e2b9aecc	exec: RCUify AddressSpaceDispatch Note that even after this patch, most callers of address_space_* functions must still be under the big QEMU lock, otherwise the memory region returned by address_space_translate can disappear as soon as address_space_translate returns. This will be fixed in the next part of this series. Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-02-16 17:30:19 +01:00
Paolo Bonzini	9d82b5a792	exec: make iotlb RCU-friendly After the previous patch, TLBs will be flushed on every change to the memory mapping. This patch augments that with synchronization of the MemoryRegionSections referred to in the iotlb array. With this change, it is guaranteed that iotlb_to_region will access the correct memory map, even once the TLB will be accessed outside the BQL. Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-02-16 17:30:19 +01:00
Paolo Bonzini	76e5c76f2e	exec: introduce cpu_reload_memory_map This for now is a simple TLB flush. This can change later for two reasons: 1) an AddressSpaceDispatch will be cached in the CPUState object 2) it will not be possible to do tlb_flush once the TCG-generated code runs outside the BQL. Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-02-16 17:30:19 +01:00
Paolo Bonzini	0266359e57	cpu-exec: add a new CF_USE_ICOUNT cflag Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-12-23 10:14:53 +01:00
Pavel Dovgalyuk	d8a499f17e	cpu-exec: invalidate nocache translation if they are interrupted In this case, QEMU might longjmp out of cpu-exec.c and miss the final cleanup in cpu_exec_nocache. Do this manually through a new compile flag. Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-12-15 12:21:02 +01:00
Igor Mammedov	a2b257d621	memory: expose alignment used for allocating RAM as MemoryRegion API introduce memory_region_get_alignment() that returns underlying memory block alignment or 0 if it's not relevant/implemented for backend. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2014-11-23 12:11:30 +02:00
Peter Maydell	86025ee443	cpu-exec: Make debug_excp_handler a QOM CPU method Make the debug_excp_handler target specific hook into a QOM CPU method. Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2014-09-12 14:06:48 +01:00
Richard Henderson	9171478c95	tcg-ppc: Use uintptr_t in ppc_tb_set_jmp_target Tested-by: Tom Musta <tommusta@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-06-23 07:29:30 -07:00
Paolo Bonzini	c773828aa9	softmmu: move all load/store functions to cpu_ldst.h Unify pieces of cpu-all.h, exec-all.h, softmmu_exec.h and tcg/tcg.h into a single new header file with all helpers. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-06-05 16:10:33 +02:00
Paolo Bonzini	ca0aa40816	softmmu: move definition of CPU_MMU_INDEX to inclusion site, drop ACCESS_TYPE Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-06-05 16:05:52 +02:00
Paolo Bonzini	859d76120b	softmmu: start introducing SOFTMMU_CODE_ACCESS in softmmu_header.h This preprocessor symbol is already used in softmmu_template.h. We will use it to distinguish the two "fake" ACCESS_TYPEs NB_MMU_MODES and NB_MMU_MODES + 1. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-06-05 16:05:27 +02:00
Richard Henderson	b6bfeea92a	tcg-mips: Enable direct chaining of TBs Now that the code_gen_buffer is constrained to not cross 256mb regions, we are assured that we can use J to reach another TB. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-24 08:48:37 -07:00
Richard Henderson	a10c64e0df	tcg-s390: Implement direct chaining of TBs Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-15 09:22:32 -07:00
Richard Henderson	cb3d83bc49	exec-all.h: Use stl_le_p instead of stl_p patching x86 We got the wrong version of stl_p, the one that bswaps as appropriate for the target. Since x86 is always little-endian, the "_le_" routine will resolve to what we want. Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-13 10:16:07 -07:00
Richard Henderson	1813e1758d	tcg: Define tcg_insn_unit for code pointers To be defined by the tcg backend based on the elemental unit of the ISA. During the transition, allow TCG_TARGET_INSN_UNIT_SIZE to be undefined, which allows us to default tcg_insn_unit to the current uint8_t. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-12 10:03:04 -07:00
Peter Maydell	86360ad71d	exec-all.h: Use stl_p to avoid undefined behaviour patching x86 jumps The code which patches x86 jump instructions assumes it can do an unaligned write of a uint32_t. This is actually safe on x86, but it's still undefined behaviour. We have infrastructure for doing efficient unaligned accesses which doesn't engage in undefined behaviour, so use it. This is technically fractionally less efficient, at least with gcc 4.6; instead of one instruction: 7b2: 89 3e mov %edi,(%rsi) we get an extra spurious store to the stack slot: 7b2: 89 7c 24 64 mov %edi,0x64(%rsp) 7b6: 89 3e mov %edi,(%rsi) Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-12 10:03:04 -07:00
Peter Maydell	14dcdac82f	exec-all.h: Increase MAX_OP_PER_INSTR for ARM A64 decoder The ARM A64 decoder's worst case number of TCG ops per instruction is 266 (for insn 0x4c800000, a post-indexed ST4 multiple-structures store). Raise the MAX_OP_PER_INSTR define accordingly. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <rth@twiddle.net> Message-id: 1394822294-14837-17-git-send-email-peter.maydell@linaro.org	2014-03-17 16:31:51 +00:00
Andreas Färber	0c591eb0a9	cputlb: Change tlb_set_page() argument to CPUState Signed-off-by: Andreas Färber <afaerber@suse.de>	2014-03-13 19:52:47 +01:00
Andreas Färber	00c8cb0a36	cputlb: Change tlb_flush() argument to CPUState Signed-off-by: Andreas Färber <afaerber@suse.de>	2014-03-13 19:52:47 +01:00

1 2 3 4

181 Commits