Since we *might* have user emulation with softmmu,
replace the system emulation check by !user emulation one.
Invert some if() ladders for clarity.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230613133347.82210-7-philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Added QEMU option 'vpu' to log vector extension registers such as gpr\fpu.
Signed-off-by: Ivan Klokov <ivan.klokov@syntacore.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20230410124451.15929-2-ivan.klokov@syntacore.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
qatomic_mb_read and qatomic_mb_set were the very first atomic primitives
introduced for QEMU; their semantics are unclear and they provide a false
sense of safety.
The last use of qatomic_mb_read() has been removed, so delete it.
qatomic_mb_set() instead can survive as an optimized
qatomic_set()+smp_mb(), similar to Linux's smp_store_mb(), but
rename it to qatomic_set_mb() to match the order of the two
operations.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This makes TranslationBlock agnostic to the address size of the guest.
Use vaddr for pc, since that's always a virtual address.
Use uint64_t for cs_base, since usage varies between guests.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Now we no longer have dynamic state affecting things we can remove the
additional fields in cpu.h and simplify the TB hash calculation.
For the benchmark:
hyperfine -w 2 -m 20 \
"./arm-softmmu/qemu-system-arm -cpu cortex-a15 \
-machine type=virt,highmem=off \
-display none -m 2048 \
-serial mon:stdio \
-netdev user,id=unet,hostfwd=tcp::2222-:22 \
-device virtio-net-pci,netdev=unet \
-device virtio-scsi-pci \
-blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-armhf \
-device scsi-hd,drive=hd -smp 4 \
-kernel /home/alex/lsrc/linux.git/builds/arm/arch/arm/boot/zImage \
-append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark.service' \
-snapshot"
It has a marginal effect on runtime, before:
Time (mean ± σ): 26.279 s ± 2.438 s [User: 41.113 s, System: 1.843 s]
Range (min … max): 24.420 s … 32.565 s 20 runs
after:
Time (mean ± σ): 24.440 s ± 2.885 s [User: 34.474 s, System: 2.028 s]
Range (min … max): 21.663 s … 29.937 s 20 runs
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1358
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20230526165401.574474-10-alex.bennee@linaro.org
Message-Id: <20230524133952.3971948-9-alex.bennee@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This had been set since the beginning, is never undefined,
and it would seem to be harmful to debugging to do so.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The only place left that looks at the old 'singlestep' global
variable is the TCG curr_cflags() function. Replace the old global
with a new 'one_insn_per_tb' which is defined in tcg-all.c and
declared in accel/tcg/internal.h. This keeps it restricted to the
TCG code, unlike 'singlestep' which was available to every file in
the system and defined in multiple different places for softmmu vs
linux-user vs bsd-user.
While we're making this change, use qatomic_read() and qatomic_set()
on the accesses to the new global, because TCG will read it without
holding a lock.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20230417164041.684562-4-peter.maydell@linaro.org
Assign pc and use store_release to assign tb.
Fixes: 2dd5b7a1b9 ("accel/tcg: Move jmp-cache `CF_PCREL` checks to caller")
Reported-by: Weiwei Li <liweiwei@iscas.ac.cn>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Do this in cpu_tb_exec (normal exit) and cpu_loop_exit (exception),
adjacent to where we reset can_do_io.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1381
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230310195252.210956-2-richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230315174331.2959-12-alex.bennee@linaro.org>
Reviewed-by: Emilio Cota <cota@braap.org>
Signed-off-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230227135202.9710-13-anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
tb-jmp-cache.h contains a few small functions that only exist to hide a
CF_PCREL check, however the caller often already performs such a check.
This patch moves CF_PCREL checks from the callee to the caller, and also
removes these functions which now only hide an access of the jmp-cache.
Signed-off-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230227135202.9710-12-anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230227135202.9710-5-anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
replay API is used deeply within TCG common code (common to user
and system emulation). Unfortunately "sysemu/replay.h" requires
some QAPI headers for few system-specific declarations, example:
void replay_input_event(QemuConsole *src, InputEvent *evt);
Since commit c2651c0eaa ("qapi/meson: Restrict UI module to system
emulation and tools") the QAPI header defining the InputEvent is
not generated anymore.
To keep it simple, extract the 'core' replay prototypes to a new
"exec/replay-core.h" header which we include in the TCG code that
doesn't need the rest of the replay API.
Reviewed-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru>
Message-Id: <20221219170806.60580-5-philmd@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Since commit a0e61807a3 ("qapi: Remove QMP events and commands from
user-mode builds") we don't generate the "qapi-commands-machine.h"
header in a user-emulation-only build.
Rename 'hmp.c' as 'monitor.c' and move the QMP functions from
cpu-exec.c (which is always compiled) to monitor.c (which is only
compiled when system-emulation is selected).
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221219170806.60580-4-philmd@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221219170806.60580-2-philmd@linaro.org>
This commit was created with scripts/clean-includes.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20230202133830.2152150-19-armbru@redhat.com>
Fixes the appended use-after-free. The root cause is that
during tb invalidation we use CPU_FOREACH, and therefore
to safely free a vCPU we must wait for an RCU grace period
to elapse.
$ x86_64-linux-user/qemu-x86_64 tests/tcg/x86_64-linux-user/munmap-pthread
=================================================================
==1800604==ERROR: AddressSanitizer: heap-use-after-free on address 0x62d0005f7418 at pc 0x5593da6704eb bp 0x7f4961a7ac70 sp 0x7f4961a7ac60
READ of size 8 at 0x62d0005f7418 thread T2
#0 0x5593da6704ea in tb_jmp_cache_inval_tb ../accel/tcg/tb-maint.c:244
#1 0x5593da6704ea in do_tb_phys_invalidate ../accel/tcg/tb-maint.c:290
#2 0x5593da670631 in tb_phys_invalidate__locked ../accel/tcg/tb-maint.c:306
#3 0x5593da670631 in tb_invalidate_phys_page_range__locked ../accel/tcg/tb-maint.c:542
#4 0x5593da67106d in tb_invalidate_phys_range ../accel/tcg/tb-maint.c:614
#5 0x5593da6a64d4 in target_munmap ../linux-user/mmap.c:766
#6 0x5593da6dba05 in do_syscall1 ../linux-user/syscall.c:10105
#7 0x5593da6f564c in do_syscall ../linux-user/syscall.c:13329
#8 0x5593da49e80c in cpu_loop ../linux-user/x86_64/../i386/cpu_loop.c:233
#9 0x5593da6be28c in clone_func ../linux-user/syscall.c:6633
#10 0x7f496231cb42 in start_thread nptl/pthread_create.c:442
#11 0x7f49623ae9ff (/lib/x86_64-linux-gnu/libc.so.6+0x1269ff)
0x62d0005f7418 is located 28696 bytes inside of 32768-byte region [0x62d0005f0400,0x62d0005f8400)
freed by thread T148 here:
#0 0x7f49627b6460 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:52
#1 0x5593da5ac057 in cpu_exec_unrealizefn ../cpu.c:180
#2 0x5593da81f851 (/home/cota/src/qemu/build/qemu-x86_64+0x484851)
Signed-off-by: Emilio Cota <cota@braap.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230111151628.320011-2-cota@braap.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230124180127.1881110-27-alex.bennee@linaro.org>
We now have the option to generate direct or indirect
goto_tb depending on the dynamic displacement, thus
the define is no longer necessary or completely accurate.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Replace 'tc_ptr' and 'addr' with 'tb' and 'n'.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Stop overloading jmp_target_arg for both offset and address,
depending on TCG_TARGET_HAS_direct_jump. Instead, add a new
field to hold the jump insn offset and always set the target
address in jmp_target_addr[]. This will allow a tcg backend
to use either direct or indirect depending on displacement.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Recently the g_assert(cpu == current_cpu) test has been
intermittently failing with gcc. Reorg the code around
the setjmp to minimize the lifetime of the cpu variable
affected by the setjmp.
This appears to fix the existing issue with clang as well.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1147
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Delay cpu_list_add until realize is complete, so that cross-cpu
interaction does not happen with incomplete cpu state. For this,
we must delay plugin initialization out of tcg_exec_realizefn,
because no cpu_index has been assigned.
Fixes a problem with cross-cpu jump cache flushing, when the
jump cache has not yet been allocated.
Fixes: a976a99a29 ("include/hw/core: Create struct CPUJumpCache")
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reported-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This data structure will be replaced for user-only: add accessors.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Profiling QEMU during Fedora 35 for PPC64 boot revealed that a
considerable amount of time was being spent in
check_for_breakpoints() (0.61% of total time on PPC64 and 2.19% on
amd64), even though it was just checking that its queue was empty
and returning, when no breakpoints were set. It turns out this
function is not inlined by the compiler and it's always called by
helper_lookup_tb_ptr(), one of the most called functions.
By leaving only the check for empty queue in
check_for_breakpoints() and moving the remaining code to
check_for_breakpoints_slow(), called only when the queue is not
empty, it's possible to avoid the call overhead. An improvement of
about 3% in total time was measured on POWER9.
Signed-off-by: Leandro Lupori <leandro.lupori@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221025202424.195984-2-leandro.lupori@eldorado.org.br>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Prepare for targets to be able to produce TBs that can
run in more than one virtual context.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The availability of tb->pc will shortly be conditional.
Introduce accessor functions to minimize ifdefs.
Pass around a known pc to places like tcg_gen_code,
where the caller must already have the value.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Wrap the bare TranslationBlock pointer into a structure.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Let tb->page_addr[0] contain the address of the first byte of the
translated block, rather than the address of the page containing the
start of the translated block. We need to recover this value anyway
at various points, and it is easier to discard a page offset when it
is not needed, which happens naturally via the existing find_page shift.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
It was non-obvious to me why we can raise an exception in
the middle of a comparison function, but it works.
While nearby, use TARGET_PAGE_ALIGN instead of open-coding.
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The function is not used outside of cpu-exec.c. Move it and
its subroutines up in the file, before the first use.
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Tested-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The mmap_lock is held around tb_gen_code. While the comment
is correct that the lock is dropped when tb_gen_code runs out
of memory, the lock is *not* dropped when an exception is
raised reading code for translation.
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Tested-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
dump_opcount_info() is a one-line wrapper around tcg_dump_op_count()
which is also exported. So use the latter directly.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20220520180109.8224-10-shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Commit 3a841ab53f 'qapi: introduce
x-query-jit QMP command' basically moved the only function using
dump_drift_info() to cpu-exec.c. Therefore, dump_drift_info() doesn't
need to be exported any longer.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20220520180109.8224-9-shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Inside log_cpu_state, we perform qemu_log_trylock/unlock, which need
not be done if we have already performed the lock beforehand.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220417183019.755276-15-richard.henderson@linaro.org>
Inside qemu_log, we perform qemu_log_trylock/unlock, which need
not be done if we have already performed the lock beforehand.
Always check the result of qemu_log_trylock -- only checking
qemu_loglevel_mask races with the acquisition of the lock on
the logfile.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220417183019.755276-10-richard.henderson@linaro.org>
This function can fail, which makes it more like ftrylockfile
or pthread_mutex_trylock than flockfile or pthread_mutex_lock,
so rename it.
To closer match the other trylock functions, release rcu_read_lock
along the failure path, so that qemu_log_unlock need not be called
on failure.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220417183019.755276-8-richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220305233415.64627-2-philippe.mathieu.daude@gmail.com>
In some cases, cpu->exit_request can be false after handling the
interrupt, leading to another TB being executed instead of returning
to the main loop.
Fix this by returning true unconditionally when in single-step mode.
Fixes: ba3c35d9c4 ("tcg/cpu-exec: precise single-stepping after an interrupt")
Signed-off-by: Luc Michel <lmichel@kalray.eu>
Message-Id: <20220214132656.11397-1-lmichel@kalray.eu>
[rth: Unlock iothread mutex; simplify indentation]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Commit aff0e204cb introduced CF_NOIRQ usage,
but one case was forgotten. Record/replay uses one special TB which is not
really executed, but used to cause a correct exception in replay mode.
This patch adds CF_NOIRQ flag for such block.
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <164362834054.1754532.7678416881159817273.stgit@pasha-ThinkPad-X280>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
As --enable-profiler isn't defended in CI we missed this breakage.
Move the qmp handler into accel/tcg so we have access to the helpers
we need. While we are at it ensure we gate the feature on CONFIG_TCG.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Reported-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Fixes: 37087fde0e ("qapi: introduce x-query-profile QMP command")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/773
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20220105135009.1584676-23-alex.bennee@linaro.org>
When we set cpu->cflags_next_tb it is because we want to carefully
control the execution of the next TB. Currently there is a race that
causes the second stage of watchpoint handling to get ignored if an
IRQ is processed before we finish executing the instruction that
triggers the watchpoint. Use the new CF_NOIRQ facility to avoid the
race.
We also suppress IRQs when handling precise self modifying code to
avoid unnecessary bouncing.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Cc: Pavel Dovgalyuk <pavel.dovgalyuk@ispras.ru>
Fixes: https://gitlab.com/qemu-project/qemu/-/issues/245
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211129140932.4115115-3-alex.bennee@linaro.org>
This is a counterpart to the HMP "info opcount" command. It is being
added with an "x-" prefix because this QMP command is intended as an
ad hoc debugging tool and will thus not be modelled in QAPI as fully
structured data, nor will it have long term guaranteed stability.
The existing HMP command is rewritten to call the QMP command.
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This is a counterpart to the HMP "info jit" command. It is being
added with an "x-" prefix because this QMP command is intended as an
ad hoc debugging tool and will thus not be modelled in QAPI as fully
structured data, nor will it have long term guaranteed stability.
The existing HMP command is rewritten to call the QMP command.
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Currently there are only two places that require we reset this
value before exiting to the main loop, but that will change.
Reviewed-by: Warner Losh <imp@bsdimp.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Currently the change in cpu_tb_exec is masked by the debug exception
being raised by the translators. But this allows us to remove that code.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Commit 372579427a ("tcg: enable thread-per-vCPU") added the following
comment describing EXCP_HALTED in qemu_tcg_cpu_thread_fn():
case EXCP_HALTED:
/* during start-up the vCPU is reset and the thread is
* kicked several times. If we don't ensure we go back
* to sleep in the halted state we won't cleanly
* start-up when the vCPU is enabled.
*
* cpu->halted should ensure we sleep in wait_io_event
*/
g_assert(cpu->halted);
break;
qemu_wait_io_event() is sysemu-specific, so we can restrict the
cpu_handle_halt() call in cpu_exec() to system emulation.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210912172731.789788-2-f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>