Hvf's permission bitmap during and after dirty logging does not include
the HV_MEMORY_EXEC permission. At least on Apple Silicon, this leads to
instruction faults once dirty logging was enabled.
Add the bit to make it work properly.
Signed-off-by: Alexander Graf <agraf@csgraf.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20210916155404.86958-3-agraf@csgraf.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
x86_64 dotnet/runtime uses cmpxchg for code patching. When running it
under s390x qemu-linux user, cpu_signal_handler() does not recognize
this as a write and does not restore PAGE_WRITE cleared by
tb_page_add(), incorrectly forwarding the signal to the guest code.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210803221606.150103-1-iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
All targets call TCGCPUOps::cpu_exec_interrupt() from sysemu code.
Move its declaration to restrict it to system emulation.
Extend the code guarded.
Restrict the static inlined need_replay_interrupt() method to
avoid a "defined but not used" warning.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210911165434.531552-24-f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
do_interrupt() is sysemu specific. However due to some X86
specific hack, it is also used in user-mode emulation, which
is why it couldn't be restricted to CONFIG_SOFTMMU (see the
comment around added in commit 7827168471: "cpu: tcg_ops:
move to tcg-cpu-ops.h, keep a pointer in CPUClass").
Keep the hack but rename the handler as fake_user_interrupt()
and restrict do_interrupt() to sysemu.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Warner Losh <imp@bsdimp.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210911165434.531552-6-f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The TCG_KICK_PERIOD macro is already defined in tcg-accel-ops-rr.h.
Remove it from tcg-accel-ops-rr.c.
Signed-off-by: Luc Michel <lmichel@kalray.eu>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210811141229.12470-1-lmichel@kalray.eu>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
translate_insn() implementations fetch instruction bytes piecemeal,
which can cause qemu-user to generate inconsistent translations if
another thread modifies them concurrently [1].
Fix by making pages containing translated instruction non-writable
right before loading instruction bytes from them.
[1] https://lists.nongnu.org/archive/html/qemu-devel/2021-08/msg00644.html
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20210805204835.158918-1-iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This crept in as either a cut-and-paste error, or rebase error.
Fixes: cfec388518 ("atomic_template: add inline trace/plugin helpers")
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210729004647.282017-24-richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
PowerPC has two KVM types (HV, PR) that translate into three kernel
modules:
kvm.ko - common kvm code
kvm_hv.ko - kvm running with MSR_HV=1 or MSR_HV|PR=0 in a nested guest.
kvm_pr.ko - kvm running in usermode MSR_PR=1.
Since the two KVM types can both be running at the same time, this
creates a situation in which it is possible for one or both of the
modules to fail to initialize, leaving the generic one behind. This
leads QEMU to think it can create a guest, but KVM will fail when
calling the type-specific code:
ioctl(KVM_CREATE_VM) failed: 22 Invalid argument
qemu-kvm: failed to initialize KVM: Invalid argument
Ideally this would be solved kernel-side, but it might be a while
until we can get rid of one of the modules. So in the meantime this
patch tries to make this less confusing for the end user by adding a
more elucidative message:
ioctl(KVM_CREATE_VM) failed: 22 Invalid argument
PPC KVM module is not loaded. Try 'modprobe kvm_hv'.
[dwg: Fixed error in #elif which failed compile on !ppc hosts]
Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <20210722141340.2367905-1-farosas@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
From clang-13:
accel/tcg/cpu-exec.c:783:15: error: variable 'cc' set but not used \
[-Werror,-Wunused-but-set-variable]
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
In cpu_loop_exec_tb(), we decide whether to look for a TB with
exactly insns_left instructions in it using the condition
(!cpu->icount_extra && insns_left > 0 && insns_left < tb->icount)
The check for icount_extra == 0 is unnecessary, because we just set
insns_left = MIN(0xffff, cpu->icount_budget);
icount_extra = icount_budget - insns_left;
and so icount_extra can only be non-zero if icount_budget > 0xffff
and insns_left == 0xffff. But in that case insns_left >= tb->icount
because 0xffff is much larger than TCG_MAX_INSNS, so the condition
will be false anyway.
Remove the unnecessary check, and instead assert:
* that we are only going to execute a partial TB here if the
icount budget has run out (ie icount_extra == 0)
* that the number of insns we're going to execute does fit into
the CF_COUNT_MASK
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210725174405.24568-3-peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
In cpu_loop_exec_tb() we were bounding the number of insns we might
try to execute in a TB using CF_COUNT_MASK. This is incorrect,
because we can validly put up to 0xffff into icount_decr.u16.low. In
particular, since commit 78ff82bb1b reduced CF_COUNT_MASK to
511 this meant that we would incorrectly only try to execute 511
instructions in a 512-instruction TB, which could result in QEMU
hanging when in icount mode.
Use the actual maximum value, which is 0xffff. (This brings this code
in to line with the similar logic in icount_prepare_for_run() in
tcg-accel-ops-icount.c.)
Fixes: 78ff82bb1b
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/499
Message-Id: <20210725174405.24568-2-peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Set CF_SINGLE_STEP when single-stepping is enabled.
This avoids the need to flush all tb's when turning
single-stepping on or off.
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The access internal to tb_cflags() is atomic.
Avoid re-reading it as such for the multiple uses.
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Trigger breakpoints before beginning translation of a TB
that would begin with a BP. Thus we never generate code
for the BP at all.
Single-step instructions within a page containing a BP so
that we are sure to check each insn for the BP as above.
We no longer need to flush any TBs when changing BPs.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/286
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/404
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/489
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We are going to want two things:
(1) check for breakpoints will want to break out of the loop here,
(2) cflags can only be calculated with pc in hand.
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Request that the one TB returns immediately, so that
we release the exclusive lock as soon as possible.
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210717221851.2124573-7-richard.henderson@linaro.org>
Exchange the test in translator_use_goto_tb for CF_NO_GOTO_TB,
and the test in tb_gen_code for setting CF_COUNT_MASK to 1.
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20210717221851.2124573-6-richard.henderson@linaro.org>
The purpose of suppressing goto_ptr from -d nochain had been
to return to the main loop so that -d cpu would be recognized.
But we now include -d cpu logging in helper_lookup_tb_ptr so
there is no need to exclude goto_ptr.
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20210717221851.2124573-5-richard.henderson@linaro.org>
Move the -d nochain check to bits on tb->cflags.
These will be used for more than -d nochain shortly.
Set bits during curr_cflags, test them in translator_use_goto_tb,
assert we're not doing anything odd in tcg_gen_goto_tb. The test
in tcg_gen_exit_tb is redundant with the assert for goto_tb_issue_mask.
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20210717221851.2124573-4-richard.henderson@linaro.org>
We will shortly have more than a simple member read here,
with stuff not necessarily exposed to exec/exec-all.h.
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210717221851.2124573-3-richard.henderson@linaro.org>
The space reserved for CF_COUNT_MASK was overly large.
Reduce to free up cflags bits and eliminate an extra test.
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20210717221851.2124573-2-richard.henderson@linaro.org>
Use trace_mem_get_info instead of trace_mem_build_info,
using the TCGMemOpIdx that we already have. Do this in
the atomic_trace_*_pre function as common subroutines.
Tested-by: Cole Robinson <crobinso@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Unify the parameters of atomic_mmu_lookup between cputlb.c and
user-exec.c. Call the function directly, and remove the macros.
Tested-by: Cole Robinson <crobinso@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
All definitions are now empty.
Tested-by: Cole Robinson <crobinso@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
All instances of EXTRA_ARGS are now identical.
Tested-by: Cole Robinson <crobinso@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reduce the amount of code duplication by always passing
the TCGMemOpIdx argument to helper_atomic_*. This is not
currently used for user-only, but it's easy to ignore.
Tested-by: Cole Robinson <crobinso@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Always provide the atomic interface using TCGMemOpIdx oi
and uintptr_t retaddr. Rename from helper_* to cpu_* so
as to (mostly) match the exec/cpu_ldst.h functions, and
to emphasize that they are not callable from TCG directly.
Tested-by: Cole Robinson <crobinso@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use it to avoid some clang-12 -Watomic-alignment errors,
forcing some structures to be aligned and as a pointer when
we have ensured that the address is aligned.
Tested-by: Cole Robinson <crobinso@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The compiler rightly complains when we build on 32 bit that casting
uint64_t into a void is a bad idea. We are really dealing with a host
pointer at this point so treat it as such. This does involve
a uintptr_t cast of the result of the TLB addend as we know that has
to point to the host memory.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210709143005.1554-28-alex.bennee@linaro.org>
Split out CPU_LOG_EXEC and CPU_LOG_TB_CPU logging from
cpu_tb_exec to a new function. Perform only one pc
range check after a combined mask check.
Use the new function in lookup_tb_ptr. This enables
CPU_LOG_TB_CPU between indirectly chained tbs.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Now that we've moved helper_lookup_tb_ptr, the only user
of tb-lookup.h is cpu-exec.c; merge the contents in.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This will allow additional code sharing.
No functional change.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Add a generic version of the common use_goto_tb test.
Various targets avoid the page crossing test for CONFIG_USER_ONLY,
but that is wrong: mmap and mprotect can change page permissions.
Reviewed-by: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We can call do_tb_phys_invalidate from an iocontext, which has
no per-thread tcg_ctx. Move this to tb_ctx, which is global.
The actual update still takes place with a lock held, so only
an atomic set is required, not an atomic increment.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/457
Tested-by: Viktor Ashirov <vashirov@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The function is called only at tcg_gen_code() when duplicated TBs
are translated by different threads, and when the tcg_region_tree
is reset. Bake it into the underlying GTree as its value destroy
function to unite these situations.
Also remove tcg_region_tree_traverse() which now becomes useless.
Signed-off-by: Liren Wei <lrwei@bupt.edu.cn>
Message-Id: <8dc352f08d038c4e7a1f5f56962398cdc700c3aa.1625404483.git.lrwei@bupt.edu.cn>
[rth: Name the new tb_tc_cmp parameter correctly.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
TranslationBlocks not inserted into the corresponding region
tree shall be regarded as partially initialized objects, and
needs to be finalized first before inserting into QHT.
Signed-off-by: Liren Wei <lrwei@bupt.edu.cn>
Message-Id: <f9fc263f71e11b6308d8c1fbc0dd366bf4aeb532.1625404483.git.lrwei@bupt.edu.cn>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
One more little step towards modular tcg ...
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Jose R. Ziviani <jziviani@suse.de>
Message-Id: <20210624103836.2382472-35-kraxel@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Build tcg accel ops as module.
Which is only a small fraction of tcg.
Also only x86 for now.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Jose R. Ziviani <jziviani@suse.de>
Message-Id: <20210624103836.2382472-30-kraxel@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add module annotations for tcg so autoloading works.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Jose R. Ziviani <jziviani@suse.de>
Message-Id: <20210624103836.2382472-29-kraxel@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Allow building accelerators as module.
Start with qtest as first user.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Jose R. Ziviani <jziviani@suse.de>
Message-Id: <20210624103836.2382472-28-kraxel@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add module annotations for qtest so autoloading works.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Jose R. Ziviani <jziviani@suse.de>
Message-Id: <20210624103836.2382472-27-kraxel@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Call module_object_class_by_name() instead of object_class_by_name()
for objects possibly implemented as module
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Jose R. Ziviani <jziviani@suse.de>
Message-Id: <20210624103836.2382472-26-kraxel@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The trap number for a page fault on BSD systems is T_PAGEFLT
not 0xe -- 0xe is used by Linux and represents the intel hardware
trap vector. The BSD kernels, however, translate this to T_PAGEFLT
in their Xpage, Xtrap0e, Xtrap14, etc fault handlers. This is true
for i386 and x86_64, though the name of the trap hanlder can very
on the flavor of BSD. As far as I can tell, Linux doesn't provide
a define for this value. Invent a new one (PAGE_FAULT_TRAP) and
use it instead to avoid uglier ifdefs.
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Signed-off-by: Juergen Lock <nox@FreeBSD.org>
[ Rework to avoid ifdefs and expand it to i386 ]
Signed-off-by: Warner Losh <imp@bsdimp.com>
Message-Id: <20210625045707.84534-3-imp@bsdimp.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Found this when I wanted to try the per-vcpu dirty rate series out, then I
found that it's not really working and it can quickly hang death a guest. I
found strange errors (e.g. guest crash after migration) happens even without
the per-vcpu dirty rate series.
When merging dirty ring, probably no one notice that the trivial renaming diff
[1] missed two existing references of kvm_dirty_ring_sizes; they do matter
since otherwise we'll mmap() a shorter range of memory after the renaming.
I think it didn't SIGBUS for me easily simply because some other stuff within
qemu mmap()ed right after the dirty rings (e.g. when testing 4096 slots, it
aligned with one small page on x86), so when we access the rings we've been
reading/writting to random memory elsewhere of qemu.
Fix the two sizes when map/unmap the shared dirty gfn memory.
[1] https://lore.kernel.org/qemu-devel/dac5f0c6-1bca-3daf-e5d2-6451dbbaca93@redhat.com/
Cc: Hyman Huang <huangy81@chinatelecom.cn>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20210609014355.217110-1-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We had a single ATOMIC_MMU_LOOKUP macro that probed for
read+write on all atomic ops. This is incorrect for
plain atomic load and atomic store.
For user-only, we rely on the host page permissions.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/390
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
As noted by qemu-plugins.h, plugins can neither read nor write
guest registers.
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
As noted by qemu-plugins.h, enum qemu_plugin_cb_flags is
currently unused -- plugins can neither read nor write
guest registers.
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Let the compiler decide on inlining.
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
These variables belong to the jit side, not the user side.
Since tcg_init_ctx is no longer used outside of tcg/, move
the declaration to tcg-internal.h.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
Suggested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Start removing the include of hw/boards.h from tcg/.
Pass down the max_cpus value from tcg_init_machine,
where we have the MachineState already.
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
There is only one caller, and shortly we will need access
to the MachineState, which tcg_init_machine already has.
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Perform both tcg_context_init and tcg_region_init.
Do not leave this split to the caller.
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We shortly want to use tcg_init for something else.
Since the hook is called init_machine, match that.
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Buffer management is integral to tcg. Do not leave the allocation
to code outside of tcg/. This is code movement, with further
cleanups to follow.
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
It consists of one function call and has only one caller.
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Instead of delaying tcg_region_init until after tcg_prologue_init
is complete, do tcg_region_init first and let tcg_prologue_init
shrink the first region by the size of the generated prologue.
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The hooks we have that call us after reset, init and loadvm really all
just want to say "The reference of all register state is in the QEMU
vcpu struct, please push it".
We already have a working pushing mechanism though called cpu->vcpu_dirty,
so we can just reuse that for all of the above, syncing state properly the
next time we actually execute a vCPU.
This fixes PSCI resets on ARM, as they modify CPU state even after the
post init call has completed, but before we execute the vCPU again.
To also make the scheme work for x86, we have to make sure we don't
move stale eflags into our env when the vcpu state is dirty.
Signed-off-by: Alexander Graf <agraf@csgraf.de>
Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com>
Tested-by: Roman Bolshakov <r.bolshakov@yadro.com>
Reviewed-by: Sergio Lopez <slp@redhat.com>
Message-id: 20210519202253.76782-13-agraf@csgraf.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We will need more than a single field for hvf going forward. To keep
the global vcpu struct uncluttered, let's allocate a special hvf vcpu
struct, similar to how hax does it.
Signed-off-by: Alexander Graf <agraf@csgraf.de>
Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com>
Tested-by: Roman Bolshakov <r.bolshakov@yadro.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Sergio Lopez <slp@redhat.com>
Message-id: 20210519202253.76782-12-agraf@csgraf.de
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We can move the definition of hvf_vcpu_exec() into our internal
hvf header, obsoleting the need for hvf-accel-ops.h.
Signed-off-by: Alexander Graf <agraf@csgraf.de>
Reviewed-by: Sergio Lopez <slp@redhat.com>
Message-id: 20210519202253.76782-11-agraf@csgraf.de
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The hvf accel synchronize functions are only used as input for local
callback functions, so we can make them static.
Signed-off-by: Alexander Graf <agraf@csgraf.de>
Reviewed-by: Sergio Lopez <slp@redhat.com>
Message-id: 20210519202253.76782-10-agraf@csgraf.de
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
There is no reason to call the hvf specific hvf_cpu_synchronize_state()
when we can just use the generic cpu_synchronize_state() instead. This
allows us to have less dependency on internal function definitions and
allows us to make hvf_cpu_synchronize_state() static.
Signed-off-by: Alexander Graf <agraf@csgraf.de>
Reviewed-by: Sergio Lopez <slp@redhat.com>
Message-id: 20210519202253.76782-9-agraf@csgraf.de
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Until now, Hypervisor.framework has only been available on x86_64 systems.
With Apple Silicon shipping now, it extends its reach to aarch64. To
prepare for support for multiple architectures, let's start moving common
code out into its own accel directory.
This patch splits the vcpu init and destroy functions into a generic and
an architecture specific portion. This also allows us to move the generic
functions into the generic hvf code, removing exported functions.
Signed-off-by: Alexander Graf <agraf@csgraf.de>
Reviewed-by: Sergio Lopez <slp@redhat.com>
Message-id: 20210519202253.76782-8-agraf@csgraf.de
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The ARM version of Hypervisor.framework no longer defines these two
types, so let's just revert to standard ones.
Signed-off-by: Alexander Graf <agraf@csgraf.de>
Reviewed-by: Sergio Lopez <slp@redhat.com>
Message-id: 20210519202253.76782-7-agraf@csgraf.de
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The hvf_set_phys_mem() function is only called within the same file.
Make it static.
Signed-off-by: Alexander Graf <agraf@csgraf.de>
Reviewed-by: Sergio Lopez <slp@redhat.com>
Message-id: 20210519202253.76782-6-agraf@csgraf.de
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Until now, Hypervisor.framework has only been available on x86_64 systems.
With Apple Silicon shipping now, it extends its reach to aarch64. To
prepare for support for multiple architectures, let's start moving common
code out into its own accel directory.
This patch moves CPU and memory operations over. While at it, make sure
the code is consumable on non-i386 systems.
Signed-off-by: Alexander Graf <agraf@csgraf.de>
Reviewed-by: Sergio Lopez <slp@redhat.com>
Message-id: 20210519202253.76782-4-agraf@csgraf.de
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Until now, Hypervisor.framework has only been available on x86_64 systems.
With Apple Silicon shipping now, it extends its reach to aarch64. To
prepare for support for multiple architectures, let's start moving common
code out into its own accel directory.
This patch moves the vCPU thread loop over.
Signed-off-by: Alexander Graf <agraf@csgraf.de>
Reviewed-by: Sergio Lopez <slp@redhat.com>
Message-id: 20210519202253.76782-3-agraf@csgraf.de
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Until now, Hypervisor.framework has only been available on x86_64 systems.
With Apple Silicon shipping now, it extends its reach to aarch64. To
prepare for support for multiple architectures, let's start moving common
code out into its own accel directory.
This patch moves assert_hvf_ok() and introduces generic build infrastructure.
Signed-off-by: Alexander Graf <agraf@csgraf.de>
Reviewed-by: Sergio Lopez <slp@redhat.com>
Message-id: 20210519202253.76782-2-agraf@csgraf.de
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Commit e50caf4a5c ("tracing: convert documentation to rST")
converted docs/devel/tracing.txt to docs/devel/tracing.rst.
We still have several references to the old file, so let's fix them
with the following command:
sed -i s/tracing.txt/tracing.rst/ $(git grep -l docs/devel/tracing.txt)
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210517151702.109066-2-sgarzare@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Only the TCG accelerator uses the TranslationBlock API.
Move the tb-context.h / tb-hash.h / tb-lookup.h from the
global namespace to the TCG one (in accel/tcg).
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210524170453.3791436-3-f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Only 2 headers require "exec/tb-context.h". Instead of having
all files including "exec/exec-all.h" also including it, directly
include it where it is required:
- accel/tcg/cpu-exec.c
- accel/tcg/translate-all.c
For plugins/plugin.h, we were implicitly relying on
exec/exec-all.h -> exec/tb-context.h -> qemu/qht.h
which is now included directly.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210524170453.3791436-2-f4bug@amsat.org>
[rth: Fix plugins/plugin.h compilation]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
KVM dirty ring is a new interface to pass over dirty bits from kernel to the
userspace. Instead of using a bitmap for each memory region, the dirty ring
contains an array of dirtied GPAs to fetch (in the form of offset in slots).
For each vcpu there will be one dirty ring that binds to it.
kvm_dirty_ring_reap() is the major function to collect dirty rings. It can be
called either by a standalone reaper thread that runs in the background,
collecting dirty pages for the whole VM. It can also be called directly by any
thread that has BQL taken.
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20210506160549.130416-11-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 is for KVM_CLEAR_DIRTY_LOG, which is only
useful for KVM_GET_DIRTY_LOG. Skip enabling it for kvm dirty ring.
More importantly, KVM_DIRTY_LOG_INITIALLY_SET will not wr-protect all the pages
initially, which is against how kvm dirty ring is used - there's no way for kvm
dirty ring to re-protect a page before it's notified as being written first
with a GFN entry in the ring! So when KVM_DIRTY_LOG_INITIALLY_SET is enabled
with dirty ring, we'll see silent data loss after migration.
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20210506160549.130416-10-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add a parameter for dirty gfn count for dirty rings. If zero, dirty ring is
disabled. Otherwise dirty ring will be enabled with the per-vcpu gfn count as
specified. If dirty ring cannot be enabled due to unsupported kernel or
illegal parameter, it'll fallback to dirty logging.
By default, dirty ring is not enabled (dirty-gfn-count default to 0).
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20210506160549.130416-9-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Cache it too because we'll reference it more frequently in the future.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20210506160549.130416-8-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
kvm_physical_sync_dirty_bitmap() on the whole section is inaccurate, because
the section can be a superset of the memslot that we're working on. The result
is that if the section covers multiple kvm memslots, we could be doing the
synchronization for multiple times for each kvmslot in the section.
With the two helpers that we just introduced, it's very easy to do it right now
by calling the helpers.
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20210506160549.130416-7-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
kvm_physical_sync_dirty_bitmap() calculates the ramblock offset in an
awkward way from the MemoryRegionSection that passed in from the
caller. The truth is for each KVMSlot the ramblock offset never
change for the lifecycle. Cache the ramblock offset for each KVMSlot
into the structure when the KVMSlot is created.
With that, we can further simplify kvm_physical_sync_dirty_bitmap()
with a helper to sync KVMSlot dirty bitmap to the ramblock dirty
bitmap of a specific KVMSlot.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20210506160549.130416-6-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Provide a helper kvm_slot_get_dirty_log() to make the function
kvm_physical_sync_dirty_bitmap() clearer. We can even cache the as_id
into KVMSlot when it is created, so that we don't even need to pass it
down every time.
Since at it, remove return value of kvm_physical_sync_dirty_bitmap()
because it should never fail.
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20210506160549.130416-5-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Previously we have two places that will create the per KVMSlot dirty
bitmap:
1. When a newly created KVMSlot has dirty logging enabled,
2. When the first log_sync() happens for a memory slot.
The 2nd case is lazy-init, while the 1st case is not (which is a fix
of what the 2nd case missed).
To do explicit initialization of dirty bitmaps, what we're missing is
to create the dirty bitmap when the slot changed from not-dirty-track
to dirty-track. Do that in kvm_slot_update_flags().
With that, we can safely remove the 2nd lazy-init.
This change will be needed for kvm dirty ring because kvm dirty ring
does not use the log_sync() interface at all.
Also move all the pre-checks into kvm_slot_init_dirty_bitmap().
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20210506160549.130416-4-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Per-kml slots_lock will bring some trouble if we want to take all slots_lock of
all the KMLs, especially when we're in a context that we could have taken some
of the KML slots_lock, then we even need to figure out what we've taken and
what we need to take.
Make this simple by merging all KML slots_lock into a single slots lock.
Per-kml slots_lock isn't anything that helpful anyway - so far only x86 has two
address spaces (so, two slots_locks). All the rest archs will be having one
address space always, which means there's actually one slots_lock so it will be
the same as before.
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20210506160549.130416-3-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Rename to match tlb_flush_range_locked.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20210509151618.2331764-9-f4bug@amsat.org
Message-Id: <20210508201640.1045808-1-richard.henderson@linaro.org>
[PMD: Split from bigger patch]
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Rename to match tlb_flush_range_locked.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20210509151618.2331764-8-f4bug@amsat.org
Message-Id: <20210508201640.1045808-1-richard.henderson@linaro.org>
[PMD: Split from bigger patch]
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Forward tlb_flush_page_bits_by_mmuidx_all_cpus_synced to
tlb_flush_range_by_mmuidx_all_cpus_synced passing TARGET_PAGE_SIZE.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20210509151618.2331764-7-f4bug@amsat.org
Message-Id: <20210508201640.1045808-1-richard.henderson@linaro.org>
[PMD: Split from bigger patch]
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Forward tlb_flush_page_bits_by_mmuidx_all_cpus to
tlb_flush_range_by_mmuidx_all_cpus passing TARGET_PAGE_SIZE.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20210509151618.2331764-6-f4bug@amsat.org
Message-Id: <20210508201640.1045808-1-richard.henderson@linaro.org>
[PMD: Split from bigger patch]
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Forward tlb_flush_page_bits_by_mmuidx to tlb_flush_range_by_mmuidx
passing TARGET_PAGE_SIZE.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20210509151618.2331764-5-f4bug@amsat.org
Message-Id: <20210508201640.1045808-1-richard.henderson@linaro.org>
[PMD: Split from bigger patch]
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We will not be able to fit address + length into a 64-bit packet.
Drop this optimization before re-organizing this code.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20210509151618.2331764-10-f4bug@amsat.org
Message-Id: <20210508201640.1045808-1-richard.henderson@linaro.org>
[PMD: Split from bigger patch]
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
[PMM: Moved patch earlier in the series]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Rename the structure to match the rename of tlb_flush_range_locked.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20210509151618.2331764-4-f4bug@amsat.org
Message-Id: <20210508201640.1045808-1-richard.henderson@linaro.org>
[PMD: Split from bigger patch]
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Rename tlb_flush_page_bits_locked() -> tlb_flush_range_locked(), and
have callers pass a length argument (currently TARGET_PAGE_SIZE) via
the TLBFlushPageBitsByMMUIdxData structure.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20210509151618.2331764-3-f4bug@amsat.org
Message-Id: <20210508201640.1045808-1-richard.henderson@linaro.org>
[PMD: Split from bigger patch]
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Using g_memdup is a bit more compact than g_new + memcpy.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20210509151618.2331764-2-f4bug@amsat.org
Message-Id: <20210508201640.1045808-1-richard.henderson@linaro.org>
[PMD: Split from bigger patch]
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
If arch-specific code generates a translation block of size 0,
tb_gen_code() may generate a spurious exception. Add an assertion in
order to catch such situations early.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210416154939.32404-5-iii@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
To better visualize the data dumped at the end of a TB, left-align it
(padding it with 0). Print ".long" instead of ".quad" on 32-bit hosts.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210515104202.241504-1-f4bug@amsat.org>
[rth: Split the qemu_log and print .long for 32-bit hosts.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Obvious uses of the new functions.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
while on x86 all CPU classes can use the same set of TCGCPUOps,
on ARM the right accel behavior depends on the type of the CPU.
So we need a way to specialize the accel behavior according to
the CPU. Therefore, add a second initialization, after the
accel_cpu->cpu_class_init, that allows to do this.
Signed-off-by: Claudio Fontana <cfontana@suse.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210322132800.7470-24-cfontana@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
overall, all devices' realize functions take an Error **errp, but return void.
hw/core/qdev.c code, which realizes devices, therefore does:
local_err = NULL;
dc->realize(dev, &local_err);
if (local_err != NULL) {
goto fail;
}
However, we can improve at least accel_cpu to return a meaningful bool value.
Signed-off-by: Claudio Fontana <cfontana@suse.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210322132800.7470-9-cfontana@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
avoid open coding the accesses to cpu->accel_cpu interfaces,
and instead introduce:
accel_cpu_instance_init,
accel_cpu_realizefn
to be used by the targets/ initfn code,
and by cpu_exec_realizefn respectively.
Signed-off-by: Claudio Fontana <cfontana@suse.de>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210322132800.7470-7-cfontana@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Stop including exec/address-spaces.h in files that don't need it.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210416171314.2074665-5-thuth@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Stop including cpu.h in files that don't need it.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210416171314.2074665-4-thuth@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Stop including hw/boards.h in files that don't need it.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210416171314.2074665-3-thuth@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Stop including sysemu/sysemu.h in files that don't need it.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210416171314.2074665-2-thuth@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
When dumping the extra exit data provided by KVM, make it clear that
the data is hexadecimal.
At the same time, zero-pad the output.
Signed-off-by: David Edmondson <david.edmondson@oracle.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210428142431.266879-1-david.edmondson@oracle.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
By definition a single instruction is capable of being an IO
instruction. This avoids a problem of triggering a cpu_io_recompile on
a non-recorded translation which then fails because it expects
tcg_tb_lookup() to succeed unconditionally. The normal use case
requires a TB to be able to resolve machine state.
The other users of tcg_tb_lookup() are able to tolerate a missing TB
if the machine state has been resolved by other means - which in the
single-shot case is always true because machine state is synced at the
start of a block.
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210415162454.22056-1-alex.bennee@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Using mprotect() to change PROT_* does not change the MAP_ANON
previously set with mmap(). Our linux-user version of MTE only
works with MAP_ANON pages, so losing PAGE_ANON caused MTE to
stop working.
Reported-by: Stephen Long <steplong@quicinc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This reverts commit 3920552846.
Thomas Huth reported a failure with CentOS 6 guests:
../../devel/qemu/accel/kvm/kvm-all.c:690: kvm_log_clear_one_slot: Assertion `QEMU_IS_ALIGNED(start | size, psize)' failed.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Prior to commit f2ce39b4f0 a MachineClass kvm_type method
only needed to be registered to ensure it would be executed.
With commit f2ce39b4f0 a kvm-type machine property must also
be specified. hw/arm/virt relies on the kvm_type method to pass
its selected IPA limit to KVM, but this is not exposed as a
machine property. Restore the previous functionality of invoking
kvm_type when it's present.
Fixes: f2ce39b4f0 ("vl: make qemu_get_machine_opts static")
Signed-off-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20210310135218.255205-2-drjones@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The 'running' argument from VMChangeStateHandler does not require
other value than 0 / 1. Make it a plain boolean.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20210111152020.1422021-3-philmd@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
There are 23 files that include the "sysemu/qtest.h",
but they do not use any qtest functions.
Signed-off-by: Chen Qun <kuhn.chenqun@huawei.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20210226081414.205946-1-kuhn.chenqun@huawei.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The primary motivation is to remove a dozen insns along
the fast-path in tb_lookup. As a byproduct, this allows
us to completely remove parallel_cpus.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We don't really deal in cf_mask most of the time. The one time it's
relevant is when we want to remove an invalidated TB from the QHT
lookup. Everywhere else we should be looking up things without
CF_INVALID set.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210224165811.11567-4-alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
There is nothing special about this compile flag that doesn't mean we
can't just compute it with curr_cflags() which we should be using when
building a new set.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210224165811.11567-3-alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Having a function return either and valid TB and some system state
seems excessive. It will make the subsequent re-factoring easier if we
lookup the current state where we are.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210224165811.11567-2-alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The parameters start and size are transfered from QEMU memory
emulation layer. It can promise that they are TARGET_PAGE_SIZE
aligned. However, KVM needs they are qemu_real_page_size aligned.
Though no caller breaks this aligned requirement currently, we'd
better add an explicit assert to avoid future breaking.
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-Id: <20201217014941.22872-3-zhukeqian1@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When handle dirty log, we face qemu_real_host_page_size and
TARGET_PAGE_SIZE. The first one is the granule of KVM dirty
bitmap, and the second one is the granule of QEMU dirty bitmap.
As qemu_real_host_page_size >= TARGET_PAGE_SIZE (kvm_init()
enforced it), misuse TARGET_PAGE_SIZE to init kvmslot dirty_bmap
may waste memory. For example, when qemu_real_host_page_size is
64K and TARGET_PAGE_SIZE is 4K, it wastes 93.75% (15/16) memory.
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <20201217014941.22872-2-zhukeqian1@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When icount is enabled and we recompile an MMIO access we end up
double counting the instruction execution. To avoid this we introduce
the CF_MEMI cflag which only allows memory instrumentation for the
next TB (which won't yet have been counted). As this is part of the
hashed compile flags we will only execute the generated TB while
coming out of a cpu_io_recompile.
While we are at it delete the old TODO. We might as well keep the
translation handy as it's likely you will repeatedly hit it on each
MMIO access.
Reported-by: Aaron Lindsay <aaron@os.amperecomputing.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Aaron Lindsay <aaron@os.amperecomputing.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210213130325.14781-21-alex.bennee@linaro.org>
Now we no longer generate CF_NOCACHE blocks we can remove a bunch of
the special case handling for them. While we are at it we can remove
the unused tb->orig_tb field and save a few bytes on the TB structure.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210213130325.14781-20-alex.bennee@linaro.org>
There is no real need to use CF_NOCACHE here. As long as the TB isn't
linked to other TBs or included in the QHT or jump cache then it will
only get executed once.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210213130325.14781-19-alex.bennee@linaro.org>
Again there is no reason to jump through the nocache hoops to execute
a single instruction block. We do have to add an additional wrinkle to
the cpu_handle_interrupt case to ensure we let through a TB where we
have specifically disabled icount for the block.
As the last user of cpu_exec_nocache we can now remove the function.
Further clean-up will follow in subsequent patches.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210213130325.14781-18-alex.bennee@linaro.org>
When we exit a block under icount with instructions left to execute we
might need a shorter than normal block to take us to the next
deterministic event. Instead of creating a throwaway block on demand
we use the existing compile flags mechanism to ensure we fetch (or
compile and fetch) a block with exactly the number of instructions we
need.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210213130325.14781-17-alex.bennee@linaro.org>
Move the code from accel/tcg/translate-all.c to target/sh4/cpu.c.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210208233906.479571-5-richard.henderson@linaro.org>
Message-Id: <20210213130325.14781-14-alex.bennee@linaro.org>
Move the code from accel/tcg/translate-all.c to target/mips/cpu.c.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210208233906.479571-4-richard.henderson@linaro.org>
Message-Id: <20210213130325.14781-13-alex.bennee@linaro.org>
Create a hook in which to split out the mips and
sh4 ifdefs from cpu_io_recompile.
[AJB: s/stoped/stopped/]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210208233906.479571-3-richard.henderson@linaro.org>
Message-Id: <20210213130325.14781-12-alex.bennee@linaro.org>
A recent change to the handling of constants in TCG changed the
pattern of ops emitted for a constant add. We no longer emit a mov and
the constant can be applied directly to the TCG_op_add arguments. This
was causing SEGVs when running the insn plugin with arg=inline. Fix
this by updating copy_add_i64 to do the right thing while also adding
a comment at the top of the append section as an aide memoir if
something like this happens again.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Cc: Emilio G. Cota <cota@braap.org>
Message-Id: <20210213130325.14781-10-alex.bennee@linaro.org>
An SEV-ES guest does not allow register state to be altered once it has
been measured. When an SEV-ES guest issues a reboot command, Qemu will
reset the vCPU state and resume the guest. This will cause failures under
SEV-ES. Prevent that from occuring by introducing an arch-specific
callback that returns a boolean indicating whether vCPUs are resettable.
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: Aleksandar Rikalo <aleksandar.rikalo@syrmia.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: David Hildenbrand <david@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>
Message-Id: <1ac39c441b9a3e970e9556e1cc29d0a0814de6fd.1611682609.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When SEV-ES is enabled, it is not possible modify the guests register
state after it has been initially created, encrypted and measured.
Normally, an INIT-SIPI-SIPI request is used to boot the AP. However, the
hypervisor cannot emulate this because it cannot update the AP register
state. For the very first boot by an AP, the reset vector CS segment
value and the EIP value must be programmed before the register has been
encrypted and measured. Search the guest firmware for the guest for a
specific GUID that tells Qemu the value of the reset vector to use.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <22db2bfb4d6551aed661a9ae95b4fdbef613ca21.1611682609.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The places that use these are better off using untagged
addresses, so do not provide a tagged versions. Rename
to make it clear about the address type.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210212184902.1251044-16-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Use g2h_untagged in contexts that have no cpu, e.g. the binary
loaders that operate before the primary cpu is created. As a
colollary, target_mmap and friends must use untagged addresses,
since they are used by the loaders.
Use g2h_untagged on values returned from target_mmap, as the
kernel never applies a tag itself.
Use g2h_untagged on all pc values. The only current user of
tags, aarch64, removes tags from code addresses upon branch,
so "pc" is always untagged.
Use g2h with the cpu context on hand wherever possible.
Use g2h_untagged in lock_user, which will be updated soon.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210212184902.1251044-13-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This data can be allocated by page_alloc_target_data() and
released by page_set_flags(start, end, prot | PAGE_RESET).
This data will be used to hold tag memory for AArch64 MTE.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210212184902.1251044-2-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In cpu_exec() we have a longstanding workaround for compilers which
do not correctly implement the part of the sigsetjmp()/siglongjmp()
spec which requires that local variables which are not changed
between the setjmp and the longjmp retain their value.
I recently ran across the upstream clang bug report for this; add a
link to it to the comment describing the workaround, and generally
expand the comment, so that we have a reasonable chance in future of
understanding why it's there and determining when we can remove it,
assuming clang eventually fixes the bug.
Remove the /* buggy compiler */ comments on the #else and #endif:
they don't add anything to understanding and are somewhat misleading
since they're sandwiching the code path for *non*-buggy compilers.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20210129130330.30820-1-peter.maydell@linaro.org
Sometimes interrupt event comes at the same time with
the virtual timers. In this case replay tries to proceed
the timers, because deadline for them is zero.
This patch allows processing interrupts and exceptions
by entering the vCPU execution loop, when deadline is zero,
but checkpoint associated with virtual timers is not ready
to be replayed.
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru>
Message-Id: <161216312794.2030770.1709657858900983160.stgit@pasha-ThinkPad-X280>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The kvm_vm_ioctl() wrapper already returns -errno if the ioctl itself
returned -1, so the callers of kvm_vm_ioctl() should not check for -1
but for a value < 0 instead.
This problem has been fixed once already in commit b533f658a9
but that commit missed that the ENOENT error code is not fatal for
this ioctl, so the commit has been reverted in commit 50212d6346
since the problem occurred close to a pending release at that point
in time. The plan was to fix it properly after the release, but it
seems like this has been forgotten. So let's do it now finally instead.
Resolves: https://bugs.launchpad.net/qemu/+bug/1294227
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210129084354.42928-1-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
While we've abstracted some (potential) differences between mechanisms for
securing guest memory, the initialization is still specific to SEV. Given
that, move it into x86's kvm_arch_init() code, rather than the generic
kvm_init() code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
This allows failures to be reported richly and idiomatically.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Currently the "memory-encryption" property is only looked at once we
get to kvm_init(). Although protection of guest memory from the
hypervisor isn't something that could really ever work with TCG, it's
not conceptually tied to the KVM accelerator.
In addition, the way the string property is resolved to an object is
almost identical to how a QOM link property is handled.
So, create a new "confidential-guest-support" link property which sets
this QOM interface link directly in the machine. For compatibility we
keep the "memory-encryption" property, but now implemented in terms of
the new property.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
When AMD's SEV memory encryption is in use, flash memory banks (which are
initialed by pc_system_flash_map()) need to be encrypted with the guest's
key, so that the guest can read them.
That's abstracted via the kvm_memcrypt_encrypt_data() callback in the KVM
state.. except, that it doesn't really abstract much at all.
For starters, the only call site is in code specific to the 'pc'
family of machine types, so it's obviously specific to those and to
x86 to begin with. But it makes a bunch of further assumptions that
need not be true about an arbitrary confidential guest system based on
memory encryption, let alone one based on other mechanisms:
* it assumes that the flash memory is defined to be encrypted with the
guest key, rather than being shared with hypervisor
* it assumes that that hypervisor has some mechanism to encrypt data into
the guest, even though it can't decrypt it out, since that's the whole
point
* the interface assumes that this encrypt can be done in place, which
implies that the hypervisor can write into a confidential guests's
memory, even if what it writes isn't meaningful
So really, this "abstraction" is actually pretty specific to the way SEV
works. So, this patch removes it and instead has the PC flash
initialization code call into a SEV specific callback.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
add a new optional interface to CPUClass, which allows accelerators
to extend the CPUClass with additional accelerator-specific
initializations.
This will allow to separate the target cpu code that is specific
to each accelerator, and register it automatically with object
hierarchy lookup depending on accelerator code availability,
as part of the accel_init_interfaces() initialization step.
Signed-off-by: Claudio Fontana <cfontana@suse.de>
Message-Id: <20210204163931.7358-19-cfontana@suse.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This will allow us to centralize the registration of
the cpus.c module accelerator operations (in accel/accel-softmmu.c),
and trigger it automatically using object hierarchy lookup from the
new accel_init_interfaces() initialization step, depending just on
which accelerators are available in the code.
Rename all tcg-cpus.c, kvm-cpus.c, etc to tcg-accel-ops.c,
kvm-accel-ops.c, etc, matching the object type names.
Signed-off-by: Claudio Fontana <cfontana@suse.de>
Message-Id: <20210204163931.7358-18-cfontana@suse.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
we cannot in principle make the TCG Operations field definitions
conditional on CONFIG_TCG in code that is included by both common_ss
and specific_ss modules.
Therefore, what we can do safely to restrict the TCG fields to TCG-only
builds, is to move all tcg cpu operations into a separate header file,
which is only included by TCG, target-specific code.
This leaves just a NULL pointer in the cpu.h for the non-TCG builds.
This also tidies up the code in all targets a bit, having all TCG cpu
operations neatly contained by a dedicated data struct.
Signed-off-by: Claudio Fontana <cfontana@suse.de>
Message-Id: <20210204163931.7358-16-cfontana@suse.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>