Xtensa binaries built for call0 ABI don't rotate register window on
function calls and returns. Invocation of signal handlers from the
kernel is therefore different in windowed and call0 ABIs.
There's currently no way to determine xtensa ELF binary ABI from the
binary itself. Add handler for the -xtensa-abi-call0 command line
parameter/QEMU_XTENSA_ABI_CALL0 envitonment variable to the qemu-user
and record ABI choice. Use it to initialize PS.WOE in xtensa_cpu_reset.
Check PS.WOE in setup_rt_frame to determine how a signal should be
delivered.
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Message-Id: <20190906165713.5558-1-jcmvbkbc@gmail.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
This patch moves the define of target access alignment earlier from
target/foo/cpu.h to configure.
Suggested in Richard Henderson's reply to "[PATCH 1/4] tcg: TCGMemOp is now
accelerator independent MemOp"
Signed-off-by: Tony Nguyen <tony.nguyen@bt.com>
Message-Id: <11e818d38ebc40e986cfa62dd7d0afdc@tpw09926dag18e.domain1.systemhost.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: tony.nguyen@bt.com <tony.nguyen@bt.com>
No header includes qemu-common.h after this commit, as prescribed by
qemu-common.h's file comment.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190523143508.25387-5-armbru@redhat.com>
[Rebased with conflicts resolved automatically, except for
include/hw/arm/xlnx-zynqmp.h hw/arm/nrf51_soc.c hw/arm/msf2-soc.c
block/qcow2-refcount.c block/qcow2-cluster.c block/qcow2-cache.c
target/arm/cpu.h target/lm32/cpu.h target/m68k/cpu.h target/mips/cpu.h
target/moxie/cpu.h target/nios2/cpu.h target/openrisc/cpu.h
target/riscv/cpu.h target/tilegx/cpu.h target/tricore/cpu.h
target/unicore32/cpu.h target/xtensa/cpu.h; bsd-user/main.c and
net/tap-bsd.c fixed up]
This macro is now always empty, so remove it. This leaves the
entire contents of CPUArchState under the control of the guest
architecture.
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Nothing in there so far, but all of the plumbing done
within the target ArchCPU state.
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Now that we have ArchCPU, we can define this generically,
in the one place that needs it.
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Cleanup in the boilerplate that each target must define.
Replace xtensa_env_get_cpu with env_archcpu. The combination
CPU(xtensa_env_get_cpu) should have used ENV_GET_CPU to begin;
use env_cpu now.
Move cpu_get_tb_cpu_state below the include of "exec/cpu-all.h"
so that the definition of env_cpu is available.
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Now that we have both ArchCPU and CPUArchState, we can define
this generically instead of via macro in each target's cpu.h.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
For all targets, do this just before including exec/cpu-all.h.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
For all targets, do this just before including exec/cpu-all.h.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
For all targets, into this new file move TARGET_LONG_BITS,
TARGET_PAGE_BITS, TARGET_PHYS_ADDR_SPACE_BITS,
TARGET_VIRT_ADDR_SPACE_BITS, and NB_MMU_MODES.
Include this new file from exec/cpu-defs.h.
This now removes the somewhat odd requirement that target/arch/cpu.h
defines TARGET_LONG_BITS before including exec/cpu-defs.h, so push the
bulk of the includes within target/arch/cpu.h to the top.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The Exclusive Instructions provide a general-purpose mechanism for
atomic updates of memory-based synchronization variables that can be
used for exclusion algorithms.
Use cmpxchg-based implementation that is sufficient for the typical use
of exclusive access in atomic operations.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
The Memory Protection Unit Option (MPU) is a combined instruction and
data memory protection unit with more protection flexibility than the
Region Protection Option or the Region Translation Option but without
any translation capability. It does no demand paging and does not
reference a memory-based page table.
Add memory protection unit option, internal state, SRs and opcodes.
Implement MPU entries dumping in dump_mmu.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Add SRs and rsr/wsr/xsr opcodes defined by the parity/ECC xtensa option.
The implementation is trivial since we don't emulate parity/ECC yet.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
IDMA and scatter/gather features introduced new IRQ types that
overlay_tool.h need to initialize Xtensa configuration.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Remove declarations of the internal mmu_helper functions from the cpu.h,
make these functions static and shuffle them.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
SR numbers are not unique: different Xtensa options may reuse SR number
for different purposes. Introduce generic rsr/wsr functions and xsr
template and use them instead of centralized SR access functions. Change
prototypes of specific rsr/wsr functions to match XtensaOpcodeOp and use
them instead of centralized SR access functions. Put xtensa option that
introduces SR into the second opcode description parameter and use it to
test for rsr/wsr/xsr opcode validity. Extract SR and UR names for the
xtensa_cpu_dump_state from libisa. Merge SRs and URs in the dump.
Register names of used SR/UR in init_libisa and use these names for TCG
globals referencing these SR/UR.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
CPUClass method dump_statistics() takes an fprintf()-like callback and
a FILE * to pass to it. Most callers pass fprintf() and stderr.
log_cpu_state() passes fprintf() and qemu_log_file.
hmp_info_registers() passes monitor_fprintf() and the current monitor
cast to FILE *. monitor_fprintf() casts it right back, and is
otherwise identical to monitor_printf().
The callback gets passed around a lot, which is tiresome. The
type-punning around monitor_fprintf() is ugly.
Drop the callback, and call qemu_fprintf() instead. Also gets rid of
the type-punning, since qemu_fprintf() takes NULL instead of the
current monitor cast to FILE *.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20190417191805.28198-15-armbru@redhat.com>
The various dump_mmu() take an fprintf()-like callback and a FILE * to
pass to it, and so do their helper functions. Passing around callback
and argument is rather tiresome.
Most dump_mmu() are called only by the target's hmp_info_tlb(). These
all pass monitor_printf() cast to fprintf_function and the current
monitor cast to FILE *.
SPARC's dump_mmu() gets also called from target/sparc/ldst_helper.c a
few times #ifdef DEBUG_MMU. These calls pass fprintf() and stdout.
The type-punning is technically undefined behaviour, but works in
practice. Clean up: drop the callback, and call qemu_printf()
instead.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20190417191805.28198-11-armbru@redhat.com>
The various TARGET_cpu_list() take an fprintf()-like callback and a
FILE * to pass to it. Their callers (vl.c's main() via list_cpus(),
bsd-user/main.c's main(), linux-user/main.c's main()) all pass
fprintf() and stdout. Thus, the flexibility provided by the (rather
tiresome) indirection isn't actually used.
Drop the callback, and call qemu_printf() instead.
Calling printf() would also work, but would make the code unsuitable
for monitor context without making it simpler.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190417191805.28198-10-armbru@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Load/store opcodes may raise MMU exceptions. Normally exceptions should
be checked in priority order before any actual operations, but since MMU
exceptions are tightly coupled with actual memory access, there's
currently no way to do it.
Approximate this behavior by executing all load, then all store, and
then all other opcodes in the FLIX bundles. Use opcode dependency
mechanism to express ordering. Mark load/store opcodes with
XTENSA_OP_{LOAD,STORE} flags. Newer libisa has classifier functions that
can tell whether opcode is a load or store, but this information is not
available in the existing overlays.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
To support circular register dependencies in FLIX bundles opcode inputs
and outputs must be separate and adjustable. Circular dependencies can
be broken by making temporary copies of opcode inputs and substituting
them into the arguments array instead of the original registers.
E.g. the circular register dependency in the following bundle:
{ mov a2, a3 ; mov a3, a2 }
can be resolved by making copy a2' = a2 and substituting it as input
argument of the second opcode:
{ mov a2, a3 ; mov a3, a2' }
Change opcode translator prototype to accept OpcodeArg array as
argument. For each register argument initialize OpcodeArg::{in,out} with
TCGv_* of the respective register. Don't explicitly use cpu_R in the
opcode translators, use OpcodeArg::{in,out} instead.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Opcodes that modify WINDOW_BASE SR don't have dependency on opcodes that
use windowed registers. If such opcodes are combined in a single
instruction they may not be correctly ordered. Instead of adding said
dependency use temporary register to store changed WINDOW_BASE value and
do actual register window rotation as a postprocessing step.
Not all opcodes that change WINDOW_BASE need this: retw, rfwo and rfwu
are also jump opcodes, so they are guaranteed to be translated last and
thus will not affect other opcodes in the same instruction.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Some opcodes may need additional actions at every exit from the
translated instruction or may need to amend TB exit slots available to
jumps generated for the instruction. Add gen_postprocess function and
call it from the gen_jump_slot and from the disas_xtensa_insn.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Opcodes in different slots may read and write same resources (registers,
states). In the absence of resource dependency loops it must be possible
to sort opcodes to avoid interference.
Record resources used by each opcode in the bundle. Build opcode
dependency graph and use topological sort to order its nodes. In case of
success translate opcodes in sort order. In case of failure report and
raise invalid opcode exception.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
There are opcodes that differ only in encoding or possible range of
immediate arguments. Allow multiple names for single opcode translation
table entry to reduce code duplication in that case.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Requirement for alphabetical opcode sorting in opcode tables is awkward
and does not allow sharing implementation between multiple opcodes.
Use hash tables to find opcodes by name. Move implementation from the
translate.c to the helper.c to its only user and remove declaration from
the cpu.h
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Don't run xtensa_finalize_config at the time of core registration,
instead run it at the CPU class initialization.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Use libisa to extract whether opcode uses windowed registers and
construct mask based on that. This only leaves special case for the
'entry' opcode, as it needs to probe a register dynamically.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Runstall signal looks very much like a level-triggered IRQ line. Provide
xtensa_get_runstall function that returns runstall IRQ.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Replace xtensa_get_extint that returns single external IRQ descriptor
with xtensa_get_extints that returns a vector of all external IRQs.
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
It's a one-liner used in a single place, move its implementation there
and remove its declaration.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Don't invalidate TB with the end of zero overhead loop when LBEG or LEND
change. Instead encode the distance from the start of the page where the
TB starts to the LEND in the TB cs_base and generate loopback code when
the next PC matches encoded LEND. Distance to a destination within the
same page and up to a maximum instruction length into the next page is
encoded literally, otherwise it's zero. The distance from LEND to LBEG
is also encoded in the cs_base: it's encoded literally when less than
256 or as 0 otherwise. This allows for TB chaining for the loopback
branch at the end of a loop for the most common loop sizes.
With this change the resulting emulation speed is about 10% higher in
softmmu mode on uClibc-ng and LTP tests. Emulation speed in linux
user mode is a few percent lower because there's no direct TB chaining
between different memory pages. Testing with lower limit on direct TB
chaining range shows gradual slowdown to ~15% for the block size of 64
bytes and ~50% for the block size of 32 bytes.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
- add XtensaOpcodeOps::coprocessor with bitmask of coprocessors used by
the instruction;
- replace coprocessor id parameter of gen_check_cpenable with the
bitmask of used coprocessors;
- collect coprocessor IDs used by an instruction in the disassembly
loop;
- put test for coprocessor disabled exception after the alloca test;
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
- add ps.callinc to the TB flags, that allows testing all instructions
for window overflow statically;
- drop gen_window_check* functions; replace them with get_window_check
that accepts bitmask of used registers;
- add XtensaOpcodeOps::test_overflow that returns bitmask of implicitly
used registers; use it for entry and call{,x}{4,8,12};
- drop window overflow test from the entry helper;
- drop parameter 0 from translate_[di]cache and use translate_nop for
d/i cache opcodes that don't need memory accessibility check;
- add bitmask XtensaOpcodeOps::windowed_register_op that marks opcode
arguments that refer to windowed registers;
- translate windowed_register_op mask to a mask of actually used
registers in the disassembly loop;
- add check for window overflow right after the check for debug
exception;
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
- TB flags: add XTENSA_TBFLAG_CWOE that corresponds to the architectural
CWOE state;
- entry: move CWOE check from the helper to the test_ill_entry;
- retw: move CWOE check from the helper to the test_ill_retw;
- separate instruction disassembly loop and translation loop; save
disassembly results in local array;
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
- move register counting to xtensa/gdbstub.c
- add symbolic names for register types and flags from GDB and use them
in register counting and access functions.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
ISA book documents that the first instruction of zero overhead loop
must fit completely into naturally aligned region of an instruction
fetch unit size. Check that condition and log a message if it's
violated.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
cpu_init(cpu_model) were replaced by cpu_create(cpu_type) so
no users are left, remove it.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au> (ppc)
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <1518000027-274608-6-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
it will be used for providing to cpu name resolving class for
parsing cpu model for system and user emulation code.
Along with change add target to null-machine tests, so
that when switch to CPU_RESOLVING_TYPE happens,
it would ensure that null-machine usecase still works.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu> (m68k)
Acked-by: David Gibson <david@gibson.dropbear.id.au> (ppc)
Acked-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> (tricore)
Message-Id: <1518000027-274608-4-git-send-email-imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
[ehabkost: Added macro to riscv too]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Import list of syscalls from the kernel source. Conditionalize code/data
that is only used with softmmu. Implement exception handlers. Implement
signal hander (only the core registers for now, no coprocessors or TIE).
Cc: Riku Voipio <riku.voipio@iki.fi>
Cc: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
- emit TCG barriers for MEMW, EXTW, S32RI and L32AI;
- do atomic_cmpxchg_i32 for S32C1I.
Cc: Emilio G. Cota <cota@braap.org>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
System emulation should provide access to all registers, userspace
emulation should only provide access to unprivileged registers.
Record register flags from GDB register map definition, calculate both
num_regs and num_core_regs if either is zero. Use num_regs in system
emulation, num_core_regs in userspace emulation gdbstub.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
As cpu.h is another typically widely included file which doesn't need
full access to the softfloat API we can remove the includes from here
as well. Where they do need types it's typically for float_status and
the rounding modes so we move that to softfloat-types.h as well.
As a result of not having softfloat in every cpu.h call we now need to
add it to various helpers that do need the full softfloat.h
definitions.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
[For PPC parts]
Acked-by: David Gibson <david@gibson.dropbear.id.au>