The code which patches x86 jump instructions assumes it can do an
unaligned write of a uint32_t. This is actually safe on x86, but it's
still undefined behaviour. We have infrastructure for doing efficient
unaligned accesses which doesn't engage in undefined behaviour, so
use it.
This is technically fractionally less efficient, at least with gcc 4.6;
instead of one instruction:
7b2: 89 3e mov %edi,(%rsi)
we get an extra spurious store to the stack slot:
7b2: 89 7c 24 64 mov %edi,0x64(%rsp)
7b6: 89 3e mov %edi,(%rsi)
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
The ARM A64 decoder's worst case number of TCG ops per instruction
is 266 (for insn 0x4c800000, a post-indexed ST4 multiple-structures
store). Raise the MAX_OP_PER_INSTR define accordingly.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-id: 1394822294-14837-17-git-send-email-peter.maydell@linaro.org
There is a HOST_PAGE_ALIGN macro which makes sense for KVM accelerator
but it uses qemu_host_page_size/qemu_host_page_mask which initialized
for TCG only.
This moves qemu_host_page_size/qemu_host_page_mask initialization from
TCG's page_init() and adds a call for it from kvm_init().
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
phys_mem_alloc and its assigned values qemu_anon_ram_alloc and
legacy_s390_alloc must have identical argument lists.
legacy_s390_alloc uses the size parameter to call mmap, so size_t is
good enough for all of them.
This patch fixes compiler errors on i686 Linux hosts:
CC alpha-softmmu/exec.o
exec.c:752:51: error:
initialization from incompatible pointer type [-Werror]
exec.c: In function 'qemu_ram_alloc_from_ptr':
exec.c:1139:32: error:
comparison of distinct pointer types lacks a cast [-Werror]
exec.c: In function 'qemu_ram_remap':
exec.c:1283:21: error:
comparison of distinct pointer types lacks a cast [-Werror]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1380481005-32399-1-git-send-email-sw@weilnetz.de
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
A minimal update to use the new helpers with the return address argument.
Tested-by: Claudio Fontana <claudio.fontana@linaro.org>
Reviewed-by: Claudio Fontana <claudio.fontana@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Use the new helper_ret_*_mmu routines. Use a conditional call
to arrange for a tail-call from the store path, and to load the
return address for the helper for the load path.
Signed-off-by: Richard Henderson <rth@twiddle.net>
These use a 32-bit load-of-immediate to save a mflr+addi+mtlr sequence.
Tested with a Windows 98 guest (pretty much the most recent thing I
could run on my PPC machine) and kvm-unit-tests's sieve.flat. The
speed up for sieve.flat is as high as 10% for qemu-system-i386, 25%
(no kidding) for qemu-system-x86_64 on my PowerBook G4.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Make it a generic hook rather than a KVM hook. Less code and
ifdeffery.
Since the only user of the hook is old S390 KVM, there's hope we can
get rid of it some day.
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Message-id: 1375276272-15988-5-git-send-email-armbru@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
The _cmmu helpers can be moved to exec-all.h. The helpers that are
used from TCG will shortly need access to tcg_target_long so move
their declarations into tcg.h.
This requires minor include adjustments to all TCG backends.
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Always define GETRA; use __builtin_extract_return_addr, rather than
having a special case for s390. Split GETPC_ADJ out of GETPC; use 2
universally, rather than having a special case for arm.
Rename GETPC_LDST to GETRA_LDST to indicate that it does not
contain the GETPC_ADJ value. Likewise with GETPC_EXT to GETRA_EXT.
Perform the GETPC_ADJ adjustment inside helper_ret_ld/st. This will
allow backends to pass along the "true" return address rather than
the massaged GETPC value. In the meantime, double application of
GETPC_ADJ does not hurt, since the call insn in all ISAs is at least
4 bytes long.
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Indeed, remove it entirely and remove the is_tcg_gen_code check
from GETPC_EXT.
Fixes https://bugs.launchpad.net/qemu/+bug/1218098 wherein a call
to a "normal" helper function performed a sequence of tail calls
all the way into the memory helper functions, leading to a stack
frame in which the memory helper function appeared to be called
directly from tcg.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Discontinue the jump-around-jump-to-jump scheme, trading it for a single
immediate move instruction. The two extra jumps always consume 7 bytes,
whereas the immediate move is either 5 or 7 bytes depending on where the
code_gen_buffer gets located.
Signed-off-by: Richard Henderson <rth@twiddle.net>
add preliminary support for TCG target aarch64.
Signed-off-by: Claudio Fontana <claudio.fontana@huawei.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 51A5C596.3090108@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Move the slow path out of line, as the TODO's mention.
This allows the fast path to be unconditional, which can
speed up the fast path as well, depending on the core.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Explictly NULL it on CPU reset since it was located before breakpoints.
Change vapic_report_tpr_access() argument to CPUState. This also
resolves the use of void* for cpu.h independence.
Change vAPIC patch_instruction() argument to X86CPU.
Signed-off-by: Andreas Färber <afaerber@suse.de>
It's worth to clean-up translation blocks variables and move them
into one context as was suggested by Swirl.
Also if we use this context directly inside tcg_ctx, then it
speeds up code generation a bit.
Signed-off-by: Evgeny Voevodin <evgenyvoevodin@gmail.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
s390x-linux-user now also uses GETPC. Instead of adding it to the list of
targets which use GETPC, the macro is now defined unconditionally.
This avoids future build regressions like this one:
CC s390x-linux-user/target-s390x/int_helper.o
cc1: warnings being treated as errors
qemu/target-s390x/int_helper.c: In function ‘helper_divs32’:
qemu/target-s390x/int_helper.c:47: error: implicit declaration of function ‘GETPC’
qemu/target-s390x/int_helper.c:47: error: nested extern declaration of ‘GETPC’
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>