Restrict the size of code_gen_buffer to 2GB on ppc64, which
lets us assert that everything is reachable with addis+addi
from tb_ret_addr. This lets us use a max of 4 insns for goto_tb
instead of 7.
Emit the indirect branch portion of goto_tb up front, which
means we only have to update two insns to update any link.
With a 64-bit store, we can update the link atomically, which
may be required in future.
Signed-off-by: Richard Henderson <rth@twiddle.net>
We currently pre-compute an worst case code size for any TB, which
works out to be 122kB. Since the average TB size is near 1kB, this
wastes quite a lot of storage.
Instead, check for overflow in between generating code for each opcode.
The overhead of the check isn't measurable and wastage is minimized.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
This will catch any overflow of the buffer.
Add a native win32 alternative for alloc_code_gen_buffer;
remove the malloc alternative.
Signed-off-by: Richard Henderson <rth@twiddle.net>
By putting the prologue at the end, we risk overwriting the
prologue should our estimate of maximum TB size. Given the
two different placements of the call to tcg_prologue_init,
move the high water mark computation into tcg_prologue_init.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
We can now restore state without retranslation.
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
The gen_opc_* arrays are already redundant with the data stored in
the insn_start arguments. Transition restore_state_to_opc to use
data from the latter.
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
As it's only caller, this tidies things a bit.
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Move the size and mask globals for the "real" host page size to
translate-common. This is to allow system-level code to use
REAL_HOST_PAGE_ALIGN and friends in builds which hide translate-all
behind arch-obj.
Cc: dgilbert@redhat.com
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-Id: <b437638691f044bc690a7f03b1240c8b0f34ab57.1441614289.git.crosthwaite.peter@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move this function to common code. It has no arch specific
dependencies. Prepares support for multi-arch where the translate-all
interface needs to be virtualised. One less thing to virtualise.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-Id: <44a7c73604ed2552af47ed02b047b6a772b683e0.1441614289.git.crosthwaite.peter@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* qemu_mutex_lock_iothread "No such process" fix
* cutils: qemu_strto* wrappers
* iohandler.c simplification
* Many other fixes and misc patches.
And some MTTCG work (with Emilio's fixes squashed):
* Signal-free TCG kick
* Removing spinlock in favor of QemuMutex
* User-mode emulation multi-threading fixes/docs
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQEcBAABCAAGBQJV8Tk7AAoJEL/70l94x66Ds3QH/3bi0RRR2NtKIXAQrGo5tfuD
NPMu1K5Hy+/26AC6mEVNRh4kh7dPH5E4NnDGbxet1+osvmpjxAjc2JrxEybhHD0j
fkpzqynuBN6cA2Gu5GUNoKzxxTmi2RrEYigWDZqCftRXBeO2Hsr1etxJh9UoZw5H
dgpU3j/n0Q8s08jUJ1o789knZI/ckwL4oXK4u2KhSC7ZTCWhJT7Qr7c0JmiKReaF
JEYAsKkQhICVKRVmC8NxML8U58O8maBjQ62UN6nQpVaQd0Yo/6cstFTZsRrHMHL3
7A2Tyg862cMvp+1DOX3Bk02yXA+nxnzLF8kUe0rYo6llqDBDStzqyn1j9R0qeqA=
=nB06
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* Support for jemalloc
* qemu_mutex_lock_iothread "No such process" fix
* cutils: qemu_strto* wrappers
* iohandler.c simplification
* Many other fixes and misc patches.
And some MTTCG work (with Emilio's fixes squashed):
* Signal-free TCG kick
* Removing spinlock in favor of QemuMutex
* User-mode emulation multi-threading fixes/docs
# gpg: Signature made Thu 10 Sep 2015 09:03:07 BST using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>"
* remotes/bonzini/tags/for-upstream: (44 commits)
cutils: work around platform differences in strto{l,ul,ll,ull}
cpu-exec: fix lock hierarchy for user-mode emulation
exec: make mmap_lock/mmap_unlock globally available
tcg: comment on which functions have to be called with mmap_lock held
tcg: add memory barriers in page_find_alloc accesses
remove unused spinlock.
replace spinlock by QemuMutex.
cpus: remove tcg_halt_cond and tcg_cpu_thread globals
cpus: protect work list with work_mutex
scripts/dump-guest-memory.py: fix after RAMBlock change
configure: Add support for jemalloc
add macro file for coccinelle
configure: factor out adding disas configure
vhost-scsi: fix wrong vhost-scsi firmware path
checkpatch: remove tests that are not relevant outside the kernel
checkpatch: adapt some tests to QEMU
CODING_STYLE: update mixed declaration rules
qmp: Add example usage of strto*l() qemu wrapper
cutils: Add qemu_strtoull() wrapper
cutils: Add qemu_strtoll() wrapper
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
My Coccinelle semantic patch finds a few more, because it also fixes up
the equally pointless conditional
if (foo) {
free(foo);
foo = NULL;
}
Result (feel free to squash it into your patch):
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
tb_lock has to be taken inside the mmap_lock (example:
tb_invalidate_phys_range is called by target_mmap), but
tb_link_page is taking the mmap_lock and it is called
with the tb_lock held.
To fix this, take the mmap_lock in tb_find_slow, not
in tb_link_page.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There is some iffy lock hierarchy going on in translate-all.c. To
fix it, we need to take the mmap_lock in cpu-exec.c. Make the
functions globally available.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
page_find is reading the radix tree outside all locks, so it has to
use the RCU primitives. It does not need RCU critical sections
because the PageDescs are never removed, so there is never a need
to wait for the end of code sections that use a PageDesc.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
spinlock is only used in two cases:
* cpu-exec.c: to protect TranslationBlock
* mem_helper.c: for lock helper in target-i386 (which seems broken).
It's a pthread_mutex_t in user-mode, so we can use QemuMutex directly,
with an #ifdef. The #ifdef will be removed when multithreaded TCG
will need the mutex as well.
Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Message-Id: <1439220437-23957-5-git-send-email-fred.konrad@greensocs.com>
Signed-off-by: Emilio G. Cota <cota@braap.org>
[Merge Emilio G. Cota's patch to remove volatile. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
l1_map is based on physical addresses in full-system mode, as pointed
out in an earlier comment. Said comment also mentions that virtual
addresses are only used in l1_map in user-only mode.
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1440375847-17603-11-git-send-email-cota@braap.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
All tcg host architectures now support the guest base and as
there is no real performance lost, it can be always enabled.
Anyway, guest base use can be disabled lively by setting guest
base to 0.
CONFIG_USE_GUEST_BASE is defined as (USE_GUEST_BASE && USER_ONLY),
it should have to be replaced by CONFIG_USER_ONLY in non CONFIG_USER_ONLY
parts, but as some other parts are using !CONFIG_SOFTMMU I have chosen to
use !CONFIG_SOFTMMU instead.
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <1440373328-9788-2-git-send-email-laurent@vivier.eu>
Signed-off-by: Richard Henderson <rth@twiddle.net>
After commit 626cf8f (icount: set can_do_io outside TB execution,
2014-12-08), can_do_io is set to 1 if not executing code. It is
no longer necessary to make this assumption in cpu_can_do_io.
It is also possible to remove the use_icount test, simply by
never setting cpu->can_do_io to 0 unless use_icount is true.
With these changes cpu_can_do_io boils down to a read of
cpu->can_do_io.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Instead of invalidating an original TB in cpu_exec_nocache()
prematurely, just save a link to it in the temporary generated TB. If
cpu_io_recompile() is raised subsequently from the temporary TB,
invalidate the original one as well. That allows reusing the original TB
each time cpu_exec_nocache() is called to handle expired instruction
counter in icount mode.
Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Message-Id: <1435656909-29116-1-git-send-email-serge.fdrv@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
All of the core-code usages of this API have the cpu pointer handy so
pass it in. There are only 3 architecture specific usages (2 of which
are commented out) which can just use ENV_GET_CPU() locally to get the
cpu pointer. The reduces core code usage of the CPU env, which brings
us closer to common-obj'ing these core files.
Cc: Riku Voipio <riku.voipio@iki.fi>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Currently the "host" page size alignment API is really aligning to both
host and target page sizes. There is the qemu_real_page_size which can
be used for the actual host page size but it's missing a mask and ALIGN
macro as provided for qemu_page_size. Complete the API. This allows
system level code that cares about the host page size to use a
consistent alignment interface without having to un-needingly align to
the target page size. This also reduces system level code dependency
on the cpu specific TARGET_PAGE_SIZE.
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
This is one of very few things in exec-all with a genuine CPU
architecture dependency. Move these hashing helpers to a new
header to trim exec-all.h down to a near architecture-agnostic
header.
The defs are only used by cpu-exec and translate-all which are both
arch-obj's so the new tb-hash.h has no core code usage.
Reviewed-by: Richard Henderson <rth@redhat.com>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-Id: <9d048b96f7cfa64a4d9c0b88e0dd2877fac51d41.1433052532.git.crosthwaite.peter@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The tb_check_watchpoint function currently assumes that all memory
access is done either directly through the TCG code or through an
helper which knows its return address. This is obviously wrong as the
helpers use cpu_ldxx/stxx_data functions to access the memory.
Instead of aborting in that case, don't try to retranslate the code, but
assume that the CPU state (and especially the program counter) has been
saved before calling the helper. Then invalidate the TB based on this
address.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
is_cpu_write_access is only set if tb_invalidate_phys_page_range is called
from tb_invalidate_phys_page_fast, and hence from notdirty_mem_write.
However:
- the code bitmap can be built directly in tb_invalidate_phys_page_fast
(unconditionally, since is_cpu_write_access would always be passed as 1);
- the virtual address is not needed to mark the page as "not containing
code" (dirty code bitmap = 1), so we can also remove that use of
is_cpu_write_access. For calls of tb_invalidate_phys_page_range
that do not come from notdirty_mem_write, the next call to
notdirty_mem_write will notice that the page does not contain code
anymore, and will fix up the TLB entry.
The parameter needs to remain in order to guard accesses to cpu->mem_io_pc.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These days modification of the TLB is done in notdirty_mem_write,
so the virtual address and env pointer as unnecessary.
The new name of the function, tlb_unprotect_code, is consistent with
tlb_protect_code.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Once address_space_translate will be called outside the BQL, the returned
MemoryRegion might disappear as soon as the RCU read-side critical section
ends. Avoid this by moving the critical section to the callers.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1426684909-95030-3-git-send-email-pbonzini@redhat.com>
Here we have an open-coded byte-based bitmap implementation.
Get rid of it since there's a ulong-based implementation to be
used by all code.
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Since commit
b7b5233a "bsd-user/mmap.c: Don't try to override g_malloc/g_free"
the exception we make here for usermode has been unnecessary.
Get rid of it.
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1428610053-26148-1-git-send-email-cota@braap.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The USE_MMAP code can fail, and the caller handles the failure
already. Let the !USE_MMAP code fail as well, for consistency.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The TARGET_HAS_ICE #define is intended to indicate whether a target-*
guest CPU implementation supports the breakpoint handling. However,
all our guest CPUs have that support (the only two which do not
define TARGET_HAS_ICE are unicore32 and openrisc, and in both those
cases the bp support is present and the lack of the #define is just
a bug). So remove the #define entirely: all new guest CPU support
should include breakpoint handling as part of the basic implementation.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-id: 1420484960-32365-1-git-send-email-peter.maydell@linaro.org
Mark map_exec() with the 'unused' attribute to avoid '-Wunused-function'
warnings on clang 3.4 or later. This means we don't need to mark it
'inline', which is what we were previously using to suppress the warning
(a trick which only works with gcc, not clang).
Signed-off-by: SeokYeon Hwang <syeon.hwang@samsung.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
[PMM: tweaked comment message a little]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Currently 'info jit' outputs half of the information to monitor and the
rest to qemu log. Dumping opcode counts to monitor as a part of 'info
jit' command doesn't sound useful. Add new monitor command 'info
opcount' that only dumps opcode counters.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Correct MIPS16/microMIPS branch size calculation in PC adjustment
needed:
- to set the value of CP0.ErrorEPC at the entry to the reset exception,
- for the purpose of branch reexecution in the context of device I/O.
Follow the approach taken in `exception_resume_pc' for ordinary, Debug
and NMI exceptions.
MIPS16 and microMIPS branches can be 2 or 4 bytes in size and that has
to be reflected in calculation. Original MIPS ISA branches, which is
where this code originates from, are always 4 bytes long, just as all
original MIPS ISA instructions.
Signed-off-by: Nathan Froyd <froydnj@codesourcery.com>
Signed-off-by: Maciej W. Rozycki <macro@codesourcery.com>
Reviewed-by: Leon Alrae <leon.alrae@imgtec.com>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
In this case, QEMU might longjmp out of cpu-exec.c and miss the final
cleanup in cpu_exec_nocache. Do this manually through a new compile
flag.
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The initial base address is miscalculated in walk_memory_regions().
It has to be shifted TARGET_PAGE_BITS more. Holder variables are
extended to target_ulong size otherwise they don't fit for MIPS N32
(a 32-bit ABI with a 64-bit address space) and qemu won't compile.
The issue led to incorrect debug output of memory maps and a
mis-formed coredumped file.
Signed-off-by: Mikhail Ilyin <m.ilin@samsung.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
This adds a couple of tcg specific trace-events which are useful for
tracing execution though tcg generated blocks. It's been tested with
lttng user space tracing but is generic enough for all systems. The tcg
events are:
* translate_block - when a subject block is translated
* exec_tb - when a translated block is entered
* exec_tb_exit - when we exit the translated code
* exec_tb_nocache - special case translations
Of course we can only trace the entrance to the first block of a chain
as each block will jump directly to the next when it can. See the -d
nochain patch to allow more complete tracing at the expense of
performance.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
So that backends can use it.
Since we need the page size for efficiency, move code to compute it
out of translate-all.c and into util/oslib-win32.c.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This assures us use of J for exit_tb and goto_tb, and JAL for calling
into the generated bswap helpers.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Choosing good addresses for them means we can use JAL for helper calls.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
To be defined by the tcg backend based on the elemental unit of the ISA.
During the transition, allow TCG_TARGET_INSN_UNIT_SIZE to be undefined,
which allows us to default tcg_insn_unit to the current uint8_t.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
When checking a page range, if we found that a page was
made read-only by QEMU because it contained translated code,
we were incorrectly returning immediately after unprotecting
that page, rather than continuing to check the entire range,
so we might fail to unprotect pages later in the range, or
might incorrectly return a "success" result even if later
pages were not writable.
In particular, this could cause segfaults in a case where
signals are delivered back to back on a target architecture
which uses trampoline code in the stack frame (as AArch64
currently does). The second signal causes a segfault because
the frame cannot be written to (it was protected because
we translated and executed the restorer trampoline, and the
unprotect logic did not unprotect the whole range).
Signed-off-by: Andrei Warkentin <andrey.warkentin@gmail.com
[PMM: expanded commit message a bit]
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>