Use mmap_lock in user-mode to protect TCG state and the page descriptors.
In !user-mode, each vCPU has its own TCG state, so no locks needed.
Per-page locks are used to protect the page descriptors.
Per-TB locks are used in both modes to protect TB jumps.
Some notes:
- tb_lock is removed from notdirty_mem_write by passing a
locked page_collection to tb_invalidate_phys_page_fast.
- tcg_tb_lookup/remove/insert/etc have their own internal lock(s),
so there is no need to further serialize access to them.
- do_tb_flush is run in a safe async context, meaning no other
vCPU threads are running. Therefore acquiring mmap_lock there
is just to please tools such as thread sanitizer.
- Not visible in the diff, but tb_invalidate_phys_page already
has an assert_memory_lock.
- cpu_io_recompile is !user-only, so no mmap_lock there.
- Added mmap_unlock()'s before all siglongjmp's that could
be called in user-mode while mmap_lock is held.
+ Added an assert for !have_mmap_lock() after returning from
the longjmp in cpu_exec, just like we do in cpu_exec_step_atomic.
Performance numbers before/after:
Host: AMD Opteron(tm) Processor 6376
ubuntu 17.04 ppc64 bootup+shutdown time
700 +-+--+----+------+------------+-----------+------------*--+-+
| + + + + + *B |
| before ***B*** ** * |
|tb lock removal ###D### *** |
600 +-+ *** +-+
| ** # |
| *B* #D |
| *** * ## |
500 +-+ *** ### +-+
| * *** ### |
| *B* # ## |
| ** * #D# |
400 +-+ ** ## +-+
| ** ### |
| ** ## |
| ** # ## |
300 +-+ * B* #D# +-+
| B *** ### |
| * ** #### |
| * *** ### |
200 +-+ B *B #D# +-+
| #B* * ## # |
| #* ## |
| + D##D# + + + + |
100 +-+--+----+------+------------+-----------+------------+--+-+
1 8 16 Guest CPUs 48 64
png: https://imgur.com/HwmBHXe
debian jessie aarch64 bootup+shutdown time
90 +-+--+-----+-----+------------+------------+------------+--+-+
| + + + + + + |
| before ***B*** B |
80 +tb lock removal ###D### **D +-+
| **### |
| **## |
70 +-+ ** # +-+
| ** ## |
| ** # |
60 +-+ *B ## +-+
| ** ## |
| *** #D |
50 +-+ *** ## +-+
| * ** ### |
| **B* ### |
40 +-+ **** # ## +-+
| **** #D# |
| ***B** ### |
30 +-+ B***B** #### +-+
| B * * # ### |
| B ###D# |
20 +-+ D ##D## +-+
| D# |
| + + + + + + |
10 +-+--+-----+-----+------------+------------+------------+--+-+
1 8 16 Guest CPUs 48 64
png: https://imgur.com/iGpGFtv
The gains are high for 4-8 CPUs. Beyond that point, however, unrelated
lock contention significantly hurts scalability.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
These were named incorrectly, going so far as to invade strace.list.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180607184844.30126-2-richard.henderson@linaro.org>
[lv: replace tabs by spaces]
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Remove a "#if defined(XX) || defined(YY) || ..." with all available
targets
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180529194207.31503-16-laurent@vivier.eu>
add a per target target_fcntl.h and include the generic one from them
No code change.
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180529194207.31503-2-laurent@vivier.eu>
linux-user/syscall.c:9860:17: warning: Call to function 'strcpy' is insecure as it does not provide bounding of the memory buffer. Replace unbounded copy functions with analogous functions that support length arguments such as 'strlcpy'. CWE-119
strcpy (buf->machine, cpu_to_uname_machine(cpu_env));
^~~~~~
Reported-by: Clang Static Analyzer
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20170724182751.18261-32-f4bug@amsat.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Extend special registers to 64-bits. This is in preparation for
MFSE/MTSE, moves to and from extended special registers.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Values defined for sparc are not correct.
Copy the content of "arch/sparc/include/uapi/asm/socket.h"
to fix them.
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20180519092956.15134-8-laurent@vivier.eu>
to be like in the kernel and rename it TARGET_ARCH_HAS_SOCKET_TYPES
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20180519092956.15134-7-laurent@vivier.eu>
Change conditional #ifdef part by #undef of the symbols
redefined for PPC relative to generic/socket.h
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20180519092956.15134-6-laurent@vivier.eu>
and include the file from architectures without specific definitions
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20180519092956.15134-5-laurent@vivier.eu>
Coverity points out that there's a missing break in the switch in
host_to_target_cmsg() where we update tgt_len for
cmsg_level/cmsg_type combinations which require a different length
for host and target (CID 1385425). To avoid duplicating the default
case (target length same as host) in both switches, set that before
the switch so that only the cases which want to override it need any
code.
This fixes a bug where we would have used the wrong length
for SOL_SOCKET/SO_TIMESTAMP messages where the target and
host have differently sized 'struct timeval' (ie one is 32
bit and the other is 64 bit).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180518184715.29833-1-peter.maydell@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
cpu_init() was replaced by cpu_create() since 2.12 but comments
weren't updated. So update stale comments to point that page
sizes arei actually initialized by tcg_exec_init(). Also move
another qemu_host_page_size related comment before tcg_exec_init()
where it belongs.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <1526557877-293151-1-git-send-email-imammedo@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
"sun4" is not recognized by config.guess.
linux defines sparc and sparc64 in arch/sparc/Makefile.
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Max Filippov <jcmvbkbc@gmail.com>
Message-Id: <20180509231123.20864-7-laurent@vivier.eu>
As l_type values (F_RDLCK, F_WRLCK, F_UNLCK, F_EXLCK, F_SHLCK)
are not bitmasks, we can't use target_to_host_bitmask() and
host_to_target_bitmask() to convert them.
Introduce target_to_host_flock() and host_to_target_flock()
to convert values between host and target.
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Max Filippov <jcmvbkbc@gmail.com>
Message-Id: <20180509231123.20864-5-laurent@vivier.eu>
And kill sys_aplib, add sys_sync_file_range:
on sparc, since linux 2.6.17, aplib syscall has been replaced
by sync_file_range syscall.
(289eee6fa78e ["SPARC]: Wire up sys_sync_file_range() into syscall tables.")
The syscall has been removed in linux v2.5.71
(6196166fad "[SPARC64]: Kill sys_aplib.")
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Max Filippov <jcmvbkbc@gmail.com>
Message-Id: <20180509231123.20864-4-laurent@vivier.eu>
include/uapi/asm-generic/fcntl.h insert a padding macro at
the end of the structures flock and flock64.
This macro is defined to "short __unused;" on sparc,
and "long pad[4]" on mips.
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Max Filippov <jcmvbkbc@gmail.com>
Message-Id: <20180509231123.20864-3-laurent@vivier.eu>
The insns in the ARMv8.1-Atomics are added to the existing
load/store exclusive and load/store reg opcode spaces.
Rearrange the top-level decoders for these to accomodate.
The Atomics insns themselves still generate Unallocated.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180508151437.4232-8-richard.henderson@linaro.org
[PMM: Drop the ARM_FEATURE_V8_1 feature flag]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Since commit 8efb2ed5ec ("linux-user: Correct signedness of
target_flock l_start and l_len fields"), flock64 structure uses
abi_llong for l_start and l_len in place of "unsigned long long"
this should force them to be aligned accordingly to the target
rules. So we can remove the padding field and the QEMU_PACKED
attribute.
I have compared the result of the following program before and
after the change:
cat -> flock64_dump <<EOF
p/d sizeof(struct target_flock64)
p/d &((struct target_flock64 *)0)->l_type
p/d &((struct target_flock64 *)0)->l_whence
p/d &((struct target_flock64 *)0)->l_start
p/d &((struct target_flock64 *)0)->l_len
p/d &((struct target_flock64 *)0)->l_pid
quit
EOF
for file in build/all/*-linux-user/qemu-* ; do
echo $file
gdb -batch -nx -x flock64_dump $file 2> /dev/null
done
The sizeof() changes because we remove the QEMU_PACKED.
The new size is 32 (except for i386 and m68k) and this is
the real size of "struct flock64" on the target architecture.
The following architectures differ:
aarch64_be, aarch64, alpha, armeb, arm, cris, hppa, nios2, or1k,
riscv32, riscv64, s390x.
For a subset of these architectures, I have checked with the following
program the new structure is the correct one:
#include <stdio.h>
#define __USE_LARGEFILE64
#include <fcntl.h>
int main(void)
{
printf("struct flock64 %d\n", sizeof(struct flock64));
printf("l_type %d\n", &((struct flock64 *)0)->l_type);
printf("l_whence %d\n", &((struct flock64 *)0)->l_whence);
printf("l_start %d\n", &((struct flock64 *)0)->l_start);
printf("l_len %d\n", &((struct flock64 *)0)->l_len);
printf("l_pid %d\n", &((struct flock64 *)0)->l_pid);
}
[I have checked aarch64, alpha, hppa, s390x]
For ARM, the target_flock64 becomes the EABI definition, so we need to
define the OABI one in place of the EABI one and use it when it is
needed.
I have also fixed the alignment value for sh4 (to align llong on 4 bytes)
(see c2e3dee6e0 "linux-user: Define target alignment size")
[We should check alignment properties for cris, nios2 and or1k]
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20180502215730.28162-1-laurent@vivier.eu>
The FDPIC restorer needs to deal with a function descriptor, hence we
have to extend 'retcode' such that it can hold the instructions needed
to perform this.
The restorer sequence uses the same thumbness as the exception
handler (mainly to support Thumb-only architectures).
Co-Authored-By: Mickaël Guêné <mickael.guene@st.com>
Signed-off-by: Christophe Lyon <christophe.lyon@st.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20180430080404.7323-5-christophe.lyon@st.com>
[lv: moved the change to linux-user/arm/signal.c]
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Add FDPIC info into image_info structure since interpreter info is on
stack and needs to be saved to be accessed later on.
Co-Authored-By: Mickaël Guêné <mickael.guene@st.com>
Signed-off-by: Christophe Lyon <christophe.lyon@st.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20180430080404.7323-4-christophe.lyon@st.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
We want to avoid code disabled by default, because it ends up less
tested. This patch removes all instances of #ifdef CONFIG_USE_FDPIC,
most of which can be safely kept. For the ones that should be
conditionally executed, we define elf_is_fdpic(). Without this patch,
defining CONFIG_USE_FDPIC would prevent QEMU from building precisely
because elf_is_fdpic is not defined.
Signed-off-by: Christophe Lyon <christophe.lyon@st.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20180430080404.7323-2-christophe.lyon@st.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>