Commit Graph

88 Commits

Author SHA1 Message Date
Peter Maydell
e469b22ffd Make CPU iotlb a structure rather than a plain hwaddr
Make the CPU iotlb a structure rather than a plain hwaddr;
this will allow us to add transaction attributes to it.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
2015-04-26 16:49:23 +01:00
Xin Tong
88e89a57f9 implementing victim TLB for QEMU system emulated TLB
QEMU system mode page table walks are expensive. Taken by running QEMU
qemu-system-x86_64 system mode on Intel PIN , a TLB miss and walking a
4-level page tables in guest Linux OS takes ~450 X86 instructions on
average.

QEMU system mode TLB is implemented using a directly-mapped hashtable.
This structure suffers from conflict misses. Increasing the
associativity of the TLB may not be the solution to conflict misses as
all the ways may have to be walked in serial.

A victim TLB is a TLB used to hold translations evicted from the
primary TLB upon replacement. The victim TLB lies between the main TLB
and its refill path. Victim TLB is of greater associativity (fully
associative in this patch). It takes longer to lookup the victim TLB,
but its likely better than a full page table walk. The memory
translation path is changed as follows :

Before Victim TLB:
1. Inline TLB lookup
2. Exit code cache on TLB miss.
3. Check for unaligned, IO accesses
4. TLB refill.
5. Do the memory access.
6. Return to code cache.

After Victim TLB:
1. Inline TLB lookup
2. Exit code cache on TLB miss.
3. Check for unaligned, IO accesses
4. Victim TLB lookup.
5. If victim TLB misses, TLB refill
6. Do the memory access.
7. Return to code cache

The advantage is that victim TLB can offer more associativity to a
directly mapped TLB and thus potentially fewer page table walks while
still keeping the time taken to flush within reasonable limits.
However, placing a victim TLB before the refill path increase TLB
refill path as the victim TLB is consulted before the TLB refill. The
performance results demonstrate that the pros outweigh the cons.

some performance results taken on SPECINT2006 train
datasets and kernel boot and qemu configure script on an
Intel(R) Xeon(R) CPU  E5620  @ 2.40GHz Linux machine are shown in the
Google Doc link below.

https://docs.google.com/spreadsheets/d/1eiItzekZwNQOal_h-5iJmC4tMDi051m9qidi5_nwvH4/edit?usp=sharing

In summary, victim TLB improves the performance of qemu-system-x86_64 by
11% on average on SPECINT2006, kernelboot and qemu configscript and with
highest improvement of in 26% in 456.hmmer. And victim TLB does not result
in any performance degradation in any of the measured benchmarks. Furthermore,
the implemented victim TLB is architecture independent and is expected to
benefit other architectures in QEMU as well.

Although there are measurement fluctuations, the performance
improvement is very significant and by no means in the range of
noises.

Signed-off-by: Xin Tong <trent.tong@gmail.com>
Message-id: 1407202523-23553-1-git-send-email-trent.tong@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-01 17:43:06 +01:00
Andreas Färber
f0c3c505a8 cpu: Move breakpoints field from CPU_COMMON to CPUState
Most targets were using offsetof(CPUFooState, breakpoints) to determine
how much of CPUFooState to clear on reset. Use the next field after
CPU_COMMON instead, if any, or sizeof(CPUFooState) otherwise.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-03-13 19:20:47 +01:00
Andreas Färber
ff4700b05c cpu: Move watchpoint fields from CPU_COMMON to CPUState
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-03-13 19:20:47 +01:00
Andreas Färber
0429a97195 cpu: Move opaque field from CPU_COMMON to CPUState
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-03-13 19:20:47 +01:00
Andreas Färber
27103424c4 cpu: Move exception_index field from CPU_COMMON to CPUState
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-03-13 19:20:46 +01:00
Andreas Färber
6f03bef0ff cpu: Move jmp_env field from CPU_COMMON to CPUState
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-03-13 19:20:46 +01:00
Andreas Färber
8cd70437f3 cpu: Move tb_jmp_cache field from CPU_COMMON to CPUState
Clear it on reset.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-03-13 19:20:46 +01:00
Andreas Färber
28ecfd7a62 cpu: Move icount_decr field from CPU_COMMON to CPUState
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-03-13 19:20:46 +01:00
Andreas Färber
efee734004 cpu: Move icount_extra field from CPU_COMMON to CPUState
Reset it.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-03-13 19:20:46 +01:00
Andreas Färber
99df7dce8a cpu: Move can_do_io field from CPU_COMMON to CPUState
Rename can_do_io() to cpu_can_do_io() and change argument to CPUState.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-03-13 19:20:46 +01:00
Andreas Färber
93afeade09 cpu: Move mem_io_{pc,vaddr} fields from CPU_COMMON to CPUState
Reset them.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-03-13 19:20:46 +01:00
Peter Maydell
72c1d3af6e target-arm: Implement WFE as a yield operation
Implement WFE to yield our timeslice to the next CPU.
This avoids slowdowns in multicore configurations caused
by one core busy-waiting on a spinlock which can't possibly
be unlocked until the other core has an opportunity to run.
This speeds up my test case A15 dual-core boot by a factor
of three (though it is still four or five times slower than
a single-core boot).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1393339545-22111-1-git-send-email-peter.maydell@linaro.org
Reviewed-by: Richard Henderson <rth@twiddle.net>
Tested-by: Rob Herring <rob.herring@linaro.org>
2014-03-10 14:56:30 +00:00
Andreas Färber
51fb256ab5 cpu: Drop cpu_model_str from CPU_COMMON
Since this is only read in cpu_copy() and linux-user has a global
cpu_model, drop the field from generic code.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-10-07 11:48:47 +02:00
Anthony Liguori
f0ef1cf4d6 Merge remote-tracking branch 'rth/tcg-next' into staging
# By Claudio Fontana (1) and others
# Via Richard Henderson
* rth/tcg-next:
  tcg: Remove temp_buf
  tcg/aarch64: Implement tlb lookup fast path
  tcg/aarch64: implement ldst 12bit scaled uimm offset

Message-id: 1373919944-8521-1-git-send-email-rth@twiddle.net
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-07-26 13:04:21 -05:00
Andreas Färber
eac8b355f0 cpu: Move gdb_regs field from CPU_COMMON to CPUState
Prepares for changing gdb_register_coprocessor() argument to CPUState.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-07-23 02:41:33 +02:00
Andreas Färber
ed2803da58 cpu: Move singlestep_enabled field from CPU_COMMON to CPUState
Prepares for changing cpu_single_step() argument to CPUState.

Acked-by: Michael Walle <michael@walle.cc> (for lm32)
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-07-23 02:41:32 +02:00
Richard Henderson
a28177820a tcg: Remove temp_buf
All targets have been converted to allocating space for temporaries
on the stack.  No need to allocate space within the CPU_COMMON block.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-07-15 13:16:20 -07:00
Andreas Färber
182735efaf cpu: Make first_cpu and next_cpu CPUState
Move next_cpu from CPU_COMMON to CPUState.
Move first_cpu variable to qom/cpu.h.

gdbstub needs to use CPUState::env_ptr for now.
cpu_copy() no longer needs to save and restore cpu_next.

Acked-by: Paolo Bonzini <pbonzini@redhat.com>
[AF: Rebased, simplified cpu_copy()]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-07-09 21:32:54 +02:00
Andreas Färber
ce927ed9e4 hwaddr: Make hwaddr type usable beyond softmmu
While not normally needed for *-user, it can safely be used there since
always based on uint64_t, to avoid ifdeffery.

To avoid accidental uses, move the guards from exec/hwaddr.h to its
inclusion sites.  No need for them in include/hw/.

Prepares for hwaddr use in qom/cpu.h.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-06-28 13:25:13 +02:00
Richard Henderson
e85ef5381a tcg: Use QEMU_BUILD_BUG_ON for CPU_TLB_ENTRY_BITS
Rather than a hand-coded version of the same thing.

Reviewed-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: liguang <lig.fnst@cn.fujitsu.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-06-05 05:54:00 -07:00
Paolo Bonzini
918fc54caf elfload: use abi_llong/ullong instead of target_llong/ullong
The alignment is a characteristic of the ABI, not the CPU.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
2013-04-18 14:12:31 +02:00
Paolo Bonzini
6cfd9b5251 elfload: only give abi_long/ulong the alignment specified by the target
Previously, this was done for target_long/ulong, and propagated to
abi_long/ulong via a typedef.  But target_long/ulong should not
have any specific alignment, it is never used to access guest
memory.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
2013-04-18 14:12:31 +02:00
Paolo Bonzini
f8fd4fc4cd elfload: use abi_int/uint instead of target_int/uint
The alignment is a characteristic of the ABI, not the CPU.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
2013-04-18 14:12:31 +02:00
Paolo Bonzini
1ddd592fd3 elfload: use abi_short/ushort instead of target_short/ushort
The alignment is a characteristic of the ABI, not the CPU.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
2013-04-18 14:12:31 +02:00
Andreas Färber
259186a7d2 cpu: Move halted and interrupt_request fields to CPUState
Both fields are used in VMState, thus need to be moved together.
Explicitly zero them on reset since they were located before
breakpoints.

Pass PowerPCCPU to kvmppc_handle_halt().

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-03-12 10:35:55 +01:00
Peter Maydell
6ab7e5465a Replace all setjmp()/longjmp() with sigsetjmp()/siglongjmp()
The setjmp() function doesn't specify whether signal masks are saved and
restored; on Linux they are not, but on BSD (including MacOSX) they are.
We want to have consistent behaviour across platforms, so we should
always use "don't save/restore signal mask" (this is also generally
going to be faster). This also works around a bug in MacOSX where the
signal-restoration on longjmp() affects the signal mask for a completely
different thread, not just the mask for the thread which did the longjmp.
The most visible effect of this was that ctrl-C was ignored on MacOSX
because the CPU thread did a longjmp which resulted in its signal mask
being applied to every thread, so that all threads had SIGINT and SIGTERM
blocked.

The POSIX-sanctioned portable way to do a jump without affecting signal
masks is to siglongjmp() to a sigjmp_buf which was created by calling
sigsetjmp() with a zero savemask parameter, so change all uses of
setjmp()/longjmp() accordingly. [Technically POSIX allows sigsetjmp(buf, 0)
to save the signal mask; however the following siglongjmp() must not
restore the signal mask, so the pair can be effectively considered as
"sigjmp/longjmp which don't touch the mask".]

For Windows we provide a trivial sigsetjmp/siglongjmp in terms of
setjmp/longjmp -- this is OK because no user will ever pass a non-zero
savemask.

The setjmp() uses in tests/tcg/test-i386.c and tests/tcg/linux-test.c
are left untouched because these are self-contained singlethreaded
test programs intended to be run under QEMU's Linux emulation, so they
have neither the portability nor the multithreading issues to deal with.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Tested-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2013-02-23 16:11:19 +00:00
Andreas Färber
d77953b94f cpu: Move current_tb field to CPUState
Explictly NULL it on CPU reset since it was located before breakpoints.

Change vapic_report_tpr_access() argument to CPUState. This also
resolves the use of void* for cpu.h independence.
Change vAPIC patch_instruction() argument to X86CPU.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-02-16 14:51:00 +01:00
Andreas Färber
fcd7d0034b cpu: Move exit_request field to CPUState
Since it was located before breakpoints field, it needs to be reset.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-02-16 14:51:00 +01:00
Andreas Färber
0315c31cda cpu: Move running field to CPUState
Pass CPUState to cpu_exec_{start,end}() functions.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-02-16 14:51:00 +01:00
Andreas Färber
0d34282fdd cpu: Move host_tid field to CPUState
Change gdbstub's cpu_index() argument to CPUState now that CPUArchState
is no longer used.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-02-16 14:50:59 +01:00
Andreas Färber
249fe3f3e9 cpu-defs.h: Drop qemu_work_item prototype
Commit c64ca8140e (cpu: Move
queued_work_{first,last} to CPUState) moved the qemu_work_item fields
away. Clean up the now unused prototype.

Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2013-01-19 10:29:27 +00:00
Andreas Färber
55e5c28502 cpu: Move cpu_index field to CPUState
Note that target-alpha accesses this field from TCG, now using a
negative offset. Therefore the field is placed last in CPUState.

Pass PowerPCCPU to [kvm]ppc_fixup_cpu() to facilitate this change.

Move common parts of mips cpu_state_reset() to mips_cpu_reset().

Acked-by: Richard Henderson <rth@twiddle.net> (for alpha)
[AF: Rebased onto ppc CPU subclasses and openpic changes]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-01-15 04:09:13 +01:00
Andreas Färber
1b1ed8dc40 cpu: Move numa_node field to CPUState
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-01-15 04:09:13 +01:00
Andreas Färber
ce3960ebe5 cpu: Move nr_{cores,threads} fields to CPUState
To facilitate the field movements, pass MIPSCPU to malta_mips_config();
avoid that for mips_cpu_map_tc() since callers only access MIPS Thread
Contexts, inside TCG helpers.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-01-15 04:09:13 +01:00
Andreas Färber
501a7ce727 Merge branch 'master' of git://git.qemu.org/qemu into qom-cpu
Adapt header include paths.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2012-12-23 00:40:49 +01:00
Paolo Bonzini
1de7afc984 misc: move include files to include/qemu/
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-19 08:32:39 +01:00
Paolo Bonzini
022c62cbbc exec: move include files to include/exec/
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-19 08:31:31 +01:00