Commit Graph

40728 Commits

Author SHA1 Message Date
Peter Maydell
cea66e9121 target-arm: Implement AArch64 TLBI operations on IPAs
Implement the AArch64 TLBI operations which take an intermediate
physical address and invalidate stage 2 translations.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1439548879-1972-7-git-send-email-peter.maydell@linaro.org
2015-08-25 16:18:33 +01:00
Peter Maydell
43efaa33fa target-arm: Implement missing EL3 TLB invalidate operations
Implement the remaining stage 1 TLB invalidate operations
visible from EL3.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1439548879-1972-6-git-send-email-peter.maydell@linaro.org
2015-08-25 16:18:33 +01:00
Peter Maydell
2bfb9d75d3 target-arm: Implement missing EL2 TLBI operations
Implement the missing TLBI operations that exist only
if EL2 is implemented.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1439548879-1972-5-git-send-email-peter.maydell@linaro.org
2015-08-25 16:18:33 +01:00
Peter Maydell
fd3ed96922 target-arm: Restrict AArch64 TLB flushes to the MMU indexes they must touch
Now we have the ability to flush the TLB only for specific MMU indexes,
update the AArch64 TLB maintenance instruction implementations to only
flush the parts of the TLB they need to, rather than doing full flushes.

We take the opportunity to remove some duplicate functions (the per-asid
tlb ops work like the non-per-asid ones because we don't support
flushing a TLB only by ASID) and to bring the function names in line
with the architectural TLBI operation names.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1439548879-1972-4-git-send-email-peter.maydell@linaro.org
2015-08-25 16:18:33 +01:00
Peter Maydell
83ddf97577 target-arm: Move TLBI ALLE1/ALLE1IS definitions into numeric order
Move the two regdefs for TLBI ALLE1 and TLBI ALLE1IS down so that the
whole set of AArch64 TLBI regdefs is arranged in numeric order.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1439548879-1972-3-git-send-email-peter.maydell@linaro.org
2015-08-25 16:18:33 +01:00
Peter Maydell
d7a74a9d4a cputlb: Add functions for flushing TLB for a single MMU index
Guest CPU TLB maintenance operations may be sufficiently
specialized to only need to flush TLB entries corresponding
to a particular MMU index. Implement cputlb functions for
this, to avoid the inefficiency of flushing TLB entries
which we don't need to.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1439548879-1972-2-git-send-email-peter.maydell@linaro.org
2015-08-25 16:18:33 +01:00
Peter Maydell
14db7fe09a target-arm: Implement AArch32 ATS1H* operations
Implement the AArch32 ATS1H* operations which perform
Hyp mode stage 1 translations.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1437751263-21913-6-git-send-email-peter.maydell@linaro.org
2015-08-25 15:45:08 +01:00
Peter Maydell
87562e4f4a target-arm: Enable the AArch32 ATS12NSO ops
Apply the correct conditions in the ats_access() function for
the ATS12NSO* address translation operations:
 * succeed at EL2 or EL3
 * normal UNDEF trap from NS EL1
 * trap to EL3 from S EL1 (only possible if EL3 is AArch64)

(This change means they're now available in our EL3-supporting
CPUs when they would previously always UNDEF.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1437751263-21913-5-git-send-email-peter.maydell@linaro.org
2015-08-25 15:45:08 +01:00
Peter Maydell
e76157264d target-arm: Add CP_ACCESS_TRAP_UNCATEGORIZED_EL2, 3
Some coprocessor register access functions need to be able
to report "trap to EL3 with an 'uncategorized' syndrome";
add the necessary CPAccessResult enum and handling for it.

I don't currently know of any registers that need to trap
to EL2 with the 'uncategorized' syndrome, but adding the
_EL2 enum as well is trivial and fills in what would
otherwise be an odd gap in the handling.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1437751263-21913-4-git-send-email-peter.maydell@linaro.org
2015-08-25 15:45:08 +01:00
Peter Maydell
2a47df9532 target-arm: Wire up AArch64 EL2 and EL3 address translation ops
Wire up the AArch64 EL2 and EL3 address translation operations
(AT S12E1*, AT S12E0*, AT S1E2*, AT S1E3*), and correct some
errors in the ats_write64() function in previously unused code
that would have done the wrong kind of lookup for accesses from
EL3 when SCR.NS==0.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1437751263-21913-3-git-send-email-peter.maydell@linaro.org
2015-08-25 15:45:08 +01:00
Peter Maydell
d0a2cbceb2 target-arm: there is no TTBR1 for 32-bit EL2 stage 1 translations
For EL2 stage 1 translations, there is no TTBR1. We were already
handling this for 64-bit EL2; add the code to take the 'no TTBR1'
code path for 64-bit EL2 as well.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1437751263-21913-2-git-send-email-peter.maydell@linaro.org
2015-08-25 15:45:08 +01:00
Peter Maydell
834a6c6920 target-arm: Implement missing ACTLR registers
We already implemented ACTLR_EL1; add the missing ACTLR_EL2 and
ACTLR_EL3, for consistency.

Since we don't currently have any CPUs that need the EL2/EL3
versions to reset to non-zero values, implement as RAZ/WI.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1438281398-18746-5-git-send-email-peter.maydell@linaro.org
2015-08-25 15:45:07 +01:00
Peter Maydell
37cd6c2478 target-arm: Implement missing AFSR registers
The AFSR registers are implementation dependent auxiliary fault
status registers. We already implemented a RAZ/WI AFSR0_EL1 and
AFSR_EL1; add the missing AFSR{0,1}_EL{2,3} for consistency.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1438281398-18746-4-git-send-email-peter.maydell@linaro.org
2015-08-25 15:45:07 +01:00
Peter Maydell
2179ef958c target-arm: Implement missing AMAIR registers
The AMAIR registers are for providing auxiliary implementation
defined memory attributes. We already implemented a RAZ/WI
AMAIR_EL1; add the EL2 and EL3 versions for consistency.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1438281398-18746-3-git-send-email-peter.maydell@linaro.org
2015-08-25 15:45:07 +01:00
Peter Maydell
4cfb8ad896 target-arm: Add missing MAIR_EL3 and TPIDR_EL3 registers
Add the AArch64 registers MAIR_EL3 and TPIDR_EL3, which are the only
two which we had implemented the 32-bit Secure equivalents of but
not the 64-bit Secure versions.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1438281398-18746-2-git-send-email-peter.maydell@linaro.org
2015-08-25 15:45:07 +01:00
Alistair Francis
137805f5d8 MAINTAINERS: Add ZynqMP to MAINTAINERS file
Add the Xilinx ZynqMP SoC and EP108 machine to the maintainers
file.

Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Message-id: fed078103a0b02cfb3adadbe8e80e4420d554505.1436486024.git.alistair.francis@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-08-25 15:45:07 +01:00
Alistair Francis
4b46ba6145 MAINTAINERS: Update Xilinx Maintainership
Peter C is leaving Xilinx, so update the maintainer list
to point to Alistair and Edgar from Xilinx and Peter's
personal email address.

Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Message-id: 54b4c070452bac05aa3a9c1d75899bc097fef831.1436486024.git.alistair.francis@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-08-25 15:45:06 +01:00
Alistair Francis
6675d71915 xlnx-zynqmp: Connect the four OCM banks
The Xilinx EP108 has four separate OCM banks which are located
adjacent to each other. This patch adds the four banks to
the ZynqMP SoC.

Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Message-id: afa6ba31163a5d541a0bef4b0dc11f2597e0c495.1436813543.git.alistair.francis@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-08-25 15:45:06 +01:00
Peter Maydell
34a4450434 queued tcg patches
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJV22RbAAoJEK0ScMxN0CebzgkIAICUc/zbs+jSMa3vBH3nH0QP
 v8//Ek73dH14B7BnKGIHA/9VWSpGq8HEHcRQfEOiu5pHZK/9XpwuxLTs2u7O9BVZ
 n+XcAazeF7eikMScsEaMpFkPpKOcIv5QkwZaX90/sBMsGw3+nAR0+nBpqX2rIn2J
 C+3YMMd4WgCG0SvV4rdxfZtFEh9NsHUddZKznyX4zzDYcGYSq0e6plKEhTt+u5zB
 GJrfqkRWYg6bHe0p18u9o8kL3BSyLCWDuj3rXw8vSlkoybPHN/XsEMZ3LzZ6NVrV
 Omx5ubAe2EwxOlHXD/N8wc+euQl3g0ZR2nd9j/KGcGDuNfziGgRo0l4sSF0LNx8=
 =AI/e
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20150824' into staging

queued tcg patches

# gpg: Signature made Mon 24 Aug 2015 19:37:15 BST using RSA key ID 4DD0279B
# gpg: Good signature from "Richard Henderson <rth7680@gmail.com>"
# gpg:                 aka "Richard Henderson <rth@redhat.com>"
# gpg:                 aka "Richard Henderson <rth@twiddle.net>"

* remotes/rth/tags/pull-tcg-20150824:
  linux-user: remove useless macros GUEST_BASE and RESERVED_VA
  linux-user: remove --enable-guest-base/--disable-guest-base
  tcg/aarch64: Use softmmu fast path for unaligned accesses
  tcg/s390: Use softmmu fast path for unaligned accesses
  tcg/ppc: Improve unaligned load/store handling on 64-bit backend
  tcg/i386: use softmmu fast path for unaligned accesses
  tcg: Remove tcg_gen_trunc_i64_i32
  tcg: Split trunc_shr_i32 opcode into extr[lh]_i64_i32
  tcg: update README about size changing ops
  tcg/optimize: add optimizations for ext_i32_i64 and extu_i32_i64 ops
  tcg: implement real ext_i32_i64 and extu_i32_i64 ops
  tcg: don't abuse TCG type in tcg_gen_trunc_shr_i64_i32
  tcg: rename trunc_shr_i32 into trunc_shr_i64_i32
  tcg/optimize: allow constant to have copies
  tcg/optimize: track const/copy status separately
  tcg/optimize: add temp_is_const and temp_is_copy functions
  tcg/optimize: optimize temps tracking
  tcg/optimize: fix constant signedness

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-08-25 13:34:57 +01:00
Laurent Vivier
b76f21a707 linux-user: remove useless macros GUEST_BASE and RESERVED_VA
As we have removed CONFIG_USE_GUEST_BASE, we always use a guest base
and the macros GUEST_BASE and RESERVED_VA become useless: replace
them by their values.

Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <1440420834-8388-1-git-send-email-laurent@vivier.eu>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2015-08-24 11:14:30 -07:00
Laurent Vivier
4cbea59869 linux-user: remove --enable-guest-base/--disable-guest-base
All tcg host architectures now support the guest base and as
there is no real performance lost, it can be always enabled.

Anyway, guest base use can be disabled lively by setting guest
base to 0.

CONFIG_USE_GUEST_BASE is defined as (USE_GUEST_BASE && USER_ONLY),
it should have to be replaced by CONFIG_USER_ONLY in non CONFIG_USER_ONLY
parts, but as some other parts are using !CONFIG_SOFTMMU I have chosen to
use !CONFIG_SOFTMMU instead.

Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <1440373328-9788-2-git-send-email-laurent@vivier.eu>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2015-08-24 11:14:17 -07:00
Richard Henderson
9ee14902bf tcg/aarch64: Use softmmu fast path for unaligned accesses
Signed-off-by: Richard Henderson <rth@twiddle.net>
2015-08-24 11:10:54 -07:00
Richard Henderson
a5e39810b9 tcg/s390: Use softmmu fast path for unaligned accesses
Signed-off-by: Richard Henderson <rth@twiddle.net>
2015-08-24 11:10:54 -07:00
Benjamin Herrenschmidt
68d45bb61c tcg/ppc: Improve unaligned load/store handling on 64-bit backend
Currently, we get to the slow path for any unaligned access in the
backend, because we effectively preserve the bottom address bits
below the alignment requirement when comparing with the TLB entry,
so any non-0 bit there will cause the compare to fail.

For the same number of instructions, we can instead add the access
size - 1 to the address and stick to clearing all the bottom bits.

That means that normal unaligned accesses will not fallback (the HW
will handle them fine). Only when crossing a page boundary well we
end up having a mismatch because we'll end up pointing to the next
page which cannot possibly be in that same TLB entry.

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Message-Id: <1437455978.5809.2.camel@kernel.crashing.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2015-08-24 11:10:54 -07:00
Aurelien Jarno
8cc580f6a0 tcg/i386: use softmmu fast path for unaligned accesses
Softmmu unaligned load/stores currently goes through through the slow
path for two reasons:
  - to support unaligned access on host with strict alignement
  - to correctly handle accesses crossing pages

x86 is only concerned by the second reason. Unaligned accesses are
avoided by compilers, but are not uncommon. We therefore would like
to see them going through the fast path, if they don't cross pages.

For that we can use the fact that two adjacent TLB entries can't contain
the same page. Therefore accessing the TLB entry corresponding to the
first byte, but comparing its content to page address of the last byte
ensures that we don't cross pages. We can do this check without adding
more instructions in the TLB code (but increasing its length by one
byte) by using the LEA instruction to combine the existing move with the
size addition.

On an x86-64 host, this gives a 3% boot time improvement for a powerpc
guest and 4% for an x86-64 guest.

[rth: Tidied calculation of the offset mask]

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Message-Id: <1436467197-2183-1-git-send-email-aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2015-08-24 11:10:54 -07:00
Richard Henderson
ecc7b3aa71 tcg: Remove tcg_gen_trunc_i64_i32
Replacing it with tcg_gen_extrl_i64_i32.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2015-08-24 11:10:54 -07:00
Richard Henderson
609ad70562 tcg: Split trunc_shr_i32 opcode into extr[lh]_i64_i32
Rather than allow arbitrary shift+trunc, only concern ourselves
with low and high parts.  This is all that was being used anyway.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2015-08-24 11:10:54 -07:00
Aurelien Jarno
870ad1547a tcg: update README about size changing ops
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2015-08-24 11:10:54 -07:00
Aurelien Jarno
8bcb5c8f34 tcg/optimize: add optimizations for ext_i32_i64 and extu_i32_i64 ops
They behave the same as ext32s_i64 and ext32u_i64 from the constant
folding and zero propagation point of view, except that they can't
be replaced by a mov, so we don't compute the affected value.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2015-08-24 11:10:54 -07:00
Aurelien Jarno
4f2331e5b6 tcg: implement real ext_i32_i64 and extu_i32_i64 ops
Implement real ext_i32_i64 and extu_i32_i64 ops. They ensure that a
32-bit value is always converted to a 64-bit value and not propagated
through the register allocator or the optimizer.

Cc: Andrzej Zaborowski <balrogg@gmail.com>
Cc: Alexander Graf <agraf@suse.de>
Cc: Blue Swirl <blauwirbel@gmail.com>
Cc: Stefan Weil <sw@weilnetz.de>
Acked-by: Claudio Fontana <claudio.fontana@huawei.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2015-08-24 11:10:54 -07:00
Aurelien Jarno
6acd2558fd tcg: don't abuse TCG type in tcg_gen_trunc_shr_i64_i32
The tcg_gen_trunc_shr_i64_i32 function takes a 64-bit argument and
returns a 32-bit value. Directly call tcg_gen_op3 with the correct
types instead of calling tcg_gen_op3i_i32 and abusing the TCG types.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2015-08-24 11:10:54 -07:00
Aurelien Jarno
0632e555fc tcg: rename trunc_shr_i32 into trunc_shr_i64_i32
The op is sometimes named trunc_shr_i32 and sometimes trunc_shr_i64_i32,
and the name in the README doesn't match the name offered to the
frontends.

Always use the long name to make it clear it is a size changing op.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2015-08-24 11:10:54 -07:00
Aurelien Jarno
299f801304 tcg/optimize: allow constant to have copies
Now that copies and constants are tracked separately, we can allow
constant to have copies, deferring the choice to use a register or a
constant to the register allocation pass. This prevent this kind of
regular constant reloading:

-OUT: [size=338]
+OUT: [size=298]
   mov    -0x4(%r14),%ebp
   test   %ebp,%ebp
   jne    0x7ffbe9cb0ed6
   mov    $0x40002219f8,%rbp
   mov    %rbp,(%r14)
-  mov    $0x40002219f8,%rbp
   mov    $0x4000221a20,%rbx
   mov    %rbp,(%rbx)
   mov    $0x4000000000,%rbp
   mov    %rbp,(%r14)
-  mov    $0x4000000000,%rbp
   mov    $0x4000221d38,%rbx
   mov    %rbp,(%rbx)
   mov    $0x40002221a8,%rbp
   mov    %rbp,(%r14)
-  mov    $0x40002221a8,%rbp
   mov    $0x4000221d40,%rbx
   mov    %rbp,(%rbx)
   mov    $0x4000019170,%rbp
   mov    %rbp,(%r14)
-  mov    $0x4000019170,%rbp
   mov    $0x4000221d48,%rbx
   mov    %rbp,(%rbx)
   mov    $0x40000049ee,%rbp
   mov    %rbp,0x80(%r14)
   mov    %r14,%rdi
   callq  0x7ffbe99924d0
   mov    $0x4000001680,%rbp
   mov    %rbp,0x30(%r14)
   mov    0x10(%r14),%rbp
   mov    $0x4000001680,%rbp
   mov    %rbp,0x30(%r14)
   mov    0x10(%r14),%rbp
   shl    $0x20,%rbp
   mov    (%r14),%rbx
   mov    %ebx,%ebx
   mov    %rbx,(%r14)
   or     %rbx,%rbp
   mov    %rbp,0x10(%r14)
   mov    %rbp,0x90(%r14)
   mov    0x60(%r14),%rbx
   mov    %rbx,0x38(%r14)
   mov    0x28(%r14),%rbx
   mov    $0x4000220e60,%r12
   mov    %rbx,(%r12)
   mov    $0x40002219c8,%rbx
   mov    %rbp,(%rbx)
   mov    0x20(%r14),%rbp
   sub    $0x8,%rbp
   mov    $0x4000004a16,%rbx
   mov    %rbx,0x0(%rbp)
   mov    %rbp,0x20(%r14)
   mov    $0x19,%ebp
   mov    %ebp,0xa8(%r14)
   mov    $0x4000015110,%rbp
   mov    %rbp,0x80(%r14)
   xor    %eax,%eax
   jmpq   0x7ffbebcae426
   lea    -0x5f6d72a(%rip),%rax        # 0x7ffbe3d437b3
   jmpq   0x7ffbebcae426

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2015-08-24 11:10:54 -07:00
Aurelien Jarno
b41059dd9d tcg/optimize: track const/copy status separately
Instead of using an enum which could be either a copy or a const, track
them separately. This will be used in the next patch.

Constants are tracked through a bool. Copies are tracked by initializing
temp's next_copy and prev_copy to itself, allowing to simplify the code
a bit.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2015-08-24 11:10:53 -07:00
Aurelien Jarno
d9c769c609 tcg/optimize: add temp_is_const and temp_is_copy functions
Add two accessor functions temp_is_const and temp_is_copy, to make the
code more readable and make code change easier.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2015-08-24 11:10:53 -07:00
Aurelien Jarno
1208d7dd5f tcg/optimize: optimize temps tracking
The tcg_temp_info structure uses 24 bytes per temp. Now that we emulate
vector registers on most guests, it's not uncommon to have more than 100
used temps. This means we have initialize more than 2kB at least twice
per TB, often more when there is a few goto_tb.

Instead used a TCGTempSet bit array to track which temps are in used in
the current basic block. This means there are only around 16 bytes to
initialize.

This improves the boot time of a MIPS guest on an x86-64 host by around
7% and moves out tcg_optimize from the the top of the profiler list.

[rth: Handle TCG_CALL_DUMMY_ARG]

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2015-08-24 11:10:30 -07:00
Aurelien Jarno
29f3ff8d6c tcg/optimize: fix constant signedness
By convention, on a 64-bit host TCG internally stores 32-bit constants
as sign-extended. This is not the case in the optimizer when a 32-bit
constant is folded.

This doesn't seem to have more consequences than suboptimal code
generation. For instance the x86 backend assumes sign-extended constants,
and in some rare cases uses a 32-bit unsigned immediate 0xffffffff
instead of a 8-bit signed immediate 0xff for the constant -1. This is
with a ppc guest:

before
------

 ---- 0x9f29cc
 movi_i32 tmp1,$0xffffffff
 movi_i32 tmp2,$0x0
 add2_i32 tmp0,CA,CA,tmp2,r6,tmp2
 add2_i32 tmp0,CA,tmp0,CA,tmp1,tmp2
 mov_i32 r10,tmp0

0x7fd8c7dfe90c:  xor    %ebp,%ebp
0x7fd8c7dfe90e:  mov    %ebp,%r11d
0x7fd8c7dfe911:  mov    0x18(%r14),%r9d
0x7fd8c7dfe915:  add    %r9d,%r10d
0x7fd8c7dfe918:  adc    %ebp,%r11d
0x7fd8c7dfe91b:  add    $0xffffffff,%r10d
0x7fd8c7dfe922:  adc    %ebp,%r11d
0x7fd8c7dfe925:  mov    %r11d,0x134(%r14)
0x7fd8c7dfe92c:  mov    %r10d,0x28(%r14)

after
-----

 ---- 0x9f29cc
 movi_i32 tmp1,$0xffffffffffffffff
 movi_i32 tmp2,$0x0
 add2_i32 tmp0,CA,CA,tmp2,r6,tmp2
 add2_i32 tmp0,CA,tmp0,CA,tmp1,tmp2
 mov_i32 r10,tmp0

0x7f37010d490c:  xor    %ebp,%ebp
0x7f37010d490e:  mov    %ebp,%r11d
0x7f37010d4911:  mov    0x18(%r14),%r9d
0x7f37010d4915:  add    %r9d,%r10d
0x7f37010d4918:  adc    %ebp,%r11d
0x7f37010d491b:  add    $0xffffffffffffffff,%r10d
0x7f37010d491f:  adc    %ebp,%r11d
0x7f37010d4922:  mov    %r11d,0x134(%r14)
0x7f37010d4929:  mov    %r10d,0x28(%r14)

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Message-Id: <1436544211-2769-2-git-send-email-aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2015-08-24 11:10:08 -07:00
Peter Maydell
a30878e708 configure: Don't permit SDL or GTK on OSX
The cocoa GUI frontend assumes it is the only GUI (it redefines
main() so it always gets control before the rest of QEMU), so
it does not play well with other UIs like SDL or GTK. (Mostly
people building QEMU on OSX don't have the necessary dependencies
available for configure to build those other front ends, so
mostly this problem goes unnoticed.)

Make configure automatically disable the SDL and GTK front ends
if the cocoa front end is enabled. (We were sort of attempting
to do this for SDL before, but not in a way that worked very well.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: John Arbuckle <programmingkidx@gmail.com>
Message-id: 1439565052-3457-1-git-send-email-peter.maydell@linaro.org
2015-08-19 20:29:30 +01:00
Peter Maydell
20fbcfdd58 apic_internal.h: Include cpu.h directly
apic_internal.h relies on cpu.h having been included (for the
X86CPU type); include it directly rather than relying on it
being pulled in via one of the other includes like timer.h.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
2015-08-19 16:29:53 +01:00
Peter Maydell
49caffe0cc qemu-common.h: Move muldiv64() to host-utils.h
Move the muldiv64() function from qemu-common.h to host-utils.h.
This puts it together with all the other arithmetic functions
where we provide a version with __int128_t and a fallback
without, and allows headers which need muldiv64() to avoid
including qemu-common.h.

We don't include host-utils from qemu-common.h, to avoid dragging
more things into qemu-common.h than it already has; in practice
everywhere that needs muldiv64() can get it via qemu/timer.h.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
2015-08-19 16:29:53 +01:00
Peter Maydell
03557b9aba osdep.h: Add header comment
Add a header comment to osdep.h, explaining what the header is for
and some rules to avoid circular-include difficulties.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
2015-08-19 16:29:53 +01:00
Peter Maydell
bfe7e449f1 osdep.h: Move some OS header includes and fixups from qemu-common.h
qemu-common.h has some system header includes and fixups for
things that might be missing. This is really an OS dependency
and belongs in osdep.h, so move it across.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
2015-08-19 16:29:53 +01:00
Peter Maydell
1aad8104f3 qemu-common.h: Move Win32 fixups into os-win32.h
qemu-common.h includes some fixups for things the Win32
headers don't define or define weirdly. These really
belong in os-win32.h, so move them there.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
2015-08-19 16:29:53 +01:00
Peter Maydell
24134c4e91 compiler.h: Use glue() in QEMU_BUILD_BUG_ON define
Rather than rolling custom concatenate-strings macros for the
QEMU_BUILD_BUG_ON macro to use, use the glue() macro we already
have (since it's now available to us in this header).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
2015-08-19 16:29:53 +01:00
Peter Maydell
4912086865 osdep.h: Move some compiler-specific things to compiler.h
osdep.h has a few things which are really compiler specific;
move them to compiler.h, and include compiler.h from osdep.h.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
2015-08-19 16:29:53 +01:00
Peter Maydell
71baf787d8 osdep.h: Remove qemu_printf
qemu_printf is an ancient remnant which has been a simple #define to
printf for over a decade, and is used in only a few places. Expand
it out in those places and remove the #define.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
2015-08-19 16:29:53 +01:00
Peter Maydell
38e20cac66 qapi/qmp-event.c: Don't manually include os-win32.h/os-posix.h
qmp-event.c already includes qemu-common.h, so manually including
os-win32.h/os-posix.h is unnecessary (and potentially fragile,
since it's duplicating the #ifdef logic that chooses which of the
two we need). Remove the unnecessary include logic.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
2015-08-19 16:29:53 +01:00
Peter Maydell
4c4a29cb68 Alpha shadow register optimization
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJV03TlAAoJEK0ScMxN0CebCDUH/jBAivio/oh0MEVOy12uQ+Dm
 JWyTnde8PHId50ysHwI3+EiE0Jzk1PQv1RV+7RRSy767KOOc+WqRDGaVHN4vjK6Z
 CgocDUygM79vInaGB34KuvQ0Z+tu4rCIQxBfkCm9OsMF5h/8ZO16IA7RwUQ9GvDj
 1KGFL309D4Uw9/oqtx1SQeXwiZR7eI5buaJFfit3uZeaFwyZkkwLXfgvgoPIlvQ+
 GMMv5Ejxd5pT/2WwqsxwI0/Y2kxRqyJd0u+28S969Vg0sZ4GlMIsFvjuKqB3kSeZ
 TpOyNa7ulcywjZotp6OIlr483Sn6SGjzGej/8Qvg5mQ3fxoHTGeQa3NWfvM2+34=
 =Mg6S
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/rth/tags/pull-axp-201508018' into staging

Alpha shadow register optimization

# gpg: Signature made Tue 18 Aug 2015 19:09:41 BST using RSA key ID 4DD0279B
# gpg: Good signature from "Richard Henderson <rth7680@gmail.com>"
# gpg:                 aka "Richard Henderson <rth@redhat.com>"
# gpg:                 aka "Richard Henderson <rth@twiddle.net>"

* remotes/rth/tags/pull-axp-201508018:
  target-alpha: Inline hw_ret
  target-alpha: Inline call_pal
  target-alpha: Use separate TCGv temporaries for the shadow registers

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-08-19 00:25:52 +01:00
Richard Henderson
6c05d3ded7 target-alpha: Inline hw_ret
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2015-08-18 11:08:59 -07:00
Richard Henderson
2f458b7c31 target-alpha: Inline call_pal
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2015-08-18 11:08:54 -07:00