We can directly test bits of a 32-bit comparison without
zero or sign-extending an intermediate result.
We can directly test bit 0 for odd/even.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use 'v' for a variable that needs copying, 't' for a temp that
doesn't need copying, and 'i' for an immediate, and use this
naming for both arguments of the comparison. So:
cond_make_tmp -> cond_make_tt
cond_make_0_tmp -> cond_make_ti
cond_make_0 -> cond_make_vi
cond_make -> cond_make_vv
Pass 0 explictly, rather than implicitly in the function name.
Reviewed-by: Helge Deller <deller@gmx.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This is a first step in enabling CF_PCREL, but for now
we regenerate the absolute address before writeback.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Wrap offset and space together in one structure, ensuring
that they're copied together as required.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This simplifies callers, which might otherwise have
to make another copy.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Using umax is clearer than the same operation using movcond.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This allows unification of BE, BLR, BV, BVE with a common helper.
Since we can now track space with IAQ_Next, we can now let the
TranslationBlock continue across the delay slot with BE, BVE.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Move space assighments to a central location.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Add variable to track space changes to IAQ. So far, no such changes
are introduced, but the new checks vs ctx->iasq_b may eliminate an
unnecessary copy to cpu_iasq_f with e.g. BLR.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Minimize the amount of code in hppa_tr_translate_insn advancing the
insn queue for the next insn. Move the goto_tb path to hppa_tr_tb_stop.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We no longer have to allocate a temp and perform an
addition before translation of the rest of the insn.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Add a common routine for writing the return address.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Instead of two separate cpu_iaoq_entry calls, use one call to update
both IAQ_Front and IAQ_Back. Simplify with an argument combination
that automatically handles a simple increment from Front to Back.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The generic tcg driver will have already checked for breakpoints.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Simplify the function by not attempting a conditional move
on the branch destination -- just use nullify_over normally.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Pass a displacement instead of an absolute value.
In trans_be, remove the user-only do_dbranch case. The branch we are
attempting to optimize is to the zero page, which is perforce on a
different page than the code currently executing, which means that
we will *not* use a goto_tb. Use a plain indirect branch instead,
which is what we got out of the attempted direct branch anyway.
Reviewed-by: Helge Deller <deller@gmx.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Share this check between gen_goto_tb and hppa_tr_translate_insn.
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This function is for log_pc(), which needs to produce a
similar result to cpu_get_tb_cpu_state().
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Now that boards are enabled by default and the "CONFIG_FOO=y"
entries are gone from configs/devices/, there cannot be any more
a conflicts between the default contents of configs/devices/
and a failed "depends on" clause.
With this change, each individual board or target can express
whether it needs FDT. It can then include the common code in the
build via "select DEVICE_TREE", which will also as tell meson to link
with libfdt.
This allows building non-microvm x86 emulators without having
libfdt available.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The local APIC is a part of the CPU and has callbacks that are invoked
from multiple accelerators.
The IOAPIC on the other hand is optional, but ioapic_eoi_broadcast is
used by common x86 code to implement the IOAPIC's implicit EOI mode.
Add a stub in case the IOAPIC device is not included but the APIC is.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240509170044.190795-13-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240509170044.190795-5-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This function has no dependency on the virtio-ccw machine type, though it
assumes that the CPU address corresponds to the core_id and the index.
If there is any need of something different or more fancy (unlikely)
S390 can include a MachineClass subclass and implement it there. For
now, move it to sigp.c for simplicity.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240509170044.190795-2-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The VMX feature bit depends on general availability of WAITPKG,
not the other way round.
Fixes: 33cc88261c ("target/i386: add support for VMX_SECONDARY_EXEC_ENABLE_USER_WAIT_PAUSE", 2023-08-28)
Cc: qemu-stable@nongnu.org
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These are trivial to add, and moving them to the new decoder fixes some
corner cases: raising #UD instead of an instruction fetch page fault for
the undefined opcodes, and incorrectly rejecting 0F 18 prefetches with
register operands (which are treated as reserved NOPs).
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reject 0x66/0xf3/0xf2 in front of them.
Cc: qemu-stable@nongnu.org
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
According to the manual, 32-bit vs 64-bit is governed by REX.W
and REX ignores the 0x66 prefix. This can be confirmed with this
program:
#include <stdio.h>
int main()
{
int x = 0x12340000;
int y;
asm("popcntl %1, %0" : "=r" (y) : "r" (x)); printf("%x\n", y);
asm("mov $-1, %0; .byte 0x66; popcntl %1, %0" : "+r" (y) : "r" (x)); printf("%x\n", y);
asm("mov $-1, %0; .byte 0x66; popcntq %q1, %q0" : "+r" (y) : "r" (x)); printf("%x\n", y);
}
which prints 5/ffff0000/5 on real hardware and 5/ffff0000/ffff0000
on QEMU.
Cc: qemu-stable@nongnu.org
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The PCOMMIT instruction was never included in any physical processor.
TCG implements it as a no-op instruction, but its utility is debatable
to say the least. Drop it from the decoder since it is only available
with "-cpu max", which does not guarantee migration compatibility
across versions, and deprecate the property just in case someone is
using it as "pcommit=off".
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add the CONSTRAINT_TRANSACTIONAL_EXE (cte) and TRANSACTIONAL_EXE (te)
to the list of deprecated features.
Signed-off-by: Collin Walling <walling@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-ID: <20240429191059.11806-3-walling@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Retain a list of deprecated features disjoint from any particular
CPU model. A query-cpu-model-expansion reply will now provide a list of
properties (i.e. features) that are flagged as deprecated. Example:
{
"return": {
"model": {
"name": "z14.2-base",
"deprecated-props": [
"bpb",
"csske"
],
"props": {
"pfmfi": false,
"exrl": true,
...a lot more props...
"skey": false,
"vxpdeh2": false
}
}
}
}
It is recommended that s390 guests operate with these features
explicitly disabled to ensure compatibility with future hardware.
Signed-off-by: Collin Walling <walling@linux.ibm.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-ID: <20240429191059.11806-2-walling@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
-----BEGIN PGP SIGNATURE-----
iLMEAAEKAB0WIQS4/x2g0v3LLaCcbCxAov/yOSY+3wUCZjyDAgAKCRBAov/yOSY+
33cfA/4jE0x+eLAT161caSwM3wBOfZRClfUhXdkxLP6GvWbACVQ8l0rEZiw2PuI8
DFReU2gqs7wAfYKt7Yy62xXlCw1B3aSUzE45gS2TGIP1GqKBwigvpW4i1SgiOoMX
4TA+GG16KgR9zaxO48bjjyJ1epc7S3SxdAL09p2U08D9EdSwCA==
=RLFu
-----END PGP SIGNATURE-----
Merge tag 'pull-loongarch-20240509' of https://gitlab.com/gaosong/qemu into staging
pull-loongarch-20240509
# -----BEGIN PGP SIGNATURE-----
#
# iLMEAAEKAB0WIQS4/x2g0v3LLaCcbCxAov/yOSY+3wUCZjyDAgAKCRBAov/yOSY+
# 33cfA/4jE0x+eLAT161caSwM3wBOfZRClfUhXdkxLP6GvWbACVQ8l0rEZiw2PuI8
# DFReU2gqs7wAfYKt7Yy62xXlCw1B3aSUzE45gS2TGIP1GqKBwigvpW4i1SgiOoMX
# 4TA+GG16KgR9zaxO48bjjyJ1epc7S3SxdAL09p2U08D9EdSwCA==
# =RLFu
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 09 May 2024 10:02:10 AM CEST
# gpg: using RSA key B8FF1DA0D2FDCB2DA09C6C2C40A2FFF239263EDF
# gpg: Good signature from "Song Gao <m17746591750@163.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B8FF 1DA0 D2FD CB2D A09C 6C2C 40A2 FFF2 3926 3EDF
* tag 'pull-loongarch-20240509' of https://gitlab.com/gaosong/qemu:
target/loongarch: Put cpucfg operation before CSR register
target/loongarch: Add TCG macro in structure CPUArchState
hw/loongarch: Refine default numa id calculation
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
On Loongarch, cpucfg is register for cpu feature, some other registers
depend on cpucfg feature such as perf CSR registers. Here put cpucfg
read/write operations before CSR register, so that KVM knows how many
perf CSR registers are valid from pre-set cpucfg feature information.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20240428031651.1354587-1-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
In structure CPUArchState some struct elements are only used in TCG
mode, and it is not used in KVM mode. Macro CONFIG_TCG is added to
make it simpiler in KVM mode, also there is the same modification
in c code when these structure elements are used.
When VM runs in KVM mode, TLB entries are not used and do not need
migrate. It is only useful when it runs in TCG mode.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240506011912.2108842-1-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
QEMU headers are relative to the include/ directory,
not to the project root directory. Remove "include/".
See also:
https://www.qemu.org/docs/master/devel/style.html#include-directives
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240507142737.95735-1-philmd@linaro.org>
Per https://discourse.gnome.org/t/port-your-module-from-g-memdup-to-g-memdup2-now/5538
The old API took the size of the memory to duplicate as a guint,
whereas most memory functions take memory sizes as a gsize. This
made it easy to accidentally pass a gsize to g_memdup(). For large
values, that would lead to a silent truncation of the size from 64
to 32 bits, and result in a heap area being returned which is
significantly smaller than what the caller expects. This can likely
be exploited in various modules to cause a heap buffer overflow.
Replace g_memdup() by the safer g_memdup2() wrapper.
Trivially safe because the argument was directly from sizeof.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20210903174510.751630-27-philmd@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmY5z/QUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroP1YQf/WMAoB/lR31fzu/Uh36hF1Ke/NHNU
gefqKRAol6xJXxavKH8ym9QMlCTzrCLVt0e8RalZH76gLqYOjRhSLSSL+gUo5HEo
lsGSfkDAH2pHO0ZjQUkXcjJQQKkH+4+Et8xtyPc0qmq4uT1pqQZRgOeI/X/DIFNb
sMoKaRKfj+dB7TSp3qCSOp77RqL13f4QTP8mUQ4XIfzDDXdTX5n8WNLnyEIKjoar
ge4U6/KHjM35hAjCG9Av/zYQx0E084r2N2OEy0ESYNwswFZ8XYzTuL4SatN/Otf3
F6eQZ7Q7n6lQbTA+k3J/jR9dxiSqVzFQnL1ePGoe9483UnxVavoWd0PSgw==
=jCyB
-----END PGP SIGNATURE-----
Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging
* target/i386/tcg: conversion of one byte opcodes to table-based decoder
# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmY5z/QUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroP1YQf/WMAoB/lR31fzu/Uh36hF1Ke/NHNU
# gefqKRAol6xJXxavKH8ym9QMlCTzrCLVt0e8RalZH76gLqYOjRhSLSSL+gUo5HEo
# lsGSfkDAH2pHO0ZjQUkXcjJQQKkH+4+Et8xtyPc0qmq4uT1pqQZRgOeI/X/DIFNb
# sMoKaRKfj+dB7TSp3qCSOp77RqL13f4QTP8mUQ4XIfzDDXdTX5n8WNLnyEIKjoar
# ge4U6/KHjM35hAjCG9Av/zYQx0E084r2N2OEy0ESYNwswFZ8XYzTuL4SatN/Otf3
# F6eQZ7Q7n6lQbTA+k3J/jR9dxiSqVzFQnL1ePGoe9483UnxVavoWd0PSgw==
# =jCyB
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 06 May 2024 11:53:40 PM PDT
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (26 commits)
target/i386: remove duplicate prefix decoding
target/i386: split legacy decoder into a separate function
target/i386: decode x87 instructions in a separate function
target/i386: remove now-converted opcodes from old decoder
target/i386: port extensions of one-byte opcodes to new decoder
target/i386: move BSWAP to new decoder
target/i386: move remaining conditional operations to new decoder
target/i386: merge and enlarge a few ranges for call to disas_insn_new
target/i386: move C0-FF opcodes to new decoder (except for x87)
target/i386: generalize gen_movl_seg_T0
target/i386: move 60-BF opcodes to new decoder
target/i386: allow instructions with more than one immediate
target/i386: extract gen_far_call/jmp, reordering temporaries
target/i386: move 00-5F opcodes to new decoder
target/i386: reintroduce debugging mechanism
target/i386: cleanup *gen_eob*
target/i386: clarify the "reg" argument of functions returning CCPrepare
target/i386: do not use s->T0 and s->T1 as scratch registers for CCPrepare
target/i386: extend cc_* when using them to compute flags
target/i386: pull cc_op update to callers of gen_jmp_rel{,_csize}
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Now that a bulk of opcodes go through the new decoder, it is sensible
to do some cleanup. Go immediately through disas_insn_new and only jump
back after parsing the prefixes.
disas_insn() now only contains the three sigsetjmp cases, and they
are more easily managed if they are inlined into i386_tr_translate_insn.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Split the bits that have some duplication with disas_insn_new, from
those that should be the main topic of the conversion. This is the
first step towards removing duplicate decoding of prefixes between
disas_insn and disas_insn_new.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These are unlikely to be converted to the table-based decoding
soon (perhaps there could be generic ESC decoding in decode-new.c.inc
for the Mod/RM byte, but not operand decoding), so keep them separate
from the remaining legacy-decoded instructions.
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Send all converted opcodes to disas_insn_new() directly from the big
decoding switch statement; once more, the debugging/bisecting logic
disappears.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
A few two-byte opcodes are simple extensions of existing one-byte opcodes;
they are easy to decode and need no change to emit.c.inc. Port them to
the new decoder.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move long-displacement Jcc, SETcc and CMOVcc to the new decoder.
While filling in the tables makes the code seem longer, the new
emitters are all just one line of code.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Since new opcodes are not going to be added in translate.c, round the
case labels that call to disas_insn_new(), including whole sets of
eight opcodes when possible.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The shift instructions are rewritten instead of reusing code from the old
decoder. Rotates use CC_OP_ADCOX more extensively and generally rely
more on the optimizer, so that the code generators are shared between
the immediate-count and variable-count cases.
In particular, this makes gen_RCL and gen_RCR pretty efficient for the
count == 1 case, which becomes (apart from a few extra movs) something like:
(compute_cc_all if needed)
// save old value for OF calculation
mov cc_src2, T0
// the bulk of RCL is just this!
deposit T0, cc_src, T0, 1, TARGET_LONG_BITS - 1
// compute carry
shr cc_dst, cc_src2, length - 1
and cc_dst, cc_dst, 1
// compute overflow
xor cc_src2, cc_src2, T0
extract cc_src2, cc_src2, length - 1, 1
32-bit MUL and IMUL are also slightly more efficient on 64-bit hosts.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In the new decoder it is sometimes easier to put the segment
in T1 instead of T0, usually because another operand was loaded
by common code in T0. Genrealize gen_movl_seg_T0 to allow
using any source.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>