Each slot in a packet can be assigned to at most one instruction.
Although the assembler generally ought to enforce this rule, we better
be safe than sorry and also do some check to properly throw an "invalid
packet" exception on wrong slot assignments.
This should also make it easier to debug possible future errors caused
by missing updates to `find_iclass_slots()` rules in
target/hexagon/iclass.c.
Co-authored-by: Taylor Simpson <tsimpson@quicinc.com>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Signed-off-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com>
Reviewed-by: Taylor Simpson <tsimpson@quicinc.com>
Tested-by: Taylor Simpson <tsimpson@quicinc.com>
Message-Id: <f8b829443523568823d062adf8bf6659bc6d4a3f.1683552984.git.quic_mathbern@quicinc.com>
The Hexagon PRM says that "The assembler automatically encodes
instructions in the packet in the proper order. In the binary encoding
of a packet, the instructions must be ordered from Slot 3 down to
Slot 0."
Prior to the architecture version v73, the slot constraints from
instruction "hintjr" only allowed it to be executed at slot 2.
With that in mind, consider the packet:
{
hintjr(r0)
nop
nop
if (!p0) memd(r1+#0) = r1:0
}
To satisfy the ordering rule quoted from the PRM, the assembler would,
thus, move one of the nops to the first position, so that it can be
assigned to slot 3 and the subsequent hintjr to slot 2.
However, since v73, hintjr can be executed at either slot 2 or 3. So
there is no need to reorder that packet and the assembler will encode it
as is. When QEMU tries to execute it, however, we end up hitting a
"misaliged store" exception because both the store and the hintjr will
be assigned to store 0, and some functions like `slot_is_predicated()`
expect the decode machinery to assign only one instruction per slot. In
particular, the mentioned function will traverse the packet until it
finds the first instruction at the desired slot which, for slot 0, will
be hintjr. Since hintjr is not predicated, the result is that we try to
execute the store regardless of the predicate. And because the predicate
is false, we had not previously loaded hex_store_addr[0] or
hex_store_width[0]. As a result, the store will decide de width based on
trash memory, causing it to be misaligned.
Update the slot constraints for hintjr so that QEMU can properly handle
such encodings.
Note: to avoid similar-but-not-identical issues in the future, we should
look for multiple instructions at the same slot during decoding time and
throw an invalid packet exception. That will be done in the subsequent
commit.
Signed-off-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com>
Reviewed-by: Taylor Simpson <tsimpson@quicinc.com>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Message-Id: <0fcd8293642c6324119fbbab44741164bcbd04fb.1673616964.git.quic_mathbern@quicinc.com>
Currently, qemu-hexagon only models the v67 cpu. Nonetheless if we try
to get this information with `-cpu help`, qemu just exists with an error
code and no output. Let's correct that.
The code is basically a copy from target/alpha/cpu.h, but we strip the
"-hexagon-cpu" suffix before printing. This is to avoid confusing
situations like the following:
$ qemu-hexagon -cpu help
Available CPUs:
v67-hexagon-cpu
$ qemu-hexagon -cpu v67-hexagon-cpu ./prog
qemu-hexagon: unable to find CPU model 'v67-hexagon-cpu'
Signed-off-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Tested-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Taylor Simpson <tsimpson@quicinc.com>
Message-Id: <b946e17c7e17eed9095700b54c5ead36e5d55dfa.1683225804.git.quic_mathbern@quicinc.com>
Currently, the python scripts used for the hexagon building will not
abort the compilation when there is an error parsing a register. Let's
make the compilation properly fail in such cases by rasing an exception
instead of just printing a warning message, which might get lost in the
output.
This patch was generated with:
git grep -l "Bad register" *hexagon* | \
xargs sed -i "" -e 's/print("Bad register parse: "[, ]*\([^)]*\))/hex_common.bad_register(\1)/g'
Plus the bad_register() helper added to hex_common.py.
Signed-off-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Tested-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Taylor Simpson <tsimpson@quicinc.com>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Message-Id: <1f5dbd92f68fdd89e2647e4ba527a2c32cf0f070.1683217043.git.quic_mathbern@quicinc.com>
**** Changes in v2 ****
Fix yyassert's for sign and zero extends
Coverity reports a parameter that is "set but never used". This is caused
by an assignment operator being used instead of equality.
Co-authored-by: Taylor Simpson <tsimpson@quicinc.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Tested-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20230428204411.1400931-1-tsimpson@quicinc.com>
**** Changes in v3 ****
Fix bugs exposed by dpmpyss_rnd_s0 instruction
Set correct size/signedness for constants
Test cases added to tests/tcg/hexagon/misc.c
**** Changes in v2 ****
Fix bug in imm_print identified in clang build
Currently, idef-parser skips all floating point instructions. However,
there are some floating point instructions that can be handled.
The following instructions are now parsed
F2_sfimm_p
F2_sfimm_n
F2_dfimm_p
F2_dfimm_n
F2_dfmpyll
F2_dfmpylh
To make these instructions work, we fix some bugs in parser-helpers.c
gen_rvalue_extend
gen_cast_op
imm_print
lexer properly sets size/signedness of constants
Test cases added to tests/tcg/hexagon/fpstuff.c
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Tested-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20230501203125.4025991-1-tsimpson@quicinc.com>
The following items in the CPUHexagonState are only used for bookkeeping
within the translation of a packet. With recent changes that eliminate
the need to free TCGv variables, these make more sense to be transient
and kept in DisasContext.
The following items are moved
dczero_addr
branch_taken
this_PC
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230427230012.3800327-22-tsimpson@quicinc.com>
The pkt_has_store_s1 field is only used for bookkeeping helpers with
a load. With recent changes that eliminate the need to free TCGv
variables, it makes more sense to make this transient.
These helpers already take the instruction slot as an argument. We
combine the slot and pkt_has_store_s1 into a single argument called
slotval.
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230427230012.3800327-21-tsimpson@quicinc.com>
The pred_written variable in the CPUHexagonState is only used for
bookkeeping within the translation of a packet. With recent changes
that eliminate the need to free TCGv variables, these make more sense
to be transient and kept in DisasContext.
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230427230012.3800327-20-tsimpson@quicinc.com>
The new_pred_value array in the CPUHexagonState is only used for
bookkeeping within the translation of a packet. With recent changes
that eliminate the need to free TCGv variables, these make more sense
to be transient and kept in DisasContext.
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230427230012.3800327-19-tsimpson@quicinc.com>
The new_value array in the CPUHexagonState is only used for bookkeeping
within the translation of a packet. With recent changes that eliminate
the need to free TCGv variables, these make more sense to be transient
and kept in DisasContext.
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230427230012.3800327-18-tsimpson@quicinc.com>
Precursor to moving new_value from the global state to DisasContext
USR will need to stay in the global state because some helpers will
set it's value
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230427230012.3800327-17-tsimpson@quicinc.com>
The following have overrides
S2_insert
S2_insert_rp
S2_asr_r_svw_trun
A2_swiz
These instructions have semantics that write to the destination
before all the operand reads have been completed. Therefore,
the idef-parser versions were disabled with the short-circuit patch.
Test cases added to tests/tcg/hexagon/read_write_overlap.c
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230427230012.3800327-16-tsimpson@quicinc.com>
The generated helpers for HVX use pass-by-reference, so they can't
short-circuit when the reads/writes overlap. The instructions with
overrides are OK because they use tcg_gen_gvec_*.
We add a flag has_hvx_helper to DisasContext and extend gen_analyze_funcs
to set the flag when the instruction is an HVX instruction with a
generated helper.
We add an override for V6_vcombine so that it can be short-circuited
along with a test case in tests/tcg/hexagon/hvx_misc.c
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230427230012.3800327-15-tsimpson@quicinc.com>
In certain cases, we can avoid the overhead of writing to future_VRegs
and write directly to VRegs. We consider HVX reads/writes when computing
ctx->need_commit. Then, we can early-exit from gen_commit_hvx.
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230427230012.3800327-14-tsimpson@quicinc.com>
In certain cases, we can avoid the overhead of writing to hex_new_pred_value
and write directly to hex_pred. We consider predicate reads/writes when
computing ctx->need_commit. The get_result_pred() function uses this
field to decide between hex_new_pred_value and hex_pred. Then, we can
early-exit from gen_pred_writes.
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230427230012.3800327-13-tsimpson@quicinc.com>
In certain cases, we can avoid the overhead of writing to hex_new_value
and write directly to hex_gpr. We add need_commit field to DisasContext
indicating if the end-of-packet commit is needed. If it is not needed,
get_result_gpr() and get_result_gpr_pair() can return hex_gpr.
We pass the ctx->need_commit to helpers when needed.
Finally, we can early-exit from gen_reg_writes during packet commit.
There are a few instructions whose semantics write to the result before
reading all the inputs. Therefore, the idef-parser generated code is
incompatible with short-circuit. We tell idef-parser to skip them.
For debugging purposes, we add a cpu property to turn off short-circuit.
When the short-circuit property is false, we skip the analysis and force
the end-of-packet commit.
Here's a simple example of the TCG generated for
0x004000b4: 0x7800c020 { R0 = #0x1 }
BEFORE:
---- 004000b4
movi_i32 new_r0,$0x1
mov_i32 r0,new_r0
AFTER:
---- 004000b4
movi_i32 r0,$0x1
This patch reintroduces a use of check_for_attrib, so we remove the
G_GNUC_UNUSED added earlier in this series.
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Brian Cain <bcain@quicinc.com>
Message-Id: <20230427230012.3800327-12-tsimpson@quicinc.com>
Have gen_analyze_funcs mark the registers that are read by the
instruction. We also mark the implicit reads using instruction
attributes.
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230427230012.3800327-11-tsimpson@quicinc.com>
When generating TCG, make sure we have read all the operand registers
before writing to the destination registers.
This is a prerequesite for short-circuiting where the source and dest
operands could be the same.
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230427230012.3800327-10-tsimpson@quicinc.com>
Only endloop instructions will conditionally write to a predicate.
When there is an endloop instruction, we preload the values into
new_pred_value.
The only place pred_written is needed is when HEX_DEBUG is on.
We remove the last use of check_for_attrib. However, new uses will be
introduced later in this series, so we mark it with G_GNUC_UNUSED.
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230427230012.3800327-9-tsimpson@quicinc.com>
These instructions have implicit writes to registers, so we don't
want them to be helpers when idef-parser is off.
The following instructions are overriden
S2_cabacdecbin
SA1_cmpeqi
Remove the log_pred_write function from op_helper.c
Remove references in macros.h
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230427230012.3800327-8-tsimpson@quicinc.com>
With the overrides added in prior commits, this function is not used
Remove references in macros.h
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230427230012.3800327-7-tsimpson@quicinc.com>
These instructions have implicit reads from p0, so we don't want
them in helpers when idef-parser is off.
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230427230012.3800327-6-tsimpson@quicinc.com>
These instructions have implicit writes to registers, so we don't
want them to be helpers when idef-parser is off.
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230427230012.3800327-5-tsimpson@quicinc.com>
These instructions have implicit writes to registers, so we don't
want them to be helpers when idef-parser is off.
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230427230012.3800327-4-tsimpson@quicinc.com>
Add DisasContext arg to gen_log_reg_write_pair also
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230427230012.3800327-3-tsimpson@quicinc.com>
The following instructions are added
J2_callrh
J2_junprh
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20230427224057.3766963-9-tsimpson@quicinc.com>
The following instructions are added
V6_vasrvuhubrndsat
V6_vasrvuhubsat
V6_vasrvwuhrndsat
V6_vasrvwuhsat
V6_vassign_tmp
V6_vcombine_tmp
V6_vmpyuhvs
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20230427224057.3766963-7-tsimpson@quicinc.com>
The following instructions are added
V6_v6mpyvubs10_vxx
V6_v6mpyhubs10_vxx
V6_v6mpyvubs10
V6_v6mpyhubs10
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20230427224057.3766963-5-tsimpson@quicinc.com>
The following instructions are added
L2_loadw_aq
L4_loadd_aq
R6_release_at_vi
R6_release_st_vi
S2_storew_rl_at_vi
S4_stored_rl_at_vi
S2_storew_rl_st_vi
S4_stored_rl_st_vi
The release instructions are nop's in qemu. The others behave as
loads/stores.
The encodings for these instructions changed some "don't care" bits
L2_loadw_locked
L4_loadd_locked
S2_storew_locked
S4_stored_locked
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20230427224057.3766963-3-tsimpson@quicinc.com>
Add support for the ELF flags
Move target/hexagon/cpu.[ch] to be v73
Change the compiler flag used by "make check-tcg"
The decbin instruction is removed in Hexagon v73, so check the
version before trying to compile the instruction.
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20230427224057.3766963-2-tsimpson@quicinc.com>
This header is supposed to be private to tcg and in fact
does not need to be included here at all.
Reviewed-by: Song Gao <gaosong@loongson.cn>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Mark all memory operations that are not already marked with UNALIGN.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
In gen_ldx/gen_stx, the only two locations for memory operations,
mark the operation as either aligned (softmmu) or unaligned
(user-only, as if emulated by the kernel).
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The opposite of MO_UNALN is MO_ALIGN.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Memory operations that are not already aligned, or otherwise
marked up, require addition of ctx->default_tcg_memop_mask.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Case was accidentally dropped in b7a94da955.
Tested-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This list requires a NULL terminator.
Fixes: 16f5396cec ("target/loongarch: Add LSX data type VReg")
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230510062405.127260-1-richard.henderson@linaro.org>
Adds the support for AMD EPYC Genoa generation processors. The model
display for the new processor will be EPYC-Genoa.
Adds the following new feature bits on top of the feature bits from
the previous generation EPYC models.
avx512f : AVX-512 Foundation instruction
avx512dq : AVX-512 Doubleword & Quadword Instruction
avx512ifma : AVX-512 Integer Fused Multiply Add instruction
avx512cd : AVX-512 Conflict Detection instruction
avx512bw : AVX-512 Byte and Word Instructions
avx512vl : AVX-512 Vector Length Extension Instructions
avx512vbmi : AVX-512 Vector Byte Manipulation Instruction
avx512_vbmi2 : AVX-512 Additional Vector Byte Manipulation Instruction
gfni : AVX-512 Galois Field New Instructions
avx512_vnni : AVX-512 Vector Neural Network Instructions
avx512_bitalg : AVX-512 Bit Algorithms, add bit algorithms Instructions
avx512_vpopcntdq: AVX-512 AVX-512 Vector Population Count Doubleword and
Quadword Instructions
avx512_bf16 : AVX-512 BFLOAT16 instructions
la57 : 57-bit virtual address support (5-level Page Tables)
vnmi : Virtual NMI (VNMI) allows the hypervisor to inject the NMI
into the guest without using Event Injection mechanism
meaning not required to track the guest NMI and intercepting
the IRET.
auto-ibrs : The AMD Zen4 core supports a new feature called Automatic IBRS.
It is a "set-and-forget" feature that means that, unlike e.g.,
s/w-toggled SPEC_CTRL.IBRS, h/w manages its IBRS mitigation
resources automatically across CPL transitions.
Signed-off-by: Babu Moger <babu.moger@amd.com>
Message-Id: <20230504205313.225073-8-babu.moger@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add the following featute bits.
vnmi: Virtual NMI (VNMI) allows the hypervisor to inject the NMI into the
guest without using Event Injection mechanism meaning not required to
track the guest NMI and intercepting the IRET.
The presence of this feature is indicated via the CPUID function
0x8000000A_EDX[25].
automatic-ibrs :
The AMD Zen4 core supports a new feature called Automatic IBRS.
It is a "set-and-forget" feature that means that, unlike e.g.,
s/w-toggled SPEC_CTRL.IBRS, h/w manages its IBRS mitigation
resources automatically across CPL transitions.
The presence of this feature is indicated via the CPUID function
0x80000021_EAX[8].
The documention for the features are available in the links below.
a. Processor Programming Reference (PPR) for AMD Family 19h Model 01h,
Revision B1 Processors
b. AMD64 Architecture Programmer’s Manual Volumes 1–5 Publication No. Revision
40332 4.05 Date October 2022
Signed-off-by: Santosh Shukla <santosh.shukla@amd.com>
Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Babu Moger <babu.moger@amd.com>
Link: https://www.amd.com/system/files/TechDocs/55898_B1_pub_0.50.zip
Link: https://www.amd.com/system/files/TechDocs/40332_4.05.pdf
Message-Id: <20230504205313.225073-7-babu.moger@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add the following feature bits for EPYC-Milan model and bump the version.
vaes : Vector VAES(ENC|DEC), VAES(ENC|DEC)LAST instruction support
vpclmulqdq : Vector VPCLMULQDQ instruction support
stibp-always-on : Single Thread Indirect Branch Prediction Mode has enhanced
performance and may be left Always on
amd-psfd : Predictive Store Forward Disable
no-nested-data-bp : Processor ignores nested data breakpoints
lfence-always-serializing : LFENCE instruction is always serializing
null-sel-clr-base : Null Selector Clears Base. When this bit is
set, a null segment load clears the segment base
These new features will be added in EPYC-Milan-v2. The "-cpu help" output
after the change will be.
x86 EPYC-Milan (alias configured by machine type)
x86 EPYC-Milan-v1 AMD EPYC-Milan Processor
x86 EPYC-Milan-v2 AMD EPYC-Milan Processor
The documentation for the features are available in the links below.
a. Processor Programming Reference (PPR) for AMD Family 19h Model 01h,
Revision B1 Processors
b. SECURITY ANALYSIS OF AMD PREDICTIVE STORE FORWARDING
c. AMD64 Architecture Programmer’s Manual Volumes 1–5 Publication No. Revision
40332 4.05 Date October 2022
Signed-off-by: Babu Moger <babu.moger@amd.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://www.amd.com/system/files/TechDocs/55898_B1_pub_0.50.zip
Link: https://www.amd.com/system/files/documents/security-analysis-predictive-store-forwarding.pdf
Link: https://www.amd.com/system/files/TechDocs/40332_4.05.pdf
Message-Id: <20230504205313.225073-6-babu.moger@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add the following feature bits.
no-nested-data-bp : Processor ignores nested data breakpoints.
lfence-always-serializing : LFENCE instruction is always serializing.
null-sel-cls-base : Null Selector Clears Base. When this bit is
set, a null segment load clears the segment base.
The documentation for the features are available in the links below.
a. Processor Programming Reference (PPR) for AMD Family 19h Model 01h,
Revision B1 Processors
b. AMD64 Architecture Programmer’s Manual Volumes 1–5 Publication No. Revision
40332 4.05 Date October 2022
Signed-off-by: Babu Moger <babu.moger@amd.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://www.amd.com/system/files/TechDocs/55898_B1_pub_0.50.zip
Link: https://www.amd.com/system/files/TechDocs/40332_4.05.pdf
Message-Id: <20230504205313.225073-5-babu.moger@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add the following feature bits.
amd-psfd : Predictive Store Forwarding Disable:
PSF is a hardware-based micro-architectural optimization
designed to improve the performance of code execution by
predicting address dependencies between loads and stores.
While SSBD (Speculative Store Bypass Disable) disables both
PSF and speculative store bypass, PSFD only disables PSF.
PSFD may be desirable for the software which is concerned
with the speculative behavior of PSF but desires a smaller
performance impact than setting SSBD.
Depends on the following kernel commit:
b73a54321ad8 ("KVM: x86: Expose Predictive Store Forwarding Disable")
stibp-always-on :
Single Thread Indirect Branch Prediction mode has enhanced
performance and may be left always on.
The documentation for the features are available in the links below.
a. Processor Programming Reference (PPR) for AMD Family 19h Model 01h,
Revision B1 Processors
b. SECURITY ANALYSIS OF AMD PREDICTIVE STORE FORWARDING
Signed-off-by: Babu Moger <babu.moger@amd.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://www.amd.com/system/files/documents/security-analysis-predictive-store-forwarding.pdf
Link: https://www.amd.com/system/files/TechDocs/55898_B1_pub_0.50.zip
Message-Id: <20230504205313.225073-4-babu.moger@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Introduce new EPYC cpu versions: EPYC-v4 and EPYC-Rome-v3.
The only difference vs. older models is an updated cache_info with
the 'complex_indexing' bit unset, since this bit is not currently
defined for AMD and may cause problems should it be used for
something else in the future. Setting this bit will also cause
CPUID validation failures when running SEV-SNP guests.
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Babu Moger <babu.moger@amd.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20230504205313.225073-3-babu.moger@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
New EPYC CPUs versions require small changes to their cache_info's.
Because current QEMU x86 CPU definition does not support versioned
cach_info, we would have to declare a new CPU type for each such case.
To avoid the dup work, add "cache_info" in X86CPUVersionDefinition",
to allow new cache_info pointers to be specified for a new CPU version.
Co-developed-by: Wei Huang <wei.huang2@amd.com>
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Babu Moger <babu.moger@amd.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20230504205313.225073-2-babu.moger@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20230504122810.4094787-45-gaosong@loongson.cn>
Introduce set_fpr() and get_fpr() and remove cpu_fpr.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20230504122810.4094787-44-gaosong@loongson.cn>