The result of (8 - 3 - vlmul) is negative when vlmul >= 6,
and results in wrong vill.
Signed-off-by: demin.han <demin.han@starfivetech.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20240225174114.5298-1-demin.han@starfivetech.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
We'll re-use the logic froim vext_get_vlmax() in 2 other occurrences in
the next patch, but first we need to make it independent of both 'cpu'
and 'vtype'. To do that, add 'vlenb', 'vsew' and 'lmul' as parameters
instead.
Adapt the two existing callers. In cpu_get_tb_cpu_state(), rename 'sew'
to 'vsew' to be less ambiguous about what we're encoding into *pflags.
In HELPER(vsetvl) the following changes were made:
- add a 'vsew' var to store vsew. Use it in the shift to get 'sew';
- the existing 'lmul' var was renamed to 'vlmul';
- add a new 'lmul' var to store 'lmul' encoded like DisasContext:lmul.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240122161107.26737-12-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Use the new 'vlenb' CPU config to validate fractional LMUL. The original
comparison is done with 'vlen' and 'sew', both in bits. Adjust the shift
to use vlenb.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240122161107.26737-9-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Use 'cpu->cfg.vlenb' instead of 'cpu->cfg.vlen >> 3'.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20240122161107.26737-8-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Use the target-specific function name in preference
to the generic name.
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The operator (fwmacc16) of vfwmaccbf16.vf helper function should be
replaced by fwmaccbf16.
Fixes: adf772b0f7 ("target/riscv: Add support for Zvfbfwma extension")
Signed-off-by: Max Chou <max.chou@sifive.com>
Reviewed-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231005095734.567575-1-max.chou@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231004090629.37473-6-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Local variables shadowing other local variables or parameters make the
code needlessly hard to understand. Bugs love to hide in such code.
Evidence: "[PATCH v3 1/7] migration/rdma: Fix save_page method to fail
on polling error".
This patch removes the local variable shadowing. Tested by adding:
--extra-cflags='-Wshadow=local -Wno-error=shadow=local -Wno-error=shadow=compatible-local'
To configure
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20230925043023.71448-4-alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Move some macros out of `vector_helper` and into `vector_internals`.
This ensures they can be used by both vector and vector-crypto helpers
(latter implemented in proceeding commits).
Signed-off-by: Kiran Ostrolenk <kiran.ostrolenk@codethink.co.uk>
Reviewed-by: Weiwei Li <liweiwei@iscas.ac.cn>
Signed-off-by: Max Chou <max.chou@sifive.com>
Message-ID: <20230711165917.2629866-8-max.chou@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Take some functions/macros out of `vector_helper` and put them in a new
module called `vector_internals`. This ensures they can be used by both
vector and vector-crypto helpers (latter implemented in proceeding
commits).
Signed-off-by: Kiran Ostrolenk <kiran.ostrolenk@codethink.co.uk>
Reviewed-by: Weiwei Li <liweiwei@iscas.ac.cn>
Signed-off-by: Max Chou <max.chou@sifive.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20230711165917.2629866-2-max.chou@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Commit bef6f008b98(accel/tcg: Return bool from page_check_range) converts
integer return value to bool type. However, it wrongly converted the use
of the API in riscv fault-only-first, where page_check_range < = 0, should
be converted to !page_check_range.
Signed-off-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20230729031618.821-1-zhiwei_liu@linux.alibaba.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
All these files access the CPU LD/ST API declared in "exec/cpu_ldst.h".
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230828221314.18435-4-philmd@linaro.org>
Changes the address type of the guest memory read/write functions from
target_ulong to abi_ptr. (abi_ptr is currently typedef'd to target_ulong
but that will change in a following commit.) This will reduce the
coupling between accel/ and target/.
Note: Function pointers that point to cpu_[st|ld]*() in target/riscv and
target/rx are also updated in this commit.
Signed-off-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230807155706.9580-6-anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The previous check was failing with:
VLEN=128 ELEN = 64 SEW = 16 and LMUL = 1/8 which is a
valid combination.
Fix the check to allow valid combinations when VLEN is a multiple of
ELEN.
From the specification:
"In general, the requirement is to support LMUL ≥ SEWMIN/ELEN, where
SEWMIN is the narrowest supported SEW value and ELEN is the widest
supported SEW value. In the standard extensions, SEWMIN=8. For standard
vector extensions with ELEN=32, fractional LMULs of 1/2 and 1/4 must be
supported. For standard vector extensions with ELEN=64, fractional LMULs
of 1/2, 1/4, and 1/8 must be supported." Elsewhere in the specification
it makes clear that VLEN>=ELEN.
From inspection this new check allows:
VLEN=ELEN=64 1/2, 1/4, 1/8 for SEW >=8
VLEN=ELEN=32 1/2, 1/4 for SEW >=8
Fixes: d9b7609a1f ("target/riscv: rvv-1.0: configure instructions")
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
Reviewed-by: Weiwei Li <liweiwei@iscas.ac.cn>
Message-Id: <20230718131316.12283-2-rbradford@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Replace the 0/-1 result with true/false.
Invert the sense of the test of all callers.
Document the function.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230707204054.8792-25-richard.henderson@linaro.org>
Add trans_* and helper function for Zvfbfwma instructions.
Signed-off-by: Weiwei Li <liweiwei@iscas.ac.cn>
Signed-off-by: Junqiang Wang <wangjunqiang@iscas.ac.cn>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230615063302.102409-5-liweiwei@iscas.ac.cn>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Add trans_* and helper function for Zvfbfmin instructions.
Signed-off-by: Weiwei Li <liweiwei@iscas.ac.cn>
Signed-off-by: Junqiang Wang <wangjunqiang@iscas.ac.cn>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230615063302.102409-4-liweiwei@iscas.ac.cn>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Commit 752614cab8 ("target/riscv: rvv: Add tail agnostic for vector
load / store instructions") added an extra check for LMUL fragmentation,
intended for setting the "rest tail elements" in the last register for a
segment load insn.
Actually, the max_elements derived in vext_ld*() won't be a fraction of
vector register size, since the lmul encoded in desc is emul, which has
already been adjusted to 1 for LMUL fragmentation case by vext_get_emul()
in trans_rvv.c.inc, for ld_stride(), ld_us(), ld_index() and ldff().
Besides, vext_get_emul() has also taken EEW/SEW into consideration, so no
need to call vext_get_total_elems() which would base on the emul to derive
another emul, the second emul would be incorrect when esz differs from sew.
Thus this patch removes the check for extra tail elements.
Fixes: 752614cab8 ("target/riscv: rvv: Add tail agnostic for vector load / store instructions")
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Weiwei Li <liweiwei@iscas.ac.cn>
Message-Id: <20230607091646.4049428-1-xiao.w.wang@intel.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
There's no code using MTYPE, which was a concept used in older vector
implementation.
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Message-Id: <20230608053517.4102648-1-xiao.w.wang@intel.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
The function is a no-op if 'vta' is zero but we're still doing a lot of
stuff in this function regardless. vext_set_elems_1s() will ignore every
single time (since vta is zero) and we just wasted time.
Skip it altogether in this case. Aside from the code simplification
there's a noticeable emulation performance gain by doing it. For a
regular C binary that does a vectors operation like this:
=======
#define SZ 10000000
int main ()
{
int *a = malloc (SZ * sizeof (int));
int *b = malloc (SZ * sizeof (int));
int *c = malloc (SZ * sizeof (int));
for (int i = 0; i < SZ; i++)
c[i] = a[i] + b[i];
return c[SZ - 1];
}
=======
Emulating it with qemu-riscv64 and RVV takes ~0.3 sec:
$ time ~/work/qemu/build/qemu-riscv64 \
-cpu rv64,debug=false,vext_spec=v1.0,v=true,vlen=128 ./foo.out
real 0m0.303s
user 0m0.281s
sys 0m0.023s
With this skip we take ~0.275 sec:
$ time ~/work/qemu/build/qemu-riscv64 \
-cpu rv64,debug=false,vext_spec=v1.0,v=true,vlen=128 ./foo.out
real 0m0.274s
user 0m0.252s
sys 0m0.019s
This performance gain adds up fast when executing heavy benchmarks like
SPEC.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com>
Reviewed-by: Weiwei Li <liweiwei@iscas.ac.cn>
Message-Id: <20230427205708.246679-2-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Fix lines with over 80 characters for both code and comments.
Signed-off-by: Weiwei Li <liweiwei@iscas.ac.cn>
Signed-off-by: Junqiang Wang <wangjunqiang@iscas.ac.cn>
Reviewed-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-Id: <20230405085813.40643-5-liweiwei@iscas.ac.cn>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Fix formats for multi-lines comments.
Add spaces around single line comments(after "/*" and before "*/").
Signed-off-by: Weiwei Li <liweiwei@iscas.ac.cn>
Signed-off-by: Junqiang Wang <wangjunqiang@iscas.ac.cn>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-Id: <20230405085813.40643-4-liweiwei@iscas.ac.cn>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Fix identation problems, and try to use the same indentation strategy
in the same file.
Signed-off-by: Weiwei Li <liweiwei@iscas.ac.cn>
Signed-off-by: Junqiang Wang <wangjunqiang@iscas.ac.cn>
Reviewed-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-Id: <20230405085813.40643-3-liweiwei@iscas.ac.cn>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
This file has several uses of env_archcpu() that are used solely to read
cfg->vlen. Use the new riscv_cpu_cfg() inline instead.
Suggested-by: Weiwei Li <liweiwei@iscas.ac.cn>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Weiwei Li<liweiwei@iscas.ac.cn>
Message-ID: <20230226170514.588071-3-dbarboza@ventanamicro.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Commit 752614cab8 ("target/riscv: rvv: Add tail agnostic for vector
load / store instructions") added code to set the tail elements to 1 in
the end of vext_ldst_stride(), vext_ldst_us(), vext_ldst_index() and
vext_ldff(). Aside from a env->vl versus an evl value being used in the
first loop, the code is being repeated 4 times.
Create a helper to avoid code repetition in all those functions.
Arguments that are used in the callers (nf, esz and max_elems) are
passed as arguments. All other values are being derived inside the
helper.
Reviewed-by: Weiwei Li <liweiwei@iscas.ac.cn>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20230226170514.588071-2-dbarboza@ventanamicro.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
vslide1up_##BITWIDTH is used by the vslide1up.vx and vslide1up.vf. So its
scalar input should be uint64_t to hold the 64 bits float register.And the
same for vslide1down_##BITWIDTH.
This bug is caught when run these instructions on qemu-riscv32.
Signed-off-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Weiwei Li <liweiwei@iscas.ac.cn>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Message-ID: <20230213094550.29621-1-zhiwei_liu@linux.alibaba.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Tweak the semantic patch to drop redundant parenthesis around the
return expression.
Coccinelle drops a comment in hw/rdma/vmw/pvrdma_cmd.c; restored
manually.
Coccinelle messes up vmdk_co_create(), not sure why. Change dropped,
will be done manually in the next commit.
Line breaks in target/avr/cpu.h and hw/rdma/vmw/pvrdma_cmd.c tidied up
manually.
Whitespace in tools/virtiofsd/fuse_lowlevel.c tidied up manually.
checkpatch.pl complains "return of an errno should typically be -ve"
two times for hw/9pfs/9p-synth.c. Preexisting, the patch merely makes
it visible to checkpatch.pl.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20221122134917.1217307-2-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Starting with RVV1.0, the original vf[w]redsum_vs instruction was renamed
to vf[w]redusum_vs. The distinction between ordered and unordered is also
more consistent with other instructions, although there is no difference
in implementation between the two for QEMU.
Signed-off-by: Yang Liu <liuyang22@iscas.ac.cn>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Message-Id: <20220817074802.20765-2-liuyang22@iscas.ac.cn>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Remove duplicate code by wrapping vfwredsum_vs's OP function.
Signed-off-by: Yang Liu <liuyang22@iscas.ac.cn>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Message-Id: <20220817074802.20765-1-liuyang22@iscas.ac.cn>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
According to v-spec, mask agnostic behavior can be either kept as
undisturbed or set elements' bits to all 1s. To distinguish the
difference of mask policies, QEMU should be able to simulate the mask
agnostic behavior as "set mask elements' bits to all 1s".
There are multiple possibility for agnostic elements according to
v-spec. The main intent of this patch-set tries to add option that
can distinguish between mask policies. Setting agnostic elements to
all 1s allows QEMU to express this.
This is the first commit regarding the optional mask agnostic
behavior. Follow-up commits will add this optional behavior
for all rvv instructions.
Signed-off-by: eop Chen <eop.chen@sifive.com>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Reviewed-by: Weiwei Li <liweiwei@iscas.ac.cn>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <165570784143.17634.35095816584573691-1@git.sr.ht>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
The tail elements in the destination mask register are updated under
a tail-agnostic policy.
Signed-off-by: eop Chen <eop.chen@sifive.com>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Reviewed-by: Weiwei Li <liweiwei@iscas.ac.cn>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <165449614532.19704.7000832880482980398-14@git.sr.ht>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Compares write mask registers, and so always operate under a tail-
agnostic policy.
Signed-off-by: eop Chen <eop.chen@sifive.com>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Reviewed-by: Weiwei Li <liweiwei@iscas.ac.cn>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <165449614532.19704.7000832880482980398-12@git.sr.ht>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Compares write mask registers, and so always operate under a tail-
agnostic policy.
Signed-off-by: eop Chen <eop.chen@sifive.com>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Reviewed-by: Weiwei Li <liweiwei@iscas.ac.cn>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <165449614532.19704.7000832880482980398-9@git.sr.ht>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
`vmadc` and `vmsbc` produces a mask value, they always operate with
a tail agnostic policy.
Signed-off-by: eop Chen <eop.chen@sifive.com>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Reviewed-by: Weiwei Li <liweiwei@iscas.ac.cn>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <165449614532.19704.7000832880482980398-7@git.sr.ht>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Destination register of unit-stride mask load and store instructions are
always written with a tail-agnostic policy.
A vector segment load / store instruction may contain fractional lmul
with nf * lmul > 1. The rest of the elements in the last register should
be treated as tail elements.
Signed-off-by: eop Chen <eop.chen@sifive.com>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Reviewed-by: Weiwei Li <liweiwei@iscas.ac.cn>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <165449614532.19704.7000832880482980398-6@git.sr.ht>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>