Commit Graph

76615 Commits

Author SHA1 Message Date
Peter Maydell
d27e82f7d0 target/arm: Convert VFM[AS]L (scalar) to decodetree
Convert the VFM[AS]L (scalar) insns in the 2reg-scalar-ext group
to decodetree. These are the last ones in the group so we can remove
all the legacy decode for the group.

Note that in disas_thumb2_insn() the parts of this encoding space
where the decodetree decoder returns false will correctly be directed
to illegal_op by the "(insn & (1 << 28))" check so they won't fall
into disas_coproc_insn() by mistake.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200430181003.21682-11-peter.maydell@linaro.org
2020-05-04 12:57:56 +01:00
Peter Maydell
35f5d4d174 target/arm: Convert V[US]DOT (scalar) to decodetree
Convert the V[US]DOT (scalar) insns in the 2reg-scalar-ext group
to decodetree.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200430181003.21682-10-peter.maydell@linaro.org
2020-05-04 12:57:56 +01:00
Peter Maydell
7e1b5d6153 target/arm: Convert VCMLA (scalar) to decodetree
Convert VCMLA (scalar) in the 2reg-scalar-ext group to decodetree.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200430181003.21682-9-peter.maydell@linaro.org
2020-05-04 12:57:56 +01:00
Peter Maydell
9a107e7b8a target/arm: Convert VFM[AS]L (vector) to decodetree
Convert the VFM[AS]L (vector) insns to decodetree.  This is the last
insn in the legacy decoder for the 3same_ext group, so we can
delete the legacy decoder function for the group entirely.

Note that in disas_thumb2_insn() the parts of this encoding space
where the decodetree decoder returns false will correctly be directed
to illegal_op by the "(insn & (1 << 28))" check so they won't fall
into disas_coproc_insn() by mistake.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200430181003.21682-8-peter.maydell@linaro.org
2020-05-04 12:57:56 +01:00
Peter Maydell
32da0e330d target/arm: Convert V[US]DOT (vector) to decodetree
Convert the V[US]DOT (vector) insns to decodetree.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200430181003.21682-7-peter.maydell@linaro.org
2020-05-04 12:57:56 +01:00
Peter Maydell
94d5eb7b3f target/arm: Convert VCADD (vector) to decodetree
Convert the VCADD (vector) insns to decodetree.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200430181003.21682-6-peter.maydell@linaro.org
2020-05-04 12:57:56 +01:00
Peter Maydell
afff8de0d4 target/arm: Convert VCMLA (vector) to decodetree
Convert the VCMLA (vector) insns in the 3same extension group to
decodetree.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200430181003.21682-5-peter.maydell@linaro.org
2020-05-04 12:57:56 +01:00
Peter Maydell
625e3dd44a target/arm: Add stubs for AArch32 Neon decodetree
Add the infrastructure for building and invoking a decodetree decoder
for the AArch32 Neon encodings.  At the moment the new decoder covers
nothing, so we always fall back to the existing hand-written decode.

We follow the same pattern we did for the VFP decodetree conversion
(commit 78e138bc1f and following): code that deals
with Neon will be moving gradually out to translate-neon.vfp.inc,
which we #include into translate.c.

In order to share the decode files between A32 and T32, we
split Neon into 3 parts:
 * data-processing
 * load-store
 * 'shared' encodings

The first two groups of instructions have similar but not identical
A32 and T32 encodings, so we need to manually transform the T32
encoding into the A32 one before calling the decoder; the third group
covers the Neon instructions which are identical in A32 and T32.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200430181003.21682-4-peter.maydell@linaro.org
2020-05-04 12:57:56 +01:00
Peter Maydell
d1a6d3b594 target/arm: Don't allow Thumb Neon insns without FEATURE_NEON
We were accidentally permitting decode of Thumb Neon insns even if
the CPU didn't have the FEATURE_NEON bit set, because the feature
check was being done before the call to disas_neon_data_insn() and
disas_neon_ls_insn() in the Arm decoder but was omitted from the
Thumb decoder.  Push the feature bit check down into the called
functions so it is done for both Arm and Thumb encodings.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20200430181003.21682-3-peter.maydell@linaro.org
2020-05-04 12:57:56 +01:00
Peter Maydell
0d787cf1f3 target/arm/translate-vfp.inc.c: Remove duplicate simd_r32 check
Somewhere along theline we accidentally added a duplicate
"using D16-D31 when they don't exist" check to do_vfm_dp()
(probably an artifact of a patchseries rebase). Remove it.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20200430181003.21682-2-peter.maydell@linaro.org
2020-05-04 12:57:56 +01:00
Edgar E. Iglesias
2aca5284b1 hw/arm: versal-virt: Add support for the RTC
Add support for the RTC.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Luc Michel <luc.michel@greensocs.com>
Message-id: 20200427181649.26851-12-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-05-04 11:11:28 +01:00
Edgar E. Iglesias
3afec85c2e hw/arm: versal-virt: Add support for SD
Add support for SD.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Luc Michel <luc.michel@greensocs.com>
Message-id: 20200427181649.26851-11-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-05-04 11:11:28 +01:00
Edgar E. Iglesias
eb1221c52d hw/arm: versal: Add support for the RTC
hw/arm: versal: Add support for the RTC.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Luc Michel <luc.michel@greensocs.com>
Message-id: 20200427181649.26851-10-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-05-04 11:11:28 +01:00
Edgar E. Iglesias
724c6e12dd hw/arm: versal: Add support for SD
Add support for SD.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Luc Michel <luc.michel@greensocs.com>
Message-id: 20200427181649.26851-9-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-05-04 11:11:23 +01:00
Edgar E. Iglesias
ced18d5e50 hw/arm: versal: Embed the APUs into the SoC type
Embed the APUs into the SoC type.

Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Luc Michel <luc.michel@greensocs.com>
Message-id: 20200427181649.26851-8-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-05-04 11:11:17 +01:00
Edgar E. Iglesias
f4e3fa3726 hw/arm: versal: Embed the ADMAs into the SoC type
Embed the ADMAs into the SoC type.

Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Luc Michel <luc.michel@greensocs.com>
Message-id: 20200427181649.26851-7-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-05-04 11:11:12 +01:00
Edgar E. Iglesias
4bd9b59c05 hw/arm: versal: Embed the GEMs into the SoC type
Embed the GEMs into the SoC type.

Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Luc Michel <luc.michel@greensocs.com>
Message-id: 20200427181649.26851-6-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-05-04 11:11:06 +01:00
Edgar E. Iglesias
88052ffdd1 hw/arm: versal: Embed the UARTs into the SoC type
Embed the UARTs into the SoC type.

Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Luc Michel <luc.michel@greensocs.com>
Message-id: 20200427181649.26851-5-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-05-04 11:08:20 +01:00
Edgar E. Iglesias
0b79d1baee hw/arm: versal-virt: Fix typo xlnx-ve -> xlnx-versal
Fix typo xlnx-ve -> xlnx-versal.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Luc Michel <luc.michel@greensocs.com>
Message-id: 20200427181649.26851-4-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-05-04 11:08:20 +01:00
Edgar E. Iglesias
c07c0c37ad hw/arm: versal: Move misplaced comment
Move misplaced comment.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Luc Michel <luc.michel@greensocs.com>
Message-id: 20200427181649.26851-3-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-05-04 11:08:20 +01:00
Edgar E. Iglesias
5995a02511 hw/arm: versal: Remove inclusion of arm_gicv3_common.h
Remove inclusion of arm_gicv3_common.h, this already gets
included via xlnx-versal.h.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Luc Michel <luc.michel@greensocs.com>
Message-id: 20200427181649.26851-2-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-05-04 11:08:20 +01:00
Philippe Mathieu-Daudé
e544f80030 target/arm: Use uint64_t for midr field in CPU state struct
MIDR_EL1 is a 64-bit system register with the top 32-bit being RES0.
Represent it in QEMU's ARMCPU struct with a uint64_t, not a
uint32_t.

This fixes an error when compiling with -Werror=conversion
because we were manipulating the register value using a
local uint64_t variable:

  target/arm/cpu64.c: In function ‘aarch64_max_initfn’:
  target/arm/cpu64.c:628:21: error: conversion from ‘uint64_t’ {aka ‘long unsigned int’} to ‘uint32_t’ {aka ‘unsigned int’} may change value [-Werror=conversion]
    628 |         cpu->midr = t;
        |                     ^

and future-proofs us against a possible future architecture
change using some of the top 32 bits.

Suggested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Message-id: 20200428172634.29707-1-f4bug@amsat.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-05-04 10:32:46 +01:00
Peter Maydell
5a89dd2385 target/arm: Use correct variable for setting 'max' cpu's ID_AA64DFR0
In aarch64_max_initfn() we update both 32-bit and 64-bit ID
registers.  The intended pattern is that for 64-bit ID registers we
use FIELD_DP64 and the uint64_t 't' register, while 32-bit ID
registers use FIELD_DP32 and the uint32_t 'u' register.  For
ID_AA64DFR0 we accidentally used 'u', meaning that the top 32 bits of
this 64-bit ID register would end up always zero.  Luckily at the
moment that's what they should be anyway, so this bug has no visible
effects.

Use the right-sized variable.

Fixes: 3bec78447a
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20200423110915.10527-1-peter.maydell@linaro.org
2020-05-04 10:32:46 +01:00
Peter Maydell
ce3125bed9 target/arm: Implement ARMv8.2-TTS2UXN
The ARMv8.2-TTS2UXN feature extends the XN field in stage 2
translation table descriptors from just bit [54] to bits [54:53],
allowing stage 2 to control execution permissions separately for EL0
and EL1. Implement the new semantics of the XN field and enable
the feature for our 'max' CPU.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200330210400.11724-5-peter.maydell@linaro.org
2020-05-04 10:32:46 +01:00
Peter Maydell
ff7de2fc2c target/arm: Add new 's1_is_el0' argument to get_phys_addr_lpae()
For ARMv8.2-TTS2UXN, the stage 2 page table walk wants to know
whether the stage 1 access is for EL0 or not, because whether
exec permission is given can depend on whether this is an EL0
or EL1 access. Add a new argument to get_phys_addr_lpae() so
the call sites can pass this information in.

Since get_phys_addr_lpae() doesn't already have a doc comment,
add one so we have a place to put the documentation of the
semantics of the new s1_is_el0 argument.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200330210400.11724-4-peter.maydell@linaro.org
2020-05-04 10:32:46 +01:00
Peter Maydell
59dff859cd target/arm: Use enum constant in get_phys_addr_lpae() call
The access_type argument to get_phys_addr_lpae() is an MMUAccessType;
use the enum constant MMU_DATA_LOAD rather than a literal 0 when we
call it in S1_ptw_translate().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200330210400.11724-3-peter.maydell@linaro.org
2020-05-04 10:32:46 +01:00
Peter Maydell
bf05340cb6 target/arm: Don't use a TLB for ARMMMUIdx_Stage2
We define ARMMMUIdx_Stage2 as being an MMU index which uses a QEMU
TLB.  However we never actually use the TLB -- all stage 2 lookups
are done by direct calls to get_phys_addr_lpae() followed by a
physical address load via address_space_ld*().

Remove Stage2 from the list of ARM MMU indexes which correspond to
real core MMU indexes, and instead put it in the set of "NOTLB" ARM
MMU indexes.

This allows us to drop NB_MMU_MODES to 11.  It also means we can
safely add support for the ARMv8.3-TTS2UXN extension, which adds
permission bits to the stage 2 descriptors which define execute
permission separatel for EL0 and EL1; supporting that while keeping
Stage2 in a QEMU TLB would require us to use separate TLBs for
"Stage2 for an EL0 access" and "Stage2 for an EL1 access", which is a
lot of extra complication given we aren't even using the QEMU TLB.

In the process of updating the comment on our MMU index use,
fix a couple of other minor errors:
 * NS EL2 EL2&0 was missing from the list in the comment
 * some text hadn't been updated from when we bumped NB_MMU_MODES
   above 8

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200330210400.11724-2-peter.maydell@linaro.org
2020-05-04 10:32:46 +01:00
Philippe Mathieu-Daudé
2e256c04c1 hw/arm/mps2-tz: Use TYPE_IOTKIT instead of hardcoded string
By using the TYPE_* definitions for devices, we can:
 - quickly find where devices are used with 'git-grep'
 - easily rename a device (one-line change).

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20200428154650.21991-1-f4bug@amsat.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-05-04 10:32:46 +01:00
Fredrik Strupe
ab553ef74e target/arm: Make VQDMULL undefined when U=1
According to Arm ARM, VQDMULL is only valid when U=0, while having
U=1 is unallocated.

Signed-off-by: Fredrik Strupe <fredrik@strupe.net>
Fixes: 695272dcb9 ("target-arm: Handle UNDEF cases for Neon 3-regs-different-widths")
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-05-04 10:32:44 +01:00
Peter Maydell
2ef486e76d RDMA queue
* hw/rdma: Destroy list mutex when list is destroyed
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJerb8qAAoJEDbUwPDPL+RtxXMIAIuvcJSM1UNjzLTZpSsR3jPc
 qGH5QgFdLRajtoiv5+WLs9ZBRQ7GxJJe7Cuzy6TsY9g//Jd6kpwtite8wXR/dKr1
 wjG86P6HTXhoD8pKrQzdrn95+vbqnnKgn2SYYwwkCfdkNPnN85xOFIbfXVAKmwDq
 PteYkXQn3iOvpHjvZwF48ej2ddBHvdS1MhckPggEvEGz3/0equPQyo7l+P5xc2C6
 mY2aJAto6R3uwjc7zBk/BUyNBSs7UmL1eVptiUjEGzNOCIs8j2ui7JYC0BjJtSRH
 xPt5Dao7HDlBFOlnCnzx0MWA9gceMfAcZ29vTQogYj/JAvbFU2SkKNL9xc2+tFQ=
 =WwAT
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/marcel/tags/rdma-pull-request' into staging

RDMA queue

* hw/rdma: Destroy list mutex when list is destroyed

# gpg: Signature made Sat 02 May 2020 19:42:50 BST
# gpg:                using RSA key 36D4C0F0CF2FE46D
# gpg: Good signature from "Marcel Apfelbaum <marcel.apfelbaum@zoho.com>" [unknown]
# gpg:                 aka "Marcel Apfelbaum <marcel@redhat.com>" [marginal]
# gpg:                 aka "Marcel Apfelbaum <marcel.apfelbaum@gmail.com>" [unknown]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: B1C6 3A57 F92E 08F2 640F  31F5 36D4 C0F0 CF2F E46D

* remotes/marcel/tags/rdma-pull-request:
  hw/rdma: Destroy list mutex when list is destroyed

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-05-03 14:12:56 +01:00
Yuval Shaia
a5cde048e8 hw/rdma: Destroy list mutex when list is destroyed
List mutex should be destroyed when gs list gets destroyed.

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Yuval Shaia <yuval.shaia.ml@gmail.com>
Message-Id: <20200413085738.11145-1-yuval.shaia.ml@gmail.com>
Reviewed-by: Marcel Apfelbaum<marcel.apfelbaum@gmail.com>
Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
2020-05-02 21:31:17 +03:00
Peter Maydell
6897541d90 virtiofsd: Pull 2020-05-01 (includes CVE fix)
This set includes a security fix, other fixes and improvements.
 
 Security fix:
 The security fix is for CVE-2020-10717 where, on low RAM hosts,
 the guest can potentially exceed the maximum fd limit.
 This fix adds some more configuration so that the user
 can explicitly set the limit.
 
 Fixes:
 
 Recursive mounting of the exported directory is now used in
 the sandbox, such that if there was a mount underneath present at
 the time the virtiofsd was started, that mount is also
 visible to the guest; in the existing code, only mounts that
 happened after startup were visible.
 
 Security improvements:
 
 The jailing for /proc/self/fd is improved - but it's something
 that shouldn't be accessible anyway.
 
 Most capabilities are now dropped at startup; again this shouldn't
 change any behaviour but is extra protection.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEERfXHG0oMt/uXep+pBRYzHrxb/ecFAl6sc0YACgkQBRYzHrxb
 /eeP2Q/+KK//vrNblyKsAFV3G9fPi5VTLz6j/fwiA1JduyX3X8w7kGGwfxSuwvJP
 K7t07W/bkyVfFe+8kMVOkFFU63XPMIR0s6NGrnmLMJOY8eLXNtqFodE6yBzOPplA
 +ugxKRirHegx3mEtEvWO0EJU0EWcJ8QuaF0NT35ef3I/ufmyggapZ9cdYfPCgow/
 QHMQjvEl2CwC7kYwm4s0bkS/kP67m2SorVx/fr6Q0kSLUssqYhmbkK3KLdv+eUIo
 K1AFpoPBcpxdh6TxhtMM4tJ/2OycZOvmdKzEdzciq1MRTH1AgVU7ZBQPv/76bsJ8
 yOsaNUqz0K2aNERF+0uBJ6POMcbfqg9IMwpgbM+vv2Yp7UjjUpdZNLplQIil1QvP
 9Eun2KNJ9sF3ZM2DcusQuxOcs4cVPVui1k1+KqFRBV1oxoj5ikosrmkI3OOjjdZd
 lLJRmQQZ451zMb/aKndu+jG08ciWh+2l3NGZ2e7Sllo+X5lE5yhToXlRgkK5kR+U
 1aS+CDj/kWQ/PIP9KYgss6nxHmsPb7P7KnR42MnrU8GSroayzqdLn2hQYx9CZQAB
 /2QpvZGAX95urKTz9tpCRXzq2JMAY+qe2/YYgiNPhABWqIsu63YYuYZhahg+UFeq
 d29aUGxUOyLsuQWlyB3aFoOIIjNp89n3V5cgXIWHM2vbMVq8P4E=
 =tou1
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/dgilbert-gitlab/tags/pull-virtiofs-20200501' into staging

virtiofsd: Pull 2020-05-01 (includes CVE fix)

This set includes a security fix, other fixes and improvements.

Security fix:
The security fix is for CVE-2020-10717 where, on low RAM hosts,
the guest can potentially exceed the maximum fd limit.
This fix adds some more configuration so that the user
can explicitly set the limit.

Fixes:

Recursive mounting of the exported directory is now used in
the sandbox, such that if there was a mount underneath present at
the time the virtiofsd was started, that mount is also
visible to the guest; in the existing code, only mounts that
happened after startup were visible.

Security improvements:

The jailing for /proc/self/fd is improved - but it's something
that shouldn't be accessible anyway.

Most capabilities are now dropped at startup; again this shouldn't
change any behaviour but is extra protection.

# gpg: Signature made Fri 01 May 2020 20:06:46 BST
# gpg:                using RSA key 45F5C71B4A0CB7FB977A9FA90516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>" [full]
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A  9FA9 0516 331E BC5B FDE7

* remotes/dgilbert-gitlab/tags/pull-virtiofs-20200501:
  virtiofsd: drop all capabilities in the wait parent process
  virtiofsd: only retain file system capabilities
  virtiofsd: Show submounts
  virtiofsd: jail lo->proc_self_fd
  virtiofsd: stay below fs.file-max sysctl value (CVE-2020-10717)
  virtiofsd: add --rlimit-nofile=NUM option

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-05-01 23:10:22 +01:00
Stefan Hajnoczi
66502bbca3 virtiofsd: drop all capabilities in the wait parent process
All this process does is wait for its child.  No capabilities are
needed.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-05-01 20:05:37 +01:00
Stefan Hajnoczi
a59feb483b virtiofsd: only retain file system capabilities
virtiofsd runs as root but only needs a subset of root's Linux
capabilities(7).  As a file server its purpose is to create and access
files on behalf of a client.  It needs to be able to access files with
arbitrary uid/gid owners.  It also needs to be create device nodes.

Introduce a Linux capabilities(7) whitelist and drop all capabilities
that we don't need, making the virtiofsd process less powerful than a
regular uid root process.

  # cat /proc/PID/status
  ...
          Before           After
  CapInh: 0000000000000000 0000000000000000
  CapPrm: 0000003fffffffff 00000000880000df
  CapEff: 0000003fffffffff 00000000880000df
  CapBnd: 0000003fffffffff 0000000000000000
  CapAmb: 0000000000000000 0000000000000000

Note that file capabilities cannot be used to achieve the same effect on
the virtiofsd executable because mount is used during sandbox setup.
Therefore we drop capabilities programmatically at the right point
during startup.

This patch only affects the sandboxed child process.  The parent process
that sits in waitpid(2) still has full root capabilities and will be
addressed in the next patch.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20200416164907.244868-2-stefanha@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-05-01 18:57:31 +01:00
Max Reitz
ace0829c0d virtiofsd: Show submounts
Currently, setup_mounts() bind-mounts the shared directory without
MS_REC.  This makes all submounts disappear.

Pass MS_REC so that the guest can see submounts again.

Fixes: 5baa3b8e95
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20200424133516.73077-1-mreitz@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
  Changed Fixes to point to the commit with the problem rather than
          the commit that turned it on
2020-05-01 18:52:17 +01:00
Miklos Szeredi
397ae982f4 virtiofsd: jail lo->proc_self_fd
While it's not possible to escape the proc filesystem through
lo->proc_self_fd, it is possible to escape to the root of the proc
filesystem itself through "../..".

Use a temporary mount for opening lo->proc_self_fd, that has it's root at
/proc/self/fd/, preventing access to the ancestor directories.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Message-Id: <20200429124733.22488-1-mszeredi@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-05-01 18:46:54 +01:00
Stefan Hajnoczi
8c1d353d10 virtiofsd: stay below fs.file-max sysctl value (CVE-2020-10717)
The system-wide fs.file-max sysctl value determines how many files can
be open.  It defaults to a value calculated based on the machine's RAM
size.  Previously virtiofsd would try to set RLIMIT_NOFILE to 1,000,000
and this allowed the FUSE client to exhaust the number of open files
system-wide on Linux hosts with less than 10 GB of RAM!

Take fs.file-max into account when choosing the default RLIMIT_NOFILE
value.

Fixes: CVE-2020-10717
Reported-by: Yuval Avrahami <yavrahami@paloaltonetworks.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20200501140644.220940-3-stefanha@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-05-01 18:41:56 +01:00
Stefan Hajnoczi
6dbb716877 virtiofsd: add --rlimit-nofile=NUM option
Make it possible to specify the RLIMIT_NOFILE on the command-line.
Users running multiple virtiofsd processes should allocate a certain
number to each process so that the system-wide limit can never be
exhausted.

When this option is set to 0 the rlimit is left at its current value.
This is useful when a management tool wants to configure the rlimit
itself.

The default behavior remains unchanged: try to set the limit to
1,000,000 file descriptors if the current rlimit is lower.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20200501140644.220940-2-stefanha@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-05-01 18:41:55 +01:00
Peter Maydell
1c47613588 Block layer patches:
- Fix resize (extending) of short overlays
 - nvme: introduce PMR support from NVMe 1.4 spec
 - qemu-storage-daemon: Fix non-string --object properties
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAl6q9BERHGt3b2xmQHJl
 ZGhhdC5jb20ACgkQfwmycsiPL9bgag//WlwtXgTRj2N/Gijav7d9ihoaUXs4Kza1
 UpP6MA4WHCPCNJlhkGyYFO4PeQP+BFoGt9+TxGfwVIP8Sv2FYXFY/ZdgrdCw5oHV
 6Zh7xpEcOvDnR5psEvaKHBwIjEWqHMOowm0rgpzf0AmJo8C419oO6Met7zkwKUgL
 gVNtB8RkKWXEoht+JrkhRd15gSAC5bFH1Ragh2ywYhwZuMk4lSZVASk1q/Zd2lX/
 e5a8uFEwC8rNbHJxtUMPZfgCjN0H/t3h1rovxpSp9Z9YL0SQ67qZcDQy9sbOXokN
 2UwIZZsFAZPeDsBoWn+Vy0GtNsYphcG6UOn4UG6404hxMzINZB3p4f6dRDJmssvT
 whTbtfczs5rXyeoQSUHpg6RyVo8sgNWxqfNjiXvSuWy5PtW/qVYPBeO7Bexb/rxl
 +yi0oU72o5AaYhS8FV2Uj1dEMLJSJkdF7558q+eKTL5sUnBOSC4xi2UMKBwhZ46q
 IdRwOrqGI0S23BQkymQR4hXrTsrPrYZAZJHAfzqckq1qqZHg85+RP/wo/Nq1ZykC
 xTJgzzIwBOAQUoqIgARtJY5rE6tmcww/T6nuVIIFglew80nzdf3Ga4kVYWHZk9Dz
 syCTQib7R3bxz0NSZrCkeynkzNrvAt+iGQrYs+RsHDWjSEzIzlYrqVuYxWLuEBSk
 CNbdkLcHi/w=
 =ad71
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Block layer patches:

- Fix resize (extending) of short overlays
- nvme: introduce PMR support from NVMe 1.4 spec
- qemu-storage-daemon: Fix non-string --object properties

# gpg: Signature made Thu 30 Apr 2020 16:51:45 BST
# gpg:                using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6
# gpg:                issuer "kwolf@redhat.com"
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* remotes/kevin/tags/for-upstream:
  qemu-storage-daemon: Fix non-string --object properties
  qom: Factor out user_creatable_add_dict()
  nvme: introduce PMR support from NVMe 1.4 spec
  qcow2: Forward ZERO_WRITE flag for full preallocation
  iotests: Test committing to short backing file
  iotests: Filter testfiles out in filter_img_info()
  block: truncate: Don't make backing file data visible
  file-posix: Support BDRV_REQ_ZERO_WRITE for truncate
  raw-format: Support BDRV_REQ_ZERO_WRITE for truncate
  qcow2: Support BDRV_REQ_ZERO_WRITE for truncate
  block-backend: Add flags to blk_truncate()
  block: Add flags to bdrv(_co)_truncate()
  block: Add flags to BlockDriver.bdrv_co_truncate()
  qemu-iotests: allow qcow2 external discarded clusters to contain stale data
  qcow2: Add incompatibility note between backing files and raw external data files

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-04-30 19:25:41 +01:00
Kevin Wolf
eaae29ef89 qemu-storage-daemon: Fix non-string --object properties
After processing the option string with the keyval parser, we get a
QDict that contains only strings. This QDict must be fed to a keyval
visitor which converts the strings into the right data types.

qmp_object_add(), however, uses the normal QObject input visitor, which
expects a QDict where all properties already have the QType that matches
the data type required by the QOM object type.

Change the --object implementation in qemu-storage-daemon so that it
doesn't call qmp_object_add(), but calls user_creatable_add_dict()
directly instead and pass it a new keyval boolean that decides which
visitor must be used.

Reported-by: Coiby Xu <coiby.xu@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-04-30 17:51:07 +02:00
Kevin Wolf
d6a5beeb2b qom: Factor out user_creatable_add_dict()
The QMP handler qmp_object_add() and the implementation of --object in
qemu-storage-daemon can share most of the code. Currently,
qemu-storage-daemon calls qmp_object_add(), but this is not correct
because different visitors need to be used.

As a first step towards a fix, make qmp_object_add() a wrapper around a
new function user_creatable_add_dict() that can get an additional
parameter. The handling of "props" is only required for compatibility
and not required for the qemu-storage-daemon command line, so it stays
in qmp_object_add().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-04-30 17:51:07 +02:00
Andrzej Jakowski
6cf9413229 nvme: introduce PMR support from NVMe 1.4 spec
This patch introduces support for PMR that has been defined as part of NVMe 1.4
spec. User can now specify a pmrdev option that should point to HostMemoryBackend.
pmrdev memory region will subsequently be exposed as PCI BAR 2 in emulated NVMe
device. Guest OS can perform mmio read and writes to the PMR region that will stay
persistent across system reboot.

Signed-off-by: Andrzej Jakowski <andrzej.jakowski@linux.intel.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20200330164656.9348-1-andrzej.jakowski@linux.intel.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-04-30 17:51:07 +02:00
Kevin Wolf
eb8a0cf3ba qcow2: Forward ZERO_WRITE flag for full preallocation
The BDRV_REQ_ZERO_WRITE is currently implemented in a way that first the
image is possibly preallocated and then the zero flag is added to all
clusters. This means that a copy-on-write operation may be needed when
writing to these clusters, despite having used preallocation, negating
one of the major benefits of preallocation.

Instead, try to forward the BDRV_REQ_ZERO_WRITE to the protocol driver,
and if the protocol driver can ensure that the new area reads as zeros,
we can skip setting the zero flag in the qcow2 layer.

Unfortunately, the same approach doesn't work for metadata
preallocation, so we'll still set the zero flag there.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20200424142701.67053-1-kwolf@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-04-30 17:51:07 +02:00
Kevin Wolf
bf03dede47 iotests: Test committing to short backing file
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20200424125448.63318-10-kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-04-30 17:51:07 +02:00
Kevin Wolf
fd586ce8be iotests: Filter testfiles out in filter_img_info()
We want to keep TEST_IMG for the full path of the main test image, but
filter_testfiles() must be called for other test images before replacing
other things like the image format because the test directory path could
contain the format as a substring.

Insert a filter_testfiles() call between both.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20200424125448.63318-9-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-04-30 17:51:07 +02:00
Kevin Wolf
955c7d6687 block: truncate: Don't make backing file data visible
When extending the size of an image that has a backing file larger than
its old size, make sure that the backing file data doesn't become
visible in the guest, but the added area is properly zeroed out.

Consider the following scenario where the overlay is shorter than its
backing file:

    base.qcow2:     AAAAAAAA
    overlay.qcow2:  BBBB

When resizing (extending) overlay.qcow2, the new blocks should not stay
unallocated and make the additional As from base.qcow2 visible like
before this patch, but zeros should be read.

A similar case happens with the various variants of a commit job when an
intermediate file is short (- for unallocated):

    base.qcow2:     A-A-AAAA
    mid.qcow2:      BB-B
    top.qcow2:      C--C--C-

After commit top.qcow2 to mid.qcow2, the following happens:

    mid.qcow2:      CB-C00C0 (correct result)
    mid.qcow2:      CB-C--C- (before this fix)

Without the fix, blocks that previously read as zeros on top.qcow2
suddenly turn into A.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20200424125448.63318-8-kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-04-30 17:51:07 +02:00
Kevin Wolf
2f0c6e7a65 file-posix: Support BDRV_REQ_ZERO_WRITE for truncate
For regular files, we always get BDRV_REQ_ZERO_WRITE behaviour from the
OS, so we can advertise the flag and just ignore it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20200424125448.63318-7-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-04-30 17:51:07 +02:00
Kevin Wolf
1ddaabaecb raw-format: Support BDRV_REQ_ZERO_WRITE for truncate
The raw format driver can simply forward the flag and let its bs->file
child take care of actually providing the zeros.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20200424125448.63318-6-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-04-30 17:51:07 +02:00
Kevin Wolf
f01643fb8b qcow2: Support BDRV_REQ_ZERO_WRITE for truncate
If BDRV_REQ_ZERO_WRITE is set and we're extending the image, calling
qcow2_cluster_zeroize() with flags=0 does the right thing: It doesn't
undo any previous preallocation, but just adds the zero flag to all
relevant L2 entries. If an external data file is in use, a write_zeroes
request to the data file is made instead.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20200424125448.63318-5-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-04-30 17:51:07 +02:00
Kevin Wolf
8c6242b6f3 block-backend: Add flags to blk_truncate()
Now that node level interface bdrv_truncate() supports passing request
flags to the block driver, expose this on the BlockBackend level, too.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20200424125448.63318-4-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-04-30 17:51:07 +02:00