Commit Graph

93818 Commits

Author SHA1 Message Date
Peter Maydell
242f2cae78 V3: virtiofs pull 2022-02-17
Security label improvements from Vivek
   - includes a fix for building against new kernel headers
   [V3: checkpatch style fixes]
   [V2: Fix building on old Linux]
 Blocking flock disable from Sebastian
 SYNCFS support from Greg
 
 Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEERfXHG0oMt/uXep+pBRYzHrxb/ecFAmIOhMkACgkQBRYzHrxb
 /eddnQ/+JM0l6bAA1f08BwjRinGEfmsEZoYsOSkPOD9LOd5+n4v5KuHctu8PHp1+
 G5Acop7wC6knNDvJMYeAhAkR6w0/PmxFDoEHbDNss82LFxzeY0ham1pBNudepkIK
 6EmL6bgeRDkA6advlnUVE3eKtErUvtrfQwuIHT36upXXaWa7mImb+enJYh7P2tZC
 nCSO7FLdtZh/zGUJ7pPY3EENJwZ48t/DR+UyXob2ljJio9RuqZP/cXK/3ru/8j/S
 GdFDj3V42ZrNWdC83r5XhV/PGa5gtqRKDg6MkumSIMV1/PslAhJtiPirGIth5PDY
 q6Gv9xPwkUXgmL7Zwvn+Xj+6oOpWoKvi8pTMG41BiBOYqeHzHfxPxiRNMq3zxEc1
 UDcLlBgHWQU6aqp3QFwVu6Vg/WtKMK1GLR4/CJSYEKQ/U6Vn5nTWpVSHgDo56OdM
 NXkHeoQwjbdXYGOr5tlqP8L31IceFlxaKDr1DwLEcEHjbzUeIaiegHjrmO96BHqw
 lNTgLkkRGs2/utXG+zNAPAHyF0AeTTLVMzwgO/AltTCJbDNRtX7/1vzzFqjk6S6T
 nvlXhvqx2Kl0H29J0ZRho9Ammr9pXd8fcIWw1xy7+2HoSpkOICZ8EnyzQykUUq6C
 Mb7eDuSOWKy/GjTlBgvpQ9SDj8pmFeEb0M6NAxfnVCukkMk3SnQ=
 =K93e
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/dgilbert-gitlab/tags/pull-virtiofs-20220217b' into staging

V3: virtiofs pull 2022-02-17

Security label improvements from Vivek
  - includes a fix for building against new kernel headers
  [V3: checkpatch style fixes]
  [V2: Fix building on old Linux]
Blocking flock disable from Sebastian
SYNCFS support from Greg

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

# gpg: Signature made Thu 17 Feb 2022 17:24:25 GMT
# gpg:                using RSA key 45F5C71B4A0CB7FB977A9FA90516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>" [full]
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A  9FA9 0516 331E BC5B FDE7

* remotes/dgilbert-gitlab/tags/pull-virtiofs-20220217b:
  virtiofsd: Add basic support for FUSE_SYNCFS request
  virtiofsd: Add an option to enable/disable security label
  virtiofsd: Create new file using O_TMPFILE and set security context
  virtiofsd: Create new file with security context
  virtiofsd: Add helpers to work with /proc/self/task/tid/attr/fscreate
  virtiofsd: Move core file creation code in separate function
  virtiofsd, fuse_lowlevel.c: Add capability to parse security context
  virtiofsd: Extend size of fuse_conn_info->capable and ->want fields
  virtiofsd: Parse extended "struct fuse_init_in"
  linux-headers: Update headers to v5.17-rc1
  virtiofsd: Fix breakage due to fuse_init_in size change
  virtiofsd: Do not support blocking flock

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2022-02-19 15:24:12 +00:00
Peter Maydell
439346ce8f 9pfs: fixes and cleanup
* Fifth patch fixes a 9pfs server crash that happened on some systems due
   to incorrect (system dependant) handling of struct dirent size.
 
 * Tests: Second patch fixes a test error that happened on some systems due
   mkdir() being called twice for creating the test directory for the 9p
   'local' tests.
 
 * Tests: Third patch fixes a memory leak.
 
 * Tests: The remaining two patches are code cleanup.
 -----BEGIN PGP SIGNATURE-----
 
 iQJLBAABCgA1FiEEltjREM96+AhPiFkBNMK1h2Wkc5UFAmIOdY0XHHFlbXVfb3Nz
 QGNydWRlYnl0ZS5jb20ACgkQNMK1h2Wkc5Xygg/9G0Ag4LIQsFxcAdTlC1hsnfer
 5GOCpWQsDqUpWScfe+FiaHNdNen328m/2VxF8+V+nSgIJ3qKcGRswesYOPh45//4
 v6dVBJG20c6QtRcHrDj75RNwsoRXMIbTs9bg9rYZc1NU9GxdMS8/JZ/2jWxuxvSb
 4yhuEdP9VVNGFqqzb0jLcHvvUY5jejlhzYlh4TkTrS/sYiG3m/4Dt0vJw+aO2+cw
 HJcLnd7z8I9Oz4IOo/P/CkfLaPc+m97xcVTZk5NZaApHBqRA6Mpz4qkDM00TJ7vr
 dmfOfzZtU+ssYk0YatHltYJ4g/G2kG7YAz/M8IMePOQFzQ8rDKc77N+8gI+45fxU
 GbSDXoPThRGTQUxIGmOJRF0IuYMuM282xHXwhR9nvZvKcRo/SEFzwgMEubz4V9iR
 1SpoPHV8cJsKxy1aQqofOEDHIllBeznkSbRMfJP5icFSnxWMoOZh4hqNi5GcVQ4w
 17tCATb6GXOmJnzewKWiyrUooWjCxUBF5De077VbNxdg53dQyUkhZbRaS0LY3GII
 K6MZFlODr2XtFpiBTZKhV5c0WDLMET0J12LAHSNDKANW3EplOVTVrtTFSJvKQbaY
 2CWye+4cqYNwCTcRZzqEA2UmRjYqFd2eztdasVa6p4rafHt5d4QmqfoYugTi01/2
 D3cwreDyCS3TkWehKKs=
 =qfMH
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/cschoenebeck/tags/pull-9p-20220217' into staging

9pfs: fixes and cleanup

* Fifth patch fixes a 9pfs server crash that happened on some systems due
  to incorrect (system dependant) handling of struct dirent size.

* Tests: Second patch fixes a test error that happened on some systems due
  mkdir() being called twice for creating the test directory for the 9p
  'local' tests.

* Tests: Third patch fixes a memory leak.

* Tests: The remaining two patches are code cleanup.

# gpg: Signature made Thu 17 Feb 2022 16:19:25 GMT
# gpg:                using RSA key 96D8D110CF7AF8084F88590134C2B58765A47395
# gpg:                issuer "qemu_oss@crudebyte.com"
# gpg: Good signature from "Christian Schoenebeck <qemu_oss@crudebyte.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: ECAB 1A45 4014 1413 BA38  4926 30DB 47C3 A012 D5F4
#      Subkey fingerprint: 96D8 D110 CF7A F808 4F88  5901 34C2 B587 65A4 7395

* remotes/cschoenebeck/tags/pull-9p-20220217:
  9pfs: Fix segfault in do_readdir_many caused by struct dirent overread
  tests/9pfs: Use g_autofree and g_autoptr where possible
  tests/9pfs: Fix leak of local_test_path
  tests/9pfs: fix mkdir() being called twice
  tests/9pfs: use g_autofree where possible

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2022-02-19 12:59:38 +00:00
Fabiano Rosas
65e0446c86 target/ppc: Move common SPR functions out of cpu_init
Let's leave cpu_init with just generic CPU initialization and
QOM-related functions.

The rest of the SPR registration functions will be moved in the
following patches along with the code that uses them. These are only
the commonly used ones.

Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20220216162426.1885923-28-farosas@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-18 08:34:15 +01:00
Fabiano Rosas
b58fd0c39b target/ppc: cpu_init: Move check_pow and QOM macros to a header
These will need to be accessed from other files once we move the CPUs
code to separate files.

The check_pow_hid0 and check_pow_hid0_74xx are too specific to be
moved to a header so I'll deal with them later when splitting this
code between the multiple CPU families.

Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20220216162426.1885923-27-farosas@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-18 08:34:15 +01:00
Fabiano Rosas
565873b380 target/ppc: cpu_init: Move SPR registration macros to a header
Put the SPR registration macros in a header that is accessible outside
of cpu_init.c. The following patches will move CPU-specific code to
separate files and will need to access it.

Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20220216162426.1885923-26-farosas@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-18 08:34:15 +01:00
Fabiano Rosas
917ea4381a target/ppc: cpu_init: Expose some SPR registration helpers
The following patches will move CPU-specific code into separate files,
so expose the most used SPR registration functions:

register_sdr1_sprs         | 22 callers
register_low_BATs          | 20 callers
register_non_embedded_sprs | 19 callers
register_high_BATs         | 10 callers
register_thrm_sprs         | 8 callers
register_usprgh_sprs       | 6 callers
register_6xx_7xx_soft_tlb  | only 3 callers, but it helps to
                             keep the soft TLB code consistent.

Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20220216162426.1885923-25-farosas@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-18 08:34:15 +01:00
Fabiano Rosas
99e964ef95 target/ppc: Rename spr_tcg.h to spr_common.h
Initial intent for the spr_tcg header was to expose the spr_read|write
callbacks that are only used by TCG code. However, although these
routines are TCG-specific, the KVM code needs access to env->sprs
which creation is currently coupled to the callback registration.

We are probably not going to decouple SPR creation and TCG callback
registration any time soon, so let's rename the header to spr_common
to accomodate the register_*_sprs functions that will be moved out of
cpu_init.c in the following patches.

Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20220216162426.1885923-24-farosas@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-18 08:34:15 +01:00
Fabiano Rosas
2a48d83dfd target/ppc: cpu_init: Remove register_usprg3_sprs
This function registers just one SPR and has only two callers, so open
code it.

Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20220216162426.1885923-23-farosas@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-18 08:34:15 +01:00
Fabiano Rosas
217781afde target/ppc: cpu_init: Rename register_ne_601_sprs
The important part of this function is that it applies to non-embedded
CPUs, not that it also applies to the 601. We removed support for the
601 anyway, so rename this function.

Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20220216162426.1885923-22-farosas@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-18 08:34:15 +01:00
Fabiano Rosas
c1f2157728 target/ppc: cpu_init: Reuse init_proc_745 for the 755
The init_proc_755 function is identical to the 745 one except for the
755-specific registers. I think it is worth it to make them share
code.

Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20220216162426.1885923-21-farosas@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-18 08:34:15 +01:00
Fabiano Rosas
0df0ca16b4 target/ppc: cpu_init: Reuse init_proc_604 for the 604e
Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20220216162426.1885923-20-farosas@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-18 08:34:15 +01:00
Fabiano Rosas
9f33f3d876 target/ppc: cpu_init: Reuse init_proc_603 for the e300
init_proc_603 is defined after init_proc_e300, so I had to move some
code around to make it work.

Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20220216162426.1885923-19-farosas@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-18 08:34:15 +01:00
Fabiano Rosas
3b18ec7687 target/ppc: cpu_init: Move 604e SPR registration into a function
This is done to improve init_proc readability and to make subsequent
patches that touch this code a bit cleaner.

Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20220216162426.1885923-18-farosas@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-18 08:34:15 +01:00
Fabiano Rosas
a3a2767488 target/ppc: cpu_init: Move e300 SPR registration into a function
This is done to improve init_proc readability and to make subsequent
patches that touch this code a bit cleaner.

Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20220216162426.1885923-17-farosas@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-18 08:34:15 +01:00
Fabiano Rosas
28930245a8 target/ppc: cpu_init: Move 755 L2 cache SPRs into a function
This is just to have 755-specific registers contained into a function,
intead of leaving them open-coded in init_proc_755. It makes init_proc
easier to read and keeps later patches that touch this code a bit
cleaner.

Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20220216162426.1885923-16-farosas@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-18 08:34:15 +01:00
Fabiano Rosas
0301b39c78 target/ppc: cpu_init: Deduplicate 7xx SPR registration
Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20220216162426.1885923-15-farosas@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-18 08:34:15 +01:00
Fabiano Rosas
a5d1120b1d target/ppc: cpu_init: Deduplicate 745/755 SPR registration
The 745 and 755 can share the HID registration, so move it all into
register_755_sprs, which applies for both CPUs.

Also rename that function to register_745_sprs, since the 745 is the
earliest of the two. This will help with separating 755-specific
registers in a subsequent patch.

Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20220216162426.1885923-14-farosas@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-18 08:34:15 +01:00
Fabiano Rosas
20f6fb99b2 target/ppc: cpu_init: Deduplicate 604 SPR registration
Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20220216162426.1885923-13-farosas@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-18 08:34:15 +01:00
Fabiano Rosas
d2b29d0ade target/ppc: cpu_init: Deduplicate 603 SPR registration
Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20220216162426.1885923-12-farosas@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-18 08:34:15 +01:00
Fabiano Rosas
49ed82b29a target/ppc: cpu_init: Deduplicate 440 SPR registration
Move some of the 440 registers that are being repeated in the 440*
CPUs to register_440_sprs.

Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20220216162426.1885923-11-farosas@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-18 08:34:15 +01:00
Fabiano Rosas
674f45096f target/ppc: cpu_init: Decouple 74xx SPR registration from 7xx
We're considering these two to be from different CPU families, so
duplicate some code to keep them separate.

Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20220216162426.1885923-10-farosas@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-18 08:34:14 +01:00
Fabiano Rosas
1a71c5d158 target/ppc: cpu_init: Decouple G2 SPR registration from 755
We're considering these two to be in different CPU families (6xx and
7xx), so keep their SPR registration separate.

The code was copied into register_G2_sprs and the common function was
renamed to apply only to the 755.

Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20220216162426.1885923-9-farosas@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-18 08:34:14 +01:00
Fabiano Rosas
e599bcedf9 target/ppc: cpu_init: Move G2 SPRs into register_G2_sprs
Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20220216162426.1885923-8-farosas@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-18 08:34:14 +01:00
Fabiano Rosas
acd1f78870 target/ppc: cpu_init: Move 405 SPRs into register_405_sprs
Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20220216162426.1885923-7-farosas@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-18 08:34:14 +01:00
Fabiano Rosas
4ffb8c5e43 target/ppc: cpu_init: Avoid nested SPR register functions
Make sure that every register_*_sprs function only has calls to
spr_register* to register individual SPRs. Do not allow nesting. This
makes the code easier to follow and a look at init_proc_* should
suffice to know what SPRs a CPU has.

Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20220216162426.1885923-6-farosas@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-18 08:34:14 +01:00
Fabiano Rosas
024b40e0ae target/ppc: cpu_init: Move Timebase registration into the common function
Now that the 601 was removed, all of our CPUs have a timebase, so that
can be moved into the common function.

Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20220216162426.1885923-5-farosas@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-18 08:34:14 +01:00
Fabiano Rosas
e78280a237 target/ppc: cpu_init: Group registration of generic SPRs
The top level init_proc calls register_generic_sprs but also registers
some other SPRs outside of that function. Let's group everything into
a single place.

Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20220216162426.1885923-4-farosas@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-18 08:34:14 +01:00
Fabiano Rosas
363bd7d0d5 target/ppc: cpu_init: Remove G2LE init code
The G2LE CPU initialization code is the same as the G2. Use the latter
for both.

Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20220216162426.1885923-3-farosas@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-18 08:34:14 +01:00
Fabiano Rosas
acf629eb7a target/ppc: cpu_init: Remove not implemented comments
The /* XXX : not implemented */ comments all over cpu_init are
confusing and ambiguous.

Do they mean not implemented by QEMU, not implemented in a specific
access mode? Not implemented by the CPU? Do they apply to just the
register right after or to a whole block? Do they mean we have an
action to take in the future to implement these?  Are they only
informative?

Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20220216162426.1885923-2-farosas@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-18 08:34:14 +01:00
Nicholas Piggin
120f738a46 spapr: implement nested-hv capability for the virtual hypervisor
This implements the Nested KVM HV hcall API for spapr under TCG.

The L2 is switched in when the H_ENTER_NESTED hcall is made, and the
L1 is switched back in returned from the hcall when a HV exception
is sent to the vhyp. Register state is copied in and out according to
the nested KVM HV hcall API specification.

The hdecr timer is started when the L2 is switched in, and it provides
the HDEC / 0x980 return to L1.

The MMU re-uses the bare metal radix 2-level page table walker by
using the get_pate method to point the MMU to the nested partition
table entry. MMU faults due to partition scope errors raise HV
exceptions and accordingly are routed back to the L1.

The MMU does not tag translations for the L1 (direct) vs L2 (nested)
guests, so the TLB is flushed on any L1<->L2 transition (hcall entry
and exit).

Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
[ clg: checkpatch fixes ]
Message-Id: <20220216102545.1808018-10-npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-18 08:34:14 +01:00
Nicholas Piggin
7cebc5db2e target/ppc: Introduce a vhyp framework for nested HV support
Introduce virtual hypervisor methods that can support a "Nested KVM HV"
implementation using the bare metal 2-level radix MMU, and using HV
exceptions to return from H_ENTER_NESTED (rather than cause interrupts).

HV exceptions can now be raised in the TCG spapr machine when running a
nested KVM HV guest. The main ones are the lev==1 syscall, the hdecr,
hdsi and hisi, hv fu, and hv emu, and h_virt external interrupts.

HV exceptions are intercepted in the exception handler code and instead
of causing interrupts in the guest and switching the machine to HV mode,
they go to the vhyp where it may exit the H_ENTER_NESTED hcall with the
interrupt vector numer as return value as required by the hcall API.

Address translation is provided by the 2-level page table walker that is
implemented for the bare metal radix MMU. The partition scope page table
is pointed to the L1's partition scope by the get_pate vhc method.

Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20220216102545.1808018-9-npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-18 08:34:14 +01:00
Nicholas Piggin
3680e99461 target/ppc: Add powerpc_reset_excp_state helper
This moves the logic to reset the QEMU exception state into its own
function.

Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[ clg: checkpatch fixes ]
Message-Id: <20220216102545.1808018-8-npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-18 08:34:14 +01:00
Nicholas Piggin
4c6cf6b295 target/ppc: add helper for books vhyp hypercall handler
The virtual hypervisor currently always intercepts and handles
hypercalls but with a future change this will not always be the case.

Add a helper for the test so the logic is abstracted from the mechanism.

Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20220216102545.1808018-7-npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-18 08:34:14 +01:00
Nicholas Piggin
f32d4ab41c target/ppc: make vhyp get_pate method take lpid and return success
In prepartion for implementing a full partition table option for
vhyp, update the get_pate method to take an lpid and return a
success/fail indicator.

The spapr implementation currently just asserts lpid is always 0
and always return success.

Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[ clg: checkpatch fixes ]
Message-Id: <20220216102545.1808018-6-npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-18 08:34:14 +01:00
Nicholas Piggin
4dce0bde30 target/ppc: add vhyp addressing mode helper for radix MMU
The radix on vhyp MMU uses a single-level radix table walk, with the
partition scope mapping provided by the flat QEMU machine memory.

A subsequent change will use the two-level radix walk on vhyp in some
situations, so provide a helper which can abstract that logic.

Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20220216102545.1808018-5-npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-18 08:34:14 +01:00
Nicholas Piggin
93aeb70210 ppc: allow the hdecr timer to be created/destroyed
Machines which don't emulate the HDEC facility are able to use the
timer for something else. Provide functions to start and stop the
hdecr timer.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[ clg: checkpatch fixes ]
Message-Id: <20220216102545.1808018-4-npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-18 08:34:14 +01:00
Nicholas Piggin
5ff40b0124 spapr: prevent hdec timer being set up under virtual hypervisor
The spapr virtual hypervisor does not require the hdecr timer.
Remove it.

Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20220216102545.1808018-3-npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-18 08:34:14 +01:00
Nicholas Piggin
4ffcef2a88 target/ppc: raise HV interrupts for partition table entry problems
Invalid or missing partition table entry exceptions should cause HV
interrupts. HDSISR is set to bad MMU config, which is consistent with
the ISA and experimentally matches what POWER9 generates.

Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[ clg: checkpatch fixes ]
Message-Id: <20220216102545.1808018-2-npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-18 08:34:14 +01:00
Shivaprasad G Bhat
8601b4f11d spapr: nvdimm: Introduce spapr-nvdimm device
If the device backend is not persistent memory for the nvdimm, there is
need for explicit IO flushes on the backend to ensure persistence.

On SPAPR, the issue is addressed by adding a new hcall to request for
an explicit flush from the guest when the backend is not pmem. So, the
approach here is to convey when the hcall flush is required in a device
tree property. The guest once it knows the device backend is not pmem,
makes the hcall whenever flush is required.

To set the device tree property, a new PAPR specific device type inheriting
the nvdimm device is implemented. When the backend doesn't have pmem=on
the device tree property "ibm,hcall-flush-required" is set, and the guest
makes hcall H_SCM_FLUSH requesting for an explicit flush. The new device
has boolean property pmem-override which when "on" advertises the device
tree property even when pmem=on for the backend. The flush function
invokes the fdatasync or pmem_persist() based on the type of backend.

The vmstate structures are made part of the spapr-nvdimm device object.
The patch attempts to keep the migration compatibility between source and
destination while rejecting the incompatibles ones with failures.

Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <164396256092.109112.17933240273840803354.stgit@ltczzess4.aus.stglabs.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-18 08:34:14 +01:00
Shivaprasad G Bhat
b5513584a0 spapr: nvdimm: Implement H_SCM_FLUSH hcall
The patch adds support for the SCM flush hcall for the nvdimm devices.
To be available for exploitation by guest through the next patch. The
hcall is applicable only for new SPAPR specific device class which is
also introduced in this patch.

The hcall expects the semantics such that the flush to return with
H_LONG_BUSY_ORDER_10_MSEC when the operation is expected to take longer
time along with a continue_token. The hcall to be called again by providing
the continue_token to get the status. So, all fresh requests are put into
a 'pending' list and flush worker is submitted to the thread pool. The
thread pool completion callbacks move the requests to 'completed' list,
which are cleaned up after collecting the return status for the guest
in subsequent hcall from the guest.

The semantics makes it necessary to preserve the continue_tokens and
their return status across migrations. So, the completed flush states
are forwarded to the destination and the pending ones are restarted
at the destination in post_load. The necessary nvdimm flush specific
vmstate structures are also introduced in this patch which are to be
saved in the new SPAPR specific nvdimm device to be introduced in the
following patch.

Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <164396254862.109112.16675611182159105748.stgit@ltczzess4.aus.stglabs.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-18 08:34:14 +01:00
Shivaprasad G Bhat
3e35960bf1 nvdimm: Add realize, unrealize callbacks to NVDIMMDevice class
A new subclass inheriting NVDIMMDevice is going to be introduced in
subsequent patches. The new subclass uses the realize and unrealize
callbacks. Add them on NVDIMMClass to appropriately call them as part
of plug-unplug.

Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Acked-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <164396253158.109112.1926755104259023743.stgit@ltczzess4.aus.stglabs.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-18 08:34:13 +01:00
Greg Kurz
45b04ef48d virtiofsd: Add basic support for FUSE_SYNCFS request
Honor the expected behavior of syncfs() to synchronously flush all data
and metadata to disk on linux systems.

If virtiofsd is started with '-o announce_submounts', the client is
expected to send a FUSE_SYNCFS request for each individual submount.
In this case, we just create a new file descriptor on the submount
inode with lo_inode_open(), call syncfs() on it and close it. The
intermediary file is needed because O_PATH descriptors aren't
backed by an actual file and syncfs() would fail with EBADF.

If virtiofsd is started without '-o announce_submounts' or if the
client doesn't have the FUSE_CAP_SUBMOUNTS capability, the client
only sends a single FUSE_SYNCFS request for the root inode. The
server would thus need to track submounts internally and call
syncfs() on each of them. This will be implemented later.

Note that syncfs() might suffer from a time penalty if the submounts
are being hammered by some unrelated workload on the host. The only
solution to prevent that is to avoid shared mounts.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <20220215181529.164070-2-groug@kaod.org>
Reviewed-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-02-17 17:22:26 +00:00
Vivek Goyal
963061dc11 virtiofsd: Add an option to enable/disable security label
Provide an option "-o security_label/no_security_label" to enable/disable
security label functionality. By default these are turned off.

If enabled, server will indicate to client that it is capable of handling
one security label during file creation. Typically this is expected to
be a SELinux label. File server will set this label on the file. It will
try to set it atomically wherever possible. But its not possible in
all the cases.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Message-Id: <20220208204813.682906-11-vgoyal@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-02-17 17:22:26 +00:00
Vivek Goyal
a675c9a600 virtiofsd: Create new file using O_TMPFILE and set security context
If guest and host policies can't work with each other, then guest security
context (selinux label) needs to be set into an xattr. Say remap guest
security.selinux xattr to trusted.virtiofs.security.selinux.

That means setting "fscreate" is not going to help as that's ony useful
for security.selinux xattr on host.

So we need another method which is atomic. Use O_TMPFILE to create new
file, set xattr and then linkat() to proper place.

But this works only for regular files. So dir, symlinks will continue
to be non-atomic.

Also if host filesystem does not support O_TMPFILE, we fallback to
non-atomic behavior.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Message-Id: <20220208204813.682906-10-vgoyal@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-02-17 17:22:26 +00:00
Vivek Goyal
0c3f81e131 virtiofsd: Create new file with security context
This patch adds support for creating new file with security context
as sent by client. It basically takes three paths.

- If no security context enabled, then it continues to create files without
  security context.

- If security context is enabled and but security.selinux has not been
  remapped, then it uses /proc/thread-self/attr/fscreate knob to set
  security context and then create the file. This will make sure that
  newly created file gets the security context as set in "fscreate" and
  this is atomic w.r.t file creation.

  This is useful and host and guest SELinux policies don't conflict and
  can work with each other. In that case, guest security.selinux xattr
  is not remapped and it is passthrough as "security.selinux" xattr
  on host.

- If security context is enabled but security.selinux xattr has been
  remapped to something else, then it first creates the file and then
  uses setxattr() to set the remapped xattr with the security context.
  This is a non-atomic operation w.r.t file creation.

  This mode will be most versatile and allow host and guest to have their
  own separate SELinux xattrs and have their own separate SELinux policies.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Message-Id: <20220208204813.682906-9-vgoyal@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-02-17 17:22:26 +00:00
Vivek Goyal
cb282e556a virtiofsd: Add helpers to work with /proc/self/task/tid/attr/fscreate
Soon we will be able to create and also set security context on the file
atomically using /proc/self/task/tid/attr/fscreate knob. If this knob
is available on the system, first set the knob with the desired context
and then create the file. It will be created with the context set in
fscreate. This works basically for SELinux and its per thread.

This patch just introduces the helper functions. Subsequent patches will
make use of these helpers.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Message-Id: <20220208204813.682906-8-vgoyal@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
  dgilbert: Manually merged gettid syscall number fixup from Vivek
2022-02-17 17:22:26 +00:00
Vivek Goyal
81489726ad virtiofsd: Move core file creation code in separate function
Move core file creation bits in a separate function. Soon this is going
to get more complex as file creation need to set security context also.
And there will be multiple modes of file creation in next patch.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Message-Id: <20220208204813.682906-7-vgoyal@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-02-17 17:22:26 +00:00
Vivek Goyal
36cfab870e virtiofsd, fuse_lowlevel.c: Add capability to parse security context
Add capability to enable and parse security context as sent by client
and put into fuse_req. Filesystems now can get security context from
request and set it on files during creation.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Message-Id: <20220208204813.682906-6-vgoyal@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-02-17 17:22:26 +00:00
Vivek Goyal
4c7c393c7b virtiofsd: Extend size of fuse_conn_info->capable and ->want fields
->capable keeps track of what capabilities kernel supports and ->wants keep
track of what capabilities filesytem wants.

Right now these fields are 32bit in size. But now fuse has run out of
bits and capabilities can now have bit number which are higher than 31.

That means 32 bit fields are not suffcient anymore. Increase size to 64
bit so that we can add newer capabilities and still be able to use existing
code to check and set the capabilities.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Message-Id: <20220208204813.682906-5-vgoyal@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-02-17 17:22:26 +00:00
Vivek Goyal
776dc4b165 virtiofsd: Parse extended "struct fuse_init_in"
Add some code to parse extended "struct fuse_init_in". And use a local
variable "flag" to represent 64 bit flags. This will make it easier
to add more features without having to worry about two 32bit flags (->flags
and ->flags2) in "fuse_struct_in".

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Message-Id: <20220208204813.682906-4-vgoyal@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
  dgilbert: Fixed up long line
2022-02-17 17:22:19 +00:00