Commit Graph

88600 Commits

Author SHA1 Message Date
Paolo Bonzini
d8fb7d0969 vl: switch -M parsing to keyval
Switch from QemuOpts to keyval.  This enables the introduction
of non-scalar machine properties, and JSON syntax in the future.

For JSON syntax to be supported right now, we would have to
consider what would happen if string-based dictionaries (produced by
-M key=val) were to be merged with strongly-typed dictionaries
(produced by -M {'key': 123}).

The simplest way out is to never enter the situation, and only allow one
-M option when JSON syntax is in use.  However, we want options such as
-smp to become syntactic sugar for -M, and this is a problem; as soon
as -smp becomes a shortcut for -M, QEMU would forbid using -M '{....}'
together with -smp.  Therefore, allowing JSON syntax right now for -M
would be a forward-compatibility nightmare and it would be impossible
anyway to introduce -M incrementally in tools.

Instead, support for JSON syntax is delayed until after the main
options are converted to QOM compound properties.  These include -boot,
-acpitable, -smbios, -m, -semihosting-config, -rtc and -fw_cfg.  Once JSON
syntax is introduced, these options will _also_ be forbidden together
with -M '{...}'.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-06 08:33:51 +02:00
Paolo Bonzini
c445909e1f keyval: introduce keyval_parse_into
Allow parsing multiple keyval sequences into the same dictionary.
This will be used to simplify the parsing of the -M command line
option, which is currently a .merge_lists = true QemuOpts group.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-06 08:33:51 +02:00
Paolo Bonzini
9176e800db keyval: introduce keyval_merge
This patch introduces a function that merges two keyval-produced
(or keyval-like) QDicts.  It can be used to emulate the behavior of
.merge_lists = true QemuOpts groups, merging -readconfig sections and
command-line options in a single QDict, and also to implement -set.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-06 08:33:51 +02:00
Paolo Bonzini
3bb6944585 qom: export more functions for use with non-UserCreatable objects
Machines and accelerators are not user-creatable but they are going
to share similar command-line parsing machinery.  Export functions
that will be used with -machine and -accel in softmmu/vl.c.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-06 08:33:51 +02:00
Paolo Bonzini
d47a8b3b69 configure: convert compiler tests to meson, part 6
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-06 08:33:51 +02:00
Paolo Bonzini
a620fbe9ac configure: convert compiler tests to meson, part 5
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-06 08:33:51 +02:00
Paolo Bonzini
e1fbd2c4ed configure: convert compiler tests to meson, part 4
And remove them from the summary, since now their outcome is verbosely
included in the meson output.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-06 08:33:51 +02:00
Paolo Bonzini
be7e89f63f configure: convert compiler tests to meson, part 3
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-06 08:33:51 +02:00
Paolo Bonzini
ed3b3f1764 configure: convert compiler tests to meson, part 2
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-06 08:33:51 +02:00
Paolo Bonzini
e66420ac6d configure: convert compiler tests to meson, part 1
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-06 08:33:51 +02:00
Paolo Bonzini
e46bd55d9c configure: convert HAVE_BROKEN_SIZE_MAX to meson
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-06 08:33:51 +02:00
Paolo Bonzini
ccd250aa2d configure, meson: move CONFIG_IVSHMEM to meson
This is a duplicate of CONFIG_EVENTFD, handle it directly in meson.build.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-06 08:33:51 +02:00
Paolo Bonzini
6d7c7c2d1d meson: store dependency('threads') in a variable
It can be useful for has_function checks.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-06 08:33:51 +02:00
Paolo Bonzini
69d8de7a2d meson: sort existing compiler tests
The next patches will add more compiler tests.  Sort and group the
existing tests, keeping similar cc.has_* tests together and sorting them
alphabetically by macro name.  This should make it easier to look for
examples when adding new tests to meson.build.

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-06 08:33:51 +02:00
Paolo Bonzini
c5b36c25c2 configure, meson: convert libxml2 detection to meson
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-06 08:33:51 +02:00
Paolo Bonzini
53c22b68e3 configure, meson: convert liburing detection to meson
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-06 08:33:51 +02:00
Paolo Bonzini
e36e8c70f6 configure, meson: convert libpmem detection to meson
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-06 08:33:51 +02:00
Paolo Bonzini
83ef16821a configure, meson: convert libdaxctl detection to meson
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-06 08:33:51 +02:00
Paolo Bonzini
587d59d6cc configure, meson: convert virgl detection to meson
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-06 08:33:51 +02:00
Paolo Bonzini
c23d7b4e57 configure, meson: convert vte detection to meson
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-06 08:33:51 +02:00
Paolo Bonzini
f08b65b651 configure: drop vte-2.90 check
All currently supported distros have vte 0.37 or newer, which is where the
ABI changed from 2.90 to 2.91.  So drop support for the older ABI.

Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-06 08:33:51 +02:00
David Edmondson
48e5c98a38 target/i386: Move X86XSaveArea into TCG
Given that TCG is now the only consumer of X86XSaveArea, move the
structure definition and associated offset declarations and checks to a
TCG specific header.

Signed-off-by: David Edmondson <david.edmondson@oracle.com>
Message-Id: <20210705104632.2902400-9-david.edmondson@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-06 08:33:51 +02:00
David Edmondson
fea4500841 target/i386: Populate x86_ext_save_areas offsets using cpuid where possible
Rather than relying on the X86XSaveArea structure definition,
determine the offset of XSAVE state areas using CPUID leaf 0xd where
possible (KVM and HVF).

Signed-off-by: David Edmondson <david.edmondson@oracle.com>
Message-Id: <20210705104632.2902400-8-david.edmondson@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-06 08:33:48 +02:00
David Edmondson
3568987f78 target/i386: Observe XSAVE state area offsets
Rather than relying on the X86XSaveArea structure definition directly,
the routines that manipulate the XSAVE state area should observe the
offsets declared in the x86_ext_save_areas array.

Currently the offsets declared in the array are derived from the
structure definition, resulting in no functional change.

Signed-off-by: David Edmondson <david.edmondson@oracle.com>
Message-Id: <20210705104632.2902400-7-david.edmondson@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-06 07:54:53 +02:00
David Edmondson
5aa10ab1a0 target/i386: Make x86_ext_save_areas visible outside cpu.c
Provide visibility of the x86_ext_save_areas array and associated type
outside of cpu.c.

Signed-off-by: David Edmondson <david.edmondson@oracle.com>
Message-Id: <20210705104632.2902400-6-david.edmondson@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-06 07:54:53 +02:00
David Edmondson
c0198c5f87 target/i386: Pass buffer and length to XSAVE helper
In preparation for removing assumptions about XSAVE area offsets, pass
a buffer pointer and buffer length to the XSAVE helper functions.

Signed-off-by: David Edmondson <david.edmondson@oracle.com>
Message-Id: <20210705104632.2902400-5-david.edmondson@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-06 07:54:53 +02:00
David Edmondson
fde7482100 target/i386: Clarify the padding requirements of X86XSaveArea
Replace the hard-coded size of offsets or structure elements with
defined constants or sizeof().

Signed-off-by: David Edmondson <david.edmondson@oracle.com>
Message-Id: <20210705104632.2902400-4-david.edmondson@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-06 07:54:53 +02:00
David Edmondson
436463b84b target/i386: Consolidate the X86XSaveArea offset checks
Rather than having similar but different checks in cpu.h and kvm.c,
move them all to cpu.h.
Message-Id: <20210705104632.2902400-3-david.edmondson@oracle.com>

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-06 07:54:53 +02:00
David Edmondson
ac7b7cae4e target/i386: Declare constants for XSAVE offsets
Declare and use manifest constants for the XSAVE state component
offsets.

Signed-off-by: David Edmondson <david.edmondson@oracle.com>
Message-Id: <20210705104632.2902400-2-david.edmondson@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-06 07:54:53 +02:00
Paolo Bonzini
dd52af17ec coverity-scan: switch to vpath build
This is the patch that has been running on the coverity cronjob
for a few weeks now.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-06 07:54:50 +02:00
Philippe Mathieu-Daudé
dff5f68224 coverity-scan: Remove lm32 / unicore32 targets
lm32 has been removed in commit 9d49bcf699 ("Drop the deprecated
lm32 target"), and unicore32 in 4369223902 ("Drop the deprecated
unicore32 target").

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210619091342.3660495-2-f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-06 07:49:41 +02:00
Thomas Huth
95f439bd11 qemu-options: Improve the documentation of the -display options
The sdl and gtk display options support more parameters than currently
documented. Also the "vnc" option got lost during a recent commit,
add it again.

Fixes: ddc717581c ("Add display suboptions to man pages")
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210630163231.467987-5-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-05 19:21:07 +02:00
Thomas Huth
b6ddc6a2b2 ui: Mark the '-no-quit' option as deprecated
It's just a wrapper around the -display ...,window-close=off parameter,
and the name "no-quit" is rather confusing compared to "window-close"
(since there are still other means to quit the emulator), so we should
rather tell our users to use the "window-close" parameter instead.

While we're at it, update the documentation to state that
"-no-quit" is available for GTK, too, not only for SDL.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210630163231.467987-4-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-05 19:21:07 +02:00
Thomas Huth
bb20b86db9 ui: Fix the "-display sdl,window_close=..." parameter
According to the QAPI schema, there is a "-" and not a "_" between
"window" and "close", and we're also talking about "window-close"
in the long parameter description in qemu-options.hx, so we should
make sure that we rather use the variant with the "-" by default
instead of only allowing the one with the "_" here. The old way
still stays enabled for compatibility, but we deprecate it, so that
we can switch to a QAPIfied parameter one day more easily.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210630163231.467987-3-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-05 19:21:05 +02:00
Thomas Huth
f6b560bbc1 softmmu/vl: Remove obsolete comment about the "frame" parameter
The frame parameter has been removed along with the support for
SDL 1.2.

Fixes: 09bd7ba9f5 ("Remove deprecated -no-frame option")
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210630163231.467987-2-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-05 19:20:46 +02:00
Thomas Huth
bc05439334 Makefile: Remove /usr/bin/env wrapper from the SHELL variable
The wrapper should not be needed here (it's not the shebang line of
a shell script), and it is causing trouble on Haiku where "env"
resides in a different directory.

Reported-by: Richard Zak <richard.j.zak@gmail.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210705082542.936856-1-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-05 19:20:46 +02:00
Peter Maydell
715167a36c Migration and virtiofs pull 2021-07-01 v2
Dropped Peter Xu's migration-test fix to reenable
 most of the migration tests when uffd isn't available;
 we're seeing at least one seg in github CI (on qemu-system-i386)
 and Peter Maydell is reporting a hang on Openbsd.
 
 Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEERfXHG0oMt/uXep+pBRYzHrxb/ecFAmDi2H8ACgkQBRYzHrxb
 /edgRw//cr5xtzaTP2G3zP/R4T06Hp8xVgGNz8lXuqXOctLJNxMgj2uaty1fq4AL
 vaRHhQ/DFgh5aQU2FbnoMFeaOdsGLbAla+p7bKuDswMGbQnl1sy/qRgmeJyN7McS
 2Zo/pc4xkCv3YkUpVJnvysn4vyA7xyQ6a4ndmmsI3EyqX5d3IWaTrK6vRbFpi/tI
 en6QVH1x5zXkygmFQddyhBNyS1OQGane3AhaetZs5y8FSFUsBcuixsgmSZZIhN5Y
 Y1Yk+qRTQfdXVCYaBjbxoWk7yIiw/X4xy0wwrZUDc6zScv64nZDu+V1olq0DBGsg
 Zo5lFfNx/mzq5FwjD3eU27PSaa9zHOnBmkvIhToGU+od+Ct4sB2Xbq4e2FtVYgX1
 cOhptfE2OewKbNsSHWvKL6UqgxJ4tT+6COclwrdO7bqnIuWpJWPORWYmR5fKvN42
 xnPZYopTpFJkEv2jKZAVif2tWnD+l806ebNtkAlldFwbAudzq4+qMCbMeDopyBP2
 ksrUTchmyFq8nm7cuRvEm2z9EHDLHP/RtwsUZmTiY1A+d1B5HUO2ndByWO/buoc0
 N6Y+UwTldi7ZK2bWphEpNsEqtT2vZs5X5CB3nWLOqPGoPYWnl6H58T/3kh9W/fvG
 ycVoxZ3iH5rr3lsilQcFDwQyG3nprrWj2CO0V7b+oI5BBCdpOnQ=
 =PBi+
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/dgilbert-gitlab/tags/pull-migration-20210705a' into staging

Migration and virtiofs pull 2021-07-01 v2

Dropped Peter Xu's migration-test fix to reenable
most of the migration tests when uffd isn't available;
we're seeing at least one seg in github CI (on qemu-system-i386)
and Peter Maydell is reporting a hang on Openbsd.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

# gpg: Signature made Mon 05 Jul 2021 11:01:35 BST
# gpg:                using RSA key 45F5C71B4A0CB7FB977A9FA90516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>" [full]
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A  9FA9 0516 331E BC5B FDE7

* remotes/dgilbert-gitlab/tags/pull-migration-20210705a:
  migration/rdma: Use error_report to suppress errno message
  tests/migration: fix "downtime_limit" type when "migrate-set-parameters"
  tests/migration: parse the thread-id key of CpuInfoFast
  virtiofsd: Add an option to enable/disable posix acls
  virtiofsd: Switch creds, drop FSETID for system.posix_acl_access xattr
  virtiofsd: Add capability to change/restore umask
  virtiofsd: Add umask to seccom allow list
  virtiofsd: Add support for extended setxattr
  virtiofsd: Fix xattr operations overwriting errno
  virtiofsd: Fix fuse setxattr() API change issue
  virtiofsd: Don't allow file creation with FUSE_OPEN
  docs: describe the security considerations with virtiofsd xattr mapping
  virtiofsd: use GDateTime for formatting timestamp for debug messages
  migration: failover: continue to wait card unplug on error
  migration: move wait-unplug loop to its own function
  migration: Allow reset of postcopy_recover_triggered when failed
  migration: Move yank outside qemu_start_incoming_migration()
  migration: fix the memory overwriting risk in add_to_iovec
  tests: migration-test: Add dirty ring test

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-07-05 12:45:24 +01:00
Li Zhijian
e5f607913c migration/rdma: Use error_report to suppress errno message
Since the prior calls are successful, in this case a errno doesn't
indicate a real error which would just make us confused.

before:
(qemu) migrate -d rdma:192.168.22.23:8888
source_resolve_host RDMA Device opened: kernel name rxe_eth0 uverbs device name uverbs2, infiniband_verbs class device path /sys/class/infiniband_verbs/uverbs2, infiniband class device path /sys/class/infiniband/rxe_eth0, transport: (2) Ethernet
rdma_get_cm_event != EVENT_ESTABLISHED after rdma_connect: No space left on device

Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Message-Id: <20210628071959.23455-1-lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-07-05 10:51:26 +01:00
Hyman Huang(黄勇)
fa264f4266 tests/migration: fix "downtime_limit" type when "migrate-set-parameters"
migrate-set-parameters parse "downtime_limit" as integer type when
execute "migrate-set-parameters" before migration, and, the unit
dowtime_limit is milliseconds, fix this two so that test can go
smoothly.

Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
Message-Id: <31d82df24cc0c468dbe4d2d86730158ebf248071.1622729934.git.huangy81@chinatelecom.cn>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-07-05 10:51:26 +01:00
Hyman Huang(黄勇)
c99fb3a50d tests/migration: parse the thread-id key of CpuInfoFast
thread_id in CpuInfoFast is deprecated, parse thread-id instead
after execute qmp query-cpus-fast. fix this so that test can
go smoothly.

Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
Message-Id: <584578c0a0dd781cee45f72ddf517f6e6a41c504.1622729934.git.huangy81@chinatelecom.cn>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-07-05 10:51:26 +01:00
Vivek Goyal
65a820d292 virtiofsd: Add an option to enable/disable posix acls
fuse has an option FUSE_POSIX_ACL which needs to be opted in by fuse
server to enable posix acls. As of now we are not opting in for this,
so posix acls are disabled on virtiofs by default.

Add virtiofsd option "-o posix_acl/no_posix_acl" to let users enable/disable
posix acl support. By default it is disabled as of now due to performance
concerns with cache=none.

Currently even if file server has not opted in for FUSE_POSIX_ACL, user can
still query acl and set acl, and system.posix_acl_access and
system.posix_acl_default xattrs show up listxattr response.

Miklos said this is confusing. So he said lets block and filter
system.posix_acl_access and system.posix_acl_default xattrs in
getxattr/setxattr/listxattr if user has explicitly disabled
posix acls using -o no_posix_acl.

As of now continuing to keeping the existing behavior if user did not
specify any option to disable acl support due to concerns about backward
compatibility.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Message-Id: <20210622150852.1507204-8-vgoyal@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-07-05 10:51:26 +01:00
Vivek Goyal
f1aa1774df virtiofsd: Switch creds, drop FSETID for system.posix_acl_access xattr
When posix access acls are set on a file, it can lead to adjusting file
permissions (mode) as well. If caller does not have CAP_FSETID and it
also does not have membership of owner group, this will lead to clearing
SGID bit in mode.

Current fuse code is written in such a way that it expects file server
to take care of chaning file mode (permission), if there is a need.
Right now, host kernel does not clear SGID bit because virtiofsd is
running as root and has CAP_FSETID. For host kernel to clear SGID,
virtiofsd need to switch to gid of caller in guest and also drop
CAP_FSETID (if caller did not have it to begin with).

If SGID needs to be cleared, client will set the flag
FUSE_SETXATTR_ACL_KILL_SGID in setxattr request. In that case server
should kill sgid.

Currently just switch to uid/gid of the caller and drop CAP_FSETID
and that should do it.

This should fix the xfstest generic/375 test case.

We don't have to switch uid for this to work. That could be one optimization
that pass a parameter to lo_change_cred() to only switch gid and not uid.

Also this will not work whenever (if ever) we support idmapped mounts. In
that case it is possible that uid/gid in request are 0/0 but still we
need to clear SGID. So we will have to pick a non-root sgid and switch
to that instead. That's an TODO item for future when idmapped mount
support is introduced.

This patch only adds the capability to switch creds and drop FSETID
when acl xattr is set. This does not take affect yet. It can take
affect when next patch adds the capability to enable posix_acl.

Reported-by: Luis Henriques <lhenriques@suse.de>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Message-Id: <20210622150852.1507204-7-vgoyal@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-07-05 10:51:26 +01:00
Vivek Goyal
227e5d7fd5 virtiofsd: Add capability to change/restore umask
When parent directory has default acl and a file is created in that
directory, then umask is ignored and final file permissions are
determined using default acl instead. (man 2 umask).

Currently, fuse applies the umask and sends modified mode in create
request accordingly. fuse server can set FUSE_DONT_MASK and tell
fuse client to not apply umask and fuse server will take care of
it as needed.

With posix acls enabled, requirement will be that we want umask
to determine final file mode if parent directory does not have
default acl.

So if posix acls are enabled, opt in for FUSE_DONT_MASK. virtiofsd
will set umask of the thread doing file creation. And host kernel
should use that umask if parent directory does not have default
acls, otherwise umask does not take affect.

Miklos mentioned that we already call unshare(CLONE_FS) for
every thread. That means umask has now become property of per
thread and it should be ok to manipulate it in file creation path.

This patch only adds capability to change umask and restore it. It
does not enable it yet. Next few patches will add capability to enable it
based on if user enabled posix_acl or not.

This should fix fstest generic/099.

Reported-by: Luis Henriques <lhenriques@suse.de>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20210622150852.1507204-6-vgoyal@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-07-05 10:51:26 +01:00
Vivek Goyal
6d0028b947 virtiofsd: Add umask to seccom allow list
Patches in this series  are going to make use of "umask" syscall.
So allow it.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20210622150852.1507204-5-vgoyal@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-07-05 10:51:26 +01:00
Vivek Goyal
c46ef954fa virtiofsd: Add support for extended setxattr
Add the bits to enable support for setxattr_ext if fuse offers it. Do not
enable it by default yet. Let passthrough_ll opt-in. Enabling it by deafult
kind of automatically means that you are taking responsibility of clearing
SGID if ACL is set.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Message-Id: <20210622150852.1507204-4-vgoyal@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
  Fixed up double def in fuse_common.h
2021-07-05 10:51:26 +01:00
Vivek Goyal
5290fb625d virtiofsd: Fix xattr operations overwriting errno
getxattr/setxattr/removexattr/listxattr operations handle regualar
and non-regular files differently. For the case of non-regular files
we do fchdir(/proc/self/fd) and the xattr operation and then revert
back to original working directory. After this we are saving errno
and that's buggy because fchdir() will overwrite the errno.

FCHDIR_NOFAIL(lo->proc_self_fd);
ret = getxattr(procname, name, value, size);
FCHDIR_NOFAIL(lo->root.fd);

if (ret == -1)
    saverr = errno

In above example, if getxattr() failed, we will still return 0 to caller
as errno must have been written by FCHDIR_NOFAIL(lo->root.fd) call.
Fix all such instances and capture "errno" early and save in "saverr"
variable.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Message-Id: <20210622150852.1507204-3-vgoyal@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-07-05 10:51:26 +01:00
Vivek Goyal
061624455f virtiofsd: Fix fuse setxattr() API change issue
With kernel header updates fuse_setxattr_in struct has grown in size.
But this new struct size only takes affect if user has opted in
for fuse feature FUSE_SETXATTR_EXT otherwise fuse continues to
send "fuse_setxattr_in" of older size. Older size is determined
by FUSE_COMPAT_SETXATTR_IN_SIZE.

Fix this. If we have not opted in for FUSE_SETXATTR_EXT, then
expect that we will get fuse_setxattr_in of size FUSE_COMPAT_SETXATTR_IN_SIZE
and not sizeof(struct fuse_sexattr_in).

Fixes: 278f064e45 ("Update Linux headers to 5.13-rc4")
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Message-Id: <20210622150852.1507204-2-vgoyal@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-07-05 10:51:26 +01:00
Greg Kurz
1d03e56607 virtiofsd: Don't allow file creation with FUSE_OPEN
A well behaved FUSE client uses FUSE_CREATE to create files. It isn't
supposed to pass O_CREAT along a FUSE_OPEN request, as documented in
the "fuse_lowlevel.h" header :

    /**
     * Open a file
     *
     * Open flags are available in fi->flags. The following rules
     * apply.
     *
     *  - Creation (O_CREAT, O_EXCL, O_NOCTTY) flags will be
     *    filtered out / handled by the kernel.

But if the client happens to do it anyway, the server ends up passing
this flag to open() without the mandatory mode_t 4th argument. Since
open() is a variadic function, glibc will happily pass whatever it
finds on the stack to the syscall. If this file is compiled with
-D_FORTIFY_SOURCE=2, glibc will even detect that and abort:

*** invalid openat64 call: O_CREAT or O_TMPFILE without mode ***: terminated

Specifying O_CREAT with FUSE_OPEN is a protocol violation. Check this
in do_open(), print out a message and return an error to the client,
EINVAL like we already do when fuse_mbuf_iter_advance() fails.

The FUSE filesystem doesn't currently support O_TMPFILE, but the very
same would happen if O_TMPFILE was passed in a FUSE_OPEN request. Check
that as well.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <20210624101809.48032-1-groug@kaod.org>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-07-05 10:51:26 +01:00
Daniel P. Berrangé
3399bca451 docs: describe the security considerations with virtiofsd xattr mapping
Different guest xattr prefixes have distinct access control rules applied
by the guest. When remapping a guest xattr care must be taken that the
remapping does not allow the a guest user to bypass guest kernel access
control rules.

For example if 'trusted.*' which requires CAP_SYS_ADMIN is remapped
to 'user.virtiofs.trusted.*', an unprivileged guest user which can
write to 'user.*' can bypass the CAP_SYS_ADMIN control. Thus the
target of any remapping must be explicitly blocked from read/writes
by the guest, to prevent access control bypass.

The examples shown in the virtiofsd man page already do the right
thing and ensure safety, but the security implications of getting
this wrong were not made explicit. This could lead to host admins
and apps unwittingly creating insecure configurations.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210611120427.49736-1-berrange@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-07-05 10:51:26 +01:00
Daniel P. Berrangé
d9a801f7e9 virtiofsd: use GDateTime for formatting timestamp for debug messages
The GDateTime APIs provided by GLib avoid portability pitfalls, such
as some platforms where 'struct timeval.tv_sec' field is still 'long'
instead of 'time_t'. When combined with automatic cleanup, GDateTime
often results in simpler code too.

Localtime is changed to UTC to avoid the need to grant extra seccomp
permissions for GLib's access of the timezone database.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20210611164319.67762-1-berrange@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-07-05 10:51:26 +01:00