Upon failure, a libseccomp API returns actual errno value very
rarely. Fortunately, after its commit 34bf78ab (contained in
2.5.0 release), the SCMP_FLTATR_API_SYSRAWRC attribute can be set
which makes subsequent APIs return true errno on failure.
This is especially critical when seccomp_load() fails, because
generic -ECANCELED says nothing.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
setns/unshare are used to change namespaces which is not something QEMU
needs to be able todo.
execveat is a new variant of execve so should be blocked just like
execve already is.
Acked-by: Eduardo Otubo <otubo@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Modern glibc will use clone3 instead of clone, when it detects that it
is available. We need to compare flags in order to decide whether to
allow clone (thread create vs process fork), but in clone3 the flags
are hidden inside a struct. Seccomp can't currently match on data inside
a struct, so our only option is to block clone3 entirely. If we use
ENOSYS to block it, then glibc transparently falls back to clone.
This may need to be revisited if Linux adds a new architecture in
future and only provides clone3, without clone.
Acked-by: Eduardo Otubo <otubo@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
When '-sandbox on,spawn=deny' is given, we are supposed to block the
ability to spawn processes. We naively blocked the 'fork' syscall,
forgetting that any modern libc will use the 'clone' syscall instead.
We can't simply block the 'clone' syscall though, as that will break
thread creation. We thus list the set of flags used to create threads
and block anything that doesn't match this exactly.
Acked-by: Eduardo Otubo <otubo@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
We're currently tailoring whether to use kill process or return EPERM
based on the syscall set. This is not flexible enough for future
requirements where we also need to be able to return a variety of
actions on a per-syscall granularity.
Acked-by: Eduardo Otubo <otubo@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Recent GLibC calls sched_getaffinity in code paths related to malloc and
when QEMU blocks access, it sends it off into a bad codepath resulting
in stack exhaustion[1]. The GLibC bug is being fixed[2], but none the
less, GLibC has valid reasons to want to use sched_getaffinity.
It is not unreasonable for code to want to run many resource syscalls
for information gathering, so it is a bit too harsh for QEMU to block
them.
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1975693
[2] https://sourceware.org/pipermail/libc-alpha/2021-June/128271.html
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Acked-by: Eduardo Otubo <otubo@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Follow the inclusive terminology from the "Conscious Language in your
Open Source Projects" guidelines [*] and replace the word "blacklist"
appropriately.
[*] https://github.com/conscious-lang/conscious-lang-docs/blob/main/faq.md
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Acked-by: Eduardo Otubo <otubo@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210303184644.1639691-4-philmd@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>