This protects devices from bh->mmio reentrancy issues.
Thanks: Thomas Huth <thuth@redhat.com> for diagnosing OS X test failure.
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230427211013.2994127-5-alxndr@bu.edu>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit f63192b054)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The 9p protocol does not specifically define how server shall behave when
client tries to open a special file, however from security POV it does
make sense for 9p server to prohibit opening any special file on host side
in general. A sane Linux 9p client for instance would never attempt to
open a special file on host side, it would always handle those exclusively
on its guest side. A malicious client however could potentially escape
from the exported 9p tree by creating and opening a device file on host
side.
With QEMU this could only be exploited in the following unsafe setups:
- Running QEMU binary as root AND 9p 'local' fs driver AND 'passthrough'
security model.
or
- Using 9p 'proxy' fs driver (which is running its helper daemon as
root).
These setups were already discouraged for safety reasons before,
however for obvious reasons we are now tightening behaviour on this.
Fixes: CVE-2023-2861
Reported-by: Yanwu Shen <ywsPlz@gmail.com>
Reported-by: Jietao Xiao <shawtao1125@gmail.com>
Reported-by: Jinku Li <jkli@xidian.edu.cn>
Reported-by: Wenbo Shen <shenwenbo@zju.edu.cn>
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Message-Id: <E1q6w7r-0000Q0-NM@lizzy.crudebyte.com>
(cherry picked from commit f6b0de53fb)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: drop adding qemu_fstat wrapper for 7.2 where wrappers aren't used)
xen_9pfs_free can't use gnttabdev since it is already closed and NULL-ed
out when free is called. Do the teardown in _disconnect(). This
matches the setup done in _connect().
trace-events are also added for the XenDevOps functions.
Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Message-Id: <20230502143722.15613-1-jandryuk@gmail.com>
[C.S.: - Remove redundant return in xen_9pfs_free().
- Add comment to trace-events. ]
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
(cherry picked from commit 92e667f6fd)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: minor context conflict in hw/9pfs/xen-9p-backend.c)
The previous implementation would iterate over the fid table for
lookup operations, resulting in an operation with O(n) complexity on
the number of open files and poor cache locality -- for every open,
stat, read, write, etc operation.
This change uses a hashtable for this instead, significantly improving
the performance of the 9p filesystem. The runtime of NixOS's simple
installer test, which copies ~122k files totalling ~1.8GiB from 9p,
decreased by a factor of about 10.
Signed-off-by: Linus Heckemann <git@sphalerite.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
[CS: - Retain BUG_ON(f->clunked) in get_fid().
- Add TODO comment in clunk_fid(). ]
Message-Id: <20221004104121.713689-1-git@sphalerite.org>
[CS: - Drop unnecessary goto and out: label. ]
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Callers of coroutine_fn must be coroutine_fn themselves, or the call
must be within "if (qemu_in_coroutine())". Apply coroutine_fn to
functions where this holds.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Alberto Faria <afaria@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20220922084924.201610-25-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
'namespace' is misspelled in a bunch of places.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Message-Id: <20220614104045.85728-3-dgilbert@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Current implementation of 'Twalk' request handling always sends an 'Rerror'
response if any error occured. The 9p2000 protocol spec says though:
"
If the first element cannot be walked for any reason, Rerror is returned.
Otherwise, the walk will return an Rwalk message containing nwqid qids
corresponding, in order, to the files that are visited by the nwqid
successful elementwise walks; nwqid is therefore either nwname or the index
of the first elementwise walk that failed.
"
http://ericvh.github.io/9p-rfc/rfc9p2000.html#anchor33
For that reason we are no longer leaving from an error path in function
v9fs_walk(), unless really no path component could be walked successfully or
if the request has been interrupted.
Local variable 'nwalked' counts and reflects the number of path components
successfully processed by background I/O thread, whereas local variable
'name_idx' subsequently counts and reflects the number of path components
eventually accepted successfully by 9p server controller portion.
New local variable 'any_err' is an aggregate variable reflecting whether any
error occurred at all, while already existing variable 'err' only reflects
the last error.
Despite QIDs being delivered to client in a more relaxed way now, it is
important to note though that fid still must remain unaffected if any error
occurred.
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <bc73e24258a75dc29458024c7936c8a036c3eac5.1647339025.git.qemu_oss@crudebyte.com>
The local variable 'name_idx' is used in two loops in function v9fs_walk().
Let the first loop use its own variable 'nwalked' instead, which we will
use in subsequent patch as the number of (requested) path components
successfully walked by background I/O thread.
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <d506308e7e343023c4db95d0e6053dd2627ed3c1.1647339025.git.qemu_oss@crudebyte.com>
This patch drops the name parameter for the virtio_init function.
The pair between the numeric device ID and the string device ID
(name) of a virtio device already exists, but not in a way that
lets us map between them.
This patch lets us do this and removes the need for the name
parameter in the virtio_init function.
Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Message-Id: <1648819405-25696-2-git-send-email-jonah.palmer@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
qemu_mknodat() is expected to behave according to its POSIX API, and
therefore should always return exactly -1 on any error, and errno
should be set for the actual error code.
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Akihiko Odaki <akihiko.odaki@gmail.com>
Message-Id: <c714b5e1cae225ab7575242c45ee0fe4945eb6ad.1651228001.git.qemu_oss@crudebyte.com>
When mapped POSIX ACL is used, we are ignoring errors when trying
to remove a POSIX ACL xattr that does not exist. On Linux hosts we
would get ENODATA in such cases, on macOS hosts however we get
ENOATTR instead.
As we can be sure that ENOATTR is defined as being identical on Linux
hosts (at least by qemu/xattr.h), it is safe to fix this issue by
simply comparing against ENOATTR instead of ENODATA.
This patch fixes e.g. a command on Linux guest like:
cp --preserve=mode old new
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Link: https://lore.kernel.org/qemu-devel/2866993.yOYK24bMf6@silver/
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Akihiko Odaki <akihiko.odaki@gmail.com>
Message-Id: <34f81e9bffd7a3e65fb7aab5b56c107bd0aac960.1651228001.git.qemu_oss@crudebyte.com>
Linux and macOS only share some errno definitions with equal macro
name and value. In fact most mappings for errno are completely
different on the two systems.
This patch converts some important errno values from macOS host to
corresponding Linux errno values before eventually sending such error
codes along with 'Rlerror' replies (if 9p2000.L is used that is). Not
having translated errnos before violated the 9p2000.L protocol spec,
which says:
"
size[4] Rlerror tag[2] ecode[4]
... ecode is a numerical Linux errno.
"
https://github.com/chaos/diod/wiki/protocol#lerror----return-error-code
This patch fixes a bunch of misbehaviours when running a Linux client
on macOS host. For instance this patch fixes:
mount -t 9p -o posixacl ...
on Linux guest if security_mode=mapped was used for 9p server, which
refused to mount successfully, because macOS returned ENOATTR==93
when client tried to retrieve POSIX ACL xattrs, because errno 93
is defined as EPROTONOSUPPORT==93 on Linux, so Linux client believed
that xattrs were not supported by filesystem on host in general.
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Link: https://lore.kernel.org/qemu-devel/20220421124835.3e664669@bahia/
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Akihiko Odaki <akihiko.odaki@gmail.com>
Message-Id: <b322ab298a62069e527d2b032028bdc9115afacd.1651228001.git.qemu_oss@crudebyte.com>
The 'rdev' field in 9p reponse 'Rgetattr' is of type dev_t,
which is actually a system dependant type and therefore both the
size and encoding of dev_t differ between macOS and Linux.
So far we have sent 'rdev' to guest in host's dev_t format as-is,
which caused devices to appear with wrong device numbers on
guests running on macOS hosts, eventually leading to various
misbehaviours on guest in conjunction with device files.
This patch fixes this issue by converting the device number from
host's dev_t format to Linux dev_t format. As 9p request
'Tgettattr' is exclusive to protocol version 9p2000.L, it should
be fair to assume that 'rdev' field is assumed to be in Linux dev_t
format by client as well.
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Link: https://lore.kernel.org/qemu-devel/20220421093056.5ab1e7ed@bahia/
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Akihiko Odaki <akihiko.odaki@gmail.com>
Message-Id: <b3a430c2c382ba69a7405e04c0b090ab0d86f17e.1651228001.git.qemu_oss@crudebyte.com>
mknod() on macOS does not support creating sockets, so divert to
call sequence socket(), bind() and fchmodat() respectively if S_IFSOCK
was passed with mode argument.
Link: https://lore.kernel.org/qemu-devel/17933734.zYzKuhC07K@silver/
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Akihiko Odaki <akihiko.odaki@gmail.com>
Message-Id: <2e7b5ecd7a6d83a538db4e8a22d8fb03e9e0f06e.1651228001.git.qemu_oss@crudebyte.com>
[C.S. - Use AT_SYMLINK_NOFOLLOW instead of AT_SYMLINK_NOFOLLOW_ANY. ]
Link: https://lore.kernel.org/qemu-devel/3704033.BMyLRrx2Jx@silver/
mknod() on macOS does not support creating regular files, so
divert to openat_file() if S_IFREG is passed with mode argument.
Furthermore, 'man 2 mknodat' on Linux says: "Zero file type is
equivalent to type S_IFREG".
Link: https://lore.kernel.org/qemu-devel/17933734.zYzKuhC07K@silver/
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Will Cohen <wwcohen@gmail.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Akihiko Odaki <akihiko.odaki@gmail.com>
Message-Id: <3102ca936f88bc1f79d2a325e5bc68f48f54e6e3.1651228000.git.qemu_oss@crudebyte.com>
The 'synth' driver's root node and the 'synth' driver's first
subdirectory node falsely share the same inode number (zero), which
makes it impossible for 9p clients (i.e. 9p test cases) to distinguish
root node and first subdirectory from each other by comparing their QIDs
(which are derived by 9p server from driver's inode numbers).
Fix this issue by using prefix-increment instead of postfix-increment
operator while generating new inode numbers for subdirectories and files.
Link: https://lore.kernel.org/qemu-devel/3859307.hTDP4D0zbi@silver/
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <E1nTpyU-0000yR-9o@lizzy.crudebyte.com>
The patch set adding 9p functionality to darwin introduced an issue
where limits.h, which defines XATTR_SIZE_MAX, is included in 9p.c,
though the referenced constant is needed in 9p.h. This commit fixes that
issue by moving the definition of P9_XATTR_SIZE_MAX, which uses
XATTR_SIZE_MAX, to also be in 9p.c.
Additionally, this commit moves the location of the system headers
include in 9p.c to occur before the project headers (except osdep.h).
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/950
Fixes: 38d7fd68b0 ("9p: darwin: Move XATTR_SIZE_MAX->P9_XATTR_SIZE_MAX")
Signed-off-by: Will Cohen <wwcohen@gmail.com>
Message-Id: <20220331182651.887-1-wwcohen@gmail.com>
[thuth: Adjusted placement of osdep.h]
Signed-off-by: Thomas Huth <thuth@redhat.com>
One less qemu-specific macro. It also helps to make some headers/units
only depend on glib, and thus moved in standalone projects eventually.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
g_new(T, n) is neater than g_malloc(sizeof(T) * n). It's also safer,
for two reasons. One, it catches multiplication overflowing size_t.
Two, it returns T * rather than void *, which lets the compiler catch
more type errors.
This commit only touches allocations with size arguments of the form
sizeof(T).
Initial patch created mechanically with:
$ spatch --in-place --sp-file scripts/coccinelle/use-g_new-etc.cocci \
--macro-file scripts/cocci-macro-file.h FILES...
This uncovers a typing error:
../hw/9pfs/9p.c: In function ‘qid_path_fullmap’:
../hw/9pfs/9p.c:855:13: error: assignment to ‘QpfEntry *’ from incompatible pointer type ‘QppEntry *’ [-Werror=incompatible-pointer-types]
855 | val = g_new0(QppEntry, 1);
| ^
Harmless, because QppEntry is larger than QpfEntry. Manually fixed to
allocate a QpfEntry instead.
Cc: Greg Kurz <groug@kaod.org>
Cc: Christian Schoenebeck <qemu_oss@crudebyte.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <20220315144156.1595462-3-armbru@redhat.com>
API doc comments in QEMU are supposed to be in kerneldoc format, so
drop Doxygen format used on v9fs_co_run_in_worker() macro.
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <a8fdf0290d1e40a68f5577f29aeae12298b70733.1646314856.git.qemu_oss@crudebyte.com>
API doc comments in QEMU are supposed to be in kerneldoc format, so
convert API doc comments from Doxygen format to kerneldoc format.
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <dc1c4a85e233f5884ee5f6ec96b87db286083df7.1646314856.git.qemu_oss@crudebyte.com>
API doc comments in QEMU are supposed to be in kerneldoc format, so
convert API doc comments from Doxygen format to kerneldoc format.
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <4ece6ffa4465c271c6a7c42a3040f42780fcce87.1646314856.git.qemu_oss@crudebyte.com>
API doc comments in QEMU are supposed to be in kerneldoc format, so
convert API doc comments from Doxygen format to kerneldoc format.
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <c76be7d38ea448c6417b2ffb5ccd6b711519a878.1646314856.git.qemu_oss@crudebyte.com>
API doc comments in QEMU are supposed to be in kerneldoc format, so
convert API doc comments from Doxygen format to kerneldoc format.
Based-on: <E1nPTwO-0006pl-Np@lizzy.crudebyte.com>
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <2b8f91de7bac3d3bc85d60eb08830a35a394be75.1646314856.git.qemu_oss@crudebyte.com>
API doc comments in QEMU are supposed to be in kerneldoc format, so drop
occurrences of "@c" which is Doxygen format for fixed-width text.
Link: https://lore.kernel.org/qemu-devel/CAFEAcA89+ENOM6x19OEF53Kd2DWkhN5SN21Va0D7yepJSa3Jyg@mail.gmail.com/
Based-on: <E1nP9Oz-00043L-KJ@lizzy.crudebyte.com>
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <E1nPTwO-0006pl-Np@lizzy.crudebyte.com>
Function qemu_dirent_dup() is currently only used by 9pfs server, so move
it from project global header osdep.h to 9pfs specific header 9p-util.h.
Link: https://lore.kernel.org/qemu-devel/CAFEAcA_=HAUNomKD2wurSVaAHa5mrk22A1oHKLWUDjk7v6Khmg@mail.gmail.com/
Based-on: <20220227223522.91937-12-wwcohen@gmail.com>
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <E1nP9Oz-00043L-KJ@lizzy.crudebyte.com>
Darwin does not support mknodat. However, to avoid race conditions
with later setting the permissions, we must avoid using mknod on
the full path instead. We could try to fchdir, but that would cause
problems if multiple threads try to call mknodat at the same time.
However, luckily there is a solution: Darwin includes a function
that sets the cwd for the current thread only.
This should suffice to use mknod safely.
This function (pthread_fchdir_np) is protected by a check in
meson in a patch later in this series.
Signed-off-by: Keno Fischer <keno@juliacomputing.com>
Signed-off-by: Michael Roitzsch <reactorcontrol@icloud.com>
[Will Cohen: - Adjust coding style
- Replace clang references with gcc
- Note radar filed with Apple for missing syscall
- Replace direct syscall with pthread_fchdir_np and
adjust patch notes accordingly
- Declare pthread_fchdir_np with
- __attribute__((weak_import)) to allow checking for
its presence before usage
- Move declarations above cplusplus guard
- Add CONFIG_PTHREAD_FCHDIR_NP to meson and check for
presence in 9p-util
- Rebase to apply cleanly on top of the 2022-02-10
changes to 9pfs
- Fix line over 90 characters formatting error]
Signed-off-by: Will Cohen <wwcohen@gmail.com>
Message-Id: <20220227223522.91937-10-wwcohen@gmail.com>
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
On darwin `fgetxattr` takes two extra optional arguments,
and the l* variants are not defined (in favor of an extra
flag to the regular variants.
Signed-off-by: Keno Fischer <keno@juliacomputing.com>
[Michael Roitzsch: - Rebase for NixOS]
Signed-off-by: Michael Roitzsch <reactorcontrol@icloud.com>
Signed-off-by: Will Cohen <wwcohen@gmail.com>
Message-Id: <20220227223522.91937-9-wwcohen@gmail.com>
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Acked-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
This implements the darwin equivalent of the functions that were
moved to 9p-util(-linux) earlier in this series in the new
9p-util-darwin file.
Signed-off-by: Keno Fischer <keno@juliacomputing.com>
[Michael Roitzsch: - Rebase for NixOS]
Signed-off-by: Michael Roitzsch <reactorcontrol@icloud.com>
Signed-off-by: Will Cohen <wwcohen@gmail.com>
Message-Id: <20220227223522.91937-8-wwcohen@gmail.com>
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Acked-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Signed-off-by: Keno Fischer <keno@juliacomputing.com>
Signed-off-by: Michael Roitzsch <reactorcontrol@icloud.com>
Because XATTR_SIZE_MAX is not defined on Darwin,
create a cross-platform P9_XATTR_SIZE_MAX instead.
[Will Cohen: - Adjust coding style
- Lower XATTR_SIZE_MAX to 64k
- Add explanatory context related to XATTR_SIZE_MAX]
[Fabian Franz: - Move XATTR_SIZE_MAX reference from 9p.c to
P9_XATTR_SIZE_MAX in 9p.h]
Signed-off-by: Will Cohen <wwcohen@gmail.com>
Signed-off-by: Fabian Franz <fabianfranz.oss@gmail.com>
[Will Cohen: - For P9_XATTR_MAX, ensure that Linux uses
XATTR_SIZE_MAX, Darwin uses 64k, and error
out for undefined hosts]
Signed-off-by: Will Cohen <wwcohen@gmail.com>
Message-Id: <20220227223522.91937-7-wwcohen@gmail.com>
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Darwin doesn't have either of these flags. Darwin does have
F_NOCACHE, which is similar to O_DIRECT, but has different
enough semantics that other projects don't generally map
them automatically. In any case, we don't support O_DIRECT
on Linux at the moment either.
Signed-off-by: Keno Fischer <keno@juliacomputing.com>
[Michael Roitzsch: - Rebase for NixOS]
Signed-off-by: Michael Roitzsch <reactorcontrol@icloud.com>
[Will Cohen: - Adjust coding style]
Signed-off-by: Will Cohen <wwcohen@gmail.com>
Message-Id: <20220227223522.91937-6-wwcohen@gmail.com>
[C.S.: - Fix compiler warning "unused label 'again'". ]
Link: https://lore.kernel.org/qemu-devel/11201492.CjeqJxXfGd@silver/
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
On darwin d_seekoff exists, but is optional and does not seem to
be commonly used by file systems. Use `telldir` instead to obtain
the seek offset and inject it into d_seekoff, and create a
qemu_dirent_off helper to call it appropriately when appropriate.
Signed-off-by: Keno Fischer <keno@juliacomputing.com>
[Michael Roitzsch: - Rebase for NixOS]
Signed-off-by: Michael Roitzsch <reactorcontrol@icloud.com>
[Will Cohen: - Adjust to pass testing
- Ensure that d_seekoff is filled using telldir
on darwin, and create qemu_dirent_off helper
to decide which to access]
[Fabian Franz: - Add telldir error handling for darwin]
Signed-off-by: Fabian Franz <fabianfranz.oss@gmail.com>
[Will Cohen: - Ensure that telldir error handling uses
signed int
- Cleanup of telldir error handling
- Remove superfluous error handling for
qemu_dirent_off
- Adjust formatting
- Use qemu_dirent_off in codir.c
- Declare qemu_dirent_off as static to prevent
linker error
- Move qemu_dirent_off above the end-of-file
endif to fix compilation]
Signed-off-by: Will Cohen <wwcohen@gmail.com>
Message-Id: <20220227223522.91937-5-wwcohen@gmail.com>
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Signed-off-by: Keno Fischer <keno@juliacomputing.com>
Signed-off-by: Michael Roitzsch <reactorcontrol@icloud.com>
[Will Cohen: - Note lack of f_namelen and f_frsize on Darwin
- Ensure that tv_sec and tv_nsec are both
initialized for Darwin and non-Darwin]
Signed-off-by: Will Cohen <wwcohen@gmail.com>
Message-Id: <20220227223522.91937-4-wwcohen@gmail.com>
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
The current file only has the Linux versions of these functions.
Rename the file accordingly and update the Makefile to only build
it on Linux. A Darwin version of these will follow later in the
series.
Signed-off-by: Keno Fischer <keno@juliacomputing.com>
[Michael Roitzsch: - Rebase for NixOS]
Signed-off-by: Michael Roitzsch <reactorcontrol@icloud.com>
Signed-off-by: Will Cohen <wwcohen@gmail.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20220227223522.91937-3-wwcohen@gmail.com>
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
- Guard Linux only headers.
- Add qemu/statfs.h header to abstract over the which
headers are needed for struct statfs
- Define `ENOATTR` only if not only defined
(it's defined in system headers on Darwin).
Signed-off-by: Keno Fischer <keno@juliacomputing.com>
[Michael Roitzsch: - Rebase for NixOS]
Signed-off-by: Michael Roitzsch <reactorcontrol@icloud.com>
While it might at first appear that fsdev/virtfs-proxy-header.c would
need similar adjustment for darwin as file-op-9p here, a later patch in
this series disables virtfs-proxy-helper for non-Linux. Allowing
virtfs-proxy-helper on darwin could potentially be an additional
optimization later.
[Will Cohen: - Fix headers for Alpine
- Integrate statfs.h back into file-op-9p.h
- Remove superfluous header guards from file-opt-9p
- Add note about virtfs-proxy-helper being disabled
on non-Linux for this patch series]
Signed-off-by: Will Cohen <wwcohen@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <20220227223522.91937-2-wwcohen@gmail.com>
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Use QEMU_ALIGN_DOWN() macro to reduce code and to make it
more human readable.
Suggested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <b84eb324d2ebdcc6f9c442c97b5b4d01eecb4f43.1632758315.git.qemu_oss@crudebyte.com>
When client sent a 9p Tgetattr request then the wrong I/O block
size value was returned by 9p server; instead of host file
system's I/O block size it should rather return an I/O block
size according to 9p session's 'msize' value, because the value
returned to client should be an "optimum" block size for I/O
(i.e. to maximize performance), it should not reflect the actual
physical block size of the underlying storage media.
The I/O block size of a host filesystem is typically 4k, so the
value returned was far too low for good 9p I/O performance.
This patch adds stat_to_iounit() with a similar approach as the
existing get_iounit() function.
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <E1mT2Js-0000DW-OH@lizzy.crudebyte.com>
v9fs_walk() utilizes the v9fs_co_run_in_worker({...}) macro to run the
supplied fs driver code block on a background worker thread.
When either the 'Twalk' client request was interrupted or if the client
requested fid for that 'Twalk' request caused a stat error then that
fs driver code block was left by 'break' keyword, with the intention to
return from worker thread back to main thread as well:
v9fs_co_run_in_worker({
if (v9fs_request_cancelled(pdu)) {
err = -EINTR;
break;
}
err = s->ops->lstat(&s->ctx, &dpath, &fidst);
if (err < 0) {
err = -errno;
break;
}
...
});
However that 'break;' statement also skipped the v9fs_co_run_in_worker()
macro's final and mandatory
/* re-enter back to qemu thread */
qemu_coroutine_yield();
call and thus caused the rest of v9fs_walk() to be continued being
executed on the worker thread instead of main thread, eventually
leading to a crash in the transport virtio transport driver.
To fix this issue and to prevent the same error from happening again by
other users of v9fs_co_run_in_worker() in future, auto wrap the supplied
code block into its own
do { } while (0);
loop inside the 'v9fs_co_run_in_worker' macro definition.
Full discussion and backtrace:
https://lists.gnu.org/archive/html/qemu-devel/2021-08/msg05209.htmlhttps://lists.gnu.org/archive/html/qemu-devel/2021-09/msg00174.html
Fixes: 8d6cb10073
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <E1mLTBg-0002Bh-2D@lizzy.crudebyte.com>
Suggested-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <b51670d2a39399535a035f6bc77c3cbeed85edae.1629208359.git.qemu_oss@crudebyte.com>
The v9fs_walk() function resolves all client submitted path nodes to the
local 'pathes' array. Using a separate string scalar variable 'path'
inside the background worker thread loop and copying that local 'path'
string scalar variable subsequently to the 'pathes' array (at the end of
each loop iteration) is not necessary.
Instead simply resolve each path directly to the 'pathes' array and
don't use the string scalar variable 'path' inside the fs worker thread
loop at all.
The only advantage of the 'path' scalar was that in case of an error
the respective 'pathes' element would not be filled. Right now this is
not an issue as the v9fs_walk() function returns as soon as any error
occurs.
Suggested-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <7dacbecf25b2c9b4a0ce12d689a8a535f09a31e3.1629208359.git.qemu_oss@crudebyte.com>
As with previous performance optimization on Treaddir handling;
reduce the overall latency, i.e. overall time spent on processing
a Twalk request by reducing the amount of thread hops between the
9p server's main thread and fs worker thread(s).
In fact this patch even reduces the thread hops for Twalk handling
to its theoritical minimum of exactly 2 thread hops:
main thread -> fs worker thread -> main thread
This is achieved by doing all the required fs driver tasks altogether
in a single v9fs_co_run_in_worker({ ... }); code block.
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <1a6701674afc4f08d40396e3aa2631e18a4dbb33.1622821729.git.qemu_oss@crudebyte.com>
There is no longer a user of root_qid, so drop it.
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <6896dd161d3257db6b0513842a14f87ca191fdf6.1622821729.git.qemu_oss@crudebyte.com>
As we are actually only comparing the filesystem ID (i.e. device number
and inode number pair) let's use the POSIX stat buffer instead of QIDs,
because resolving QIDs requires to be done on 9p server's main thread
only as it might mutate the server state if inode remapping is enabled.
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <26aa465ff9cc9c07e053331554a02fdae3994417.1622821729.git.qemu_oss@crudebyte.com>