Commit Graph

94352 Commits

Author SHA1 Message Date
Peter Maydell
1a11265d7e util: Put qemu_vfree() in memalign.c
qemu_vfree() is the companion free function to qemu_memalign(); put
it in memalign.c so the allocation and free functions are together.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220226180723.1706285-9-peter.maydell@linaro.org
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2022-03-07 13:16:24 +00:00
Peter Maydell
88454f844e util: Use meson checks for valloc() and memalign() presence
Instead of assuming that all CONFIG_BSD have valloc() and anything
else is memalign(), explicitly check for those functions in
meson.build and use the "is the function present" define.  Tests for
specific functionality are better than which-OS checks; this also
lets us give a helpful error message if somehow there's no usable
function present.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20220226180723.1706285-8-peter.maydell@linaro.org
2022-03-07 13:16:05 +00:00
Peter Maydell
5c8c714a0a util: Share qemu_try_memalign() implementation between POSIX and Windows
The qemu_try_memalign() functions for POSIX and Windows used to be
significantly different, but these days they are identical except for
the actual allocation function called, and the POSIX version already
has to have ifdeffery for different allocation functions.

Move to a single implementation in memalign.c, which uses the Windows
_aligned_malloc if we detect that function in meson.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220226180723.1706285-7-peter.maydell@linaro.org
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2022-03-07 13:15:24 +00:00
Peter Maydell
8698343b86 meson.build: Don't misdetect posix_memalign() on Windows
Currently we incorrectly think that posix_memalign() exists on
Windows.  This is because of a combination of:

 * the msys2/mingw toolchain/libc claim to have a
   __builtin_posix_memalign when there isn't a builtin of that name
 * meson will assume that if you have a __builtin_foo that
   counts for has_function('foo')

Specifying a specific include file via prefix: causes meson to not
treat builtins as sufficient and actually look for the function
itself; see this meson pull request which added that as the official
way to get the right answer:
  https://github.com/mesonbuild/meson/pull/1150

Currently this misdectection doesn't cause problems because we only
use CONFIG_POSIX_MEMALIGN in oslib-posix.c; however that will change
in a following commit.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220226180723.1706285-6-peter.maydell@linaro.org
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2022-03-07 13:14:59 +00:00
Peter Maydell
bc0fecc1c2 util: Return valid allocation for qemu_try_memalign() with zero size
Currently qemu_try_memalign()'s behaviour if asked to allocate
0 bytes is rather variable:
 * on Windows, we will assert
 * on POSIX platforms, we get the underlying behaviour of
   the posix_memalign() or equivalent function, which may be
   either "return a valid non-NULL pointer" or "return NULL"

Explictly check for 0 byte allocations, so we get consistent
behaviour across platforms.  We handle them by incrementing the size
so that we return a valid non-NULL pointer that can later be passed
to qemu_vfree().  This is permitted behaviour for the
posix_memalign() API and is the most usual way that underlying
malloc() etc implementations handle a zero-sized allocation request,
because it won't trip up calling code that assumes NULL means an
error.  (This includes our own qemu_memalign(), which will abort on
NULL.)

This change is a preparation for sharing the qemu_try_memalign() code
between Windows and POSIX.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2022-03-07 13:14:07 +00:00
Peter Maydell
ac8057a11b util: Unify implementations of qemu_memalign()
We implement qemu_memalign() in both oslib-posix.c and oslib-win32.c,
but the two versions are essentially the same: they call
qemu_try_memalign(), and abort() after printing an error message if
it fails.  The only difference is that the win32 version prints the
GetLastError() value whereas the POSIX version prints
strerror(errno).  However, this is a bug in the win32 version: in
commit dfbd0b873a in 2020 we changed the implementation of
qemu_try_memalign() from using VirtualAlloc() (which sets the
GetLastError() value) to using _aligned_malloc() (which sets errno),
but didn't update the error message to match.

Replace the two separate functions with a single version in a
new memalign.c file, which drops the unnecessary extra qemu_oom_check()
function and instead prints a more useful message including the
requested size and alignment as well as the errno string.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220226180723.1706285-4-peter.maydell@linaro.org
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2022-03-07 13:09:20 +00:00
Peter Maydell
1c6c3b764d util: Make qemu_oom_check() a static function
The qemu_oom_check() function, which we define in both oslib-posix.c
and oslib-win32.c, is now used only locally in that file; make it
static.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20220226180723.1706285-3-peter.maydell@linaro.org
2022-03-07 13:09:20 +00:00
Christian Schoenebeck
35b6466459 fsdev/p9array.h: convert Doxygen -> kerneldoc format
API doc comments in QEMU are supposed to be in kerneldoc format, so
convert API doc comments from Doxygen format to kerneldoc format.

Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <2e2d46a402560f155de322d95789ba107d728885.1646314856.git.qemu_oss@crudebyte.com>
2022-03-07 11:49:31 +01:00
Christian Schoenebeck
09d19d5807 9pfs/coth.h: drop Doxygen format on v9fs_co_run_in_worker()
API doc comments in QEMU are supposed to be in kerneldoc format, so
drop Doxygen format used on v9fs_co_run_in_worker() macro.

Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <a8fdf0290d1e40a68f5577f29aeae12298b70733.1646314856.git.qemu_oss@crudebyte.com>
2022-03-07 11:49:31 +01:00
Christian Schoenebeck
041b0945f9 9pfs/9p-util.h: convert Doxygen -> kerneldoc format
API doc comments in QEMU are supposed to be in kerneldoc format, so
convert API doc comments from Doxygen format to kerneldoc format.

Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <dc1c4a85e233f5884ee5f6ec96b87db286083df7.1646314856.git.qemu_oss@crudebyte.com>
2022-03-07 11:49:31 +01:00
Christian Schoenebeck
e16fea4156 9pfs/9p.c: convert Doxygen -> kerneldoc format
API doc comments in QEMU are supposed to be in kerneldoc format, so
convert API doc comments from Doxygen format to kerneldoc format.

Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <4ece6ffa4465c271c6a7c42a3040f42780fcce87.1646314856.git.qemu_oss@crudebyte.com>
2022-03-07 11:49:31 +01:00
Christian Schoenebeck
1a7f240014 9pfs/codir.c: convert Doxygen -> kerneldoc format
API doc comments in QEMU are supposed to be in kerneldoc format, so
convert API doc comments from Doxygen format to kerneldoc format.

Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <c76be7d38ea448c6417b2ffb5ccd6b711519a878.1646314856.git.qemu_oss@crudebyte.com>
2022-03-07 11:49:31 +01:00
Christian Schoenebeck
39db334719 9pfs/9p.h: convert Doxygen -> kerneldoc format
API doc comments in QEMU are supposed to be in kerneldoc format, so
convert API doc comments from Doxygen format to kerneldoc format.

Based-on: <E1nPTwO-0006pl-Np@lizzy.crudebyte.com>
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <2b8f91de7bac3d3bc85d60eb08830a35a394be75.1646314856.git.qemu_oss@crudebyte.com>
2022-03-07 11:49:31 +01:00
Christian Schoenebeck
63ce31c35d 9pfs: drop Doxygen format from qemu_dirent_dup() API comment
API doc comments in QEMU are supposed to be in kerneldoc format, so drop
occurrences of "@c" which is Doxygen format for fixed-width text.

Link: https://lore.kernel.org/qemu-devel/CAFEAcA89+ENOM6x19OEF53Kd2DWkhN5SN21Va0D7yepJSa3Jyg@mail.gmail.com/
Based-on: <E1nP9Oz-00043L-KJ@lizzy.crudebyte.com>
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <E1nPTwO-0006pl-Np@lizzy.crudebyte.com>
2022-03-07 11:49:31 +01:00
Christian Schoenebeck
1983d8b0d6 9pfs: move qemu_dirent_dup() from osdep -> 9p-util
Function qemu_dirent_dup() is currently only used by 9pfs server, so move
it from project global header osdep.h to 9pfs specific header 9p-util.h.

Link: https://lore.kernel.org/qemu-devel/CAFEAcA_=HAUNomKD2wurSVaAHa5mrk22A1oHKLWUDjk7v6Khmg@mail.gmail.com/
Based-on: <20220227223522.91937-12-wwcohen@gmail.com>
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <E1nP9Oz-00043L-KJ@lizzy.crudebyte.com>
2022-03-07 11:49:31 +01:00
Keno Fischer
0fb1e19d78 9p: darwin: meson: Allow VirtFS on Darwin
To allow VirtFS on darwin, we need to check that pthread_fchdir_np is
available, which has only been available since macOS 10.12.

Additionally, virtfs_proxy_helper is disabled on Darwin. This patch
series does not currently provide an implementation of the proxy-helper,
but this functionality could be implemented later on.

Signed-off-by: Keno Fischer <keno@juliacomputing.com>
[Michael Roitzsch: - Rebase for NixOS]
Signed-off-by: Michael Roitzsch <reactorcontrol@icloud.com>
[Will Cohen: - Rebase to master]
Signed-off-by: Will Cohen <wwcohen@gmail.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
[Will Cohen: - Add check for pthread_fchdir_np to virtfs
             - Add comments to patch commit
             - Note that virtfs_proxy_helper does not work
               on macOS
             - Fully adjust meson virtfs error note to specify
               macOS
             - Rebase to master]
Signed-off-by: Will Cohen <wwcohen@gmail.com>
Message-Id: <20220227223522.91937-12-wwcohen@gmail.com>
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Acked-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
2022-03-07 11:49:31 +01:00
Will Cohen
d3671fd972 9p: darwin: Adjust assumption on virtio-9p-test
The previous test depended on the assumption that P9_DOTL_AT_REMOVEDIR
and AT_REMOVEDIR have the same value.

While this is true on Linux, it is not true everywhere, and leads to an
incorrect test failure on unlink_at, noticed when adding 9p to darwin:

Received response 7 (RLERROR) instead of 77 (RUNLINKAT)
Rlerror has errno 22 (Invalid argument)
**

ERROR:../tests/qtest/virtio-9p-test.c:305:v9fs_req_recv: assertion
failed (hdr.id == id): (7 == 77) Bail out!

ERROR:../tests/qtest/virtio-9p-test.c:305:v9fs_req_recv: assertion
failed (hdr.id == id): (7 == 77)

Signed-off-by: Fabian Franz <fabianfranz.oss@gmail.com>
[Will Cohen: - Add explanation of patch and description
               of pre-patch test failure]
Signed-off-by: Will Cohen <wwcohen@gmail.com>
Acked-by: Thomas Huth <thuth@redhat.com>
[Will Cohen: - Move this patch before 9p: darwin: meson
               patch to avoid qtest breakage during
               bisecting]
Signed-off-by: Will Cohen <wwcohen@gmail.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <20220227223522.91937-11-wwcohen@gmail.com>
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
2022-03-07 11:49:31 +01:00
Keno Fischer
029ed1bd9d 9p: darwin: Implement compatibility for mknodat
Darwin does not support mknodat. However, to avoid race conditions
with later setting the permissions, we must avoid using mknod on
the full path instead. We could try to fchdir, but that would cause
problems if multiple threads try to call mknodat at the same time.
However, luckily there is a solution: Darwin includes a function
that sets the cwd for the current thread only.
This should suffice to use mknod safely.

This function (pthread_fchdir_np) is protected by a check in
meson in a patch later in this series.

Signed-off-by: Keno Fischer <keno@juliacomputing.com>
Signed-off-by: Michael Roitzsch <reactorcontrol@icloud.com>
[Will Cohen: - Adjust coding style
             - Replace clang references with gcc
             - Note radar filed with Apple for missing syscall
             - Replace direct syscall with pthread_fchdir_np and
               adjust patch notes accordingly
             - Declare pthread_fchdir_np with
             - __attribute__((weak_import)) to allow checking for
               its presence before usage
             - Move declarations above cplusplus guard
             - Add CONFIG_PTHREAD_FCHDIR_NP to meson and check for
               presence in 9p-util
             - Rebase to apply cleanly on top of the 2022-02-10
               changes to 9pfs
             - Fix line over 90 characters formatting error]
Signed-off-by: Will Cohen <wwcohen@gmail.com>
Message-Id: <20220227223522.91937-10-wwcohen@gmail.com>
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
2022-03-07 11:49:31 +01:00
Keno Fischer
b5989326f5 9p: darwin: Compatibility for f/l*xattr
On darwin `fgetxattr` takes two extra optional arguments,
and the l* variants are not defined (in favor of an extra
flag to the regular variants.

Signed-off-by: Keno Fischer <keno@juliacomputing.com>
[Michael Roitzsch: - Rebase for NixOS]
Signed-off-by: Michael Roitzsch <reactorcontrol@icloud.com>
Signed-off-by: Will Cohen <wwcohen@gmail.com>
Message-Id: <20220227223522.91937-9-wwcohen@gmail.com>
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Acked-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
2022-03-07 11:49:31 +01:00
Keno Fischer
57b3910bc3 9p: darwin: *xattr_nofollow implementations
This implements the darwin equivalent of the functions that were
moved to 9p-util(-linux) earlier in this series in the new
9p-util-darwin file.

Signed-off-by: Keno Fischer <keno@juliacomputing.com>
[Michael Roitzsch: - Rebase for NixOS]
Signed-off-by: Michael Roitzsch <reactorcontrol@icloud.com>
Signed-off-by: Will Cohen <wwcohen@gmail.com>
Message-Id: <20220227223522.91937-8-wwcohen@gmail.com>
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Acked-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
2022-03-07 11:49:31 +01:00
Keno Fischer
38d7fd68b0 9p: darwin: Move XATTR_SIZE_MAX->P9_XATTR_SIZE_MAX
Signed-off-by: Keno Fischer <keno@juliacomputing.com>
Signed-off-by: Michael Roitzsch <reactorcontrol@icloud.com>

Because XATTR_SIZE_MAX is not defined on Darwin,
create a cross-platform P9_XATTR_SIZE_MAX instead.

[Will Cohen: - Adjust coding style
             - Lower XATTR_SIZE_MAX to 64k
             - Add explanatory context related to XATTR_SIZE_MAX]
[Fabian Franz: - Move XATTR_SIZE_MAX reference from 9p.c to
                 P9_XATTR_SIZE_MAX in 9p.h]
Signed-off-by: Will Cohen <wwcohen@gmail.com>
Signed-off-by: Fabian Franz <fabianfranz.oss@gmail.com>
[Will Cohen: - For P9_XATTR_MAX, ensure that Linux uses
               XATTR_SIZE_MAX, Darwin uses 64k, and error
               out for undefined hosts]
Signed-off-by: Will Cohen <wwcohen@gmail.com>
Message-Id: <20220227223522.91937-7-wwcohen@gmail.com>
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
2022-03-07 11:49:31 +01:00
Keno Fischer
67a71e3b71 9p: darwin: Ignore O_{NOATIME, DIRECT}
Darwin doesn't have either of these flags. Darwin does have
F_NOCACHE, which is similar to O_DIRECT, but has different
enough semantics that other projects don't generally map
them automatically. In any case, we don't support O_DIRECT
on Linux at the moment either.

Signed-off-by: Keno Fischer <keno@juliacomputing.com>
[Michael Roitzsch: - Rebase for NixOS]
Signed-off-by: Michael Roitzsch <reactorcontrol@icloud.com>
[Will Cohen: - Adjust coding style]
Signed-off-by: Will Cohen <wwcohen@gmail.com>
Message-Id: <20220227223522.91937-6-wwcohen@gmail.com>
[C.S.: - Fix compiler warning "unused label 'again'". ]
Link: https://lore.kernel.org/qemu-devel/11201492.CjeqJxXfGd@silver/
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
2022-03-07 11:49:31 +01:00
Keno Fischer
6b3b279bd6 9p: darwin: Handle struct dirent differences
On darwin d_seekoff exists, but is optional and does not seem to
be commonly used by file systems. Use `telldir` instead to obtain
the seek offset and inject it into d_seekoff, and create a
qemu_dirent_off helper to call it appropriately when appropriate.

Signed-off-by: Keno Fischer <keno@juliacomputing.com>
[Michael Roitzsch: - Rebase for NixOS]
Signed-off-by: Michael Roitzsch <reactorcontrol@icloud.com>
[Will Cohen: - Adjust to pass testing
             - Ensure that d_seekoff is filled using telldir
               on darwin, and create qemu_dirent_off helper
               to decide which to access]
[Fabian Franz: - Add telldir error handling for darwin]
Signed-off-by: Fabian Franz <fabianfranz.oss@gmail.com>
[Will Cohen: - Ensure that telldir error handling uses
               signed int
             - Cleanup of telldir error handling
             - Remove superfluous error handling for
               qemu_dirent_off
             - Adjust formatting
             - Use qemu_dirent_off in codir.c
             - Declare qemu_dirent_off as static to prevent
               linker error
             - Move qemu_dirent_off above the end-of-file
               endif to fix compilation]
Signed-off-by: Will Cohen <wwcohen@gmail.com>
Message-Id: <20220227223522.91937-5-wwcohen@gmail.com>
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
2022-03-07 11:49:31 +01:00
Keno Fischer
f41db099c7 9p: darwin: Handle struct stat(fs) differences
Signed-off-by: Keno Fischer <keno@juliacomputing.com>
Signed-off-by: Michael Roitzsch <reactorcontrol@icloud.com>
[Will Cohen: - Note lack of f_namelen and f_frsize on Darwin
             - Ensure that tv_sec and tv_nsec are both
               initialized for Darwin and non-Darwin]
Signed-off-by: Will Cohen <wwcohen@gmail.com>
Message-Id: <20220227223522.91937-4-wwcohen@gmail.com>
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
2022-03-07 11:49:30 +01:00
Keno Fischer
6450084a66 9p: Rename 9p-util -> 9p-util-linux
The current file only has the Linux versions of these functions.
Rename the file accordingly and update the Makefile to only build
it on Linux. A Darwin version of these will follow later in the
series.

Signed-off-by: Keno Fischer <keno@juliacomputing.com>
[Michael Roitzsch: - Rebase for NixOS]
Signed-off-by: Michael Roitzsch <reactorcontrol@icloud.com>
Signed-off-by: Will Cohen <wwcohen@gmail.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20220227223522.91937-3-wwcohen@gmail.com>
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
2022-03-07 11:49:30 +01:00
Keno Fischer
e0bd743bb2 9p: linux: Fix a couple Linux assumptions
- Guard Linux only headers.
 - Add qemu/statfs.h header to abstract over the which
   headers are needed for struct statfs
 - Define `ENOATTR` only if not only defined
   (it's defined in system headers on Darwin).

Signed-off-by: Keno Fischer <keno@juliacomputing.com>
[Michael Roitzsch: - Rebase for NixOS]
Signed-off-by: Michael Roitzsch <reactorcontrol@icloud.com>

While it might at first appear that fsdev/virtfs-proxy-header.c would
need similar adjustment for darwin as file-op-9p here, a later patch in
this series disables virtfs-proxy-helper for non-Linux. Allowing
virtfs-proxy-helper on darwin could potentially be an additional
optimization later.

[Will Cohen: - Fix headers for Alpine
             - Integrate statfs.h back into file-op-9p.h
             - Remove superfluous header guards from file-opt-9p
             - Add note about virtfs-proxy-helper being disabled
               on non-Linux for this patch series]
Signed-off-by: Will Cohen <wwcohen@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <20220227223522.91937-2-wwcohen@gmail.com>
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
2022-03-07 11:49:30 +01:00
Gerd Hoffmann
d877ada1b8 update seabios binaries to 1.16.0
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2022-03-07 11:20:21 +01:00
Gerd Hoffmann
f682abc83b update seabios submodule to 1.16.0
git shortlog 6a62e0cb0dfe..rel-1.16.0
-------------------------------------

Florian Larysch (1):
      nvme: fix LBA format data structure

Jan Beulich via SeaBIOS (1):
      nvme: avoid use-after-free in nvme_controller_enable()

Kevin O'Connor (9):
      smm: Suppress gcc array-bounds warnings
      nvme: Rework nvme_io_readwrite() to return -1 on error
      nvme: Add nvme_bounce_xfer() helper function
      nvme: Convert nvme_build_prpl() to nvme_prpl_xfer()
      nvme: Pass prp1 and prp2 directly to nvme_io_xfer()
      nvme: Build the page list in the existing dma buffer
      nvme: Only allocate one dma bounce buffer for all nvme drives
      sercon: Fix missing GET_LOW() to access rx_bytes
      docs: Note v1.16.0 release

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2022-03-07 11:20:21 +01:00
Thomas Huth
5be6fd0cb9 MAINTAINERS: Update the files in the FreeBSD section
The FreeBSD CI definitions now reside in other files than .cirrs.yml.
Update the entry in MAINTAINERS accordingly.

Message-Id: <20220217141138.917292-1-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2022-03-07 10:08:57 +01:00
David Miller
8b398296d4 tests/tcg/s390x: Cleanup of mie3 tests.
Adds clobbers and merges remaining separate asm statements.

Signed-off-by: David Miller <dmiller423@gmail.com>
Message-Id: <20220301214305.2778-1-dmiller423@gmail.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
[thuth: dropped changes to mie3-compl.c, whitespace fixes]
Signed-off-by: Thomas Huth <thuth@redhat.com>
2022-03-07 09:44:07 +01:00
Vladimir Sementsov-Ogievskiy
743da0b401 iotests/image-fleecing: test push backup with fleecing
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20220303194349.2304213-17-vsementsov@virtuozzo.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
2022-03-07 09:33:31 +01:00
Vladimir Sementsov-Ogievskiy
153bab4d4c iotests/image-fleecing: add test case with bitmap
Note that reads zero areas (not dirty in the bitmap) fails, that's
correct.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20220303194349.2304213-16-vsementsov@virtuozzo.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
2022-03-07 09:33:31 +01:00
Vladimir Sementsov-Ogievskiy
308c6abf73 iotests.py: add qemu_io_pipe_and_status()
Add helper that returns both status and output, to be used in the
following commit

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20220303194349.2304213-15-vsementsov@virtuozzo.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
2022-03-07 09:33:31 +01:00
Vladimir Sementsov-Ogievskiy
3927e5c5c8 iotests/image-fleecing: add test-case for fleecing format node
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220303194349.2304213-14-vsementsov@virtuozzo.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
2022-03-07 09:33:31 +01:00
Vladimir Sementsov-Ogievskiy
af5bcd775f block: copy-before-write: realize snapshot-access API
Current scheme of image fleecing looks like this:

[guest]                    [NBD export]
  |                              |
  |root                          | root
  v                              v
[copy-before-write] -----> [temp.qcow2]
  |                 target  |
  |file                     |backing
  v                         |
[active disk] <-------------+

 - On guest writes copy-before-write filter copies old data from active
   disk to temp.qcow2. So fleecing client (NBD export) when reads
   changed regions from temp.qcow2 image and unchanged from active disk
   through backing link.

This patch makes possible new image fleecing scheme:

[guest]                   [NBD export]
   |                            |
   | root                       | root
   v                 file       v
[copy-before-write]<------[snapshot-access]
   |           |
   | file      | target
   v           v
[active-disk] [temp.img]

 - copy-before-write does CBW operations and also provides
   snapshot-access API. The API may be accessed through
   snapshot-access driver.

Benefits of new scheme:

1. Access control: if remote client try to read data that not covered
   by original dirty bitmap used on copy-before-write open, client gets
   -EACCES.

2. Discard support: if remote client do DISCARD, this additionally to
   discarding data in temp.img informs block-copy process to not copy
   these clusters. Next read from discarded area will return -EACCES.
   This is significant thing: when fleecing user reads data that was
   not yet copied to temp.img, we can avoid copying it on further guest
   write.

3. Synchronisation between client reads and block-copy write is more
   efficient. In old scheme we just rely on BDRV_REQ_SERIALISING flag
   used for writes to temp.qcow2. New scheme is less blocking:
     - fleecing reads are never blocked: if data region is untouched or
       in-flight, we just read from active-disk, otherwise we read from
       temp.img
     - writes to temp.img are not blocked by fleecing reads
     - still, guest writes of-course are blocked by in-flight fleecing
       reads, that currently read from active-disk - it's the minimum
       necessary blocking

4. Temporary image may be of any format, as we don't rely on backing
   feature.

5. Permission relation are simplified. With old scheme we have to share
   write permission on target child of copy-before-write, otherwise
   backing link conflicts with copy-before-write file child write
   permissions. With new scheme we don't have backing link, and
   copy-before-write node may have unshared access to temporary node.
   (Not realized in this commit, will be in future).

6. Having control on fleecing reads we'll be able to implement
   alternative behavior on failed copy-before-write operations.
   Currently we just break guest request (that's a historical behavior
   of backup). But in some scenarios it's a bad behavior: better
   is to drop the backup as failed but don't break guest request.
   With new scheme we can simply unset some bits in a bitmap on CBW
   failure and further fleecing reads will -EACCES, or something like
   this. (Not implemented in this commit, will be in future)
   Additional application for this is implementing timeout for CBW
   operations.

Iotest 257 output is updated, as two more bitmaps now live in
copy-before-write filter.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20220303194349.2304213-13-vsementsov@virtuozzo.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
2022-03-07 09:33:31 +01:00
Vladimir Sementsov-Ogievskiy
1c14eaabdb block: introduce snapshot-access block driver
The new block driver simply utilizes snapshot-access API of underlying
block node.

In further patches we want to use it like this:

[guest]                   [NBD export]
   |                            |
   | root                       | root
   v                 file       v
[copy-before-write]<------[snapshot-access]
   |           |
   | file      | target
   v           v
[active-disk] [temp.img]

This way, NBD client will be able to read snapshotted state of active
disk, when active disk is continued to be written by guest. This is
known as "fleecing", and currently uses another scheme based on qcow2
temporary image which backing file is active-disk. New scheme comes
with benefits - see next commit.

The other possible application is exporting internal snapshots of
qcow2, like this:

[guest]          [NBD export]
   |                  |
   | root             | root
   v       file       v
[qcow2]<---------[snapshot-access]

For this, we'll need to implement snapshot-access API handlers in
qcow2 driver, and improve snapshot-access block driver (and API) to
make it possible to select snapshot by name. Another thing to improve
is size of snapshot. Now for simplicity we just use size of bs->file,
which is OK for backup, but for qcow2 snapshots export we'll need to
imporve snapshot-access API to get size of snapshot.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20220303194349.2304213-12-vsementsov@virtuozzo.com>
[hreitz: Rebased on block GS/IO split]
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
2022-03-07 09:33:31 +01:00
Vladimir Sementsov-Ogievskiy
ce14f3b407 block/io: introduce block driver snapshot-access API
Add new block driver handlers and corresponding generic wrappers.
It will be used to allow copy-before-write filter to provide
reach fleecing interface in further commit.

In future this approach may be used to allow reading qcow2 internal
snapshots, for example to export them through NBD.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220303194349.2304213-11-vsementsov@virtuozzo.com>
[hreitz: Rebased on block GS/IO split]
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
2022-03-07 09:33:31 +01:00
Vladimir Sementsov-Ogievskiy
3b7ca26bdf block/reqlist: add reqlist_wait_all()
Add function to wait for all intersecting requests.
To be used in the further commit.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Nikita Lapshin <nikita.lapshin@virtuozzo.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220303194349.2304213-10-vsementsov@virtuozzo.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
2022-03-07 09:33:30 +01:00
Vladimir Sementsov-Ogievskiy
a6426475a7 block/dirty-bitmap: introduce bdrv_dirty_bitmap_status()
Add a convenient function similar with bdrv_block_status() to get
status of dirty bitmap.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220303194349.2304213-9-vsementsov@virtuozzo.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
2022-03-07 09:33:30 +01:00
Vladimir Sementsov-Ogievskiy
84b1e80f67 block/reqlist: reqlist_find_conflict(): use ranges_overlap()
Let's reuse convenient helper.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220303194349.2304213-8-vsementsov@virtuozzo.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
2022-03-07 09:33:30 +01:00
Vladimir Sementsov-Ogievskiy
d088e6a48a block: intoduce reqlist
Split intersecting-requests functionality out of block-copy to be
reused in copy-before-write filter.

Note: while being here, fix tiny typo in MAINTAINERS.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220303194349.2304213-7-vsementsov@virtuozzo.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
2022-03-07 09:33:30 +01:00
Vladimir Sementsov-Ogievskiy
177541e671 block/block-copy: add block_copy_reset()
Split block_copy_reset() out of block_copy_reset_unallocated() to be
used separately later.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220303194349.2304213-6-vsementsov@virtuozzo.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
2022-03-07 09:33:30 +01:00
Vladimir Sementsov-Ogievskiy
5f3a3cd7f0 block/copy-before-write: add bitmap open parameter
This brings "incremental" mode to copy-before-write filter: user can
specify bitmap so that filter will copy only "dirty" areas.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20220303194349.2304213-5-vsementsov@virtuozzo.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
2022-03-07 09:33:30 +01:00
Vladimir Sementsov-Ogievskiy
1f7252e868 block/block-copy: block_copy_state_new(): add bitmap parameter
This will be used in the following commit to bring "incremental" mode
to copy-before-write filter.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220303194349.2304213-4-vsementsov@virtuozzo.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
2022-03-07 09:33:30 +01:00
Vladimir Sementsov-Ogievskiy
34ffacb7f4 block/dirty-bitmap: bdrv_merge_dirty_bitmap(): add return value
That simplifies handling failure in existing code and in further new
usage of bdrv_merge_dirty_bitmap().

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220303194349.2304213-3-vsementsov@virtuozzo.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
2022-03-07 09:33:30 +01:00
Vladimir Sementsov-Ogievskiy
051f66caa2 block/block-copy: move copy_bitmap initialization to block_copy_state_new()
We are going to complicate bitmap initialization in the further
commit. And in future, backup job will be able to work without filter
(when source is immutable), so we'll need same bitmap initialization in
copy-before-write filter and in backup job. So, it's reasonable to do
it in block-copy.

Note that for now cbw_open() is the only caller of
block_copy_state_new().

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220303194349.2304213-2-vsementsov@virtuozzo.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
2022-03-07 09:33:30 +01:00
Hanna Reitz
1a8fcca03f iotests: Write test output to TEST_DIR
Drop the use of OUTPUT_DIR (test/qemu-iotests under the build
directory), and instead write test output files (.out.bad, .notrun, and
.casenotrun) to TEST_DIR.

With this, the same test can be run concurrently without the separate
instances interfering, because they will need separate TEST_DIRs anyway.
Running the same test separately is useful when running the iotests with
various format/protocol combinations in parallel, or when you just want
to aggressively exercise a single test (e.g. when it fails only
sporadically).

Putting this output into TEST_DIR means that it will stick around for
inspection after the test run is done (though running the same test in
the same TEST_DIR will overwrite it, just as it used to be); but given
that TEST_DIR is a scratch directory, it should be clear that users can
delete all of its content at any point.  (And if TEST_DIR is on tmpfs,
it will just disappear on shutdown.)  Contrarily, alternative approaches
that would put these output files into OUTPUT_DIR with some prefix to
differentiate between separate test runs might easily lead to cluttering
OUTPUT_DIR.

(This change means OUTPUT_DIR is no longer written to by the iotests, so
we can drop its usage altogether.)

Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220221172909.762858-1-hreitz@redhat.com>
[hreitz: Simplified `Path(os.path.join(x, y))` to `Path(x, y)`, as
         suggested by Vladimir; and rebased on 9086c76398
         ("tests/qemu-iotests: Rework the checks and spots using GNU
         sed")]
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2022-03-07 09:32:28 +01:00
Thomas Huth
db4b2133b8 tests/qemu-iotests/testrunner: Quote "case not run" lines in TAP mode
In TAP mode, the stdout is reserved for the TAP protocol, so we
have to make sure to mark other lines with a comment '#' character
at the beginning to avoid that the TAP parser at the other end
gets confused.

To test this condition, run "configure" for example with:

 --block-drv-rw-whitelist=copy-before-write,qcow2,raw,file,host_device,blkdebug,null-co,copy-on-read

so that iotest 041 will report that some tests are not run due to
the missing "quorum" driver. Without this change, "make check-block"
fails since the meson tap parser gets confused by these messages.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20220223124353.3273898-1-thuth@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
2022-03-07 09:19:20 +01:00
Thomas Huth
024354ea91 tests/qemu-iotests/040: Skip TestCommitWithFilters without 'throttle'
iotest 040 already has some checks for the availability of the 'throttle'
driver, but some new code has been added in the course of time that
depends on 'throttle' but does not check for its availability. Add
a check to the TestCommitWithFilters class so that this iotest now
also passes again if 'throttle' has not been enabled in the QEMU
binaries.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20220223123127.3206042-1-thuth@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
2022-03-07 09:19:20 +01:00
Vladimir Sementsov-Ogievskiy
45e62b464a block: fix preallocate filter: don't do unaligned preallocate requests
There is a bug in handling BDRV_REQ_NO_WAIT flag: we still may wait in
wait_serialising_requests() if request is unaligned. And this is
possible for the only user of this flag (preallocate filter) if
underlying file is unaligned to its request_alignment on start.

So, we have to fix preallocate filter to do only aligned preallocate
requests.

Next, we should fix generic block/io.c somehow. Keeping in mind that
preallocate is the only user of BDRV_REQ_NO_WAIT and that we have to
fix its behavior now, it seems more safe to just assert that we never
use BDRV_REQ_NO_WAIT with unaligned requests and add corresponding
comment. Let's do so.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Message-Id: <20220215121609.38570-1-vsementsov@virtuozzo.com>
[hreitz: Rebased on block GS/IO split]
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
2022-03-07 09:19:20 +01:00