mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Clément Léger	7532ca570a	qemu/osdep: Add excluded fd parameter to qemu_close_all_open_fd() In order for this function to be usable by tap.c code, add a list of file descriptors that should not be closed. Signed-off-by: Clément Léger <cleger@rivosinc.com> Message-ID: <20240802145423.3232974-5-cleger@rivosinc.com> [rth: Use max_fd in qemu_close_all_open_fd_close_range] Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2024-08-05 08:21:59 +10:00
Clément Léger	ffa28f9cf5	qemu/osdep: Split qemu_close_all_open_fd() and add fallback In order to make it cleaner, split qemu_close_all_open_fd() logic into multiple subfunctions (close with close_range(), with /proc/self/fd and fallback). Signed-off-by: Clément Léger <cleger@rivosinc.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-ID: <20240802145423.3232974-3-cleger@rivosinc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2024-08-05 08:06:08 +10:00
Clément Léger	4ec5ebea07	qemu/osdep: Move close_all_open_fds() to oslib-posix Move close_all_open_fds() in oslib-posix, rename it qemu_close_all_open_fds() and export it. Signed-off-by: Clément Léger <cleger@rivosinc.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-ID: <20240802145423.3232974-2-cleger@rivosinc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2024-08-05 08:06:08 +10:00
Zhao Liu	083c4e71cf	util/oslib-posix: Fix superfluous trailing semicolon Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2024-07-17 14:04:15 +03:00
Paolo Bonzini	44a90c0875	oslib-posix: fix memory leak in touch_all_pages touch_all_pages() can return early, before creating threads. In this case, however, it leaks the MemsetContext that it has allocated at the beginning of the function. Reported by Coverity as CID 1534922. Fixes: `04accf43df` ("oslib-posix: initialize backend memory objects in parallel", 2024-02-06) Reviewed-by: Mark Kanda <mark.kanda@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-03-08 15:51:22 +01:00
Mark Kanda	04accf43df	oslib-posix: initialize backend memory objects in parallel QEMU initializes preallocated backend memory as the objects are parsed from the command line. This is not optimal in some cases (e.g. memory spanning multiple NUMA nodes) because the memory objects are initialized in series. Allow the initialization to occur in parallel (asynchronously). In order to ensure optimal thread placement, asynchronous initialization requires prealloc context threads to be in use. Signed-off-by: Mark Kanda <mark.kanda@oracle.com> Message-ID: <20240131165327.3154970-2-mark.kanda@oracle.com> Tested-by: Mario Casquero <mcasquer@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>	2024-02-06 08:15:22 +01:00
Philippe Mathieu-Daudé	b622ee98bf	util/oslib: Have qemu_prealloc_mem() handler return a boolean Following the example documented since commit `e3fe3988d7` ("error: Document Error API usage rules"), have qemu_prealloc_mem() return a boolean indicating whether an error is set or not. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Message-Id: <20231120213301.24349-19-philmd@linaro.org>	2024-01-05 16:20:15 +01:00
Akihiko Odaki	a1eaa6281f	util: Delete checks for old host definitions IA-64 and PA-RISC host support is already removed with commit `b1cef6d02f` ("Drop remaining bits of ia64 host support"). Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Message-Id: <20230810225922.21600-1-akihiko.odaki@daynix.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2023-09-15 05:26:50 -07:00
Michael Tokarev	d02d06f8f1	util: spelling fixes Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-ID: <20230823065335.1919380-3-mjt@tls.msk.ru> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>	2023-08-31 19:47:43 +02:00
Marc-André Lureau	8278e30c45	util: drop qemu_fork() Fortunately, qemu_fork() is no longer used since commit `a95570e3e4` ("io/command: use glib GSpawn, instead of open-coding fork/exec"). (GSpawn uses posix_spawn() whenever possible instead) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20230221124802.4103554-2-marcandre.lureau@redhat.com>	2023-03-13 15:23:37 +04:00
Markus Armbruster	a67dfa660b	Drop duplicate #include Tracked down with the help of scripts/clean-includes. Signed-off-by: Markus Armbruster <armbru@redhat.com> Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20230202133830.2152150-21-armbru@redhat.com>	2023-02-08 07:28:05 +01:00
Markus Armbruster	bfe7bf8590	Don't include headers already included by qemu/osdep.h This commit was created with scripts/clean-includes. Signed-off-by: Markus Armbruster <armbru@redhat.com> Acked-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20230202133830.2152150-19-armbru@redhat.com>	2023-02-08 07:28:05 +01:00
David Hildenbrand	e04a34e55c	util: Make qemu_prealloc_mem() optionally consume a ThreadContext ... and implement it under POSIX. When a ThreadContext is provided, create new threads via the context such that these new threads obtain a properly configured CPU affinity. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Message-Id: <20221014134720.168738-6-david@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>	2022-10-27 11:00:56 +02:00
David Hildenbrand	e2de2c497e	util: Introduce ThreadContext user-creatable object Setting the CPU affinity of QEMU threads is a bit problematic, because QEMU doesn't always have permissions to set the CPU affinity itself, for example, with seccomp after initialized by QEMU: -sandbox enable=on,resourcecontrol=deny General information about CPU affinities can be found in the man page of taskset: CPU affinity is a scheduler property that "bonds" a process to a given set of CPUs on the system. The Linux scheduler will honor the given CPU affinity and the process will not run on any other CPUs. While upper layers are already aware of how to handle CPU affinities for long-lived threads like iothreads or vcpu threads, especially short-lived threads, as used for memory-backend preallocation, are more involved to handle. These threads are created on demand and upper layers are not even able to identify and configure them. Introduce the concept of a ThreadContext, that is essentially a thread used for creating new threads. All threads created via that context thread inherit the configured CPU affinity. Consequently, it's sufficient to create a ThreadContext and configure it once, and have all threads created via that ThreadContext inherit the same CPU affinity. The CPU affinity of a ThreadContext can be configured two ways: (1) Obtaining the thread id via the "thread-id" property and setting the CPU affinity manually (e.g., via taskset). (2) Setting the "cpu-affinity" property and letting QEMU try set the CPU affinity itself. This will fail if QEMU doesn't have permissions to do so anymore after seccomp was initialized. A simple QEMU example to set the CPU affinity to host CPU 0,1,6,7 would be: qemu-system-x86_64 -S \ -object thread-context,id=tc1,cpu-affinity=0-1,cpu-affinity=6-7 And we can query it via HMP/QMP: (qemu) qom-get tc1 cpu-affinity [ 0, 1, 6, 7 ] But note that due to dynamic library loading this example will not work before we actually make use of thread_context_create_thread() in QEMU code, because the type will otherwise not get registered. We'll wire this up next to make it work. In general, the interface behaves like pthread_setaffinity_np(): host CPU numbers that are currently not available are ignored; only host CPU numbers that are impossible with the current kernel will fail. If the list of host CPU numbers does not include a single CPU that is available, setting the CPU affinity will fail. A ThreadContext can be reused, simply by reconfiguring the CPU affinity. Note that the CPU affinity of previously created threads will not get adjusted. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Acked-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20221014134720.168738-4-david@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>	2022-10-27 11:00:43 +02:00
David Hildenbrand	6556aadc18	util: Cleanup and rename os_mem_prealloc() Let's * give the function a "qemu_" style name make sure the parameters in the implementation match the prototype * rename smp_cpus to max_threads, which makes the semantics of that parameter clearer ... and add a function documentation. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Message-Id: <20221014134720.168738-2-david@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>	2022-10-27 11:00:28 +02:00
Guoyi Tu	3c63b4e94a	oslib-posix: Introduce qemu_socketpair() qemu_socketpair() will create a pair of connected sockets with FD_CLOEXEC set Signed-off-by: Guoyi Tu <tugy@chinatelecom.cn> Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <17fa1eff729eeabd9a001f4639abccb127ceec81.1661240709.git.tugy@chinatelecom.cn>	2022-09-29 14:38:05 +04:00
Philippe Mathieu-Daudé	4311682ea8	cutils: Add missing dyld(3) include on macOS Commit `06680b15b4` moved qemu__exec_dir() to cutils but forgot to move the macOS dyld(3) include, resulting in the following error (when building with Homebrew GCC on macOS Monterey 12.4): [313/1197] Compiling C object libqemuutil.a.p/util_cutils.c.o FAILED: libqemuutil.a.p/util_cutils.c.o ../../util/cutils.c:1039:13: error: implicit declaration of function '_NSGetExecutablePath' [-Werror=implicit-function-declaration] 1039 \| if (_NSGetExecutablePath(fpath, &len) == 0) { \| ^~~~~~~~~~~~~~~~~~~~ ../../util/cutils.c:1039:13: error: nested extern declaration of '_NSGetExecutablePath' [-Werror=nested-externs] Fix by moving the include line to cutils. Fixes: `06680b15b4` ("include: move qemu__exec_dir() to cutils") Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 20220809222046.30812-1-f4bug@amsat.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2022-08-12 11:33:52 +01:00
Thomas Huth	0a979a1320	util: Fix broken build on Haiku A recent commit moved some Haiku-specific code parts from oslib-posix.c to cutils.c, but failed to move the corresponding header #include statement, too, so "make vm-build-haiku.x86_64" is currently broken. Fix it by moving the header #include, too. Fixes: `06680b15b4` ("include: move qemu_*_exec_dir() to cutils") Message-Id: <20220718172026.139004-1-thuth@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>	2022-07-18 20:24:36 +02:00
Marc-André Lureau	06680b15b4	include: move qemu_*_exec_dir() to cutils The function is required by get_relocated_path() (already in cutils), and used by qemu-ga and may be generally useful. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20220525144140.591926-2-marcandre.lureau@redhat.com>	2022-05-28 11:42:56 +02:00
Marc-André Lureau	ff5927baa7	util: rename qemu_block() socket functions The qemu_block() functions are meant to be be used with sockets (the win32 implementation expects SOCKET) Over time, those functions where used with Win32 SOCKET or file-descriptors interchangeably. But for portability, they must only be used with socket-like file-descriptors. FDs can use g_unix_set_fd_nonblocking() instead. Rename the functions with "socket" in the name to prevent bad usages. This is effectively reverting commit `f9e8cacc55` ("oslib-posix: rename socket_set_nonblock() to qemu_set_nonblock()"). Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2022-05-03 15:53:20 +04:00
Marc-André Lureau	22e135fca3	Replace fcntl(O_NONBLOCK) with g_unix_set_fd_nonblocking() Suggested-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>	2022-05-03 15:47:38 +04:00
Marc-André Lureau	a7241974ce	Replace qemu_pipe() with g_unix_open_pipe() GLib g_unix_open_pipe() is essentially like qemu_pipe(), available since 2.30. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>	2022-05-03 15:17:56 +04:00
Marc-André Lureau	ad24b679d2	block: move fcntl_setfl() It is only used by block/file-posix.c, move it there. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>	2022-05-03 15:17:53 +04:00
Marc-André Lureau	1fbf2665e6	util: replace qemu_get_local_state_pathname() Simplify the function to only return the directory path. Callers are adjusted to use the GLib function to build paths, g_build_filename(). Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20220420132624.2439741-39-marcandre.lureau@redhat.com>	2022-04-21 17:09:09 +04:00
Marc-André Lureau	1b34d08f0b	util: use qemu_create() in qemu_write_pidfile() qemu_open_old(O_CREATE) should be replaced with qemu_create() which handles Error reporting. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20220420132624.2439741-38-marcandre.lureau@redhat.com>	2022-04-21 17:09:09 +04:00
Marc-André Lureau	96eb9b2b47	util: use qemu_write_full() in qemu_write_pidfile() Mostly for correctness. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20220420132624.2439741-37-marcandre.lureau@redhat.com>	2022-04-21 17:09:09 +04:00
Marc-André Lureau	548fb0da73	qga: move qga_get_host_name() The function is specific to qemu-ga, no need to share it in QEMU. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Konstantin Kostiuk <kkostiuk@redhat.com> Message-Id: <20220420132624.2439741-32-marcandre.lureau@redhat.com>	2022-04-21 17:09:09 +04:00
Marc-André Lureau	73991a9222	include: move qemu_msync() to osdep The implementation depends on the OS. (and longer-term goal is to move cutils to a common subproject) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20220420132624.2439741-21-marcandre.lureau@redhat.com>	2022-04-21 17:03:51 +04:00
Marc-André Lureau	0f9668e0c1	Remove qemu-common.h include from most units Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20220323155743.1585078-33-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-04-06 14:31:55 +02:00
Marc-André Lureau	e9c4e0a8e5	Move fcntl_setfl() to oslib-posix It is only implemented for POSIX anyway. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220323155743.1585078-30-marcandre.lureau@redhat.com> [Add braces around if statements. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-04-06 14:31:55 +02:00
Marc-André Lureau	8e3b0cbb72	Replace qemu_real_host_page variables with inlined functions Replace the global variables with inlined helper functions. getpagesize() is very likely annotated with a "const" function attribute (at least with glibc), and thus optimization should apply even better. This avoids the need for a constructor initialization too. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20220323155743.1585078-12-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-04-06 10:50:38 +02:00
Peter Maydell	1a11265d7e	util: Put qemu_vfree() in memalign.c qemu_vfree() is the companion free function to qemu_memalign(); put it in memalign.c so the allocation and free functions are together. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20220226180723.1706285-9-peter.maydell@linaro.org Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>	2022-03-07 13:16:24 +00:00
Peter Maydell	5c8c714a0a	util: Share qemu_try_memalign() implementation between POSIX and Windows The qemu_try_memalign() functions for POSIX and Windows used to be significantly different, but these days they are identical except for the actual allocation function called, and the POSIX version already has to have ifdeffery for different allocation functions. Move to a single implementation in memalign.c, which uses the Windows _aligned_malloc if we detect that function in meson. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20220226180723.1706285-7-peter.maydell@linaro.org Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>	2022-03-07 13:15:24 +00:00
Peter Maydell	bc0fecc1c2	util: Return valid allocation for qemu_try_memalign() with zero size Currently qemu_try_memalign()'s behaviour if asked to allocate 0 bytes is rather variable: * on Windows, we will assert * on POSIX platforms, we get the underlying behaviour of the posix_memalign() or equivalent function, which may be either "return a valid non-NULL pointer" or "return NULL" Explictly check for 0 byte allocations, so we get consistent behaviour across platforms. We handle them by incrementing the size so that we return a valid non-NULL pointer that can later be passed to qemu_vfree(). This is permitted behaviour for the posix_memalign() API and is the most usual way that underlying malloc() etc implementations handle a zero-sized allocation request, because it won't trip up calling code that assumes NULL means an error. (This includes our own qemu_memalign(), which will abort on NULL.) This change is a preparation for sharing the qemu_try_memalign() code between Windows and POSIX. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>	2022-03-07 13:14:07 +00:00
Peter Maydell	ac8057a11b	util: Unify implementations of qemu_memalign() We implement qemu_memalign() in both oslib-posix.c and oslib-win32.c, but the two versions are essentially the same: they call qemu_try_memalign(), and abort() after printing an error message if it fails. The only difference is that the win32 version prints the GetLastError() value whereas the POSIX version prints strerror(errno). However, this is a bug in the win32 version: in commit `dfbd0b873a` in 2020 we changed the implementation of qemu_try_memalign() from using VirtualAlloc() (which sets the GetLastError() value) to using _aligned_malloc() (which sets errno), but didn't update the error message to match. Replace the two separate functions with a single version in a new memalign.c file, which drops the unnecessary extra qemu_oom_check() function and instead prints a more useful message including the requested size and alignment as well as the errno string. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20220226180723.1706285-4-peter.maydell@linaro.org Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>	2022-03-07 13:09:20 +00:00
Peter Maydell	1c6c3b764d	util: Make qemu_oom_check() a static function The qemu_oom_check() function, which we define in both oslib-posix.c and oslib-win32.c, is now used only locally in that file; make it static. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 20220226180723.1706285-3-peter.maydell@linaro.org	2022-03-07 13:09:20 +00:00
Peter Maydell	b85ea5fa2f	include: Move qemu_madvise() and related #defines to new qemu/madvise.h The function qemu_madvise() and the QEMU_MADV_* constants associated with it are used in only 10 files. Move them out of osdep.h to a new qemu/madvise.h header that is included where it is needed. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20220208200856.3558249-2-peter.maydell@linaro.org	2022-02-21 13:30:20 +00:00
David Hildenbrand	dd4fc60585	util/oslib-posix: Fix missing unlock in the error path of os_mem_prealloc() We're missing an unlock in case installing the signal handler failed. Fortunately, we barely see this error in real life. Fixes: `a960d6642d` ("util/oslib-posix: Support concurrent os_mem_prealloc() invocation") Fixes: CID 1468941 Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Pankaj Gupta <pankaj.gupta@ionos.com> Cc: Daniel P. Berrangé <berrange@redhat.com> Cc: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20220111120830.119912-1-david@redhat.com> Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-02-06 04:33:50 -05:00
David Hildenbrand	29b838c05d	util/oslib-posix: Forward SIGBUS to MCE handler under Linux Temporarily modifying the SIGBUS handler is really nasty, as we might be unlucky and receive an MCE SIGBUS while having our handler registered. Unfortunately, there is no way around messing with SIGBUS when MADV_POPULATE_WRITE is not applicable or not around. Let's forward SIGBUS that don't belong to us to the already registered handler and document the situation. Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20211217134611.31172-8-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-01-07 19:30:13 -05:00
David Hildenbrand	a960d6642d	util/oslib-posix: Support concurrent os_mem_prealloc() invocation Add a mutex to protect the SIGBUS case, as we cannot mess concurrently with the sigbus handler and we have to manage the global variable sigbus_memset_context. The MADV_POPULATE_WRITE path can run concurrently. Note that page_mutex and page_cond are shared between concurrent invocations, which shouldn't be a problem. This is a preparation for future virtio-mem prealloc code, which will call os_mem_prealloc() asynchronously from an iothread when handling guest requests. Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20211217134611.31172-7-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-01-07 05:19:55 -05:00
David Hildenbrand	ac86e5c37d	util/oslib-posix: Avoid creating a single thread with MADV_POPULATE_WRITE Let's simplify the case when we only want a single thread and don't have to mess with signal handlers. Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20211217134611.31172-6-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-01-07 05:19:55 -05:00
David Hildenbrand	89aec6411c	util/oslib-posix: Don't create too many threads with small memory or little pages Let's limit the number of threads to something sane, especially that - We don't have more threads than the number of pages we have - We don't have threads that initialize small (< 64 MiB) memory Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20211217134611.31172-5-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-01-07 05:19:55 -05:00
David Hildenbrand	dba506788b	util/oslib-posix: Introduce and use MemsetContext for touch_all_pages() Let's minimize the number of global variables to prepare for os_mem_prealloc() getting called concurrently and make the code a bit easier to read. The only consumer that really needs a global variable is the sigbus handler, which will require protection via a mutex in the future either way as we cannot concurrently mess with the SIGBUS handler. Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20211217134611.31172-4-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-01-07 05:19:55 -05:00
David Hildenbrand	a384bfa32e	util/oslib-posix: Support MADV_POPULATE_WRITE for os_mem_prealloc() Let's sense support and use it for preallocation. MADV_POPULATE_WRITE does not require a SIGBUS handler, doesn't actually touch page content, and avoids context switches; it is, therefore, faster and easier to handle than our current approach. While MADV_POPULATE_WRITE is, in general, faster than manual prefaulting, and especially faster with 4k pages, there is still value in prefaulting using multiple threads to speed up preallocation. More details on MADV_POPULATE_WRITE can be found in the Linux commits 4ca9b3859dac ("mm/madvise: introduce MADV_POPULATE_(READ\|WRITE) to prefault page tables") and eb2faa513c24 ("mm/madvise: report SIGBUS as -EFAULT for MADV_POPULATE_(READ\|WRITE)"), and in the man page proposal [1]. This resolves the TODO in do_touch_pages(). In the future, we might want to look into using fallocate(), eventually combined with MADV_POPULATE_READ, when dealing with shared file/fd mappings and not caring about memory bindings. [1] https://lkml.kernel.org/r/20210816081922.5155-1-david@redhat.com Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20211217134611.31172-3-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-01-07 05:19:55 -05:00
David Hildenbrand	6c427ab926	util/oslib-posix: Let touch_all_pages() return an error Let's prepare touch_all_pages() for returning differing errors. Return an error from the thread and report the last processed error. Translate SIGBUS to -EFAULT, as a SIGBUS can mean all different kind of things (memory error, read error, out of memory). When allocating memory fails via the current SIGBUS-based mechanism, we'll get: os_mem_prealloc: preallocating memory failed: Bad address Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20211217134611.31172-2-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-01-07 05:19:55 -05:00
David Hildenbrand	8dbe22c686	memory: Introduce RAM_NORESERVE and wire it up in qemu_ram_mmap() Let's introduce RAM_NORESERVE, allowing mmap'ing with MAP_NORESERVE. The new flag has the following semantics: " RAM is mmap-ed with MAP_NORESERVE. When set, reserving swap space (or huge pages if applicable) is skipped: will bail out if not supported. When not set, the OS will do the reservation, if supported for the memory type. " Allow passing it into: - memory_region_init_ram_nomigrate() - memory_region_init_resizeable_ram() - memory_region_init_ram_from_file() ... and teach qemu_ram_mmap() and qemu_anon_ram_alloc() about the flag. Bail out if the flag is not supported, which is the case right now for both, POSIX and win32. We will add Linux support next and allow specifying RAM_NORESERVE via memory backends. The target use case is virtio-mem, which dynamically exposes memory inside a large, sparse memory area to the VM. Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Acked-by: Eduardo Habkost <ehabkost@redhat.com> for memory backend and machine core Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20210510114328.21835-9-david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-06-15 20:27:38 +02:00
David Hildenbrand	b444f5c079	util/mmap-alloc: Pass flags instead of separate bools to qemu_ram_mmap() Let's pass flags instead of bools to prepare for passing other flags and update the documentation of qemu_ram_mmap(). Introduce new QEMU_MAP_ flags that abstract the mmap() PROT_ and MAP_ flag handling and simplify it. We expose only flags that are currently supported by qemu_ram_mmap(). Maybe, we'll see qemu_mmap() in the future as well that can implement these flags. Note: We don't use MAP_ flags as some flags (e.g., MAP_SYNC) are only defined for some systems and we want to always be able to identify these flags reliably inside qemu_ram_mmap() -- for example, to properly warn when some future flags are not available or effective on a system. Also, this way we can simplify PROT_ handling as well. Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Acked-by: Eduardo Habkost <ehabkost@redhat.com> for memory backend and machine core Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20210510114328.21835-8-david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-06-15 20:27:38 +02:00
Brad Smith	29c3d213f4	oslib-posix: Remove OpenBSD workaround for fcntl("/dev/null", F_SETFL, O_NONBLOCK) failure OpenBSD prior to 6.3 required a workaround to utilize fcntl(F_SETFL) on memory devices. Since modern verions of OpenBSD that are only officialy supported and buildable on do not have this issue I am garbage collecting this workaround. Signed-off-by: Brad Smith <brad@comstyle.com> Message-Id: <YGYECGXQhdamEJgC@humpty.home.comstyle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-06-04 13:47:07 +02:00
Jagannathan Raman	44a4ff31c0	memory: alloc RAM from file at offset Allow RAM MemoryRegion to be created from an offset in a file, instead of allocating at offset of 0 by default. This is needed to synchronize RAM between QEMU & remote process. Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Signed-off-by: John G Johnson <john.g.johnson@oracle.com> Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 609996697ad8617e3b01df38accc5c208c24d74e.1611938319.git.jag.raman@oracle.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2021-02-09 20:53:56 +00:00
Stefan Hajnoczi	369d6dc4de	memory: add readonly support to memory_region_init_ram_from_file() There is currently no way to open(O_RDONLY) and mmap(PROT_READ) when creating a memory region from a file. This functionality is needed since the underlying host file may not allow writing. Add a bool readonly argument to memory_region_init_ram_from_file() and the APIs it calls. Extend memory_region_init_ram_from_file() rather than introducing a memory_region_init_rom_from_file() API so that callers can easily make a choice between read/write and read-only at runtime without calling different APIs. No new RAMBlock flag is introduced for read-only because it's unclear whether RAMBlocks need to know that they are read-only. Pass a bool readonly argument instead. Both of these design decisions can be changed in the future. It just seemed like the simplest approach to me. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20210104171320.575838-2-stefanha@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2021-02-01 17:07:34 -05:00

1 2 3

142 Commits