Copy each guest kernel's default value, then bound it
against reserved_va or the host address space.
Tested-by: Helge Deller <deller@gmx.de>
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Ensure that the chosen values for mmap_next_start and
task_unmapped_base are within the guest address space.
Tested-by: Helge Deller <deller@gmx.de>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The man page states:
> Note that older kernels which do not recognize the MAP_FIXED_NOREPLACE
> flag will typically (upon detecting a collision with a preexisting
> mapping) fall back to a “non-MAP_FIXED” type of behavior: they will
> return an address that is different from the requested address.
> Therefore, backward-compatible software should check the returned
> address against the requested address.
https://man7.org/linux/man-pages/man2/mmap.2.html
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20230802071754.14876-3-akihiko.odaki@daynix.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Passing MAP_FIXED_NOREPLACE to host will fail for reserved_va because
the address space is reserved with mmap. Replace it with MAP_FIXED
in that case.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20230802071754.14876-2-akihiko.odaki@daynix.com>
[rth: Expand inline commentary.]
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The trivial length 0 check can be moved up, simplifying some
of the other cases. The end < start test is handled by
guest_range_valid_untagged.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230707204054.8792-27-richard.henderson@linaro.org>
Use page_check_range instead, which uses the interval tree
instead of checking each page individually.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230707204054.8792-26-richard.henderson@linaro.org>
All of the guest to host page adjustment is handled by
mmap_reserve_or_unmap; there is no need to duplicate that.
There are no failure modes for munmap after alignment and
guest address range have been validated.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230707204054.8792-23-richard.henderson@linaro.org>
If !reserved_va, munmap instead and assert success.
Update all callers.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230707204054.8792-22-richard.henderson@linaro.org>
Use 'last' variables instead of 'end' variables; be careful
about avoiding overflow. Assert that the mmap succeeded.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230707204054.8792-21-richard.henderson@linaro.org>
Complete the transition within the mmap functions to a formulation
that does not overflow at the end of the address space.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230707204054.8792-20-richard.henderson@linaro.org>
Use the interval tree to find empty space, rather than
probing each page in turn.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230707204054.8792-19-richard.henderson@linaro.org>
Use 'last' variables instead of 'end' variables.
Always zero MAP_ANONYMOUS fragments, which we previously
failed to do if they were not writable; early exit in case
we allocate a new page from the kernel, known zeros.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230707204054.8792-16-richard.henderson@linaro.org>
Use 'last' variables instead of 'end' variables.
When host page size > guest page size, detect when
adjacent host pages have the same protection and
merge that expanded host range into fewer syscalls.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230707204054.8792-15-richard.henderson@linaro.org>
We build with _FILE_OFFSET_BITS=64, so off_t = off64_t = uint64_t.
With an extra cast, this fixes emulation of mmap2, which could
overflow the computation of the full value of offset.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230707204054.8792-14-richard.henderson@linaro.org>
Split out from validate_prot_to_pageflags, as there is not
one single host_prot for the entire range. We need to adjust
prot for every host page that overlaps multiple guest pages.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230707204054.8792-13-richard.henderson@linaro.org>
Fix all checkpatch.pl errors within mmap.c.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230707204054.8792-5-richard.henderson@linaro.org>
There is an overflow problem in mmap_find_vma_reserved:
when reserved_va == UINT32_MAX, end may overflow to 0.
Rather than a larger rewrite at this time, simply avoid
the final byte of the VA, which avoids searching the
final page, which avoids the overflow.
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1741
Fixes: 95059f9c ("include/exec: Change reserved_va semantics to last byte")
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Message-Id: <20230629080835.71371-1-richard.henderson@linaro.org>
Change the semantics to be the last byte of the guest va, rather
than the following byte. This avoids some overflow conditions.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Pass the address of the last byte to be changed, rather than
the first address past the last byte. This avoids overflow
when the last page of the address space is involved.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Pass the address of the last byte to be changed, rather than
the first address past the last byte. This avoids overflow
when the last page of the address space is involved.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1528
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Both parameters have a different value on the parisc platform, so first
translate the target value into a host value for usage in the native
madvise() syscall.
Those parameters are often used by security sensitive applications (e.g.
tor browser, boringssl, ...) which expect the call to return a proper
return code on failure, so return -EINVAL if qemu fails to forward the
syscall to the host OS.
While touching this code, enhance the comments about MADV_DONTNEED.
Tested with testcase of tor browser when running hppa-linux guest on
x86-64 host.
Signed-off-by: Helge Deller <deller@gmx.de>
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <Y5iwTaydU7i66K/i@p100>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
When PAGE_RESET is set, we are replacing pages with new
content, which means that we need to invalidate existing
cached data, such as TranslationBlocks. Perform the
reset invalidate while we're doing other invalidates,
which allows us to remove the separate invalidates from
the user-only mmap/munmap/mprotect routines.
In addition, restrict invalidation to PAGE_EXEC pages.
Since cdf7130851, we have validated PAGE_EXEC is present
before translation, which means we can assume that if the
bit is not present, there are no translations to invalidate.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The hppa platform uses an upwards-growing stack and required in Linux
kernels < 5.18 an executable stack for signal processing. For that some
executables and libraries are marked to have an executable stack, for
which glibc uses the mprotect() syscall to mark the stack like this:
mprotect(xfa000000,4096,PROT_EXEC|PROT_READ|PROT_WRITE|PROT_GROWSUP).
Currently qemu will return -TARGET_EINVAL for this syscall because of the
checks in validate_prot_to_pageflags(), which doesn't allow the
PROT_GROWSUP or PROT_GROWSDOWN flags and thus triggers this error in the
guest:
error while loading shared libraries: libc.so.6: cannot enable executable stack as shared object requires: Invalid argument
Allow mprotect() to handle both flags and thus fix the guest.
The glibc tst-execstack testcase can be used to reproduce the issue.
Signed-off-by: Helge Deller <deller@gmx.de>
Message-Id: <20220924114501.21767-7-deller@gmx.de>
[lvivier: s/elif TARGET_HPPA/elif defined(TARGET_HPPA)/]
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
This is a follow-up for commit 892a4f6a75 ("linux-user: Add partial
support for MADV_DONTNEED"), which added passthrough for anonymous
mappings. File mappings can be handled in a similar manner.
In order to do that, mark pages, for which mmap() was passed through,
with PAGE_PASSTHROUGH, and then allow madvise() passthrough for these
pages. Drop the explicit PAGE_ANON check, since anonymous mappings are
expected to have PAGE_PASSTHROUGH anyway.
Add PAGE_PASSTHROUGH to PAGE_STICKY in order to keep it on mprotect().
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220725125043.43048-1-iii@linux.ibm.com>
Message-Id: <20220906000839.1672934-5-iii@linux.ibm.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
MADV_DONTNEED has a different value on alpha, compared to all the other
architectures. Fix by using TARGET_MADV_DONTNEED instead of
MADV_DONTNEED.
Fixes: 892a4f6a75 ("linux-user: Add partial support for MADV_DONTNEED")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220906000839.1672934-3-iii@linux.ibm.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
On the parisc architecture the stack grows upwards.
Move the TASK_UNMAPPED_BASE to high memory area as it's done by the
kernel on physical machines.
Signed-off-by: Helge Deller <deller@gmx.de>
Message-Id: <20220918194555.83535-9-deller@gmx.de>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Currently it's possible to execute pages that do not have PAGE_EXEC
if there is an existing translation block. Fix by invalidating TBs
that touch the affected pages.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20220817150506.592862-2-iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
aarch64 stores MTE tags in target_date, and they should be reset by
MADV_DONTNEED.
Signed-off-by: Vitaly Buka <vitalybuka@google.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220711220028.2467290-1-vitalybuka@google.com>
[lv: fix code style issues]
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
We have extra stuff to log at the same time.
Hoist the qemu_log_lock/unlock to the caller and use fprintf.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220417183019.755276-23-richard.henderson@linaro.org>
Replace the global variables with inlined helper functions. getpagesize() is very
likely annotated with a "const" function attribute (at least with glibc), and thus
optimization should apply even better.
This avoids the need for a constructor initialization too.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20220323155743.1585078-12-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
qemu.h is included in various non-linux-user files (which
mostly want the TaskState struct and the functions for
doing usermode access to guest addresses like lock_user(),
unlock_user(), get_user*(), etc).
Split out the parts that are only used in linux-user itself
into a new user-internals.h. This leaves qemu.h with basically
three things:
* the definition of the TaskState struct
* the user-access functions and macros
* do_brk()
all of which are needed by code outside linux-user that
includes qemu.h.
The addition of all the extra #include lines was done with
sed -i '/include.*qemu\.h/a #include "user-internals.h"' $(git grep -l 'include.*qemu\.h' linux-user)
(and then undoing the change to fpa11.h).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210908154405.15417-8-peter.maydell@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Split out the mmap prototypes into a new header user-mmap.h
which we only include where required.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210908154405.15417-6-peter.maydell@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signal the translator to use host atomic instructions for
guest operations, insofar as it is possible. This is the
best we can do to allow the guest to interact atomically
with other processes.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/121
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210612060828.695332-1-richard.henderson@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Remember the PROT_MTE bit as PAGE_MTE/PAGE_TARGET_2.
Otherwise this does not yet have effect.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210212184902.1251044-25-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The places that use these are better off using untagged
addresses, so do not provide a tagged versions. Rename
to make it clear about the address type.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210212184902.1251044-16-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Use g2h_untagged in contexts that have no cpu, e.g. the binary
loaders that operate before the primary cpu is created. As a
colollary, target_mmap and friends must use untagged addresses,
since they are used by the loaders.
Use g2h_untagged on values returned from target_mmap, as the
kernel never applies a tag itself.
Use g2h_untagged on all pc values. The only current user of
tags, aarch64, removes tags from code addresses upon branch,
so "pc" is always untagged.
Use g2h with the cpu context on hand wherever possible.
Use g2h_untagged in lock_user, which will be updated soon.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210212184902.1251044-13-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Record whether the backing page is anonymous, or if it has file
backing. This will allow us to get close to the Linux AArch64
ABI for MTE, which allows tag memory only on ram-backed VMAs.
The real ABI allows tag memory on files, when those files are
on ram-backed filesystems, such as tmpfs. We will not be able
to implement that in QEMU linux-user.
Thankfully, anonymous memory for malloc arenas is the primary
consumer of this feature, so this restricted version should
still be of use.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210212184902.1251044-3-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This data can be allocated by page_alloc_target_data() and
released by page_set_flags(start, end, prot | PAGE_RESET).
This data will be used to hold tag memory for AArch64 MTE.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210212184902.1251044-2-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
If mremap() is called without the MREMAP_MAYMOVE flag with a start address
just before the end of memory (reserved_va) where new_size would exceed
it (and GUEST_ADDR_MAX), the assert(end - 1 <= GUEST_ADDR_MAX) in
page_set_flags() would trigger.
Add an extra guard to the guest_range_valid() checks to prevent this and
avoid asserting binaries when reserved_va is set.
This meant a bug I was seeing locally now gives the same behaviour
regardless of whether reserved_va is set or not.
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <70c46e7b999bafbb01d54bfafd44b420d0b782e9.camel@linuxfoundation.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
If mremap succeeds, an additional check is performed to ensure that the
new address range fits into the target address space. This check was
previously perfomed in host address space, with the upper bound fixed to
abi_ulong.
This patch replaces the static check with a call to `guest_range_valid`,
performing the range check against the actual size of the target address
space. It also moves the corresponding block to prevent it from being
called incorrectly when the mapping itself fails.
Signed-off-by: Tobias Koch <tobias.koch@nonterra.com>
Message-Id: <20201028213833.26592-1-tobias.koch@nonterra.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Transform the prot bit to a qemu internal page bit, and save
it in the page tables.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20201021173749.111103-3-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Executable guest pages are never directly executed by
the host, but do need to be readable for translation.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20200519185645.3915-3-richard.henderson@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
The kernel will return -EINVAL for bits set in the prot argument
that are unknown or invalid. Previously we were simply cropping
out the bits that we care about.
Introduce validate_prot_to_pageflags to perform this check in a
single place between the two syscalls. Differentiate between
the target and host versions of prot. Compute the qemu internal
page_flags value at the same time.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20200519185645.3915-2-richard.henderson@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Relaxing the restrictions on 64 bit guests leads to the user being
able to attempt to map right at the edge of addressable memory. This
in turn lead to address overflow tripping the assert in page_set_flags
when the end address wrapped around.
Detect the wrap earlier and correctly -ENOMEM the guest (in the
reported case LTP mmap15).
Fixes: 7d8cbbabcb
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reported-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20200605154929.26910-15-alex.bennee@linaro.org>
Fixes: https://bugs.launchpad.net/bugs/1876373
This code path in mmap occurs when a page size is decreased with mremap. When a section of pages is shrunk, qemu calls mmap_reserve on the pages that were released. However, it has the diff operation reversed, subtracting the larger old_size from the smaller new_size. Instead, it should be subtracting the smaller new_size from the larger old_size. You can also see in the previous line of the change that this mmap_reserve call only occurs when old_size > new_size.
Bug: https://bugs.launchpad.net/qemu/+bug/1876373
Signed-off-by: Jonathan Marler <johnnymarler@gmail.com>
Reviewded-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20200502161225.14346-1-johnnymarler@gmail.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>