-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABCAAGBQJX+LTGAAoJEHAbT2saaT5ZIBwH+wfho+xxruEjro6qPvSAtdKk
BBsOWBfBoqWfbAbOxxCO8ina2nA7p5XbyzSXUr94nZhvZMB9BkgL6la03gdS0Yr2
jHf0J9mM8fIbMQFsEKGOPcdpvU7VEXeFwridZYzypiRvbNSdWK3SKVBKgz2ADNhb
l4Tos81IZeH/mw8HcU3XgSGSTV4JuKP4XsnmwlFMa8/sWM/X3vVgx5IG26KURZQm
pW720jcX0meSfji5YvhspfbBbp1g2EorTZb6iLcZf+OUIB6XkViMisVasnyOo2HJ
cehPlhAHixwq1kXGItc1fs11VloZ6hvEZ7kZ615jAdsD2sGJObtGDxgyJW3+gPo=
=HPHj
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mjt/tags/trivial-patches-fetch' into staging
trivial patches for 2016-10-08
# gpg: Signature made Sat 08 Oct 2016 09:56:38 BST
# gpg: using RSA key 0x701B4F6B1A693E59
# gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>"
# gpg: aka "Michael Tokarev <mjt@corpit.ru>"
# gpg: aka "Michael Tokarev <mjt@debian.org>"
# Primary key fingerprint: 6EE1 95D1 886E 8FFB 810D 4324 457C E0A0 8044 65C5
# Subkey fingerprint: 7B73 BAD6 8BE7 A2C2 8931 4B22 701B 4F6B 1A69 3E59
* remotes/mjt/tags/trivial-patches-fetch: (26 commits)
net/filter-mirror: Fix mirror initial check typo
virtio: rename the bar index field name in VirtIOPCIProxy
linux-user: include <poll.h> instead of <sys/poll.h>
char: fix missing return in error path for chardev TLS init
CODING_STYLE: Fix a typo ("have" vs. "has")
bitmap: refine and move BITMAP_{FIRST/LAST}_WORD_MASK
build-sys: fix find-in-path
m68k: change default system clock for m5208evb
exec: remove unused compacted argument
usb: ehci: fix memory leak in ehci_process_itd
qapi: make the json schema files more regular.
maint: Add module_block.h to .gitignore
MAINTAINERS: Some updates related to the SH4 machines
MAINTAINERS: Add some more MIPS related files
MAINTAINERS: Add usermode related config files
MAINTAINERS: Add some more pattern to recognize all win32 related files
MAINTAINERS: Add some more rocker related files
MAINTAINERS: Add header files to CRIS section
MAINTAINERS: Add some more files to the virtio section
MAINTAINERS: Add some SPARC machine related files
...
# Conflicts:
# MAINTAINERS
This removes the last usage of <sys/poll.h> in the code base.
Signed-off-by: Felix Janda <felix.janda@posteo.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
There is a potential race if several threads exit at once. To serialise
the exits extend the lock above the initial checking of the CPU list.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20160930213106.20186-11-alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
EDQUOT is defined for Mips platform in Linux kernel in such a way
that it has different value than on most other platforms. However,
correspondent TARGET_EDQUOT for Mips is missing in Qemu code. Moreover,
TARGET_EDQUOT is missing from the table for conversion of error codes
from host to target. This patch fixes these problems.
Without this patch, syscalls add_key(), keyctl(), link(), mkdir(), mknod(),
open(), rename(), request_key(), setxattr(), symlink(), and write() will not
be able to return the right error code in some scenarios on Mips platform.
(Some of these syscalls are not yet supported in Qemu, but once they are
supported, they will need correct EDQUOT handling.)
Signed-off-by: Aleksandar Markovic <aleksandar.markovic@imgtec.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Acked-by: Riku Voipio <riku.voipio@linaro.org>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
The function that is changed in this patch is supposed to indicate that
there was certain argument rearrangement related to 64-bit arguments on
32-bit platforms. The background on such rearrangements can be found,
for example, in the man page for syscall(2).
However, for 64-bit Mips architectures there is no such rearrangement,
and this patch reflects it.
Signed-off-by: Aleksandar Rikalo <aleksandar.rikalo@imgtec.com>
Signed-off-by: Aleksandar Markovic <aleksandar.markovic@imgtec.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Leon Alrae <leon.alrae@imgtec.com>
Acked-by: Riku Voipio <riku.voipio@linaro.org>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
TARGET_NR_select can have three different implementations:
1- to always return -ENOSYS
microblaze, ppc, ppc64
-> TARGET_WANT_NI_OLD_SELECT
2- to take parameters from a structure pointed by arg1
(kernel sys_old_select)
i386, arm, m68k
-> TARGET_WANT_OLD_SYS_SELECT
3- to take parameters from arg[1-5]
(kernel sys_select)
x86_64, alpha, s390x,
cris, sparc, sparc64
Some (new) architectures don't define NR_select,
4- but only NR__newselect with sys_select:
mips, mips64, sh
5- don't define NR__newselect, and use pselect6 syscall:
aarch64, openrisc, tilegx, unicore32
Reported-by: Timothy Pearson <tpearson@raptorengineering.com>
Reported-by: Allan Wirth <awirth@akamai.com>
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
We currently make no checks on the flags passed to the clone syscall,
which means we will not fail clone attempts which ask for features
that we can't implement. Add sanity checking of the flags to clone
(which we were already doing in the "this is a fork" path, but not
for the "this is a new thread" path), tidy up the checking in
the fork path to match it, and check that the fork case isn't trying
to specify a custom termination signal.
This is helpful in causing some LTP test cases to fail cleanly
rather than behaving bizarrely when we let the clone succeed
but didn't provide the semantics requested by the flags.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
The 'nptl_flags' variable in do_fork() is set to a copy of
'flags', and then the CLONE_NPTL_FLAGS are cleared out of 'flags'.
However the only effect of this is that the later check on
"if (flags & CLONE_PARENT_SETTID)" is never true. Since we
will already have done the setting of parent_tidptr in clone_func()
in the child thread, we don't need to do it again.
Delete the dead if() and the clearing of CLONE_NPTL_FLAGS from
'flags', and then use 'flags' where we were previously using
'nptl_flags', so we can delete the unnecessary variable.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Instead of assuming in queue_signal() that all callers are passing
a siginfo structure which uses the _sifields._sigfault part of the
union (and thus a si_type of QEMU_SI_FAULT), make callers pass
the si_type they require in as an argument.
[RV adjusted to apply]
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
The shmat() handling needs to do target-specific handling
of the attach address for shmat():
* if the SHM_RND flag is passed, the address is rounded
down to a SHMLBA boundary
* if SHM_RND is not passed, then the call is failed EINVAL
if the address is not a multiple of SHMLBA
Since SHMLBA is target-specific, we need to do this
checking and rounding in QEMU and can't leave it up to the
host syscall.
Allow targets to define TARGET_FORCE_SHMLBA and provide
a target_shmlba() function if appropriate, and update
do_shmat() to honour them.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
do_ioctl_dm() should return target errno values, not host ones;
correct an accidental use of a host errno in an error path.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
lock_user() can return NULL, which typically means the syscall
should fail with EFAULT. Add checks in various places where
Coverity spotted that we were missing them.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Do an initial range check on the ppoll syscall's nfds argument,
to avoid possible overflow in the calculation of the lock_user()
size argument. The host kernel will later apply the rather lower
limit based on RLIMIT_NOFILE as appropriate.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
The kernel checks that the maxevents parameter to epoll_wait
is non-negative and not larger than EP_MAX_EVENTS. Add this
check to our implementation, so that:
* we fail these cases EINVAL rather than EFAULT
* we don't pass negative or overflowing values to the
lock_user() size calculation
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
The linux utimensat syscall differs in semantics from the
libc function because the syscall combines the features
of utimensat() and futimens(). Rather than trying to
split these apart in order to call the two libc functions
which then call the same underlying syscall, just always
directly make the host syscall. This fixes bugs in some
of the corner cases which should return errors from the
syscall but which we were incorrectly directing to futimens().
This doesn't reduce the set of hosts that our syscall
implementation will work on, because if the direct syscall
fails ENOSYS then the libc functions would also fail ENOSYS.
(The system call has been in the kernel since 2.6.22 anyway.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
The POSIX standard mandates that for a connected socket recvfrom()
must ignore the msg_name and msg_namelen fields. This is awkward
for QEMU because we will attempt to copy them from guest address
space. Handle this by not immediately returning a TARGET_EFAULT
if the copy failed, but instead passing a known-bad address
to the host kernel, which can then return EFAULT or ignore the
value appropriately.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
The sendmsg and recvmsg syscalls use a different errno to indicate
an overlarge iovec length from readv and writev. Handle this
special case in do_sendrcvmsg_locked() to avoid getting the
default errno returned by lock_iovec().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
In the kernel the length of an iovec is generally handled as
an unsigned long, not an integer; fix the parameter to
lock_iovec() accordingly.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
In 9c37146782 I've tried to fix a broken build with older
linux-headers. However, I didn't do it properly. The solution
implemented here is to grab the enums that caused the problem
initially, and rename their values so that they are "QEMU_"
prefixed. In order to guarantee matching values with actual
enums from linux-headers, the enums are seeded with starting
values from the original enums.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-id: 75c14d6e8a97c4ff3931d69c13eab7376968d8b4.1471593869.git.mprivozn@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The fix I've made there was wrong. I mean, basically what I did
there was equivalent to:
#if 0
some code;
#endif
This reverts commit 9c37146782.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-id: 40d61349e445c1ad5fef795da704bf7ed6e19c86.1471593869.git.mprivozn@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The llseek syscall takes two 32-bit arguments, offset_high
and offset_low, which must be combined to form a single
64-bit offset. Unfortunately we were combining them with
(uint64_t)arg2 << 32) | arg3
and arg3 is a signed type; this meant that when promoting
arg3 to a 64-bit type it would be sign-extended. The effect
was that if the offset happened to have bit 31 set then
this bit would get sign-extended into all of bits 63..32.
Explicitly cast arg3 to abi_ulong to avoid the erroneous
sign extension.
Reported-by: Chanho Park <parkch98@gmail.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Chanho Park <parkch98@gmail.com>
Message-id: 1470938379-1133-1-git-send-email-peter.maydell@linaro.org
In c5dff280 we tried to make us understand netlink messages more.
So we've added a code that does some translation. However, the
code assumed linux-headers to be at least version 4.4 of it
because most of the symbols there (if not all of them) were added
in just that release. This, however, breaks build on systems with
older versions of the package.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Message-id: 23806aac6db3baf7e2cdab4c62d6e3468ce6b4dc.1471340849.git.mprivozn@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In do_brk(), we were inadvertently truncating the size
of a requested brk() from the guest by putting it into an
'int' variable. This meant that we would incorrectly report
success back to the guest rather than a failed allocation,
typically resulting in the guest then segfaulting. Use
abi_ulong instead.
This fixes a crash in the '31370.cc' test in the gcc libstdc++ test
suite (the test case starts by trying to allocate a very large
size and reduces the size until the allocation succeeds).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
The target_semid_ds structure is not correct for all
architectures: the padding fields should only exist for:
* 32-bit ABIs
* x86
It is also misnamed, since it is following the kernel
semid64_ds structure (QEMU doesn't support the legacy
semid_ds structure at all). Rename the struct, provide
a correct generic definition and allow the oddball x86
architecture to provide its own version.
This fixes broken SYSV semaphores for all our 64-bit
architectures except x86 and ppc.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Use g_strlcpy() rather than strcpy() to copy the uname string
into the structure we return to the guest for the uname syscall.
This avoids overrunning the buffer if the user passed us an
overlong string via the QEMU command line.
We fix a comment typo while we're in the neighbourhood.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
In open_self_cmdline() we look for a 0 in the buffer we read
from /prc/self/cmdline. We were incorrectly passing the length
of our buf[] array to memchr() as the length to search, rather
than the number of bytes we actually read into it, which could
be shorter. This was spotted by Coverity (because it could
result in our trying to pass a negative length argument to
write()).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
If userspace specifies a short buffer for a target sockaddr,
the kernel will only copy in as much as it has space for
(or none at all if the length is zero) -- see the kernel
move_addr_to_user() function. Mimic this in QEMU's
host_to_target_sockaddr() routine.
In particular, this fixes a segfault running the LTP
recvfrom01 test, where the guest makes a recvfrom()
call with a bad buffer pointer and other parameters which
cause the kernel to set the addrlen to zero; because we
did not skip the attempt to swap the sa_family field we
segfaulted on the bad address.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Many syscalls which take a sigset_t argument also take an argument
giving the size of the sigset_t. The kernel insists that this
matches its idea of the type size and fails EINVAL if it is not.
Implement this logic in QEMU. (This mostly just means some LTP test
cases which check error cases now pass.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Nested types are used by the kernel to send link information and
protocol properties.
We can see following errors with "ip link show":
Unimplemented nested type 26
Unimplemented nested type 26
Unimplemented nested type 18
Unimplemented nested type 26
Unimplemented nested type 18
Unimplemented nested type 26
This patch implements nested types 18 (IFLA_LINKINFO) and
26 (IFLA_AF_SPEC).
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
As we convert sockaddr for AF_PACKET family for sendto() (target to
host) we need also to convert this for getsockname() (host to target).
arping uses getsockname() to get the the interface address and uses
this address with sendto().
Tested with:
/sbin/arping -D -q -c2 -I eno1 192.168.122.88
...
getsockname(3, {sa_family=AF_PACKET, proto=0x806, if2,
pkttype=PACKET_HOST, addr(6)={1, 10c37b6b9a76}, [18]) = 0
...
sendto(3, "..." 28, 0,
{sa_family=AF_PACKET, proto=0x806, if2, pkttype=PACKET_HOST,
addr(6)={1, ffffffffffff}, 20) = 28
...
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Netlink is byte-swapping data in the guest memory (it's bad).
It's ok when the data come from the host as they are generated by the
host.
But it doesn't work when data come from the guest: the guest can
try to reuse these data whereas they have been byte-swapped.
This is what happens in glibc:
glibc generates a sequence number in nlh.nlmsg_seq and calls
sendto() with this nlh. In sendto(), we byte-swap nlmsg.seq.
Later, after the recvmsg(), glibc compares nlh.nlmsg_seq with
sequence number given in return, and of course it fails (hangs),
because nlh.nlmsg_seq is not valid anymore.
The involved code in glibc is:
sysdeps/unix/sysv/linux/check_pf.c:make_request()
...
req.nlh.nlmsg_seq = time (NULL);
...
if (TEMP_FAILURE_RETRY (__sendto (fd, (void *) &req, sizeof (req), 0,
(struct sockaddr *) &nladdr,
sizeof (nladdr))) < 0)
<here req.nlh.nlmsg_seq has been byte-swapped>
...
do
{
...
ssize_t read_len = TEMP_FAILURE_RETRY (__recvmsg (fd, &msg, 0));
...
struct nlmsghdr *nlmh;
for (nlmh = (struct nlmsghdr *) buf;
NLMSG_OK (nlmh, (size_t) read_len);
nlmh = (struct nlmsghdr *) NLMSG_NEXT (nlmh, read_len))
{
<we compare nlmh->nlmsg_seq with corrupted req.nlh.nlmsg_seq>
if (nladdr.nl_pid != 0 || (pid_t) nlmh->nlmsg_pid != pid
|| nlmh->nlmsg_seq != req.nlh.nlmsg_seq)
continue;
...
else if (nlmh->nlmsg_type == NLMSG_DONE)
/* We found the end, leave the loop. */
done = true;
}
}
while (! done);
As we have a continue on "nlmh->nlmsg_seq != req.nlh.nlmsg_seq",
"done" cannot be set to "true" and we have an infinite loop.
It's why commands like "apt-get update" or "dnf update hangs".
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
fd_trans_target_to_host_data() and fd_trans_host_to_target_data() must
return the length of processed data.
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Older kernels don't have F_SETPIPE_SZ and F_GETPIPE_SZ (in
particular RHEL6's system headers don't define these). Add
ifdefs so that we can gracefully fall back to not supporting
those guest ioctls rather than failing to build.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-id: 1467304429-21470-1-git-send-email-peter.maydell@linaro.org
Adds two events to trace syscalls in syscall emulation mode (*-user):
* guest_user_syscall: Emitted before the syscall is emulated; contains
the syscall number and arguments.
* guest_user_syscall_ret: Emitted after the syscall is emulated;
contains the syscall number and return value.
Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Message-id: 146651712411.12388.10024905980452504938.stgit@fimbulvetr.bsc.es
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
If the structure pointed by NLMSG_DATA() is bigger
than the size of NLMSG_DATA(), don't swap its fields
to avoid memory corruption.
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
if we process the whole buffer, the netlink helpers can try
to swap invalid data.
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Support the F_GETPIPE_SZ and F_SETPIPE_SZ fcntl operations.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
The third argument to the rt_sigqueueinfo syscall is a pointer to
a siginfo_t, not a pointer to a sigset_t. Fix the error in the
arguments to lock_user(), which meant that we would not have
detected some faults that we should.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
The kernel and libc have different ideas about what a sigset_t
is -- for the kernel it is only _NSIG / 8 bytes in size (usually
8 bytes), but for libc it is much larger, 128 bytes. In most
situations the difference doesn't matter, because if you pass a
pointer to a libc sigset_t to the kernel it just acts on the first
8 bytes of it, but for the ucontext_t* argument to a signal handler
it trips us up. The kernel allocates this ucontext_t on the stack
according to its idea of the sigset_t type, but the type of the
ucontext_t defined by the libc headers uses the libc type, and
so do the manipulator functions like sigfillset(). This means that
(1) sizeof(uc->uc_sigmask) is much larger than the actual
space used on the stack
(2) sigfillset(&uc->uc_sigmask) will write garbage 0xff bytes
off the end of the structure, which can trash data that
was on the stack before the signal handler was invoked,
and may result in a crash after the handler returns
To avoid this, we use a memset() of the correct size to fill
the signal mask rather than using the libc function.
This fixes a problem where we would crash at least some of the
time on an i386 host when a signal was taken.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Use the safe_syscall wrapper for fcntl. This is straightforward now
that we always use 'struct fcntl64' on the host, as we don't need
to select whether to call the host's fcntl64 or fcntl syscall
(a detail that the libc previously hid for us).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Use the __get_user() and __put_user() to handle reading and writing the
guest structures in do_ioctl(). This has two benefits:
* avoids possible errors due to misaligned guest pointers
* correctly sign extends signed fields (like l_start in struct flock)
which might be different sizes between guest and host
To do this we abstract out into copy_from/to_user functions. We
also standardize on always using host flock64 and the F_GETLK64
etc flock commands, as this means we always have 64 bit offsets
whether the host is 64-bit or 32-bit and we don't need to support
conversion to both host struct flock and struct flock64.
In passing we fix errors in converting l_type from the host to
the target (where we were doing a byteswap of the host value
before trying to do the convert-bitmasks operation rather than
otherwise, and inexplicably shifting left by 1); these were
accidentally left over when the original simple "just shift by 1"
arm<->x86 conversion of commit 43f238d was changed to the more
general scheme of using target_to_host_bitmask() functions in 2ba7f73.
[RV: fixed ifdef guard for eabi functions]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
qemu/osdep.h checks whether MAP_ANONYMOUS is defined, but this check
is bogus without a previous inclusion of sys/mman.h. Include it in
sysemu/os-posix.h and remove it from everywhere else.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIVAwUAV1gdMrRIkN7ePJvAAQhLcg/+Kby99taEuewItrA1yDs75jxOlLqaJopd
cVzo4LFRFPhIn4UEKqRQS0CGoIeU/DYOmObvuUzJxs2LyUoHoqmQOwEm5obC2a85
JrHo/NOppYBbyvvIEAAXzZDCZo0KZKVclrlT+AX5obpOSNSvAnKvEuLWq1aQ9WGN
n4AzHuFEl885cd4nFd8VK/xth89bqz6U/z8CjgIuw3mczp1XNrK5IJJwAy5epHay
GCBr9XHooW3SU971WS20RTRS0D33tKPHgCU3ZeZ3rKh4g3JNj6/ixdVgzi9NqFsQ
5DzAj/iBGhN1LtCOednRS6tUt32Bhy8G/g4O3GiXdejagAmNe2wz31cveNJ8S3W5
DK8SZAnJlz06zN5uIpOVQgDOqfXZkCp7ndq779QJoHOAnuOjJBcUbhw1myz2R3eR
6208tStWl3R0+ATEK8CZ7ejg1cUHvdzyqGJA+1nC2HaFUrBWipxN8jf2fz9vO/wG
G7zNbahvVgyJWO7bPNK4mxkb6qkWCETnCnLJsq2ZbmtPEMcINjD8vNWLNvFGVG8b
2HbinDrzh0Z9Zik5gLZfiVyP5HFaWSrJn9QRVIgaUjuIH9n3/25sl9OvW/JLjxJ+
h2P17CLnAK6dhUYc4R3wQTx2X/N2FvO4DD8iMYOcgDY6fhZ2b6EEyE9yBgQrIDbF
gU1AlC/CX+M=
=AXqa
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/riku/tags/pull-linux-user-20160608' into staging
linux-user pull request for June 2016
# gpg: Signature made Wed 08 Jun 2016 14:27:14 BST
# gpg: using RSA key 0xB44890DEDE3C9BC0
# gpg: Good signature from "Riku Voipio <riku.voipio@iki.fi>"
# gpg: aka "Riku Voipio <riku.voipio@linaro.org>"
* remotes/riku/tags/pull-linux-user-20160608: (44 commits)
linux-user: In fork_end(), remove correct CPUs from CPU list
linux-user: Special-case ERESTARTSYS in target_strerror()
linux-user: Make target_strerror() return 'const char *'
linux-user: Correct signedness of target_flock l_start and l_len fields
linux-user: Use safe_syscall wrapper for ioctl
linux-user: Use safe_syscall wrapper for accept and accept4 syscalls
linux-user: Use safe_syscall wrapper for semop
linux-user: Use safe_syscall wrapper for epoll_wait syscalls
linux-user: Use safe_syscall wrapper for poll and ppoll syscalls
linux-user: Use safe_syscall wrapper for sleep syscalls
linux-user: Use safe_syscall wrapper for rt_sigtimedwait syscall
linux-user: Use safe_syscall wrapper for flock
linux-user: Use safe_syscall wrapper for mq_timedsend and mq_timedreceive
linux-user: Use safe_syscall wrapper for msgsnd and msgrcv
linux-user: Use safe_syscall wrapper for send* and recv* syscalls
linux-user: Use safe_syscall wrapper for connect syscall
linux-user: Use safe_syscall wrapper for readv and writev syscalls
linux-user: Fix error conversion in 64-bit fadvise syscall
linux-user: Fix NR_fadvise64 and NR_fadvise64_64 for 32-bit guests
linux-user: Fix handling of arm_fadvise64_64 syscall
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Conflicts:
configure
scripts/qemu-binfmt-conf.sh
Since TARGET_ERESTARTSYS and TARGET_ESIGRETURN are internal-to-QEMU
error numbers, handle them specially in target_strerror(), to avoid
confusing strace output like:
9521 rt_sigreturn(14,8,274886297808,8,0,268435456) = -1 errno=513 (Unknown error 513)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Make target_strerror() return 'const char *' rather than just 'char *';
this will allow us to return constant strings from it for some special
cases.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Use the safe_syscall wrapper to implement the ioctl syscall.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Use the safe_syscall wrapper for the accept and accept4 syscalls.
accept4 has been in the kernel since 2.6.28 so we can assume it
is always present.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Use the safe_syscall wrapper for the semop syscall or IPC operation.
(We implement via the semtimedop syscall to make it easier to
implement the guest semtimedop syscall later.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Use the safe_syscall wrapper for epoll_wait and epoll_pwait syscalls.
Since we now directly use the host epoll_pwait syscall for both
epoll_wait and epoll_pwait, we don't need the configure machinery
to check whether glibc supports epoll_pwait(). (The kernel has
supported the syscall since 2.6.19 so we can assume it's always there.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Use the safe_syscall wrapper for the poll and ppoll syscalls.
Since not all host architectures will have a poll syscall, we
have to rewrite the TARGET_NR_poll handling to use ppoll instead
(we can assume everywhere has ppoll by now).
We take the opportunity to switch to the code structure
already used in the implementation of epoll_wait and epoll_pwait,
which uses a switch() to avoid interleaving #if and if (),
and to stop using a variable with a leading '_' which is in
the implementation's namespace.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Use the safe_syscall wrapper for the clock_nanosleep and nanosleep
syscalls.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Use the safe_syscall wrapper for the rt_sigtimedwait syscall.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Use the safe_syscall wrapper for the flock syscall.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Use the safe_syscall wrapper for mq_timedsend and mq_timedreceive syscalls.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Use the safe_syscall wrapper for msgsnd and msgrcv syscalls.
This is made slightly awkward by some host architectures providing
only a single 'ipc' syscall rather than separate syscalls per
operation; we provide safe_msgsnd() and safe_msgrcv() as wrappers
around safe_ipc() to handle this if needed.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Use the safe_syscall wrapper for the send, sendto, sendmsg, recv,
recvfrom and recvmsg syscalls.
RV: adjusted to apply
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Use the safe_syscall wrapper for the connect syscall.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Use the safe_syscall wrapper for readv and writev syscalls.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Fix errors in the implementation of NR_fadvise64 and NR_fadvise64_64
for 32-bit guests, which pass their off_t values in register pairs.
We can't use the 64-bit code path for this, so split out the 32-bit
cases, so that we can correctly handle the "only offset is 64-bit"
and "both offset and length are 64-bit" syscall flavours, and
"uses aligned register pairs" and "does not" flavours of target.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
32-bit ARM has an odd variant of the fadvise syscall which has
rearranged arguments, which we try to implement. Unfortunately we got
the rearrangement wrong.
This is a six-argument syscall whose arguments are:
* fd
* advise parameter
* offset high half
* offset low half
* len high half
* len low half
Stop trying to share code with the standard fadvise syscalls,
and just implement the syscall with the correct argument order.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Replace (((n) + (d) - 1) /(d)) by DIV_ROUND_UP(n,d).
This patch is the result of coccinelle script
scripts/coccinelle/round.cocci
CC: Riku Voipio <riku.voipio@iki.fi>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
If there is a signal pending during fork() the signal handler will
erroneously be called in both the parent and child, so handle any
pending signals first.
Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-20-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Use the safe_syscall wrapper for the kill, tkill and tgkill syscalls.
Without this, if a thread sent a SIGKILL to itself it could kill the
thread before we had a chance to process a signal that arrived just
before the SIGKILL, and that signal would get lost.
We drop all the ifdeffery for tkill and tgkill, because every guest
architecture we support implements them, and they've been in Linux
since 2003 so we can assume the host headers define the __NR_tkill
and __NR_tgkill constants.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Without this a signal could vanish on thread exit.
Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-26-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Fix races between signal handling and the pause syscall by
reimplementing it using block_signals() and sigsuspend().
(Using safe_syscall(pause) would also work, except that the
pause syscall doesn't exist on all architectures.)
Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-28-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
[PMM: tweaked commit message]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
If multiple host signals are received in quick succession they would
be queued in TaskState then delivered to the guest in spite of
signals being supposed to be blocked by the guest signal handler's
sa_mask. Fix this by decoupling the guest signal mask from the
host signal mask, so we can have protected sections where all
host signals are blocked. In particular we block signals from
when host_signal_handler() queues a signal from the guest until
process_pending_signals() has unqueued it. We also block signals
while we are manipulating the guest signal mask in emulation of
sigprocmask and similar syscalls.
Blocking host signals also ensures the correct behaviour with respect
to multiple threads and the overrun count of timer related signals.
Alas blocking and queuing in qemu is still needed because of virtual
processor exceptions, SIGSEGV and SIGBUS.
Blocking signals inside process_pending_signals() protects against
concurrency problems that would otherwise happen if host_signal_handler()
ran and accessed the signal data structures while process_pending_signals()
was manipulating them.
Since we now track the guest signal mask separately from that
of the host, the sigsuspend system calls must track the signal
mask passed to them, because when we process signals as we leave
the sigsuspend the guest signal mask in force is that passed to
sigsuspend.
Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-19-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
[PMM: make signal_pending a simple flag rather than a word with two flag bits;
ensure we don't call block_signals() twice in sigreturn codepaths;
document and assert() the guarantee that using do_sigprocmask() to
get the current mask never fails; use the qemu atomics.h functions
rather than raw volatile variable access; add extra commentary and
documentation; block SIGSEGV/SIGBUS in block_signals() and in
process_pending_signals() because they can't occur synchronously here;
check the right do_sigprocmask() call for errors in ssetmask syscall;
expand commit message; fixed sigsuspend() hanging]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Use the safe_syscall wrapper for sigsuspend syscalls. This
means that we will definitely deliver a signal that arrives
before we do the sigsuspend call, rather than blocking first
and delivering afterwards.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Some host syscalls take an argument specifying the size of a
host kernel's sigset_t (which isn't necessarily the same as
that of the host libc's type of that name). Instead of hardcoding
_NSIG / 8 where we do this, define and use a SIGSET_T_SIZE macro.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Some IFLA_* symbols can be missing in the host linux/if_link.h,
but as they are enums and not "#defines", check in "configure" if
last known (IFLA_PROTO_DOWN) is available and if not, disable
management of NETLINK_ROUTE protocol.
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
This is, for instance, needed to log in a container.
Without this, the user cannot be identified and the console login
fails with "Login incorrect".
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
This is the protocol used by udevd to manage kernel events.
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
rtnetlink is needed to use iproute package (ip addr, ip route)
and dhcp client.
Examples:
Without this patch:
# ip link
Cannot open netlink socket: Address family not supported by protocol
# ip addr
Cannot open netlink socket: Address family not supported by protocol
# ip route
Cannot open netlink socket: Address family not supported by protocol
# dhclient eth0
Cannot open netlink socket: Address family not supported by protocol
Cannot open netlink socket: Address family not supported by protocol
With this patch:
# ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
51: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT qlen 1000
link/ether 00:16:3e:89:6b:d7 brd ff:ff:ff:ff:ff:ff
# ip addr show eth0
51: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP qlen 1000
link/ether 00:16:3e:89:6b:d7 brd ff:ff:ff:ff:ff:ff
inet 192.168.122.197/24 brd 192.168.122.255 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::216:3eff:fe89:6bd7/64 scope link
valid_lft forever preferred_lft forever
# ip route
default via 192.168.122.1 dev eth0
192.168.122.0/24 dev eth0 proto kernel scope link src 192.168.122.197
# ip addr flush eth0
# ip addr add 192.168.122.10 dev eth0
# ip addr show eth0
51: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP qlen 1000
link/ether 00:16:3e:89:6b:d7 brd ff:ff:ff:ff:ff:ff
inet 192.168.122.10/32 scope global eth0
valid_lft forever preferred_lft forever
# ip route add 192.168.122.0/24 via 192.168.122.10
# ip route
192.168.122.0/24 via 192.168.122.10 dev eth0
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
On Linux the setuid(), setgid(), etc system calls have different semantics
from the libc functions. The libc functions follow POSIX and update the
credentials for all threads in the process; the system calls update only
the thread which makes the call. (This impedance mismatch is worked around
in libc by signalling all threads to tell them to do a syscall, in a
byzantine and fragile way; see http://ewontfix.com/17/.)
Since in linux-user we are trying to emulate the system call semantics,
we must implement all these syscalls to directly call the underlying
host syscall, rather than calling the host libc function.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
In do_msgrcv() we want to allocate a message buffer, whose size
is passed to us by the guest. That means we could legitimately
fail, so use g_try_malloc() and handle the error case, in the same
way that do_msgsnd() does.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
The msgrcv ABI is a bit odd -- the msgsz argument is a size_t, which is
unsigned, but it must fail EINVAL if the value is negative when cast
to a long. We were incorrectly passing the value through an
"unsigned int", which meant that if the guest was 32-bit longs and
the host was 64-bit longs an input of 0xffffffff (which should trigger
EINVAL) would simply be passed to the host msgrcv() as 0xffffffff,
where it does not cause the host kernel to reject it.
Follow the same approach as do_msgsnd() in using a ssize_t and
doing the check for negative values by hand, so we correctly fail
in this corner case.
This fixes the msgrcv03 Linux Test Project test case, which otherwise
hangs.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
In a struct timespec, both fields are signed longs. Converting
them from guest to host with code like
host_ts->tv_sec = tswapal(target_ts->tv_sec);
mishandles negative values if the guest has 32-bit longs and
the host has 64-bit longs because tswapal()'s return type is
abi_ulong: the assignment will zero-extend into the host long
type rather than sign-extending it.
Make the conversion routines use __get_user() and __set_user()
instead: this automatically picks up the signedness of the
field type and does the correct kind of sign or zero extension.
It also handles the possibility that the target struct is not
sufficiently aligned for the host's requirements.
In particular, this fixes a hang when running the Linux Test Project
mq_timedsend01 and mq_timedreceive01 tests: one of the test cases
sets the timeout to -1 and expects an EINVAL failure, but we were
setting a very long timeout instead.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Use the safe_syscall wrapper for the futex syscall.
In particular, this fixes hangs when using programs that link
against the Boehm garbage collector, including the Mono runtime.
(We don't change the sys_futex() call in the implementation of
the exit syscall, because as the FIXME comment there notes
that should be handled by disabling signals, since we can't
easily back out if the futex were to return ERESTARTSYS.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Use the safe_syscall wrapper for the pselect and select syscalls.
Since not every architecture has the select syscall, we now
have to implement select in terms of pselect, which means doing
timeval<->timespec conversion.
(Five years on from the initial patch that added pselect support
to QEMU and a decade after pselect6 went into the kernel, it seems
safe to not try to support hosts with header files which don't
define __NR_pselect6.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Wrap execve() in the safe-syscall handling. Although execve() is not
an interruptible syscall, it is a special case: if we allow a signal
to happen before we make the host$ syscall then we will 'lose' it,
because at the point of execve the process leaves QEMU's control. So
we use the safe syscall wrapper to ensure that we either take the
signal as a guest signal, or else it does not happen before the
execve completes and makes it the other program's problem.
The practical upshot is that without this SIGTERM could fail to
terminate the process.
Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-25-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
[PMM: expanded commit message to explain in more detail why this is
needed, and add comment about it too]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Use safe_syscall for waitpid, waitid and wait4 syscalls. Note that this
change allows us to implement support for waitid's fifth (rusage) argument
in future; for the moment we ignore it as we have done up til now.
Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-18-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
[PMM: Adjust to new safe_syscall convention. Add fifth waitid syscall argument
(which isn't present in the libc interface but is in the syscall ABI)]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Restart open() and openat() if signals occur before,
or during with SA_RESTART.
Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-17-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
[PMM: Adjusted to follow new -1-and-set-errno safe_syscall convention]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Restart read() and write() if signals occur before, or during with SA_RESTART
Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-15-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
[PMM: Update to new safe_syscall() convention of setting errno]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
If a signal is delivered immediately before a blocking system call the
handler will only be called after the system call returns, which may be a
long time later or never.
This is fixed by using a function (safe_syscall) that checks if a guest
signal is pending prior to making a system call, and if so does not call the
system call and returns -TARGET_ERESTARTSYS. If a signal is received between
the check and the system call host_signal_handler() rewinds execution to
before the check. This rewinding has the effect of closing the race window
so that safe_syscall will reliably either (a) go into the host syscall
with no unprocessed guest signals pending or or (b) return
-TARGET_ERESTARTSYS so that the caller can deal with the signals.
Implementing this requires a per-host-architecture assembly language
fragment.
This will also resolve the mishandling of the SA_RESTART flag where
we would restart a host system call and not call the guest signal handler
until the syscall finally completed -- syscall restarting now always
happens at the guest syscall level so the guest signal handler will run.
(The host syscall will never be restarted because if the host kernel
rewinds the PC to point at the syscall insn for a restart then our
host_signal_handler() will see this and arrange the guest PC rewind.)
This commit contains the infrastructure for implementing safe_syscall
and the assembly language fragment for x86-64, but does not change any
syscalls to use it.
Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-14-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
[PMM:
* Avoid having an architecture if-ladder in configure by putting
linux-user/host/$(ARCH) on the include path and including
safe-syscall.inc.S from it
* Avoid ifdef ladder in signal.c by creating new hostdep.h to hold
host-architecture-specific things
* Added copyright/license header to safe-syscall.inc.S
* Rewrote commit message
* Added comments to safe-syscall.inc.S
* Changed calling convention of safe_syscall() to match syscall()
(returns -1 and host error in errno on failure)
* Added a long comment in qemu.h about how to use safe_syscall()
to implement guest syscalls.
]
RV: squashed Peters "fixup! linux-user: compile on non-x86-64 hosts"
patch
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
If DEBUG_ERESTARTSYS is set restart all system calls once. This
is pure debug code for exercising the syscall restart code paths
in the per-architecture cpu main loops.
Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-10-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
[PMM: Add comment and a commented-out #define next to the commented-out
generic DEBUG #define; remove the check on TARGET_USE_ERESTARTSYS;
tweak comment message]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Update the x86 main loop and sigreturn code:
* on TARGET_ERESTARTSYS, wind guest PC backwards to repeat syscall insn
* set all guest CPU state within signal.c code rather than passing it
back out as the "return code" from do_sigreturn()
* handle TARGET_QEMU_ESIGRETURN in the main loop as the indication
that the main loop should not touch EAX
Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-5-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: Commit message tweaks; drop TARGET_USE_ERESTARTSYS define]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
The function do_openat() is not consistent about whether it is
returning a host errno or a guest errno in case of failure.
Standardise on returning -1 with errno set (ie caller has
to call get_errno()).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reported-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Move declarations out of qemu-common.h for functions declared in
utils/ files: e.g. include/qemu/path.h for utils/path.c.
Move inline functions out of qemu-common.h and into new files (e.g.
include/qemu/bcd.h)
Signed-off-by: Veronia Bahaa <veroniabahaa@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
getrandom() has been introduced in kernel 3.17 and is now used during
the boot sequence of Debian unstable (stretch/sid).
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
QEMU lists deprecated system call numbers in for Aarch64. These
are never enabled for Linux kernel, so don't define them in Qemu
either. Remove the ifdef around host_to_target_stat64 since
all architectures need it now.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Our implementation of shmat() and shmdt() for linux-user was
using "zero guest address" as its marker for "entry in the
shm_regions[] array is not in use". This meant that if the
guest did a shmdt(0) we would match on an unused array entry
and call page_set_flags() with both start and end addresses zero,
which causes an assertion failure.
Use an explicit in_use flag to manage the shm_regions[] array,
so that we avoid this problem.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reported-by: Pavel Shamis <pasharesearch@gmail.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
target_fd_trans is an array of "TargetFdTrans *": compute size
accordingly. Use g_renew() as proposed by Paolo.
Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.
This commit was created with scripts/clean-includes.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1453832250-766-10-git-send-email-peter.maydell@linaro.org
Adds the definitions for the socket calls SOCKOP_sendmmsg
and SOCKOP_recvmmsg and wires them up with the rest of the code.
The necessary function do_sendrecvmmsg() is already present in
linux-user/syscall.c. After adding these two definitions and wiring
them up, I no longer receive an error message about the
unimplemented socket calls when running "apt-get update" on Debian
unstable running on qemu with glibc_2.21 on m68k.
Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
In this case, level is TARGET_SOL_SOCKET, but we need SOL_SOCKET for
setsockopt().
Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
There is no reason to limit sigaltstack syscall to just a few
architectures and pretend it is not implemented for others.
If some architecture is not ready for this, that architecture
should be fixed instead.
This fixes LP#1516408.
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
This is obsolete, but if we want to use dhcp with an old distro (like debian
etch), we need it. Some users (like dhclient) use SOCK_PACKET with AF_PACKET
and the kernel allows that.
packet(7)
In Linux 2.0, the only way to get a packet socket was by calling
socket(AF_INET, SOCK_PACKET, protocol). This is still supported but
strongly deprecated. The main difference between the two methods is
that SOCK_PACKET uses the old struct sockaddr_pkt to specify an inter‐
face, which doesn't provide physical layer independence.
struct sockaddr_pkt {
unsigned short spkt_family;
unsigned char spkt_device[14];
unsigned short spkt_protocol;
};
spkt_family contains the device type, spkt_protocol is the IEEE 802.3
protocol type as defined in <sys/if_ether.h> and spkt_device is the
device name as a null-terminated string, for example, eth0.
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
in PACKET(7) :
packet_socket = socket(AF_PACKET, int socket_type, int protocol);
[...]
protocol is the IEEE 802.3 protocol
number in network order. See the <linux/if_ether.h> include file for a
list of allowed protocols. When protocol is set to htons(ETH_P_ALL)
then all protocols are received. All incoming packets of that protocol
type will be passed to the packet socket before they are passed to the
protocols implemented in the kernel.
[...]
Compatibility
In Linux 2.0, the only way to get a packet socket was by calling
socket(AF_INET, SOCK_PACKET, protocol).
We need to tswap16() the protocol because on big-endian, the ABI is
waiting for, for instance for ETH_P_ALL, 0x0003 (big endian ==
network order), whereas on little-endian it is waiting for 0x0300.
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Convert malloc()/ calloc() calls to g_malloc()/ g_try_malloc()/ g_new0()
All heap memory allocation should go through glib so that we can take
advantage of a single memory allocator and its debugging/tracing features.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Harmandeep Kaur <write.harmandeep@gmail.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
This patch introduces a system very similar to the one used in the kernel
to attach specific functions to a given file descriptor.
In this case, we attach a specific "host_to_target()" translator to the fd
returned by signalfd() to be able to byte-swap the signalfd_siginfo
structure provided by read().
This patch allows to execute the example program given by
man signalfd(2):
#include <sys/signalfd.h>
#include <signal.h>
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#define handle_error(msg) \
do { perror(msg); exit(EXIT_FAILURE); } while (0)
int
main(int argc, char *argv[])
{
sigset_t mask;
int sfd;
struct signalfd_siginfo fdsi;
ssize_t s;
sigemptyset(&mask);
sigaddset(&mask, SIGINT);
sigaddset(&mask, SIGQUIT);
/* Block signals so that they aren't handled
according to their default dispositions */
if (sigprocmask(SIG_BLOCK, &mask, NULL) == -1)
handle_error("sigprocmask");
sfd = signalfd(-1, &mask, 0);
if (sfd == -1)
handle_error("signalfd");
for (;;) {
s = read(sfd, &fdsi, sizeof(struct signalfd_siginfo));
if (s != sizeof(struct signalfd_siginfo))
handle_error("read");
if (fdsi.ssi_signo == SIGINT) {
printf("Got SIGINT\n");
} else if (fdsi.ssi_signo == SIGQUIT) {
printf("Got SIGQUIT\n");
exit(EXIT_SUCCESS);
} else {
printf("Read unexpected signal\n");
}
}
}
$ ./signalfd_demo
^CGot SIGINT
^CGot SIGINT
^\Got SIGQUIT
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
No need to use g_malloc0 to zero the memory if we memcpy to
the whole buffer afterwards anyway. Actually, there is even
a function which combines both steps, g_memdup, so let's use
this function here instead.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Casting to a union type is a gcc (and clang) extension. Other compilers
might not support it. This is not a problem today, but the type casts
can be removed easily. Smatch now no longer complains like before:
linux-user/syscall.c:3190:18: warning: cast to non-scalar
linux-user/syscall.c:7348:44: warning: cast to non-scalar
Cc: Riku Voipio <riku.voipio@iki.fi>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
g_new(T, n) is neater than g_malloc(sizeof(T) * n). It's also safer,
for two reasons. One, it catches multiplication overflowing size_t.
Two, it returns T * rather than void *, which lets the compiler catch
more type errors.
This commit only touches allocations with size arguments of the form
sizeof(T). Same Coccinelle semantic patch as in commit b45c03f.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Under Alpha host, EAGAIN is redefined to 35, so it need be remapped too.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
This patch allows to run example given by open_by_handle_at(2):
The following shell session demonstrates the use of these two programs:
$ echo 'Can you please think about it?' > cecilia.txt
$ ./t_name_to_handle_at cecilia.txt > fh
$ ./t_open_by_handle_at < fh
open_by_handle_at: Operation not permitted
$ sudo ./t_open_by_handle_at < fh # Need CAP_SYS_ADMIN
Read 31 bytes
$ rm cecilia.txt
Now we delete and (quickly) re-create the file so that it has the same
content and (by chance) the same inode.[...]
$ stat --printf="%i\n" cecilia.txt # Display inode number
4072121
$ rm cecilia.txt
$ echo 'Can you please think about it?' > cecilia.txt
$ stat --printf="%i\n" cecilia.txt # Check inode number
4072121
$ sudo ./t_open_by_handle_at < fh
open_by_handle_at: Stale NFS file handle
See the man page for source code.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Whilst calls to do_fork() are wrapped in get_errno() this does not
translate return values.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Currently, __target_cmsg_nxthdr compares a pointer derived from
target_cmsg against the msg_control field of target_msgh (through
subtraction). This failed for me when emulating i386 code under x86_64,
because pointers in the host address space and pointers in the guest
address space were not the same. This patch passes the initial value of
target_cmsg into __target_cmsg_nxthdr.
I found and fixed two more related bugs:
- __target_cmsg_nxthdr now returns the new cmsg pointer instead of the
old one.
- tgt_space (in host_to_target_cmsg) doesn't count "sizeof (struct
target_cmsghdr)" twice anymore.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Instead of creating a temporary copy for the whole environment and
the arguments, directly copy everything to the target stack.
For this to work, we have to change the order of stack creation and
copying the arguments.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Stefan Brüns <stefan.bruens@rwth-aachen.de>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Some of architectures (e.g. tilegx), several syscall macros are not
supported, so switch them.
Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <BLU436-SMTP457D6FC9B2B9BA87AEB22CB9660@phx.gbl>
Signed-off-by: Richard Henderson <rth@twiddle.net>
* qemu_mutex_lock_iothread "No such process" fix
* cutils: qemu_strto* wrappers
* iohandler.c simplification
* Many other fixes and misc patches.
And some MTTCG work (with Emilio's fixes squashed):
* Signal-free TCG kick
* Removing spinlock in favor of QemuMutex
* User-mode emulation multi-threading fixes/docs
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQEcBAABCAAGBQJV8Tk7AAoJEL/70l94x66Ds3QH/3bi0RRR2NtKIXAQrGo5tfuD
NPMu1K5Hy+/26AC6mEVNRh4kh7dPH5E4NnDGbxet1+osvmpjxAjc2JrxEybhHD0j
fkpzqynuBN6cA2Gu5GUNoKzxxTmi2RrEYigWDZqCftRXBeO2Hsr1etxJh9UoZw5H
dgpU3j/n0Q8s08jUJ1o789knZI/ckwL4oXK4u2KhSC7ZTCWhJT7Qr7c0JmiKReaF
JEYAsKkQhICVKRVmC8NxML8U58O8maBjQ62UN6nQpVaQd0Yo/6cstFTZsRrHMHL3
7A2Tyg862cMvp+1DOX3Bk02yXA+nxnzLF8kUe0rYo6llqDBDStzqyn1j9R0qeqA=
=nB06
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* Support for jemalloc
* qemu_mutex_lock_iothread "No such process" fix
* cutils: qemu_strto* wrappers
* iohandler.c simplification
* Many other fixes and misc patches.
And some MTTCG work (with Emilio's fixes squashed):
* Signal-free TCG kick
* Removing spinlock in favor of QemuMutex
* User-mode emulation multi-threading fixes/docs
# gpg: Signature made Thu 10 Sep 2015 09:03:07 BST using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>"
* remotes/bonzini/tags/for-upstream: (44 commits)
cutils: work around platform differences in strto{l,ul,ll,ull}
cpu-exec: fix lock hierarchy for user-mode emulation
exec: make mmap_lock/mmap_unlock globally available
tcg: comment on which functions have to be called with mmap_lock held
tcg: add memory barriers in page_find_alloc accesses
remove unused spinlock.
replace spinlock by QemuMutex.
cpus: remove tcg_halt_cond and tcg_cpu_thread globals
cpus: protect work list with work_mutex
scripts/dump-guest-memory.py: fix after RAMBlock change
configure: Add support for jemalloc
add macro file for coccinelle
configure: factor out adding disas configure
vhost-scsi: fix wrong vhost-scsi firmware path
checkpatch: remove tests that are not relevant outside the kernel
checkpatch: adapt some tests to QEMU
CODING_STYLE: update mixed declaration rules
qmp: Add example usage of strto*l() qemu wrapper
cutils: Add qemu_strtoull() wrapper
cutils: Add qemu_strtoll() wrapper
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When executing a 64bit target chroot on 64bit host,
the ioctl() command can mismatch.
It seems the previous commit doesn't solve the problem in
my case:
9c6bf9c7 linux-user: Fix ioctl cmd type mismatch on 64-bit targets
For example, a ppc64 chroot on an x86_64 host:
bash-4.3# ls
Unsupported ioctl: cmd=0x80087467
Unsupported ioctl: cmd=0x802c7415
The origin of the problem is in syscall.c:do_ioctl().
static abi_long do_ioctl(int fd, abi_long cmd, abi_long arg)
In this case (ppc64) abi_long is long (on the x86_64), and
cmd = 0x0000000080087467
then
if (ie->target_cmd == cmd)
target_cmd is int, so target_cmd = 0x80087467
and to compare an int with a long, the sign is extended to 64bit,
so the comparison is:
if (0xffffffff80087467 == 0x0000000080087467)
which doesn't match whereas it should.
This patch uses int in the case of the target command type
instead of abi_long.
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
The target payloads in cmsg conversions may not have the alignment
required by the host. Using the get_user and put_user functions is
the easiest way to handle this and also do the byte-swapping we
require.
(Note that prior to this commit target_to_host_cmsg was incorrectly
using __put_user() rather than __get_user() for the SCM_CREDENTIALS
conversion, which meant it wasn't getting the benefit of the
misalignment handling.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
The previous code for handling payload length when converting
cmsg structures from host to target had a number of problems:
* we required the msg->msg_controllen to declare the buffer
to have enough space for final trailing padding (we were
checking against CMSG_SPACE), whereas the kernel does not
require this, and common userspace code assumes this. (In
particular, glibc's "try to talk to nscd" code that it will
run on startup will receive a cmsg with a 4 byte payload and
only allocate 4 bytes for it, which was causing us to do
the wrong thing on architectures that need 8-alignment.)
* we weren't correctly handling the fact that the SO_TIMESTAMP
payload may be larger for the target than the host
* we weren't marking the messages with MSG_CTRUNC when we did
need to truncate a message that wasn't truncated by the host,
but were instead logging a QEMU message; since truncation is
always the result of a guest giving us an insufficiently
sized buffer, we should report it to the guest as the kernel
does and don't log anything
Rewrite the parts of the function that deal with length to
fix these issues, and add a comment in target_to_host_cmsg
to explain why the overflow logging it does is a QEMU bug,
not a guest issue.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
We store all struct types in an array of static size without ever
checking whether we overrun it. Of course some day someone (like me
in another, ancient ALSA enabling patch set) will run into the limit
without realizing it.
So let's make the allocation dynamic. We already know the number of
structs that we want to allocate, so we only need to pass the variable
into the respective piece of code.
Also, to ensure we don't accidently overwrite random memory, add some
asserts to sanity check whether a thunk is actually part of our array.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
If QEMU forks after the CPU threads have been created, qemu_mutex_lock_iothread
will not be able to do qemu_cpu_kick_thread. There is no solution other than
assuming that forks after the CPU threads have been created will end up in an
exec. Forks before the CPU threads have been created (such as -daemonize)
have to call rcu_after_fork manually.
Notably, the oxygen theme for GTK+ forks and shows a "No such process" error
without this patch.
This patch can be reverted once the iothread loses the "kick the TCG thread"
magic.
User-mode emulation does not use the iothread, so it can also call
rcu_after_fork.
Reported by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Tested by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The second and fourth argument are in/out parameters, store them back
after the syscall. Also, the fourth argument was mishandled, and EFAULT
handling was missing.
Signed-off-by: Andreas Schwab <schwab@suse.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
In abi_long do_ioctl_dm(), after lock_user() call, the code does
not call unlock_user() before going to failure return in default case.
Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
It is only a typo issue, need use tswapal(target_vec[i].iov_len) for the
len.
Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
When failure occurs during locking of vec[i], we also need to unlock all
already locked vec[i] in failure processing code block before return.
Code in unlock_user() checks vec[i].iov_base for NULL, so there's no
need not check it .
If error is EFAULT when "i == 0", vec[i].iov_base is NULL, we can just
skip it, so can still use "while (--i >= 0)" loop condition.
Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
When passing ancillary data through a unix socket, handle
credentials properly instead of doing a simple copy and
issuing a warning.
Signed-off-by: Alex Suykov <alex.suykov@gmail.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
linux-user passes the cmd argument of the ioctl syscall as a signed long,
but compares it to an unsigned int when iterating through the ioctl_entries
list. When the cmd is a large value like 0x80047476 (TARGET_TIOCSWINSZ on
mips64) it gets sign-extended to 0xffffffff80047476, causing the comparison
to fail and resulting in lots of spurious "Unsupported ioctl" errors.
Changing the target_cmd field in the ioctl_entries list to a signed int
causes those values to be sign-extended as well during the comparison.
Signed-off-by: Ed Swierk <eswierk@skyportsystems.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
The resource argument is translated from host to target for
[gs]etprlimit but not for prlimit64. Fix this.
Signed-off-by: Felix Janda <felix.janda@posteo.de>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
When creating a timer handle, we give the timer id a special magic offset
of 0xcafe0000. However, we never mask that offset out of the timer id before
we start using it to dereference our timer array. So we always end up aborting
timer operations because the timer id is out of bounds.
This was not an issue before my patch e52a99f756 ("linux-user: Simplify
timerid checks on g_posix_timers range") because before we would blindly mask
anything above the first 16 bits.
This patch simplifies the code around timer id creation by introducing a proper
target_timer_id typedef that is s32, just like Linux has it. It also changes the
magic offset to a value that makes all timer ids be positive.
Reported-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Tom Musta <tommusta@gmail.com>
Tested-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Without this, builds on older systems fail with:
qemu/linux-user/syscall.c:61:25: warning: sys/timerfd.h: No such file or directory
v2: fix the usual case where CONFIG_TIMERFD is enabled..
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
We check whether the passed in timer id is negative on all calls
that involve g_posix_timers.
However, these checks are bogus. First off we limit the timer_id to
16 bits which is not what Linux does. Then we check whether it's negative
which it can't be because we masked it.
We can safely remove the masking. For the negativity check we can just
treat the timerid as unsigned and only check for upper boundaries.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
The blkpg ioctl can take different payloads depending on the opcode in
its payload structure. Create a new special ioctl handler that can only
deal with partition style ones for now.
This patch fixes running parted for me.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Handle variable "fd_orig" going out of scope leaks the handle.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Although not technically not required by POSIX, the writev system call will
typically write out its buffers individually. That is, if the first buffer
is written successfully, but the second buffer pointer is invalid, then
the first chuck will be written and its size is returned.
Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
The argument to the mlockall system call is not necessarily the same on
all platforms and thus may require translation prior to passing to the
host.
For example, PowerPC 64 bit platforms define values for MCL_CURRENT
(0x2000) and MCL_FUTURE (0x4000) which are different from Intel platforms
(0x1 and 0x2, respectively)
Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
The clock_nanosleep syscall is unusual in that it returns positive
numbers in error handling situations, versus returning -1 and setting
errno, or returning a negative errno value. On POWER, the kernel will
set the SO bit of CR0 to indicate failure in a syscall. QEMU has
generic handling to do this for syscalls with standard return values.
Add special case code for clock_nanosleep to handle CR0 properly.
Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Properly detect a fault when attempting to store into an invalid
struct timespec pointer.
Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
The sched_getparam, sched_setparam and sched_setscheduler system
calls take a pointer argument to a sched_param structure. When
this pointer is null, errno should be set to EINVAL.
Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
The msgsnd system call takes an argument that describes the message
size (msgsz) and is of type size_t. The system call should set
errno to EINVAL in the event that a negative message size is passed.
Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
The mq_open system call takes an optional struct mq_attr pointer
argument in the fourth position. This pointer is used when O_CREAT
is specified in the flags (second) argument. It may be NULL, in
which case the queue is created with implementation defined attributes.
Change the code to properly handle the case when NULL is passed in the
arg4 position.
Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
For those target ABIs that use the ipc system call (e.g. POWER),
the third argument is used in the shmat path as a pointer. It
therefore must be declared as an abi_long (versus int) so that
the address bits are not lost in truncation. In fact, all arguments
to do_ipc should be declared as abit_long.
In fact, it makes more sense for all of the arguments to be declaried
as abi_long (except call).
Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
The semun union used in the semctl system call contains both an int (val) and
pointers. In cross-endian situations on 64 bit targets, the value passed to
semctl is an 8 byte (abi_long) value and thus does not have the 4-byte val
field in the correct location. In order to rectify this, the other half
of the union must be accessed. This is achieved in code by performing
a byte swap on the entire 8 byte union, followed by a 4-byte swap of the
first half.
Also, eliminate an extraneous (dead) line of code that sets target_su.val in
the IPC_SET/IPC_GET case.
Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
When the ipc system call is used to wrap a semctl system call,
the ptr argument to ipc needs to be dereferenced prior to passing
it to the semctl handler. This is because the fourth argument to
semctl is a union and not a pointer to a union.
Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
The 64 bit PowerPC platforms eliminate the _unused1 and _unused2
elements of the semid_ds structure from <sys/sem.h>. So eliminate
these from the target_semid_ds structure.
Signed-off-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Add support for the setns and unshare syscalls, trivially passed through to
the host. Based on patches by Paul Burton, added configure check.
Signed-off-by: Paul Burton <paul@archlinuxmips.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Add support for the ioprio_get & ioprio_set syscalls, allowing their
use by target programs.
Signed-off-by: Paul Burton <paul@archlinuxmips.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Adds support for the timerfd_create, timerfd_gettime & timerfd_settime
syscalls, allowing use of timerfds by target programs.
v2: By Riku - added configure check for timerfd and ifdefs
for benefit of old distributions like RHEL5.
Signed-off-by: Paul Burton <paul@archlinuxmips.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
The current code always returns the length of the path when it should
be returning the number of bytes it wrote to the output string.
Further, readlink is not supposed to append a NUL byte, but the current
snprintf logic will always do just that.
Even further, if you pass in a length of 0, you're suppoesd to get back
an error (EINVAL), but the current logic just returns 0.
Further still, if there was an error reading the symlink, we should not
go ahead and try to read the target buffer as it is garbage.
Simple test for the first two issues:
$ cat test.c
int main() {
char buf[50];
size_t len;
for (len = 0; len < 10; ++len) {
memset(buf, '!', sizeof(buf));
ssize_t ret = readlink("/proc/self/exe", buf, len);
buf[20] = '\0';
printf("readlink(/proc/self/exe, {%s}, %zu) = %zi\n", buf, len, ret);
}
return 0;
}
Now compare the output of the native:
$ gcc test.c -o /tmp/x
$ /tmp/x
$ strace /tmp/x
With what qemu does:
$ armv7a-cros-linux-gnueabi-gcc test.c -o /tmp/x -static
$ qemu-arm /tmp/x
$ qemu-arm -strace /tmp/x
Signed-off-by: Mike Frysinger <vapier@chromium.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
There were a number of bugs in the conversion of the sigevent
argument to timer_create from target to host format:
* signal number not converted from target to host
* thread ID not copied across
* sigev_value not copied across
* we never unlocked the struct when we were done
Between them, these problems meant that SIGEV_THREAD_ID
timers (and the glibc-implemented SIGEV_THREAD timers which
depend on them) didn't work.
Fix these problems and clean up the code a little by pulling
the struct conversion out into its own function, in line with
how we convert various other structs. This allows the test
program in bug LP:1042388 to run.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
While Mikhail fixed /proc/self/maps, it was noticed openat calls are
not redirected currently. Some archs don't have open at all, so
openat needs to be redirected.
Fix this by consolidating open/openat code to do_openat - open
is implemented using openat(AT_FDCWD, ... ), which according
to open(2) man page is identical.
Since all targets now have openat, remove the ifdef around sys_openat
and openat: case in do_syscall.
Cc: Mikhail Ilin <m.ilin@samsung.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Build /proc/self/maps doing a match against guest memory translation table.
Output only that map records which are valid for guest memory layout.
Signed-off-by: Mikhail Ilyin <m.ilin@samsung.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
target_to_host_sockaddr() may increase the lenth with 1 byte
for AF_UNIX sockets so allocate 1 extra byte.
Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Add a definition of the KDSIGACCEPT ioctl & allow its use by target
programs.
Signed-off-by: Paul Burton <paul@archlinuxmips.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
The tv argument to the settimeofday syscall is allowed to be NULL, if
the program only wishes to provide the timezone. QEMU previously
returned -EFAULT when tv was NULL. Instead, execute the syscall &
provide NULL to the kernel as the target program expected.
Signed-off-by: Paul Burton <paul@archlinuxmips.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
The settimeofday syscall accepts a tz argument indicating the desired
timezone to the kernel. QEMU previously ignored any argument provided
by the target program & always passed NULL to the kernel. Instead,
translate the argument & pass along the data userland provided.
Although this argument is described by the settimeofday man page as
obsolete, it is used by systemd as of version 213.
Signed-off-by: Paul Burton <paul@archlinuxmips.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Calls to the mount syscall can legitimately provide NULL as the value
for the source of filesystemtype arguments, which QEMU would previously
reject & return -EFAULT to the target program. An example of this is
remounting an already mounted filesystem with different properties.
Instead of rejecting such syscalls with -EFAULT, pass NULL along to the
kernel as the target program expects.
Additionally this patch fixes a potential memory leak when DEBUG_REMAP
is enabled and lock_user_string fails on the target or filesystemtype
arguments but a prior argument was non-NULL and already locked.
Since the patch already touched most lines of the TARGET_NR_mount case,
it fixes the indentation & coding style for good measure.
Signed-off-by: Paul Burton <paul@archlinuxmips.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Translate the SO_PASSSEC option to setsockopt to the host value &
perform the syscall as expected, allowing use of the option by target
programs.
Signed-off-by: Paul Burton <paul@archlinuxmips.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Translate the SO_SNDBUFFORCE & SO_RCVBUFFORCE options to setsockopt to
the host values & perform the syscall as expected, allowing use of those
options by target programs.
Signed-off-by: Paul Burton <paul@archlinuxmips.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Translate the SO_ACCEPTCONN option to the host value & execute the
syscall as expected.
Signed-off-by: Paul Burton <paul@archlinuxmips.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
QEMU previously passed the result of the host syscall directly to the
target program. This is a problem if the host & target have different
representations of socket types, as is the case when running a MIPS
target program on an x86 host. Introduce a host_to_target_sock_type
helper function mirroring the existing target_to_host_sock_type, and
call it to translate the value provided by getsockopt when called for
the SO_TYPE option.
Signed-off-by: Paul Burton <paul@archlinuxmips.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
mmap_flags_tbl contains a list of mmap flags, and how to map them to
the target. This patch adds MAP_NORESERVE, which was missing to the
list.
Signed-off-by: Christophe Lyon <christophe.lyon@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
From MIPS documentation (Volume III):
UserLocal Register (CP0 Register 4, Select 2)
Compliance Level: Recommended.
The UserLocal register is a read-write register that is not interpreted by
the hardware and conditionally readable via the RDHWR instruction.
This register only exists if the Config3-ULRI register field is set.
Privileged software may write this register with arbitrary information and
make it accessible to unprivileged software via register 29 (ULR) of the
RDHWR instruction. To do so, bit 29 of the HWREna register must be set to a
1 to enable unprivileged access to the register.
Signed-off-by: Petar Jovanovic <petar.jovanovic@imgtec.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
This fixes "Cannot open audit interface - aborting." when the
EAFNOSUPPORT errno differs between the target and host
architectures (e.g. mips target and x86_64 host).
Signed-off-by: Ed Swierk <eswierk@skyportsystems.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
If the guest's "long" type is smaller than the host's, then
our sched_getaffinity wrapper needs to round the buffer size
up to a multiple of the host sizeof(long). This means that when
we copy the data back from the host buffer to the guest's
buffer there might be more than we can fit. Rather than
overflowing the guest's buffer, handle this case by returning
EINVAL or ignoring the unused extra space, as appropriate.
Note that only guests using the syscall interface directly might
run into this bug -- the glibc wrappers around it will always
use a buffer whose size is a multiple of 8 regardless of guest
architecture.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Implementations of system calls getrusage and wait4 have not previously
handled correctly cases when incorrect address of struct rusage is
passed.
This change makes sure return values are correctly set for these cases.
Signed-off-by: Petar Jovanovic <petar.jovanovic@imgtec.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Use the public sigset_t instead of the glibc specific internal
__sigset_t in _syscall.
Calculate the sigevent pad size is calculated in similar way as kernel
does it instead of using glibc internal field _pad.
This is needed for building with musl libc.
Signed-off-by: Natanael Copa <ncopa@alpinelinux.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Recently merged kernel ports (such as OpenRISC and Meta) have an llseek
system call instead of _llseek. This is handled for the host
architecture by defining __NR__llseek as __NR_llseek, but not for the
target architecture.
Handle it in the same way for these architectures, defining
TARGET_NR__llseek as TARGET_NR_llseek.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Riku Voipio <riku.voipio@iki.fi>
Cc: Jia Liu <proljc@gmail.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
QEMU already supports /proc/self/{maps,stat,auxv} so addition of
/proc/self/exe is rather trivial.
Fixes https://bugs.launchpad.net/qemu/+bug/1299190
Signed-off-by: Maxim Ostapenko <m.ostapenko@partner.samsung.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Flags NONBLOCK and CLOEXEC can have different values on the host and the
guest, so set correct host values before calling accept4().
This fixes several issues with accept4 system call and user-mode of QEMU.
Signed-off-by: Petar Jovanovic <petar.jovanovic@imgtec.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Implement the capget and capset syscalls. This is useful because
simple programs like 'ls' try to use it in AArch64, and otherwise
we emit a lot of noise about it being unimplemented.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Create a wrapper for signal mask changes initiated by the guest;
(this includes syscalls and also the sigreturns from signal.c)
this will give us a place to put code which prevents the guest
from changing the handling of signals used by QEMU itself
internally.
The wrapper is called from all the guest-initiated sigprocmask, but
is not called from internal qemu sigprocmask calls.
Signed-off-by: Alex Barcelo <abarcelo@ac.upc.edu>
[PMM: Added calls to wrapper for sigprocmask uses in signal.c
when setting the signal mask on entry and exit from signal
handlers, since these also are guest-provided signal masks.]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
F_GETOWN is replaced by F_GETOWN_EX inside the glibc fcntl wrapper
Signed-off-by: Andreas Schwab <schwab@suse.de>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
QEMU's implementation of the m68k atomic_barrier syscall, like the kernel's,
is just a no-op. However we still need to return a result code from it.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
On success, sigtimedwait() returns a signal number that needs to be
translated from a host value to a target value.
This change also fixes issues with sigwait (that is implemented using
sigtimedwait()).
Signed-off-by: Petar Jovanovic <petar.jovanovic@imgtec.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Glibc when built for newer kernels assumes that the sendmmsg syscall is
available. Without it, dns resolution simply fails to work.
Wrap the syscall with existing infrastructure so that we don't have a host
dependency on sendmmsg.
To avoid locking the same area of guest memory twice (which will break if
DEBUG_REMAP is defined) we pull the lock/unlock part of do_sendrecvmsg()
out into its own function so the actual implementation can be shared.
Signed-off-by: Alexander Graf <agraf@suse.de>
[PMM: add recvmmsg support;
handle errors (which also implies support for non-blocking operations);
cap the vector length as the kernel implementation does;
don't lock guest memory twice;
support MSG_WAITFORONE flag]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
The size of the UID/GID types depends on whether USE_UID16 is
defined. Define a new put_user_id() which writes a uid/gid
type to guest memory. This fixes getresuid and getresgid, which
were always storing 16 bits even if the uid type was 32 bits.
Reported-by: Michael Matz <matz@suse.de>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Fix two issues in error handling in target_to_host_semarray():
* don't leak the host_array buffer if lock_user fails
* return an error if malloc() fails
v2: added missing * -Riku Voipio
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
In lock_iovec() if lock_user() failed we were doing an unlock_user
but not a free(vec), which is the wrong way round. We were also
assuming that free() and unlock_user() don't touch errno, which
is not guaranteed. Fix both these problems.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Refactor do_socketcall() to do argument conversion/checking first,
according to a lookup table (which call has how many args) and
by calling the right function second with ready-to-go arguments.
This ensures that all arguments are handled as abi_long, according
to socketcall prototype, and simplifies argument handling alot too.
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
addrlen parameter of recvfrom() of type socklen_t* was read into
variable of type socklen_t, that caused zeroing out of upper 4 bytes
when running s390x on top of x86_64. This patch changes addrlen type
to abi_ulong.
Signed-off-by: Pavel Zbitskiy <pavel.zbitskiy@gmail.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
optlen parameter of getsockopt() of type socklen_t* was read into
variable of type socklen_t, that caused zeroing out of upper 4 bytes
when running s390x on top of x86_64. This patch changes optlen type
to abi_ulong.
Signed-off-by: Pavel Zbitskiy <pavel.zbitskiy@gmail.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Creating target_structs header in linux-user/$arch/ and making
target_ipc_perm and target_shmid_ds its first inhabitants.
The struct defintions may/should be further fine-tuned by arch maintainers.
Signed-off-by: Petar Jovanovic <petar.jovanovic@imgtec.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Some targets use a stat64 structure for the stat64 syscall while others
use a stat structure. SPARC64 used the wrong kind.
Instead of extending the conditional compilation in syscall.c, now a
macro TARGET_HAS_STRUCT_STAT64 is defined whenever a target has a
target_stat64.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Erik de Castro Lopo <erikd@mega-nerd.com>
If the host lacks SOCK_CLOEXEC, bail out with -EINVAL.
If the host lacks SOCK_ONONBLOCK, try to emulate it with fcntl()
and O_NONBLOCK.
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
With nptl enabled, atomic_cmpxchg_32 and atomic_barrier
system calls are needed. This patch enabled really dummy
versions of the system calls, modeled after the m68k
kernel code.
With this patch I am able to execute m68k binaries
with qemu linux-user (busybox compiled for coldfire).
[v2] que an segfault instead of returning a EFAULT
to keep in line with kernel code.
Cc: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Previous implementation does not take into account that SOL_SOCKET constant
can be arch specific. This change fixes some issues with sendmsg/recvmsg.
Signed-off-by: Petar Jovanovic <petar.jovanovic@imgtec.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
This is needed to be able to run dhclient.
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
This patch allows to have IP addresses in correct order
in the case of "netstat -nr" when the endianess of the
guest differs from one of the host.
For instance, an m68k guest on an x86_64 host:
WITHOUT this patch:
$ netstat -nr
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
0.0.0.0 1.3.0.10 0.0.0.0 UG 0 0 0 eth0
0.3.0.10 0.0.0.0 0.255.255.255 U 0 0 0 eth0
$ cat /proc/net/route
Iface Destination Gateway Flags RefCnt Use Metric Mask MTU Window IRTT
eth0 00000000 0103000A 0003 0 0 0 000000000 0 0
eth0 0003000A 00000000 0001 0 0 0 00FFFFFF0 0 0
WITH this patch:
$ netstat -nr
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
0.0.0.0 10.0.3.1 0.0.0.0 UG 0 0 0 eth0
10.0.3.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
$ cat /proc/net/route
Iface Destination Gateway Flags RefCnt Use Metric Mask MTU Window IRTT
eth0 00000000 0a000301 0003 0 0 0 000000000 0 0
eth0 0a000300 00000000 0001 0 0 0 ffffff000 0 0
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
It has been pointed out on LKML that the alpha umount syscall numbers
are named wrong, and a patch to rectify that has been posted for 3.11.
Glibc works around this by treating NR_umount as NR_umount2 if
NR_oldumount exists. That's more complicated than we need in QEMU,
given that we control linux-user/*/syscall_nr.h.
This is the last instance of TARGET_NR_oldumount, so delete that from
the strace.list.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
For newer target architectures, glibc can be picky about the kernel
version: for example, it will not run on an aarch64 system unless
the kernel reports itself as at least 3.8.0. Accommodate this by
enhancing the existing support for faking the kernel version so
that each target can optionally specify a minimum version: if
the user doesn't force a specific fake version then we will override
with the minimum required version only if the real host kernel
version is insufficient.
Use this facility to let aarch64 report a minimum of 3.8.0.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1378235544-22290-21-git-send-email-peter.maydell@linaro.org
Some syscall handlers have special code for ARM enabled that we don't
need on AArch64. Exclude AArch64 in those cases. In other places we
can share struct definitions with other targets or have to provide our
own.
With this patch applied, most syscall definitions in linux-user should
be sound for AArch64.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: John Rigby <john.rigby@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1378235544-22290-16-git-send-email-peter.maydell@linaro.org
Message-id: 1368505980-17151-9-git-send-email-john.rigby@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The m68k set_thread_area syscall implementation failed to set the
return value. Correctly set it zero, since this syscall will always
succeed.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1375093909-13653-1-git-send-email-peter.maydell@linaro.org
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
When a new thread gets created, we need to reset non arch specific state to
get the new CPU into clean state.
However this reset should happen before the arch specific CPU contents get
copied over. Otherwise we end up having clean reset state in our newly created
thread.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
SPARC is one of the CPUs which has a funny syscall ABI for the
pipe syscall; add it to the set of special cases in do_pipe().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Now all linux-user targets support building with NPTL, we can make it
mandatory. This is a good idea because:
* NPTL is no longer new and experimental; it is completely standard
* in practice, linux-user without NPTL is nearly useless for
binaries built against non-ancient glibc
* it allows us to delete the rather untested code for handling
the non-NPTL configuration
Note that this patch leaves the CONFIG_USE_NPTL ifdefs in the
bsd-user codebase alone. This makes no change for bsd-user, since
our configure test for NPTL had a "#include <linux/futex.h>"
which means bsd-user would never have been compiled with
CONFIG_USE_NPTL defined, and it still is not.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Add x86-64 implementation of cpu_set_tls() (like the kernel, we
just have to call do_arch_prctl() to set FS); this allows us to
enable NPTL.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
We can easily set the TLS on i386. Add code to do so.
Signed-off-by: Alexander Graf <agraf@suse.de>
[PMM: also remove "target_nptl=no" line from configure, for
consistency with other patches in this series]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Linux manages to have three separate orderings of the arguments to
the clone() syscall on different architectures. In the kernel these
are selected via CONFIG_CLONE_BACKWARDS and CONFIG_CLONE_BACKWARDS2.
Clean up our implementation of this to use similar #define names
rather than a TARGET_* ifdef ladder.
This includes behaviour changes fixing bugs on cris, x86-64, m68k,
openrisc and unicore32. cris had explicit but wrong handling; the
others were just incorrectly using QEMU's default, which happened
to be the equivalent of CONFIG_CLONE_BACKWARDS. (unicore32 appears
to be broken in the mainline kernel in that it tries to use arg3 for
both parent_tidptr and newtls simultaneously -- we don't attempt
to emulate this bug...)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
The i386 code for the get_thread_area syscall was missing a
'break' which meant it would have fallen through into the
implementation of the following syscall; add it.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
For m68k, per-thread data is a purely kernel construct with no
CPU level support. Implement it via a field in the TaskState structure,
used by cpu_set_tls() and the set_thread_area/get_thread_area
syscalls. This allows us to enable compilation with NPTL.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Move next_cpu from CPU_COMMON to CPUState.
Move first_cpu variable to qom/cpu.h.
gdbstub needs to use CPUState::env_ptr for now.
cpu_copy() no longer needs to save and restore cpu_next.
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
[AF: Rebased, simplified cpu_copy()]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Previous implementation has failed to take into account different value of
SOCK_NONBLOCK on target and host, and existence of SOCK_CLOEXEC.
The same conversion has to be applied both for do_socket and do_socketpair,
so the code has been isolated in a static inline function.
enum sock_type in linux-user/socket.h has been extended to include
TARGET_SOCK_CLOEXEC and TARGET_SOCK_NONBLOCK, similar to definition in libc.
The patch also includes necessary code style changes (tab to spaces) in the
header file since most of the file has been touched by this change.
Signed-off-by: Petar Jovanovic <petar.jovanovic@imgtec.com>
Message-id: 1372639454-7560-1-git-send-email-petar.jovanovic@rt-rk.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Commit c0d472b12e accidentally dropped the definition of
__NR_SYS_utimensat even though its use is guarded by
CONFIG_UTIMENSAT, not CONFIG_ATFILE. Some older glibc don't
have utimensat() (even if they have the other *at() functions).
Fix this by correctly cleaning up the sys_utimensat()
implementation and #defines, so that we always provide the
syscall if needed whether we're doing it via glibc or not.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Message-id: 1371743841-26110-1-git-send-email-peter.maydell@linaro.org
This allows to pass the device name.
You can test this with the "route" command.
WITHOUT this patch:
$ sudo route add -net default gw 10.0.3.1 eth0
SIOCADDRT: Bad address
$ netstat -nr
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Ifa
10.0.3.0 0.0.0.0 255.255.255.0 U 0 0 0 eth
WITH this patch:
$ sudo route add -net default gw 10.0.3.1 eth0
$ netstat -nr
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Ifa
0.0.0.0 10.0.3.1 0.0.0.0 UG 0 0 0 eth
10.0.3.0 0.0.0.0 255.255.255.0 U 0 0 0 eth
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Some applications use /proc/$$/... (where $$ is the own pid) instead of
/proc/self/... to refer to their own proc files. Extend the interception
for open and readlink to handle this case. Also, do the same interception
in readlinkat.
Signed-off-by: Andreas Schwab <schwab@suse.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
The linux-user syscall emulation layer currently supports the
openat family of syscalls via two mechanisms: simply calling
the corresponding libc functions, and making direct syscalls.
Since glibc has supported these functions since at least glibc
2.5, there's no real need to retain the (essentially untested)
direct syscall fallback code, so simply delete it. This allows
us to remove some ifdeffery that was attempting to disable
provision of some of the syscalls if the host didn't seem to
support them, which in some cases was actually wrong (eg where
there are several flavours of the syscall and we only need
one of them, not necessarily the exact one the guest has,
as with the fstatat* calls).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Tested-by: Claudio Fontana <claudio.fontana@huawei.com>
Message-id: 1370126121-22975-2-git-send-email-peter.maydell@linaro.org
Newer architectures may only implement the getdents64 syscall, not
getdents. Provide an implementation of getdents in terms of getdents64
so that we can run getdents-using targets on a getdents64-only host.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Tested-by: Claudio Fontana <claudio.fontana@huawei.com>
Message-id: 1370344377-27445-1-git-send-email-peter.maydell@linaro.org
Message-id: 1370193044-24535-1-git-send-email-peter.maydell@linaro.org
Add a space at end of line when there is no filename to print, to
conform to linux kernel format (see show_map_vma() in
fs/proc/task_mmu.c).
Signed-off-by: Christophe Lyon <christophe.lyon@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Remove a stray colon from the end of a #ifdef line. Some versions
of gcc complain about this:
linux-user/syscall.c: In function ‘do_syscall’:
linux-user/syscall.c:7606:28: error: extra tokens at end of #ifdef directive [-Werror]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-By: Riku Voipio <riku.voipio@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
If TARGET_ABI_BITS is bigger than 32 we shift by more than the size of int.
Signed-off-by: Andreas Schwab <schwab@suse.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
do_semop() is called from two places, and one of these fails to convert
return error to target errno when semop fails. This patch changes the
function to always return target errno in case of an unsuccessful call.
Signed-off-by: Petar Jovanovic <petar.jovanovic@imgtec.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
This change makes conversion of TARGET_O_NONBLOCK and TARGET_O_CLOEXEC flags
to host flags before calling eventfd for TARGET_NR_eventfd2.
Signed-off-by: Petar Jovanovic <petar.jovanovic@imgtec.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The nature of the kernel ABI for the get_robust_list and set_robust_list
syscalls means we cannot implement them in QEMU. Make get_robust_list
silently return ENOSYS rather than using the default "print message and
then fail ENOSYS" code path, in the same way we already do for
set_robust_list, and add a comment documenting why we do this.
This silences warnings which were being produced for emulating
even trivial programs like 'ls' in x86-64-on-x86-64.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Implement the accept4 syscall (which is identical to accept
but has an additional flags argument).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Implement the sendfile and sendfile64 syscalls. This implementation
passes all the LTP test cases for these syscalls.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
If the guest passes us a bogus negative length for an iovec, fail
EINVAL rather than proceeding blindly forward. This fixes some of
the error cases tests for readv and writev in the LTP.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Upstream libc has recently changed to start using
FUTEX_WAIT_BITSET instead of FUTEX_WAIT and this
is causing do_futex to return -TARGET_ENOSYS.
Pass bitset in val3 to sys_futex which will be
ignored by kernel for the FUTEX_WAIT case.
Signed-off-by: John Rigby <john.rigby@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
CPUs are never added to the composition tree, so delete is achieved
simply by removing the last references to them.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
According to man reboot(2), the 4th argument is only used with
LINUX_REBOOT_CMD_RESTART2. In other cases, trying to convert
the value can generate EFAULT.
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
SO_SNDTIMEO and SO_RCVTIMEO take a struct timeval, not an int
To test this, you can use :
QEMU_STRACE= ping localhost 2>&1 |grep TIMEO
568 setsockopt(3,SOL_SOCKET,SO_SNDTIMEO,{1,0},8) = 0
568 setsockopt(3,SOL_SOCKET,SO_RCVTIMEO,{1,0},8) = 0
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
All parameters must be swapped before the call of do_msgrcv().
Allow faked (debian fakeroot daemon) to work properly.
WITHOUT this patch:
$ faked-sysv --foreground --debug
using 1723744788 as msg key
msg_key=1723744788
1723744788:431
FAKEROOT: msg=131072, key=1723744788
FAKEROOT: r=-1, received message type=-150996052, message=-160219330
FAKEROOT, get_msg: Bad address
r=14, EINTR=4
fakeroot: clearing up message queues and semaphores, signal=-1
fakeroot: database save FAILED
WITH this patch:
$ faked-sysv --foreground --debug
using 1569385744 as msg key
msg_key=1569385744
1569385744:424
FAKEROOT: msg=0, key=1569385744
^C
fakeroot: clearing up message queues and semaphores, signal=2
fakeroot: database save FAILED
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Alpha, like s390x, passes all select arguments in registers.
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Richard Henderson <rth@twiddle.net>
The values of F_RDLCK, F_WRLCK, F_UNLCK, F_EXLCK, F_SHLCK
differ between alpha and other linux architectures.
This patch allows to run "dpkg" (database lock).
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Richard Henderson <rth@twiddle.net>
instead use the correct headers that define these functions.
Requested-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: John Spencer <maillist-qemu@barfooze.de>
Reviewed-by: Amos Kong <kongjianjun@gmail.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* bonzini/header-dirs: (45 commits)
janitor: move remaining public headers to include/
hw: move executable format header files to hw/
fpu: move public header file to include/fpu
softmmu: move remaining include files to include/ subdirectories
softmmu: move include files to include/sysemu/
misc: move include files to include/qemu/
qom: move include files to include/qom/
migration: move include files to include/migration/
monitor: move include files to include/monitor/
exec: move include files to include/exec/
block: move include files to include/block/
qapi: move include files to include/qobject/
janitor: add guards to headers
qapi: make struct Visitor opaque
qapi: remove qapi/qapi-types-core.h
qapi: move inclusions of qemu-common.h from headers to .c files
ui: move files to ui/ and include/ui/
qemu-ga: move qemu-ga files to qga/
net: reorganize headers
net: move net.c to net/
...
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
this declaration is wrong:
the correct prototype on linux is:
int setgroups(size_t size, const gid_t *list);
since by default musl libc exposes this symbol in unistd.h
additionally to grp.h, the wrong declaration causes a build error.
the proper fix is to simply include the correct header.
Signed-off-by: John Spencer <maillist-qemu@barfooze.de>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The Linux syscalls underlying pread() and pwrite() take a 64 bit
offset on all architectures, even if some of them name the syscall
"pread/pwrite" rather than "pread64/pwrite64" for historical reasons.
So move the four QEMU target architectures (arm, i386, sparc,
unicore32) which were defining TARGET_NR_pread/pwrite to define
TARGET_NR_pread64/pwrite64 instead, and drop the TARGET_NR_pread/pwrite
implementation code completely.
(Based on examination of the kernel sources for the four architectures
this patch affects.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
pread64 and pwrite64 pass 64bit parameters which for some architectures need
to be aligned to special argument pairs, creating a gap argument.
Handle this special case the same way we handle it in other places of the code.
Reported-by: Alex Barcelo <abarcelo@ac.upc.edu>
Signed-off-by: Alexander Graf <agraf@suse.de>
Tested-by: Alex Barcelo <abarcelo@ac.upc.edu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
The SysV PPC32 ABI dictates that long long (64bit) parameters are pass in odd/even
register pairs. Because unlike ARM and MIPS we start at an odd register number,
we can reuse the same aligning code that ARM and MIPS use.
Clarified inline comment that it is SysV ABI that requires long long aligned
parameters - Riku
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Compare signal numbers in the proper domain.
Convert all of the fields for SIGIO and SIGCHLD.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Validate count between 0 and IOV_MAX. Limit total length of
operation in the same way the kernel does.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
When reading our faked /proc/self/maps from a secondary thread,
we get an invalid stack entry. This is because ts->stack_base is not
initialized in non-primary threads.
However, ts->info is, and the stack layout information we're looking
for is there too. So let's use that one instead!
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
The statfs syscall should always memset(0) its full struct extent before
writing to it. Newer versions of the syscall use one of the reserved fields
for flags, which would otherwise get stale values from uncleaned memory.
This fixes libarchive for me, which got confused about the return value of
pathconf("/", _PC_REC_XFER_ALIGN) otherwise, as it some times gave old pointers
as return value.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Report from smatch:
linux-user/syscall.c:3632 do_ioctl_dm(220) info:
redundant null check on big_buf calling free()
'big_buf' was allocated by g_malloc0, therefore free was also
replaced by g_free.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
In case when TARGET_ABI_BITS == 32 && HOST_LONG_BITS == 64, the last
byte of the target dirent structure (aka d_type byte) was never copied
from the host dirent structure, thus breaking everything that relies
on valid d_type value, e.g. glob(3).
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Also, use g_malloc to avoid NULL-deref upon OOM.
Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
The code to initialise the target_to_host_errno_table[] array was
accidentally inside the loop through checking and initialising all
the supported ioctls. This was harmless but meant that we reinitialised the
array several hundred times on startup.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Alpha uses unbiased priority values in the syscall, with the a3
return value signaling error conditions. Therefore, properly
interpret the libc getpriority as needed for the guest rather
than passing the host value through unchanged.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Name the syscall properly for QEMU, kernel source notwithstanding.
Fix syntax errors in the code thus enabled within do_syscall.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
We weren't aggregating the exceptions, nor raising signals properly.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Improve the emulation of /proc/self/maps by reading the underlying
host maps file and passing lines through with addresses adjusted
to be guest addresses. This is necessary to avoid false triggers
of the glibc check that a format string containing '%n' is not in
writable memory. (For an example see the bug reported in
https://bugs.launchpad.net/qemu-linaro/+bug/947888 where gpg aborts.)
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
After all target CPUs have been QOM'ified, we no longer need an #ifdef
to switch between object_delete() and g_free() in NPTL thread exit.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
For QOM'ified CPUs we cannot g_free() CPUArchState, we must
object_delete() the object it is embedded into.
Fixes LP#982321 (invalid free() while executing pacman with qemu-arm).
Reported-by: Serge Schneider <serge@xecdesign.com>
Reported-by: Russell Keith Davis <russell@russelldavis.org>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Serge Schneider <serge@xecdesign.com>
Tested-by: Russell Keith Davis <russell@russelldavis.org>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Add support for the prctl options PR_GET_NAME and PR_SET_NAME,
which take or return a name in a 16 byte buffer pointed to by arg2.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Clean up the odd indentation of this switch statement before
we double its size by adding new cases to it.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Fallocate gets off_t parameters passed in, so we should also read them out
accordingly.
Signed-off-by: Alexander Graf <agraf@suse.de>
---
v1 -> v2:
- unbreak 64-bit guests
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
This patch implements all ioctls currently implemented by device mapper,
enabling us to run dmsetup and kpartx inside of linux-user.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
With the current fake /proc/self/stat implementation `ps` is
segfaulting because it expects to read PID and argv[0] as first and
second field respectively, with the latter being enclosed between
backets.
Reproducing is as easy as running: `ps` inside qemu-user chroot
with /proc mounted.
Signed-off-by: Fabio Erculiani <lxnay@sabayon.org>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Scripted conversion:
for file in *.[hc] hw/*.[hc] hw/kvm/*.[hc] linux-user/*.[hc] linux-user/m68k/*.[hc] bsd-user/*.[hc] darwin-user/*.[hc] tcg/*/*.[hc] target-*/cpu.h; do
sed -i "s/CPUState/CPUArchState/g" $file
done
All occurrences of CPUArchState are expected to be replaced by QOM CPUState,
once all targets are QOM'ified and common fields have been extracted.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Frees the identifier cpu_reset for QOM CPUs (manual rename).
Don't hide the parameter type behind explicit casts, use static
functions with strongly typed argument to indirect.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Fix format type mismatches in do_brk debug printfs.
Signed-off-by: Paul Brook <paul@codesourcery.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
If the host's page size is equal to or smaller than the target's, native
execve() will fail appropriately with E2BIG if called with too big an
environment for the target to handle. It may falsely succeed, however, if
the host's page size is bigger, and feed the executed target process an
environment that is too big for it to handle, at which point QEMU barfs and
exits, confusing procmail's autoconf script and causing the build to fail.
This patch makes sure that execve() will return E2BIG if the environment is
too large for the target.
Signed-off-by: Ulrich Hecht <uli@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Implement the f and l versions (operate on fd, don't follow links)
of the setxattr, getxattr and removexattr syscalls.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
It's valid to pass a NULL value pointer to setxattr, so don't
fail this case EFAULT.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
When calling wait4 or waitpid with a status pointer and WNOHANG, the
syscall can potentially not modify the status pointer input. Now if we
have guest code like:
int status = 0;
waitpid(pid, &status, WNOHANG);
if (status)
<breakage>
then we have to make sure that in case status did not change we actually
return the guest's initialized status variable instead of our own uninitialized.
We fail to do so today, as we proxy everything through an uninitialized status
variable which for me ended up always containing the last error code.
This patch fixes some test cases when building yast2-core in OBS for ARM.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
While debugging some issues with QEMU_STRACE I stumbled over segmentation
faults that were pretty reproducible. Turns out we tried to treat a
normal return value as errno, resulting in an access over array boundaries
for the resolution.
Fix this by allowing failure to resolve invalid errnos into strings.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Gtk tries to read /proc/self/auxv to find its auxv table instead of
taking it from its own program memory space.
However, when running with linux-user, we see the host's auxv which
clearly exposes wrong information. so let's instead expose the guest
memory backed auxv tables via /proc/self/auxv as well.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
The boehm gc finds the program's stack starting pointer by
checking /proc/self/stat. Unfortunately, so far it reads
qemu's stack pointer which clearly is wrong.
So let's instead fake the file so the guest program sees the
right address.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
glibc's pthread_attr_getstack tries to find the stack range from
/proc/self/maps. Unfortunately, /proc is usually the host's /proc
which means linux-user guests see qemu's stack there.
Fake the file with a constructed maps entry that exposes the guest's
stack range.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
There are a number of files in /proc that expose host information
to the guest program. This patch adds infrastructure to override
the open() syscall for guest programs to enable us to on the fly
generate guest sensible files.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
In an fcntl64 failure path, we were returning directly rather than
simply breaking out of the switch statement. This skips the strace
code for printing the syscall return value, so don't do that.
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Double semicolons should be single.
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Modern distributions place xattr.h in /usr/include/sys, and fold
libattr.so into libc. They also don't have an ENOATTR.
Make configure detect this, and add a qemu-xattr.h file that
directs the #include to the right place.
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
For OBS, we're running a full cross-guest inside of a VM. When a build
is done there, we reboot the guest as shutdown mechanism.
Unfortunately, reboot is not implemented in linux-user. So this mechanism
fails, spilling unpretty warnings. This patch implements sys_reboot()
emulation.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
When running openat using qemu-arm, we stumbled over invalid permissions
on the created files. The reason for this is that the mode parameter gets
treates as an O_... flag, which it isn't - it's a permission bitmask.
This patch removes the needless translation of the mode parameter,
rendering permission passing of openat() to work with linux-user.
Reported-by: Dirk Mueller <dmueller@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
abi_(u)long might be different from target_ulong, so don't use tswapl
but introduce a new tswapal
Signed-off-by: Matthias Braun <matze@braunis.de>
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
Those blanks violate the coding conventions, see
scripts/checkpatch.pl.
Blanks missing after colons in the changed lines were added.
This patch does not try to fix tabs, long lines and other
problems in the changed lines, therefore checkpatch.pl reports
many violations.
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
* qemu-common.h is not a system include file, so it should be included
with "" instead of <>. Otherwise incremental builds might fail
because only local include files are checked for changes.
* linux-user/syscall.c included the file twice.
Cc: Riku Voipio <riku.voipio@iki.fi>
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Acked-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
This patch implements the setxattr, getxattr, and removexattr syscalls
if CONFIG_ATTR is enabled.
Note that since libattr uses indirect syscalls for these, this change
depends on the fix for indirect syscall handling on MIPS.
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: An-Cheng Huang <ancheng@ubnt.com>
Technically the new mmapped pages are already initialized to zero
since they are anonymous, however we have to take care with the
contents that come from the remaining part of the previous page: it
may contains garbage data due to a previous heap usage (grown then
shrunken).
This patch completes commit 55f08c84. The problem could be reproduced
when emulating the build process of Perl 5.12.3 on ARMedSlack 13.37:
make[1]: Entering directory `/tmp/perl-5.12.3/cpan/Compress-Raw-Bzip2'
cc -c -I. -fno-strict-aliasing -pipe -fstack-protector \
-I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 \
-O2 -DVERSION=\"2.024\" -DXS_VERSION=\"2.024\" -fPIC "-I../.." \
-DBZ_NO_STDIO decompress.c
decompress.c: In function 'BZ2_decompress':
decompress.c:621:1: internal compiler error: Segmentation fault
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Signed-off-by: Laurent ALFONSI <laurent.alfonsi@st.com>
Signed-off-by: Cédric VINCENT <cedric.vincent@st.com>
Avoid warnings like these by wrapping recv():
CC slirp/ip_icmp.o
/src/qemu/slirp/ip_icmp.c: In function 'icmp_receive':
/src/qemu/slirp/ip_icmp.c:418:5: error: passing argument 2 of 'recv' from incompatible pointer type [-Werror]
/usr/local/lib/gcc/i686-mingw32msvc/4.6.0/../../../../i686-mingw32msvc/include/winsock2.h:547:32: note: expected 'char *' but argument is of type 'struct icmp *'
Remove also casts used to avoid warnings.
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
MIPS uses similar calling convention than ARM eabi, where when using
64-bit values some registers are skipped. This patch makes MIPS and ARM
eabi share the argument reordering code.
This affects ftruncate64, creating insane sized fails (or just failing).
Cc: Wesley W. Terpstra <terpstra@debian.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
The codes for get/setrlimit differ between linux target platforms.
This patch adds conversion.
This is important else programs (rsyslog, python, ...) can go into a
near infinite loop trying to close all the file descriptors from 0 to
-1.
Signed-off-by: Wesley W. Terpstra <terpstra@debian.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Byte swap was applied in the wrong order with testing for
RLIM_INFINITY. On mips bigendian from an amd64 system this results in
infinity being misinterpretted as 2^31-1.
This is a serious bug because it causes setrlimit stack size to kill
all child processes. This means (for example) that 'make' can run no
children. The mechanism of failure:
1. parent sets stack size rlimit to 'infinity'
2. qemu screws this value up
3. child process fetches stack size as a large (but non-infinite) value
4. qemu tries to allocate stack before execution
5. stack allocation fails (too big) and child process dies
Signed-off-by: Wesley W. Terpstra <terpstra@debian.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Enforce the same restriction on the size of the sigset passed to
pselect6 as the Linux kernel does. This is both correct and silences
a gcc 4.6 warning about a write-only variable.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
As noticed while looking at "Bump do_syscall() up to 8 syscall arguments"
patch, sync_file_range uses a pad argument on 32bit mips. Deal with it
by reading the correct arguments when on mips.
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
On 32 bit MIPS a few syscalls have 7 arguments, and so to call
them via NR_syscall the guest needs to be able to pass 8 arguments
to do_syscall(). Raise the number of arguments do_syscall() takes
accordingly.
This fixes some gcc 4.6 compiler warnings about arg7 and arg8
variables being set and never used.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
Looking at the other architectures, we should be using "how" not "arg1".
Signed-off-by: Juan Quintela <quintela@redhat.com>
[peter.maydell@linaro.org: remove unnecessary initialisation of how]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
We assign ret with the error code, but then return 0 unconditionally.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
Some architectures (like Blackfin) only implement pselect6 (and skip
select/newselect). So add support for it.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
There were several remaining bugs in the previous implementation of
do_brk():
1. the value of "new_alloc_size" was one page too large when the
requested brk was aligned on a host page boundary.
2. no new pages should be (re-)allocated when the requested brk is
in the range of the pages that were already allocated
previsouly (for the same purpose). Technically these pages are
never unmapped in the current implementation.
The problem/fix can be reproduced/validated with the test-suite above:
#include <unistd.h> /* syscall(2), */
#include <sys/syscall.h> /* SYS_brk, */
#include <stdio.h> /* puts(3), */
#include <stdlib.h> /* exit(3), EXIT_*, */
#include <stdint.h> /* uint*_t, */
#include <sys/mman.h> /* mmap(2), MAP_*, */
#include <string.h> /* memset(3), */
int main()
{
int exit_status = EXIT_SUCCESS;
uint8_t *current_brk = 0;
uint8_t *initial_brk;
uint8_t *new_brk;
uint8_t *old_brk;
int failure = 0;
int i;
void test_brk(int increment, int expected_result) {
new_brk = (uint8_t *)syscall(SYS_brk, current_brk + increment);
if ((new_brk == current_brk) == expected_result)
failure = 1;
current_brk = (uint8_t *)syscall(SYS_brk, 0);
}
void test_result() {
if (!failure)
puts("OK");
else {
puts("failure");
exit_status = EXIT_FAILURE;
}
}
void test_title(const char *title) {
failure = 0;
printf("%-45s : ", title);
fflush(stdout);
}
test_title("Initialization");
test_brk(0, 1);
initial_brk = current_brk;
test_result();
test_title("Don't overlap \"brk\" pages");
test_brk(HOST_PAGE_SIZE, 1);
test_brk(HOST_PAGE_SIZE, 1);
test_result();
/* Preparation for the test "Re-allocated heap is initialized". */
old_brk = current_brk - HOST_PAGE_SIZE;
memset(old_brk, 0xFF, HOST_PAGE_SIZE);
test_title("Don't allocate the same \"brk\" page twice");
test_brk(-HOST_PAGE_SIZE, 1);
test_brk(HOST_PAGE_SIZE, 1);
test_result();
test_title("Re-allocated \"brk\" pages are initialized");
for (i = 0; i < HOST_PAGE_SIZE; i++) {
if (old_brk[i] != 0) {
printf("(index = %d, value = 0x%x) ", i, old_brk[i]);
failure = 1;
break;
}
}
test_result();
test_title("Don't allocate \"brk\" pages over \"mmap\" pages");
new_brk = mmap(current_brk, HOST_PAGE_SIZE / 2, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED, -1, 0);
if (new_brk == (void *) -1)
puts("unknown");
else {
test_brk(HOST_PAGE_SIZE, 0);
test_result();
}
test_title("All \"brk\" pages are writable (please wait)");
if (munmap(current_brk, HOST_PAGE_SIZE / 2) != 0)
puts("unknown");
else {
while (current_brk - initial_brk < 2*1024*1024*1024UL) {
old_brk = current_brk;
test_brk(HOST_PAGE_SIZE, -1);
if (old_brk == current_brk)
break;
for (i = 0; i < HOST_PAGE_SIZE; i++)
old_brk[i] = 0xAA;
}
puts("OK");
}
test_title("Maximum size of the heap > 16MB");
failure = (current_brk - initial_brk) < 16*1024*1024;
test_result();
exit(exit_status);
}
Changes introduced in patch v2:
* extend the "brk" test-suite embedded within the commit message;
* heap contents have to be initialized to zero, this bug was
exposed by "tst-calloc.c" from the GNU C library;
* don't [try to] allocate a new host page if the new "brk" is
equal to the latest allocated host page ("brk_page"); and
* print some debug information when DEBUGF_BRK is defined.
Signed-off-by: Cédric VINCENT <cedric.vincent@st.com>
Reviewed-by: Christophe Guillon <christophe.guillon@st.com>
Cc: Riku Voipio <riku.voipio@iki.fi>
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
Since mmap() with MAP_FIXED will map over the top of existing mappings,
it's a bad idea to use it to implement brk(), because brk() with a
large size is likely to overwrite important things like qemu itself
or the host libc. So we drop MAP_FIXED and handle "mapped but at
different address" as an error case instead.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
This patch adds support for running s390x binaries in the linux-user emulation
code.
Signed-off-by: Ulrich Hecht <uli@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
Function bzero is deprecated, so replace it by function memset.
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
The kernel doesn't fill the buffer provided to sched_getaffinity
with zero bytes, so neither should QEMU.
Signed-off-by: Mike McCormack <mj.mccormack@samsung.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
Zeroing of the cpu array should start from &cpus[kernel_ret]
not &cpus[num_zeros_to_fill].
This fixes a crash in EFL's edje_cc running under qemu-arm.
Signed-off-by: Mike McCormack <mj.mccormack@samsung.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
Quite a number of uid/gid related syscalls are only defined on systems
with USE_UID16 defined. This is apperently based on the idea that these
system calls would never be called on non-UID16 systems. Make these
syscalls available for all architectures that define them.
drop alpha hack to support selected UID16 syscalls. MIPS and PowerPC
were also defined as UID16, to get uid/gid syscalls available, drop
this error as well.
Change QEMU to reflect this.
Cc: Ulrich Hecht <uli@suse.de>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Alexander Graf <agraf@suse.de>
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
We keep a list of host architectures that do llseek with the same
syscall as lseek. S390x is one of them, so let's add it to the list.
Original-patch-by: Ulrich Hecht <uli@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
The result needs to be converted as it is stored in an array of struct
ifreq and sizeof(struct ifreq) differs according to target and host
alignment rules.
This patch allows to execute correctly the following program on arm
and m68k:
#include <stdio.h>
#include <sys/ioctl.h>
#include <net/if.h>
#include <alloca.h>
#include <string.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
int main(void)
{
int s, ret;
struct ifconf ifc;
int i;
memset( &ifc, 0, sizeof( struct ifconf ) );
ifc.ifc_len = 8 * sizeof(struct ifreq);
ifc.ifc_buf = alloca(ifc.ifc_len);
s = socket( AF_INET, SOCK_DGRAM, 0 );
if (s < 0) {
perror("Cannot open socket");
return 1;
}
ret = ioctl( s, SIOCGIFCONF, &ifc );
if (s < 0) {
perror("ioctl() failed");
return 1;
}
for (i = 0; i < ifc.ifc_len / sizeof(struct ifreq) ; i ++) {
struct sockaddr_in *s;
s = (struct sockaddr_in*)&ifc.ifc_req[i].ifr_addr;
printf("%s\n", ifc.ifc_req[i].ifr_name);
printf("%s\n", inet_ntoa(s->sin_addr));
}
}
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
PTHREAD_STACK_MIN (16KB) is somewhat inadequate for a new stack for new
QEMU threads. Set new limit to 256K which should be enough, yet doesn't
increase memory pressure significantly.
Signed-off-by: Riku Voipio <riku.voipio@nokia.com>
Reviewed-by: Nathan Froyd <froydnj@codesourcery.com>
Correct the broken attempt to calculate the third argument
to unlock_user() in the code path which unlocked the pollfd
array on return from poll() and ppoll() emulation. (This
only caused a problem if unlock_user() wasn't a no-op, eg
if DEBUG_REMAP is defined.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
When emulating a 32 bit Linux user-mode program on a 64 bit target
we implement the llseek syscall in terms of lseek. Correct a bug
which meant we were silently casting the result of host lseek()
to a 32 bit integer as it passed through get_errno() and thus
throwing away the top half.
We also don't try to store the result back to userspace unless
the seek succeeded; this matches the kernel behaviour.
Thanks to Eoghan Sherry for identifying the problem and suggesting
a solution.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Support the epoll family of syscalls: epoll_create(), epoll_create1(),
epoll_ctl(), epoll_wait() and epoll_pwait(). Note that epoll_create1()
and epoll_pwait() are later additions, so we have to test separately
in configure for their presence.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@nokia.com>
Some architectures (like Blackfin) only implement ppoll (and skip poll).
So add support for it using existing poll code.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Riku Voipio <riku.voipio@nokia.com>
Add a configure check for the existence of linux/fiemap.h and the
IOC_FS_FIEMAP ioctl. This fixes a compilation failure on Linux
systems which don't have that header file.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Implement the FS_IOC_FIEMAP ioctl using the new support for
custom handling of ioctls; this is needed because the struct
that is passed includes a variable-length array.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
Some ioctls (for example FS_IOC_FIEMAP) use structures whose size is
not constant. The generic argument conversion code in do_ioctl()
cannot handle this, so add support for implementing a special-case
handler for a particular ioctl which does the conversion itself.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
Implement the missing syscalls sync_file_range and sync_file_range2.
The latter in particular is used by newer versions of apt on Ubuntu
for ARM.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
n setsockopt, the socket level options are translated to the hosts'
architecture before the real syscall is called, e.g.
TARGET_SO_TYPE -> SO_TYPE. This patch does the same with getsockopt.
Tested on a x86 host emulating MIPS. Without it:-
$ grep getsockopt host.strace
31311 getsockopt(3, SOL_SOCKET, 0x1007 /* SO_??? */, 0xbff17208,
0xbff17204) = -1 ENOPROTOOPT (Protocol not available)
With:-
$ grep getsockopt host.strace
25706 getsockopt(3, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
Whitespace cleanup: Riku Voipio
Signed-off-by: Jamie Lentin <jm@lentin.co.uk>
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
Running programs that create large numbers of threads, such as this
snippet from libstdc++'s pthread7-rope.cc:
const int max_thread_count = 4;
const int max_loop_count = 10000;
...
for (int j = 0; j < max_loop_count; j++)
{
...
for (int i = 0; i < max_thread_count; i++)
pthread_create (&tid[i], NULL, thread_main, 0);
for (int i = 0; i < max_thread_count; i++)
pthread_join (tid[i], NULL);
}
in user-mode emulation will quickly run out of memory. This is caused
by a failure to free memory in do_syscall prior to thread exit:
/* TODO: Free CPU state. */
pthread_exit(NULL);
The first step in fixing this is to make all TaskStates used by QEMU
dynamically allocated. The TaskState used by the initial thread was
not, as it was allocated on main's stack. So fix that, free the
cpu_env, free the TaskState, and we're home free, right?
Not exactly. When we create a thread, we do:
ts = qemu_mallocz(sizeof(TaskState) + NEW_STACK_SIZE);
...
new_stack = ts->stack;
...
ret = pthread_attr_setstack(&attr, new_stack, NEW_STACK_SIZE);
If we blindly free the TaskState, then, we yank the current (host)
thread's stack out from underneath it while it still has things to do,
like calling pthread_exit. That causes problems, as you might expect.
The solution adopted here is to let the C library allocate the thread's
stack (so the C library can properly clean it up at pthread_exit) and
provide a hint that we want NEW_STACK_SIZE bytes of stack.
With those two changes, we're done, right? Well, almost. You see,
we're creating all these host threads and their parent threads never
bother to check that their children are finished. There's no good place
for the parent threads to do so. Therefore, we need to create the
threads in a detached state so the parent thread doesn't have to call
pthread_join on the child to release the child's resources; the child
does so automatically.
With those three major changes, we can comfortably run programs like the
above without exhausting memory. We do need to delete 'stack' from the
TaskState structure.
Signed-off-by: Nathan Froyd <froydnj@codesourcery.com>
Signed-off-by: Riku Voipio <riku.voipio@nokia.com>
On many systems, socklen_t is defined as unsigned. This means that
checks for negative values are not meaningful.
Fix by explicitly casting to a signed integer.
This also avoids some warnings with GCC flag -Wtype-limits.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
When loading a shared library that requires an executable stack,
glibc uses the mprotext PROT_GROWSDOWN flag to achieve this.
We don't support PROT_GROWSDOWN.
Add a special case to handle changing the stack permissions in this way.
Signed-off-by: Paul Brook <paul@codesourcery.com>
There's no _llseek on s390x either. Replace the existing
test for __x86_64__ with a functional test for __NR_llseek.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Libc will fallback gracefully if pselect6 is not available. Thus put
pselect6 to nowarn until the atomicity issues of the original pselect6
patch are dealt with.
Signed-off-by: Riku Voipio <riku.voipio@nokia.com>
Cc: Michael Casadevall <mcasadevall@ubuntu.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Alpha passes oldset by value in a register, and returns the newset
as the return value; as compared to the standard implementation in
which both are passed by reference. This requires being able to
distinguish negative return values that are not errors. Do this in
the same way as the Alpha Linux kernel, by storing a zero in V0 in
the implementation of the syscall.
At the same time, fix a think-o in the regular sigprocmask path in
which we passed the target, rather than the host, HOW value.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Alpha passes the signal set in a register, not by reference.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
At the same time, tidy the code wrt MIPS and SH4 which have the
same two register return mechanism. Fix confusion between pipe
and pipe2 with an explicit flags=0, when the guest will not be
using the two register return mechanism.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
rlim_t conversion between host and target added.
Otherwise there are some incorrect case like
- RLIM_INFINITY on 32bit target -> 64bit host.
- RLIM_INFINITY on 64bit host -> mips and sparc target ?
- Big value(for 32bit target) on 64bit host -> 32bit target.
One is added into getrlimit, setrlimit, and ugetrlimit. It converts both
RLIM_INFINITY and value bigger than target can hold(>31bit) to RLIM_INFINITY.
Another one is added to guest_stack_size calculation introduced by
703e0e89. The rule is mostly same except the result on the case is keeping
the value of guest_stack_size.
Slightly tested for SH4, and x86_64 -linux-user on x86_64-pc-linux host.
Signed-off-by: Takashi YOSHII <takasi-y@ops.dti.ne.jp>
Acked-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Check TARGET_ABI_BITS, not TARGET_LONG_BITS, when deciding
whether or not the guest needs special 64-bit stat translation.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2nd arg of page_set_flags() should be start+size, but size.
Signed-off-by: Takashi YOSHII <takasi-y@ops.dti.ne.jp>
Acked-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Commit c05c7a7306
breaks cross compilation for mips (and other
compilations without CONFIG_INOTIFY1):
make[1]: Entering directory `/qemu/bin/mips'
CC i386-linux-user/syscall.o
cc1: warnings being treated as errors
/qemu/linux-user/syscall.c: In function ‘do_syscall’:
/qemu/linux-user/syscall.c:7067: error: implicit declaration of function ‘sys_inotify_init1’
Cc: Riku Voipio <riku.voipio@nokia.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
ia64 has some strangenesses that need to be workaround:
- it has a __clone2() syscall instead of the using clone() one, with
different arguments, and which is not declared in the usual headers.
- ucontext.uc_sigmask is declared with type long int, while it is
actually of type sigset_t.
- uc_mcontext, uc_sigmask, uc_stack, uc_link are declared using #define,
which clashes with the target_ucontext fields. Change their names to
tuc_*, as already done for some target architectures.
New syscall which gets actively used when you have a
fresh kernel.
Signed-off-by: Riku Voipio <riku.voipio@nokia.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
On linux/sh4
pipe() return values by r0:r1 as SH C calling convention.
pipe2() return values on memory as traditional unix way.
Signed-off-by: Takashi YOSHII <takasi-y@ops.dti.ne.jp>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Move userland PALcode handling into linux-user main loop so that
we can send signals from there. This also makes alpha_palcode.c
system-level only, so don't build it for userland. Add defines
for GENTRAP PALcall mapping to signals.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
This patch for linux-user adapts the output of the emulated uname()
syscall to match the configured CPU. Tested with x86, x86-64 and arm
emulation.
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
Signed-off-by: Loïc Minier <lool@dooz.org>
The stat64/fstat64 syscalls are broken for alpha linux-user.
This is because Alpha, even though it is native 64-bits, has a stat64
syscall that is different than regular stat. This means that the
"TARGET_LONG_BITS==64" check in syscall.c isn't enough. Below is
a patch that fixes things for me, although it might not be the cleanest
fix.
This issue keeps sixtrack and fma3d spec2k benchmarks from running.
Signed-off-by: Vince Weaver <vince@csl.cornell.edu>
Acked-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
1. Add correct definitions of error numbers.
2. Implement SYS_osf_sigprocmask
3. Implement SYS_osf_get/setsysinfo for IEEE_FP_CONTROL.
This last requires exposing the FPCR value to do_syscall.
Since this value is actually split up into the float_status,
expose routines from helper.c to access it.
Finally, also add a float_exception_mask field to float_status.
We don't actually use it to control delivery of exceptions to
the emulator yet, but simply hold the value that we placed there
when loading/storing the FPCR.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
b55a37c981 moved the call to cpu_reset
to user emulators. But cpu_copy also initializes a CPU structure, so add the
call also there.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
everything needed to run SDL on a framebuffer device in the userspace emulator
Signed-off-by: Ulrich Hecht <uli@suse.de>
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
updated fallocate check to new configure, added dup3 check as suggested
by Jan-Simon Möller.
Riku: updated to apply to current git.
Signed-off-by: Ulrich Hecht <uli@suse.de>
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
In the very least, a change like this requires discussion on the list.
The naming convention is goofy and it causes a massive merge problem. Something
like this _must_ be presented on the list first so people can provide input
and cope with it.
This reverts commit 99a0949b72.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Put space between = and & when taking a pointer,
to avoid confusion with old-style "&=".
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
The fstat implementation does not initialize the nanosecond fields in the
stat buffer; this caused funny values to turn up there, preventing, for
instance, cp -p from preserving timestamps because utimensat rejected
the out-of-bounds nanosecond values. Resetting the entire structure
to zero fixes that.
Signed-off-by: Ulrich Hecht <uli@suse.de>
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
works perfectly fine with the example from getdents(2) and passes the LTP
tests (tested with s390x on x86_64 emulation)
Signed-off-by: Ulrich Hecht <uli@suse.de>
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
Fixes swaps on l_pid which were pretty much of random size. Implements
F_SETLEASE, F_GETLEASE. Now passes all LTP fcntl tests.
Signed-off-by: Ulrich Hecht <uli@suse.de>
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
mqueue.h is only available if __NR_mq_open is defined. So don't include
it unconditionally. Similarly, the mq_* family of syscalls depend on
__NR_mq_open. Finally, the copy_{from,to}_user_mq_attr functions should
not be defined unconditionally, but only if we're going to use the mq_*
syscalls.
Signed-off-by: Nathan Froyd <froydnj@codesourcery.com
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
As setup_frame() and setup_rt_frame() are now implemented we can now
enable sigaltstack().
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
Straightforward implementation. This syscall is rare enough that we
don't need to support the odder cases, just disable it if host glibc
is too old.
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
Use was not consistent, in Makefile was TARGET_GPROF and in *h HAVE_GPROF
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
I used the following command to enable debugging:
perl -p -i -e 's/^\/\/#define DEBUG/#define DEBUG/g' * */* */*/*
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Fixes getrlimit implementation that overwrote the result of the syscall
instead of converting it
Signed-off-by: Ulrich Hecht <uli@suse.de>
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
makes socketcall 64-bit clean so it works on 64-bit big-endian systems
Signed-off-by: Ulrich Hecht <uli@suse.de>
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
pipedes is an address, it should not be signed (breaks for addresses
> 0x80000000)
Signed-off-by: Ulrich Hecht <uli@suse.de>
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
Removes the following warning
CC i386-linux-user/syscall.o
cc1: warnings being treated as errors
/media/nfs/qemu/linux-user/syscall.c: In function ‘do_syscall’:
/media/nfs/qemu/linux-user/syscall.c:2219: warning: ‘array’ may be used uninitialized in this function
Signed-off-by: Vibi Sreenivasan <vibi_sreenivasan@cms.com>
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
This patch is fixing following issues :
- commit 8fea36025b was applied to
do_getsockname instead of do_accept.
- Some syscalls were not checking properly the memory addresses passed
as argument
- Add check before syscalls made for cases like do_getpeername() where
we're using the address parameter after doing the syscall
- Fix do_accept to return EINVAL instead of EFAULT when parameters
invalid to match with linux behaviour
Signed-off-by: Arnaud Patard <arnaud.patard@rtp-net.org>
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
On Tue, Jun 16, 2009 at 08:19:23PM -0500, Anthony Liguori wrote:
> malc wrote:
>>
>> On my system the above line causes gcc to emit:
>>
>> In file included from /home/malc/x/rcs/git/qemu/linux-user/strace.c:12:
>> /usr/include/linux/futex.h:48: error: field `__user' has incomplete type
>> /usr/include/linux/futex.h:48: error: syntax error before '*' token
>> /usr/include/linux/futex.h:63: error: field `list' has incomplete type
>> /usr/include/linux/futex.h:83: error: field `__user' has incomplete type
>> /usr/include/linux/futex.h:83: error: syntax error before '*' token
>> make[1]: *** [strace.o] Error 1
> We had the same problem with usb-linux.c. It's broken system headers,
> the __user stuff is supposed to get removed as part of the headers
> installation.
> It builds fine on my system (Fedora 10).
Howabout something like this:
commit eb8387cb0eda32a18880664eb5f0ca5c8bf05b45
Author: Riku Voipio <riku.voipio@iki.fi>
Date: Thu Jun 18 22:44:31 2009 +0300
Subject: linux-user: include futex defines directly
Since some common distributions have broken linux/futex.h, stop
including it. Instead add the defines directly.
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
This issue has been detected with tests/linux-tests.c:
linux-test.c:330: getsockopt
327 len = sizeof(val);
328 chk_error(getsockopt(server_fd, SOL_SOCKET, SO_TYPE, &val, &len));
329 if (val != SOCK_STREAM)
330 error("getsockopt");
In linux-user/syscall.c:do_getsockopt(), we have:
...
val = tswap32(val);
...
if (put_user_u32(val, optval_addr))
...
whereas "put_user_u32" calls in the end "__put_user" which uses "tswap32".
So the "val = tswap32(val);" is useless and wrong.
This patch removes it.
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
Needed to make sure the xxxat() functions are available.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
Hi,
This is a new try to fix the fcntl support in linux-user. I tried to
adress all comments but as the previous version is several weeks old,
it's possible that I've missed some.
This patch doesn't handle linux specific fcntl flags. My plan is to get
this version of the patch reviewed/fixed and then, add them if wanted.
Thanks,
Arnaud
Signed-off-by: Arnaud Patard (Rtp) <arnaud.patard@rtp-net.org>
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>