Commit Graph

2889 Commits

Author SHA1 Message Date
Rich Felker
f3ddd17380 dynamic linker bootstrap overhaul
this overhaul further reduces the amount of arch-specific code needed
by the dynamic linker and removes a number of assumptions, including:

- that symbolic function references inside libc are bound at link time
  via the linker option -Bsymbolic-functions.

- that libc functions used by the dynamic linker do not require
  access to data symbols.

- that static/internal function calls and data accesses can be made
  without performing any relocations, or that arch-specific startup
  code handled any such relocations needed.

removing these assumptions paves the way for allowing libc.so itself
to be built with stack protector (among other things), and is achieved
by a three-stage bootstrap process:

1. relative relocations are processed with a flat function.
2. symbolic relocations are processed with no external calls/data.
3. main program and dependency libs are processed with a
   fully-functional libc/ldso.

reduction in arch-specific code is achived through the following:

- crt_arch.h, used for generating crt1.o, now provides the entry point
  for the dynamic linker too.

- asm is no longer responsible for skipping the beginning of argv[]
  when ldso is invoked as a command.

- the functionality previously provided by __reloc_self for heavily
  GOT-dependent RISC archs is now the arch-agnostic stage-1.

- arch-specific relocation type codes are mapped directly as macros
  rather than via an inline translation function/switch statement.
2015-04-13 03:04:42 -04:00
Rich Felker
385c01112c remove mismatched arguments from vmlock function definitions
commit f08ab9e61a introduced these
accidentally as remnants of some work I tried that did not work out.
2015-04-11 10:38:57 -04:00
Rich Felker
a2d3053354 apply vmlock wait to __unmapself in pthread_exit 2015-04-10 03:47:42 -04:00
Rich Felker
f08ab9e61a redesign and simplify vmlock system
this global lock allows certain unlock-type primitives to exclude
mmap/munmap operations which could change the identity of virtual
addresses while references to them still exist.

the original design mistakenly assumed mmap/munmap would conversely
need to exclude the same operations which exclude mmap/munmap, so the
vmlock was implemented as a sort of 'symmetric recursive rwlock'. this
turned out to be unnecessary.

commit 25d12fc0fc already shortened the
interval during which mmap/munmap held their side of the lock, but
left the inappropriate lock design and some inefficiency.

the new design uses a separate function, __vm_wait, which does not
hold any lock itself and only waits for lock users which were already
present when it was called to release the lock. this is sufficient
because of the way operations that need to be excluded are sequenced:
the "unlock-type" operations using the vmlock need only block
mmap/munmap operations that are precipitated by (and thus sequenced
after) the atomic-unlock they perform while holding the vmlock.

this allows for a spectacular lack of synchronization in the __vm_wait
function itself.
2015-04-10 02:27:52 -04:00
Rich Felker
4e98cce1c5 optimize out setting up robust list with kernel when not needed
as a result of commit 12e1e32468, kernel
processing of the robust list is only needed for process-shared
mutexes. previously the first attempt to lock any owner-tracked mutex
resulted in robust list initialization and a set_robust_list syscall.
this is no longer necessary, and since the kernel's record of the
robust list must now be cleared at thread exit time for detached
threads, optimizing it out is more worthwhile than before too.
2015-04-10 00:54:48 -04:00
Rich Felker
12e1e32468 process robust list in pthread_exit to fix detached thread use-after-unmap
the robust list head lies in the thread structure, which is unmapped
before exit for detached threads. this leaves the kernel unable to
process the exiting thread's robust list, and with a dangling pointer
which may happen to point to new unrelated data at the time the kernel
processes it.

userspace processing of the robust list was already needed for
non-pshared robust mutexes in order to perform private futex wakes
rather than the shared ones the kernel would do, but it was
conditional on linking pthread_mutexattr_setrobust and did not bother
processing the pshared mutexes in the list, which requires additional
logic for the robust list pending slot in case pthread_exit is
interrupted by asynchronous process termination.

the new robust list processing code is linked unconditionally (inlined
in pthread_exit), handles both private and shared mutexes, and also
removes the kernel's reference to the robust list before unmapping and
exit if the exiting thread is detached.
2015-04-10 00:26:34 -04:00
Rich Felker
25748db301 fix possible clobbering of syscall return values on mips
depending on the compiler's interpretation of __asm__ register names
for register class objects, it may be possible for the return value in
r2 to be clobbered by the function call to __stat_fix. I have not
observed any such breakage in normal builds and suspect it only
happens with -O0 or other unusual build options, but since there's an
ambiguity as to the semantics of this feature, it's best to use an
explicit temporary to avoid the issue.

based on reporting and patch by Eugene.
2015-04-07 12:47:19 -04:00
Szabolcs Nagy
05e0e301e3 fix getdelim to set the error indicator on all failures 2015-04-04 10:53:09 -04:00
Rich Felker
077096259d fix rpath string memory leak on failed dlopen
when dlopen fails, all partially-loaded libraries need to be unmapped
and freed. any of these libraries using an rpath with $ORIGIN
expansion may have an allocated string for the expanded rpath;
previously, this string was not freed when freeing the library data
structures.
2015-04-04 00:15:19 -04:00
Rich Felker
2963a9f794 halt dynamic linker library search on errors resolving $ORIGIN in rpath
this change hardens the dynamic linker against the possibility of
loading the wrong library due to inability to expand $ORIGIN in rpath.
hard failures such as excessively long paths or absence of /proc (when
resolving /proc/self/exe for the main executable's origin) do not stop
the path search, but memory allocation failures and any other
potentially transient failures do.

to implement this change, the meaning of the return value of
fixup_rpath function is changed. returning zero no longer indicates
that the dso's rpath string pointer is non-null; instead, the caller
needs to check. a return value of -1 indicates a failure that should
stop further path search.
2015-04-03 16:35:43 -04:00
Rich Felker
5e25d87b09 remove macro definition of longjmp from setjmp.h
the C standard specifies that setjmp is a macro, but longjmp is a
normal function. a macro version of it would be permitted (albeit
useless) for C (not C++), but would have to be a function-like macro,
not an object-like one.
2015-04-01 20:35:03 -04:00
Rich Felker
5d1c8c9956 harden dynamic linker library path search
transient errors during the path search should not allow the search to
continue and possibly open the wrong file. this patch eliminates most
conditions where that could happen, but there is still a possibility
that $ORIGIN-based rpath processing will have an allocation failure,
causing the search to skip such a path. fixing this is left as a
separate task.

a small bug where overly-long path components caused an infinite loop
rather than being skipped/ignored is also fixed.
2015-04-01 20:27:29 -04:00
Rich Felker
fd427c4eae move O_PATH definition back to arch bits
while it's the same for all presently supported archs, it differs at
least on sparc, and conceptually it's no less arch-specific than the
other O_* macros. O_SEARCH and O_EXEC are still defined in terms of
O_PATH in the main fcntl.h.
2015-04-01 19:31:06 -04:00
Rich Felker
abfe1f6541 aarch64: remove duplicate macro definitions in bits/fcntl.h 2015-04-01 19:25:32 -04:00
Rich Felker
dfc1a37c44 aarch64: fix definition of sem_nsems in semid_ds structure
POSIX requires the sem_nsems member to have type unsigned short. we
have to work around the incorrect kernel type using matching
endian-specific padding.
2015-04-01 19:12:18 -04:00
Szabolcs Nagy
b24d813d24 aarch64: fix namespace pollution in bits/shm.h
The shm_info struct is a gnu extension and some of its members do
not have shm* prefix. This is worked around in sys/shm.h by macros,
but aarch64 didn't use those.
2015-04-01 19:05:12 -04:00
Rich Felker
115af23942 release 1.1.8 2015-03-29 23:48:12 -04:00
Szabolcs Nagy
c498efe117 regex: fix character class repetitions
Internally regcomp needs to copy some iteration nodes before
translating the AST into TNFA representation.

Literal nodes were not copied correctly: the class type and list
of negated class types were not copied so classes were ignored
(in the non-negated case an ignored char class caused the literal
to match everything).

This affects iterations when the upper bound is finite, larger
than one or the lower bound is larger than one. So eg. the EREs

 [[:digit:]]{2}
 [^[:space:]ab]{1,4}

were treated as

 .{2}
 [^ab]{1,4}

The fix is done with minimal source modification to copy the
necessary fields, but the AST preparation and node handling
code of tre will need to be cleaned up for clarity.
2015-03-27 20:24:30 -04:00
Szabolcs Nagy
32dee9b9b1 do not treat \0 as a backref in BRE
The valid BRE backref tokens are \1 .. \9, and 0 is not a special
character either so \0 is undefined by the standard.

Such undefined escaped characters are treated as literal characters
currently, following existing practice, so \0 is the same as 0.
2015-03-23 12:28:49 -04:00
Rich Felker
11d1e2e2de fix FLT_ROUNDS regression in C++ applications
commit 559de8f5f0 redefined FLT_ROUNDS
to use an external function that can report the actual current
rounding mode, rather than always reporting round-to-nearest. however,
float.h did not include 'extern "C"' wrapping for C++, so C++ programs
using FLT_ROUNDS ended up with an unresolved reference to a
name-mangled C++ function __flt_rounds.
2015-03-23 11:26:51 -04:00
Rich Felker
fc13acc3dc fix internal buffer overrun in inet_pton
one stop condition for parsing abbreviated ipv6 addressed was missed,
allowing the internal ip[] buffer to overflow. this patch adds the
missing stop condition and masks the array index so that, in case
there are any remaining stop conditions missing, overflowing the
buffer is not possible.
2015-03-23 09:44:18 -04:00
Rich Felker
7c8c86f630 suppress backref processing in ERE regcomp
one of the features of ERE is that it's actually a regular language
and does not admit expressions which cannot be matched in linear time.
introduction of \n backref support into regcomp's ERE parsing was
unintentional.
2015-03-20 18:28:37 -04:00
Rich Felker
39dfd58417 fix memory-corruption in regcomp with backslash followed by high byte
the regex parser handles the (undefined) case of an unexpected byte
following a backslash as a literal. however, instead of correctly
decoding a character, it was treating the byte value itself as a
character. this was not only semantically unjustified, but turned out
to be dangerous on archs where plain char is signed: bytes in the
range 252-255 alias the internal codes -4 through -1 used for special
types of literal nodes in the AST.
2015-03-20 18:06:04 -04:00
Rich Felker
e626deeec8 fix missing max_align_t definition on aarch64 2015-03-20 01:21:37 -04:00
Rich Felker
8c1c57a64b release 1.1.7 2015-03-18 20:38:02 -04:00
Rich Felker
d5a5045382 fix MINSIGSTKSZ values for archs with large signal contexts
the previous values (2k min and 8k default) were too small for some
archs. aarch64 reserves 4k in the signal context for future extensions
and requires about 4.5k total, and powerpc reportedly uses over 2k.
the new minimums are chosen to fit the saved context and also allow a
minimal signal handler to run.

since the default (SIGSTKSZ) has always been 6k larger than the
minimum, it is also increased to maintain the 6k usable by the signal
handler. this happens to be able to store one pathname buffer and
should be sufficient for calling any function in libc that doesn't
involve conversion between floating point and decimal representations.

x86 (both 32-bit and 64-bit variants) may also need a larger minimum
(around 2.5k) in the future to support avx-512, but the values on
these archs are left alone for now pending further analysis.

the value for PTHREAD_STACK_MIN is not increased to match MINSIGSTKSZ
at this time. this is so as not to preclude applications from using
extremely small thread stacks when they know they will not be handling
signals. unfortunately cancellation and multi-threaded set*id() use
signals as an implementation detail and therefore require a stack
large enough for a signal context, so applications which use extremely
small thread stacks may still need to avoid using these features.
2015-03-18 00:31:37 -04:00
Rich Felker
76fd01177a block all signals (even internal ones) in cancellation signal handler
previously the implementation-internal signal used for multithreaded
set*id operations was left unblocked during handling of the
cancellation signal. however, on some archs, signal contexts are huge
(up to 5k) and the possibility of nested signal handlers drastically
increases the minimum stack requirement. since the cancellation signal
handler will do its job and return in bounded time before possibly
passing execution to application code, there is no need to allow other
signals to interrupt it.
2015-03-16 20:12:49 -04:00
Rich Felker
eceaf1d29f update authors/contributors list
these additions were made based on scanning commit authors since the
last update, at the time of the 1.1.4 release.
2015-03-16 18:43:54 -04:00
Rich Felker
4b5ca13fb1 avoid sending huge names as nscd passwd/group queries
overly long user/group names are potentially a DoS vector and source
of other problems like partial writes by sendmsg, and not useful.
2015-03-15 23:46:22 -04:00
Rich Felker
49d1e7f931 simplify nscd lookup code for alt passwd/group backends
previously, a sentinel value of (FILE *)-1 was used to inform the
caller of __nscd_query that nscd is not in use. aside from being an
ugly hack, this resulted in duplicate code paths for two logically
equivalent cases: no nscd, and "not found" result from nscd.

now, __nscd_query simply skips closing the socket and returns a valid
FILE pointer when nscd is not in use, and produces a fake "not found"
response header. the caller is then responsible for closing the socket
just like it would do if it had gotten a real "not found" response.
2015-03-15 23:33:59 -04:00
Josiah Worcester
2894a44b40 add alternate backend support for getgrouplist
This completes the alternate backend support that was previously added
to the getpw* and getgr* functions. Unlike those, though, it
unconditionally queries nscd. Any groups from nscd that aren't in the
/etc/groups file are added to the returned list, and any that are
present in the file are ignored. The purpose of this behavior is to
provide a view of the group database consistent with what is observed
by the getgr* functions. If group memberships reported by nscd were
honored when the corresponding group already has a definition in the
/etc/groups file, the user's getgrouplist-based membership in the
group would conflict with their non-membership in the reported
gr_mem[] for the group.

The changes made also make getgrouplist thread-safe and eliminate its
clobbering of the global getgrent state.
2015-03-15 22:32:22 -04:00
Szabolcs Nagy
962cbfbf86 aarch64: fix typo in bits/ioctl.h 2015-03-14 15:49:08 -04:00
Szabolcs Nagy
38bf2d7cc3 aarch64: add struct _aarch64_ctx to signal.h
The unwind code in libgcc uses this type for unwinding across signal
handlers. On aarch64 the kernel may place a sequence of structs on the
signal stack on top of the ucontext to provide additional information.
The unwinder only needs the header, but added all the types the kernel
currently defines for this mechanism because they are part of the uapi.
2015-03-14 13:55:24 -04:00
Rich Felker
673cab5c56 align x32 pthread type sizes to be common with 32-bit archs
previously, commit e7b9887e8b aligned
the sizes with the glibc ABI. subsequent discussion during the merge
of the aarch64 port reached a conclusion that we should reject larger
arch-specific sizes, which have significant cost and no benefit, and
stick with the existing common 32-bit sizes for all 32-bit/ILP32 archs
and the x86_64 sizes for 64-bit archs.

one peculiarity of this change is that x32 pthread_attr_t is now
larger in musl than in the glibc x32 ABI, making it unsafe to call
pthread_attr_init from x32 code that was compiled against glibc. with
all the ABI issues of x32, it's not clear that ABI compatibility will
ever work, but if it's needed, pthread_attr_init and related functions
could be modified not to write to the last slot of the object.

this is not a regression versus previous releases, since on previous
releases the x32 pthread type sizes were all severely oversized
already (due to incorrectly using the x86_64 LP64 definitions).
moreover, x32 is still considered experimental and not ABI-stable.
2015-03-12 14:43:36 -04:00
Szabolcs Nagy
01ef3dd9c5 add aarch64 port
This adds complete aarch64 target support including bigendian subarch.

Some of the long double math functions are known to be broken otherwise
interfaces should be fully functional, but at this point consider this
port experimental.

Initial work on this port was done by Sireesh Tripurari and Kevin Bortis.
2015-03-11 20:12:35 -04:00
Szabolcs Nagy
f4e4632abf math: add dummy implementations of 128 bit long double functions
This is in preparation for the aarch64 port only to have the long
double math symbols available on ld128 platforms. The implementations
should be fixed up later once we have proper tests for these functions.

Added bigendian handling for ld128 bit manipulations too.
2015-03-11 18:54:53 -04:00
Szabolcs Nagy
53cfe0c61a math: add ld128 exp2l based on the freebsd implementation
Changed the special case handling and bit manipulation to better
match the double version.
2015-03-11 18:54:50 -04:00
Szabolcs Nagy
204a69d2d9 copy the dtv pointer to the end of the pthread struct for TLS_ABOVE_TP archs
There are two main abi variants for thread local storage layout:

 (1) TLS is above the thread pointer at a fixed offset and the pthread
 struct is below that. So the end of the struct is at known offset.

 (2) the thread pointer points to the pthread struct and TLS starts
 below it. So the start of the struct is at known (zero) offset.

Assembly code for the dynamic TLSDESC callback needs to access the
dynamic thread vector (dtv) pointer which is currently at the front
of the pthread struct. So in case of (1) the asm code needs to hard
code the offset from the end of the struct which can easily break if
the struct changes.

This commit adds a copy of the dtv at the end of the struct. New members
must not be added after dtv_copy, only before it. The size of the struct
is increased a bit, but there is opportunity for size optimizations.
2015-03-11 18:53:48 -04:00
Rich Felker
a46677af18 fix regression in pthread_cond_wait with cancellation disabled
due to a logic error in the use of masked cancellation mode,
pthread_cond_wait did not honor PTHREAD_CANCEL_DISABLE but instead
failed with ECANCELED when cancellation was pending.
2015-03-07 14:11:01 -05:00
Szabolcs Nagy
559de8f5f0 fix FLT_ROUNDS to reflect the current rounding mode
Implemented as a wrapper around fegetround introducing a new function
to the ABI: __flt_rounds. (fegetround cannot be used directly from float.h)
2015-03-07 12:05:28 -05:00
Rich Felker
bd67959f3a fix over-alignment of TLS, insufficient builtin TLS on 64-bit archs
a conservative estimate of 4*sizeof(size_t) was used as the minimum
alignment for thread-local storage, despite the only requirements
being alignment suitable for struct pthread and void* (which struct
pthread already contains). additional alignment required by the
application or libraries is encoded in their headers and is already
applied.

over-alignment prevented the builtin_tls array from ever being used in
dynamic-linked programs on 64-bit archs, thereby requiring allocation
at startup even in programs with no TLS of their own.
2015-03-06 13:27:08 -05:00
Rich Felker
2b42c8cb37 add legacy functions from sysinfo.h duplicating sysconf functionality 2015-03-04 22:10:01 -05:00
Rich Felker
380857bf21 fix signed left-shift overflow in pthread_condattr_setpshared 2015-03-04 21:46:08 -05:00
Szabolcs Nagy
ad85fcb568 add new si_lower and si_upper siginfo_t members
new in linux v3.19 commit ee1b58d36aa1b5a79eaba11f5c3633c88231da83
used to report intel mpx bound violation information.
2015-03-04 14:50:52 -05:00
Rich Felker
9c3da8968d declare incomplete type struct itimerspec in timerfd.h
normally time.h would provide a definition for this struct, but
depending on the feature test macros in use, it may not be exposed,
leading to warnings when it's used in the function prototypes.
2015-03-04 14:38:08 -05:00
Rich Felker
91a3bd743e fix preprocessor error introduced in poll.h in last commit 2015-03-04 14:15:44 -05:00
Trutz Behn
f5011c62c3 fix POLLWRNORM and POLLWRBAND on mips
these macros have the same distinct definition on blackfin, frv, m68k,
mips, sparc and xtensa kernels. POLLMSG and POLLRDHUP additionally
differ on sparc.
2015-03-04 12:09:37 -05:00
Rich Felker
e7b9887e8b fix x32 pthread type definitions
the previous definitions were copied from x86_64. not only did they
fail to match the ABI sizes; they also wrongly encoded an assumption
that long/pointer types are twice as large as int.
2015-03-04 11:33:26 -05:00
Rich Felker
064898cfe2 remove useless check of bin match in malloc
this re-check idiom seems to have been copied from the alloc_fwd and
alloc_rev functions, which guess a bin based on non-synchronized
memory access to adjacent chunk headers then need to confirm, after
locking the bin, that the chunk is actually in the bin they locked.

the check being removed, however, was being performed on a chunk
obtained from the already-locked bin. there is no race to account for
here; the check could only fail in the event of corrupt free lists,
and even then it would not catch them but simply continue running.

since the bin_index function is mildly expensive, it seems preferable
to remove the check rather than trying to convert it into a useful
consistency check. casual testing shows a 1-5% reduction in run time.
2015-03-04 10:48:00 -05:00
Rich Felker
6de071a0be eliminate atomics in syslog setlogmask function 2015-03-04 09:44:43 -05:00