ethertype for fake VLAN header for DSA, see
linux commit bf5bc3ce8a8f32a0d45b6820ede8f9fc3e9c23df
ether: Add dedicated Ethertype for pseudo-802.1Q DSA tagging
historically, a number of 32-bit archs used long rather than int for
wchar_t, for no good reason. GCC still uses the historical types, but
clang replaced them all with int, and it seems PCC uses int too.
mismatching the compiler's type for wchar_t is not an option due to
wide string literals.
note that the mismatch does not affect C++ ABI since wchar_t is its
own builtin type/keyword in C++, distinct from both int and long, not
a typedef.
i386 already worked around this by honoring __WCHAR_TYPE__ if defined
by the compiler, and only using the official legacy ABI type if not.
add the same to the other affected archs.
it might make sense at some point to switch to using int as the
default if __WCHAR_TYPE__ is not defined, if the expectations is that
new compilers will treat int as the correct choice, but it's unlikely
that the case where __WCHAR_TYPE__ is undefined will ever be used
anyway. I actually wanted to move the definition of wchar_t to the
top-level shared alltypes.h.in, using __WCHAR_TYPE__ and falling back
to int if not defined, but that can't be done without assuming all
compilers define __WCHAR_TYPE__ thanks to some pathological archs
where the ABI has wchar_t as an unsigned type.
previously, when pthread_create failed due to inability to set
explicit scheduling according to the requested attributes, the nascent
thread was detached and made responsible for its own cleanup via the
standard pthread_exit code path. this left it consuming resources
potentially well after pthread_create returned, in a way that the
application could not see or mitigate, and unnecessarily exposed its
existence to the rest of the implementation via the global thread
list.
instead, attempt explicit scheduling early and reuse the failure path
for __clone failure if it fails. the nascent thread's exit futex is
not needed for unlocking the thread list, since the thread calling
pthread_create holds the thread list lock the whole time, so it can be
repurposed to ensure the thread has finished exiting. no pthread_exit
is needed, and freeing the stack, if needed, can happen just as it
would if __clone failed.
if setting scheduling properties succeeds, the new thread may end up
with lower priority than the caller, and may be unable to continue
running due to another intermediate-priority thread. this produces a
priority inversion situation for the thread calling pthread_create,
since it cannot return until the new thread reports success.
originally, the parent was responsible for setting the new thread's
priority; commits b8742f3260 and
40bae2d32f changed it as part of
trimming down the pthread structure. since then, commit
04335d9260 partly reversed the changes,
but did not switch responsibilities back. do that now.
commit 8f11e6127f wrongly documented
that all changes to libc.threads_minus_1 were guarded by the thread
list lock, but the decrement for failed SYS_clone took place after the
thread list lock was released.
commit 030e526392 added optreset, a BSD
extension to getopt duplicating the functionality (also an extension)
of setting optind to 0, but failed to provide a public declaration for
it. according to the BSD documentation and headers, the application is
not supposed to need to provide its own declaration.
these are presently extensions, thus named with _np to match glibc and
other implementations that provide them; however they are likely to be
standardized in the future without the _np suffix as a result of
Austin Group issue 1208. if so, both names will be kept as aliases.
due to historical accident/sloppiness in glibc, the powerpc,
powerpc64, and sh versions of struct user, defined by sys/user.h, used
struct pt_regs from the kernel asm/ptrace.h for their regs member.
this made it impossible to define the type in an API-compatible manner
without either including asm/ptrace.h like glibc does (contrary to our
policy of not depending on kernel headers), or clashing with
asm/ptrace.h's definition of struct pt_regs if both headers are
included (which is almost always the case in software using
sys/user.h).
for a long time I viewed this problem as having no reasonable fix. I
even explored the possibility of having the powerpc[64] and sh
versions of user.h just include the kernel header (breaking with
policy), but that looked like it might introduce new clashes with
sys/ptrace.h. and it would also bring in a lot of additional cruft
that makes no sense for sys/user.h to expose. glibc goes out of its
way to suppress some of that with #undef, possibly leading to
different problems. this is a rabbit-hole that should be explored no
further.
as it turns out, however, nothing actually uses struct user
sufficiently to care about the type of the regs member; most software
including sys/user.h does not even use struct user at all. so, the
problem can be fixed just by doing away with the insistence on strict
glibc API compatibility for the struct tag of the regs member.
rather than renaming the tag, which might lead to the new name
entering use as API, simply use an untagged structure inside struct
user with the same members/layout as struct pt_regs.
for sh, struct pt_dspregs is just removed entirely since it was not
used.
these members are associated with an unsupported option group. with
time_t changing size on 32-bit archs, all interfaces taking struct
sched_param arguments would need redirection and compat shims in order
to be able to continue offering these members, for no benefit. just
convert them to reserved space instead.
commit ffab43602b broke this by moving
relocations after not only the allocation of storage for the main
thread's static TLS, but after the copying of the TLS image. thus,
relocation results were not reflected in the main thread's copy. this
could be fixed by calling __reset_tls after relocations, but instead
split the allocation and installation before/after relocations so that
there's not a redundant copy.
due to commit 71af530987, updating of
static_tls_cnt needs to be kept with allocation of static TLS, before
relocations, rather than after installation.
Using common code path for all symbol lookups fixes three dlsym issues:
- st_shndx of STT_TLS symbols were not checked and thus an undefined
tls symbol reference could be incorrectly treated as a definition
(the sysv hash lookup returns undefined symbols, gnu does not, so should
be rare in practice).
- symbol binding was not checked so a hidden symbol may be returned
(in principle STB_LOCAL symbols may appear in the dynamic symbol table
for hidden symbols, but linkers most likely don't produce it).
- mips specific behaviour was not applied (ARCH_SYM_REJECT_UND) so
undefined symbols may be returned on mips.
always_inline is used to avoid relocation performance regression, the
code generation for find_sym should not be affected.
commit 7a9669e977 added use of the
symbol reference as the definition, in place of performing a lookup,
for STT_SECTION symbol references that were first found used in FDPIC.
such references may happen in certain other cases, such as
local-dynamic TLS and with relocation types that require a symbol but
that are being used for non-symbolic purposes, like the powerpc
unaligned address relocations.
in all such cases I'm aware of, the symbol referenced is a section
symbol (STT_SECTION); however, the important semantic property is not
its being a section, but rather its binding local (STB_LOCAL). check
the latter instead of the former for greater generality and semantic
correctness.
R_PPC_UADDR32 (R_PPC64_UADDR64) has the same meaning as R_PPC_ADDR32
(R_PPC64_ADDR64), except that its address need not be aligned. For
powerpc64, BFD ld(1) will automatically convert between ADDR<->UADDR
relocations when the address is/isn't at its native alignment. This
will happen if, for example, there is a pointer in a packed struct.
gold and lld do not currently generate R_PPC64_UADDR64, but pass
through misaligned R_PPC64_ADDR64 relocations from object files,
possibly relaxing them to misaligned R_PPC64_RELATIVE. In both cases
(relaxed or not) this violates the PSABI, which defines the relevant
field type as "a 64-bit field occupying 8 bytes, the alignment of
which is 8 bytes unless otherwise specified."
All three linkers violate the PSABI on 32-bit powerpc, where the only
difference is that the field is 32 bits wide, aligned to 4 bytes.
Currently musl fails to load executables linked by BFD ld containing
R_PPC64_UADDR64, with the error "unsupported relocation type 43".
This change provides compatibility with BFD ld on powerpc64, and any
static linker on either architecture that starts following the PSABI
more closely.
as a result of commit ffab43602b,
static_tls_cnt is now valid during relocations at program startup, so
it's no longer necessary to condition the check against static_tls_cnt
on this being a runtime (dlopen) relocation.
this is analogous to commit 2f1f51ae7b,
and should have been caught at the same time since it was right next
to the code moved in that commit. between final stage 3 reloc_all and
the jump to the main program's entry point, it is not valid to call
any functions which may be interposed by the application; doing so
results in execution of application code before ctors have run, and on
fdpic archs, before the main program's fdpic self-fixups have taken
place, which will produce runaway wrong execution.
somewhat analogous to commit d0b547dfb5,
but here the omission of the null timeout check was in the time64
syscall code path. this code is not yet used except on x32.
these accept the netbsd/openbsd message catalog file format,
consisting of a sorted list of set headers and a sorted list of
message headers for each set, admitting trivial binary search for
lookups.
the gnu format was not chosen because it's unusably bad. it does not
admit efficient (log time or better) lookups; rather, it requires
linear search or hash table lookups, and the hash function is awful:
it's literally set_id*msg_id.
commit 722a1ae335 inadvertently passed a
copy of {s,us} to the syscall even if the timeout argument tv was
null, thereby causing immediate timeout (polling) in place of
unlimited timeout. only archs using SYS_select were affected.
when the pattern ended with one or more literal path components, or
when the GLOB_MARK flag was passed to request that glob flag directory
results and the type obtained by readdir was unknown or inconclusive
(symlink), the stat function was called to evaluate existence and/or
determine type. however, stat fails with ENOENT for broken symlinks,
and this caused the match to be omitted from the results.
instead, use stat only for the unknown/inconclusive cases with
GLOB_MARK, and otherwise, or if stat fails, use lstat existence still
needs to be determined. this minimizes the number of costly syscalls,
performing both only in the case where GLOB_MARK is in use and there
is a final literal path component which is a broken symlink.
based on/simplified from patch by James Y Knight.
the contents conflicted with asm/ptrace.h. glibc does not provide
anything in user.h for riscv, so software cannot be depending on it.
simplified from patch submitted by Baruch Siach.
Rename user registers struct definitions to avoid conflict with the
asm/ptrace.h kernel header that defines the same structs. Use the
__riscv_mc prefix as glibc does.
The only reason we needed to preserve the link register was because we
were using a branch-link instruction to branch to __cp_cancel.
Replacing this with a branch means we can avoid the save/restore as
the link register is no longer modified.
otherwise alarm will break on 32-bit archs when time_t is changed to
64-bit. a second itimerval object is introduced for retrieving the old
value, since the setitimer function has restrict-qualified arguments.
commit 31c5fb80b9 introduced underflow
code paths for the i386 math asm, along with checks on the fpu status
word to skip the underflow-generation instructions if the underflow
flag was already raised. unfortunately, at least one such path, in
log1p, returned with 2 items on the x87 stack rather than just 1 item
for the return value. this is a violation of the ABI's calling
convention, and could cause subsequent floating point code to produce
NANs due to x87 stack overflow. if floating point results are used in
flow control, this can lead to runaway wrong code execution.
rather than reviewing each "underflow already raised" code path for
correctness, remove them all. they're likely slower than just
performing the underflow code unconditionally, and significantly more
complex.
all of this code should be ripped out and replaced by C source files
with inline asm. doing so would preclude this kind of error by having
the compiler perform all x87 stack register allocation and stack
manipulation, and would produce comparable or better code. however
such a change is a much larger project.
commit f3f96f2daa added these for the
rest of the archs, but the patch it corresponded to missed riscv64
since riscv64 was not yet upstream at the time. this caused commit
dfc81828f7 to break riscv64 build, due
to a wrong assumption that SYS_statx was unconditionally defined.
this fixes a major upcoming performance regression introduced by
commit 72f50245d0, whereby 32-bit archs
would lose vdso clock_gettime after switching to 64-bit time_t, unless
the kernel supports time64 and provides a time64 version of the vdso
function. this would incur not just one but two syscalls: first, the
failed time64 syscall, then the fallback time32 one.
overflow of the 32-bit result is detected and triggers a revert to
syscalls. normally, on a system that's not Y2038-ready, this would
still overflow, but if the process has been migrated to a
time64-capable kernel or if the kernel has been hot-patched to add
time64 syscalls, it may conceivably work.
otherwise, 32-bit archs that could otherwise share the generic
bits/ipc.h would need to duplicate the struct ipc_perm definition,
obscuring the fact that it's the same. sysvipc is not widely used and
these headers are not commonly included, so there is no performance
gain to be had by limiting the number of indirectly included files
here.
files with the existing time32 definition of IPC_STAT are added to all
current 32-bit archs now, so that when it's changed the change will
show up as a change rather than addition of a new file where it's less
obvious that the value is changing vs the generic one that was used
before.
per policy, define the feature test macro to get declarations for the
pthread_tryjoin_np and pthread_timedjoin_np functions. in the past
this has been only for checking; with 32-bit archs getting 64-bit
time_t it will also be necessary for symbols to get redirected
correctly.
to make use of {sem,shm,msg}ctl IPC_STAT functionality to provide
64-bit time_t on 32-bit archs, IPC_STAT and related macros must be
defined with bit 8 (0x100) set. allow archs to define IPC_STAT in
bits/ipc.h, and define the other macros in terms of it so that they
all get the same value of the time64 bit.
the time64 syscall has to be used if time_t is 64-bit, since there's
no way of knowing before making a syscall whether the result will fit
in 32 bits, and the 32-bit syscalls do not report overflow as an
error.
on 64-bit archs, there is no change to the code after preprocessing.
on current 32-bit archs, the result is now read from the kernel
through long[2] array, then copied into the timespec, to remove the
assumption that time_t is the same as long.
vdso clock_gettime is still used in place of a syscall if available.
32-bit archs with 64-bit time_t must use the time64 version of the
vdso function; if it's not available, performance will significantly
suffer. support for both vdso functions could be added, but would
break the ability to move a long-lived process from a pre-time64
kernel to one that can outlast Y2038 with checkpoint/resume, at least
without added hacks to identify that the 32-bit function is no longer
usable and stop using it (e.g. by seeing negative tv_sec). this
possibility may be explored in future work on the function.
the 64-bit/time64 version of the syscall is not API-compatible with
the userspace timex structure definition; fields specified as long
have type long long. so when using the time64 syscall, we have to
convert the entire structure. this was always the case for x32 as
well, but went unnoticed, meaning that clock_adjtime just passed junk
to the kernel on x32. it should be fixed now.
for the fallback case, we avoid encoding any assumptions about the new
location of the time member or naming of the legacy slots by accessing
them through a union of the kernel type and the new userspace type.
the only assumption is that the non-time members live at the same
offsets as in the (non-time64, long-based) kernel timex struct. this
property saves us from having to convert the whole thing, and avoids a
lot of additional work in compat shims.
the new code is statically unreachable for now except on x32, where it
fixes major brokenness. it is permanently unreachable on 64-bit.
without this, the SIOCGSTAMP and SIOCGSTAMPNS ioctl commands, for
obtaining timestamps, would stop working on pre-5.1 kernels after
time_t is switched to 64-bit and their values are changed to the new
time64 versions.
new code is written such that it's statically unreachable on 64-bit
archs, and on existing 32-bit archs until the macro values are changed
to activate 64-bit time_t.
without this, the SO_RCVTIMEO and SO_SNDTIMEO socket options would
stop working on pre-5.1 kernels after time_t is switched to 64-bit and
their values are changed to the new time64 versions.
new code is written such that it's statically unreachable on 64-bit
archs, and on existing 32-bit archs until the macro values are changed
to activate 64-bit time_t.
the __socketcall and __socketcall_cp macros are remnants from a really
old version of the syscall-mechanism infrastructure, and don't follow
the pattern that the "__" version of the macro returns the raw negated
error number rather than setting errno and returning -1.
for time64 purposes, some socket syscalls will need to operate on the
error value rather than returning immediately, so fix this up so they
can use it.
being "ctl" functions that take command numbers, these will be handled
like ioctl/sockopt/etc., using new command numbers for the time64
variants with an "IPC_TIME64" bit added to their values. to obtain
such a reserved bit, we reuse the IPC_64 bit, 0x100, which served only
as part of the libc-to-kernel interface, not as a public interface of
the libc functions.
using new command numbers avoids the need for compat shims (in ABIs
doing time64 through symbol redirection and compat shims) and, by
virtue of having a fixed time64 bit for all commands, we can ensure
that libc can perform the appropriate translations, even if the
application is using new commands from a newer version of the libc
headers than the libc available at runtime.
for the vast majority of 32-bit archs, the kernel {sem,shm,msq}id64_ds
definitions left padding space intended for expanding their time_t
fields to 64 bits in-place, and it would have been really nice to be
able to do time64 support that way. however the padding was almost
always in little-endian order (except on powerpc, and for msqid_ds
only on mips, where it matched the arch's byte order), and more
importantly, the alignment was overlooked. in semid_ds and msqid_ds,
the time_t members were not suitably aligned to be expanded to 64-bit,
due to the ipc_perm header consisting of 9 32-bit words -- except on
powerpc where ipc_perm contains an extra padding word. in shmid_ds,
the time_t members were suitably aligned, except that mips
(accidentally?) omitted the padding for them alltogether.
as a result, we're stuck with adding new time_t fields on the end of
the structures, and assembling the 32-bit lo/hi parts (or 16-bit hi
parts, for mips shmid_ds, which lacked sufficient reserved space for
full 32-bit hi parts) to fill them in.
all of the functional changes here are conditional on the IPC_TIME64
macro having a nonzero definition, which will only happen when
IPC_STAT is redefined for 32-bit archs, and on time_t being larger
than long, so for now the new code is all dead code.
due to the variadic signature, semctl needs to be made aware of any
new commands that take arguments. this was overlooked when commit
af55070eae added SEM_STAT_ANY.
these differ from generic only in using endian-matched padding with a
short __ipc_perm_seq field in place of the int field in generic. this
is not a documented public interface anyway, and the original intent
was to use int here. some ports just inadvertently slipped in the
kernel short+padding form.
previously these differed from generic because they needed their own
definitions of IPC_64. now that it's no longer in public header,
they're identical.
the definition of the IPC_64 macro controls the interface between libc
and the kernel through syscalls; it's not a public API. the meaning is
rather obscure. long ago, Linux's sysvipc *id_ds structures used
16-bit uids/gids and wrong types for a few other fields. this was in
the libc5 era, before glibc. the IPC_64 flag (64 is a misnomer; it's
more like 32) tells the kernel to use the modern[-ish] versions of the
structures.
the definition of IPC_64 has nothing to do with whether the arch is
32- or 64-bit. rather, due to either historical accident or
intentional obnoxiousness, the kernel only accepts and masks off the
0x100 IPC_64 flag conditional on CONFIG_ARCH_WANT_IPC_PARSE_VERSION,
i.e. for archs that want to provide, or that accidentally provided,
both. for archs which don't define this option, no masking is
performed and commands with the 0x100 bit set will fail as invalid. so
ultimately, the definition is just a matter of matching an arbitrary
switch defined per-arch in the kernel.