NetBSD

Commit Graph

Author	SHA1	Message	Date
lukem	eca6fe7e73	fix spello in comment	2011-08-05 03:55:31 +00:00
matt	def13dad30	Add __HAVE___LWP_GETTCB_FAST support (for mips and powerpc).	2011-03-17 00:43:48 +00:00
joerg	01eef02a1b	If TLS support is present, use it for pthread__self(). The initialisation order is correct in this case as _lwp_setprivate has been called already by ld.elf_so for dynamic programs or _libc_init for statically linked ones.	2011-03-16 12:39:44 +00:00
joerg	aad599979d	Add TLS support infrastructure. For dynamic binaries, ld.elf_so exports _rtld_tls_allocate and _rtld_tls_free. libpthread uses this functions to setup the thread private area of all new threads. ld.elf_so is responsible for setting up the private area for the initial thread. Similar functions are called from _libc_init for static binaries, using dl_iterate_phdr to access the ELF Program Header. Add test cases to exercise the different TLS storage models. Test cases are compiled and installed on all platforms, but are skipped on platforms not marked for TLS support. This material is based upon work partially supported by The NetBSD Foundation under a contract with Joerg Sonnenberger. It is inspired by the TLS support in FreeBSD by Doug Rabson and the clean ups of the DragonFly port of the original FreeBSD modifications.	2011-03-09 23:10:05 +00:00
joerg	b7b592d544	Back out using the thread register (if present) for now. libgcc_s's __register_frame_info gets called from libc's CSU code before the libc constructors are run. __register_frame_info in turn calls pthread_mutex_lock. libpthread is not initialised at this point and therefore pthread__self() traps when deferencing the thread register. This worked before because the garbage from pthread__self() is effectively ignored.	2011-02-25 14:32:38 +00:00
joerg	1631a78097	Allow storing and receiving the LWP private pointer via ucontext_t on all platforms except VAX and IA64. Add fast access via register for AMD64, i386 and SH3 ports. Use this fast access in libpthread to replace the stack based pthread_self(). Implement skeleton support for Alpha, HPPA, PowerPC, SPARC and SPARC64, but leave it disabled. Ports that support this feature provide __HAVE____LWP_GETPRIVATE_FAST in machine/types.h and a corresponding __lwp_getprivate_fast in machine/mcontext.h. This material is based upon work partially supported by The NetBSD Foundation under a contract with Joerg Sonnenberger.	2011-02-24 04:28:41 +00:00
christos	66f16a1fa6	I've had this patch in my tree for a while and since it only improves the situation, I decided to commit it. There is an inherent problem with ASLR and the way the pthread library is using the thread stack. Our pthread library chooses that stack for each thread strategically so that it can locate the location of the pthread struct for each thread by masking the stack pointer and looking just below the red zone it creates. Unfortunately with ASLR you get many random values for the initial stack, and there are situations where the masked stack base ends up below the base of the stack. (this happens on x86 when the stack base happens to be 0x???02000 for example and your mask is stackmask is 0xffe00000). To fix this, we detect the pathological cases (this happens only in the main thread), allocate more stack, and mprotect it appropriately. Then we stash the main base and the main struct, so that when we look for the pthread struct in pthread__id, we can special case the main thread. Another way to work around the problem is unlimiting stacksize, but the proper way is to use TLS to find the thread structure and not to play games with the thread stacks.	2010-12-18 15:54:27 +00:00
ad	61cac435e4	- Convert from makecontext() -> _lwp_makecontext(). - Rely on _lwp_makecontext() to set up the thread identity register. This is not currently done (a bug), nor does libpthread use the threadreg yet. I'm doing this so it the code can be used by the person working on TLS to verify that their threadreg code is working.	2009-05-17 14:49:00 +00:00
ad	a61915e94f	Remove unused code that's confusing when using cscope/opengrok.	2009-05-16 22:20:40 +00:00
ad	cbd43ffa55	Now that we have all the scheduling gunk, make these do something useful: pthread_attr_get_np pthread_attr_setschedparam pthread_attr_getschedparam pthread_attr_setschedpolicy pthread_attr_getschedpolicy	2008-06-28 10:29:37 +00:00
ad	2bcb8bf1c4	PR lib/38741 priority inversion in libpthread breaks apps that use SCHED_FIFO threads - Change condvar sync so that we never take the condvar's spinlock without first holding the caller-provided mutex. Previously, the spinlock was only taken without the mutex in an error path, but it was enough to trigger the problem described in the PR. - Even with this change, applications calling pthread_cond_signal/broadcast without holding the interlocking mutex are still subject to the problem described in the PR. POSIX discourages this saying that it leads to undefined scheduling behaviour, which seems good enough for the time being. - Elsewhere, use a hash of mutexes instead of per-object spinlocks to synchronize entry/exit from sleep queues. - Simplify how sleep queues are maintained.	2008-05-25 17:05:28 +00:00
martin	ce099b4099	Remove clause 3 and 4 from TNF licenses	2008-04-28 20:22:51 +00:00
ad	377f098ab0	Adjust mutex/rwlock definitions to match reality now that there is only one implementation of each. PR lib/38030.	2008-02-14 21:40:51 +00:00
ad	a67e1e3475	- Remove libpthread's atomic ops. - Remove the old spinlock-based mutex and rwlock implementations. - Use the atomic ops from libc.	2008-02-10 18:50:54 +00:00
christos	c6409540ef	add missing static decls.	2008-01-08 20:56:08 +00:00
ad	622bbc505a	- Use pthread__cancelled() in more places. - pthread_join(): assert that pthread_cond_wait() returns zero.	2007-12-24 16:04:20 +00:00
ad	989565f81d	- Fix pthread_rwlock_trywrlock() which was broken. - Add new functions: pthread_mutex_held_np, mutex_owner_np, rwlock_held_np, rwlock_wrheld_np, rwlock_rdheld_np. These match the kernel's locking primitives and can be used when porting kernel code to userspace. - Always create LWPs detached. Do join/exit sync mostly in userland. When looped on a dual core box this seems ~30% quicker than using lwp_wait(). Reduce number of lock acquire/release ops during thread exit.	2007-12-24 14:46:28 +00:00
ad	8077340e63	Remove the debuglog stuff. ktrace is more useful now.	2007-11-19 15:14:11 +00:00
ad	66ac2ffaf2	Mutexes: - Play scrooge again and chop more cycles off acquire/release. - Spin while the lock holder is running on another CPU (adaptive mutexes). - Do non-atomic release. Threadreg: - Add the necessary hooks to use a thread register. - Add the code for i386, using %gs. - Leave i386 code disabled until xen and COMPAT_NETBSD32 have the changes.	2007-11-13 17:20:08 +00:00
ad	15e9cec117	For PR bin/37347: - Override __libc_thr_init() instead of using our own constructor. - Add pthread__getenv() and use instead of getenv(). This is used before we are up and running and unfortunatley getenv() takes locks. Other changes: - Cache the spinlock vectors in pthread__st. Internal spinlock operations now take 1 function call instead of 3 (i386). - Use pthread__self() internally, not pthread_self(). - Use __attribute__ ((visibility("hidden"))) in some places. - Kill PTHREAD_MAIN_DEBUG.	2007-11-13 15:57:10 +00:00
ad	84a6749ef2	Note that libpthread_dbg needs to be checked after making changes to libpthread.	2007-10-16 15:21:54 +00:00
ad	f1b2c1c4c9	... but preserve the linked list, for the debugger only.	2007-10-16 15:07:02 +00:00
ad	9583eeb248	Replace the global thread list with a red-black tree. From joerg@.	2007-10-16 13:41:18 +00:00
skrll	d32ed98975	Resurrect the function pointers for lock operations and allow each architecture to provide asm versions of the RAS operations. We do this because relying on the compiler to get the RAS right is not sensible. (It gets alpha wrong and hppa is suboptimal) Provide asm RAS ops for hppa. (A slightly different version) reviewed by Andrew Doran.	2007-09-24 12:19:39 +00:00
ad	20e3392edc	Add a per-mutex deferred wakeup flag so that threads doing something like the following do not wake other threads early: pthread_mutex_lock(&mutex); pthread_cond_broadcast(&cond); foo = malloc(100); /* takes libc mutexes */ pthread_mutex_unlock(&mutex);	2007-09-13 23:51:47 +00:00
ad	b0efccf4cd	Make the new mutexes faster: - Eliminate mutexattr_private and just set a bit in ptm_owner if the mutex is recursive. This forces the slow path to be taken for recursive mutexes. Overload an unused field in pthread_mutex_t to record whether or not it's an errorcheck mutex. - Streamline pthread_mutex_lock / pthread_mutex_unlock a bit more. As a side effect makes it possible to have assembly stubs for them.	2007-09-11 18:11:29 +00:00
skrll	9fdaf800d9	Merge nick-csl-alignment.	2007-09-10 11:34:05 +00:00
ad	f4fd6b797e	- Get rid of self->pt_mutexhint and use pthread__mutex_owned() instead. - Update some comments and fix minor bugs. Minor cosmetic changes. - Replace some spinlocks with mutexes and rwlocks. - Change the process private semaphores to use mutexes and condition variables instead of doing the synchronization directly. Spinlocks are no longer used by the semaphore code.	2007-09-08 22:49:50 +00:00
ad	8ccc6e060d	- Don't take the mutex's spinlock (ptr_interlock) in pthread_cond_wait(). Instead, make the deferred wakeup list a per-thread array and pass down the lwpid_t's that way. - In pthread_cond_wait(), take the mutex before dealing with early wakeup. In this way there should never be contention on the CV's spinlock if the app follows POSIX rules (there should only be contention on the user-provided mutex). - Add a port of the kernel's rwlocks. The rwlock's spinlock is only taken if there is contention. This is enabled where atomic ops are available. Right now that is only i386 and amd64 because I don't have other hardware to test with. It's trivial to add stubs for other architectures as long as they have compare-and-swap. When we have proper atomic ops the old rwlock code can be removed. - Add a new mutex implementation that's similar to the kernel's mutexes, but uses compare-and-swap to maintain the waiters list, so no spinlocks are involved. Same caveats apply as for the rwlocks.	2007-09-07 14:09:27 +00:00
ad	a6ed47a549	Add: pthread__atomic_cas_ptr, pthread__atomic_swap_ptr, pthread__membar_full This is a stopgap until the thorpej-atomic branch is complete.	2007-09-07 00:24:56 +00:00
ad	d9adedd764	Trim fat off libpthread internal spinlock operations. Makes a mesurable improvement across the board.	2007-08-16 13:54:16 +00:00
ad	b8833ff53f	- Reinitialize the absolute minimum when recycling user thread state. Chops another ~10% off create/join in a loop on i386. - Disable low level debugging as this is stable. Improves benchmarks across the board by a small percentage. Uncontested mutex acquire and release in a loop becomes about 8% quicker. - Minor cleanup.	2007-08-16 12:01:49 +00:00
ad	9e28719960	Remove PT_FIXEDSTACKSIZE_LG.	2007-08-16 01:09:34 +00:00
ad	ed964af19e	Cache thread context for creation instead of setting it up every time. Speeds create/join loop by about 10-15% on i386.	2007-08-16 00:41:23 +00:00
ad	c3f8e2ee55	Change the signature of _lwp_park() to accept an lwpid_t and second hint pointer, but do so in a way that remains compatible with older pthread libraries. This can be used to wake another thread before the calling thread goes asleep, saving at least one syscall + involuntary context switch. This turns out to be a fairly large win on the condvar benchmarks that I have tried.	2007-08-07 19:04:21 +00:00
ad	7bf06aa722	Make libpthread_dbg build again.	2007-08-04 18:54:12 +00:00
ad	50fa8db4e4	Some significant performance improvements, and a fix for a race with pthread detach/join. - Make mutex acquire spin for a short time, as done with spinlocks. - Make the number of spins controllable with the env var PTHREAD_NSPINS. - Reduce the amount of time that libpthread internal spinlocks are held. - Rely more on the barrier effects of park/unpark to avoid taking spinlocks. - Simplify the locking around pthreads and the global queues. - Align per-thread sync data on a 128 byte boundary. - Offset thread stacks by a small amount to try and reduce cache thrash.	2007-08-04 13:37:48 +00:00
ad	b5a5e72af1	Mirror a fix made to the kernel's condvars: After resuming execution, the thread must check to see if it has been restarted as a result of pthread_cond_signal(). If it has, but cannot take the wakeup (because of eg a pending Unix signal or timeout) then try to ensure that another thread sees it. This is necessary because there may be multiple waiters, and at least one should take the wakeup if possible.	2007-04-12 21:36:06 +00:00
ad	a5070151ae	- Test+branch is usually cheaper than making an indirect function call, so avoid making them. - When parking an LWP on a condition variable, point the hint argument at the mutex's waiters queue. Chances are we will be awoken from that later.	2007-03-24 18:51:59 +00:00
ad	b0427b61fb	- Maintain a per-thread pointer to the last mutex acquired by the app, to be used only as as a hint. Clear the pointer when releasing the mutex. - When releasing a mutex, wake all waiters. Makes it possible to tranfer waiters from another object to a mutex.	2007-03-20 23:33:10 +00:00
ad	792cc0e17d	- Simplify the interface to pthread__park() and friends slightly. - If sysctl() fails, complain.	2007-03-05 23:55:40 +00:00
ad	de2138164c	Remove the PTHREAD_SA option. If M:N threads is reimplemented it's better off done with a seperate library.	2007-03-02 18:53:51 +00:00
ad	d333bb5f2f	Build without sys/sa.h present.	2007-02-06 15:24:37 +00:00
ad	ded2602507	Fix bugs with and improve upon previous.	2006-12-24 18:39:45 +00:00
ad	1ac6a89b79	Conditionalised support for 1:1 threads. Needs associated kernel changes and more work to be useful.	2006-12-23 05:14:46 +00:00
yamt	41cc94b9f0	remove unused IDLESPINS.	2006-10-03 09:37:07 +00:00
chs	0e67554241	starting the pthread library (ie. calling pthread__start()) before any threads are created turned out to be not such a good idea. there are stronger requirements on what has to work in a forked child while a process is still single-threaded. so take all that stuff back out and fix the problems with single-threaded programs that are linked with libpthread differently, by checking if the library has been started and doing completely different stuff if it hasn't been: - for pthread_rwlock_timedrdlock(), just fail with EDEADLK immediately. - for sem_wait(), the only thing that can unlock the semaphore is a signal handler, so use sigsuspend() to wait for a signal. - for pthread_mutex_lock_slow(), just go into an infinite loop waiting for signals. I also noticed that there's a "sem2" test that has never worked in its single-threaded form. the problem there is that a signal handler tries to take a sem_t interlock which is already held when the signal is received. fix this too, by adding a single-threaded case for sig_trywait() that blocks signals instead of using the userland interlock.	2005-10-19 02:15:03 +00:00
chs	2415c56ed0	in pthread_mutex_lock_slow(), pthread_rwlock_timedrdlock() and sem_wait(), call pthread__start() if it hasn't already been called. this avoids an internal assertion from the library if these routines are used before any threads are created and they need to sleep. fixes PR 20256, PR 24241, PR 25722, PR 26096.	2005-10-16 00:07:24 +00:00
nathanw	916de87872	Keep the kernel updated with signal action signal masks (act.sa_mask) until threads are started, since before that the traditional signal invocation method will be used. Fixes regress/lib/libpthread/sigmask2.	2005-02-26 20:33:06 +00:00
mycroft	2b4ccae3e9	Remove pt_blockuc. If the debugger attempts to muck with the state of a blocked thread, return an error; this should be done through ptrace(2).	2004-10-12 22:17:56 +00:00

1 2

78 Commits