This is now distinct from pthread__smt_pause, which is for spin lock
backoff with no paired wakeup.
On Arm, there is a single-bit event register per CPU, and there are two
instructions to manage it:
- wfe, wait for event -- if event register is clear, enter low power
mode and wait until event register is set; then exit low power mode
and clear event register
- sev, signal event -- sets event register on all CPUs (other
circumstances like interrupts also set the event register and cause
wfe to wake)
These can be used to reduce the power consumption of spinning for a
lock, but only if they are actually paired -- if there's no sev, wfe
might hang indefinitely. Currently only pthread_spin(3) actually
pairs them; the other lock primitives (internal lock, mutex, rwlock)
do not -- they have spin lock backoff loops, but no corresponding
wakeup to cancel a wfe.
It may be worthwhile to teach the other lock primitives to pair
wfe/sev, but that requires some performance measurement to verify
it's actually worthwhile. So for now, we just make sure not to use
wfe when there's no sev, and keep everything else the same -- this
should fix severe performance degredation in libpthread on Arm
without hurting anything else.
No change in the generated code on amd64 and i386. No change in the
generated code for pthread_spin.c on arm and aarch64 -- changes only
the generated code for pthread_lock.c, pthread_mutex.c, and
pthread_rwlock.c, as intended.
PR port-arm/57437
XXX pullup-10