Emulating LL/SC with cmpxchg is not correct, since it can
suffer from the ABA problem. However, portable parallel
code is written assuming only cmpxchg which means that in
practice this is a viable alternative.
Signed-off-by: Richard Henderson <rth@twiddle.net>