The old sparc64 implementation may replace two insns, which leaves
a race condition in which a thread could be stopped at a PC in the
middle of the sequence, and when restarted does not see the complete
address computation and branches to nowhere.
The new implemetation replaces only one insn, swapping between a
direct branch and a direct call. The TCG_REG_TB register is loaded
from tb->jmp_target_addr[] in the delay slot.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>