The parallel version of STBY did not take host endianness into
account, and also computed the incorrect address for STBY_E.
Bswap twice to handle the merge and store. Compute mask inside
the function rather than as a parameter. Force align the address,
rather than subtracting one.
Generalize the function to system mode by using probe_access().
Cc: qemu-stable@nongnu.org
Tested-by: Helge Deller <deller@gmx.de>
Reported-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>