Every store would always cause the tb_invalidate_phys_page_fast path to be invoked,
amounting to a 40x slowdown of stores compared to loads.
Change this code to only worry about TB invalidation for regions marked as
executable (i.e. emulated executable).
Even without uc_set_native_thunks, this change fixes most of the performance
issues seen with thunking to native calls.
Signed-off-by: Andrei Warkentin <andrei.warkentin@intel.com>