Store only single byte of opcode in b1() - speedup shift instructions Code cleanups
Currently no speedup and no slowdown - about the same results on my Bochs benchmarking A lot of code reorganization in fetchdecode