The SMC problem was solved in following manner:
- Every trace linked to another remembers when it was linked (a special timestamp value called traceLinkTimeStamp)
- When true SMC happens it incremements the traceLinkTimeStamp
- Jump to the linked trace won't be allowed if traceLinkTimeStamp in the link doesn't match traceLinkTimeStamp
So SMC effectively breaks all trace links and therefore I should not care for them anymore
5%-10% speedup on OS boot benchamarks observed
Bochs instruction emulation handlers won't refer to direct fields of instructions like MODRM.NNN or MODRM.RM anymore.
Use generic source/destination indications like SRC1, SRC2 and DST.
All handlers are modified to support new notation. In addition fetchDecode module was modified to assign sources to instructions properly.
Immediate benefits:
- Removal of several duplicated handlers (FMA3 duplicated with FMA4 is a trivial example)
- Simpler to understand fetch-decode code
Future benefits:
- Integration of disassembler into Bochs CPU module, ability to disasm bx_instruction_c instance (planned)
Huge patch. Almost all source files wre modified.
the iaOpcode field has no masking anymore.
fixed bug during the code reorganization:
+ XOP: Fixed instructions with operands order depending on VEX.W (fixed VEX.W read from instruction object)
XOP: few instructions are still missing, coming soon
BX_PANIC(("VPERMILPS_VpsHpsWpsVIbR: not implemented yet"));
BX_PANIC(("VPERMILPD_VpdHpdWpdVIbR: not implemented yet"));
BX_PANIC(("VPMADCSSWD_VdqHdqWdqVIbR: not implemented yet"));
BX_PANIC(("VPMADCSWD_VdqHdqWdqVIbR: not implemented yet"));
BX_PANIC(("VFRCZPS_VpsWpsR: not implemented yet"));
BX_PANIC(("VFRCZPD_VpdWpdR: not implemented yet"));
BX_PANIC(("VFRCZSS_VssWssR: not implemented yet"));
BX_PANIC(("VFRCZSD_VsdWsdR: not implemented yet"));
feature is enabled by default when configure with --enable-all-optimizations
option, to disable handlers chaining speedups configure with
--disable-handlers-chaining
Allows to significantly cut trace cache miss latenct and find data in victim cahe instead of redoding it
8 entries VC in parallel with direct map 64K entries