Implemented VCOMISS/VCOMISD/VUCOMISS/VUCOMISD AVX512 instructions
Fix vector length values for AVX-512 (512-bit vector should have length 4)
support mis-alignment #GP exception for VMOVAPS/PD/DQA32/DQ64 AVX512 instructions
move AVX512 load/store and register move operations into dedicated file avx512_move.cc
The code already knows to disasm most of the opcodes with their operands.
- Split according to OSIZE opcodes RDFSBASE/WRFSBASE / RDGSBASE/WRGSBASE both for disasm and performance
- Minimize amount of opcode forms in ia_opcodes.h again.
For example Udq means the same as Wdq but with no memory form.
The end goal will be also merging of disasm and cpu decoder to one module and remove the disasm.
Two bug fixes on the way:
TBM: fixed 64-bit TBM instructions with memory access (did 32-bit load instead of 64-bit)
BMI2: fixed operands order for PEXT/PDEP instructions
AVX2: fixed gather instruction decoding bug from decoder alias commit
The Windows version looks almost stable, but the GTK version fails in some cases.
That's why the classic wx debugger is still available if BX_DEBUGGER_GUI is set to 0.
- added function close_debug_dialog() to handle the simulation stop case in wx
- disable all the wx debugger related code if BX_DEBUGGER_GUI is set to 1
- added enhanced debugger specific init code similar to the code in sdl.cc
- include debugger related resources on Windows
- TODO: make the GTK / wxGTK case stable and remove the wx debugger
Bugfix: VMX: VmEntry should do TPR Virtualization (TPR Shadow + APIC Access Virtualization case is affected) and even could possibly cause TPR Threshold VMEXIT
Of course no true random numbers will be generated - use standard "C" rand() function as stub.
In future it will be possible to improve (using another random generator) or even use real rdrand/rdseed intrinsics
Minor speedup (of 1-2%) was observed due to new implementation
Remove obsolete dbg_take_irq function and dbg_force_interrupt function from CPU code, the functions were not working properly anyway
fixed enabling of ADX extensions in generic CPUID when enabled through .bochsrc
Small code cleanups on the way to implementation of APIC Registers Virtualization features disclosed in recent Intel SDM rev043
Bochs instruction emulation handlers won't refer to direct fields of instructions like MODRM.NNN or MODRM.RM anymore.
Use generic source/destination indications like SRC1, SRC2 and DST.
All handlers are modified to support new notation. In addition fetchDecode module was modified to assign sources to instructions properly.
Immediate benefits:
- Removal of several duplicated handlers (FMA3 duplicated with FMA4 is a trivial example)
- Simpler to understand fetch-decode code
Future benefits:
- Integration of disassembler into Bochs CPU module, ability to disasm bx_instruction_c instance (planned)
Huge patch. Almost all source files wre modified.
for CPU emulation performance reasons, the alignment check compilation
still can be enabled using configure option --enable-alignment-check.
There is no software in the world which enable #AC exception checking, this
x86 feature is completely legacy but its emulation support costs up to 3-5%
emulation speed.
The checking for #AC exception enable still will be done, if
CPL == 3, EFLAGS.AC = 1 and CR0.AM = 1
but the alignment check is not compiled in, the Bochs will PANIC with corresponding message.
You can press 'always continue' and ignore the PANIC, the simulation will continue as if alignment checking is not enabled.
The problem with Parity is it is generally referenced very rarely so the current lazy flags code is not efficient to updated Parify flag only (because it updates low 8 bits of .result value the existing Zero Flag has to be shadowed in .auxbits.
So I flipped it around, to make Parity be shadowed in auxbits. .result now is only needed to derive Zero Flag, and both Sign and Parify are derived from .result + .auxbits (as Zero Flag is now). For the 90% of the conditional jumps that are JZ or JNZ, this is a speedup.
Parity is now derived from 8 bits in .result and 8 bits in .auxbits, and Sign is derived from one flag in .result and 1 bit in .auxbits by XOR-ing them all together. It makes the code sequences for SAHF and POPF simpler too.
------------
Implemented SVN nested paging support - the Virtual Box boots perfectly with Nested Paging guest !
A lot of code duplication was added for now - major cleanup will follow later.
! Added AMD Phenom X3 8650 (Toliman) configuration to the CPUDB - this configuration has Nested Paging enabled.
Some CPUID modules rework done to enable Toliman configuration.
Ckean up 'executable' attribute from all CPU source files.
- IO intercept is not implemented yet
- MSR intercept is not implemented yet
VMX:
Fixed Bochs PANIC crash when doing I/O access crossing VMX I/O permission bitmaps.
This can happen because access_physical_read and access_physical_write cannot access memory cross 4K boundary.
I am merging the code in order to start making shortcuts between VMX emulation and SVM emulation.
Of course SVM emulation is incomplete, completely untested and not expected to work.
But someone could already take a look one the code and give some suggestions.
Also looking for anybody with existing SVM kernels - as simple as possible - for testing.
Status:
- exceptions intercept is not implemented yet
- IO intercept is not implemented yet
- MSR intercept is not implemented yet
- virtual interrupts are not implemented yet
- CPUID is not implemented yet
No advanced SVM featurez planned - I am implementing the very basic 'Pacifica' document from 2005 using QEMU code as reference.
XOP: few instructions are still missing, coming soon
BX_PANIC(("VPERMILPS_VpsHpsWpsVIbR: not implemented yet"));
BX_PANIC(("VPERMILPD_VpdHpdWpdVIbR: not implemented yet"));
BX_PANIC(("VPMADCSSWD_VdqHdqWdqVIbR: not implemented yet"));
BX_PANIC(("VPMADCSWD_VdqHdqWdqVIbR: not implemented yet"));
BX_PANIC(("VFRCZPS_VpsWpsR: not implemented yet"));
BX_PANIC(("VFRCZPD_VpdWpdR: not implemented yet"));
BX_PANIC(("VFRCZSS_VssWssR: not implemented yet"));
BX_PANIC(("VFRCZSD_VsdWsdR: not implemented yet"));
So now the same single option will choose not only the CPUID flags but also VMX capabilities matching real HW machine.
Removed cpuid of core2_extreme_x9770 from the cpudb. I don't remember its VMX capabilities anyway.
There is another Penryn model in the cpudb - core2_penryn_t9600.
32-bit CPU using Bochs binary compiled with x86-64 support.
The commit also fixes some init.cc issues with initialization of SYSCALL/SYSRET MSR in AMD hosts and also includes code reorg.
SMP emulation. New implementation uses dynamic CPU quantum value and takes
full advantage of the trace cache. Each emulated processor will execute
the whole trace before switching to the next processor.
* It is also safe to use large (up to 16 instructions) quantum values for
the SMP emulation now and improve performance even further.
The same merge also completely fixes SF bug :
[3312237] stepN command might be not working properly
Handlers chaining speedups are also supported with SMP emulation now.
- SYSCALL/SYSRET: SYSCALL/SYSRET instructions are not supported in legacy mode for Intel processors
- CPUID: CPUID.0x80000001.EDX[11] SYSCALL/SYSRET support should not be reported outside long64 mode if legacy mode SYSCALL/SYSRET is not supported
- Added new CPUDB entry - AMD K6-2 3D proc3essor (Chomper)
with --enable-avx option. When compiled in, AVX still has to be enabled
using .bochsrc CPUID option. AVX2 FMA instructions still not implemented.
- Added support for Bit Manipulation Instructions (BMI) emulation. The BMI
instructions support can be enabled using .bochsrc CPUID option.
feature is enabled by default when configure with --enable-all-optimizations
option, to disable handlers chaining speedups configure with
--disable-handlers-chaining
now the order is going to be:
1. MSRs emulated in Bochs (msr.cc)
2. MSRs emulated in model specific derivative class of cpuid_t
3. MSR can be loaded from msrs.def file
4. MSR is not found. We can fault or ignore based on ignore_bad_msrs option
Change CPUID to generic interface which could be chosen from .bochsrc.
Bochs CPU emulation will enable/disable features (like instruction sets) according to CPUID that is selected.
TODO: Add database of CPUID from real hardware CPUs
The patch was posted in mailing list at Thu 6/16/2011.
Desription for CHANGES:
- Memory
- Added new configure option which enables RAM file backing for large guest
memory with a smaller amount host memory, without causing a panic when
host memory is exhausted (patch by Gary Cameron). To enable configure with
--enable-large-ramfile option.
[3298173] Breakpoint on VMEXIT event by Jianan Hao
Patch description:
The patch provides a new command "vmexitbp" to set breakpoint when VM guest exit. The simulation will be stopped before first HOST mode instruction is executed.
Usage:
Type "vmexitbp" in debugger command window to switch it on/off (similar to modebp).
Currently, the patch has no corresponding interface on GUI debugger. Someone may add it if interested.
Allows to significantly cut trace cache miss latenct and find data in victim cahe instead of redoding it
8 entries VC in parallel with direct map 64K entries