With this coding style each instruction could be implemented separatelly even not together with current Bochs FPU emulator.
Step-by-step I am going to transfer all FPU instructions from current Bochs FPU emulator to new style and remove an old bugged emulator.
Anyway, now I could implement all currently missed FPU instructions without hacking wm-fpu-emu.
* Fully supports
* MMX/XMM/3DNOW instruction sets
* FPU instruction
* SSE3 extensions
currently only 16/32 bit mode bug anyway, it is much better that old one ;)
Instructions MOV_CxRx and MOV_RxCx are not supported in v8086 mode according to Intel manuals.
Also these instructions are treated as register-to-register regardless to MODRM byte fields (according to AMD manuals)
Also commit fix for MOV_EwSw by Kevin
PNI could be enabled by setting BX_SUPPORT_PNI in config.h
After the feature will be fully validation I'll also add configure option.
The implemntation is ~complete. I've missed only three FPU new opcodes of FUSTTP instruction and MONITOR/WAIT instructions.
Enjoy ! ;)
From the author (see bug #663320) :
In the code there is a check to verify that an IO bitmap
is defined (io_base > BX_CPU_THIS_PTR
tr.cache.u.tss386.limit_scaled) but there is no check if
an accessed IO port's address actually falls within the
defined limit of the TSS segment. So if I define an IO
bitmap with 100 entries, port 101 may or may not be
allowed depending on whatever bytes follow the TSS in
memory
[x] fixed bug in int01 (opcode 0xF1) emulation
[x] fixed bug in x86 debugger with dr0-dr3 registers
Committed disassembler bugfix from Dirk Thierbach:
[x] fixed bug in relative addresses in Jmp, Jcc, Call and so on
* changed all %ll format descriptions to FMT_LL macro so that
Microsoft Visual C works correctly (it uses %I64)
* missing type conversions added
* cdrom.cc: variable types for win32 fixed
* removed some unused variables in eth_win32.cc and harddrv.cc
* added missing includes in make_cmos_image.c and niclist.c
check.
Commented out a number of instances of invalidate_prefetch_q(),
for branches which do not change CS since the EIP window mechanism
takes care of validating that EIP lands in the current page or not
in the main cpu loop anyways.
Fixed a couple cases (v8086 mode and real mode) of loading CS where
the EIP page window was not invalidated in segment_ctrl_pro.cc.
That may fix some aliasing problems reported before (OS2).
Notes from the author:
Here is another one of my speed up patches. Unlike my previous speedups
this one will help more platforms than just X86. It cleans up the Data
Xfer instructions. Since the Data Xfer instructions are the most often
executed instructions it gives a noticable boost in speed. The basic
optimization technique was to eliminate intermediate variables and pass
a pointer to the final destination or original source to the
read_virtual_whatever and the write_virtual_whatever functions.
According to the Intel manuals:
The LOCK prefix can be prepended only to the following instructions
and only to those forms of the instructions where the destination
operand is a memory operand: ADD, ADC, AND, BTC, BTR, BTS, CMPXCHG,
CMPXCH8B, DEC, INC, NEG, NOT, OR, SBB, SUB, XOR, XADD, and XCHG. If
the LOCK prefix is used with one of these instructions and the source
operand is a memory operand, an undefined opcode exception (#UD) will
be generated. An undefined opcode exception will also be generated if
the LOCK prefix is used with any instruction not in the above list.
Checking of the LOCK prefix done in fetchDecode state and not overloads
Bochs's execution.
- it works only on x86 with gcc2.95+
- uses the GCC function atribute "regparm(n)" to declare that certain
functions use the register calling convention
- performance improvement is about 6%
1) fixed the type of "hostPageAddr" and associated typecasts.
2) fixed the type of "pages" and associated typecasts (overloaded variable)
3) patch to cpu.cc to calculate "eipPageBias" correctly in 64 bit mode
1) fixed some errors running 32 bit compat mode. IMPORTANT FIX.
2) added IST processing (uses IST1-IST7 in 64 bit TSS)
3) cosmetic - debugging stuff to console.
a minor optimization. Also in transition from compat mode to 64 bit mode (e.g. interrupt to inner
privelege with mode change), SS may not be properly defined - this avoids other messiness.
* renamed CPU_ID to BX_CPU_ID.
with this new name there is no possibility for name contentions and BX_CPU_ID
definition could be moved out to NEED_CPU_REG_SHORTCUTS block
* returned back `unsigned BX_CPU::which_cpu(void)` function
* added BX_CPU_ID parameter for
BX_INSTR_PHY_READ(a20addr, len);
BX_INSTR_PHY_WRITE(a20addr, len);
now it will be
BX_INSTR_PHY_READ(cpu_id, a20addr, len);
BX_INSTR_PHY_WRITE(cpu_id, a20addr, len);
> CPU_ID is defined as
> #define CPU_ID (BX_CPU_THIS_PTR local_apic.get_id())
> This is not true when the APIC name is changed (true in Linux). Please
> change this to:
> #define CPU_ID (BX_CPU_THIS - BX_CPU(0))
The 64 bit variant of MOVNTI was not decoded. The proper fix for this is to work on
fetchdecode64.cc to call a 64 bit variant of SSE instructions or fail it with a
invalid op. A careful check needs to be done with the AMD manuals to determine if
there are any other SSE instructions that have a special 64 bit decoding.
PSRAW_PqQq (MMX)
PSRAD_PqQq (MMX)
PSRAW_PqIb (MMX)
PSRAD_PqIb (MMX)
PSRAW_VdqWdq (SSE)
PSRAD_VdqWdq (SSE)
PSRAW_PdqIb (SSE)
PSRAD_PdqIb (SSE)
When register was shifted by 0 bits the result produced was incorrect.
Now Bochs fully passes MMX test provided by
Hentai Yagi [hentai_yagi@yahoo.com.au] !
sash to run.
1) fixed fetchdecode64.cc to fix the operand size at 64 bits in long mode for moves
to/from CRx
2) minor patches to sse2.cc to fix unimplemented and 64 bit variants of sse2
instructions.
Because source files were added/removed it would require an update
of the windows and macos project files, so I want to wait until after 2.0.
M Makefile.in 1.51 back to 1.50
M cpu.h 1.121 back to 1.120
M fetchdecode.cc 1.37 back to 1.36
M fetchdecode64.cc 1.33 back to 1.32
M sse.cc 1.17 back to 1.16
A sse2.cc 1.27 back to 1.26 (added back)
R sse_move.cc removed
R sse_pfp.cc removed
- to bring these changes back again, all we have to do is
"cvs update -j tmp-before1 -j tmp-after1"
sse.cc -> general SSE stuff and SSE integer (MMX extensions)
sse_move.cc -> memory transfer and shuffle opcodes
sse_pfp.cc -> packed floating point operations
Description/justification:
Endian Host byte order Guest (x86) byte order
======================================================
Little FFFFFFFFEEAAAAAA FFFFFFFFEEAAAAAA
Big AAAAAAEEFFFFFFFF FFFFFFFFEEAAAAAA
F - fraction/mmx
E - exponent
A - aligment
debugger, SMP, and x86-64. A few macros were missing the CPU_ID argument,
and a few passed nonexistent variables to the instrumentation macros.
- I changed CPU_ID into a plain old macro instead of an inline call to a
trivial which_cpu() function, and removed which_cpu().
Modified Files:
cpu/cpu.h cpu/ctrl_xfer64.cc debug/dbg_main.cc
In bx_cpu_c::reset method I set bx_cpu->async_event to 2
so execution in the cpu_loop gets stopped early.
Previously, async_event was set to 0, and with repeatable
instructions, after reset, eip was incremented by the instruction
length, so execution would resume at 0xffffX (X being >0, the current
instruction length).
In halt state I check now for reset with async_event is 2, so
reset works also when the cpu is halted. (update to Peter change)
I hope I fixed this the right way, please report any strange behaviour.
For a whole lot of configure options, I put #if...#endif around code that
is specific to the option, even in files which are normally only compiled
when the option is on. This allows me to create a MS Visual C++ 6.0
workspace that supports many of these options. The workspace will basically
compile every file all the time, but the code for disabled options will
be commented out by the #if...#endif.
This may one day lead to simplification of the Makefiles and configure
scripts, but for the moment I'm leaving Makefiles and configure scripts
alone.
Affected options:
BX_SUPPORT_APIC (cpu/apic.cc)
BX_SUPPORT_X86_64 (cpu/*64.cc)
BX_DEBUGGER (debug/*)
BX_DISASM (disasm/*)
BX_WITH_nameofgui (gui/*)
BX_SUPPORT_CDROM (iodev/cdrom.cc)
BX_NE2K_SUPPORT (iodev/eth*.cc, iodev/ne2k.cc)
BX_SUPPORT_APIC (iodev/ioapic.cc)
BX_IODEBUG_SUPPORT (iodev/iodebug.cc)
BX_PCI_SUPPORT (iodev/pci*.cc)
BX_SUPPORT_SB16 (iodev/sb*.cc)
Modified Files:
cpu/apic.cc cpu/arith64.cc cpu/ctrl_xfer64.cc
cpu/data_xfer64.cc cpu/fetchdecode64.cc cpu/logical64.cc
cpu/mult64.cc cpu/resolve64.cc cpu/shift64.cc cpu/stack64.cc
debug/Makefile.in debug/crc.cc debug/dbg_main.cc debug/lexer.l
debug/linux.cc debug/parser.c debug/parser.y
disasm/dis_decode.cc disasm/dis_groups.cc gui/amigaos.cc
gui/beos.cc gui/carbon.cc gui/macintosh.cc gui/rfb.cc
gui/sdl.cc gui/term.cc gui/win32.cc gui/wx.cc gui/wxdialog.cc
gui/wxmain.cc gui/x.cc iodev/cdrom.cc iodev/eth.cc
iodev/eth_arpback.cc iodev/eth_fbsd.cc iodev/eth_linux.cc
iodev/eth_null.cc iodev/eth_packetmaker.cc iodev/eth_tap.cc
iodev/eth_tuntap.cc iodev/eth_win32.cc iodev/ioapic.cc
iodev/iodebug.cc iodev/ne2k.cc iodev/pci.cc iodev/pci2isa.cc
iodev/sb16.cc iodev/soundlnx.cc iodev/soundwin.cc
stack limits. This is needed for EROS, and probably for L4, as both
rely on this SS fault (and the corresponding GP fault) to trigger the
switch from small address spaces to large address spaces. The
push_16() code was already correct, and I find the inconsistency a bit
odd.
I'm not 100% sure about the push_64() change, so I made the change
with a comment but left it a BX_PANIC() rather than switching it to
BX_INFO. I'll ask Peter momentarily to have a look and let me know.
While I was added, changed the push_16() BX_INFO message to be
consistent with the others -- all now say 'push outside stack limits'.
- Now compiles for plain ia-32
- Fixed some printf formatting for ia32 only.
- Update to latest Win32 DLL
- Added an ICEBP (Undoc 0xF8, INT 01) facility.
- updated to use latest VGA refresh routine
a control panel, but now we're calling it a text configuration interface.
Modified:
.bochsrc Makefile.in bochs.h main.cc cpu/Makefile.in
debug/Makefile.in disasm/Makefile.in fpu/Makefile.in
gui/Makefile.in iodev/Makefile.in memory/Makefile.in