Commit Graph

56 Commits

Author SHA1 Message Date
Stanislav Shwartsman
ec1ff39a5f Splitted memory access methods for 32 and 64-bit code.
The 64-bit code got >10% speedup, the 32-bit code also got about 2% because laddr cacluation optimization
2008-05-10 18:10:53 +00:00
Stanislav Shwartsman
62e3728591 preparations for future optimizations - not necessary speedupo now 2008-04-03 17:56:59 +00:00
Stanislav Shwartsman
255d512e29 Organize bxInstruction fields differently 2008-03-31 17:33:34 +00:00
Stanislav Shwartsman
1a59913e2b Fixed BX_INFO message 2008-03-27 21:04:39 +00:00
Stanislav Shwartsman
167c7075fb Use fastcall gcc attribute for all cpu execution functions - this pure "compiler helper" optimization brings additional 2% speedup to Bochs code 2008-03-22 21:29:41 +00:00
Stanislav Shwartsman
a2897933a3 white space cleanup 2008-02-02 21:46:54 +00:00
Stanislav Shwartsman
37fbb82baa Cleanups. Move bxInstruction_c definition to separate file instr.h 2008-01-29 17:13:10 +00:00
Stanislav Shwartsman
d9984bb3a1 Eliminate BxResolve call from the heart of cpu loop and move into instructions that really require this calculation. Yes, it blows the code of EVERY CPU method but it has >15% speedup ! 2008-01-10 19:37:56 +00:00
Stanislav Shwartsman
5d4e32b8da Avoid pointer params for every read_virtual_* except 16-byte SSE and 10-byte x87 reads 2007-12-20 20:58:38 +00:00
Stanislav Shwartsman
b516589e4e Changes in write_virtual_* and pop_* functions -> avoid moving parameteres by pointer 2007-12-20 18:29:42 +00:00
Stanislav Shwartsman
903f6dea35 Split setCC functions - makes code faster and simpler 2007-12-14 21:29:36 +00:00
Stanislav Shwartsman
976af56f6d Split bit.cc to 4 files - new files bit16/32/64.cc 2007-12-07 10:59:18 +00:00
Stanislav Shwartsman
7ca78b88e9 configure/compile changes + small optimizations 2007-12-01 16:45:17 +00:00
Stanislav Shwartsman
aa00d33640 BITSCAN lazy flags evaluation optimization 2007-11-29 21:52:16 +00:00
Stanislav Shwartsman
c51888f43f Split last BxLockable opcodes -> this allows to eliminate mod==0xc0 check from fetchdecode of every instruction
reduce ACPU.CC dependencies - now that file doesn't depend of CPU
2007-11-25 20:22:10 +00:00
Stanislav Shwartsman
3daa468c02 Fixed comments in bit.cc
Revert back lock prefix changes in fetchdecode - not all lockable instructions are splitted yet ;(
2007-11-23 16:37:06 +00:00
Stanislav Shwartsman
8e909508c8 a bit faster SETL/SETNL code 2007-11-21 22:42:40 +00:00
Stanislav Shwartsman
506dc3d963 Optimize 64-bit fetchdecode prefix handling
Deparecated set_FLAG() method, setB_FLAG() method was used everywhere
Rename setB_FLAG to set_FLAG, so set_FLAG() will must receive 0/1 inly
2007-11-20 23:00:44 +00:00
Stanislav Shwartsman
1e0db62984 bit.cc speedup (small) 2007-11-18 20:21:34 +00:00
Stanislav Shwartsman
cdc9a09090 Split more opcodes 2007-11-18 18:24:46 +00:00
Stanislav Shwartsman
83f6eb6945 Changes copyrights for the files I wrote :)
Also split EqId G1 group for x86-64
2007-11-17 23:28:33 +00:00
Stanislav Shwartsman
d9e58bd598 split11b on opcode tables level - split almost eevery splittable instruction
will be continued
2007-11-17 12:44:10 +00:00
Stanislav Shwartsman
42fdd8a3a1 During Bochs benchmarking I figured out that hostasm actually slow down the emulation ... so remove this ugly code which also doesn't help :)
speedup flags update for some instructions - idea was taken from DT patch by h.johansson
2007-10-21 22:07:33 +00:00
Stanislav Shwartsman
dbb91069f4 Added SSE4_2 instructions emulation 2007-10-01 19:59:37 +00:00
Stanislav Shwartsman
0dc4badfbb Added SSE4A and SSE4_2 to disassembler
Implemented POPCNT instruction
2007-09-19 19:38:10 +00:00
Stanislav Shwartsman
8221fa6838 - Fixed zero upper 32-bit part of GPR in x86-64 mode
- CMOV_GdEd should zero upper 32-bit part of GPR register even if the
    'cmov' condition was false !
2007-01-26 22:12:05 +00:00
Stanislav Shwartsman
aa1a61bfde Add (when needed) or remove (when not needed) x86-64 compilation hack 2006-06-26 20:28:00 +00:00
Stanislav Shwartsman
b0e49a9a05 Warn if somebody used BSWAP with 16-bit opsize (behavior undefined) 2006-05-07 20:56:40 +00:00
Stanislav Shwartsman
20b14aefa6 Fix in BSWAP 64-bit mode - allow to use additional R8-R15 registers
Also fixed code duplication story with BSWAP instruction
2006-05-07 18:58:47 +00:00
Stanislav Shwartsman
5c3fba4399 Support access to SMRAM in memory object
Cleanup in CPU code
2006-03-26 18:58:01 +00:00
Stanislav Shwartsman
7b6c2587a9 Now devices could be compiled separatelly from CPU
Averything that required cpu.h include now has it explicitly and there are a lot of files not dependant by CPU at all which will compile a lot faster now ...
2006-03-06 22:03:16 +00:00
Stanislav Shwartsman
8c783bc329 Fixed cpu_mode corruption in x86-64 mode
Removed all potentially unsafe and duplicated code in setFLAGS methods to avoid such kind of problems in future
2005-09-29 17:32:32 +00:00
Stanislav Shwartsman
2b5a812674 Split last bit.cc methods according to os16/32/64 2005-07-25 04:18:20 +00:00
Stanislav Shwartsman
c026a90779 Unify coding style in CPU methods
NO AFFECT ON EMULATION RESULTS
2005-05-20 20:06:50 +00:00
Stanislav Shwartsman
760a195c9d * Fix LOCK prefix handling for x86-64
* Split BT*_EvGv functions to 3 different function according to exec mode
2004-09-17 20:47:19 +00:00
Stanislav Shwartsman
77b3886f8b Cleanup and optimize 2004-08-28 08:41:46 +00:00
Stanislav Shwartsman
eefdcece6f Speed-up BTC instruction
Speed-up SAR instruction with implementing lazy-flags for it
2004-08-15 20:12:05 +00:00
Stanislav Shwartsman
1732e54baa Fixed undocumented flags handling for some instructions.
Bugfix for CF flag handling for SHL64 instruction
2004-08-14 19:34:02 +00:00
Stanislav Shwartsman
a1f830d429 Implemented FAST lazy flags version for logic instructions.
Small code cleanup/simplification for others.
2004-08-13 20:00:03 +00:00
Stanislav Shwartsman
8f0cf91fff This commit is the first commit in long series of changes the have several purposes:
1. Review and commit patch

	[ 896733 ] Lazy flags, for more instructions, only 1 src op

   May be partially, but I hope to get all ideas from patch in

2. Get Bochs speedup after lazy flags optimization

3. Most important for me: improve correctness of emulation by handling several
   undocumented EFLAGS modifications. And finally pass

	UFLAGS - Undefined Flags Test v 3.0
	Copyright (C) Potemkin's Hackers Group (PHG) 1989,1995

   The test still fails on > 50% of its checks.
2004-08-09 21:28:47 +00:00
Stanislav Shwartsman
ddc6c33887 BX_PANIC replaced by BX_INFO 2004-07-08 20:15:23 +00:00
Stanislav Shwartsman
49c6fd55e4 Remove redundant ifdefs 2004-01-10 19:45:53 +00:00
Stanislav Shwartsman
b84f0bd0f2 This was not a cleanup. Those macros were intentionally
there to offer a way to substitute more efficient code
to do the RMW cases.  At the moment, they just map to
the normal functions.

Sorry, restored the previous version ...
2002-10-25 18:26:29 +00:00
Stanislav Shwartsman
a0c1fd60e6 Just little cleanup of macro duplicating an existing code 2002-10-25 17:23:34 +00:00
Kevin Lawton
b742ccec7e Changed eflags accessors for get_?F() to use (val32 & (1<<N)) instead
of (1 & (val32>>N)), and added a getB_?F() accessor for special
  cases which need a strict binary value (exactly 0 or 1).  Most
  code only needed a value for logical comparison.  I modified the
  special cases which do need a binary number for shifting and
  comparison between flags, to use the special getB_?F() accessor.

Cleaned up memory.cc functions a little, now that all accesses
  are within a single page.

Fixed a (not very likely encountered) bug in fetchdecode.cc (and
  fetchdecode64.cc) where a 2-byte opcode starting with a prefix
  starts at the last offset on a page.  There were no checks
  on the segment overrides for a boundary condition.  I added them.

The eflags enhancements added just a tiny bit of performance.
2002-09-22 18:22:24 +00:00
Kevin Lawton
402d02974d Moved the EFLAGS.RF check and clearing of inhibit_mask code
in cpu.cc out of the main loop, and into the asynchronous
events handling.  I went through all the code paths, and
there doesn't seem to be any reason for that code to be
in the hot loop.

Added another accessor for getting instruction data, called
modC0().  A lot of instructions test whether the mod field
of mod-nnn-rm is 0xc0 or not, ie., it's a register operation
and not memory.  So I flag this in fetchdecode{,64}.cc.
This added on the order of 1% performance improvement for
a Win95 boot.

Macroized a few leftover calls to Write_RMV_virtual_xyz()
that didn't get modified in the x86-64 merge.  Really, they
just call the real function for now, but I want to have them
available to do direct writes with the guest2host TLB pointers.
2002-09-20 03:52:59 +00:00
Kevin Lawton
6723ca9bf4 Moved more separate fields in the bxInstruction_c into bitfields
with accessors.  Had to touch a number of files to update the
access using the new accessors.

Moved rm_addr to the CPU structure, to slim down bxInstruction_c
and to prevent future instruction caching from getting sprayed
with writes to individual rm_addr fields.  There only needs to
be one.  Though need to deal with instructions which have
static non-modrm addresses, but which are using rm_addr since
that will change.

bxInstruction_c is down to about 40 bytes now.  Trying to
get down to 24 bytes.
2002-09-18 05:36:48 +00:00
Kevin Lawton
07b0df2a8a Updated accessing of modrm/sib addressing information to
use accessors.  This lets me work on compressing the
size of fetch-decode structure (now called bxInstruction_c).

I've reduced it down to about 76 bytes.  We should be able
to do much better soon.  I needed the abstraction of the
accessors, so I have a lot of freedom to re-arrange things
without making massive future changes.

Lost a few percent of performance in these mods, but my
main focus was to get the abstraction.
2002-09-17 22:50:53 +00:00
Kevin Lawton
a6ea6659ab (cpu64) Merged bit.cc. 2002-09-15 03:38:52 +00:00
Kevin Lawton
491035fcb2 I extended the guest-to-host TLB acceleration across the
Read-Modify-Write instructions.  The first read phase stores
the host pointer in the "pages" field if a direct use pointer
is available.  The Write phase first checks if a pointer was
issued and uses it for a direct write if available.

I chose the "pages" field since it needs to be checked by the
write_RMW_virtual variants anyways and thus needs to be
cached anyways.

Mostly the mods where to access.cc, but I did also macro-ize
the calls to write_RMW_virtual...() in files which use it
and cpu.h.  Right now, the macro is just a straight pass-through.
I tried expanding it to a quick initial check for the pointer
availability to do the write in-place, with a function call
as a fall-back.  That didn't seemed to matter at all.

Booting is not helped by this really.  The upper bound of
the gain is 5 or 6%, and that's only if you have a loop that
looks like:

label:
  add [eax], ebx   ;; mega read-modify-write instruction
  jmp label        ;; intensive loop.
2002-09-06 21:54:58 +00:00