Commit Graph

179 Commits

Author SHA1 Message Date
Kevin Lawton
0cd7346b9c - Added an instruction cache. Size is fixed for the moment,
but if you hand edit cpu/cpu.h, and change BxICacheEntries,
  you can try different sizes.  I'll make this more flexible
  with configure.  For now, use "--enable-icache" with no parameters.

- Modified fetchdecode.cc/fetchdecode64.cc just enough so that
  instructions which encode a direct address now use a memory
  resolution function which just sticks the immediate address
  into rm_addr.  With cached instructions we need this.
2002-09-19 19:17:20 +00:00
Kevin Lawton
6723ca9bf4 Moved more separate fields in the bxInstruction_c into bitfields
with accessors.  Had to touch a number of files to update the
access using the new accessors.

Moved rm_addr to the CPU structure, to slim down bxInstruction_c
and to prevent future instruction caching from getting sprayed
with writes to individual rm_addr fields.  There only needs to
be one.  Though need to deal with instructions which have
static non-modrm addresses, but which are using rm_addr since
that will change.

bxInstruction_c is down to about 40 bytes now.  Trying to
get down to 24 bytes.
2002-09-18 05:36:48 +00:00
Kevin Lawton
07b0df2a8a Updated accessing of modrm/sib addressing information to
use accessors.  This lets me work on compressing the
size of fetch-decode structure (now called bxInstruction_c).

I've reduced it down to about 76 bytes.  We should be able
to do much better soon.  I needed the abstraction of the
accessors, so I have a lot of freedom to re-arrange things
without making massive future changes.

Lost a few percent of performance in these mods, but my
main focus was to get the abstraction.
2002-09-17 22:50:53 +00:00
Kevin Lawton
2eb34d13a6 Added a configure option, "--enable-pae". The x86-64 enhancements
already had Physical Address Extensions support - we can now
compile for PAE support for x86-32 mode as well.
2002-09-16 21:55:57 +00:00
Kevin Lawton
cc6d44f9c6 Fixed cpu/paging.cc for non-global-pages support compile. 2002-09-16 21:10:31 +00:00
Kevin Lawton
a851db3e36 Ooops, cpu/paging.cc was hosed on that last commit. Fixed that. 2002-09-16 21:01:55 +00:00
Kevin Lawton
5eb4e247bc Merged the final filed ("paging.cc") from Peter Tattam's x86-64
enhancement to bochs.  You can now configure with
--enable-guest2host-tlb.

Force the support of big pages (PSE) when x86-64 is configured.

Reverted back to only one kind of TLB entry style, since everything
is ported.

Fixed one bug in io.cc with as_64 and the index registers.
There are others, as noticed by Peter.
2002-09-16 20:23:38 +00:00
Bryce Denney
9eb006ee66 - cr4 has turned into a struct. fix two cases where it was used as if it
was still a number by calling the accessor.
2002-09-16 13:14:06 +00:00
Kevin Lawton
08576b24be I implemented Global pages. Though, I haven't tested them. :^)
You need to use '--enable-global-pages' to configure in support.
If you have something to boot that uses them, give them a
spin.  Really the were introduced for PPro and above, but
I haven't put in any limits.  CPUID and CR4 report the proper
bits when configured, regardless of --enable-cpu-level at the
moment.
2002-09-10 03:52:32 +00:00
Kevin Lawton
425ad824c0 I changed the TLB entry from 3 dwords to 4, and (when you compile
with GCC) align them with the GCC special alignment attribute.
Since there was then one available field, I split the protection
attributes and native host pointers into their own fields.

Before, with 3 dwords per TLB entry, some entries (about 3/8)
were spanning two processor cache lines (assuming a 32-byte
cache line).  Now, they all fit within one cache line.

Knocked about 1.4% off Win95 boot time, probably more off normal
software runs.
2002-09-10 00:01:01 +00:00
Kevin Lawton
51c93e12a1 The paging unit gets notified of all CR0/CR3/CR4 updates so
it can decide how to proceed.  Some of those bits are necessary
to make TLB invalidation decisions.  INVLPG doesn't cause
a whole TLB flush anymore, just one page.  Some of the
current CPU behaviours model the P6, especially on CR0
reloads.  Earlier processors kept some pre-change pre-fetched
instructions until a branch.  We could probably model that
by setting a flag, and letting the revalidate_prefetch_q
function cause serialization.

The TLB flush code only invalidates entries which are not
already invalidated for the case where the TLB invalidation
ID trick is not in use.
2002-09-07 05:21:28 +00:00
Gregory Alexander
4f6039f533 Macroize BX_TLB_QUICK_INVALIDATE code.
Kevin Lawton says he doesn't get a performance benefit.

I'm not sure if I do.  Either way, the difference isn't
very large.

This code may get removed if it turns out to be useless.
2002-09-06 19:21:55 +00:00
Gregory Alexander
afdccad36c Oops, had to fix a bunch of parentheses.
Why | has precedence under == (or is it =)
I still don't understand.
2002-09-06 16:29:49 +00:00
Gregory Alexander
1c3ae99300 Speed-up for TLB invalidates as proposed by Peter Tattam.
I had been planning on this same thing in a similar form
for the I$, so this made a lot of sense, and was easy to
implement.
2002-09-06 14:58:56 +00:00
Kevin Lawton
f29f9ef021 Fixed Big-endian case of --enable-guest2host-tlb. I macro'ized the
direct reads/writes from native variables to the x86 (guest)
memory image.  Look at the end of bochs.h.  Don't know if that's
the right place to put them, but here you can extend these
macros to platform-specific asm() code if you like, or just
use the generic C code I supplied.  Some platforms have special
instructions for byte-order swapping etc.  Also, you can't
make any assumptions about the alignment of the pointers
passed.
2002-09-05 04:56:11 +00:00
Kevin Lawton
83eb7d4199 Updated comments with info on the new TLB permissions storage
and strategy, so that others can understand it.
2002-09-05 03:09:59 +00:00
Kevin Lawton
f0c9896964 Now, when you compile with --enable-guest2host-tlb, non-paged
mode uses the notion of the guest-to-host TLB.  This has the
benefit of allowing more uniform and streamlined acceleration
code in access.cc which does not have to check if CR0.PG
is set, eliminating a few instructions per guest access.
Shaved just a little off execution time, as expected.

Also, access_linear now breaks accesses which span two pages,
into two calls the the physical memory routines, when paging
is off, just like it always has for paging on.  Besides
being more uniform, this allows the physical memory access
routines to known the complete data item is contained
within a single physical page, and stop reapplying the
A20ADDR() macro to pointers as it increments them.
Perhaps things can be optimized a little more now there too...
I renamed the routines to {read,write}PhysicalPage() as
a reminder that these routines now operate on data
solely within one page.

I also added a little code so that the paging module is
notified when the A20 line is tweaked, so it can dump
whatever mappings it wants to.
2002-09-05 02:31:24 +00:00
Kevin Lawton
d07c1c0bb0 I rehashed the way the paging code stores protection bits,
so that a compare of the current access could be done more
efficiently against the cached values, both in the normal
paging routines, and in the accelerated code in access.cc.

This cut down the amount of code path needed to get to
direct use of a host address nicely, and speed definitely
got a boost as a result, especially if you use the
--enable-guest2host-tlb option.

The CR0.WP flag was a real pain, because it imparts
a complication on the way protections work.  Fortunately
it's not a high-change flag, so I just base the new
cached info on the current CR0.WP value, and dump
the TLB cache when it changes.
2002-09-04 08:59:13 +00:00
Bryce Denney
765f21fbc3 - disable #warning on MSVC++ because it doesn't understand it 2002-09-03 15:56:24 +00:00
Kevin Lawton
3a5f338419 Integrated patches for:
- Paging code rehash.  You must now use --enable-4meg-pages to
    use 4Meg pages, with the default of disabled, since we don't well
    support 4Meg pages yet.  Paging table walks model a real CPU
    more closely now, and I fixed some bugs in the old logic.
  - Segment check redundancy elimination.  After a segment is loaded,
    reads and writes are marked when a segment type check succeeds, and
    they are skipped thereafter, when possible.
  - Repeated IO and memory string copy acceleration.  Only some variants
    of instructions are available on all platforms, word and dword
    variants only on x86 for the moment due to alignment and endian issues.
    This is compiled in currently with no option - I should add a configure
    option.
  - Added a guest linear address to host TLB.  Actually, I just stick
    the host address (mem.vector[addr] address) in the upper 29 bits
    of the field 'combined_access' since they are unused.  Convenient
    for now.  I'm only storing page frame addresses.  This was the
    simplest for of such a TLB.  We can likely enhance this.  Also,
    I only accelerated the normal read/write routines in access.cc.
    Could also modify the read-modify-write versions too.  You must
    use --enable-guest2host-tlb, to try this out.  Currently speeds
    up Win95 boot time by about 3.5% for me.  More ground to cover...
  - Minor mods to CPUI/MOV_CdRd for CMOV.
  - Integrated enhancements from Volker to getHostMemAddr() for PCI
    being enabled.
2002-09-01 20:12:09 +00:00
Bryce Denney
7e04c23d2f - check in Mike Reiker's 4meg page code from a patch that he submitted last
November 17.
2002-06-19 15:49:07 +00:00
Bryce Denney
daf2a9fb55 - add RCS Id to header of every file. This makes it easier to know what's
going on when someone sends in a modified file.
2001-10-03 13:10:38 +00:00
Bryce Denney
284178479b - apply patch "patch.ifdef-paging-tlb" 2001-08-10 18:42:24 +00:00
Todd T.Fries
2bbb1ef8eb strip '\n' from BX_{INFO,DEBUG,ERROR,PANIC}
don't need it, moved the output of it into the general io functions.
saves space, as well as removes the confusing output if a '\n' is left off
2001-05-30 18:56:02 +00:00
Bryce Denney
49664f7503 - parts of the SMP merge apparantly broke the debugger and this revision
tries to fix it.  The shortcuts to register names such as AX and DL are
  #defines in cpu/cpu.h, and they are defined in terms of BX_CPU_THIS_PTR.
  When BX_USE_CPU_SMF=1, this works fine.  (This is what bochs used for
  a long time, and nobody used the SMF=0 mode at all.)  To make SMP bochs
  work, I had to get SMF=0 mode working for the CPU so that there could
  be an array of cpus.

  When SMF=0 for the CPU, BX_CPU_THIS_PTR is defined to be "this->" which
  only works within methods of BX_CPU_C.  Code outside of BX_CPU_C must
  reference BX_CPU(num) instead.
- to try to enforce the correct use of AL/AX/DL/etc. shortcuts, they are
  now only #defined when "NEED_CPU_REG_SHORTCUTS" is #defined.  This is
  only done in the cpu/*.cc code.
2001-05-24 18:46:34 +00:00
Bryce Denney
e61d00351f - merged BRANCH-smp-bochs into main branch. For details see comments
in BRANCH-smp-bochs revisions.
- The general task was to make multiple CPU's which communicate
  through their APICs.  So instead of BX_CPU and BX_MEM, we now have
  BX_CPU(x) and BX_MEM(y).  For an SMP simulation you have several
  processors in a shared memory space, so there might be processors
  BX_CPU(0..3) but only one memory space BX_MEM(0).  For cosimulation,
  you could have BX_CPU(0) with BX_MEM(0), then BX_CPU(1) with
  BX_MEM(1).  WARNING: Cosimulation is almost certainly broken by the
  SMP changes.
- to simulate multiple CPUs, you have to give each CPU time to execute
  in turn.  This is currently implemented using debugger guards.  The
  cpu loop steps one CPU for a few instructions, then steps the
  next CPU for a few instructions, etc.
- there is some limited support in the debugger for two CPUs, for
  example printing information from each CPU when single stepping.
2001-05-23 08:16:07 +00:00
Todd T.Fries
bdb89cd364 merge in BRANCH-io-cleanup.
To see the commit logs for this use either cvsweb or
cvs update -r BRANCH-io-cleanup and then 'cvs log' the various files.

In general this provides a generic interface for logging.

logfunctions:: is a class that is inherited by some classes, and also
.   allocated as a standalone global called 'genlog'.  All logging uses
.   one of the ::info(), ::error(), ::ldebug(), ::panic() methods of this
.   class through 'BX_INFO(), BX_ERROR(), BX_DEBUG(), BX_PANIC()' macros
.   respectively.
.
.   An example usage:
.     BX_INFO(("Hello, World!\n"));

iofunctions:: is a class that is allocated once by default, and assigned
as the iofunction of each logfunctions instance.  It is this class that
maintains the file descriptor and other output related code, at this
point using vfprintf().  At some future point, someone may choose to
write a gui 'console' for bochs to which messages would be redirected
simply by assigning a different iofunction class to the various logfunctions
objects.

More cleanup is coming, but this works for now.  If you want to see alot
of debugging output, in main.cc, change onoff[LOGLEV_DEBUG]=0 to =1.

Comments, bugs, flames, to me: todd@fries.net
2001-05-15 14:49:57 +00:00
Bryce Denney
a6fef54678 - update copyright dates to 2001 for all mandrake headers
- for bochs files with other header, replaced with current mandrake header
2001-04-10 02:20:02 +00:00
cvs
beff63eb32 - entered original Bochs snapshot bochs-2000_0325a.tar.gz from
ftp.bochs.com
2001-04-10 01:04:59 +00:00