Commit Graph

133 Commits

Author SHA1 Message Date
Bryce Denney
41c3bb736d - remove extraneous newline 2002-09-10 18:56:36 +00:00
Kevin Lawton
08576b24be I implemented Global pages. Though, I haven't tested them. :^)
You need to use '--enable-global-pages' to configure in support.
If you have something to boot that uses them, give them a
spin.  Really the were introduced for PPro and above, but
I haven't put in any limits.  CPUID and CR4 report the proper
bits when configured, regardless of --enable-cpu-level at the
moment.
2002-09-10 03:52:32 +00:00
Kevin Lawton
112bf27f29 Fixed bug in tasking.cc found by Scott Duplichan. When paging
if off, we were still reading CR3 from the TSS and reloading
it!  This was causing problems with a DOS extender.  When
paging is turned back on, CR3 would be incorrect.
2002-09-10 01:39:40 +00:00
Kevin Lawton
425ad824c0 I changed the TLB entry from 3 dwords to 4, and (when you compile
with GCC) align them with the GCC special alignment attribute.
Since there was then one available field, I split the protection
attributes and native host pointers into their own fields.

Before, with 3 dwords per TLB entry, some entries (about 3/8)
were spanning two processor cache lines (assuming a 32-byte
cache line).  Now, they all fit within one cache line.

Knocked about 1.4% off Win95 boot time, probably more off normal
software runs.
2002-09-10 00:01:01 +00:00
Kevin Lawton
59d00a46a3 Fixed two calls to dtranslate_linear in paging.cc to use
BX_READ not 0.  BX_READ was 10.  While I was at it, I did
change BX_{READ,WRITE,RW} to {0,1,2} rather than {10,11,12}
in case that helps optimize code.

There may be more paging checks we should do before changing
any state, to avoid receiving a page fault in the middle.
I put some extra comments in there.
2002-09-09 21:59:10 +00:00
uid94540
293cbc01ea Got rid of very old BX_SUPPORT_TASKING define. That originated
way back when I first added paging support.
2002-09-09 19:48:58 +00:00
Kevin Lawton
1e22357b06 Very small #ifdef mods so that all the static functions would
be compiled out, when MMX is not enabled for a compile.  Eliminates
the unused warnings from the compiler.
2002-09-09 17:13:13 +00:00
Kevin Lawton
414e97bc32 Enhanced the repeat IO accelerations (enabled by --enable-repeat-speedups)
to request bulk IO operations to IO devices which are bulk IO aware.
Currently, I modified only harddrv.cc to be aware.  I added some
fields to the bx_devices_c class for the IO instructions to
place requests and receive responses from the IO device emulation.
Devices except the hard drive, don't monitor these fields so they
respond as normal.  The hard drive now monitors these fields for
bulk requests, and if enabled, it memcpy()'s data straight from
the disk buffer to memory.  This eliminates numerous inp/outp calling
sequences per disk sector.

I used the fields in bx_devices_c so that I would not have to
disrupt most IO device modules.  Enhancements can be made to
other devices if they use high-bandwidth IO via in/out instructions.
2002-09-09 16:56:56 +00:00
Bryce Denney
be659a09b3 - check in Stanislav Shwartsman's patch "bochs-mmx.patch-endian-support".
He writes: Detailed description: MMX instruction set support.
  Also supports BIG_ENDIAN systems. Tested on Solaris and HP1100.
- modified files:
    configure.in cpu/Makefile.in cpu/cpu.h cpu/fetchdecode.cc
    cpu/proc_ctrl.cc fpu/fpu_system.h fpu/wmFPUemu_glue.cc
- added files: cpu/i387.h cpu/mmx.cc
2002-09-09 16:11:25 +00:00
Kevin Lawton
0d7a5fdf3c I rehashed the way the EFLAGS register was stored internally.
All the EFLAGS bits used to be cached in separate fields.  I left
a few of them in separate fields for now - might remove them
at some point also.  When the arithmetic fields are known
(ie they're not in lazy mode), they are all cached in a
32-bit EFLAGS image, just like the x86 EFLAGS register expects.
All other eflags are store in the 32-bit register also, with
a few also mirrored in separate fields for now.

The reason I did this, was so that on x86 hosts, asm() statements
can be #ifdef'd in to do the calculation and get the native
eflags results very cheaply.  Just to test that it works, I
coded ADD_EdId() and ADD_EwIw() with some conditionally compiled
asm()s for accelerated eflags processing and it works.

-Kevin
2002-09-08 04:08:14 +00:00
Kevin Lawton
51c93e12a1 The paging unit gets notified of all CR0/CR3/CR4 updates so
it can decide how to proceed.  Some of those bits are necessary
to make TLB invalidation decisions.  INVLPG doesn't cause
a whole TLB flush anymore, just one page.  Some of the
current CPU behaviours model the P6, especially on CR0
reloads.  Earlier processors kept some pre-change pre-fetched
instructions until a branch.  We could probably model that
by setting a flag, and letting the revalidate_prefetch_q
function cause serialization.

The TLB flush code only invalidates entries which are not
already invalidated for the case where the TLB invalidation
ID trick is not in use.
2002-09-07 05:21:28 +00:00
Kevin Lawton
491035fcb2 I extended the guest-to-host TLB acceleration across the
Read-Modify-Write instructions.  The first read phase stores
the host pointer in the "pages" field if a direct use pointer
is available.  The Write phase first checks if a pointer was
issued and uses it for a direct write if available.

I chose the "pages" field since it needs to be checked by the
write_RMW_virtual variants anyways and thus needs to be
cached anyways.

Mostly the mods where to access.cc, but I did also macro-ize
the calls to write_RMW_virtual...() in files which use it
and cpu.h.  Right now, the macro is just a straight pass-through.
I tried expanding it to a quick initial check for the pointer
availability to do the write in-place, with a function call
as a fall-back.  That didn't seemed to matter at all.

Booting is not helped by this really.  The upper bound of
the gain is 5 or 6%, and that's only if you have a loop that
looks like:

label:
  add [eax], ebx   ;; mega read-modify-write instruction
  jmp label        ;; intensive loop.
2002-09-06 21:54:58 +00:00
Gregory Alexander
4f6039f533 Macroize BX_TLB_QUICK_INVALIDATE code.
Kevin Lawton says he doesn't get a performance benefit.

I'm not sure if I do.  Either way, the difference isn't
very large.

This code may get removed if it turns out to be useless.
2002-09-06 19:21:55 +00:00
Bryce Denney
80a3900b8b - apply a patch I've been working on
- modified files: config.h.in cpu/init.cc debug/dbg_main.cc gui/control.cc
  gui/siminterface.cc gui/siminterface.h gui/wxdialog.cc gui/wxdialog.h
  gui/wxmain.cc gui/wxmain.h iodev/keyboard.cc

----------------------------------------------------------------------
Patch name: patch.wx-show-cpu2
Author: Bryce Denney
Date: Fri Sep  6 12:13:28 EDT 2002

Description:

Second try at implementing the "Debug:Show Cpu" and "Debug:Show
Keyboard" dialog with values that change as the simulation proceeds.
(Nobody gets to see the first try.)  This is the first step toward
making something resembling a wxWindows debugger.

First, variables which are going to be visible in the CI must be
registered as parameters.  For some variables, it might be acceptable
to change them from Bit32u into bx_param_num_c and access them only
with set/get methods, but for most variables it would be a horrible
pain and wreck performance.

To deal with this, I introduced the concept of a shadow parameter.  A
normal parameter has its value stored inside the struct, but a shadow
parameter has only a pointer to the value.  Shadow params allow you to
treat any variable as if it was a parameter, without having to change
its type and access it using get/set methods.  Of course, a shadow
param's value is controlled by someone else, so it can change at any
time.

To demonstrate and test the registration of shadow parameters, I
added code in cpu/init.cc to register a few CPU registers and
code in iodev/keyboard.cc to register a few keyboard state values.
Now these parameters are visible in the Debug:Show CPU and
Debug:Show Keyboard dialog boxes.

The Debug:Show* dialog boxes are created by the ParamDialog class,
which already understands how to display each type of parameter,
including the new shadow parameters (because they are just a subclass
of a normal parameter class).  I have added a ParamDialog::Refresh()
method, which rereads the value from every parameter that it is
displaying and changes the displayed value.  At the moment, in the
Debug:Show CPU dialog, changing the values has no effect.  However
this is trivial to add when it's time (just call CommitChanges!).  It
wouldn't really make sense to change the values unless you have paused
the simulation, for example when single stepping with the debugger.

The Refresh() method must be called periodically or else the dialog
will show the initial values forever.  At the moment, Refresh() is
called when the simulator sends an async event called
BX_ASYNC_EVT_REFRESH, created by a call to SIM->refresh_ci ().

Details:
- implement shadow parameter class for Bit32s, called bx_shadow_num_c.
  implement shadow parameter class for Boolean, called bx_shadow_bool_c.
  more to follow (I need one for every type!)
- now the simulator thread can request that the config interface refresh
  its display.  For now, the refresh event causes the CI to check every
  parameter it is watching and change the display value.  Later, it may
  be worth the trouble to keep track of which parameters have actually
  changed.  Code in the simulator thread calls SIM->refresh_ci(), which
  creates an async event called BX_ASYNC_EVT_REFRESH and sends it to
  the config interface.  When it arrives in the wxWindows gui thread,
  it calls RefreshDialogs(), which calls the Refresh() method on any
  dialogs that might need it.
- in the debugger, SIM->refresh_ci() is called before every prompt
  is printed.  Otherwise, the refresh would wait until the next
  SIM->periodic(), which might be thousands of cycles.  This way,
  when you're single stepping, the dialogs update with every step.
- To improve performance, the CI has a flag (MyFrame::WantRefresh())
  which tells whether it has any need for refresh events.  If no
  dialogs are showing that need refresh events, then no event is sent
  between threads.
- add a few defaults to the param classes that affect the settings of
  newly created parameters.  When declaring a lot of params with
  similar settings it's more compact to set the default for new params
  rather than to change each one separately.  default_text_format is
  the printf format string for displaying numbers.  default_base is
  the default base for displaying numbers (0, 16, 2, etc.)
- I added to ParamDialog to make it able to display modeless dialog
  boxes such as "Debug:Show CPU".  The new Refresh() method queries
  all the parameters for their current value and changes the value in
  the wxWindows control.  The ParamDialog class still needs a little
  work; for example, if it's modal it should have Cancel/Ok buttons,
  but if it's going to be modeless it should maybe have Apply (commit
  any changes) and Close.
2002-09-06 16:43:26 +00:00
Gregory Alexander
afdccad36c Oops, had to fix a bunch of parentheses.
Why | has precedence under == (or is it =)
I still don't understand.
2002-09-06 16:29:49 +00:00
Gregory Alexander
1c3ae99300 Speed-up for TLB invalidates as proposed by Peter Tattam.
I had been planning on this same thing in a similar form
for the I$, so this made a lot of sense, and was easy to
implement.
2002-09-06 14:58:56 +00:00
Bryce Denney
85b3dfe60f - fix minor problems with static member function declarations:
- bx_gen_reg cannot be declared with BX_SMF or it can't read gen_reg
    when static member functions are turned on.
  - use "BX_CPU_C_PREFIX" instead of "BX_CPU_C::" for get_segment_base.
- the SMF (static member function) tricks are just plain wierd. The only way to
  really be sure that you're not breaking something is to try compiling it with
  SMF on and with SMF off.  e.g. "configure && make" and
  "configure --enable-processors=2 && make".
2002-09-05 20:16:40 +00:00
Stanislav Shwartsman
2d2651a0f3 Added some useful debug/information methods for BX_CPU class 2002-09-05 19:46:20 +00:00
Stanislav Shwartsman
611d983900 Added get_REGISTER functions for all registers 2002-09-05 19:12:02 +00:00
Kevin Lawton
f29f9ef021 Fixed Big-endian case of --enable-guest2host-tlb. I macro'ized the
direct reads/writes from native variables to the x86 (guest)
memory image.  Look at the end of bochs.h.  Don't know if that's
the right place to put them, but here you can extend these
macros to platform-specific asm() code if you like, or just
use the generic C code I supplied.  Some platforms have special
instructions for byte-order swapping etc.  Also, you can't
make any assumptions about the alignment of the pointers
passed.
2002-09-05 04:56:11 +00:00
Kevin Lawton
83eb7d4199 Updated comments with info on the new TLB permissions storage
and strategy, so that others can understand it.
2002-09-05 03:09:59 +00:00
Kevin Lawton
f0c9896964 Now, when you compile with --enable-guest2host-tlb, non-paged
mode uses the notion of the guest-to-host TLB.  This has the
benefit of allowing more uniform and streamlined acceleration
code in access.cc which does not have to check if CR0.PG
is set, eliminating a few instructions per guest access.
Shaved just a little off execution time, as expected.

Also, access_linear now breaks accesses which span two pages,
into two calls the the physical memory routines, when paging
is off, just like it always has for paging on.  Besides
being more uniform, this allows the physical memory access
routines to known the complete data item is contained
within a single physical page, and stop reapplying the
A20ADDR() macro to pointers as it increments them.
Perhaps things can be optimized a little more now there too...
I renamed the routines to {read,write}PhysicalPage() as
a reminder that these routines now operate on data
solely within one page.

I also added a little code so that the paging module is
notified when the A20 line is tweaked, so it can dump
whatever mappings it wants to.
2002-09-05 02:31:24 +00:00
Kevin Lawton
8a1baa6bb8 Added ::{read,write}_virtual_qword() functions as per Stanislav's request.
I have not tested these functions, but they model the format and
acceleration principals of the byte/word/dword functions.  Give them
a try on both little/big endian machines.
2002-09-04 20:23:54 +00:00
Kevin Lawton
d07c1c0bb0 I rehashed the way the paging code stores protection bits,
so that a compare of the current access could be done more
efficiently against the cached values, both in the normal
paging routines, and in the accelerated code in access.cc.

This cut down the amount of code path needed to get to
direct use of a host address nicely, and speed definitely
got a boost as a result, especially if you use the
--enable-guest2host-tlb option.

The CR0.WP flag was a real pain, because it imparts
a complication on the way protections work.  Fortunately
it's not a high-change flag, so I just base the new
cached info on the current CR0.WP value, and dump
the TLB cache when it changes.
2002-09-04 08:59:13 +00:00
Kevin Lawton
54bc40971c Fixed repeated IO/string instruction acceleration bug. Not all the
checks were honoring the EFLAGS.DF bit, but assuming it was always
equal to 0 (increment upward).  Plus some general cleanup of the
acceleration code.

I left the default of '--enable-repeat-speedups' to disabled, but
it seems pretty solid.  Definitely adds performance for disk
heavy workloads.
2002-09-03 19:38:27 +00:00
Bryce Denney
765f21fbc3 - disable #warning on MSVC++ because it doesn't understand it 2002-09-03 15:56:24 +00:00
Kevin Lawton
3f2d28f86c Added guest2host TLB tricks to read-modify-write variants of
access routines in access.cc, completing the upgrade of
those routines.  You do need '--enable-guest2host-tlb', before
you get the speedups for now.  The guest2host mods seem pretty
solid, though I do need to see what effects the A20 line has
on this cache and the paging TLB in general.
2002-09-03 04:54:28 +00:00
Kevin Lawton
746f09b427 There's a bug in the repeated IO & mem copy speedups. I
added --enable-repeat-speedups with default to disabled.
Reconfigure/recompile and the speedup code will be #ifdef'd
out for now.  It manifested as junk written to the VGA screen
while booting/running Windows.

Also made some more mods to the main cpu loop.  Moved the
handling of EXT/errorno outside the main loop, much like
the extra EIP/ESP commits were moved, for a little better
performance.

I changed the fetch_ptr/bytesleft method of fetching to
a slightly different model, which calculates a window
for which EIP will be valid (land on the current page),
and a bias which when applied to EIP will be from
0..upper_page_limit.  Speed is about the same for either
method, but a pseudo-op/threaded-interpreter will plug
in better with this and be faster.
2002-09-02 18:44:35 +00:00
Kevin Lawton
3d8e5f8b61 Removed the BX_FETCHDECODE_CACHE mods, and the patch that
Bryce created for use of ensuring all mods were removed
cleanly.
2002-09-01 23:02:36 +00:00
Kevin Lawton
3a5f338419 Integrated patches for:
- Paging code rehash.  You must now use --enable-4meg-pages to
    use 4Meg pages, with the default of disabled, since we don't well
    support 4Meg pages yet.  Paging table walks model a real CPU
    more closely now, and I fixed some bugs in the old logic.
  - Segment check redundancy elimination.  After a segment is loaded,
    reads and writes are marked when a segment type check succeeds, and
    they are skipped thereafter, when possible.
  - Repeated IO and memory string copy acceleration.  Only some variants
    of instructions are available on all platforms, word and dword
    variants only on x86 for the moment due to alignment and endian issues.
    This is compiled in currently with no option - I should add a configure
    option.
  - Added a guest linear address to host TLB.  Actually, I just stick
    the host address (mem.vector[addr] address) in the upper 29 bits
    of the field 'combined_access' since they are unused.  Convenient
    for now.  I'm only storing page frame addresses.  This was the
    simplest for of such a TLB.  We can likely enhance this.  Also,
    I only accelerated the normal read/write routines in access.cc.
    Could also modify the read-modify-write versions too.  You must
    use --enable-guest2host-tlb, to try this out.  Currently speeds
    up Win95 boot time by about 3.5% for me.  More ground to cover...
  - Minor mods to CPUI/MOV_CdRd for CMOV.
  - Integrated enhancements from Volker to getHostMemAddr() for PCI
    being enabled.
2002-09-01 20:12:09 +00:00
Kevin Lawton
d52b23daf1 Made some very minor mods, to make CPUID aware of CMOV instructions
for BX_CPU_LEVEL >= 6, and to have the CMOV instructions generate
an undefined opcode exception after printing info that they were
called, if BX_CPU_LEVEL <= 5.  I suppose we could have a separate
configure option, but mirroring Intel, CMOV is available as of
Pentium Pro.

For now, you have to compile with --enable-cpu-level=6 for CMOV
support to be compiled in.
2002-09-01 04:01:14 +00:00
Volker Ruppert
38666a2cfb - PCI memory handling moved to bx_mem_c
* shadow RAM array and fetch function are now a part of the memory code
  * removed unnecessary PCI macros and functions load_ROM() and mem_read()
2002-08-31 12:24:41 +00:00
Bryce Denney
44ec9a0fc8 - update Makefile dependencies on nearly everything 2002-08-27 22:43:57 +00:00
Bryce Denney
10f56f60d1 - I thought I would add mmx.o ONLY when --enable-mmx was present. Stanislav
has been assuming we would compile mmx.o all the time, so I took out the
  conditional compile stuff.
2002-08-26 19:07:00 +00:00
Bryce Denney
61ff7bbb06 - fix it so that cpu/mmx.o is compiled in when MMX is enabled. 2002-08-26 16:40:21 +00:00
Bryce Denney
f780d81523 - recreate dependencies by running the gcc -MM thing again
- include references to mmx.o but leave it commented until it actually
  gets checked in.
2002-08-26 16:22:07 +00:00
Christophe Bothamy
17adce9633 - added MOV_CdRd in v8086 mode (from Martin Str|mberg) 2002-08-10 12:06:26 +00:00
Volker Ruppert
0bcee8caf7 - POPFD implemented for vm86 (tested with MS-DOS 6.2 and EMM386) 2002-08-05 19:45:32 +00:00
Christophe Bothamy
1f577b31fa - ouput unknown MSR regsiter number 2002-08-01 07:23:11 +00:00
Bryce Denney
1403a59ec4 - apply patch from Zwane Mwaikambo <zwane@linuxpower.ca> posted to
mailing list.
2002-07-25 13:30:07 +00:00
Bryce Denney
eb0974f0ce - if misaligned or wrong size write, print the address and length! 2002-07-23 15:32:14 +00:00
Bryce Denney
7dd83e2140 - removed my antisocial asserts from the apic code, and changed them to
BX_PANICs.
2002-07-21 13:56:49 +00:00
Volker Ruppert
27fedb5aba - AAM can generate an exception (divide by 0)
- AAM: modification of flags depends on AL, not AX
- AAM always clears CF and AF
- AAD can also modify AF, CF and OF
- DAA can also clear the CF
2002-07-06 11:02:35 +00:00
Christophe Bothamy
badef8cec8 - included instinc's patch.stack-return-from-v86 2002-06-27 13:31:54 +00:00
Bryce Denney
7e04c23d2f - check in Mike Reiker's 4meg page code from a patch that he submitted last
November 17.
2002-06-19 15:49:07 +00:00
Gregory Alexander
5d7c6627fd I botched the linked list implementation pretty badly.
Kudos to TLD for fixing it for me.
2002-06-06 23:03:09 +00:00
Gregory Alexander
1be5b1d46c Added a linked list to further speed up icache invalidates.
These should be pretty snappy now.  It's time to generate
some actual statistics.

 Modified Files:
 	cpu/cpu.cc cpu/cpu.h cpu/init.cc memory/memory.cc
2002-06-05 21:51:30 +00:00
Gregory Alexander
c41505e342 Added a RPN directory for the cache to help make invalidates
faster.  Hopefully this won't slow things down too much.

 	config.h.in cpu/cpu.cc cpu/cpu.h memory/memory.cc
2002-06-05 03:59:31 +00:00
Gregory Alexander
fda1b874e9 Check in FETCHDECODE Caching, with changes.
Specific changes from the patch:

1.) renamed fdcache_eip to fdcache_ip, as it is using
the RIP instead of the EIP.

2.) added a Boolean array fdcache_is32 which uses is32
to determine icache hits.  Otherwise we could run 32-bit
code as 16-bit or vice versa.


 Modified Files:
 	config.h.in cpu/cpu.cc cpu/cpu.h memory/memory.cc
2002-06-03 22:39:11 +00:00
Bryce Denney
30aaf4088e - commit patch.wxwindows.gz in the main branch. Now you can try out
the wxwindows interface by just "configure --with-wx; make"

  Modified Files:
    Makefile.in bochs.h config.h.in configure configure.in
    load32bitOShack.cc logio.cc main.cc cpu/cpu.cc cpu/cpu.h
    debug/dbg_main.cc gui/Makefile.in gui/control.cc gui/gui.cc
    gui/siminterface.cc gui/siminterface.h gui/x.cc iodev/cdrom.cc
    iodev/keyboard.cc memory/misc_mem.cc
  Added Files:
    README-wxWindows wxbochs.rc gui/wx.cc gui/wxmain.cc
    gui/wxmain.h gui/bitmaps/cdromd.xpm
    gui/bitmaps/configbutton.xpm gui/bitmaps/copy.xpm
    gui/bitmaps/floppya.xpm gui/bitmaps/floppyb.xpm
    gui/bitmaps/mouse.xpm gui/bitmaps/paste.xpm
    gui/bitmaps/power.xpm gui/bitmaps/reset.xpm
    gui/bitmaps/snapshot.xpm
  Removed Files:
    patches/patch.wxwindows.gz
2002-04-18 00:22:20 +00:00