access routines in access.cc, completing the upgrade of
those routines. You do need '--enable-guest2host-tlb', before
you get the speedups for now. The guest2host mods seem pretty
solid, though I do need to see what effects the A20 line has
on this cache and the paging TLB in general.
added --enable-repeat-speedups with default to disabled.
Reconfigure/recompile and the speedup code will be #ifdef'd
out for now. It manifested as junk written to the VGA screen
while booting/running Windows.
Also made some more mods to the main cpu loop. Moved the
handling of EXT/errorno outside the main loop, much like
the extra EIP/ESP commits were moved, for a little better
performance.
I changed the fetch_ptr/bytesleft method of fetching to
a slightly different model, which calculates a window
for which EIP will be valid (land on the current page),
and a bias which when applied to EIP will be from
0..upper_page_limit. Speed is about the same for either
method, but a pseudo-op/threaded-interpreter will plug
in better with this and be faster.
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
Specific changes from the patch:
1.) renamed fdcache_eip to fdcache_ip, as it is using
the RIP instead of the EIP.
2.) added a Boolean array fdcache_is32 which uses is32
to determine icache hits. Otherwise we could run 32-bit
code as 16-bit or vice versa.
Modified Files:
config.h.in cpu/cpu.cc cpu/cpu.h memory/memory.cc
fixed in patch.smp-instr-trace for Bochs 1.3, but the patch conflicted
with the latest source. It was simple enough to just make the changes by
hand. This should fix bug [ #532321 ] SMP debug: trace-on fails
BEFORE it is executed. Print the registers at this time, BEFORE the
instruction, since they are the values BEFORE the instruction is executed.
The important result of this is that in TRACE output, both the instruction
causing an exception and the first instruction of the exception handler
are BOTH printed.
I'm working on getting this behavior in the debugger user-interface.
Modified Files:
cpu/cpu.cc debug/dbg_main.cc
that TICK is called so I put a trace call just before each TICK.
This seems best, since the trace has a chance to print before the tick
can trigger time-based events elsewhere in the system.
has run. This ensures that the prev_eip and prev_esp that is used
for tracing and breakpoint checks is correct even in the cycle after
an interrupt or trap.
> This patch fixes a number of debugger problems.
> - with trace-on, simulation time would pass 5x faster than usual, so
> interrupts and other timed events would happen at different times
> - with trace-on, breakpoints were ignored
> - with trace-on, control-C would not stop the processor and return to the
> debugger.
>
> This patch changes the execution quantum for the debugger to 1, which means
> that cpu_loop is asked to do one instruction at a time. This may cause
> bochs with the debugger to be slower than before.
>
> I haven't tested without the debugger yet, so I don't know if the timing
> of events matches or not.
by thomas.petazzoni@meridon.com. Bryce introduced this bug in
revision 1.9 when split the code into separate #ifdefs for single
CPU and multiple CPU. Comments on the patch are:
> The following patch addresses a bug concerning the exception 1 (debug)
> which is being raised during HALT under certain conditions. It
> appears only on recent versions (1.2.1 or last CVS), and not on
> version 2000-0104.
BX_SUPPORT_APIC were used. To follow the pattern used by other
names like this, I changed them all to BX_SUPPORT_APIC.
Thanks to Tom Lindström for chasing this down!
BX_CPU_C bx_cpu;
BX_MEM_C bx_mem;
and when more than one processor, use
BX_CPU_C *bx_cpu_array[BX_SMP_PROCESSORS];
BX_MEM_C *bx_mem_array[BX_ADDRESS_SPACES];
The changeover is controlled by BX_SMP_PROCESSORS, but there are only
a few code changes since nearly all code uses the BX_CPU(n) and BX_MEM(n)
macros.
- This turns out to make a 10% speed difference! With this revision,
the CVS version now gets 95% of the performance of the 3/25/2000
snapshot, which I've been using as my baseline.
is the first attempt to regain the performance of pre-SMP bochs
(1.1.2). When simulating only one processor, stay in cpu_loop forever
as pre-SMP versions did. The overhead of returning from cpu_loop over
and over was slowing us down.
tries to fix it. The shortcuts to register names such as AX and DL are
#defines in cpu/cpu.h, and they are defined in terms of BX_CPU_THIS_PTR.
When BX_USE_CPU_SMF=1, this works fine. (This is what bochs used for
a long time, and nobody used the SMF=0 mode at all.) To make SMP bochs
work, I had to get SMF=0 mode working for the CPU so that there could
be an array of cpus.
When SMF=0 for the CPU, BX_CPU_THIS_PTR is defined to be "this->" which
only works within methods of BX_CPU_C. Code outside of BX_CPU_C must
reference BX_CPU(num) instead.
- to try to enforce the correct use of AL/AX/DL/etc. shortcuts, they are
now only #defined when "NEED_CPU_REG_SHORTCUTS" is #defined. This is
only done in the cpu/*.cc code.
in BRANCH-smp-bochs revisions.
- The general task was to make multiple CPU's which communicate
through their APICs. So instead of BX_CPU and BX_MEM, we now have
BX_CPU(x) and BX_MEM(y). For an SMP simulation you have several
processors in a shared memory space, so there might be processors
BX_CPU(0..3) but only one memory space BX_MEM(0). For cosimulation,
you could have BX_CPU(0) with BX_MEM(0), then BX_CPU(1) with
BX_MEM(1). WARNING: Cosimulation is almost certainly broken by the
SMP changes.
- to simulate multiple CPUs, you have to give each CPU time to execute
in turn. This is currently implemented using debugger guards. The
cpu loop steps one CPU for a few instructions, then steps the
next CPU for a few instructions, etc.
- there is some limited support in the debugger for two CPUs, for
example printing information from each CPU when single stepping.
To see the commit logs for this use either cvsweb or
cvs update -r BRANCH-io-cleanup and then 'cvs log' the various files.
In general this provides a generic interface for logging.
logfunctions:: is a class that is inherited by some classes, and also
. allocated as a standalone global called 'genlog'. All logging uses
. one of the ::info(), ::error(), ::ldebug(), ::panic() methods of this
. class through 'BX_INFO(), BX_ERROR(), BX_DEBUG(), BX_PANIC()' macros
. respectively.
.
. An example usage:
. BX_INFO(("Hello, World!\n"));
iofunctions:: is a class that is allocated once by default, and assigned
as the iofunction of each logfunctions instance. It is this class that
maintains the file descriptor and other output related code, at this
point using vfprintf(). At some future point, someone may choose to
write a gui 'console' for bochs to which messages would be redirected
simply by assigning a different iofunction class to the various logfunctions
objects.
More cleanup is coming, but this works for now. If you want to see alot
of debugging output, in main.cc, change onoff[LOGLEV_DEBUG]=0 to =1.
Comments, bugs, flames, to me: todd@fries.net