to give the compiler some hints:
BX_CPP_AttrPrintf(formatArg, firstArg)
BX_CPP_AttrNoReturn()
The first is to tell the compiler that a function receives printf-like
arguments so it can do some smart argument checking w.r.t. the
format string. The 2nd tells the compiler that the function does
not ever return; it's not used yet, but I'd like to use it on
exception() after we fix the situation of it returning for debugging.
I fixed one parameter mismatch in cpu/ by deleting a deprecated
debug print statement. There are several other mismatches in
other code modules.
exit out of cpu_loop() and back to the caller can be honored.
Previously, code in this function was a part of cpu_loop so
a "return;" would already do that. Now, a value is passed
back to cpu_loop() to denote such a request, and then a return
is executed from cpu_loop().
I haven't tested this yet, but previously I must have broke
certain debugging requests by moving the code to a separate
function and not fixing the "return;" statements.
Symptom: Linux kernel 2.4.19 would hang in random places. CPU still
running, but in dle loop.
Cause: if APIC interrupt occurred while a PIC interrupt was pending, the
PIC interrupt would be lost. This is because either an APIC or PIC
interrupt would trash any pending interrupt event because INTR is only a state,
not an event queue.
Temporary fix: reworked apic.cc to have it's own copy of INTR state. cpu.cc now
checks for both cpu.INTR and local_apic.INTR.
Need to do further research to see if local_apic and pic can be integrated in such
a way as properly manage the combined effects of both devices accessing INTR state.
value and a change-mask, rather than passing all the boolean
change flags as arguments.
Recoded the POPF instruction in flag_ctrl.cc to use the
new writeEFlags() function, and to make it more sane.
Also, the old write_flags() and write_eflags() functions
redirect to writeEFlags() for now. Later, when we get
back in a development mode, it would be better to make
all calls use the new function and get rid of the old ones.
been using the Boolean type for a number of multi-bit fields on the
assumption that it is actually many bits wide. However, this assumption is
unsafe and has caused some bugs that are hard to track down.
- in the Carbon library on MacOS X, Boolean is defined to be an unsigned char.
This has been causing some of the EFLAGS accessors to fail (bits 8-31)
because they depended on Boolean being 32 bits wide. I changed these
accessors to return Bit32u instead. I believe that this will finally fix
[ 618388 ] Unable to boot under MacOS X.
- It would be possible to create a bochs specific type for booleans (bx_bool),
but it's cleaner to simply use "Boolean" when we actually mean a 1-bit true
or false field, and Bit8u/Bit32u when it is a multibit field.
32-bits rather than 64. This is possible, because there is
always an active null (heartbeat) timer, with periodicity
of less than or equal to the maximum 32-bit int value.
This generates a little less code in the hot part of cpu_loop,
and saved about 3% execution time on a Win95 boot.
Moved the asynchronous handling code from cpu_loop() to its
own function since it's a long path. This neatened up the
code a little (less gotos and all), and made it more clear
to use a "while (1)" around the iterative code in cpu_loop().
so that windows types can be used in fields, for example in cdrom.h:
#ifdef WIN32
HANDLE cdrom_interface::hFile;
#endif
- since every file includes bochs.h, I removed includes of <windows.h>
everywhere else
- modified: bochs.h cpu/extdb.cc gui/win32.cc gui/wx.cc iodev/cdrom.cc
iodev/eth_win32.cc iodev/floppy.cc
coverage of the high-frequency eflags instructions. That should
complete the asm() eflags updates for now, as we should be stabilizing
moving towards bochs 2.0.
These seem to be working better, are a more simple design,
easier to understand, and AFAIK don't have race conditions
in them like the old ones do.
Re-coded the apic timer, to return cycle accurate values
which vary with each iteration of a read from a guest OS.
The previous implementation had very poor resolution. It
also didn't check the mask bit to see if an apic timer
interrupt should occur on countdown to 0. The apic timer
now calls its own bochs timer, rather than tag on the
one in iodev/devices.cc.
I needed to use one new function which is an inline in
pc_sytem.h. That would have to be added to the old pc_system.h if
we have to back-out to it.
Linux/x86-64 now boots until it hits two undefined opcodes:
FXRSTOR (0f ae). This restores FPU, MMX, XMM and MXCSR registers
from a 512-byte region of memory. We don't implement this yet.
MOVNTDQ (66 0f e7). This is a move involving an XMM register.
The 0x66 prefix is used so it's a double quadword, rather than
MOVNTQ (0f e7) which operates on a single quadword.
The Linux kernel panic is on the MOVNTQD opcodes. Perhaps that's
because that opcode is used in exception handling of the 1st?
Looks like we need to implement some new instructions.
instead of winmm being a part of GUI_LINK_OPTS_WIN32 only, it is
placed in @DEVICE_LINE_OPTS@ so that it will be used for sdl, rfb, wx,
etc.
- solve compile problems when building bximage, niclist, and any other
console based program. The compile flags returned by wx-config and
sdl-config did strange things to these console programs, for example
redefining main to SDL_main. Because I wanted to use the
configure-generated CFLAGS to compile the programs, but I wanted to
avoid including GUI specific compile options, I split up the configure's
@CFLAGS@ variable into @CFLAGS@ and @GUI_CFLAGS@, and split
@CXXFLAGS@ into @CXXFLAGS@ and @GUI_CXXFLAGS@. All programs in the
Bochs binary will use both, but the console programs will just use
@CFLAGS@ or @CXXFLAGS@.
- gui/Makefile.in, I no longer use the gui specific CFLAGS variables,
SDL_CFLAGS and WX_CXXFLAGS. These values are included in CFLAGS and
CXXFLAGS now.
- modified: configure.in, configure, all Makefile.in's
restart another one in wxWindows. Fixed that. Also, on restart, the
apic id's left over from the first run were causing panics. Fixed that.
- modified: main.cc cpu/apic.cc cpu/cpu.h cpu/init.cc
up pc_system.h. Moved all variables under the private: section,
as well as a few member functions. The string instructions
were accessing a field directly (only reads), so I indirected
that via an inline member function for better abstraction.
all available optimizations in one shot.
Finished one last case of an instruction which could but didn't use
the Read-Modify-Write variants of access.cc functions.
Started going through the integer instructions, merging obvious cases
where there are two "if (modrm==11b) {" clauses and very little
action in between, and cleaning up the aweful indentation leftover
from many years ago when those instructions were implemented using
cut-and-paste. We may get a little extra performance out of these
mods, but they'll also be easier after I'm finished to enhance
with asm() statements to knock out the lazy flags processing on x86.
now simply return a cached value which is set upon mode changes.
The biggest problem was protected_mode() which did something like:
return CR0.PM && ! EFLAGS.VM
This adds up when it was being executed many times in branch functions
etc. Now, cached values are set and sampled instead.
Used patch.disasm to do
1) clean up the disasm output to make the dispaly of extra stuff optional.
2) included the part of the patch which displays displacements as
proper addresses.
and Jas Sandys-Lumsdaine to split out common instructions into
variants which deal with the mod=11b case (Reg-Reg) and the
other cases (which do memory ops). Actually, I only split
MOV_GwEw and MOV_GdEd for now. According to some instrumentation
of a Win95 boot, they were the most frequently used opcode by far.
Essentially, when I coded a few of the instructions to use
asm()s for acceleration of the eflags, I got lazy and only
used the asm() to compute eflags and let the normal C operation
do the actual operation. Jas's patch, moved the asm()s such
that they now do the work of the operation as well.
The patches look great. The code reads a lot better as well.
Further work can be done to give the compiler more options with
register scheduling.
were simply replacements of the eflags mask constants with
the macro names already in cpu.h for asm() statements. I forgot
to use the macros for some instructions.
0x000008d5 -> EFlagsOSZAPCMask
0x000008d4 -> EFlagsOSZAPMask
Some things changed in the ctrl_xfer*.cc, fetchdecode*.cc,
and cpu.cc since the original patches, so I did some patch
integration by hand. Check the placement of the
macros BX_INSTR_FETCH_DECODE_COMPLETED() and BX_INSTR_OPCODE()
in cpu.cc to make sure I go them right. Also, I changed the
parameters to BX_INSTR_OPCODE() to update them to the new code.
I put some comments before each of these to help determine if
the placement is right.
These macros are only compiled in if you are gathering instrumentation
data from bochs, so they shouldn't effect others.
Created 64-bit versions of some branch instructions and
changed fetchdecode64.cc to use them instead. This keeps the
#ifdef pollution down for 32-bit code and made fixing them
easier. They needed to clear the upper bits of RIP for
16-bit operand sizes. They also should not have had a protection
limit check in them, especially since that field is still
32-bit in cpu.h, so there's no way to set nominal 64-bit values.
The 32-bit versions were also not honoring the upper 32-bits
of RIP.
LOOPNE64_Jb
LOOPE64_Jb
LOOP64_Jb
JCXZ64_Jb
Changed all occurances of JCC_Jw/JCC_Jd in fetchdecode64.cc to
use JCC_Jq, which was coded already. Both JMP_Jq and JCC_Jq are
now fixed w.r.t. 16-bit opsizes and upper RIP bit clearing.
63..16 when a 16-bit operand size JMP is executed. Previous
fix cleared only 63..32. I since realized, this is the case
which does parallel the 32-bit semantics.
fetching 64-bit address opcode info, which was incorrect.
Fixed. Got rid of BxImmediate_Oq. fetchdecode64.cc now
uses BxImmediateO, like the fetch routine does. Addresses which
are embedded in the opcode, have a size which depends on
the current addressing size. For long-mode, this is
either 64 (default) or 32 (AddrSize over-ride). BxImmediate_O
now conditionally fetches based on AddrSize.
64-bit bug#2: In JMP_Jq(), when the current operand size is
16-bits, the upper dword of RIP was not being cleared. The
semantics with this case are weird - one would think the
top 48 bits would be cleared, but apparently only the top
32 bits are. Anyways, I fixed this.
Replaced some of the messy immediate fetching (byte-by-byte) in
fetchdecode64.cc with ReadHost{Q,D}WordFromLittleEndian() calls
for cleanliness. Should do this for all the cases, plus
the 32-bit stuff.
Since the SYSCALL replaces the LOADALL instruction, it is incompatible with
earlier CPU types.
At moment, the SYSCALL is only enabled by x86-64 emulation, but the code
can be incorporated in IA32 only emulations.
Instructions added:
0F 05 SYSCALL (replaces LOADALL)
0F 07 SYSRET (new)
TODO: restructure #if ... so that it can be used by non x86-64 emulations.