102 Commits

Author SHA1 Message Date
Kevin Lawton
3183ab7102 Added some preliminary configure and config.h stuff for
SSE/SSE2 for Stanislav.  Also, some method prototypes and
  skeletal functions in access.cc for read/write double quadword
  features.

Also cleaned up one warning in protect_ctrl.cc for non-64 bit compiles.
  There was an unused variable, only used for 64-bit.
2002-10-11 01:11:11 +00:00
Peter Tattam
b968c4e5c8 Latest round of patches/fixups to get 64 bit emulation further.
This is an interim update to allow others to test.

We have userland code running!!! (up to a point)

Able to start executing "sash" as /sbin/init in userland from linux 64 bit
kernel until it crashes trying to access a null pointer.  No kernel panics
though, just a segfault loop.
2002-10-08 14:43:18 +00:00
Kevin Lawton
b8d7f5c88e Moved the asm() statements from the arithmetic instruction emulation
into inline functions with asm() statements in cpu.h.  This cleans
  up the *.cc code (which now doesn't have any asm()s in it), and
  centralizes the asm() code so constraints can be modified in one
  place.  This also makes it easier to cover more instructions
  with asm()s for more efficient eflags handling.
2002-10-07 22:51:58 +00:00
Kevin Lawton
d7a16521c4 Replaced "return;" statements associated with bx_guard.special_unwind_stack
hack with longjmp() back to cpu.cc main decode loop, and added a
  check in there to return control when bx_guard.special_unwind_stack
  is set (compiling with debugger enabled only).

If in the debugger you try to execute further instructions
  (which you shouldn't), other fields need to be reset I would
  think, such as EXT and errorno, and have to make sure ESP/EIP
  are corrected properly.  Basically, this hack is only good
  for examining the current situation of a nasty fault.
2002-10-06 22:08:18 +00:00
Kevin Lawton
b1d2f7ae48 Added a return value to handleAsyncEvent() so that requests to
exit out of cpu_loop() and back to the caller can be honored.
  Previously, code in this function was a part of cpu_loop so
  a "return;" would already do that.  Now, a value is passed
  back to cpu_loop() to denote such a request, and then a return
  is executed from cpu_loop().

  I haven't tested this yet, but previously I must have broke
  certain debugging requests by moving the code to a separate
  function and not fixing the "return;" statements.
2002-10-05 14:51:25 +00:00
Peter Tattam
db0a37824c Fixed elusive APIC interrupt problems when bochs compiled for P6 or later.
Symptom:  Linux kernel 2.4.19 would hang in random places.  CPU still
running, but in dle loop.

Cause: if APIC interrupt occurred while a PIC interrupt was pending, the
PIC interrupt would be lost.  This is because either an APIC or PIC
interrupt would trash any pending interrupt event because INTR is only a state,
not an event queue.

Temporary fix: reworked apic.cc to have it's own copy of INTR state. cpu.cc now
checks for both cpu.INTR and local_apic.INTR.

Need to do further research to see if local_apic and pic can be integrated in such
a way as properly manage the combined effects of both devices accessing INTR state.
2002-10-05 10:25:31 +00:00
Kevin Lawton
0d22bbafc2 Added a new function writeEFlags() which takes a 32-bit eflags
value and a change-mask, rather than passing all the boolean
  change flags as arguments.

Recoded the POPF instruction in flag_ctrl.cc to use the
  new writeEFlags() function, and to make it more sane.

Also, the old write_flags() and write_eflags() functions
  redirect to writeEFlags() for now.  Later, when we get
  back in a development mode, it would be better to make
  all calls use the new function and get rid of the old ones.
2002-10-05 06:33:10 +00:00
Bryce Denney
d754550d47 - a boolean variable is represented by just 1 bit, 0=false or 1=true. We have
been using the Boolean type for a number of multi-bit fields on the
  assumption that it is actually many bits wide.  However, this assumption is
  unsafe and has caused some bugs that are hard to track down.
- in the Carbon library on MacOS X, Boolean is defined to be an unsigned char.
  This has been causing some of the EFLAGS accessors to fail (bits 8-31)
  because they depended on Boolean being 32 bits wide.  I changed these
  accessors to return Bit32u instead.  I believe that this will finally fix
  [ 618388 ] Unable to boot under MacOS X.
- It would be possible to create a bochs specific type for booleans (bx_bool),
  but it's cleaner to simply use "Boolean" when we actually mean a 1-bit true
  or false field, and Bit8u/Bit32u when it is a multibit field.
2002-10-04 22:25:22 +00:00
Kevin Lawton
66452e9898 Replaced tabs in cpu/*.{cc,h} files with spaces. 2002-10-04 17:04:33 +00:00
Kevin Lawton
f64eb0e16a Changed the hot countdown timer in pc_system.* files to be
32-bits rather than 64.  This is possible, because there is
  always an active null (heartbeat) timer, with periodicity
  of less than or equal to the maximum 32-bit int value.

  This generates a little less code in the hot part of cpu_loop,
  and saved about 3% execution time on a Win95 boot.

Moved the asynchronous handling code from cpu_loop() to its
  own function since it's a long path.  This neatened up the
  code a little (less gotos and all), and made it more clear
  to use a "while (1)" around the iterative code in cpu_loop().
2002-10-04 16:26:10 +00:00
Kevin Lawton
ee47fabac0 Committed new bochs internal timers (in pc_system.{cc,h}.
These seem to be working better, are a more simple design,
  easier to understand, and AFAIK don't have race conditions
  in them like the old ones do.

Re-coded the apic timer, to return cycle accurate values
  which vary with each iteration of a read from a guest OS.
  The previous implementation had very poor resolution.  It
  also didn't check the mask bit to see if an apic timer
  interrupt should occur on countdown to 0.  The apic timer
  now calls its own bochs timer, rather than tag on the
  one in iodev/devices.cc.

I needed to use one new function which is an inline in
  pc_sytem.h.  That would have to be added to the old pc_system.h if
  we have to back-out to it.

Linux/x86-64 now boots until it hits two undefined opcodes:

  FXRSTOR (0f ae).  This restores FPU, MMX, XMM and MXCSR registers
    from a 512-byte region of memory.  We don't implement this yet.

  MOVNTDQ (66 0f e7).  This is a move involving an XMM register.
    The 0x66 prefix is used so it's a double quadword, rather than
    MOVNTQ (0f e7) which operates on a single quadword.

  The Linux kernel panic is on the MOVNTQD opcodes.  Perhaps that's
  because that opcode is used in exception handling of the 1st?

  Looks like we need to implement some new instructions.
2002-10-03 15:47:13 +00:00
Bryce Denney
0d28420aa2 - provide dbg_xlate_linear2phy when running as GDB stub 2002-10-03 04:53:53 +00:00
Bryce Denney
be4005269b - many parameters in cpu were being redefined if you stop simulation and
restart another one in wxWindows.  Fixed that.  Also, on restart, the
  apic id's left over from the first run were causing panics.  Fixed that.
- modified: main.cc cpu/apic.cc cpu/cpu.h cpu/init.cc
2002-09-30 22:18:53 +00:00
Kevin Lawton
67721c48f4 The convience functions protected_mode(), v8086_mode() and real_mode()
now simply return a cached value which is set upon mode changes.
  The biggest problem was protected_mode() which did something like:

    return CR0.PM && ! EFLAGS.VM

  This adds up when it was being executed many times in branch functions
  etc.  Now, cached values are set and sampled instead.
2002-09-29 22:38:18 +00:00
Kevin Lawton
a5537449cd Split out reg-reg and reg-memory cases for a few other high-profile
instructions, mainly variants of MOV.  Had to update fetchdecode64
  to keep it inline with the 32-bit mods.
2002-09-29 19:21:38 +00:00
Stanislav Shwartsman
d495bd75a6 fter integration of SplitMod11b changes Bochs failed to compile in SMP mode.
I fixed the compilation errors in CVS, smbd please check if the fix is property;
2002-09-28 09:38:58 +00:00
Kevin Lawton
08a89fe7b6 Performance mod: I implemented a suggestion from Peter Tattam
and Jas Sandys-Lumsdaine to split out common instructions into
  variants which deal with the mod=11b case (Reg-Reg) and the
  other cases (which do memory ops).  Actually, I only split
  MOV_GwEw and MOV_GdEd for now.  According to some instrumentation
  of a Win95 boot, they were the most frequently used opcode by far.
2002-09-28 05:38:11 +00:00
Kevin Lawton
13a1e55f20 Committed patches/patch-bochs-instrumentation from Stanislav.
Some things changed in the ctrl_xfer*.cc, fetchdecode*.cc,
and cpu.cc since the original patches, so I did some patch
integration by hand.  Check the placement of the
macros BX_INSTR_FETCH_DECODE_COMPLETED() and BX_INSTR_OPCODE()
in cpu.cc to make sure I go them right.  Also, I changed the
parameters to BX_INSTR_OPCODE() to update them to the new code.
I put some comments before each of these to help determine if
the placement is right.

These macros are only compiled in if you are gathering instrumentation
data from bochs, so they shouldn't effect others.
2002-09-28 00:54:05 +00:00
Stanislav Shwartsman
e6adebfe2d Added MMX opcodes to x86-64 mode
Fixed problem with fetching extra byte in ESCx opcodes if FPU is disabled
2002-09-27 09:56:40 +00:00
Kevin Lawton
47f2e7c404 Got rid of the KPL64Hacks macro. The fixes below eliminated it.
Created 64-bit versions of some branch instructions and
  changed fetchdecode64.cc to use them instead.  This keeps the
  #ifdef pollution down for 32-bit code and made fixing them
  easier.  They needed to clear the upper bits of RIP for
  16-bit operand sizes.  They also should not have had a protection
  limit check in them, especially since that field is still
  32-bit in cpu.h, so there's no way to set nominal 64-bit values.
  The 32-bit versions were also not honoring the upper 32-bits
  of RIP.

  LOOPNE64_Jb
  LOOPE64_Jb
  LOOP64_Jb
  JCXZ64_Jb

Changed all occurances of JCC_Jw/JCC_Jd in fetchdecode64.cc to
  use JCC_Jq, which was coded already.  Both JMP_Jq and JCC_Jq are
  now fixed w.r.t. 16-bit opsizes and upper RIP bit clearing.
2002-09-27 07:01:02 +00:00
Kevin Lawton
6d74a334d6 64-bit bug#1: Instructions such as MOV_ALOq were always
fetching 64-bit address opcode info, which was incorrect.

  Fixed.  Got rid of BxImmediate_Oq.  fetchdecode64.cc now
  uses BxImmediateO, like the fetch routine does.  Addresses which
  are embedded in the opcode, have a size which depends on
  the current addressing size.  For long-mode, this is
  either 64 (default) or 32 (AddrSize over-ride).  BxImmediate_O
  now conditionally fetches based on AddrSize.

64-bit bug#2: In JMP_Jq(), when the current operand size is
  16-bits, the upper dword of RIP was not being cleared.  The
  semantics with this case are weird - one would think the
  top 48 bits would be cleared, but apparently only the top
  32 bits are.  Anyways, I fixed this.

Replaced some of the messy immediate fetching (byte-by-byte) in
  fetchdecode64.cc with ReadHost{Q,D}WordFromLittleEndian() calls
  for cleanliness.  Should do this for all the cases, plus
  the 32-bit stuff.
2002-09-26 21:32:26 +00:00
Peter Tattam
67082a5b50 Implemented SWAPGS instruction.
Note that it is unusual to decode (see SGDT instruction)
2002-09-25 14:09:08 +00:00
Bryce Denney
8f9bec3919 - remove unused, and incorrect MSR fields 2002-09-25 13:26:04 +00:00
Peter Tattam
a0d90e9b39 Implemented SYSCALL and SYSRET as part of x86-64 emulation.
Since the SYSCALL replaces the LOADALL instruction, it is incompatible with
earlier CPU types.

At moment, the SYSCALL is only enabled by x86-64 emulation, but the code
can be incorporated in IA32 only emulations.

Instructions added:

0F 05		SYSCALL		(replaces LOADALL)
0F 07		SYSRET		(new)

TODO:  restructure #if ... so that it can be used by non x86-64 emulations.
2002-09-25 12:54:41 +00:00
Kevin Lawton
3c09fdb363 I updated code that was using !!get_CF() (or other arithmetic flag) to
use getB_CF() etc.  getB_CF() and friends are only for a relatively
  small number of cases where a true boolean/binary number (0 or 1) is required
  rather than 0 or non-0 as is returned by get_CF().
2002-09-24 18:33:38 +00:00
Kevin Lawton
aeca26fc04 Declaration of loadSRegLMNominal() is now only defined for 64-bit. 2002-09-24 16:39:33 +00:00
Kevin Lawton
26ebda0775 Got rid of INIT_64_DESCRIPTOR in all places. Added/replaced it with
loadSRegLMNominal() which should be used to load a segment register
  in long-mode with nominal values which are compatible with existing
  checks and expectations for descriptor cache values.

Fixed 64-bit iret to not do a descriptor fetch if SS selector is null.
  Also load SS with loadSRegLMNorminal() in the same case.
2002-09-24 16:35:44 +00:00
Bryce Denney
de0e58c2c5 These changes are from Peter Tattam
- fix load_ss, remove load_ss_null
- change the "#if KPL64Hacks" around msr stuff into "#if BX_IGNORE_BAD_MSR"
- remove "#if KPL64Hacks" from BX_CPU_C::can_push
- segment_ctrl_pro.cc: bug fix to ss == null handling in 64 bit mode

Modified: cpu/cpu.h cpu/ctrl_xfer_pro.cc cpu/exception.cc
cpu/proc_ctrl.cc cpu/segment_ctrl_pro.cc cpu/stack_pro.cc
2002-09-24 08:29:06 +00:00
Kevin Lawton
281e62d8b1 I integrated my hacks to get Linux/x86-64 booting. To keep
these from interfering from a normal compile here's what I did.
In config.h.in (which will generate config.h after a configure),
I added a #define called KPL64Hacks:

  #define KPL64Hacks

*After* running configure, you must set this by hand.  It will
default to off, so you won't get my hacks in a normal compile.
This will go away soon.  There is also a macro just after that
called BailBigRSP().  You don't need to enabled that, but you
can.  In many of the instructions which seemed like they could
be hit by the fetchdecode64() process, but which also touched
EIP/ESP, I inserted a macro.  Usually this macro expands to nothing.
If you like, you can enabled it, and it will panic if it finds
the upper bits of RIP/RSP set.   This helped me find bugs.

Also, I cleaned up the emulation in ctrl_xfer{8,16,32}.cc.
There were some really old legacy code snippets which directly
accessed operands on the stack with access_linear.  Lots of
ugly code instead of just pop_32() etc.  Cleaning those up,
minimized the number of instructions which directly manipulate
the stack pointer, which should help in refining 64-bit support.
2002-09-24 00:44:56 +00:00
Kevin Lawton
6e7a2e91f2 Added more x86 specific asm() code to directly handle eflags return
values for some common instructions (like test/and/cmp).  Only
  compiles in on x86 of course.
2002-09-22 22:22:16 +00:00
Bryce Denney
a785453194 - fixed another case of get_##flag##(void) 2002-09-22 19:06:46 +00:00
Bryce Denney
fda29cd55b - in definition of ArithmeticalFlag, we had "getB_##flag##(void)",
which says to paste getB_ with flag and then paste with (.  It should
  be "getB_##flag(void)".  Some preprocessors are complaining about pasting
  the symbol with the paren.
2002-09-22 19:03:24 +00:00
Kevin Lawton
b742ccec7e Changed eflags accessors for get_?F() to use (val32 & (1<<N)) instead
of (1 & (val32>>N)), and added a getB_?F() accessor for special
  cases which need a strict binary value (exactly 0 or 1).  Most
  code only needed a value for logical comparison.  I modified the
  special cases which do need a binary number for shifting and
  comparison between flags, to use the special getB_?F() accessor.

Cleaned up memory.cc functions a little, now that all accesses
  are within a single page.

Fixed a (not very likely encountered) bug in fetchdecode.cc (and
  fetchdecode64.cc) where a 2-byte opcode starting with a prefix
  starts at the last offset on a page.  There were no checks
  on the segment overrides for a boundary condition.  I added them.

The eflags enhancements added just a tiny bit of performance.
2002-09-22 18:22:24 +00:00
Bryce Denney
5933d94c91 - add typecast to Bit32u to avoid lots of useless -Wall warnings. The constant
expression on the right side of the comparison was turning out signed, while
  the expression on the left was unsigned.
2002-09-22 02:53:09 +00:00
Kevin Lawton
3bfeab23c9 Split out JZ/JNZ instructions from JCC because they were called
so frequently.
Coded asm() statements for INC/DEC_ERX() instructions.
Cleaned up the iCache a litle including a bug fix.  The
  generation ID was decrementing the whole field including
  some high meta bits.  That could roll over after 1 Billion
  cycles.  I know only decrement if the field is valid, to
  save the write.
I implemented inline functions which can serve the value of
  the arithmetic flags if they are cached, and redirect to
  the lazy_flags.cc routines if not.
Most of this was just prep work for adding more asm() statements
  for native eflags processing when on x86.
2002-09-22 01:52:21 +00:00
Kevin Lawton
e2e219eda0 Modified the way that the register field (low 3 bits of a few opcodes
also extended by the REX.B field on Hammer) is passed to instructions.
I rearranged the bxInstruction_c to free up a field to be used
to pass this info when mod-rm bytes are not used.  This got rid
of the ugly ((i->b1 & 7) + i->rex_b) code.

Probably shaved just a very little run time off Hammer emulation,
and even less on x86-32.  The resultant is a little cleaner anyways.
2002-09-20 23:17:51 +00:00
Kevin Lawton
402d02974d Moved the EFLAGS.RF check and clearing of inhibit_mask code
in cpu.cc out of the main loop, and into the asynchronous
events handling.  I went through all the code paths, and
there doesn't seem to be any reason for that code to be
in the hot loop.

Added another accessor for getting instruction data, called
modC0().  A lot of instructions test whether the mod field
of mod-nnn-rm is 0xc0 or not, ie., it's a register operation
and not memory.  So I flag this in fetchdecode{,64}.cc.
This added on the order of 1% performance improvement for
a Win95 boot.

Macroized a few leftover calls to Write_RMV_virtual_xyz()
that didn't get modified in the x86-64 merge.  Really, they
just call the real function for now, but I want to have them
available to do direct writes with the guest2host TLB pointers.
2002-09-20 03:52:59 +00:00
Gregory Alexander
88e64f9521 Fix big endian compile problem. 2002-09-20 03:06:39 +00:00
Kevin Lawton
0cd7346b9c - Added an instruction cache. Size is fixed for the moment,
but if you hand edit cpu/cpu.h, and change BxICacheEntries,
  you can try different sizes.  I'll make this more flexible
  with configure.  For now, use "--enable-icache" with no parameters.

- Modified fetchdecode.cc/fetchdecode64.cc just enough so that
  instructions which encode a direct address now use a memory
  resolution function which just sticks the immediate address
  into rm_addr.  With cached instructions we need this.
2002-09-19 19:17:20 +00:00
Kevin Lawton
4e51dcae40 Converted all the remaining available separate fields in bxInstruction_c
to bitfields.  bxInstruction_c is now 24 bytes, including 4 for
the memory addr resolution function pointer, and 4 for the
execution function pointer (16 + 4 + 4).

Coded more accessors, to abstract access from most code.
2002-09-18 08:00:43 +00:00
Kevin Lawton
6723ca9bf4 Moved more separate fields in the bxInstruction_c into bitfields
with accessors.  Had to touch a number of files to update the
access using the new accessors.

Moved rm_addr to the CPU structure, to slim down bxInstruction_c
and to prevent future instruction caching from getting sprayed
with writes to individual rm_addr fields.  There only needs to
be one.  Though need to deal with instructions which have
static non-modrm addresses, but which are using rm_addr since
that will change.

bxInstruction_c is down to about 40 bytes now.  Trying to
get down to 24 bytes.
2002-09-18 05:36:48 +00:00
Kevin Lawton
07b0df2a8a Updated accessing of modrm/sib addressing information to
use accessors.  This lets me work on compressing the
size of fetch-decode structure (now called bxInstruction_c).

I've reduced it down to about 76 bytes.  We should be able
to do much better soon.  I needed the abstraction of the
accessors, so I have a lot of freedom to re-arrange things
without making massive future changes.

Lost a few percent of performance in these mods, but my
main focus was to get the abstraction.
2002-09-17 22:50:53 +00:00
Bryce Denney
f1a3e0307a - add #if BX_CPU_LEVEL>=4 around cr0.wp and cr4 so that i386 will compile 2002-09-17 22:14:33 +00:00
Stanislav Shwartsman
a43bd93b98 just little clean of the code 2002-09-17 14:36:39 +00:00
Kevin Lawton
3d4210fd3f Got rid of a couple fields in BxInstruction_t that were
no longer used.  Also rearranged that struct a little
to be more compressed.  Over time, I'm going to reduce
it further, for use with future accelerations.
2002-09-17 04:20:42 +00:00
Kevin Lawton
5eb4e247bc Merged the final filed ("paging.cc") from Peter Tattam's x86-64
enhancement to bochs.  You can now configure with
--enable-guest2host-tlb.

Force the support of big pages (PSE) when x86-64 is configured.

Reverted back to only one kind of TLB entry style, since everything
is ported.

Fixed one bug in io.cc with as_64 and the index registers.
There are others, as noticed by Peter.
2002-09-16 20:23:38 +00:00
Bryce Denney
6a50742b20 - clean up ^M pollution from working in cygwin 2002-09-16 17:00:16 +00:00
Bryce Denney
42f412c43b - MS VC++ did not accept initialization of static const fields in the
class declaration, for example:
     static const unsigned os_64=0, as_64=0;
  After reading some suggestions on usenet, I changed these into
  enums instead, like this:
     enum { os_64=0, as_64=0 };
2002-09-16 15:21:51 +00:00
Kevin Lawton
278e27d5fe Merged proc_ctrl.cc. Also fixed a bug in CR4 reloading; we were
printing a message when a reserved bit was set, but not causing
a #GP(0).  As well, I force a new PAE support option to 1 when
Hammer support is enabled.
2002-09-14 23:17:55 +00:00
Kevin Lawton
93d05990cc Updated CR4 to use the patented Bryce bitfields accessor method for
both cpu32 and cpu64, to make upcoming merging easier, and the
code cleaner.  Compiled for debug as well, and fixed CR4 for that
also.
2002-09-14 19:21:41 +00:00