Bochs

Author	SHA1	Message	Date
Stanislav Shwartsman	0b60100a0d	Merged patch for Hkan T. Johansson TLB access bit optimizations	2005-06-14 20:55:57 +00:00
Stanislav Shwartsman	6fa52214b0	Canonical address check for RIP in x86-64	2005-04-17 18:54:54 +00:00
Stanislav Shwartsman	1755589376	Separate pageWriteStamp from ICACHE. The pageWriteStamp has totally independant structure and could be used in future with icache structure. Also it could be significantly speeded up using BX_SMF analog constructions.	2005-04-10 19:42:48 +00:00
Stanislav Shwartsman	52041f60d4	Support for X86_64 in debug CPU method Fixed debug messages printed from read_virtual_checks	2005-03-30 19:56:02 +00:00
Stanislav Shwartsman	fd13784231	Small cleanup in access.cc VME feature code should be valid only for CPU LEVEL >= 4	2005-03-12 19:34:18 +00:00
Stanislav Shwartsman	c30e89289b	Fixed R/O pages access in CPL=3 (TLB accessBits bug)	2005-03-03 20:24:52 +00:00
Stanislav Shwartsman	b25088bf2f	Merge patch [1153327] ignore segment bases in x86-64 by Avi Kivity	2005-02-28 18:56:05 +00:00
Stanislav Shwartsman	42a5a899c2	Improvement in the speed of general memory access. The idea was taken from patch written by LightCone	2005-01-25 20:41:43 +00:00
Stanislav Shwartsman	0d09a8c8a8	fix code duplication	2004-11-26 19:53:04 +00:00
Stanislav Shwartsman	69c0b06955	fixes in disassembler split REPEAT instructions according to opsize to speedup execution now each REPEATABLE instruction splitted to 3 different instructions, one for 16-bit operand size, one for 32-bit and one for 64-bit. Choosing of correct instruction occure in fetchdecode step.	2004-11-20 23:26:32 +00:00
Stanislav Shwartsman	645e04860e	For now : disable fetching from physical address 0xFFFFFFF0 after #RESET because ICACHE do not support physical address > mem.len. This is the first part of the fix, the rest coming soon	2004-11-18 23:16:36 +00:00
Stanislav Shwartsman	41daacdf80	fixed BX_CPU_THIS pointers	2004-11-05 10:13:15 +00:00
Stanislav Shwartsman	6cdb42d909	Little bit optimize memory access functions. Now values are calculated only if they actually needed.	2004-09-13 20:48:11 +00:00
Stanislav Shwartsman	a1f830d429	Implemented FAST lazy flags version for logic instructions. Small code cleanup/simplification for others.	2004-08-13 20:00:03 +00:00
Stanislav Shwartsman	f9bd2b74be	1. Fixed bug in FSUB instruction 2. Fixed bug [ 989478 ] I-Cache and undefined Instruktions The L4 microkernel uses an undefined instruction to trap for a special requests into the kernel (LOCK NOP). The handler fixes this up and gives the user a special code page with syscall stubs. If you're not using the I-Cache optimization everthing works find on bochs. But if you enable the I-Cache (--enable-icache), then the undefined opcode exception is thrown only once for ever virtual address it occurs. See the demodisk of the L4KA::pistachio (http://www.l4ka.org/projects/pistachio/download.php). In this case the pingpong benchmark of this demo is of interest. Everything runs fine until the program tries to spawn a new task for its measurements. This new task shares the code of the creating program. But the new task stops executing at the undefined instruction explained above and no exception is thrown.	2004-07-29 20:15:19 +00:00
Stanislav Shwartsman	5c5b556f24	Merge softfloat-fpu-implementation_ver4_branch branch	2004-06-18 14:11:11 +00:00
Stanislav Shwartsman	ac739aa8b7	Fixed possible compilation problem	2003-10-24 20:06:12 +00:00
Stanislav Shwartsman	ac20b6405a	- FXSAVE/FXRSTOR instructions should be available in P6 mode - Added second UD2 opcode to fetchdecode - Added RDPMC instruction to fetchdecode - 'changes' updated	2003-10-24 18:34:16 +00:00
Peter Tattam	cb492ae7b5	x86-64 emulation. Perform Canonical Address Checking. Only does basic checking (only offset, not offset+size-1)	2003-03-13 00:37:40 +00:00
Christophe Bothamy	50efc3b8c7	- apply Conn Clark's patch.perf-regparm-cclark : - it works only on x86 with gcc2.95+ - uses the GCC function atribute "regparm(n)" to declare that certain functions use the register calling convention - performance improvement is about 6%	2003-03-02 23:59:12 +00:00
Stanislav Shwartsman	8665979c87	* Fixed behavior of BX_INSTR_MEM_DATA callback for RMW memory accesses See instrumentation.txt for details	2003-02-28 20:51:08 +00:00
Peter Tattam	94880d1412	Fix guest2host and related optimizations to work on 64 bit host. 1) fixed the type of "hostPageAddr" and associated typecasts. 2) fixed the type of "pages" and associated typecasts (overloaded variable) 3) patch to cpu.cc to calculate "eipPageBias" correctly in 64 bit mode	2003-02-28 02:37:18 +00:00
Stanislav Shwartsman	cdfc3cbce4	instrumentation enchancements: * renamed CPU_ID to BX_CPU_ID. with this new name there is no possibility for name contentions and BX_CPU_ID definition could be moved out to NEED_CPU_REG_SHORTCUTS block * returned back `unsigned BX_CPU::which_cpu(void)` function * added BX_CPU_ID parameter for BX_INSTR_PHY_READ(a20addr, len); BX_INSTR_PHY_WRITE(a20addr, len); now it will be BX_INSTR_PHY_READ(cpu_id, a20addr, len); BX_INSTR_PHY_WRITE(cpu_id, a20addr, len);	2003-02-13 15:04:11 +00:00
Stanislav Shwartsman	5803e20240	Changed policy of SSE/SSE2 checking	2002-11-13 21:00:05 +00:00
Bryce Denney	5e520261db	Add plugin support to Bochs by merging all the changes from the BRANCH_PLUGINS branch! Authors: Bryce Denney Christophe Bothamy Kevin Lawton (we grabbed a lot of plugin code from plex86) Testing help from: Volker Ruppert Don Becker (Psyon) Jeremy Parsons (Br'fin) The change log is too long to paste in here. To read the change log, do cvs log patches/patch.final-from-BRANCH_PLUGINS.gz All the changes and a detailed description are contained in a patch called patch.final-from-BRANCH_PLUGINS.gz. To look at the complete patch, do cvs upd -r1.1 patches/patch.final-from-BRANCH_PLUGINS.gz Then you will have a local copy of the patch, which you can gunzip and play with however you want. Modified Files: .bochsrc Makefile.in aclocal.m4 bochs.h config.h.in configure configure.in gdbstub.cc logio.cc main.cc pc_system.cc pc_system.h state_file.h bios/Makefile.in bios/rombios.c cpu/Makefile.in cpu/access.cc cpu/apic.cc cpu/arith16.cc cpu/arith32.cc cpu/arith8.cc cpu/cpu.cc cpu/cpu.h cpu/ctrl_xfer32.cc cpu/exception.cc cpu/fetchdecode.cc cpu/fetchdecode64.cc cpu/flag_ctrl.cc cpu/flag_ctrl_pro.cc cpu/init.cc cpu/io.cc cpu/logical16.cc cpu/logical32.cc cpu/logical8.cc cpu/paging.cc cpu/proc_ctrl.cc cpu/protect_ctrl.cc cpu/segment_ctrl_pro.cc cpu/shift16.cc cpu/shift32.cc cpu/stack64.cc cpu/string.cc cpu/tasking.cc debug/Makefile.in debug/dbg_main.cc disasm/Makefile.in doc/docbook/user/user.dbk dynamic/Makefile.in fpu/Makefile.in gui/Makefile.in gui/amigaos.cc gui/beos.cc gui/carbon.cc gui/control.cc gui/control.h gui/gui.cc gui/gui.h gui/keymap.cc gui/keymap.h gui/macintosh.cc gui/nogui.cc gui/rfb.cc gui/sdl.cc gui/sdlkeys.h gui/siminterface.cc gui/siminterface.h gui/term.cc gui/win32.cc gui/wx.cc gui/wxdialog.cc gui/wxdialog.h gui/wxmain.cc gui/wxmain.h gui/x.cc gui/keymaps/sdl-pc-de.map gui/keymaps/sdl-pc-us.map gui/keymaps/x11-pc-de.map instrument/example0/instrument.h instrument/example1/instrument.h instrument/stubs/instrument.cc instrument/stubs/instrument.h iodev/Makefile.in iodev/biosdev.cc iodev/biosdev.h iodev/cdrom.cc iodev/cmos.cc iodev/cmos.h iodev/devices.cc iodev/dma.cc iodev/dma.h iodev/eth_fbsd.cc iodev/eth_linux.cc iodev/eth_null.cc iodev/eth_tap.cc iodev/floppy.cc iodev/floppy.h iodev/guest2host.cc iodev/guest2host.h iodev/harddrv.cc iodev/harddrv.h iodev/iodebug.cc iodev/iodebug.h iodev/iodev.h iodev/keyboard.cc iodev/keyboard.h iodev/ne2k.cc iodev/ne2k.h iodev/parallel.cc iodev/parallel.h iodev/pci.cc iodev/pci.h iodev/pci2isa.cc iodev/pci2isa.h iodev/pic.cc iodev/pic.h iodev/pit.cc iodev/pit.h iodev/pit_wrap.cc iodev/pit_wrap.h iodev/sb16.cc iodev/sb16.h iodev/scancodes.cc iodev/scancodes.h iodev/serial.cc iodev/serial.h iodev/slowdown_timer.cc iodev/slowdown_timer.h iodev/unmapped.cc iodev/unmapped.h iodev/vga.cc iodev/vga.h memory/Makefile.in memory/memory.cc memory/memory.h memory/misc_mem.cc misc/bximage.c misc/niclist.c Added Files: README-plugins extplugin.h ltdl.c ltdl.h ltdlconf.h.in ltmain.sh plugin.cc plugin.h	2002-10-24 21:07:56 +00:00
Kevin Lawton	491ca837f9	Fixed double quadword routines to work for little or big endian hosts.	2002-10-11 16:18:00 +00:00
Kevin Lawton	cffded3829	Simple implementations of the new double quadword functions in access.cc for SSE[2] implementation by Stanislav.	2002-10-11 13:55:26 +00:00
Kevin Lawton	3183ab7102	Added some preliminary configure and config.h stuff for SSE/SSE2 for Stanislav. Also, some method prototypes and skeletal functions in access.cc for read/write double quadword features. Also cleaned up one warning in protect_ctrl.cc for non-64 bit compiles. There was an unused variable, only used for 64-bit.	2002-10-11 01:11:11 +00:00
Kevin Lawton	13a1e55f20	Committed patches/patch-bochs-instrumentation from Stanislav. Some things changed in the ctrl_xfer.cc, fetchdecode.cc, and cpu.cc since the original patches, so I did some patch integration by hand. Check the placement of the macros BX_INSTR_FETCH_DECODE_COMPLETED() and BX_INSTR_OPCODE() in cpu.cc to make sure I go them right. Also, I changed the parameters to BX_INSTR_OPCODE() to update them to the new code. I put some comments before each of these to help determine if the placement is right. These macros are only compiled in if you are gathering instrumentation data from bochs, so they shouldn't effect others.	2002-09-28 00:54:05 +00:00
Kevin Lawton	82fd79c546	Fixed/updated/cleaned repeat IO & memcpy speedups for Long mode. Fixed/updated/cleaned guest2host TLB speedups for Long mode. I now can boot the Linux x86-64 kernel to the VFS mount message, using all the accelerations.	2002-09-24 04:43:59 +00:00
Kevin Lawton	4150ae197e	Hopefully this fixes "Bugs item #612880 ", which was due to the icache pageStamp check too early, before it was known that the TLB entry would produce a physical address in range of the normal part of physical memory. PCI accesses were causing seg faults because of this. I haven't tested this for PCI.	2002-09-22 21:47:57 +00:00
Kevin Lawton	3bfeab23c9	Split out JZ/JNZ instructions from JCC because they were called so frequently. Coded asm() statements for INC/DEC_ERX() instructions. Cleaned up the iCache a litle including a bug fix. The generation ID was decrementing the whole field including some high meta bits. That could roll over after 1 Billion cycles. I know only decrement if the field is valid, to save the write. I implemented inline functions which can serve the value of the arithmetic flags if they are cached, and redirect to the lazy_flags.cc routines if not. Most of this was just prep work for adding more asm() statements for native eflags processing when on x86.	2002-09-22 01:52:21 +00:00
Kevin Lawton	0cd7346b9c	- Added an instruction cache. Size is fixed for the moment, but if you hand edit cpu/cpu.h, and change BxICacheEntries, you can try different sizes. I'll make this more flexible with configure. For now, use "--enable-icache" with no parameters. - Modified fetchdecode.cc/fetchdecode64.cc just enough so that instructions which encode a direct address now use a memory resolution function which just sticks the immediate address into rm_addr. With cached instructions we need this.	2002-09-19 19:17:20 +00:00
Kevin Lawton	8f9c3c582d	More migration/synchronization of cpu/cpu64.	2002-09-13 04:33:42 +00:00
Kevin Lawton	6655634179	I merged the cpu/cpu.h and cpu64/cpu.h files as well as the other header files. There no longer are any .h files in cpu64/. Had to make some changes to the .cc files for dealing with accesses to eip.	2002-09-13 00:15:23 +00:00
Kevin Lawton	425ad824c0	I changed the TLB entry from 3 dwords to 4, and (when you compile with GCC) align them with the GCC special alignment attribute. Since there was then one available field, I split the protection attributes and native host pointers into their own fields. Before, with 3 dwords per TLB entry, some entries (about 3/8) were spanning two processor cache lines (assuming a 32-byte cache line). Now, they all fit within one cache line. Knocked about 1.4% off Win95 boot time, probably more off normal software runs.	2002-09-10 00:01:01 +00:00
Kevin Lawton	491035fcb2	I extended the guest-to-host TLB acceleration across the Read-Modify-Write instructions. The first read phase stores the host pointer in the "pages" field if a direct use pointer is available. The Write phase first checks if a pointer was issued and uses it for a direct write if available. I chose the "pages" field since it needs to be checked by the write_RMW_virtual variants anyways and thus needs to be cached anyways. Mostly the mods where to access.cc, but I did also macro-ize the calls to write_RMW_virtual...() in files which use it and cpu.h. Right now, the macro is just a straight pass-through. I tried expanding it to a quick initial check for the pointer availability to do the write in-place, with a function call as a fall-back. That didn't seemed to matter at all. Booting is not helped by this really. The upper bound of the gain is 5 or 6%, and that's only if you have a loop that looks like: label: add [eax], ebx ;; mega read-modify-write instruction jmp label ;; intensive loop.	2002-09-06 21:54:58 +00:00
Gregory Alexander	4f6039f533	Macroize BX_TLB_QUICK_INVALIDATE code. Kevin Lawton says he doesn't get a performance benefit. I'm not sure if I do. Either way, the difference isn't very large. This code may get removed if it turns out to be useless.	2002-09-06 19:21:55 +00:00
Gregory Alexander	afdccad36c	Oops, had to fix a bunch of parentheses. Why \| has precedence under == (or is it =) I still don't understand.	2002-09-06 16:29:49 +00:00
Gregory Alexander	1c3ae99300	Speed-up for TLB invalidates as proposed by Peter Tattam. I had been planning on this same thing in a similar form for the I$, so this made a lot of sense, and was easy to implement.	2002-09-06 14:58:56 +00:00
Kevin Lawton	f29f9ef021	Fixed Big-endian case of --enable-guest2host-tlb. I macro'ized the direct reads/writes from native variables to the x86 (guest) memory image. Look at the end of bochs.h. Don't know if that's the right place to put them, but here you can extend these macros to platform-specific asm() code if you like, or just use the generic C code I supplied. Some platforms have special instructions for byte-order swapping etc. Also, you can't make any assumptions about the alignment of the pointers passed.	2002-09-05 04:56:11 +00:00
Kevin Lawton	f0c9896964	Now, when you compile with --enable-guest2host-tlb, non-paged mode uses the notion of the guest-to-host TLB. This has the benefit of allowing more uniform and streamlined acceleration code in access.cc which does not have to check if CR0.PG is set, eliminating a few instructions per guest access. Shaved just a little off execution time, as expected. Also, access_linear now breaks accesses which span two pages, into two calls the the physical memory routines, when paging is off, just like it always has for paging on. Besides being more uniform, this allows the physical memory access routines to known the complete data item is contained within a single physical page, and stop reapplying the A20ADDR() macro to pointers as it increments them. Perhaps things can be optimized a little more now there too... I renamed the routines to {read,write}PhysicalPage() as a reminder that these routines now operate on data solely within one page. I also added a little code so that the paging module is notified when the A20 line is tweaked, so it can dump whatever mappings it wants to.	2002-09-05 02:31:24 +00:00
Kevin Lawton	8a1baa6bb8	Added ::{read,write}_virtual_qword() functions as per Stanislav's request. I have not tested these functions, but they model the format and acceleration principals of the byte/word/dword functions. Give them a try on both little/big endian machines.	2002-09-04 20:23:54 +00:00
Kevin Lawton	d07c1c0bb0	I rehashed the way the paging code stores protection bits, so that a compare of the current access could be done more efficiently against the cached values, both in the normal paging routines, and in the accelerated code in access.cc. This cut down the amount of code path needed to get to direct use of a host address nicely, and speed definitely got a boost as a result, especially if you use the --enable-guest2host-tlb option. The CR0.WP flag was a real pain, because it imparts a complication on the way protections work. Fortunately it's not a high-change flag, so I just base the new cached info on the current CR0.WP value, and dump the TLB cache when it changes.	2002-09-04 08:59:13 +00:00
Kevin Lawton	3f2d28f86c	Added guest2host TLB tricks to read-modify-write variants of access routines in access.cc, completing the upgrade of those routines. You do need '--enable-guest2host-tlb', before you get the speedups for now. The guest2host mods seem pretty solid, though I do need to see what effects the A20 line has on this cache and the paging TLB in general.	2002-09-03 04:54:28 +00:00
Kevin Lawton	3a5f338419	Integrated patches for: - Paging code rehash. You must now use --enable-4meg-pages to use 4Meg pages, with the default of disabled, since we don't well support 4Meg pages yet. Paging table walks model a real CPU more closely now, and I fixed some bugs in the old logic. - Segment check redundancy elimination. After a segment is loaded, reads and writes are marked when a segment type check succeeds, and they are skipped thereafter, when possible. - Repeated IO and memory string copy acceleration. Only some variants of instructions are available on all platforms, word and dword variants only on x86 for the moment due to alignment and endian issues. This is compiled in currently with no option - I should add a configure option. - Added a guest linear address to host TLB. Actually, I just stick the host address (mem.vector[addr] address) in the upper 29 bits of the field 'combined_access' since they are unused. Convenient for now. I'm only storing page frame addresses. This was the simplest for of such a TLB. We can likely enhance this. Also, I only accelerated the normal read/write routines in access.cc. Could also modify the read-modify-write versions too. You must use --enable-guest2host-tlb, to try this out. Currently speeds up Win95 boot time by about 3.5% for me. More ground to cover... - Minor mods to CPUI/MOV_CdRd for CMOV. - Integrated enhancements from Volker to getHostMemAddr() for PCI being enabled.	2002-09-01 20:12:09 +00:00
Bryce Denney	daf2a9fb55	- add RCS Id to header of every file. This makes it easier to know what's going on when someone sends in a modified file.	2001-10-03 13:10:38 +00:00
Bryce Denney	0f9a525717	- try again! This should fix [ #433759 ] virtual address checks can overflow and I have tested the condition much more thoroughly this time. All segment sizes should be supported.	2001-10-03 01:06:31 +00:00
Bryce Denney	6a1c01c8b5	- back out my poorly written patch.virtual-address-checks-overflow	2001-10-02 20:01:29 +00:00
Bryce Denney	beca5d6e67	- fix stupid printf-type bug	2001-10-02 18:11:06 +00:00

1 2

58 Commits