2001-10-03 17:10:38 +04:00
|
|
|
/////////////////////////////////////////////////////////////////////////
|
2002-10-25 01:07:56 +04:00
|
|
|
// $Id: string.cc,v 1.21 2002-10-24 21:05:55 bdenney Exp $
|
2001-10-03 17:10:38 +04:00
|
|
|
/////////////////////////////////////////////////////////////////////////
|
|
|
|
//
|
2001-04-10 06:20:02 +04:00
|
|
|
// Copyright (C) 2001 MandrakeSoft S.A.
|
2001-04-10 05:04:59 +04:00
|
|
|
//
|
|
|
|
// MandrakeSoft S.A.
|
|
|
|
// 43, rue d'Aboukir
|
|
|
|
// 75002 Paris - France
|
|
|
|
// http://www.linux-mandrake.com/
|
|
|
|
// http://www.mandrakesoft.com/
|
|
|
|
//
|
|
|
|
// This library is free software; you can redistribute it and/or
|
|
|
|
// modify it under the terms of the GNU Lesser General Public
|
|
|
|
// License as published by the Free Software Foundation; either
|
|
|
|
// version 2 of the License, or (at your option) any later version.
|
|
|
|
//
|
|
|
|
// This library is distributed in the hope that it will be useful,
|
|
|
|
// but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
|
|
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
|
|
|
// Lesser General Public License for more details.
|
|
|
|
//
|
|
|
|
// You should have received a copy of the GNU Lesser General Public
|
|
|
|
// License along with this library; if not, write to the Free Software
|
|
|
|
// Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2001-05-24 22:46:34 +04:00
|
|
|
#define NEED_CPU_REG_SHORTCUTS 1
|
2001-04-10 05:04:59 +04:00
|
|
|
#include "bochs.h"
|
merge in BRANCH-io-cleanup.
To see the commit logs for this use either cvsweb or
cvs update -r BRANCH-io-cleanup and then 'cvs log' the various files.
In general this provides a generic interface for logging.
logfunctions:: is a class that is inherited by some classes, and also
. allocated as a standalone global called 'genlog'. All logging uses
. one of the ::info(), ::error(), ::ldebug(), ::panic() methods of this
. class through 'BX_INFO(), BX_ERROR(), BX_DEBUG(), BX_PANIC()' macros
. respectively.
.
. An example usage:
. BX_INFO(("Hello, World!\n"));
iofunctions:: is a class that is allocated once by default, and assigned
as the iofunction of each logfunctions instance. It is this class that
maintains the file descriptor and other output related code, at this
point using vfprintf(). At some future point, someone may choose to
write a gui 'console' for bochs to which messages would be redirected
simply by assigning a different iofunction class to the various logfunctions
objects.
More cleanup is coming, but this works for now. If you want to see alot
of debugging output, in main.cc, change onoff[LOGLEV_DEBUG]=0 to =1.
Comments, bugs, flames, to me: todd@fries.net
2001-05-15 18:49:57 +04:00
|
|
|
#define LOG_THIS BX_CPU_THIS_PTR
|
2001-04-10 05:04:59 +04:00
|
|
|
|
|
|
|
|
2002-09-15 09:09:18 +04:00
|
|
|
#if BX_SUPPORT_X86_64==0
|
|
|
|
#define RSI ESI
|
|
|
|
#define RDI EDI
|
|
|
|
#define RAX EAX
|
|
|
|
#endif
|
|
|
|
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
|
2002-09-24 08:43:59 +04:00
|
|
|
#if BX_SUPPORT_X86_64
|
|
|
|
#define IsLongMode() (BX_CPU_THIS_PTR cpu_mode == BX_MODE_LONG_64)
|
|
|
|
#else
|
|
|
|
#define IsLongMode() (0)
|
|
|
|
#endif
|
|
|
|
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
|
2001-04-10 05:04:59 +04:00
|
|
|
/* MOVSB ES:[EDI], DS:[ESI] DS may be overridden
|
|
|
|
* mov string from DS:[ESI] into ES:[EDI]
|
|
|
|
*/
|
|
|
|
|
|
|
|
void
|
2002-09-18 02:50:53 +04:00
|
|
|
BX_CPU_C::MOVSB_XbYb(bxInstruction_c *i)
|
2001-04-10 05:04:59 +04:00
|
|
|
{
|
|
|
|
unsigned seg;
|
|
|
|
Bit8u temp8;
|
|
|
|
|
2002-09-18 09:36:48 +04:00
|
|
|
if (!BX_NULL_SEG_REG(i->seg())) {
|
|
|
|
seg = i->seg();
|
2001-04-10 05:04:59 +04:00
|
|
|
}
|
|
|
|
else {
|
|
|
|
seg = BX_SEG_REG_DS;
|
|
|
|
}
|
|
|
|
|
|
|
|
#if BX_CPU_LEVEL >= 3
|
2002-09-15 09:09:18 +04:00
|
|
|
#if BX_SUPPORT_X86_64
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->as64L()) {
|
2002-09-15 09:09:18 +04:00
|
|
|
Bit64u rsi, rdi;
|
|
|
|
|
|
|
|
rsi = RSI;
|
|
|
|
rdi = RDI;
|
|
|
|
|
|
|
|
read_virtual_byte(seg, rsi, &temp8);
|
|
|
|
|
|
|
|
write_virtual_byte(BX_SEG_REG_ES, rdi, &temp8);
|
|
|
|
|
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
|
|
|
/* decrement RSI, RDI */
|
|
|
|
rsi--;
|
|
|
|
rdi--;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment RSI, RDI */
|
|
|
|
rsi++;
|
|
|
|
rdi++;
|
|
|
|
}
|
|
|
|
|
|
|
|
RSI = rsi;
|
|
|
|
RDI = rdi;
|
|
|
|
}
|
|
|
|
else
|
|
|
|
#endif // #if BX_SUPPORT_X86_64
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->as32L()) {
|
2001-04-10 05:04:59 +04:00
|
|
|
Bit32u esi, edi;
|
|
|
|
|
|
|
|
esi = ESI;
|
|
|
|
edi = EDI;
|
|
|
|
|
|
|
|
read_virtual_byte(seg, esi, &temp8);
|
|
|
|
|
|
|
|
write_virtual_byte(BX_SEG_REG_ES, edi, &temp8);
|
|
|
|
|
2002-09-12 22:10:46 +04:00
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
2001-04-10 05:04:59 +04:00
|
|
|
/* decrement ESI, EDI */
|
|
|
|
esi--;
|
|
|
|
edi--;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment ESI, EDI */
|
|
|
|
esi++;
|
|
|
|
edi++;
|
|
|
|
}
|
|
|
|
|
2002-09-15 09:09:18 +04:00
|
|
|
// zero extension of RSI/RDI
|
|
|
|
|
|
|
|
RSI = esi;
|
|
|
|
RDI = edi;
|
2001-04-10 05:04:59 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
else
|
|
|
|
#endif /* BX_CPU_LEVEL >= 3 */
|
|
|
|
{ /* 16 bit address mode */
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
unsigned incr;
|
2001-04-10 05:04:59 +04:00
|
|
|
Bit16u si, di;
|
|
|
|
|
|
|
|
si = SI;
|
|
|
|
di = DI;
|
|
|
|
|
2002-09-02 22:44:35 +04:00
|
|
|
#if BX_SupportRepeatSpeedups
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
#if (BX_DEBUGGER == 0)
|
|
|
|
/* If conditions are right, we can transfer IO to physical memory
|
|
|
|
* in a batch, rather than one instruction at a time.
|
|
|
|
*/
|
2002-09-18 12:00:43 +04:00
|
|
|
if (i->repUsedL() && !BX_CPU_THIS_PTR async_event) {
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
Bit32u byteCount;
|
|
|
|
|
2002-09-24 08:43:59 +04:00
|
|
|
#if BX_SUPPORT_X86_64
|
|
|
|
if (i->as64L())
|
|
|
|
byteCount = RCX; // Truncated to 32bits. (we're only doing 1 page)
|
|
|
|
else
|
|
|
|
#endif
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->as32L())
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
byteCount = ECX;
|
|
|
|
else
|
|
|
|
byteCount = CX;
|
|
|
|
|
|
|
|
if (byteCount) {
|
2002-09-03 23:38:27 +04:00
|
|
|
Bit32u bytesFitSrc, bytesFitDst;
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
Bit8u *hostAddrSrc, *hostAddrDst;
|
2002-09-03 23:38:27 +04:00
|
|
|
unsigned pointerDelta;
|
|
|
|
bx_segment_reg_t *srcSegPtr, *dstSegPtr;
|
2002-09-24 08:43:59 +04:00
|
|
|
bx_address laddrDst, laddrSrc;
|
|
|
|
Bit32u paddrDst, paddrSrc;
|
2002-09-03 23:38:27 +04:00
|
|
|
|
|
|
|
srcSegPtr = &BX_CPU_THIS_PTR sregs[seg];
|
2002-10-17 02:10:07 +04:00
|
|
|
dstSegPtr = &BX_CPU_THIS_PTR sregs[BX_SEG_REG_ES];
|
2002-09-03 23:38:27 +04:00
|
|
|
|
|
|
|
// Do segment checks for the 1st word. We do not want to
|
|
|
|
// trip an exception beyond this, because the address would
|
|
|
|
// be incorrect. After we know how many bytes we will directly
|
|
|
|
// transfer, we can do the full segment limit check ourselves
|
|
|
|
// without generating an exception.
|
|
|
|
read_virtual_checks(srcSegPtr, si, 1);
|
|
|
|
laddrSrc = srcSegPtr->cache.u.segment.base + si;
|
|
|
|
if (BX_CPU_THIS_PTR cr0.pg) {
|
|
|
|
paddrSrc = dtranslate_linear(laddrSrc, CPL==3, BX_READ);
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
paddrSrc = laddrSrc;
|
|
|
|
}
|
|
|
|
// If we want to write directly into the physical memory array,
|
|
|
|
// we need the A20 address.
|
|
|
|
paddrSrc = A20ADDR(paddrSrc);
|
|
|
|
|
|
|
|
write_virtual_checks(dstSegPtr, di, 1);
|
|
|
|
laddrDst = dstSegPtr->cache.u.segment.base + di;
|
|
|
|
if (BX_CPU_THIS_PTR cr0.pg) {
|
|
|
|
paddrDst = dtranslate_linear(laddrDst, CPL==3, BX_WRITE);
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
paddrDst = laddrDst;
|
|
|
|
}
|
|
|
|
// If we want to write directly into the physical memory array,
|
|
|
|
// we need the A20 address.
|
|
|
|
paddrDst = A20ADDR(paddrDst);
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
|
2002-09-19 23:17:20 +04:00
|
|
|
hostAddrSrc = BX_CPU_THIS_PTR mem->getHostMemAddr(BX_CPU_THIS,
|
|
|
|
paddrSrc, BX_READ);
|
|
|
|
hostAddrDst = BX_CPU_THIS_PTR mem->getHostMemAddr(BX_CPU_THIS,
|
|
|
|
paddrDst, BX_WRITE);
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
|
|
|
|
if ( hostAddrSrc && hostAddrDst ) {
|
|
|
|
// See how many bytes can fit in the rest of this page.
|
2002-09-12 22:10:46 +04:00
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
2002-09-03 23:38:27 +04:00
|
|
|
// Counting downward.
|
|
|
|
bytesFitSrc = 1 + (paddrSrc & 0xfff);
|
|
|
|
bytesFitDst = 1 + (paddrDst & 0xfff);
|
|
|
|
pointerDelta = (unsigned) -1;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
// Counting upward.
|
|
|
|
bytesFitSrc = (0x1000 - (paddrSrc & 0xfff));
|
|
|
|
bytesFitDst = (0x1000 - (paddrDst & 0xfff));
|
|
|
|
pointerDelta = 1;
|
|
|
|
}
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
// Restrict count to the number that will fit in either
|
|
|
|
// source or dest pages.
|
2002-09-03 23:38:27 +04:00
|
|
|
if (byteCount > bytesFitSrc)
|
|
|
|
byteCount = bytesFitSrc;
|
|
|
|
if (byteCount > bytesFitDst)
|
|
|
|
byteCount = bytesFitDst;
|
2002-09-30 20:43:59 +04:00
|
|
|
if (byteCount > bx_pc_system.getNumCpuTicksLeftNextEvent())
|
|
|
|
byteCount = bx_pc_system.getNumCpuTicksLeftNextEvent();
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
|
|
|
|
// If after all the restrictions, there is anything left to do...
|
|
|
|
if (byteCount) {
|
|
|
|
unsigned j;
|
2002-09-03 23:38:27 +04:00
|
|
|
Bit32u srcSegLimit, dstSegLimit;
|
|
|
|
|
|
|
|
srcSegLimit = srcSegPtr->cache.u.segment.limit_scaled;
|
|
|
|
dstSegLimit = dstSegPtr->cache.u.segment.limit_scaled;
|
|
|
|
// For 16-bit addressing mode, clamp the segment limits to 16bits
|
|
|
|
// so we don't have to worry about computations using si/di
|
|
|
|
// rolling over 16-bit boundaries.
|
2002-09-18 09:36:48 +04:00
|
|
|
if (!i->as32L()) {
|
2002-09-03 23:38:27 +04:00
|
|
|
if (srcSegLimit > 0xffff)
|
|
|
|
srcSegLimit = 0xffff;
|
|
|
|
if (dstSegLimit > 0xffff)
|
|
|
|
dstSegLimit = 0xffff;
|
|
|
|
}
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
|
|
|
|
// Before we copy memory, we need to make sure that the segments
|
|
|
|
// allow the accesses up to the given source and dest offset. If
|
|
|
|
// the cache.valid bits have SegAccessWOK and ROK, we know that
|
|
|
|
// the cache is valid for those operations, and that the segments
|
2002-09-03 23:38:27 +04:00
|
|
|
// are non expand-down (thus we can make a simple limit check).
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
if ( !(srcSegPtr->cache.valid & SegAccessROK) ||
|
|
|
|
!(dstSegPtr->cache.valid & SegAccessWOK) ) {
|
|
|
|
goto noAcceleration16;
|
|
|
|
}
|
2002-09-24 08:43:59 +04:00
|
|
|
if ( !IsLongMode() ) {
|
|
|
|
// Now make sure transfer will fit within the constraints of the
|
|
|
|
// segment boundaries, 0..limit for non expand-down. We know
|
|
|
|
// byteCount >= 1 here.
|
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
|
|
|
// Counting downward.
|
|
|
|
Bit32u minOffset = (byteCount-1);
|
|
|
|
if ( si < minOffset )
|
|
|
|
goto noAcceleration16;
|
|
|
|
if ( di < minOffset )
|
|
|
|
goto noAcceleration16;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
// Counting upward.
|
|
|
|
Bit32u srcMaxOffset = (srcSegLimit - byteCount) + 1;
|
|
|
|
Bit32u dstMaxOffset = (dstSegLimit - byteCount) + 1;
|
|
|
|
if ( si > srcMaxOffset )
|
|
|
|
goto noAcceleration16;
|
|
|
|
if ( di > dstMaxOffset )
|
|
|
|
goto noAcceleration16;
|
|
|
|
}
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
// Transfer data directly using host addresses.
|
|
|
|
for (j=0; j<byteCount; j++) {
|
|
|
|
* (Bit8u *) hostAddrDst = * (Bit8u *) hostAddrSrc;
|
|
|
|
hostAddrDst += pointerDelta;
|
|
|
|
hostAddrSrc += pointerDelta;
|
|
|
|
}
|
|
|
|
// Decrement the ticks count by the number of iterations, minus
|
|
|
|
// one, since the main cpu loop will decrement one. Also,
|
|
|
|
// the count is predecremented before examined, so defintely
|
|
|
|
// don't roll it under zero.
|
2002-09-03 23:38:27 +04:00
|
|
|
BX_TICKN(byteCount-1);
|
|
|
|
//bx_pc_system.num_cpu_ticks_left -= (byteCount-1);
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
|
|
|
|
// Decrement eCX. Note, the main loop will decrement 1 also, so
|
|
|
|
// decrement by one less than expected, like the case above.
|
2002-09-24 08:43:59 +04:00
|
|
|
#if BX_SUPPORT_X86_64
|
|
|
|
if (i->as64L())
|
|
|
|
RCX -= (byteCount-1);
|
|
|
|
else
|
|
|
|
#endif
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->as32L())
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
ECX -= (byteCount-1);
|
|
|
|
else
|
|
|
|
CX -= (byteCount-1);
|
|
|
|
incr = byteCount;
|
|
|
|
goto doIncr16;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
noAcceleration16:
|
|
|
|
|
2002-09-03 23:38:27 +04:00
|
|
|
#endif // (BX_DEBUGGER == 0)
|
|
|
|
#endif // BX_SupportRepeatSpeedups
|
|
|
|
|
2001-04-10 05:04:59 +04:00
|
|
|
read_virtual_byte(seg, si, &temp8);
|
|
|
|
|
|
|
|
write_virtual_byte(BX_SEG_REG_ES, di, &temp8);
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
incr = 1;
|
|
|
|
|
2002-09-03 23:38:27 +04:00
|
|
|
#if BX_SupportRepeatSpeedups
|
|
|
|
#if (BX_DEBUGGER == 0)
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
doIncr16:
|
2002-09-03 23:38:27 +04:00
|
|
|
#endif // (BX_DEBUGGER == 0)
|
|
|
|
#endif // BX_SupportRepeatSpeedups
|
2001-04-10 05:04:59 +04:00
|
|
|
|
2002-09-12 22:10:46 +04:00
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
2001-04-10 05:04:59 +04:00
|
|
|
/* decrement SI, DI */
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
si -= incr;
|
|
|
|
di -= incr;
|
2001-04-10 05:04:59 +04:00
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment SI, DI */
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
si += incr;
|
|
|
|
di += incr;
|
2001-04-10 05:04:59 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
SI = si;
|
|
|
|
DI = di;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
void
|
2002-09-18 02:50:53 +04:00
|
|
|
BX_CPU_C::MOVSW_XvYv(bxInstruction_c *i)
|
2001-04-10 05:04:59 +04:00
|
|
|
{
|
|
|
|
unsigned seg;
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
unsigned incr;
|
2001-04-10 05:04:59 +04:00
|
|
|
|
2002-09-18 09:36:48 +04:00
|
|
|
if (!BX_NULL_SEG_REG(i->seg())) {
|
|
|
|
seg = i->seg();
|
2001-04-10 05:04:59 +04:00
|
|
|
}
|
|
|
|
else {
|
|
|
|
seg = BX_SEG_REG_DS;
|
|
|
|
}
|
|
|
|
|
|
|
|
#if BX_CPU_LEVEL >= 3
|
2002-09-15 09:09:18 +04:00
|
|
|
#if BX_SUPPORT_X86_64
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->as64L()) {
|
2002-09-15 09:09:18 +04:00
|
|
|
|
|
|
|
Bit64u rsi, rdi;
|
|
|
|
|
|
|
|
rsi = RSI;
|
|
|
|
rdi = RDI;
|
|
|
|
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->os64L()) {
|
2002-09-15 09:09:18 +04:00
|
|
|
Bit64u temp64;
|
|
|
|
read_virtual_qword(seg, rsi, &temp64);
|
|
|
|
|
|
|
|
write_virtual_qword(BX_SEG_REG_ES, rdi, &temp64);
|
|
|
|
|
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
|
|
|
/* decrement RSI */
|
|
|
|
rsi -= 8;
|
|
|
|
rdi -= 8;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment ESI */
|
|
|
|
rsi += 8;
|
|
|
|
rdi += 8;
|
|
|
|
}
|
2002-09-18 09:36:48 +04:00
|
|
|
} /* if (i->os64L()) ... */
|
2002-09-15 09:09:18 +04:00
|
|
|
else
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->os32L()) {
|
2002-09-15 09:09:18 +04:00
|
|
|
Bit32u temp32;
|
|
|
|
read_virtual_dword(seg, rsi, &temp32);
|
|
|
|
|
|
|
|
write_virtual_dword(BX_SEG_REG_ES, rdi, &temp32);
|
|
|
|
|
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
|
|
|
/* decrement RSI */
|
|
|
|
rsi -= 4;
|
|
|
|
rdi -= 4;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment ESI */
|
|
|
|
rsi += 4;
|
|
|
|
rdi += 4;
|
|
|
|
}
|
2002-09-18 09:36:48 +04:00
|
|
|
} /* if (i->os32L()) ... */
|
2002-09-15 09:09:18 +04:00
|
|
|
else { /* 16 bit opsize mode */
|
|
|
|
Bit16u temp16;
|
|
|
|
|
|
|
|
read_virtual_word(seg, rsi, &temp16);
|
|
|
|
|
|
|
|
write_virtual_word(BX_SEG_REG_ES, rdi, &temp16);
|
|
|
|
|
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
|
|
|
/* decrement RSI */
|
|
|
|
rsi -= 2;
|
|
|
|
rdi -= 2;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment RSI */
|
|
|
|
rsi += 2;
|
|
|
|
rdi += 2;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
RSI = rsi;
|
|
|
|
RDI = rdi;
|
|
|
|
}
|
|
|
|
|
|
|
|
else
|
|
|
|
#endif // #if BX_SUPPORT_X86_64
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->as32L()) {
|
2001-04-10 05:04:59 +04:00
|
|
|
|
|
|
|
Bit32u esi, edi;
|
|
|
|
|
|
|
|
esi = ESI;
|
|
|
|
edi = EDI;
|
|
|
|
|
2002-09-15 09:09:18 +04:00
|
|
|
#if BX_SUPPORT_X86_64
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->os64L()) {
|
2002-09-15 09:09:18 +04:00
|
|
|
Bit64u temp64;
|
|
|
|
read_virtual_qword(seg, esi, &temp64);
|
|
|
|
|
|
|
|
write_virtual_qword(BX_SEG_REG_ES, edi, &temp64);
|
|
|
|
|
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
|
|
|
/* decrement ESI */
|
|
|
|
esi -= 8;
|
|
|
|
edi -= 8;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment ESI */
|
|
|
|
esi += 8;
|
|
|
|
edi += 8;
|
|
|
|
}
|
2002-09-18 09:36:48 +04:00
|
|
|
} /* if (i->os32L()) ... */
|
2002-09-15 09:09:18 +04:00
|
|
|
else
|
|
|
|
#endif // #if BX_SUPPORT_X86_64
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->os32L()) {
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
|
2002-09-15 09:09:18 +04:00
|
|
|
Bit32u temp32;
|
|
|
|
|
2002-09-02 22:44:35 +04:00
|
|
|
#if BX_SupportRepeatSpeedups
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
#if (BX_DEBUGGER == 0)
|
|
|
|
#if (defined(__i386__) && __i386__)
|
|
|
|
/* If conditions are right, we can transfer IO to physical memory
|
|
|
|
* in a batch, rather than one instruction at a time.
|
|
|
|
*/
|
2002-09-18 12:00:43 +04:00
|
|
|
if (i->repUsedL() && !BX_CPU_THIS_PTR async_event) {
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
Bit32u dwordCount;
|
|
|
|
|
2002-09-24 08:43:59 +04:00
|
|
|
#if BX_SUPPORT_X86_64
|
|
|
|
if (i->as64L())
|
|
|
|
dwordCount = RCX; // Truncated to 32bits. (we're only doing 1 page)
|
|
|
|
else
|
|
|
|
#endif
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->as32L())
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
dwordCount = ECX;
|
|
|
|
else
|
|
|
|
dwordCount = CX;
|
|
|
|
|
|
|
|
if (dwordCount) {
|
2002-09-03 23:38:27 +04:00
|
|
|
Bit32u dwordsFitSrc, dwordsFitDst;
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
Bit8u *hostAddrSrc, *hostAddrDst;
|
2002-09-03 23:38:27 +04:00
|
|
|
unsigned pointerDelta;
|
|
|
|
bx_segment_reg_t *srcSegPtr, *dstSegPtr;
|
2002-09-24 08:43:59 +04:00
|
|
|
bx_address laddrDst, laddrSrc;
|
|
|
|
Bit32u paddrDst, paddrSrc;
|
2002-09-03 23:38:27 +04:00
|
|
|
|
|
|
|
srcSegPtr = &BX_CPU_THIS_PTR sregs[seg];
|
2002-10-17 02:10:07 +04:00
|
|
|
dstSegPtr = &BX_CPU_THIS_PTR sregs[BX_SEG_REG_ES];
|
2002-09-03 23:38:27 +04:00
|
|
|
|
|
|
|
// Do segment checks for the 1st word. We do not want to
|
|
|
|
// trip an exception beyond this, because the address would
|
|
|
|
// be incorrect. After we know how many bytes we will directly
|
|
|
|
// transfer, we can do the full segment limit check ourselves
|
|
|
|
// without generating an exception.
|
|
|
|
read_virtual_checks(srcSegPtr, esi, 4);
|
|
|
|
laddrSrc = srcSegPtr->cache.u.segment.base + esi;
|
|
|
|
if (BX_CPU_THIS_PTR cr0.pg) {
|
|
|
|
paddrSrc = dtranslate_linear(laddrSrc, CPL==3, BX_READ);
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
paddrSrc = laddrSrc;
|
|
|
|
}
|
|
|
|
// If we want to write directly into the physical memory array,
|
|
|
|
// we need the A20 address.
|
|
|
|
paddrSrc = A20ADDR(paddrSrc);
|
|
|
|
|
|
|
|
write_virtual_checks(dstSegPtr, edi, 4);
|
|
|
|
laddrDst = dstSegPtr->cache.u.segment.base + edi;
|
|
|
|
if (BX_CPU_THIS_PTR cr0.pg) {
|
|
|
|
paddrDst = dtranslate_linear(laddrDst, CPL==3, BX_WRITE);
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
paddrDst = laddrDst;
|
|
|
|
}
|
|
|
|
// If we want to write directly into the physical memory array,
|
|
|
|
// we need the A20 address.
|
|
|
|
paddrDst = A20ADDR(paddrDst);
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
|
2002-09-19 23:17:20 +04:00
|
|
|
hostAddrSrc = BX_CPU_THIS_PTR mem->getHostMemAddr(BX_CPU_THIS,
|
|
|
|
paddrSrc, BX_READ);
|
|
|
|
hostAddrDst = BX_CPU_THIS_PTR mem->getHostMemAddr(BX_CPU_THIS,
|
|
|
|
paddrDst, BX_WRITE);
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
|
|
|
|
if ( hostAddrSrc && hostAddrDst ) {
|
|
|
|
// See how many dwords can fit in the rest of this page.
|
2002-09-12 22:10:46 +04:00
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
2002-09-03 23:38:27 +04:00
|
|
|
// Counting downward.
|
|
|
|
// Note: 1st dword must not cross page boundary.
|
|
|
|
if ( ((paddrSrc & 0xfff) > 0xffc) ||
|
|
|
|
((paddrDst & 0xfff) > 0xffc) )
|
|
|
|
goto noAcceleration32;
|
|
|
|
dwordsFitSrc = (4 + (paddrSrc & 0xfff)) >> 2;
|
|
|
|
dwordsFitDst = (4 + (paddrDst & 0xfff)) >> 2;
|
|
|
|
pointerDelta = (unsigned) -4;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
// Counting upward.
|
|
|
|
dwordsFitSrc = (0x1000 - (paddrSrc & 0xfff)) >> 2;
|
|
|
|
dwordsFitDst = (0x1000 - (paddrDst & 0xfff)) >> 2;
|
|
|
|
pointerDelta = 4;
|
|
|
|
}
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
// Restrict dword count to the number that will fit in either
|
|
|
|
// source or dest pages.
|
2002-09-03 23:38:27 +04:00
|
|
|
if (dwordCount > dwordsFitSrc)
|
|
|
|
dwordCount = dwordsFitSrc;
|
|
|
|
if (dwordCount > dwordsFitDst)
|
|
|
|
dwordCount = dwordsFitDst;
|
2002-09-30 20:43:59 +04:00
|
|
|
if (dwordCount > bx_pc_system.getNumCpuTicksLeftNextEvent())
|
|
|
|
dwordCount = bx_pc_system.getNumCpuTicksLeftNextEvent();
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
|
|
|
|
// If after all the restrictions, there is anything left to do...
|
|
|
|
if (dwordCount) {
|
|
|
|
unsigned j;
|
2002-09-03 23:38:27 +04:00
|
|
|
Bit32u srcSegLimit, dstSegLimit;
|
|
|
|
|
|
|
|
srcSegLimit = srcSegPtr->cache.u.segment.limit_scaled;
|
|
|
|
dstSegLimit = dstSegPtr->cache.u.segment.limit_scaled;
|
|
|
|
// For 16-bit addressing mode, clamp the segment limits to 16bits
|
|
|
|
// so we don't have to worry about computations using si/di
|
|
|
|
// rolling over 16-bit boundaries.
|
2002-09-18 09:36:48 +04:00
|
|
|
if (!i->as32L()) {
|
2002-09-03 23:38:27 +04:00
|
|
|
if (srcSegLimit > 0xffff)
|
|
|
|
srcSegLimit = 0xffff;
|
|
|
|
if (dstSegLimit > 0xffff)
|
|
|
|
dstSegLimit = 0xffff;
|
|
|
|
}
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
|
|
|
|
// Before we copy memory, we need to make sure that the segments
|
|
|
|
// allow the accesses up to the given source and dest offset. If
|
|
|
|
// the cache.valid bits have SegAccessWOK and ROK, we know that
|
|
|
|
// the cache is valid for those operations, and that the segments
|
2002-09-03 23:38:27 +04:00
|
|
|
// are non expand-down (thus we can make a simple limit check).
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
if ( !(srcSegPtr->cache.valid & SegAccessROK) ||
|
|
|
|
!(dstSegPtr->cache.valid & SegAccessWOK) ) {
|
|
|
|
goto noAcceleration32;
|
|
|
|
}
|
2002-09-24 08:43:59 +04:00
|
|
|
if ( !IsLongMode() ) {
|
|
|
|
// Now make sure transfer will fit within the constraints of the
|
|
|
|
// segment boundaries, 0..limit for non expand-down. We know
|
|
|
|
// dwordCount >= 1 here.
|
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
|
|
|
// Counting downward.
|
|
|
|
Bit32u minOffset = (dwordCount-1) << 2;
|
|
|
|
if ( esi < minOffset )
|
|
|
|
goto noAcceleration32;
|
|
|
|
if ( edi < minOffset )
|
|
|
|
goto noAcceleration32;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
// Counting upward.
|
|
|
|
Bit32u srcMaxOffset = (srcSegLimit - (dwordCount<<2)) + 1;
|
|
|
|
Bit32u dstMaxOffset = (dstSegLimit - (dwordCount<<2)) + 1;
|
|
|
|
if ( esi > srcMaxOffset )
|
|
|
|
goto noAcceleration32;
|
|
|
|
if ( edi > dstMaxOffset )
|
|
|
|
goto noAcceleration32;
|
|
|
|
}
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
// Transfer data directly using host addresses.
|
|
|
|
for (j=0; j<dwordCount; j++) {
|
|
|
|
* (Bit32u *) hostAddrDst = * (Bit32u *) hostAddrSrc;
|
|
|
|
hostAddrDst += pointerDelta;
|
|
|
|
hostAddrSrc += pointerDelta;
|
|
|
|
}
|
|
|
|
// Decrement the ticks count by the number of iterations, minus
|
|
|
|
// one, since the main cpu loop will decrement one. Also,
|
|
|
|
// the count is predecremented before examined, so defintely
|
|
|
|
// don't roll it under zero.
|
2002-09-03 23:38:27 +04:00
|
|
|
BX_TICKN(dwordCount-1);
|
|
|
|
//bx_pc_system.num_cpu_ticks_left -= (dwordCount-1);
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
|
|
|
|
// Decrement eCX. Note, the main loop will decrement 1 also, so
|
|
|
|
// decrement by one less than expected, like the case above.
|
2002-09-24 08:43:59 +04:00
|
|
|
#if BX_SUPPORT_X86_64
|
|
|
|
if (i->as64L())
|
|
|
|
RCX -= (dwordCount-1);
|
|
|
|
else
|
|
|
|
#endif
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->as32L())
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
ECX -= (dwordCount-1);
|
|
|
|
else
|
|
|
|
CX -= (dwordCount-1);
|
|
|
|
incr = dwordCount << 2; // count * 4.
|
|
|
|
goto doIncr32;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
noAcceleration32:
|
|
|
|
|
2002-09-03 23:38:27 +04:00
|
|
|
#endif // __i386__
|
|
|
|
#endif // (BX_DEBUGGER == 0)
|
|
|
|
#endif // BX_SupportRepeatSpeedups
|
|
|
|
|
2001-04-10 05:04:59 +04:00
|
|
|
read_virtual_dword(seg, esi, &temp32);
|
|
|
|
|
|
|
|
write_virtual_dword(BX_SEG_REG_ES, edi, &temp32);
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
incr = 4;
|
|
|
|
|
2002-09-03 23:38:27 +04:00
|
|
|
#if BX_SupportRepeatSpeedups
|
|
|
|
#if (BX_DEBUGGER == 0)
|
|
|
|
#if (defined(__i386__) && __i386__)
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
doIncr32:
|
2002-09-03 23:38:27 +04:00
|
|
|
#endif
|
|
|
|
#endif
|
|
|
|
#endif
|
2001-04-10 05:04:59 +04:00
|
|
|
|
2002-09-12 22:10:46 +04:00
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
2001-04-10 05:04:59 +04:00
|
|
|
/* decrement ESI */
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
esi -= incr;
|
|
|
|
edi -= incr;
|
2001-04-10 05:04:59 +04:00
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment ESI */
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
esi += incr;
|
|
|
|
edi += incr;
|
2001-04-10 05:04:59 +04:00
|
|
|
}
|
2002-09-18 09:36:48 +04:00
|
|
|
} /* if (i->os32L()) ... */
|
2001-04-10 05:04:59 +04:00
|
|
|
else { /* 16 bit opsize mode */
|
|
|
|
Bit16u temp16;
|
|
|
|
|
|
|
|
read_virtual_word(seg, esi, &temp16);
|
|
|
|
|
|
|
|
write_virtual_word(BX_SEG_REG_ES, edi, &temp16);
|
|
|
|
|
2002-09-12 22:10:46 +04:00
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
2001-04-10 05:04:59 +04:00
|
|
|
/* decrement ESI */
|
|
|
|
esi -= 2;
|
|
|
|
edi -= 2;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment ESI */
|
|
|
|
esi += 2;
|
|
|
|
edi += 2;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2002-09-15 09:09:18 +04:00
|
|
|
// zero extension of RSI/RDI
|
|
|
|
|
|
|
|
RSI = esi;
|
|
|
|
RDI = edi;
|
2001-04-10 05:04:59 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
else
|
|
|
|
#endif /* BX_CPU_LEVEL >= 3 */
|
|
|
|
{ /* 16bit address mode */
|
|
|
|
Bit16u si, di;
|
|
|
|
|
|
|
|
si = SI;
|
|
|
|
di = DI;
|
|
|
|
|
|
|
|
#if BX_CPU_LEVEL >= 3
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->os32L()) {
|
2001-04-10 05:04:59 +04:00
|
|
|
Bit32u temp32;
|
|
|
|
|
|
|
|
read_virtual_dword(seg, si, &temp32);
|
|
|
|
|
|
|
|
write_virtual_dword(BX_SEG_REG_ES, di, &temp32);
|
|
|
|
|
2002-09-12 22:10:46 +04:00
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
2001-04-10 05:04:59 +04:00
|
|
|
/* decrement ESI */
|
|
|
|
si -= 4;
|
|
|
|
di -= 4;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment ESI */
|
|
|
|
si += 4;
|
|
|
|
di += 4;
|
|
|
|
}
|
2002-09-18 09:36:48 +04:00
|
|
|
} /* if (i->os32L()) ... */
|
2001-04-10 05:04:59 +04:00
|
|
|
else
|
|
|
|
#endif /* BX_CPU_LEVEL >= 3 */
|
|
|
|
{ /* 16 bit opsize mode */
|
|
|
|
Bit16u temp16;
|
|
|
|
|
2002-09-02 22:44:35 +04:00
|
|
|
#if BX_SupportRepeatSpeedups
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
#if (BX_DEBUGGER == 0)
|
|
|
|
#if (defined(__i386__) && __i386__)
|
|
|
|
/* If conditions are right, we can transfer IO to physical memory
|
|
|
|
* in a batch, rather than one instruction at a time.
|
|
|
|
*/
|
2002-09-18 12:00:43 +04:00
|
|
|
if (i->repUsedL() && !BX_CPU_THIS_PTR async_event) {
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
Bit32u wordCount;
|
|
|
|
|
2002-09-24 08:43:59 +04:00
|
|
|
#if BX_SUPPORT_X86_64
|
|
|
|
if (i->as64L())
|
|
|
|
wordCount = RCX; // Truncated to 32bits. (we're only doing 1 page)
|
|
|
|
else
|
|
|
|
#endif
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->as32L())
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
wordCount = ECX;
|
|
|
|
else
|
|
|
|
wordCount = CX;
|
|
|
|
|
|
|
|
if (wordCount) {
|
2002-09-03 23:38:27 +04:00
|
|
|
Bit32u wordsFitSrc, wordsFitDst;
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
Bit8u *hostAddrSrc, *hostAddrDst;
|
2002-09-03 23:38:27 +04:00
|
|
|
unsigned pointerDelta;
|
|
|
|
bx_segment_reg_t *srcSegPtr, *dstSegPtr;
|
2002-09-24 08:43:59 +04:00
|
|
|
bx_address laddrDst, laddrSrc;
|
|
|
|
Bit32u paddrDst, paddrSrc;
|
2002-09-03 23:38:27 +04:00
|
|
|
|
|
|
|
srcSegPtr = &BX_CPU_THIS_PTR sregs[seg];
|
2002-10-17 02:10:07 +04:00
|
|
|
dstSegPtr = &BX_CPU_THIS_PTR sregs[BX_SEG_REG_ES];
|
2002-09-03 23:38:27 +04:00
|
|
|
|
|
|
|
// Do segment checks for the 1st word. We do not want to
|
|
|
|
// trip an exception beyond this, because the address would
|
|
|
|
// be incorrect. After we know how many bytes we will directly
|
|
|
|
// transfer, we can do the full segment limit check ourselves
|
|
|
|
// without generating an exception.
|
|
|
|
read_virtual_checks(srcSegPtr, si, 2);
|
|
|
|
laddrSrc = srcSegPtr->cache.u.segment.base + si;
|
|
|
|
if (BX_CPU_THIS_PTR cr0.pg) {
|
|
|
|
paddrSrc = dtranslate_linear(laddrSrc, CPL==3, BX_READ);
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
paddrSrc = laddrSrc;
|
|
|
|
}
|
|
|
|
// If we want to write directly into the physical memory array,
|
|
|
|
// we need the A20 address.
|
|
|
|
paddrSrc = A20ADDR(paddrSrc);
|
|
|
|
|
|
|
|
write_virtual_checks(dstSegPtr, di, 2);
|
|
|
|
laddrDst = dstSegPtr->cache.u.segment.base + di;
|
|
|
|
if (BX_CPU_THIS_PTR cr0.pg) {
|
|
|
|
paddrDst = dtranslate_linear(laddrDst, CPL==3, BX_WRITE);
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
paddrDst = laddrDst;
|
|
|
|
}
|
|
|
|
// If we want to write directly into the physical memory array,
|
|
|
|
// we need the A20 address.
|
|
|
|
paddrDst = A20ADDR(paddrDst);
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
|
2002-09-19 23:17:20 +04:00
|
|
|
hostAddrSrc = BX_CPU_THIS_PTR mem->getHostMemAddr(BX_CPU_THIS,
|
|
|
|
paddrSrc, BX_READ);
|
|
|
|
hostAddrDst = BX_CPU_THIS_PTR mem->getHostMemAddr(BX_CPU_THIS,
|
|
|
|
paddrDst, BX_WRITE);
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
|
|
|
|
if ( hostAddrSrc && hostAddrDst ) {
|
|
|
|
// See how many words can fit in the rest of this page.
|
2002-09-12 22:10:46 +04:00
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
2002-09-03 23:38:27 +04:00
|
|
|
// Counting downward.
|
|
|
|
// Note: 1st word must not cross page boundary.
|
|
|
|
if ( ((paddrSrc & 0xfff) > 0xffe) ||
|
|
|
|
((paddrDst & 0xfff) > 0xffe) )
|
|
|
|
goto noAcceleration16;
|
|
|
|
wordsFitSrc = (2 + (paddrSrc & 0xfff)) >> 1;
|
|
|
|
wordsFitDst = (2 + (paddrDst & 0xfff)) >> 1;
|
|
|
|
pointerDelta = (unsigned) -2;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
// Counting upward.
|
|
|
|
wordsFitSrc = (0x1000 - (paddrSrc & 0xfff)) >> 1;
|
|
|
|
wordsFitDst = (0x1000 - (paddrDst & 0xfff)) >> 1;
|
|
|
|
pointerDelta = 2;
|
|
|
|
}
|
|
|
|
// Restrict word count to the number that will fit in either
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
// source or dest pages.
|
2002-09-03 23:38:27 +04:00
|
|
|
if (wordCount > wordsFitSrc)
|
|
|
|
wordCount = wordsFitSrc;
|
|
|
|
if (wordCount > wordsFitDst)
|
|
|
|
wordCount = wordsFitDst;
|
2002-09-30 20:43:59 +04:00
|
|
|
if (wordCount > bx_pc_system.getNumCpuTicksLeftNextEvent())
|
|
|
|
wordCount = bx_pc_system.getNumCpuTicksLeftNextEvent();
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
|
|
|
|
// If after all the restrictions, there is anything left to do...
|
|
|
|
if (wordCount) {
|
|
|
|
unsigned j;
|
2002-09-03 23:38:27 +04:00
|
|
|
Bit32u srcSegLimit, dstSegLimit;
|
|
|
|
|
|
|
|
srcSegLimit = srcSegPtr->cache.u.segment.limit_scaled;
|
|
|
|
dstSegLimit = dstSegPtr->cache.u.segment.limit_scaled;
|
|
|
|
// For 16-bit addressing mode, clamp the segment limits to 16bits
|
|
|
|
// so we don't have to worry about computations using si/di
|
|
|
|
// rolling over 16-bit boundaries.
|
2002-09-18 09:36:48 +04:00
|
|
|
if (!i->as32L()) {
|
2002-09-03 23:38:27 +04:00
|
|
|
if (srcSegLimit > 0xffff)
|
|
|
|
srcSegLimit = 0xffff;
|
|
|
|
if (dstSegLimit > 0xffff)
|
|
|
|
dstSegLimit = 0xffff;
|
|
|
|
}
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
|
|
|
|
// Before we copy memory, we need to make sure that the segments
|
|
|
|
// allow the accesses up to the given source and dest offset. If
|
|
|
|
// the cache.valid bits have SegAccessWOK and ROK, we know that
|
|
|
|
// the cache is valid for those operations, and that the segments
|
2002-09-03 23:38:27 +04:00
|
|
|
// are non expand-down (thus we can make a simple limit check).
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
if ( !(srcSegPtr->cache.valid & SegAccessROK) ||
|
|
|
|
!(dstSegPtr->cache.valid & SegAccessWOK) ) {
|
|
|
|
goto noAcceleration16;
|
|
|
|
}
|
2002-09-24 08:43:59 +04:00
|
|
|
if ( !IsLongMode() ) {
|
|
|
|
// Now make sure transfer will fit within the constraints of the
|
|
|
|
// segment boundaries, 0..limit for non expand-down. We know
|
|
|
|
// wordCount >= 1 here.
|
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
|
|
|
// Counting downward.
|
|
|
|
Bit32u minOffset = (wordCount-1) << 1;
|
|
|
|
if ( si < minOffset )
|
|
|
|
goto noAcceleration16;
|
|
|
|
if ( di < minOffset )
|
|
|
|
goto noAcceleration16;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
// Counting upward.
|
|
|
|
Bit32u srcMaxOffset = (srcSegLimit - (wordCount<<1)) + 1;
|
|
|
|
Bit32u dstMaxOffset = (dstSegLimit - (wordCount<<1)) + 1;
|
|
|
|
if ( si > srcMaxOffset )
|
|
|
|
goto noAcceleration16;
|
|
|
|
if ( di > dstMaxOffset )
|
|
|
|
goto noAcceleration16;
|
|
|
|
}
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
// Transfer data directly using host addresses.
|
|
|
|
for (j=0; j<wordCount; j++) {
|
|
|
|
* (Bit16u *) hostAddrDst = * (Bit16u *) hostAddrSrc;
|
|
|
|
hostAddrDst += pointerDelta;
|
|
|
|
hostAddrSrc += pointerDelta;
|
|
|
|
}
|
|
|
|
// Decrement the ticks count by the number of iterations, minus
|
|
|
|
// one, since the main cpu loop will decrement one. Also,
|
|
|
|
// the count is predecremented before examined, so defintely
|
|
|
|
// don't roll it under zero.
|
2002-09-03 23:38:27 +04:00
|
|
|
BX_TICKN(wordCount-1);
|
|
|
|
//bx_pc_system.num_cpu_ticks_left -= (wordCount-1);
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
|
|
|
|
// Decrement eCX. Note, the main loop will decrement 1 also, so
|
|
|
|
// decrement by one less than expected, like the case above.
|
2002-09-24 08:43:59 +04:00
|
|
|
#if BX_SUPPORT_X86_64
|
|
|
|
if (i->as64L())
|
|
|
|
RCX -= (wordCount-1);
|
|
|
|
else
|
|
|
|
#endif
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->as32L())
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
ECX -= (wordCount-1);
|
|
|
|
else
|
|
|
|
CX -= (wordCount-1);
|
|
|
|
incr = wordCount << 1; // count * 2.
|
|
|
|
goto doIncr16;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
noAcceleration16:
|
|
|
|
|
2002-09-03 23:38:27 +04:00
|
|
|
#endif // __i386__
|
|
|
|
#endif // (BX_DEBUGGER == 0)
|
|
|
|
#endif // BX_SupportRepeatSpeedups
|
|
|
|
|
2001-04-10 05:04:59 +04:00
|
|
|
read_virtual_word(seg, si, &temp16);
|
|
|
|
|
|
|
|
write_virtual_word(BX_SEG_REG_ES, di, &temp16);
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
incr = 2;
|
|
|
|
|
2002-09-03 23:38:27 +04:00
|
|
|
#if BX_SupportRepeatSpeedups
|
|
|
|
#if (BX_DEBUGGER == 0)
|
|
|
|
#if (defined(__i386__) && __i386__)
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
doIncr16:
|
2002-09-03 23:38:27 +04:00
|
|
|
#endif
|
|
|
|
#endif
|
|
|
|
#endif
|
2001-04-10 05:04:59 +04:00
|
|
|
|
2002-09-12 22:10:46 +04:00
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
2001-04-10 05:04:59 +04:00
|
|
|
/* decrement SI, DI */
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
si -= incr;
|
|
|
|
di -= incr;
|
2001-04-10 05:04:59 +04:00
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment SI, DI */
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
si += incr;
|
|
|
|
di += incr;
|
2001-04-10 05:04:59 +04:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
SI = si;
|
|
|
|
DI = di;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
void
|
2002-09-18 02:50:53 +04:00
|
|
|
BX_CPU_C::CMPSB_XbYb(bxInstruction_c *i)
|
2001-04-10 05:04:59 +04:00
|
|
|
{
|
|
|
|
unsigned seg;
|
|
|
|
Bit8u op1_8, op2_8, diff_8;
|
|
|
|
|
|
|
|
|
2002-09-18 09:36:48 +04:00
|
|
|
if (!BX_NULL_SEG_REG(i->seg())) {
|
|
|
|
seg = i->seg();
|
2001-04-10 05:04:59 +04:00
|
|
|
}
|
|
|
|
else {
|
|
|
|
seg = BX_SEG_REG_DS;
|
|
|
|
}
|
|
|
|
|
|
|
|
#if BX_CPU_LEVEL >= 3
|
2002-09-15 09:09:18 +04:00
|
|
|
#if BX_SUPPORT_X86_64
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->as64L()) {
|
2002-09-15 09:09:18 +04:00
|
|
|
Bit64u rsi, rdi;
|
|
|
|
|
|
|
|
rsi = RSI;
|
|
|
|
rdi = RDI;
|
|
|
|
|
|
|
|
read_virtual_byte(seg, rsi, &op1_8);
|
|
|
|
|
|
|
|
read_virtual_byte(BX_SEG_REG_ES, rdi, &op2_8);
|
|
|
|
|
|
|
|
diff_8 = op1_8 - op2_8;
|
|
|
|
|
|
|
|
SET_FLAGS_OSZAPC_8(op1_8, op2_8, diff_8, BX_INSTR_CMPS8);
|
|
|
|
|
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
|
|
|
/* decrement RSI */
|
|
|
|
rsi--;
|
|
|
|
rdi--;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment RSI */
|
|
|
|
rsi++;
|
|
|
|
rdi++;
|
|
|
|
}
|
|
|
|
|
|
|
|
RDI = rdi;
|
|
|
|
RSI = rsi;
|
|
|
|
}
|
|
|
|
else
|
|
|
|
#endif // #if BX_SUPPORT_X86_64
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->as32L()) {
|
2001-04-10 05:04:59 +04:00
|
|
|
Bit32u esi, edi;
|
|
|
|
|
|
|
|
esi = ESI;
|
|
|
|
edi = EDI;
|
|
|
|
|
|
|
|
read_virtual_byte(seg, esi, &op1_8);
|
|
|
|
|
|
|
|
read_virtual_byte(BX_SEG_REG_ES, edi, &op2_8);
|
|
|
|
|
|
|
|
diff_8 = op1_8 - op2_8;
|
|
|
|
|
|
|
|
SET_FLAGS_OSZAPC_8(op1_8, op2_8, diff_8, BX_INSTR_CMPS8);
|
|
|
|
|
2002-09-12 22:10:46 +04:00
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
2001-04-10 05:04:59 +04:00
|
|
|
/* decrement ESI */
|
|
|
|
esi--;
|
|
|
|
edi--;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment ESI */
|
|
|
|
esi++;
|
|
|
|
edi++;
|
|
|
|
}
|
|
|
|
|
2002-09-15 09:09:18 +04:00
|
|
|
// zero extension of RSI/RDI
|
|
|
|
|
|
|
|
RDI = edi;
|
|
|
|
RSI = esi;
|
2001-04-10 05:04:59 +04:00
|
|
|
}
|
|
|
|
else
|
|
|
|
#endif /* BX_CPU_LEVEL >= 3 */
|
|
|
|
{ /* 16bit address mode */
|
|
|
|
Bit16u si, di;
|
|
|
|
|
|
|
|
si = SI;
|
|
|
|
di = DI;
|
|
|
|
|
|
|
|
read_virtual_byte(seg, si, &op1_8);
|
|
|
|
|
|
|
|
read_virtual_byte(BX_SEG_REG_ES, di, &op2_8);
|
|
|
|
|
2002-10-03 22:12:40 +04:00
|
|
|
#if (defined(__i386__) && defined(__GNUC__) && BX_SupportHostAsms)
|
|
|
|
Bit32u flags32;
|
2002-10-08 02:51:58 +04:00
|
|
|
|
|
|
|
asmCmp8(op1_8, op2_8, flags32);
|
|
|
|
setEFlagsOSZAPC(flags32);
|
2002-10-03 22:12:40 +04:00
|
|
|
#else
|
2001-04-10 05:04:59 +04:00
|
|
|
diff_8 = op1_8 - op2_8;
|
|
|
|
|
|
|
|
SET_FLAGS_OSZAPC_8(op1_8, op2_8, diff_8, BX_INSTR_CMPS8);
|
2002-10-03 22:12:40 +04:00
|
|
|
#endif
|
2001-04-10 05:04:59 +04:00
|
|
|
|
2002-09-12 22:10:46 +04:00
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
2001-04-10 05:04:59 +04:00
|
|
|
/* decrement ESI */
|
|
|
|
si--;
|
|
|
|
di--;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment ESI */
|
|
|
|
si++;
|
|
|
|
di++;
|
|
|
|
}
|
|
|
|
|
|
|
|
DI = di;
|
|
|
|
SI = si;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
void
|
2002-09-18 02:50:53 +04:00
|
|
|
BX_CPU_C::CMPSW_XvYv(bxInstruction_c *i)
|
2001-04-10 05:04:59 +04:00
|
|
|
{
|
|
|
|
unsigned seg;
|
|
|
|
|
|
|
|
|
2002-09-18 09:36:48 +04:00
|
|
|
if (!BX_NULL_SEG_REG(i->seg())) {
|
|
|
|
seg = i->seg();
|
2001-04-10 05:04:59 +04:00
|
|
|
}
|
|
|
|
else {
|
|
|
|
seg = BX_SEG_REG_DS;
|
|
|
|
}
|
|
|
|
|
|
|
|
#if BX_CPU_LEVEL >= 3
|
2002-09-15 09:09:18 +04:00
|
|
|
#if BX_SUPPORT_X86_64
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->as64L()) {
|
2002-09-15 09:09:18 +04:00
|
|
|
Bit64u rsi, rdi;
|
|
|
|
|
|
|
|
rsi = RSI;
|
|
|
|
rdi = RDI;
|
|
|
|
|
|
|
|
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->os64L()) {
|
2002-09-15 09:09:18 +04:00
|
|
|
Bit64u op1_64, op2_64, diff_64;
|
|
|
|
|
|
|
|
read_virtual_qword(seg, rsi, &op1_64);
|
|
|
|
|
|
|
|
read_virtual_qword(BX_SEG_REG_ES, rdi, &op2_64);
|
|
|
|
|
|
|
|
diff_64 = op1_64 - op2_64;
|
|
|
|
|
|
|
|
SET_FLAGS_OSZAPC_64(op1_64, op2_64, diff_64, BX_INSTR_CMPS64);
|
|
|
|
|
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
|
|
|
/* decrement ESI */
|
|
|
|
rsi -= 8;
|
|
|
|
rdi -= 8;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment ESI */
|
|
|
|
rsi += 8;
|
|
|
|
rdi += 8;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
else
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->os32L()) {
|
2002-09-15 09:09:18 +04:00
|
|
|
Bit32u op1_32, op2_32, diff_32;
|
|
|
|
|
|
|
|
read_virtual_dword(seg, rsi, &op1_32);
|
|
|
|
|
|
|
|
read_virtual_dword(BX_SEG_REG_ES, rdi, &op2_32);
|
|
|
|
|
|
|
|
diff_32 = op1_32 - op2_32;
|
|
|
|
|
|
|
|
SET_FLAGS_OSZAPC_32(op1_32, op2_32, diff_32, BX_INSTR_CMPS32);
|
|
|
|
|
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
|
|
|
/* decrement ESI */
|
|
|
|
rsi -= 4;
|
|
|
|
rdi -= 4;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment ESI */
|
|
|
|
rsi += 4;
|
|
|
|
rdi += 4;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
else { /* 16 bit opsize */
|
|
|
|
Bit16u op1_16, op2_16, diff_16;
|
|
|
|
|
|
|
|
read_virtual_word(seg, rsi, &op1_16);
|
|
|
|
|
|
|
|
read_virtual_word(BX_SEG_REG_ES, rdi, &op2_16);
|
|
|
|
|
|
|
|
diff_16 = op1_16 - op2_16;
|
|
|
|
|
|
|
|
SET_FLAGS_OSZAPC_16(op1_16, op2_16, diff_16, BX_INSTR_CMPS16);
|
|
|
|
|
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
|
|
|
/* decrement ESI */
|
|
|
|
rsi -= 2;
|
|
|
|
rdi -= 2;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment ESI */
|
|
|
|
rsi += 2;
|
|
|
|
rdi += 2;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
RDI = rdi;
|
|
|
|
RSI = rsi;
|
|
|
|
}
|
|
|
|
else
|
|
|
|
#endif // #if BX_SUPPORT_X86_64
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->as32L()) {
|
2001-04-10 05:04:59 +04:00
|
|
|
Bit32u esi, edi;
|
|
|
|
|
|
|
|
esi = ESI;
|
|
|
|
edi = EDI;
|
|
|
|
|
|
|
|
|
2002-09-15 09:09:18 +04:00
|
|
|
#if BX_SUPPORT_X86_64
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->os64L()) {
|
2002-09-15 09:09:18 +04:00
|
|
|
Bit64u op1_64, op2_64, diff_64;
|
|
|
|
read_virtual_qword(seg, esi, &op1_64);
|
|
|
|
|
|
|
|
read_virtual_qword(BX_SEG_REG_ES, edi, &op2_64);
|
|
|
|
|
|
|
|
diff_64 = op1_64 - op2_64;
|
|
|
|
|
|
|
|
SET_FLAGS_OSZAPC_64(op1_64, op2_64, diff_64, BX_INSTR_CMPS64);
|
|
|
|
|
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
|
|
|
/* decrement ESI */
|
|
|
|
esi -= 8;
|
|
|
|
edi -= 8;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment ESI */
|
|
|
|
esi += 8;
|
|
|
|
edi += 8;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
else
|
|
|
|
#endif // #if BX_SUPPORT_X86_64
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->os32L()) {
|
2002-09-15 09:09:18 +04:00
|
|
|
Bit32u op1_32, op2_32, diff_32;
|
2001-04-10 05:04:59 +04:00
|
|
|
read_virtual_dword(seg, esi, &op1_32);
|
|
|
|
|
|
|
|
read_virtual_dword(BX_SEG_REG_ES, edi, &op2_32);
|
|
|
|
|
|
|
|
diff_32 = op1_32 - op2_32;
|
|
|
|
|
|
|
|
SET_FLAGS_OSZAPC_32(op1_32, op2_32, diff_32, BX_INSTR_CMPS32);
|
|
|
|
|
2002-09-12 22:10:46 +04:00
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
2001-04-10 05:04:59 +04:00
|
|
|
/* decrement ESI */
|
|
|
|
esi -= 4;
|
|
|
|
edi -= 4;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment ESI */
|
|
|
|
esi += 4;
|
|
|
|
edi += 4;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
else { /* 16 bit opsize */
|
2002-10-08 02:51:58 +04:00
|
|
|
Bit16u op1_16, op2_16;
|
2001-04-10 05:04:59 +04:00
|
|
|
|
|
|
|
read_virtual_word(seg, esi, &op1_16);
|
|
|
|
|
|
|
|
read_virtual_word(BX_SEG_REG_ES, edi, &op2_16);
|
|
|
|
|
2002-10-03 22:12:40 +04:00
|
|
|
#if (defined(__i386__) && defined(__GNUC__) && BX_SupportHostAsms)
|
|
|
|
Bit32u flags32;
|
2002-10-08 02:51:58 +04:00
|
|
|
|
|
|
|
asmCmp16(op1_16, op2_16, flags32);
|
|
|
|
setEFlagsOSZAPC(flags32);
|
2002-10-03 22:12:40 +04:00
|
|
|
#else
|
2002-10-08 02:51:58 +04:00
|
|
|
Bit16u diff_16;
|
2001-04-10 05:04:59 +04:00
|
|
|
diff_16 = op1_16 - op2_16;
|
|
|
|
|
|
|
|
SET_FLAGS_OSZAPC_16(op1_16, op2_16, diff_16, BX_INSTR_CMPS16);
|
2002-10-03 22:12:40 +04:00
|
|
|
#endif
|
2001-04-10 05:04:59 +04:00
|
|
|
|
2002-09-12 22:10:46 +04:00
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
2001-04-10 05:04:59 +04:00
|
|
|
/* decrement ESI */
|
|
|
|
esi -= 2;
|
|
|
|
edi -= 2;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment ESI */
|
|
|
|
esi += 2;
|
|
|
|
edi += 2;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
|
2002-09-15 09:09:18 +04:00
|
|
|
// zero extension of RSI/RDI
|
|
|
|
|
|
|
|
RDI = edi;
|
|
|
|
RSI = esi;
|
2001-04-10 05:04:59 +04:00
|
|
|
}
|
|
|
|
else
|
|
|
|
#endif /* BX_CPU_LEVEL >= 3 */
|
|
|
|
{ /* 16 bit address mode */
|
|
|
|
Bit16u si, di;
|
|
|
|
|
|
|
|
si = SI;
|
|
|
|
di = DI;
|
|
|
|
|
|
|
|
#if BX_CPU_LEVEL >= 3
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->os32L()) {
|
2001-04-10 05:04:59 +04:00
|
|
|
Bit32u op1_32, op2_32, diff_32;
|
|
|
|
|
|
|
|
read_virtual_dword(seg, si, &op1_32);
|
|
|
|
|
|
|
|
read_virtual_dword(BX_SEG_REG_ES, di, &op2_32);
|
|
|
|
|
|
|
|
diff_32 = op1_32 - op2_32;
|
|
|
|
|
|
|
|
SET_FLAGS_OSZAPC_32(op1_32, op2_32, diff_32, BX_INSTR_CMPS32);
|
|
|
|
|
2002-09-12 22:10:46 +04:00
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
2001-04-10 05:04:59 +04:00
|
|
|
/* decrement ESI */
|
|
|
|
si -= 4;
|
|
|
|
di -= 4;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment ESI */
|
|
|
|
si += 4;
|
|
|
|
di += 4;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
else
|
|
|
|
#endif /* BX_CPU_LEVEL >= 3 */
|
|
|
|
{ /* 16 bit opsize */
|
2002-10-08 02:51:58 +04:00
|
|
|
Bit16u op1_16, op2_16;
|
2001-04-10 05:04:59 +04:00
|
|
|
|
|
|
|
read_virtual_word(seg, si, &op1_16);
|
|
|
|
|
|
|
|
read_virtual_word(BX_SEG_REG_ES, di, &op2_16);
|
|
|
|
|
2002-10-03 22:12:40 +04:00
|
|
|
#if (defined(__i386__) && defined(__GNUC__) && BX_SupportHostAsms)
|
|
|
|
Bit32u flags32;
|
2002-10-08 02:51:58 +04:00
|
|
|
|
|
|
|
asmCmp16(op1_16, op2_16, flags32);
|
|
|
|
setEFlagsOSZAPC(flags32);
|
2002-10-03 22:12:40 +04:00
|
|
|
#else
|
2002-10-08 02:51:58 +04:00
|
|
|
Bit16u diff_16;
|
2001-04-10 05:04:59 +04:00
|
|
|
diff_16 = op1_16 - op2_16;
|
|
|
|
|
|
|
|
SET_FLAGS_OSZAPC_16(op1_16, op2_16, diff_16, BX_INSTR_CMPS16);
|
2002-10-03 22:12:40 +04:00
|
|
|
#endif
|
2001-04-10 05:04:59 +04:00
|
|
|
|
2002-09-12 22:10:46 +04:00
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
2001-04-10 05:04:59 +04:00
|
|
|
/* decrement ESI */
|
|
|
|
si -= 2;
|
|
|
|
di -= 2;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment ESI */
|
|
|
|
si += 2;
|
|
|
|
di += 2;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
DI = di;
|
|
|
|
SI = si;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
void
|
2002-09-18 02:50:53 +04:00
|
|
|
BX_CPU_C::SCASB_ALXb(bxInstruction_c *i)
|
2001-04-10 05:04:59 +04:00
|
|
|
{
|
2002-10-08 02:51:58 +04:00
|
|
|
Bit8u op1_8, op2_8;
|
2001-04-10 05:04:59 +04:00
|
|
|
|
|
|
|
|
|
|
|
#if BX_CPU_LEVEL >= 3
|
2002-09-15 09:09:18 +04:00
|
|
|
#if BX_SUPPORT_X86_64
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->as64L()) {
|
2002-09-15 09:09:18 +04:00
|
|
|
Bit64u rdi;
|
2002-10-08 02:51:58 +04:00
|
|
|
Bit8u diff_8;
|
2002-09-15 09:09:18 +04:00
|
|
|
|
|
|
|
rdi = RDI;
|
|
|
|
|
|
|
|
op1_8 = AL;
|
|
|
|
|
|
|
|
read_virtual_byte(BX_SEG_REG_ES, rdi, &op2_8);
|
|
|
|
|
|
|
|
diff_8 = op1_8 - op2_8;
|
|
|
|
|
|
|
|
SET_FLAGS_OSZAPC_8(op1_8, op2_8, diff_8, BX_INSTR_SCAS8);
|
2002-10-03 22:12:40 +04:00
|
|
|
|
2002-09-15 09:09:18 +04:00
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
|
|
|
/* decrement ESI */
|
|
|
|
rdi--;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment ESI */
|
|
|
|
rdi++;
|
|
|
|
}
|
|
|
|
|
|
|
|
RDI = rdi;
|
|
|
|
}
|
|
|
|
|
|
|
|
else
|
|
|
|
#endif // #if BX_SUPPORT_X86_64
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->as32L()) {
|
2001-04-10 05:04:59 +04:00
|
|
|
Bit32u edi;
|
|
|
|
|
|
|
|
edi = EDI;
|
|
|
|
|
|
|
|
op1_8 = AL;
|
|
|
|
|
|
|
|
read_virtual_byte(BX_SEG_REG_ES, edi, &op2_8);
|
|
|
|
|
2002-10-03 22:12:40 +04:00
|
|
|
#if (defined(__i386__) && defined(__GNUC__) && BX_SupportHostAsms)
|
|
|
|
Bit32u flags32;
|
2002-10-08 02:51:58 +04:00
|
|
|
|
|
|
|
asmCmp8(op1_8, op2_8, flags32);
|
|
|
|
setEFlagsOSZAPC(flags32);
|
2002-10-03 22:12:40 +04:00
|
|
|
#else
|
2002-10-08 02:51:58 +04:00
|
|
|
Bit8u diff_8;
|
2001-04-10 05:04:59 +04:00
|
|
|
diff_8 = op1_8 - op2_8;
|
|
|
|
|
|
|
|
SET_FLAGS_OSZAPC_8(op1_8, op2_8, diff_8, BX_INSTR_SCAS8);
|
2002-10-03 22:12:40 +04:00
|
|
|
#endif
|
2001-04-10 05:04:59 +04:00
|
|
|
|
2002-09-12 22:10:46 +04:00
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
2001-04-10 05:04:59 +04:00
|
|
|
/* decrement ESI */
|
|
|
|
edi--;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment ESI */
|
|
|
|
edi++;
|
|
|
|
}
|
|
|
|
|
2002-09-15 09:09:18 +04:00
|
|
|
// zero extension of RDI
|
|
|
|
|
|
|
|
RDI = edi;
|
2001-04-10 05:04:59 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
else
|
|
|
|
#endif /* BX_CPU_LEVEL >= 3 */
|
|
|
|
{ /* 16bit address mode */
|
|
|
|
Bit16u di;
|
|
|
|
|
|
|
|
di = DI;
|
|
|
|
|
|
|
|
op1_8 = AL;
|
|
|
|
|
|
|
|
read_virtual_byte(BX_SEG_REG_ES, di, &op2_8);
|
|
|
|
|
2002-10-03 22:12:40 +04:00
|
|
|
#if (defined(__i386__) && defined(__GNUC__) && BX_SupportHostAsms)
|
|
|
|
Bit32u flags32;
|
2002-10-08 02:51:58 +04:00
|
|
|
|
|
|
|
asmCmp8(op1_8, op2_8, flags32);
|
|
|
|
setEFlagsOSZAPC(flags32);
|
2002-10-03 22:12:40 +04:00
|
|
|
#else
|
2002-10-08 02:51:58 +04:00
|
|
|
Bit8u diff_8;
|
2001-04-10 05:04:59 +04:00
|
|
|
diff_8 = op1_8 - op2_8;
|
|
|
|
|
|
|
|
SET_FLAGS_OSZAPC_8(op1_8, op2_8, diff_8, BX_INSTR_SCAS8);
|
2002-10-03 22:12:40 +04:00
|
|
|
#endif
|
2001-04-10 05:04:59 +04:00
|
|
|
|
2002-09-12 22:10:46 +04:00
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
2001-04-10 05:04:59 +04:00
|
|
|
/* decrement ESI */
|
|
|
|
di--;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment ESI */
|
|
|
|
di++;
|
|
|
|
}
|
|
|
|
|
|
|
|
DI = di;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
void
|
2002-09-18 02:50:53 +04:00
|
|
|
BX_CPU_C::SCASW_eAXXv(bxInstruction_c *i)
|
2001-04-10 05:04:59 +04:00
|
|
|
{
|
|
|
|
#if BX_CPU_LEVEL >= 3
|
2002-09-15 09:09:18 +04:00
|
|
|
#if BX_SUPPORT_X86_64
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->as64L()) {
|
2002-09-15 09:09:18 +04:00
|
|
|
Bit64u rdi;
|
|
|
|
|
|
|
|
rdi = RDI;
|
|
|
|
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->os64L()) {
|
2002-09-15 09:09:18 +04:00
|
|
|
Bit64u op1_64, op2_64, diff_64;
|
|
|
|
|
|
|
|
op1_64 = RAX;
|
|
|
|
read_virtual_qword(BX_SEG_REG_ES, rdi, &op2_64);
|
|
|
|
|
|
|
|
diff_64 = op1_64 - op2_64;
|
|
|
|
|
|
|
|
SET_FLAGS_OSZAPC_64(op1_64, op2_64, diff_64, BX_INSTR_SCAS64);
|
|
|
|
|
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
|
|
|
/* decrement EDI */
|
|
|
|
rdi -= 8;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment EDI */
|
|
|
|
rdi += 8;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
else
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->os32L()) {
|
2002-09-15 09:09:18 +04:00
|
|
|
Bit32u op1_32, op2_32, diff_32;
|
|
|
|
|
|
|
|
op1_32 = EAX;
|
|
|
|
read_virtual_dword(BX_SEG_REG_ES, rdi, &op2_32);
|
|
|
|
|
|
|
|
diff_32 = op1_32 - op2_32;
|
|
|
|
|
|
|
|
SET_FLAGS_OSZAPC_32(op1_32, op2_32, diff_32, BX_INSTR_SCAS32);
|
|
|
|
|
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
|
|
|
/* decrement EDI */
|
|
|
|
rdi -= 4;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment EDI */
|
|
|
|
rdi += 4;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
else { /* 16 bit opsize */
|
|
|
|
Bit16u op1_16, op2_16, diff_16;
|
|
|
|
|
|
|
|
op1_16 = AX;
|
|
|
|
read_virtual_word(BX_SEG_REG_ES, rdi, &op2_16);
|
|
|
|
|
|
|
|
diff_16 = op1_16 - op2_16;
|
|
|
|
|
|
|
|
SET_FLAGS_OSZAPC_16(op1_16, op2_16, diff_16, BX_INSTR_SCAS16);
|
|
|
|
|
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
|
|
|
/* decrement ESI */
|
|
|
|
rdi -= 2;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment ESI */
|
|
|
|
rdi += 2;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
RDI = rdi;
|
|
|
|
}
|
|
|
|
else
|
|
|
|
#endif // #if BX_SUPPORT_X86_64
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->as32L()) {
|
2001-04-10 05:04:59 +04:00
|
|
|
Bit32u edi;
|
|
|
|
|
|
|
|
edi = EDI;
|
|
|
|
|
2002-09-15 09:09:18 +04:00
|
|
|
#if BX_SUPPORT_X86_64
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->os64L()) {
|
2002-09-15 09:09:18 +04:00
|
|
|
Bit64u op1_64, op2_64, diff_64;
|
|
|
|
|
|
|
|
op1_64 = RAX;
|
|
|
|
read_virtual_qword(BX_SEG_REG_ES, edi, &op2_64);
|
|
|
|
|
|
|
|
diff_64 = op1_64 - op2_64;
|
|
|
|
|
|
|
|
SET_FLAGS_OSZAPC_64(op1_64, op2_64, diff_64, BX_INSTR_SCAS64);
|
|
|
|
|
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
|
|
|
/* decrement ESI */
|
|
|
|
edi -= 8;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment ESI */
|
|
|
|
edi += 8;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
else
|
|
|
|
#endif // #if BX_SUPPORT_X86_64
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->os32L()) {
|
2001-04-10 05:04:59 +04:00
|
|
|
Bit32u op1_32, op2_32, diff_32;
|
|
|
|
|
|
|
|
op1_32 = EAX;
|
|
|
|
read_virtual_dword(BX_SEG_REG_ES, edi, &op2_32);
|
|
|
|
|
|
|
|
diff_32 = op1_32 - op2_32;
|
|
|
|
|
|
|
|
SET_FLAGS_OSZAPC_32(op1_32, op2_32, diff_32, BX_INSTR_SCAS32);
|
|
|
|
|
2002-09-12 22:10:46 +04:00
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
2001-04-10 05:04:59 +04:00
|
|
|
/* decrement ESI */
|
|
|
|
edi -= 4;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment ESI */
|
|
|
|
edi += 4;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
else { /* 16 bit opsize */
|
2002-10-08 02:51:58 +04:00
|
|
|
Bit16u op1_16, op2_16;
|
2001-04-10 05:04:59 +04:00
|
|
|
|
|
|
|
op1_16 = AX;
|
|
|
|
read_virtual_word(BX_SEG_REG_ES, edi, &op2_16);
|
|
|
|
|
2002-10-03 22:12:40 +04:00
|
|
|
#if (defined(__i386__) && defined(__GNUC__) && BX_SupportHostAsms)
|
|
|
|
Bit32u flags32;
|
2002-10-08 02:51:58 +04:00
|
|
|
|
|
|
|
asmCmp16(op1_16, op2_16, flags32);
|
|
|
|
setEFlagsOSZAPC(flags32);
|
2002-10-03 22:12:40 +04:00
|
|
|
#else
|
2002-10-08 02:51:58 +04:00
|
|
|
Bit16u diff_16;
|
2001-04-10 05:04:59 +04:00
|
|
|
diff_16 = op1_16 - op2_16;
|
|
|
|
|
|
|
|
SET_FLAGS_OSZAPC_16(op1_16, op2_16, diff_16, BX_INSTR_SCAS16);
|
2002-10-03 22:12:40 +04:00
|
|
|
#endif
|
2001-04-10 05:04:59 +04:00
|
|
|
|
2002-09-12 22:10:46 +04:00
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
2001-04-10 05:04:59 +04:00
|
|
|
/* decrement ESI */
|
|
|
|
edi -= 2;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment ESI */
|
|
|
|
edi += 2;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2002-09-15 09:09:18 +04:00
|
|
|
// zero extension of RDI
|
|
|
|
|
|
|
|
RDI = edi;
|
2001-04-10 05:04:59 +04:00
|
|
|
}
|
|
|
|
else
|
|
|
|
#endif /* BX_CPU_LEVEL >= 3 */
|
|
|
|
{ /* 16bit address mode */
|
|
|
|
Bit16u di;
|
|
|
|
|
|
|
|
di = DI;
|
|
|
|
|
|
|
|
#if BX_CPU_LEVEL >= 3
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->os32L()) {
|
2001-04-10 05:04:59 +04:00
|
|
|
Bit32u op1_32, op2_32, diff_32;
|
|
|
|
|
|
|
|
op1_32 = EAX;
|
|
|
|
read_virtual_dword(BX_SEG_REG_ES, di, &op2_32);
|
|
|
|
|
|
|
|
diff_32 = op1_32 - op2_32;
|
|
|
|
|
|
|
|
SET_FLAGS_OSZAPC_32(op1_32, op2_32, diff_32, BX_INSTR_SCAS32);
|
|
|
|
|
2002-09-12 22:10:46 +04:00
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
2001-04-10 05:04:59 +04:00
|
|
|
/* decrement ESI */
|
|
|
|
di -= 4;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment ESI */
|
|
|
|
di += 4;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
else
|
|
|
|
#endif /* BX_CPU_LEVEL >= 3 */
|
|
|
|
{ /* 16 bit opsize */
|
2002-10-08 02:51:58 +04:00
|
|
|
Bit16u op1_16, op2_16;
|
2001-04-10 05:04:59 +04:00
|
|
|
|
|
|
|
op1_16 = AX;
|
|
|
|
read_virtual_word(BX_SEG_REG_ES, di, &op2_16);
|
|
|
|
|
2002-10-03 22:12:40 +04:00
|
|
|
#if (defined(__i386__) && defined(__GNUC__) && BX_SupportHostAsms)
|
|
|
|
Bit32u flags32;
|
2002-10-08 02:51:58 +04:00
|
|
|
|
|
|
|
asmCmp16(op1_16, op2_16, flags32);
|
|
|
|
setEFlagsOSZAPC(flags32);
|
2002-10-03 22:12:40 +04:00
|
|
|
#else
|
2002-10-08 02:51:58 +04:00
|
|
|
Bit16u diff_16;
|
2001-04-10 05:04:59 +04:00
|
|
|
diff_16 = op1_16 - op2_16;
|
|
|
|
|
|
|
|
SET_FLAGS_OSZAPC_16(op1_16, op2_16, diff_16, BX_INSTR_SCAS16);
|
2002-10-03 22:12:40 +04:00
|
|
|
#endif
|
2001-04-10 05:04:59 +04:00
|
|
|
|
2002-09-12 22:10:46 +04:00
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
2001-04-10 05:04:59 +04:00
|
|
|
/* decrement ESI */
|
|
|
|
di -= 2;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment ESI */
|
|
|
|
di += 2;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
DI = di;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
void
|
2002-09-18 02:50:53 +04:00
|
|
|
BX_CPU_C::STOSB_YbAL(bxInstruction_c *i)
|
2001-04-10 05:04:59 +04:00
|
|
|
{
|
|
|
|
Bit8u al;
|
2002-09-15 09:09:18 +04:00
|
|
|
|
|
|
|
#if BX_SUPPORT_X86_64
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->as64L()) {
|
2002-09-15 09:09:18 +04:00
|
|
|
Bit64u rdi;
|
|
|
|
|
|
|
|
rdi = RDI;
|
|
|
|
|
|
|
|
al = AL;
|
|
|
|
write_virtual_byte(BX_SEG_REG_ES, rdi, &al);
|
|
|
|
|
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
|
|
|
/* decrement EDI */
|
|
|
|
rdi--;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment EDI */
|
|
|
|
rdi++;
|
|
|
|
}
|
|
|
|
|
|
|
|
RDI = rdi;
|
|
|
|
}
|
|
|
|
else
|
|
|
|
#endif // #if BX_SUPPORT_X86_64
|
|
|
|
{
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
Bit32u edi;
|
|
|
|
unsigned incr;
|
2001-04-10 05:04:59 +04:00
|
|
|
|
|
|
|
#if BX_CPU_LEVEL >= 3
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->as32L()) {
|
2001-04-10 05:04:59 +04:00
|
|
|
edi = EDI;
|
|
|
|
}
|
|
|
|
else
|
|
|
|
#endif /* BX_CPU_LEVEL >= 3 */
|
|
|
|
{ /* 16bit address size */
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
edi = DI;
|
|
|
|
}
|
2002-09-15 09:09:18 +04:00
|
|
|
|
|
|
|
al = AL;
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
|
|
|
|
|
2002-09-02 22:44:35 +04:00
|
|
|
#if BX_SupportRepeatSpeedups
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
#if (BX_DEBUGGER == 0)
|
|
|
|
/* If conditions are right, we can transfer IO to physical memory
|
|
|
|
* in a batch, rather than one instruction at a time.
|
|
|
|
*/
|
2002-09-18 12:00:43 +04:00
|
|
|
if (i->repUsedL() && !BX_CPU_THIS_PTR async_event) {
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
Bit32u byteCount;
|
|
|
|
|
2002-09-24 08:43:59 +04:00
|
|
|
#if BX_SUPPORT_X86_64
|
|
|
|
if (i->as64L())
|
|
|
|
byteCount = RCX; // Truncated to 32bits. (we're only doing 1 page)
|
|
|
|
else
|
|
|
|
#endif
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->as32L())
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
byteCount = ECX;
|
|
|
|
else
|
|
|
|
byteCount = CX;
|
|
|
|
|
|
|
|
if (byteCount) {
|
2002-09-03 23:38:27 +04:00
|
|
|
Bit32u bytesFitDst;
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
Bit8u *hostAddrDst;
|
2002-09-03 23:38:27 +04:00
|
|
|
unsigned pointerDelta;
|
|
|
|
bx_segment_reg_t *dstSegPtr;
|
2002-09-24 08:43:59 +04:00
|
|
|
bx_address laddrDst;
|
|
|
|
Bit32u paddrDst;
|
2002-09-03 23:38:27 +04:00
|
|
|
|
2002-10-17 02:10:07 +04:00
|
|
|
dstSegPtr = &BX_CPU_THIS_PTR sregs[BX_SEG_REG_ES];
|
2002-09-03 23:38:27 +04:00
|
|
|
|
|
|
|
// Do segment checks for the 1st word. We do not want to
|
|
|
|
// trip an exception beyond this, because the address would
|
|
|
|
// be incorrect. After we know how many bytes we will directly
|
|
|
|
// transfer, we can do the full segment limit check ourselves
|
|
|
|
// without generating an exception.
|
|
|
|
write_virtual_checks(dstSegPtr, edi, 1);
|
|
|
|
laddrDst = dstSegPtr->cache.u.segment.base + edi;
|
|
|
|
if (BX_CPU_THIS_PTR cr0.pg) {
|
|
|
|
paddrDst = dtranslate_linear(laddrDst, CPL==3, BX_WRITE);
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
paddrDst = laddrDst;
|
|
|
|
}
|
|
|
|
// If we want to write directly into the physical memory array,
|
|
|
|
// we need the A20 address.
|
|
|
|
paddrDst = A20ADDR(paddrDst);
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
|
2002-09-19 23:17:20 +04:00
|
|
|
hostAddrDst = BX_CPU_THIS_PTR mem->getHostMemAddr(BX_CPU_THIS,
|
|
|
|
paddrDst, BX_WRITE);
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
|
|
|
|
if ( hostAddrDst ) {
|
|
|
|
// See how many bytes can fit in the rest of this page.
|
2002-09-12 22:10:46 +04:00
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
2002-09-03 23:38:27 +04:00
|
|
|
// Counting downward.
|
|
|
|
bytesFitDst = 1 + (paddrDst & 0xfff);
|
|
|
|
pointerDelta = (unsigned) -1;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
// Counting upward.
|
|
|
|
bytesFitDst = (0x1000 - (paddrDst & 0xfff));
|
|
|
|
pointerDelta = 1;
|
|
|
|
}
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
// Restrict count to the number that will fit in either
|
|
|
|
// source or dest pages.
|
2002-09-03 23:38:27 +04:00
|
|
|
if (byteCount > bytesFitDst)
|
|
|
|
byteCount = bytesFitDst;
|
2002-09-30 20:43:59 +04:00
|
|
|
if (byteCount > bx_pc_system.getNumCpuTicksLeftNextEvent())
|
|
|
|
byteCount = bx_pc_system.getNumCpuTicksLeftNextEvent();
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
|
|
|
|
// If after all the restrictions, there is anything left to do...
|
|
|
|
if (byteCount) {
|
|
|
|
unsigned j;
|
2002-09-03 23:38:27 +04:00
|
|
|
Bit32u dstSegLimit;
|
|
|
|
|
|
|
|
dstSegLimit = dstSegPtr->cache.u.segment.limit_scaled;
|
|
|
|
// For 16-bit addressing mode, clamp the segment limits to 16bits
|
|
|
|
// so we don't have to worry about computations using si/di
|
|
|
|
// rolling over 16-bit boundaries.
|
2002-09-18 09:36:48 +04:00
|
|
|
if (!i->as32L()) {
|
2002-09-03 23:38:27 +04:00
|
|
|
if (dstSegLimit > 0xffff)
|
|
|
|
dstSegLimit = 0xffff;
|
|
|
|
}
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
|
|
|
|
// Before we copy memory, we need to make sure that the segments
|
|
|
|
// allow the accesses up to the given source and dest offset. If
|
|
|
|
// the cache.valid bits have SegAccessWOK and ROK, we know that
|
|
|
|
// the cache is valid for those operations, and that the segments
|
2002-09-03 23:38:27 +04:00
|
|
|
// are non expand-down (thus we can make a simple limit check).
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
if ( !(dstSegPtr->cache.valid & SegAccessWOK) ) {
|
|
|
|
goto noAcceleration16;
|
|
|
|
}
|
2002-09-24 08:43:59 +04:00
|
|
|
if ( !IsLongMode() ) {
|
|
|
|
// Now make sure transfer will fit within the constraints of the
|
|
|
|
// segment boundaries, 0..limit for non expand-down. We know
|
|
|
|
// byteCount >= 1 here.
|
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
|
|
|
// Counting downward.
|
|
|
|
Bit32u minOffset = (byteCount-1);
|
|
|
|
if ( edi < minOffset )
|
|
|
|
goto noAcceleration16;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
// Counting upward.
|
|
|
|
Bit32u dstMaxOffset = (dstSegLimit - byteCount) + 1;
|
|
|
|
if ( edi > dstMaxOffset )
|
|
|
|
goto noAcceleration16;
|
|
|
|
}
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
// Transfer data directly using host addresses.
|
|
|
|
for (j=0; j<byteCount; j++) {
|
|
|
|
* (Bit8u *) hostAddrDst = al;
|
|
|
|
hostAddrDst += pointerDelta;
|
|
|
|
}
|
|
|
|
// Decrement the ticks count by the number of iterations, minus
|
|
|
|
// one, since the main cpu loop will decrement one. Also,
|
|
|
|
// the count is predecremented before examined, so defintely
|
|
|
|
// don't roll it under zero.
|
2002-09-03 23:38:27 +04:00
|
|
|
BX_TICKN(byteCount-1);
|
|
|
|
//bx_pc_system.num_cpu_ticks_left -= (byteCount-1);
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
|
|
|
|
// Decrement eCX. Note, the main loop will decrement 1 also, so
|
|
|
|
// decrement by one less than expected, like the case above.
|
2002-09-24 08:43:59 +04:00
|
|
|
#if BX_SUPPORT_X86_64
|
|
|
|
if (i->as64L())
|
|
|
|
RCX -= (byteCount-1);
|
|
|
|
else
|
|
|
|
#endif
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->as32L())
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
ECX -= (byteCount-1);
|
|
|
|
else
|
|
|
|
CX -= (byteCount-1);
|
|
|
|
incr = byteCount;
|
|
|
|
goto doIncr16;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
2001-04-10 05:04:59 +04:00
|
|
|
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
noAcceleration16:
|
|
|
|
|
2002-09-03 23:38:27 +04:00
|
|
|
#endif // (BX_DEBUGGER == 0)
|
|
|
|
#endif // BX_SupportRepeatSpeedups
|
|
|
|
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
write_virtual_byte(BX_SEG_REG_ES, edi, &al);
|
|
|
|
incr = 1;
|
2001-04-10 05:04:59 +04:00
|
|
|
|
2002-09-03 23:38:27 +04:00
|
|
|
#if BX_SupportRepeatSpeedups
|
|
|
|
#if (BX_DEBUGGER == 0)
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
doIncr16:
|
2002-09-03 23:38:27 +04:00
|
|
|
#endif
|
|
|
|
#endif
|
2001-04-10 05:04:59 +04:00
|
|
|
|
2002-09-15 09:09:18 +04:00
|
|
|
|
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
|
|
|
/* decrement EDI */
|
|
|
|
edi -= incr;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment EDI */
|
|
|
|
edi += incr;
|
|
|
|
}
|
2001-04-10 05:04:59 +04:00
|
|
|
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
#if BX_CPU_LEVEL >= 3
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->as32L())
|
2002-09-15 09:09:18 +04:00
|
|
|
// zero extension of RDI
|
|
|
|
RDI = edi;
|
Integrated patches for:
- Paging code rehash. You must now use --enable-4meg-pages to
use 4Meg pages, with the default of disabled, since we don't well
support 4Meg pages yet. Paging table walks model a real CPU
more closely now, and I fixed some bugs in the old logic.
- Segment check redundancy elimination. After a segment is loaded,
reads and writes are marked when a segment type check succeeds, and
they are skipped thereafter, when possible.
- Repeated IO and memory string copy acceleration. Only some variants
of instructions are available on all platforms, word and dword
variants only on x86 for the moment due to alignment and endian issues.
This is compiled in currently with no option - I should add a configure
option.
- Added a guest linear address to host TLB. Actually, I just stick
the host address (mem.vector[addr] address) in the upper 29 bits
of the field 'combined_access' since they are unused. Convenient
for now. I'm only storing page frame addresses. This was the
simplest for of such a TLB. We can likely enhance this. Also,
I only accelerated the normal read/write routines in access.cc.
Could also modify the read-modify-write versions too. You must
use --enable-guest2host-tlb, to try this out. Currently speeds
up Win95 boot time by about 3.5% for me. More ground to cover...
- Minor mods to CPUI/MOV_CdRd for CMOV.
- Integrated enhancements from Volker to getHostMemAddr() for PCI
being enabled.
2002-09-02 00:12:09 +04:00
|
|
|
else
|
|
|
|
#endif
|
|
|
|
DI = edi;
|
2002-09-15 09:09:18 +04:00
|
|
|
}
|
2001-04-10 05:04:59 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
void
|
2002-09-18 02:50:53 +04:00
|
|
|
BX_CPU_C::STOSW_YveAX(bxInstruction_c *i)
|
2001-04-10 05:04:59 +04:00
|
|
|
{
|
|
|
|
#if BX_CPU_LEVEL >= 3
|
2002-09-15 09:09:18 +04:00
|
|
|
#if BX_SUPPORT_X86_64
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->as64L()) {
|
2002-09-15 09:09:18 +04:00
|
|
|
Bit64u rdi;
|
|
|
|
|
|
|
|
rdi = RDI;
|
|
|
|
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->os64L()) {
|
2002-09-15 09:09:18 +04:00
|
|
|
Bit64u rax;
|
|
|
|
|
|
|
|
rax = RAX;
|
|
|
|
write_virtual_qword(BX_SEG_REG_ES, rdi, &rax);
|
|
|
|
|
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
|
|
|
/* decrement EDI */
|
|
|
|
rdi -= 8;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment EDI */
|
|
|
|
rdi += 8;
|
|
|
|
}
|
2002-09-18 09:36:48 +04:00
|
|
|
} /* if (i->os64L()) ... */
|
2002-09-15 09:09:18 +04:00
|
|
|
else
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->os32L()) {
|
2002-09-15 09:09:18 +04:00
|
|
|
Bit32u eax;
|
|
|
|
|
|
|
|
eax = EAX;
|
|
|
|
write_virtual_dword(BX_SEG_REG_ES, rdi, &eax);
|
|
|
|
|
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
|
|
|
/* decrement EDI */
|
|
|
|
rdi -= 4;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment EDI */
|
|
|
|
rdi += 4;
|
|
|
|
}
|
2002-09-18 09:36:48 +04:00
|
|
|
} /* if (i->os32L()) ... */
|
2002-09-15 09:09:18 +04:00
|
|
|
else { /* 16 bit opsize mode */
|
|
|
|
Bit16u ax;
|
|
|
|
|
|
|
|
ax = AX;
|
|
|
|
write_virtual_word(BX_SEG_REG_ES, rdi, &ax);
|
|
|
|
|
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
|
|
|
/* decrement EDI */
|
|
|
|
rdi -= 2;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment EDI */
|
|
|
|
rdi += 2;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
RDI = rdi;
|
|
|
|
}
|
|
|
|
|
|
|
|
else
|
|
|
|
#endif // #if BX_SUPPORT_X86_64
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->as32L()) {
|
2001-04-10 05:04:59 +04:00
|
|
|
Bit32u edi;
|
|
|
|
|
|
|
|
edi = EDI;
|
|
|
|
|
2002-09-15 09:09:18 +04:00
|
|
|
#if BX_SUPPORT_X86_64
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->os64L()) {
|
2002-09-15 09:09:18 +04:00
|
|
|
Bit64u rax;
|
|
|
|
|
|
|
|
rax = RAX;
|
|
|
|
write_virtual_qword(BX_SEG_REG_ES, edi, &rax);
|
|
|
|
|
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
|
|
|
/* decrement EDI */
|
|
|
|
edi -= 8;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment EDI */
|
|
|
|
edi += 8;
|
|
|
|
}
|
|
|
|
} /* if (i->os_4) ... */
|
|
|
|
else
|
|
|
|
#endif // #if BX_SUPPORT_X86_64
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->os32L()) {
|
2001-04-10 05:04:59 +04:00
|
|
|
Bit32u eax;
|
|
|
|
|
|
|
|
eax = EAX;
|
|
|
|
write_virtual_dword(BX_SEG_REG_ES, edi, &eax);
|
|
|
|
|
2002-09-12 22:10:46 +04:00
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
2001-04-10 05:04:59 +04:00
|
|
|
/* decrement EDI */
|
|
|
|
edi -= 4;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment EDI */
|
|
|
|
edi += 4;
|
|
|
|
}
|
2002-09-18 09:36:48 +04:00
|
|
|
} /* if (i->os32L()) ... */
|
2001-04-10 05:04:59 +04:00
|
|
|
else { /* 16 bit opsize mode */
|
|
|
|
Bit16u ax;
|
|
|
|
|
|
|
|
ax = AX;
|
|
|
|
write_virtual_word(BX_SEG_REG_ES, edi, &ax);
|
|
|
|
|
2002-09-12 22:10:46 +04:00
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
2001-04-10 05:04:59 +04:00
|
|
|
/* decrement EDI */
|
|
|
|
edi -= 2;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment EDI */
|
|
|
|
edi += 2;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2002-09-15 09:09:18 +04:00
|
|
|
// zero extension of RDI
|
|
|
|
|
|
|
|
RDI = edi;
|
2001-04-10 05:04:59 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
else
|
|
|
|
#endif /* BX_CPU_LEVEL >= 3 */
|
|
|
|
{ /* 16bit address size */
|
|
|
|
Bit16u di;
|
|
|
|
|
|
|
|
di = DI;
|
|
|
|
|
|
|
|
#if BX_CPU_LEVEL >= 3
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->os32L()) {
|
2001-04-10 05:04:59 +04:00
|
|
|
Bit32u eax;
|
|
|
|
|
|
|
|
eax = EAX;
|
|
|
|
write_virtual_dword(BX_SEG_REG_ES, di, &eax);
|
|
|
|
|
2002-09-12 22:10:46 +04:00
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
2001-04-10 05:04:59 +04:00
|
|
|
/* decrement EDI */
|
|
|
|
di -= 4;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment EDI */
|
|
|
|
di += 4;
|
|
|
|
}
|
2002-09-18 09:36:48 +04:00
|
|
|
} /* if (i->os32L()) ... */
|
2001-04-10 05:04:59 +04:00
|
|
|
else
|
|
|
|
#endif /* BX_CPU_LEVEL >= 3 */
|
|
|
|
{ /* 16 bit opsize mode */
|
|
|
|
Bit16u ax;
|
|
|
|
|
|
|
|
ax = AX;
|
|
|
|
write_virtual_word(BX_SEG_REG_ES, di, &ax);
|
|
|
|
|
2002-09-12 22:10:46 +04:00
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
2001-04-10 05:04:59 +04:00
|
|
|
/* decrement EDI */
|
|
|
|
di -= 2;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment EDI */
|
|
|
|
di += 2;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
DI = di;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
void
|
2002-09-18 02:50:53 +04:00
|
|
|
BX_CPU_C::LODSB_ALXb(bxInstruction_c *i)
|
2001-04-10 05:04:59 +04:00
|
|
|
{
|
|
|
|
unsigned seg;
|
|
|
|
Bit8u al;
|
|
|
|
|
2002-09-18 09:36:48 +04:00
|
|
|
if (!BX_NULL_SEG_REG(i->seg())) {
|
|
|
|
seg = i->seg();
|
2001-04-10 05:04:59 +04:00
|
|
|
}
|
|
|
|
else {
|
|
|
|
seg = BX_SEG_REG_DS;
|
|
|
|
}
|
|
|
|
|
|
|
|
#if BX_CPU_LEVEL >= 3
|
2002-09-15 09:09:18 +04:00
|
|
|
#if BX_SUPPORT_X86_64
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->as64L()) {
|
2002-09-15 09:09:18 +04:00
|
|
|
Bit64u rsi;
|
|
|
|
|
|
|
|
rsi = RSI;
|
|
|
|
|
|
|
|
read_virtual_byte(seg, rsi, &al);
|
|
|
|
|
|
|
|
AL = al;
|
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
|
|
|
/* decrement ESI */
|
|
|
|
rsi--;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment ESI */
|
|
|
|
rsi++;
|
|
|
|
}
|
|
|
|
|
|
|
|
RSI = rsi;
|
|
|
|
}
|
|
|
|
else
|
|
|
|
#endif // #if BX_SUPPORT_X86_64
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->as32L()) {
|
2001-04-10 05:04:59 +04:00
|
|
|
Bit32u esi;
|
|
|
|
|
|
|
|
esi = ESI;
|
|
|
|
|
|
|
|
read_virtual_byte(seg, esi, &al);
|
|
|
|
|
|
|
|
AL = al;
|
2002-09-12 22:10:46 +04:00
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
2001-04-10 05:04:59 +04:00
|
|
|
/* decrement ESI */
|
|
|
|
esi--;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment ESI */
|
|
|
|
esi++;
|
|
|
|
}
|
|
|
|
|
2002-09-15 09:09:18 +04:00
|
|
|
// zero extension of RSI
|
|
|
|
|
|
|
|
RSI = esi;
|
2001-04-10 05:04:59 +04:00
|
|
|
}
|
|
|
|
else
|
|
|
|
#endif /* BX_CPU_LEVEL >= 3 */
|
|
|
|
{ /* 16bit address mode */
|
|
|
|
Bit16u si;
|
|
|
|
|
|
|
|
si = SI;
|
|
|
|
|
|
|
|
read_virtual_byte(seg, si, &al);
|
|
|
|
|
|
|
|
AL = al;
|
2002-09-12 22:10:46 +04:00
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
2001-04-10 05:04:59 +04:00
|
|
|
/* decrement ESI */
|
|
|
|
si--;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment ESI */
|
|
|
|
si++;
|
|
|
|
}
|
|
|
|
|
|
|
|
SI = si;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
void
|
2002-09-18 02:50:53 +04:00
|
|
|
BX_CPU_C::LODSW_eAXXv(bxInstruction_c *i)
|
2001-04-10 05:04:59 +04:00
|
|
|
{
|
|
|
|
unsigned seg;
|
|
|
|
|
2002-09-18 09:36:48 +04:00
|
|
|
if (!BX_NULL_SEG_REG(i->seg())) {
|
|
|
|
seg = i->seg();
|
2001-04-10 05:04:59 +04:00
|
|
|
}
|
|
|
|
else {
|
|
|
|
seg = BX_SEG_REG_DS;
|
|
|
|
}
|
|
|
|
|
|
|
|
#if BX_CPU_LEVEL >= 3
|
2002-09-15 09:09:18 +04:00
|
|
|
#if BX_SUPPORT_X86_64
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->as64L()) {
|
2002-09-15 09:09:18 +04:00
|
|
|
Bit64u rsi;
|
|
|
|
|
|
|
|
rsi = RSI;
|
|
|
|
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->os64L()) {
|
2002-09-15 09:09:18 +04:00
|
|
|
Bit64u rax;
|
|
|
|
|
|
|
|
read_virtual_qword(seg, rsi, &rax);
|
|
|
|
|
|
|
|
RAX = rax;
|
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
|
|
|
/* decrement ESI */
|
|
|
|
rsi -= 8;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment ESI */
|
|
|
|
rsi += 8;
|
|
|
|
}
|
2002-09-18 09:36:48 +04:00
|
|
|
} /* if (i->os64L()) ... */
|
2002-09-15 09:09:18 +04:00
|
|
|
else
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->os32L()) {
|
2002-09-15 09:09:18 +04:00
|
|
|
Bit32u eax;
|
|
|
|
|
|
|
|
read_virtual_dword(seg, rsi, &eax);
|
|
|
|
|
|
|
|
RAX = eax;
|
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
|
|
|
/* decrement ESI */
|
|
|
|
rsi -= 4;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment ESI */
|
|
|
|
rsi += 4;
|
|
|
|
}
|
2002-09-18 09:36:48 +04:00
|
|
|
} /* if (i->os32L()) ... */
|
2002-09-15 09:09:18 +04:00
|
|
|
else { /* 16 bit opsize mode */
|
|
|
|
Bit16u ax;
|
|
|
|
read_virtual_word(seg, rsi, &ax);
|
|
|
|
|
|
|
|
AX = ax;
|
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
|
|
|
/* decrement ESI */
|
|
|
|
rsi -= 2;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment ESI */
|
|
|
|
rsi += 2;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
RSI = rsi;
|
|
|
|
}
|
|
|
|
else
|
|
|
|
#endif // #if BX_SUPPORT_X86_64
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->as32L()) {
|
2001-04-10 05:04:59 +04:00
|
|
|
Bit32u esi;
|
|
|
|
|
|
|
|
esi = ESI;
|
|
|
|
|
2002-09-15 09:09:18 +04:00
|
|
|
#if BX_SUPPORT_X86_64
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->os64L()) {
|
2002-09-15 09:09:18 +04:00
|
|
|
Bit64u rax;
|
|
|
|
|
|
|
|
read_virtual_qword(seg, esi, &rax);
|
|
|
|
|
|
|
|
RAX = rax;
|
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
|
|
|
/* decrement ESI */
|
|
|
|
esi -= 8;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment ESI */
|
|
|
|
esi += 8;
|
|
|
|
}
|
2002-09-18 09:36:48 +04:00
|
|
|
} /* if (i->os64L()) ... */
|
2002-09-15 09:09:18 +04:00
|
|
|
else
|
|
|
|
#endif // #if BX_SUPPORT_X86_64
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->os32L()) {
|
2001-04-10 05:04:59 +04:00
|
|
|
Bit32u eax;
|
|
|
|
|
|
|
|
read_virtual_dword(seg, esi, &eax);
|
|
|
|
|
2002-09-15 09:09:18 +04:00
|
|
|
RAX = eax;
|
2002-09-12 22:10:46 +04:00
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
2001-04-10 05:04:59 +04:00
|
|
|
/* decrement ESI */
|
|
|
|
esi -= 4;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment ESI */
|
|
|
|
esi += 4;
|
|
|
|
}
|
2002-09-18 09:36:48 +04:00
|
|
|
} /* if (i->os32L()) ... */
|
2001-04-10 05:04:59 +04:00
|
|
|
else { /* 16 bit opsize mode */
|
|
|
|
Bit16u ax;
|
|
|
|
read_virtual_word(seg, esi, &ax);
|
|
|
|
|
|
|
|
AX = ax;
|
2002-09-12 22:10:46 +04:00
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
2001-04-10 05:04:59 +04:00
|
|
|
/* decrement ESI */
|
|
|
|
esi -= 2;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment ESI */
|
|
|
|
esi += 2;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2002-09-15 09:09:18 +04:00
|
|
|
// zero extension of RSI
|
|
|
|
|
|
|
|
RSI = esi;
|
2001-04-10 05:04:59 +04:00
|
|
|
}
|
|
|
|
else
|
|
|
|
#endif /* BX_CPU_LEVEL >= 3 */
|
|
|
|
{ /* 16bit address mode */
|
|
|
|
Bit16u si;
|
|
|
|
|
|
|
|
si = SI;
|
|
|
|
|
|
|
|
#if BX_CPU_LEVEL >= 3
|
2002-09-18 09:36:48 +04:00
|
|
|
if (i->os32L()) {
|
2001-04-10 05:04:59 +04:00
|
|
|
Bit32u eax;
|
|
|
|
|
|
|
|
read_virtual_dword(seg, si, &eax);
|
|
|
|
|
2002-09-15 09:09:18 +04:00
|
|
|
RAX = eax;
|
2002-09-12 22:10:46 +04:00
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
2001-04-10 05:04:59 +04:00
|
|
|
/* decrement ESI */
|
|
|
|
si -= 4;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment ESI */
|
|
|
|
si += 4;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
else
|
|
|
|
#endif /* BX_CPU_LEVEL >= 3 */
|
|
|
|
{ /* 16 bit opsize mode */
|
|
|
|
Bit16u ax;
|
|
|
|
|
|
|
|
read_virtual_word(seg, si, &ax);
|
|
|
|
|
|
|
|
AX = ax;
|
2002-09-12 22:10:46 +04:00
|
|
|
if (BX_CPU_THIS_PTR get_DF ()) {
|
2001-04-10 05:04:59 +04:00
|
|
|
/* decrement ESI */
|
|
|
|
si -= 2;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
/* increment ESI */
|
|
|
|
si += 2;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
SI = si;
|
|
|
|
}
|
|
|
|
}
|