From 4.4-Lite.
This commit is contained in:
parent
ba346cac3f
commit
ef8c1b6829
|
@ -0,0 +1,225 @@
|
|||
NOTE: this description applies to the hp300 system with the old BSD
|
||||
virtual memory system. It has not been updated to reflect the new,
|
||||
Mach-derived VM system, but should still be useful.
|
||||
The new system has no fixed-address "u.", but has a fixed mapping
|
||||
for the kernel stack at 0xfff00000.
|
||||
|
||||
--------------------------------------------------------------------------
|
||||
|
||||
Some quick notes on the HPBSD VM layout and kernel debugging.
|
||||
|
||||
Physical memory:
|
||||
|
||||
Physical memory always ends at the top of the 32 bit address space; i.e. the
|
||||
last addressible byte is at 0xFFFFFFFF. Hence, the start of physical memory
|
||||
varies depending on how much memory is installed. The kernel variable "lowram"
|
||||
contains the starting locatation of memory as provided by the ROM.
|
||||
|
||||
The low 128k (I think) of the physical address space is occupied by the ROM.
|
||||
This is accessible via /dev/mem *only* if the kernel is compiled with DEBUG.
|
||||
[ Maybe it should always be accessible? ]
|
||||
|
||||
Virtual address spaces:
|
||||
|
||||
The hardware page size is 4096 bytes. The hardware uses a two-level lookup.
|
||||
At the highest level is a one page segment table which maps a page table which
|
||||
maps the address space. Each 4 byte segment table entry (described in
|
||||
hp300/pte.h) contains the page number of a single page of 4 byte page table
|
||||
entries. Each PTE maps a single page of address space. Hence, each STE maps
|
||||
4Mb of address space and one page containing 1024 STEs is adequate to map the
|
||||
entire 4Gb address space.
|
||||
|
||||
Both page and segment table entries look similar. Both have the page frame
|
||||
in the upper part and control bits in the lower. This is the opposite of
|
||||
the VAX. It is easy to convert the page frame number in an STE/PTE to a
|
||||
physical address, simply mentally mask out the low 12 bits. For example
|
||||
if a PTE contains 0xFF880019, the physical memory location mapped starts at
|
||||
0xFF880000.
|
||||
|
||||
Kernel address space:
|
||||
|
||||
The kernel resides in its own virtual address space independent of all user
|
||||
processes. When the processor is in supervisor mode (i.e. interrupt or
|
||||
exception handling) it uses the kernel virtual mapping. The kernel segment
|
||||
table is called Sysseg and is allocated statically in hp300/locore.s. The
|
||||
kernel page table is called Systab is also allocated statically in
|
||||
hp300/locore.s and consists of the usual assortment of SYSMAPs.
|
||||
The size of Systab (Syssize) depends on the configured size of the various
|
||||
maps but as currently configured is 9216 PTEs. Both segment and page tables
|
||||
are initialized at bootup in hp300/locore.s. The segment table never changes
|
||||
(except for bits maintained by the hardware). Portions of the page table
|
||||
change as needed. The kernel is mapped into the address space starting at 0.
|
||||
|
||||
Theoretically, any address in the range 0 to Syssize * 4096 (0x2400000 as
|
||||
currently configured) is valid. However, certain addresses are more common
|
||||
in dumps than others. Those are (for the current configuration):
|
||||
|
||||
0 - 0x800000 kernel text and permanent data structures
|
||||
0x917000 - 0x91a000 u-area; 1st page is user struct, last k-stack
|
||||
0x1b1b000 - 0x2400000 user page tables, also kmem_alloc()ed data
|
||||
|
||||
User address space:
|
||||
|
||||
The user text and data are loaded starting at VA 0. The user's stack starts
|
||||
at 0xFFF00000 and grows toward lower addresses. The pages above the user
|
||||
stack are used by the kernel. From 0xFFF00000 to 0xFFF03000 is the u-area.
|
||||
The 3 PTEs for this range map (read-only) the same memory as does 0x917000
|
||||
to 0x91a000 in the kernel address space. This address range is never used
|
||||
by the kernel, but exists for utilities that assume that the u-area sits
|
||||
above the user stack. The pages from FFF03000 up are not used. They
|
||||
exist so that the user stack is in the same location as in HPUX.
|
||||
|
||||
The user segment table is allocated along with the page tables from Usrptmap.
|
||||
They are contiguous in kernel VA space with the page tables coming before
|
||||
the segment table. Hence, a process has p_szpt+1 pages allocated starting
|
||||
at kernel VA p_p0br.
|
||||
|
||||
The user segment table is typically very sparse since each entry maps 4Mb.
|
||||
There are usually only two valid STEs, one at the start mapping the text/data
|
||||
potion of the page table, and one at the end mapping the stack/u-area. For
|
||||
example if the segment table was at 0xFFFFA000 there would be valid entries
|
||||
at 0xFFFFA000 and 0xFFFFAFFC.
|
||||
|
||||
Random notes:
|
||||
|
||||
An important thing to note is that there are no hardware length registers
|
||||
on the HP. This implies that we cannot "pack" data and stack PTEs into the
|
||||
same page table page. Hence, every user page table has at least 2 pages
|
||||
(3 if you count the segment table).
|
||||
|
||||
The HP maintains the p0br/p0lr and p1br/p1lr PCB fields the same as the
|
||||
VAX even though they have no meaning to the hardware. This also keeps many
|
||||
utilities happy.
|
||||
|
||||
There is no seperate interrupt stack (right now) on the HPs. Interrupt
|
||||
processing is handled on the kernel stack of the "current" process.
|
||||
|
||||
Following is a list of things you might want to be able to do with a kernel
|
||||
core dump. One thing you should always have is a ps listing from the core
|
||||
file. Just do:
|
||||
|
||||
ps klaw vmunix.? vmcore.?
|
||||
|
||||
Exception related panics (i.e. those detected in hp300/trap.c) will dump
|
||||
out various useful information before panicing. If available, you should
|
||||
get this out of the /usr/adm/messages file. Finally, you should be in adb:
|
||||
|
||||
adb -k vmunix.? vmcore.?
|
||||
|
||||
Adb -k will allow you to examine the kernel address space more easily.
|
||||
It automatically maps kernel VAs in the range 0 to 0x2400000 to physical
|
||||
addresses. Since the kernel and user address spaces overlap (i.e. both
|
||||
start at 0), adb can't let you examine the address space of the "current"
|
||||
process as it does on the VAX.
|
||||
--------
|
||||
|
||||
1. Find out what the current process was at the time of the crash:
|
||||
|
||||
If you have the dump info from /usr/adm/messages, it should contain the
|
||||
PID of the active process. If you don't have this info you can just look
|
||||
at location "Umap". This is the PTE for the first page of the u-area; i.e.
|
||||
the user structure. Forget about the last 3 hex digits and compare the top
|
||||
5 to the ADDR column in the ps listing.
|
||||
|
||||
2. Locating a process' user structure:
|
||||
|
||||
Get the ADDR field of the desired process from the ps listing. This is the
|
||||
page frame number of the process' user structure. Tack 3 zeros on to the
|
||||
end to get the physical address. Note that this doesn't give you the kernel
|
||||
stack since it is in a different page than the user-structure and pages of
|
||||
the u-area are not physically contiguous.
|
||||
|
||||
3. Locating a process' proc structure:
|
||||
|
||||
First find the process' user structure as described above. Find the u_procp
|
||||
field at offset 0x200 from the beginning. This gives you the kernel VA of
|
||||
the proc structure.
|
||||
|
||||
4. Locating a process' page table:
|
||||
|
||||
First find the process' user structure as described above. The first part
|
||||
of the user structure is the PCB. The second longword (third field) of the
|
||||
PCB is pcb_ustp, a pointer to the user segment table. This pointer is
|
||||
actually the page frame number. Again adding 3 zeros yields the physical
|
||||
address. You can now use the values in the segment table to locate the
|
||||
page tables. For example, to locate the first page of the text/data part
|
||||
of the page table, use the first STE (longword) in the segment table.
|
||||
|
||||
5. Locating a process' kernel stack:
|
||||
|
||||
First find the process' page table as described above. The kernel stack
|
||||
is near the end of the user address space. So, locate the last entry in the
|
||||
user segment table (base+0xFFC) and use that entry to find the last page of
|
||||
the user page table. Look at the last 256 entries of this page
|
||||
(pagebase+0xFE0) The first is the PTE for the user-structure. The second
|
||||
was intended to be a read-only page to protect the user structure from the
|
||||
kernel stack. Currently it is read/write and actually allocated. Hence
|
||||
it can wind up being a second page for the kernel stack. The third is the
|
||||
kernel stack. The last 253 should be zero. Hence, indirecing through the
|
||||
third of these last 256 PTEs will give you the kernel stack page.
|
||||
|
||||
An alternate way to do this is to use the p_addr field of the proc structure
|
||||
which is found as described above. The p_addr field is at offset 0x10 in the
|
||||
proc structure and points to the first of the PTEs mentioned above (i.e. the
|
||||
user structure PTE).
|
||||
|
||||
6. Interpreting the info in a "trap type N..." panic:
|
||||
|
||||
As mentioned, when the kernel crashes out of hp300/trap.c it will dump some
|
||||
useful information. This dates back to the days when I was debugging the
|
||||
exception handling code and had no kernel adb or even kernel crash dump code.
|
||||
"trap type" (decimal) is as defined in hp300/trap.h, it doesn't really
|
||||
correlate with anything useful. "code" (hex) is only useful for MMU
|
||||
(trap type 8) errors. It is the concatination of the MMU status register
|
||||
(see hp300/cpu.h) in the high 16 bits and the 68020 special status word
|
||||
(see the 020 manual page 6-17) in the low 16. "v" (hex) is the virtual
|
||||
address which caused the fault. "pid" (decimal) is the ID of the process
|
||||
running at the time of the exception. Note that if we panic in an interrupt
|
||||
routine, this process may not be related to the panic. "ps" (hex) is the
|
||||
value of the 68020 status register (see page 1-4 of 020 manual) at the time
|
||||
of the crash. If the 0x2000 bit is on, we were in supervisor (kernel) mode
|
||||
at the time, otherwise we were in user mode. "pc" (hex) is the value of the
|
||||
PC saved on the hardware exception frame. It may *not* be the PC of the
|
||||
instruction causing the fault (see the 020 manual for details). The 0x2000
|
||||
bit of "ps" dictates whether this is a kernel or user VA. "sfc" and "dfc"
|
||||
are the 68020 source/destination function codes. They should always be one.
|
||||
"p0" and "p1" are the VAX-like region registers. They are of the form:
|
||||
|
||||
<length> '@' <kernel VA>
|
||||
|
||||
where both are in hex. Following these values are a dump of the processor
|
||||
registers (hex). Check the address registers for values close to "v", the
|
||||
fault address. Most faults are causes by dereferences of bogus pointers.
|
||||
Most such dereferences are the result of 020 instructions using the:
|
||||
|
||||
<address-register> '@' '(' offset ')'
|
||||
|
||||
addressing mode. This can help you track down the faulting instruction (since
|
||||
the PC may not point to it). Note that the value of a7 (the stack pointer) is
|
||||
ALWAYS the user SP. This is brain-dead I know. Finally, is a dump of the
|
||||
stack (user/kernel) at the time of the offense. Before kernel crash dumps,
|
||||
this was very useful.
|
||||
|
||||
7. Converting kernel virtual address to a physical address.
|
||||
|
||||
Adb -k already does this for you, but sometimes you want to know what the
|
||||
resulting physical address is rather than what is there. Doing this is
|
||||
simply a matter of indexing into the kernel page table. In theory we would
|
||||
first have to do a lookup in the kernel segment table, but we know that the
|
||||
kernel page table is physically contiguous so this isn't necessary. The
|
||||
base of the system page table is "Sysmap", so to convert an address V just
|
||||
divide the address by 4096 to get the page number, multiply that by 4 (the
|
||||
size of a PTE in bytes) to get a byte offset, and add that to "Sysmap".
|
||||
This gives you the address of the PTE mapping V. You can then get the
|
||||
physical address by masking out the low 12 bits of the contents of that PTE.
|
||||
To wit:
|
||||
|
||||
*(Sysmap+(VA%1000*4))&fffff000
|
||||
|
||||
where VA is the virtual address in question.
|
||||
|
||||
This technique should also work for user virtual addresses if you replace
|
||||
"Sysmap" with the value of the appropriate processes' P0BR. This works
|
||||
because a user's page table is *virtually* contiguous in the kernel
|
||||
starting at P0BR, and adb will handle translating the kernel virtual addresses
|
||||
for you.
|
|
@ -0,0 +1,145 @@
|
|||
Overview:
|
||||
--------
|
||||
|
||||
(Some of this is gleaned from an article in the September 1986
|
||||
Hewlett-Packard Journal and info in the July 1987 HP Communicator)
|
||||
|
||||
Page and segment table entries mimic the Motorola 68851 PMMU,
|
||||
in an effort at upward compatibility. The HP MMU uses a two
|
||||
level translation scheme. There are seperate (but equal!)
|
||||
translation tables for both supervisor and user modes. At the
|
||||
lowest level are page tables. Each page table consists of one
|
||||
or more 4k pages of 1024x4 byte page table entries. Each PTE
|
||||
maps one 4k page of VA space. At the highest level is the
|
||||
segment table. The segment table is a single 4K page of 1024x4
|
||||
byte entries. Each entry points to a 4k page of PTEs. Hence
|
||||
one STE maps 4Mb of VA space and one page of STEs is sufficient
|
||||
to map the entire 4Gb address space (what a coincidence!). The
|
||||
unused valid bit in page and segment table entries must be
|
||||
zero.
|
||||
|
||||
There are seperate translation lookaside buffers for the user
|
||||
and supervisor modes, each containing 1024 entries.
|
||||
|
||||
To augment the 68020's instruction cache, the HP CPU has an
|
||||
external cache. A direct-mapped, virtual cache implementation
|
||||
is used with 16 Kbytes of cache on 320 systems and 32 Kbytes on
|
||||
350 systems. Each cache entry can contain instructions or data,
|
||||
from either user or supervisor space. Seperate valid bits are
|
||||
kept for user and supervisor entries, allowing for descriminatory
|
||||
flushing of the cache.
|
||||
|
||||
MMU translation and cache-miss detection are done in parallel.
|
||||
|
||||
|
||||
Segment table entries:
|
||||
------- ----- -------
|
||||
|
||||
bits 31-12: Physical page frame number of PT page
|
||||
bits 11-4: Reserved at zero
|
||||
(can software use them?)
|
||||
bit 3: Reserved at one
|
||||
bit 2: Set to 1 if segment is read-only, ow read-write
|
||||
bits 1-0: Valid bits
|
||||
(hardware uses bit 1)
|
||||
|
||||
|
||||
Page table entries:
|
||||
---- ----- -------
|
||||
|
||||
bits 31-12: Physical page frame number of page
|
||||
bits 11-7: Available for software use
|
||||
bit 6: If 1, inhibits caching of data in this page.
|
||||
(both instruction and external cache)
|
||||
bit 5: Reserved at zero
|
||||
bit 4: Hardware modify bit
|
||||
bit 3: Hardware reference bit
|
||||
bit 2: Set to 1 if page is read-only, ow read-write
|
||||
bits 1-0: Valid bits
|
||||
(hardware uses bit 0)
|
||||
|
||||
|
||||
Hardware registers:
|
||||
-------- ---------
|
||||
|
||||
The hardware has four longword registers controlling the MMU.
|
||||
The registers can be accessed as shortwords also (remember to
|
||||
add 2 to addresses given below).
|
||||
|
||||
5F4000: Supervisor mode segment table pointer. Loaded (as longword)
|
||||
with page frame number (i.e. Physaddr >> 12) of the segment
|
||||
table mapping supervisor space.
|
||||
5F4004: User mode segment table pointer. Loaded (as longword) with
|
||||
page frame number of the segment table mapping user space.
|
||||
5F4008: TLB control register. Used to invalid large sections of the
|
||||
TLB. More info below.
|
||||
5F400C: MMU command/status register. Defined as follows:
|
||||
|
||||
bit 15: If 1, indicates a page table fault occured
|
||||
bit 14: If 1, indicates a page fault occured
|
||||
bit 13: If 1, indicates a protection fault (write to RO page)
|
||||
bit 6: MC68881 enable. Tied to chip enable line.
|
||||
(set this bit to enable)
|
||||
bit 5: MC68020 instruction cache enable. Tied to Insruction
|
||||
cache disable line. (set this bit to enable)
|
||||
bit 3: If 1, indicates an MMU related bus error occured.
|
||||
Bits 13-15 are now valid.
|
||||
bit 2: External cache enable. (set this bit to enable)
|
||||
bit 1: Supervisor mapping enable. Enables translation of
|
||||
supervisor space VAs.
|
||||
bit 0: User mapping enable. Enables translation of user
|
||||
space VAs.
|
||||
|
||||
|
||||
Any bits set by the hardware are cleared only by software.
|
||||
(i.e. bits 3,13,14,15)
|
||||
|
||||
Invalidating TLB:
|
||||
------------ ---
|
||||
|
||||
All translations:
|
||||
Read the TLB control register (5F4008) as a longword.
|
||||
|
||||
User translations only:
|
||||
Write a longword 0 to TLB register or set the user
|
||||
segment table pointer.
|
||||
|
||||
Supervisor translations only:
|
||||
Write a longword 0x8000 to TLB register or set the
|
||||
supervisor segment table pointer.
|
||||
|
||||
A particular VA translation:
|
||||
Set destination function code to 3 ("purge" space),
|
||||
write a longword 0 to the VA whose translation we are to
|
||||
invalidate, and restore function code. This apparently
|
||||
invalidates any translation for that VA in both the user
|
||||
and supervisor LB. Here is what I did:
|
||||
|
||||
#define FC_PURGE 3
|
||||
#define FC_USERD 1
|
||||
_TBIS:
|
||||
movl sp@(4),a0 | VA to invalidate
|
||||
moveq #FC_PURGE,d0 | change address space
|
||||
movc d0,dfc | for destination
|
||||
moveq #0,d0 | zero to invalidate?
|
||||
movsl d0,a0@ | hit it
|
||||
moveq #FC_USERD,d0 | back to old
|
||||
movc d0,dfc | address space
|
||||
rts | done
|
||||
|
||||
|
||||
Invalidating the external cache:
|
||||
------------ --- -------- -----
|
||||
|
||||
Everything:
|
||||
Toggle the cache enable bit (bit 2) in the MMU control
|
||||
register (5F400C). Can be done by ANDing and ORing the
|
||||
register location.
|
||||
|
||||
User:
|
||||
Change the user segment table pointer register (5F4004),
|
||||
i.e. read the current value and write it back.
|
||||
|
||||
Supervisor:
|
||||
Change the supervisor segment table pointer register
|
||||
(5F4000), i.e. read the current value and write it back.
|
|
@ -0,0 +1,127 @@
|
|||
Here is a list of hp300 specific kernel compilation options and what they
|
||||
mean:
|
||||
|
||||
HAVEVAC
|
||||
Compiles in support for virtually addressed cache (VAC) found on
|
||||
hp320 and 350 machines. Should only be defined when HP320 and/or
|
||||
HP350 is.
|
||||
|
||||
HP320
|
||||
Support for old hp320 machines: 16mhz 68020, HP MMU, 16mhz 68881
|
||||
and VAC. Compiles in support for a VAC, HP MMU, and the 98620A
|
||||
16-bit DMA channel. Forces the definition of HAVEVAC.
|
||||
|
||||
HP350
|
||||
Support for old hp350 machines: 25mhz 68020, HP MMU, 20mhz 68881
|
||||
and VAC. Compiles in support for a VAC and the HP MMU. Differs
|
||||
from HP320 in that it has no support for 16-bit DMA controller.
|
||||
Forces the definition of HAVEVAC.
|
||||
|
||||
HP330
|
||||
Support for old hp330 (and 318/319) machines: 16mhz 68020, 68551 PMMU
|
||||
and 16mhz 68881. Compiles in support for PMMU.
|
||||
|
||||
HP360
|
||||
Support for old hp360 (and 340) machines: 25mhz 68030+MMU and 25mhz
|
||||
68882. Compiles in support for PMMU and 68030. Differs from HP330
|
||||
in support for 68030 on-chip data cache.
|
||||
|
||||
HP370
|
||||
Support for old hp370 (and current 345/375/400) machines: 33 (50) mhz
|
||||
68030+MMU and 33 (50) mhz 68882. Compiles in support for PMMU, 68030
|
||||
and off-chip physically addressed cache. Differs from 360 in only one
|
||||
place, in dealing with flushing the external cache.
|
||||
|
||||
HP380
|
||||
Support for "current" hp380/425 (and 433) machines: 25 (33) mhz 68040
|
||||
with MMU/FPU. Compiles in support for 68040.
|
||||
|
||||
HPFPLIB
|
||||
Compiles in support to link with HP-UX's version of Motorola's 68040
|
||||
FP emulation library (hp300/hpux_float.o). Kernel will build and run
|
||||
without this option, but many binaries will core dump. Should not be
|
||||
defined unless HP380 is.
|
||||
|
||||
|
||||
USELEDS
|
||||
Twinkle the hp4xx front panel (or hp3xx internal) LEDs in the HP
|
||||
designated way. Somewhat frivolous, but the heartbeat LED is
|
||||
useful to see if your machine is alive.
|
||||
|
||||
PANICBUTTON
|
||||
Compiles in code which will enable a "force-crash" HIL keyboard
|
||||
sequence. When the Reset key is typed twice in succession (within
|
||||
half a second) the kernel will panic. Note that the HIL Reset key
|
||||
sends a NMI to the processor which will get the CPUs attention no
|
||||
matter what it is doing (i.e. as long as it isn't halted). Alas,
|
||||
also note that the NMI is only sent when the keyboard is in "cooked"
|
||||
(ITE) mode. If it is in "raw" mode (i.e. X-server is running) the
|
||||
Reset key is just another keypress event. A cheezy substitute in
|
||||
this case is holding down the upper right-most unlabeled key and
|
||||
then pressing the unlabeled key to its left. Note that this only
|
||||
works if HIL (level 1) interrupts are not masked.
|
||||
|
||||
DEBUG
|
||||
Compiles in a variety of consistency checks and debug printfs
|
||||
throughout the hp300 MD code and device drivers.
|
||||
|
||||
COMPAT_HPUX
|
||||
Enables HP-UX binary compatibility mode. Allows a variety of
|
||||
"recent" HP-UX binaries to be run unchanged. Due to the
|
||||
evolutionary and "as-needed" nature of this code, "recent" is
|
||||
anywhere from release 6.2 to 8.0 of HP-UX. It will run 8.0
|
||||
shared-library binaries (assuming all the necessary shared-libraries
|
||||
are installed in the filesystem).
|
||||
|
||||
COMPAT_OHPUX
|
||||
Compile in old 4.2-ish HP-UX (pre-6.0?) compatibility code.
|
||||
|
||||
FPCOPROC
|
||||
Compile in code to support the 68881 and above FPU. Should always
|
||||
be defined, since all supported SPUs have one. Don't even know if
|
||||
it will compile, much less work, without this option. Defined in
|
||||
the prototype makefile (hp300/conf/Makefile.hp300).
|
||||
|
||||
DCMSTATS
|
||||
Compile in code to collect a variety of transmit/receive statistics
|
||||
for the 98642 4-port MUX.
|
||||
|
||||
WAITHIST
|
||||
Compile in code to collect statistics about the distribution of
|
||||
wait-times for various busy waits in the SCSI host-adaptor driver.
|
||||
|
||||
STACKCHECK
|
||||
Enables two types of kernel stack checking in hp300/hp300/locore.s:
|
||||
1. stack "overflow". On every clock interrupt we ensure that
|
||||
the current kernel stack has not grown into the user struct
|
||||
page, i.e. size exceeded UPAGES-1 pages.
|
||||
2. stack "underflow". Before every rte to user mode we ensure
|
||||
that we will be exactly at the base of the stack after the
|
||||
exception frame has been popped.
|
||||
This option can degrade performance considerably, use it only if
|
||||
you suspect a problem with kernel stacks.
|
||||
|
||||
SCSI_REVPRI
|
||||
Changes autoconf to start matching logical SCSI devices starting
|
||||
at slave 6 and working backwards instead of starting at slave 0
|
||||
and working up. Later releases of the HP boot ROM search for
|
||||
boot devices in this manner. This is apparently the order in
|
||||
which priority is given to slaves on the host adaptor. Define
|
||||
this if you use wildcarding and want to stay in sync with the
|
||||
boot ROM's strategy.
|
||||
|
||||
MAPPEDCOPY
|
||||
Use page remapping to do large copyin/copyouts. When defined
|
||||
the default is to use mapped copy for operations on one page
|
||||
or more except on machines with virtually-indexed caches.
|
||||
See initcpu() in machdep.c
|
||||
|
||||
BUFFERS_UNMANAGED
|
||||
Set up the buffer cache "below" the machine independent VM.
|
||||
Normally, in startup() we use vm_map operations to initially
|
||||
assign physical memory to the buffers. This creates a map with
|
||||
a huge number of map entries (twice the number of buffers)
|
||||
which serve no purpose since remaining buffer operations
|
||||
(i.e. pagemove) work below the MI layer anyway. Defining this
|
||||
symbol will cause startup() to use pmap operations to map the
|
||||
initial pages leaving the buffer_map one big entry.
|
|
@ -0,0 +1,41 @@
|
|||
Following are some observations about the the BSD hp300 pmap module that
|
||||
may prove useful for other pmap modules:
|
||||
|
||||
1. pmap_remove should be efficient with large, sparsely populated ranges.
|
||||
|
||||
Profiling of exec/exit intensive work loads showed that much time was
|
||||
being spent in pmap_remove. This was primarily due to calls from exec
|
||||
when deallocating the stack segment. Since the current implementation
|
||||
of the stack is to "lazy allocate" the maximum possible stack size
|
||||
(typically 16-32mb) when the process is created, pmap_remove will be
|
||||
called with a large chunk of largely empty address space. It is
|
||||
important that this routine be able to quickly skip over large chunks
|
||||
of allocated but unpopulated VA space. The hp300 pmap module did check
|
||||
for unpopulated "segments" (which map 4mb chunks) and skipped them fairly
|
||||
efficiently but once it found a valid segment descriptor (STE), it rather
|
||||
clumsily moved forward over the PTEs mapping that segment. Particularly
|
||||
bad was that for every PTE it would recheck that the STE was valid even
|
||||
though we should already know that.
|
||||
|
||||
pmap_protect can benefit from similar optimizations though it is
|
||||
(currently) not called with large regions.
|
||||
|
||||
Another solution would be to change the way stack allocation is done
|
||||
(i.e. don't preallocate the entire address range) but I think it is
|
||||
important to be able to efficiently support such large, spare ranges
|
||||
that might show up in other applications (e.g. a randomly accessed
|
||||
large mapped file).
|
||||
|
||||
2. Bit operations (i.e. ~,&,|) are more efficient than bitfields.
|
||||
|
||||
This is a 68k/gcc issue, but if you are trying to squeeze out maximum
|
||||
performance...
|
||||
|
||||
3. Don't flush TLB/caches for inactive mappings.
|
||||
|
||||
On the hp300 the TLBs are either designed as, or used in such a way that,
|
||||
they are flushed on every context switch (i.e. there are no "process
|
||||
tags") Hence, doing TLB flushes on mappings that aren't associated with
|
||||
either the kernel or the currently running process are a waste. Seems
|
||||
pretty obvious but I missed it for many years. An analogous argument
|
||||
applies to flushing untagged virtually addressed caches (ala the 320/350).
|
|
@ -0,0 +1,28 @@
|
|||
This directory contains random snippets related to various hp300 issues.
|
||||
|
||||
Debug.tips Some tips for debugging kernel problems. Out of date but
|
||||
may contains some useful info.
|
||||
|
||||
HPMMU.notes Most of this information was collected by David Davis in
|
||||
1987 or so while he was working on hp200/300 BSD.
|
||||
|
||||
Options Kernel configuration options that are either defined in
|
||||
the prototype makefile or that can be specified in a config
|
||||
file.
|
||||
|
||||
Pmap.notes Some (mostly HP-specific) observations I made while cleaning
|
||||
up the hp300 pmap module.
|
||||
|
||||
README This.
|
||||
|
||||
README.68040 Notes and copyright information for the 68040 floating point
|
||||
emulation package.
|
||||
|
||||
TODO.dev A fairly ancient list of projects related to IO devices.
|
||||
|
||||
TODO.hp300 A much more up do date list of general hp300 projects.
|
||||
|
||||
Mike Hibler (mike@cs.utah.edu)
|
||||
Center for Software Science
|
||||
University of Utah
|
||||
February, 1993.
|
|
@ -0,0 +1,37 @@
|
|||
# @(#)README.68040 8.1 (Berkeley) 6/10/93
|
||||
|
||||
In order to do floating point on the 68040 you need to incorporate the
|
||||
HP-UX 7.05 version of Motorola's FP emulation library as hpux_float.o
|
||||
in this directory.
|
||||
|
||||
To build a new kernel using this library, set the option HPFPLIB in the
|
||||
config file. You can build a kernel without the FP library by configuring
|
||||
a kernel without the HPFPLIB option, but be warned that many binaries will
|
||||
not run.
|
||||
|
||||
We are required by Hewlett-Packard and Motorola to state the following
|
||||
items.
|
||||
|
||||
1. No source code is being licensed.
|
||||
|
||||
2. The object code can only be used for operation on Motorola M68040
|
||||
based systems, and NOT ported to other architectures.
|
||||
|
||||
3. Recipients of the code will not reverse compile or disassemble the
|
||||
code, except to the extent that such restrictions may be prohibited
|
||||
by law.
|
||||
|
||||
4. The Motorola Inc. and Hewlett-Packard Company copyright notices will
|
||||
be reproduced in the relevant bootup code.
|
||||
|
||||
5. The code is provided ``AS IS'' without warranty or obligation of any
|
||||
kind. THERE SHALL BE NO LIABILITY FOR INCIDENTAL OR CONSEQUENTIAL
|
||||
DAMAGE ARISING FROM OR RELATED TO USE OR POSSESSION OF THE CODE. THE
|
||||
IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR
|
||||
PURPOSE ARE EXPRESSLY DISCLAIMED.
|
||||
|
||||
6. Recipients of the code will adhere to the U.S. Export Administration
|
||||
Laws and Regulations and will not export or re-export the code or any
|
||||
technical data related to the code to any proscribed country listed in
|
||||
the U.S. Export Administration Regulations unless properly authorized
|
||||
as may be required by the U.S. Government.
|
|
@ -0,0 +1,17 @@
|
|||
[ this is old -- mike ]
|
||||
|
||||
Oh, where do I begin...
|
||||
|
||||
1. Integrate 98628A (single port buffered RS232) driver.
|
||||
2. Integrate 1/2" 9-track reel tape driver (from Mt Xinu).
|
||||
3. SCSI: sync support and connect/disconnect.
|
||||
4. VME/EISA adaptor drivers needed.
|
||||
5. Centronics driver for 345/375.
|
||||
6. HP-IB (SCSI?) improvement: attempt to get more activity on a single
|
||||
bus (e.g. overlap seeks with transfers).
|
||||
7. Support for more modern (post-DaVinci) displays.
|
||||
|
||||
----
|
||||
Mike Hibler
|
||||
University of Utah CSS group
|
||||
mike@cs.utah.edu
|
|
@ -0,0 +1,85 @@
|
|||
1. Create and use an interrupt stack.
|
||||
Well actually, use the master SP for kernel stacks instead of
|
||||
the interrupt SP. Right now we use the interrupt stack for
|
||||
everything. Allows for more accurate accounting of systime.
|
||||
In theory, could also allow for smaller kernel stacks but we
|
||||
only use one page anyway.
|
||||
|
||||
2. Copy/clear primitives could be tuned.
|
||||
What is best is highly CPU and cache dependent. One thing to look
|
||||
at are the copyin/copyout primitives. Rather than looping using
|
||||
MOVS instructions, you could map an entire page at a time and use
|
||||
bcopy, MOVE16, or whatever. This would lose big on the VAC models
|
||||
however.
|
||||
|
||||
3. Sendsig/sigreturn are pretty bogus.
|
||||
Currently we can call a signal handler even if an excpetion
|
||||
occurs in the middle of an instruction. This causes the handler
|
||||
to return right back to the middle of the offending instruction
|
||||
which will most likely lead to another exception/signal.
|
||||
Technically, I feel this is the correct behavior but it requires
|
||||
saving a lot of state on the user's stack, state that we don't
|
||||
really want the user messing with. Other 68k implementations
|
||||
(e.g. Sun) will delay signals or abort execution of the current
|
||||
instruction to reduce saved state. Even if we stick with the
|
||||
current philosophy, the code could be cleaned up.
|
||||
|
||||
4. Ditto for AST and software interrupt emulation.
|
||||
Both are possibly over-elaborate and inefficiently implemented.
|
||||
We could possibly handle them by using an appropriately planted
|
||||
PS trace bit.
|
||||
|
||||
5. Make use of transparent translation registers on 030/040 MMU.
|
||||
With a little rearranging of the KVA space we could use one to
|
||||
map the entire external IO space [ 600000 - 20000000 ). Since
|
||||
the translation must be 1-1, this would limit the kernel to 6mb
|
||||
(some would say that is hardly a limit) or divide it into two
|
||||
pieces. Another promising use would be to map physical memory
|
||||
within the kernel. This allows a much simpler and more efficient
|
||||
implementation of /dev/mem, pmap_zero_page, pmap_copy_page and
|
||||
possible even kernel-user cross address space copies. However,
|
||||
it does eat up a significant piece of kernel address space.
|
||||
|
||||
6. Create a 32-bit timer.
|
||||
Timers 2 and 3 on the MC6840 clock chip can be concatonated together to
|
||||
get a 32-bit countdown timer. There are at least three uses for this:
|
||||
1. Monitoring the interval timer ("clock") to detect lost "ticks".
|
||||
(Idea from Scott Marovich)
|
||||
2. Implement the DELAY macro properly instead of approximating with
|
||||
the current "while (--count);" loop. Because of caches, the current
|
||||
method is potentially way off.
|
||||
3. Export as a user-mappable timer for high-precision (4us) timing.
|
||||
Note that by doing this we can no longer use timer 3 as a separate
|
||||
statistics/profiling timer. Should be able to compile-time (runtime?)
|
||||
select between the two.
|
||||
|
||||
7. Conditional MMU code sould be restructured.
|
||||
Right now it reflects the evolutionary path of the code: 320/350 MMU
|
||||
was supported and PMMU support was glued on. The latter can be ifdef'ed
|
||||
out when not needed, but not all of the former (e.g. ``mmutype'' tests).
|
||||
Also, PMMU is made to look like the HP MMU somewhat ham-stringing it.
|
||||
Since HP MMU models are dead, the excess baggage should be there (though
|
||||
it could be argued that they benefit more from the minor performance
|
||||
impact). MMU code should probably not be ifdef'ed on model type, but
|
||||
rather on more relevant tags (e.g. MMU_HP, MMU_MOTO).
|
||||
|
||||
8. Redo cache handling.
|
||||
There are way too many routines which are specific to particular
|
||||
cache types. We should be able to come up with a more coherent
|
||||
scheme (though HP 68k boxes have just about every caching scheme
|
||||
imaginable: internal/external, physical/virtual, writeback/writethrough)
|
||||
See, for example, Wheeler and Bershad in ASPLOS 92.
|
||||
|
||||
9. Sort the free page list.
|
||||
The DMA hardware on the 300 cannot do scatter/gather IO. For example,
|
||||
if an 8k system buffer consists of two non-contiguous physical pages
|
||||
it will require two DMA transfers (and hence two interrupts) to do the
|
||||
operation. It would take only one transfer if they were physically
|
||||
contiguous. By keeping the free list ordered we could potentially
|
||||
allocate contiguous pages and reduce the number of interrupts. We can
|
||||
consider doing this since pages in the free list are not reclaimed and
|
||||
thus we don't have to worry about distorting any LRU behavior.
|
||||
----
|
||||
Mike Hibler
|
||||
University of Utah CSS group
|
||||
mike@cs.utah.edu
|
Loading…
Reference in New Issue