Kernel doesn't use it, and it could be regenerated in the kernel if it did need it.
This also unlocks the apic range the bios can use. Previously the apic ids would have
to fit within 0..MAX_CPUS or it'd reject the cpu. Some boxes (mine in particular)
seem to sparsely populate the apic id so that the range is pretty large.
git-svn-id: file:///srv/svn/repos/haiku/haiku/trunk@37108 a95241bf-73f2-0310-859d-f6bbb57e9c96
was used.
* Renamed X86VMTranslationMap to X86VMTranslationMap32Bit and pulled the paging
method agnostic part into new base class X86VMTranslationMap.
* Moved X86PagingStructures into its own header/source pair.
* Moved pgdir_virt from X86PagingStructures to X86PagingStructures32Bit where
it is actually used.
git-svn-id: file:///srv/svn/repos/haiku/haiku/trunk@37055 a95241bf-73f2-0310-859d-f6bbb57e9c96
* Renamed i386_context_switch() to x86_context_switch().
* x86_context_switch() no longer sets the page directory.
arch_thread_context_switch() does that explicitly, now. This allows to solve
the TODO by reordering releasing the previous paging structures reference and
setting the new page directory.
git-svn-id: file:///srv/svn/repos/haiku/haiku/trunk@37024 a95241bf-73f2-0310-859d-f6bbb57e9c96
* Renamed vm_translation_map_arch_info to X86PagingStructures, and all
members and local variables of that type accordingly.
* arch_thread_context_switch(): Added TODO: The still active paging structures
can indeed be deleted before we stop using them.
git-svn-id: file:///srv/svn/repos/haiku/haiku/trunk@37022 a95241bf-73f2-0310-859d-f6bbb57e9c96
where appropriate.
* Typedef'ed page_num_t to phys_addr_t and used it in more places in
vm_page.{h,cpp}.
git-svn-id: file:///srv/svn/repos/haiku/haiku/trunk@36937 a95241bf-73f2-0310-859d-f6bbb57e9c96
interrupts (MSI).
* Add the remaining IDT entries and redirection functions in the interrupt code.
* Make the PIC end_of_interrupt() return a result to indicate whether the vector
was handled by this PIC. If it isn't we now issue a apic_end_of_interrupt()
in the assumption of apic local interrupt, MSI or IPI. This also removes
the need for the gUsingIOAPIC global and doing manual apic_end_of_interrupt()
calls in the SMP and timer code.
git-svn-id: file:///srv/svn/repos/haiku/haiku/trunk@36221 a95241bf-73f2-0310-859d-f6bbb57e9c96
other places where previously the same functionality was duplicated. Also
seperated the header which was originally arch_smp.h into apic.h and arch_smp.h
again as some of it is MP and not actually APIC.
git-svn-id: file:///srv/svn/repos/haiku/haiku/trunk@36182 a95241bf-73f2-0310-859d-f6bbb57e9c96
Added fields for temporary storage of the debug registers dr6 and dr7 to the
arch_cpu_info structure. The actual registers are stored at the beginning of
x86_exit_user_debug_at_kernel_entry() and read in
x86_handle_debug_exception().
The problem was that x86_exit_user_debug_at_kernel_entry() itself overwrote
dr7 and, if kernel breakpoints were enabled, dr6 could be overwritten anytime
after. So x86_handle_debug_exception() would find incorrect values in the
registers (definitely in dr7) and thus interpret the detected debug condition
incorrectly. Usually watchpoints were recognized as breakpoints.
git-svn-id: file:///srv/svn/repos/haiku/haiku/trunk@35951 a95241bf-73f2-0310-859d-f6bbb57e9c96
arch_debug_registers instead.
* Call arch_debug_save_registers() on all CPUs when entering the kernel
debugger.
* Added debug_get_debug_registers() to return a specified CPU's saved
registers.
* x86:
- Replaced the previous arch_debug_save_registers() implementation. Disabled
getting the registers via the gdb interface for the time being.
- Fixed the "sc", "call", and "calling" commands to also work for threads
running on another CPU.
git-svn-id: file:///srv/svn/repos/haiku/haiku/trunk@35907 a95241bf-73f2-0310-859d-f6bbb57e9c96
* Made the page table allocation more flexible. Got rid of sMaxVirtualAddress
and added new virtual_end address to the architecture specific kernel args.
* Increased the virtual space we reserve for the kernel to 16 MB. That
should suffice for quite a while. The previous 2 MB were too tight when
building the kernel with debug info.
* mmu_init(): The way we were translating the BIOS' extended memory map to
our physical ranges arrays was broken. Small gaps between usable memory
ranges would be ignored and instead marked allocated. This worked fine for
the boot loader and during the early kernel initialization, but after the
VM has been fully set up it frees all physical ranges that have not been
claimed otherwise. So those ranges could be entered into the free pages
list and would be used later. This could possibly cause all kinds of weird
problems, probably including ACPI issues. Now we add only the actually
usable ranges to our list.
Kernel:
* vm_page_init(): The pages of the ranges between the usable physical memory
ranges are now marked PAGE_STATE_UNUSED, the allocated ranges
PAGE_STATE_WIRED.
* unmap_and_free_physical_pages(): Don't free pages marked as unused.
git-svn-id: file:///srv/svn/repos/haiku/haiku/trunk@35726 a95241bf-73f2-0310-859d-f6bbb57e9c96
needs to be or'ed to the address specification), "uncached" is assumed.
* Set the memory type for the "BIOS" and "DMA" areas to write-back. Not sure, if
that's correct, but that's what was effectively used on my machines before.
* Changed x86_set_mtrrs() and the CPU module hook to also set the default memory
type.
* Rewrote the MTRR computation once more:
- Now we know all used memory ranges, so we are free to extend used ranges
into unused ones in order to simplify them for MTRR setup.
- Leverage the subtractive properties of uncached and write-through ranges to
simplify ranges of any other respectively write-back type.
- Set the default memory type to write-back, so we don't need MTRRs for the
RAM ranges.
- If a new range intersects with an existing one, we no longer just fail.
Instead we use the strictest requirements implied by the ranges. This fixes
#5383.
Overall the new algorithm should be sufficient with far less MTRRs than before
(on my desktop machine 4 are used at maximum, while 8 didn't quite suffice
before). A drawback of the current implementation is that it doesn't deal with
the case of running out of MTRRs at all, which might result in some ranges
having weaker caching/memory ordering properties than requested.
git-svn-id: file:///srv/svn/repos/haiku/haiku/trunk@35515 a95241bf-73f2-0310-859d-f6bbb57e9c96
* Pulled the physical page mapping functions out of vm_translation_map into
a new interface VMPhysicalPageMapper.
* Renamed vm_translation_map to VMTranslationMap and made it a proper C++
class. The functions in the operations vector have become methods.
* Added class GenericVMPhysicalPageMapper implementing VMPhysicalPageMapper
as far as possible (without actually writing new code).
* Adjusted the x86 and the PPC specifics accordingly (untested for the
latter). For the other architectures the build is, I'm afraid, seriously
broken.
The next steps will modify and extend the VMTranslationMap interface, so that
it will be possible to fix the bugs in vm_unmap_page[s]() and employ
architecture specific optimizations.
git-svn-id: file:///srv/svn/repos/haiku/haiku/trunk@35066 a95241bf-73f2-0310-859d-f6bbb57e9c96
system_time_nsecs(), returning the system time in nanoseconds. The function
is only really implemented for x86. For the other architectures
system_time() * 1000 is returned.
git-svn-id: file:///srv/svn/repos/haiku/haiku/trunk@34543 a95241bf-73f2-0310-859d-f6bbb57e9c96
all MTRRs at once.
* Added a respective x86_set_mtrrs() kernel function.
* x86 CPU module:
- Implemented the new hook.
- Prefixed most debug output with the CPU index. Otherwise it gets quite
confusing with multiple CPUs.
- generic_init_mtrrs(): No longer clear all MTRRs, if they are already
enabled. This lets us benefit from the BIOS's setup until we install our
own -- otherwise with caching disabled things are *really* slow.
* arch_vm.cpp: Completely rewrote the MTRR handling as the old one was not
only slow (O(2^n)), but also broken (resulting in incorrect setups (e.g.
with cachable ranges larger than requested)), and not working by design for
certain cases (subtractive setups intersecting ranges added later).
Now we maintain an array with the successfully set ranges. When a new range
is added, we recompute the complete MTRR setup as we need to. The new
algorithm analyzing the ranges has linear complexity and also handles range
base addresses with an alignment not matching the range size (e.g. a range
at address 0x1000 with size 0x2000) and joining of adjacent/overlapping
ranges of the same type.
This fixes the slow graphics on my 4 GB machine (though unfortunately the
8 MTRRs aren't enough to fully cover the complete frame buffer (about 35
pixel lines remain uncachable), but that can't be helped without rounding up
the frame buffer size, for which we don't have enough information). It might
also fix#1823.
git-svn-id: file:///srv/svn/repos/haiku/haiku/trunk@34197 a95241bf-73f2-0310-859d-f6bbb57e9c96
aren't routed correctly over the 8259, it seems.
- Removed passing the hpet_regs around, since there's a static variable.
- Added lots of debug dprintfs.
- Fixed setting the timer interrupt to edge
- Timer is initialized once.
- Use the timer 0 instead of 2.
- Renamed register definitions to be more readable
- Use 64 bits registers and unions where applicable.
- Other things I don't remember
git-svn-id: file:///srv/svn/repos/haiku/haiku/trunk@33345 a95241bf-73f2-0310-859d-f6bbb57e9c96
Also shortened some defines using "TN" instead of "TIMER". It's also
the same scheme used in the specs
git-svn-id: file:///srv/svn/repos/haiku/haiku/trunk@33334 a95241bf-73f2-0310-859d-f6bbb57e9c96
ROUNDUP to use '*' and '/' -- the compiler will optimize that for powers of
two anyway and this implementation works for other numbers as well.
* The thread::fault_handler use in C[++] code was broken with gcc 4. At least
when other functions were invoked. Trying to trick the compiler wasn't a
particularly good idea anyway, since the next compiler version could break
the trick again. So the general policy is to use the fault handlers only in
assembly code where we have full control. Changed that for x86 (save for the
vm86 mode, which has a similar mechanism), but not for the other
architectures.
* Introduced fault_handler, fault_handler_stack_pointer, and fault_jump_buffer
fields in the cpu_ent structure, which must be used instead of
thread::fault_handler in the kernel debugger. Consequently user_memcpy() must
not be used in the kernel debugger either. Introduced a debug_memcpy()
instead.
* Introduced debug_call_with_fault_handler() function which calls a function
in a setjmp() and fault handler context. The architecture specific backend
arch_debug_call_with_fault_handler() has only been implemented for x86 yet.
* Introduced debug_is_kernel_memory_accessible() for use in the kernel
debugger. It determines whether a range of memory can be accessed in the
way specified. The architecture specific back end
arch_vm_translation_map_is_kernel_page_accessible() has only been implemented
for x86 yet.
* Added arch_debug_unset_current_thread() (only implemented for x86) to unset
the current thread pointer in the kernel debugger. When entering the kernel
debugger we do some basic sanity checks of the currently set thread structure
and unset it, if they fail. This allows certain commands (most importantly
the stack trace command) to avoid accessing the thread structure.
* x86: When handling a double fault, we do now install a special handler for
page faults. This allows us to gracefully catch faulting commands, even if
e.g. the thread structure is toast.
We are now in much better shape to deal with double faults. Hopefully avoiding
the triple faults that some people have been experiencing on their hardware
and ideally even allowing to use the kernel debugger normally.
git-svn-id: file:///srv/svn/repos/haiku/haiku/trunk@32073 a95241bf-73f2-0310-859d-f6bbb57e9c96
* SMP:
- Added smp_send_broadcast_ici_interrupts_disabled(), which is basically
equivalent to smp_send_broadcast_ici(), but is only called with interrupts
disabled and gets the CPU index, so it doesn't have to use
smp_get_current_cpu() (which dereferences the current thread).
- Added cpu index parameter to smp_intercpu_int_handler().
* x86:
- arch_int.c -> arch_int.cpp
- Set up an IDT per CPU. We were using a single IDT for all CPUs, but that
can't work, since we need different tasks for the double fault interrupt
vector.
- Set the per CPU double fault task gates correctly.
- Renamed set_intr_gate() to set_interrupt_gate and set_system_gate() to
set_trap_gate() and documented them a bit.
- Renamed double_fault_exception() x86_double_fault_exception() and fixed
it not to use smp_get_current_cpu(). Instead we have the new
x86_double_fault_get_cpu() that deducts the CPU index from the used stack.
- Fixed the double_fault interrupt handler: It no longer calls int_bottom to
avoid accessing the current thread.
* debug.cpp:
- Introduced explicit debug_double_fault() to enter the kernel debugger from
a double fault handler.
- Avoid using smp_get_current_cpu().
- Don't use kprintf() before sDebuggerOnCPU is set. Otherwise
acquire_spinlock() is invoked by arch_debug_serial_puts().
Things look a bit better when the current thread pointer is broken -- we run
into kernel_debugger_loop() and successfully print the "Welcome to KDL"
message -- but we still dereference the thread pointer afterwards, so that we
don't get a usable kernel debugger yet.
git-svn-id: file:///srv/svn/repos/haiku/haiku/trunk@32050 a95241bf-73f2-0310-859d-f6bbb57e9c96
* Added x86_double_fault_get_cpu(), a save way to get the CPU index when in
the double fault handler. smp_get_current_cpu() requires at least a somewhat
intact thread structure, so we rather want to avoid it when handling a double
fault. There are a lot more of those dependencies in the KDL entry code.
Working on it...
git-svn-id: file:///srv/svn/repos/haiku/haiku/trunk@32028 a95241bf-73f2-0310-859d-f6bbb57e9c96
* The bulk of the work -- i.e. juggling the software and hardware breakpoints,
watchpoints, and memory reads/writes -- is done in the new class
BreakpointManager.
* For the architectures a few capability macros have to be defined, one
pointing to the software breakpoint instruction opcode. Done for x86.
* Some more simplifications in the user debugger code, made possible by the
recently introduced debugger_changed_condition attribute.
git-svn-id: file:///srv/svn/repos/haiku/haiku/trunk@31214 a95241bf-73f2-0310-859d-f6bbb57e9c96
* Generalized address checks. The debugger can now also read the commpage.
* Added new syscall _kern_get_thread_cpu_state() to get the CPU state of a
not running thread. Introduced arch_get_thread_debug_cpu_state() for that
purpose, which is only implemented for x86 ATM (uses the new
i386_get_thread_user_iframe()).
* Don't allow a debugger to change a thread's "esp" anymore. That's the esp
register in the kernel. "user_esp" can still be changed.
* Generally set RF (resume flag) in eflags in interrupt handlers, not only
after a instruction breakpoint debug exception. This should prevent
breakpoints from being triggered more than once (e.g. when the breakpoint is
on an instruction that can cause a page fault). I still saw those with bdb
in VMware, but that might be a VMware bug.
git-svn-id: file:///srv/svn/repos/haiku/haiku/trunk@31045 a95241bf-73f2-0310-859d-f6bbb57e9c96
gcc could apparently assume that the register assigned to the one in the
clobber list would keep its value (as can be observed when disassembling
add_debugger_command_etc()).
Using a dummy output register works around the problem and also avoids the
unnecessary initialization of the register.
Comments explaining the mystery welcome.
git-svn-id: file:///srv/svn/repos/haiku/haiku/trunk@30909 a95241bf-73f2-0310-859d-f6bbb57e9c96
will return consistent values. This helps with debug measurements for the time
being. Obviously we'll have to think of something different when we support
speed-stepping on models with frequency-dependent TSCs.
git-svn-id: file:///srv/svn/repos/haiku/haiku/trunk@30287 a95241bf-73f2-0310-859d-f6bbb57e9c96
This is not necessary, since userland teams' page directories also
contain the kernel mappings, and avoids unnecessary TLB flushes. To make
that possible the vm_translation_map_arch_info objects are reference
counted now.
This optimization reduces the kernel time of the Haiku build on my
machine with SMP disabled a few percent, but interestingly the total
time decreases only marginally. Haven't tested with SMP yet, but for
full impact CPU affinity would be needed.
git-svn-id: file:///srv/svn/repos/haiku/haiku/trunk@28287 a95241bf-73f2-0310-859d-f6bbb57e9c96
added vm_memcpy_from_physical() and vm_memcpy_physical_page(), and
added respective functions to the vm_translation_map operations. The
architecture specific implementation can now decide how to implement
them most efficiently. Added generic implementations that can be used,
though.
* Changed vm_{get,put}_physical_page(). The former no longer accepts
flags (the only flag PHYSICAL_PAGE_DONT_WAIT wasn't needed anymore).
Instead it returns an implementation-specific handle that has to be
passed to the latter. Added vm_{get,put}_physical_page_current_cpu()
and *_debug() variants, that work only for the current CPU,
respectively when in the kernel debugger. Also adjusted the
vm_translation_map operations accordingly.
* Made consequent use of the physical memory operations in the source
tree.
* Also adjusted the m68k and ppc implementations with respect to the
vm_translation_map operation changes, but they are probably broken,
nevertheless.
* For x86 the generic physical page mapper isn't used anymore. It is
suboptimal in any case. For systems with small memory it is too much
overhead, since one can just map the complete physical memory (that's
not done yet, though). For systems with large memory it counteracts
the VM strategy to reuse the least recently used pages. Since those
pages will most likely not be mapped by the page mapper anymore, it
will keep remapping chunks. This was also the reason why building
Haiku in Haiku was significantly faster with only 256 MB RAM (since
that much could be kept mapped all the time).
Now we're using a different strategy: We have small pools of virtual
page slots per CPU that are used for the physical page operations
(memset_physical(), memcpy_*_physical()) with CPU-pinned thread.
Furthermore we have four slots per translation map, which are used to
map page tables.
These changes speed up the Haiku image build in Haiku significantly. On
my Core2 Duo 2.2 GHz 2 GB machine about 40% to 20 min 40 s (KDEBUG
disabled, block cache debug disabled). Still more than factor 3 slower
than FreeBSD and Linux, though.
git-svn-id: file:///srv/svn/repos/haiku/haiku/trunk@28244 a95241bf-73f2-0310-859d-f6bbb57e9c96
* memset() is now available through the commpage.
* CPU modules can provide a model-optimized memset().
git-svn-id: file:///srv/svn/repos/haiku/haiku/trunk@27952 a95241bf-73f2-0310-859d-f6bbb57e9c96
arch_vm_aspace_swap().
* The x86 implementation does now maintain a bit mask per
vm_translation_map_arch_info indicating on which CPUs the address
space is active. This allows flush_tmap() to avoid ICI for user
address spaces when the team isn't currently running on any other CPU.
In this context ICI is relatively expensive, particularly since we map
most pages via vm_map_page() and therefore invoke flush_tmap() pretty
much for every single page.
This optimization speeds up a "hello world" compilation about 20% on
my machine (KDEBUG turned off, freshly booted), but interestingly it
has virtually no effect on the "-j2" haiku build time.
git-svn-id: file:///srv/svn/repos/haiku/haiku/trunk@27912 a95241bf-73f2-0310-859d-f6bbb57e9c96
be used now. Tested only with VMware so far.
* apm_shutdown() is now called with interrupts turned on.
* Renamed arch_cpu.c to arch_cpu.cpp.
* Minor cleanup.
git-svn-id: file:///srv/svn/repos/haiku/haiku/trunk@27404 a95241bf-73f2-0310-859d-f6bbb57e9c96