Commit Graph

492 Commits

Author SHA1 Message Date
Paweł Dziepak
ac97d35790 kernel/arch: remove leftover debug message
Polite fault handlers are nice, but we like the silent ones even more.

Signed-off-by: Paweł Dziepak <pdziepak@quarnos.org>
2014-09-25 21:57:32 +02:00
Paweł Dziepak
95e97463d2 kernel: add generic wrapper for accessing user memory
This patch adds user_access() which can be used to gracefully handle
page faults that may happen when accessing user memory. It is used
by arch_cpu_user{memcpy, memset, strlcpy}() to allow using optimized
functions from the standard library.

Currently only x64 uses this, but nothing really is arch specific here.

Signed-off-by: Paweł Dziepak <pdziepak@quarnos.org>
2014-09-14 22:39:07 +02:00
Paweł Dziepak
396b74228e kernel/x86_64: save fpu state at interrupts
The kernel is allowed to use fpu anywhere so we must make sure that
user state is not clobbered by saving fpu state at interrupt entry.
There is no need to do that in case of system calls since all fpu
data registers are caller saved.

We do not need, though, to save the whole fpu state at task swich
(again, thanks to calling convention). Only status and control
registers are preserved. This patch actually adds xmm0-15 register
to clobber list of task swich code, but the only reason of that is
to make sure that nothing bad happens inside the function that
executes that task swich. Inspection of the generated code shows
that no xmm registers are actually saved.

Signed-off-by: Paweł Dziepak <pdziepak@quarnos.org>
2014-09-14 19:16:52 +02:00
Paweł Dziepak
b41f281071 boot/x86_64: enable sse early
Enable SSE as a part of the "preparation of the environment to run any
C or C++ code" in the entry points of stage2 bootloader.

SSE2 is going to be used by memset() and memcpy().

Signed-off-by: Paweł Dziepak <pdziepak@quarnos.org>
2014-09-14 19:16:52 +02:00
Paweł Dziepak
6156a508ad kernel/x86[_64]: remove get_optimized_functions from cpu modules
The possibility to specify custom memcpy and memset implementations
in cpu modules is currently unused and there is generally no point
in such feature.

There are only 2 x86 vendors that really matter and there isn't
very big difference in performance of the generic optmized versions
of these funcions across different models. Even if we wanted different
versions of memset and memcpy depending on the processor model or
features much better solution would be to use STT_GNU_IFUNC and save
one indirect call.

Long story short, we don't really benefit in any way from
get_optimized_functions and the feature it implements and it only adds
unnecessary complexity to the code.

Signed-off-by: Paweł Dziepak <pdziepak@quarnos.org>
2014-09-14 19:16:51 +02:00
Ithamar R. Adema
eea45d0a32 ARM: cleanup of bootloader memory mapping
* Removes default mapping of a portion of the RAM (will be done
  as needed)
* Passes on the page directory area to kernel, so on early vm init
  the kernel can use the area for pagetable allocation.
* Leaves it to the platform to pass in physical memory range(s). This
  will ultimately come from FDT.
* Fix long standing issue with allocation of the heap, potentially
  causing other part of the bootloader to overwrite the heap.
* Implements pagetable allocator in kernel for early vm mapping.

This fixes the first PANIC seen, we now just get the same one later
on when the VM is up... more to come...
2014-09-07 20:56:15 +02:00
Ithamar R. Adema
6048591e9d Revert "Added check to ensure KDL does not include frames beyond kernel entry in the backtrace. This prevents KDL from faulting when printing backtrace on ARM."
This reverts commit 3fbb24680c.

As I mentioned in #11131, this fix is not correct, and works around
the problem. The real reason was that arch_debug_call_with_fault_handler
was not working properly, so the fault handler went crazy.

With commit eb92810 that is fixed so this can be reverted.
2014-09-07 19:15:01 +02:00
PulkoMandy
83f5e2a258 Fix stack alignment for bootloader.
The ARM SP is pointing to the top item of the stack, not the first free
byte. This was confusing dprintf making it fail to print 64bit integers.
2014-09-02 17:01:27 +02:00
Arvind S Raj
3fbb24680c Added check to ensure KDL does not include frames beyond kernel entry in the backtrace. This prevents KDL from faulting when printing backtrace on ARM. 2014-09-02 13:39:57 +02:00
Paweł Dziepak
4b75a1e237 kernel/x86_64: implement x86_swap_pgdir in C++
No reason not to inline this function.

Signed-off-by: Paweł Dziepak <pdziepak@quarnos.org>
2014-08-25 23:07:29 +02:00
Paweł Dziepak
2e2c9bd3d0 os/support: implement atomic_*() using GCC builtin helpers
If GCC knows what these functions are actually doing the resulting
code can be optimized better what is especially noticeable in case of
invocations of atomic_{or,and}() that ignore the result. Obviously,
everything is inlined what also improves performance.

Signed-off-by: Paweł Dziepak <pdziepak@quarnos.org>
2014-08-25 23:05:07 +02:00
Arvind S Raj
82d287ddcb Reserve 8MB space for kernel before RAM_loader
...so that kernel does not overwrite the loader.

Signed-off-by: Adrien Destugues <pulkomandy@pulkomandy.tk>

Fixes #11067.
2014-08-08 17:39:33 +02:00
François Revol
c8826605df Guard header for use by assembler code
Somehow it ends up being used by shell.S for the verdex bootstrap.
2014-07-23 13:27:23 +02:00
PulkoMandy
4a2260f21a Let the bootloader know about ARMv7.
When an ARMv7 CPU is detected, immediately turn on the FPU. This allows
us to use vsnprintf in the TRACE call in that function, as our libc is
compiled with floating point support and will trigger a fault if the FPU
is not available.

This lets the boot go further, and crash in mmu_init. Next steps:
* Find why mmu_init is crashing
* Setup some fault handlers, otherwise we call uboot ones, and they are
not very helpful. They will also probably not work once the mmu is
enabledvery helpful. They will also probably not work once the mmu is
enabledvery helpful. They will also probably not work once the mmu is
enabled...
2014-06-13 22:15:54 +02:00
Pawel Dziepak
76636769bd kernel/x86_64: inline x86_{read, write}_msr()
This patch makes it possible to inline rdmsr and wrmsr instruction. The
performance impact shouldn't be significant since they are used relatively
rarely and wrmsr is usually a serializing instruction, but there is no reason
not to do so.
2014-05-06 21:41:49 +02:00
Pawel Dziepak
88e8e24c84 kernel/x86_64: improve context switch implementation
The goal of this patch is to amortize the cost of context switch by making
the compiler aware that context switch clobbers all registers. Because all
register need to be saved anyway there is no additional cost of using
callee saved register in the function that does the context switch.
2014-05-06 21:15:55 +02:00
Pawel Dziepak
9db5b975f9 kernel/x86_64: rework of IDT handling code
Similarly to previous patch regarding GDT this is mostly a rewrite of
IDT handling code from C to C++. Thanks to constexpr IDT is now entirely
generated at compile-time.
2014-05-06 14:59:54 +02:00
Pawel Dziepak
cd59bf4349 kernel/x86_64: x86_64 gdt handling code overhaul
Virtually no functional change, just rewriting the code from
"C in *.cpp files" to C++. Use of constexpr may be advantageous but
that code is not performance critical anyway.
2014-05-06 14:59:53 +02:00
Alexander von Gluck IV
d035469704 ARM: Name beagleboneblk back to beagle
* Pulkomandy pointed out that all Beagle hardware is
  very similar so we could likely get away with a single
  ARM target board.
2014-02-26 15:33:40 -06:00
Alexander von Gluck IV
8cfbbff4df ARM: Fix dmb opcode 2 on ARMv6
* Typo, also fix tabs
* Sorry for the spam, this should be correct now
2014-02-26 13:22:18 -06:00
Alexander von Gluck IV
b6994f96c0 ARM: Break apart ARMv5 and older dsb/dmb
* Add isb just because.
* pdziepak pointed out that ARMv5 and before
  had different barrier support.
* pdziepak also mentioned that dsb was too strong
  for __sync_synchronize
2014-02-26 13:17:21 -06:00
Alexander von Gluck IV
a21611e439 ARM: Add ARMv6 or older __sync_synchronize built-in
* On ARMv6 or older, we do a simulated dsb.
* Move __sync_synchronize into thread.c in libroot
  and use the new arch_atomic.h dsb/dmb defines.
* Gets arm @bootstrap-raw to end of bootstrap.
2014-02-26 12:51:51 -06:00
Alexander von Gluck IV
6d3363214f ARM: Simplify board specification
* Don't assume verdex as it isn't clear this was
  occurring.
* Make an educated guess on HAIKU_BOOT_PLATFORM
  based on provided board (but still allow it to
  be overridden)
* Error out if user doesn't populate
  HAIKU_BOOT_PLATFORM or enters an unknown board
  name.
* You need to add "-sHAIKU_BOOT_BOARD=xxx" to
  your jam to build for the proper ARM device.
* Rename beagle to beagleboneblk as per the
  documentation.
2014-02-26 00:27:18 -06:00
Ithamar R. Adema
5cef6be21f arm/atomic: fixup arch_atomic.h
* Remove _inline functions, since we're not using inlines
* Use compiler barrier instead of GCC builtin
2014-02-15 11:46:11 +01:00
Alexander von Gluck IV
35171b073d arm: Miscellaneous build fixes
* Use atomic_get_and_set for return value
* Atomics are no longer volatile
* Add missing arch_cpu_pause stub
* Move arch_cpu_idle to arch_cpu header to match
  other architectures
2014-02-12 23:37:15 -06:00
Alexander von Gluck IV
8018e8fa91 arm: Rework hrev46863 to use gcc built-in
* Those calls were indeed v7+ only, and our toolchain
  is v6.
2014-02-12 23:34:08 -06:00
Alexander von Gluck IV
92b2e03d0d arm: Add initial memory barrier functions
* These likely need reviewed by someone better
  at arm assembly. (#10537)
2014-02-12 23:11:11 -06:00
Pawel Dziepak
527da4ca8a x86[_64]: Separate bootloader and kernel GDT and IDT logic
From now on bootloader sets up its own minimal valid GDT and IDT. Then
the kernel replaces them with its own tables.
2014-01-28 00:44:02 +01:00
Pawel Dziepak
e1720098c6 kernel: No need for arch specific ifdefs in arch/atomic.h 2014-01-20 04:09:17 +01:00
Alexander von Gluck IV
c9e66bfc9b kernel: Add missing smp memory barrier calls. Set max cpu to 1
* Set max cpu to 1 for PPC until atomic functions are finished
* We have atomic functions inline in the kernel and assembly
  code in libroot post-scheduler merge... isn't that a lot of
  duplication?
2014-01-19 19:33:21 -06:00
Alexander von Gluck IV
fb8026e82b kernel: Add missing PPC CPU functions for idle / pause 2014-01-19 14:38:01 -06:00
Alexander von Gluck IV
524bea3553 kernel: fix missing cpu cache defines non-x86
* Regression introduced due to scheduler change
* Other other non-x86, non-ppc, and non-arm platforms
  need evalulated for this metric
2014-01-19 14:27:09 -06:00
Alexander von Gluck IV
6647d2c95a kernel: fix missing SMP_MAX_CPUS on non-x86
* Regression introduced due to scheduler change
* Drop MAX_BOOT_CPUS as it is no longer used
2014-01-19 14:09:51 -06:00
Pawel Dziepak
8cf8e53774 kernel/x86: Inline atomic functions and memory barriers 2014-01-06 09:08:53 +01:00
Pawel Dziepak
b89b5d3826 x86: Make arch/smp.h a C++ only header 2013-12-20 22:05:26 +01:00
Pawel Dziepak
6639514437 x86[_64]: Support assigning MSI IRQs to arbitrary CPU 2013-12-20 01:07:08 +01:00
Pawel Dziepak
e3d001ff02 x86: Implement multicast ICIs 2013-12-19 19:35:44 +01:00
Pawel Dziepak
735f67481f x86: Debugger can now use dr3 2013-12-17 04:31:29 +01:00
Pawel Dziepak
a5b070f1fa x86: Store pointer to the current thread in gs:0
Apparently, reading from dr3 is slower than reading from memory
with cache hit.

Also, depending on hypervisor configuration, accessing dr3 may cause
a VM exit (and, at least on kvm, it does), what makes it much slower
than a memory access even when there is a cache miss.
2013-12-17 04:08:51 +01:00
Pawel Dziepak
611376fef7 x86: Let each CPU have its own GDT 2013-12-17 03:57:20 +01:00
Pawel Dziepak
1bc7045fdf kernel, libroot: Introduce new API for obtaining system info 2013-12-16 03:58:43 +01:00
Pawel Dziepak
3e0e3be760 boot, kernel: Replace MAX_BOOT_CPUS with SMP_MAX_CPUS 2013-12-06 19:43:08 +01:00
Pawel Dziepak
7629d527c5 kernel: Use CPUSet in ICI code instead of cpu_mask_t 2013-12-06 03:08:39 +01:00
Pawel Dziepak
3514fd77f7 kernel: Reduce lock contention when processing ICIs 2013-11-29 03:36:44 +01:00
Pawel Dziepak
7db89e8dc3 kernel: Rework cpuidle module
* Create new interface for cpuidle modules (similar to the cpufreq
   interface)
 * Generic cpuidle module is no longer needed
 * Fix and update Intel C-State module
2013-11-25 23:50:27 +01:00
Pawel Dziepak
0e94a12f8e kernel: Make CACHE_LINE_ALIGN visible in the whole kernel 2013-11-25 00:35:15 +01:00
Pawel Dziepak
d897a478d7 kernel: Allow reassigning IRQs to logical processors 2013-11-18 04:55:25 +01:00
Pawel Dziepak
9c0ff0eed1 kernel: Add cpufreq module for Intel P-states
Since Sandy Bridge managing P-states on Intel processors is much easier
and more powerful than when using previous versions of EIST.
2013-10-30 00:55:03 +01:00
Pawel Dziepak
36cc64a9b3 x86[_64]: Add CPU cache topology detection for AMD and Intel CPUs 2013-10-02 23:48:03 +02:00
Pawel Dziepak
4110b730db x86[_64]: Add support for CPUID sub-leaves
Some CPUID leaves may contain one or more sub-leaves accessed by setting
ECX to an appropriate value.
2013-10-01 20:31:18 +02:00