This patch introduces randomization of commpage position. From now on commpage
table contains offsets from begining to of the commpage to the particular
commpage entry. Similary addresses of symbols in ELF memory image "commpage"
are just offsets from the begining of the commpage.
This patch also updates KDL so that commpage entries are recognized and shown
correctly in stack trace. An update of Debugger is yet to be done.
Set execute disable bit for any page that belongs to area with neither
B_EXECUTE_AREA nor B_KERNEL_EXECUTE_AREA set.
In order to take advanage of NX bit in 32 bit protected mode PAE must be
enabled. Thus, from now on it is also enabled when the CPU supports NX bit.
vm_page_fault() takes additional argument which indicates whether page fault
was caused by an illegal instruction fetch.
x86_userspace_thread_exit() is a stub originally placed at the bottom of
each thread user stack that ensures any thread invokes exit_thread() upon
returning from its main higher level function.
Putting anything that is expected to be executed on a stack causes problems
when implementing data execution prevention. Code of x86_userspace_thread_exit()
is now moved to commpage which seems to be much more appropriate place for it.
When mmap() is invoked without specifying address hint B_RANDOMIZED_ANY_ADDRESS
is used.
Otherwise, unless MAP_FIXED flag is set (which requires mmap() to return an area
positioned exactly at given address), B_RANDOMIZED_BASE_ADDRESS is used.
When forking a process team user data area is not cloned but a new one is
created instead. However, the new one has to be at exactly the same address
parent's team user data area is. When process is exec then team user data
area may be recreated at random position.
This patch also make sure that instances of struct user_thread in team user
data are each in separate cache line in order to prevent false sharing since
these data are very likely to be accessed simultaneously from threads executing
on different CPUs. This change however reduces the number of threads process
can create. It is fixed by reserving 512kB of address space in case team user
data area needs to grow.
Randomized equivalent of B_ANY_ADDRESS. When a free space is found (as in
B_ANY_ADDRESS) the base adress is then randomized using _RandomizeAddress
pretty much like it is done in B_RANDOMIZED_BASE_ADDRESS.
B_RAND_BASE_ADDRESS is basically B_BASE_ADDRESS with non-deterministic created
area's base address.
Initial start address is randomized and then the algorithm looks for a large
enough free space in the interval [randomized start, end]. If it fails then
the search is repeated in the interval [original start, randomized start]. In
case it also fails the algorithm falls back to B_ANY_ADDRESS
(B_RANDOMIZED_ANY_ADDRESS when it is implemented) just like B_BASE_ADDRESS does.
Randomization range is limited by kMaxRandomize and kMaxInitialRandomize.
Inside the page randomization of initial user stack pointer is not only a part
of ASLR implementation but also a performance improvement that helps
eliminating aligned 64 kB data access.
Minimal user stack size is increased to 8 kB in order to ensure that regardless
of initial stack pointer value there is still enough space on stack.
The physical memory map area was not included in the kernel virtual
address space range (it was below KERNEL_BASE). This caused problems
if an I/O operation took place on physical memory mapped there (the
bad address error seen in #9547 was occurring in lock_memory_etc()).
Changed KERNEL_BASE and KERNEL_SIZE to cover the area and add a null
area that covers all of it. Also changed X86VMTranslationMap64Bit to
handle large pages in Query(), as the physical map area uses large
pages.
As pointed out by Hamish the alignment used && in second case
instead of &, which meant it was never used.
Another error when a32 was 0xFF000000 and b32 was 0xFF00FFFF
would return a non zero value. A simple fix for the issues
with going over to the byte by byte comparison failed, so
rather than leave broken code I remove this for the time being.
Not the best code I've written obviously.
The standard states that F_GETLK should check whether given lock would be
blocked by another one and return description of the conflicting one (or
set l_type to F_UNLCK if there is no collision).
Current implementation of F_GETLK performs completely different actions, it
"Retrieves the first lock that has been set by the current team". Moreover,
if there are no locks (advisory_locking == NULL) an error is returned
instead of l_type set to F_UNLCK.
Unlike Classic PPC, Book-E CPUs do not have hardware page-table walk,
rather only a limited set of software-managed TLBs. Also, translation
is never disabled even when page-faulting.
The Linux Book-E port pins part of the RAM in one or more TLBs, and
allocates kernel memory from it for the kernel itself and exception
vectors, as well as page tables.
cf. http://kernel.org/doc/ols/2003/ols2003-pages-340-350.pdf
We take a similar approach, but instead of using a tree-like page
directory we reserve a large part of this mapped range for an
hashed page table similar to the Classic one, to later allow
factoring code out. The kernel and boot modules will also be
allocated there, and the rest should be used as SLAB areas.
Note doing so will not allow implementing proper permissions for
the areas allocated there since they are all handled by a single TLB.
Also note Book-E does not standardize the MMU implementation itself.
For now only AMCC440 type MMU is supported. This will need to be
cleaned up later on.
The size variable at this point is actually a page count.
The test should never be true anyway though. Maybe we should use a
pages variable for clarity?
* Read the PVR register to determine the CPU type.
* Also check the FDT /cpu/cpu@0 node for model string
(we use the FDT calls directly since the OF wrapper isn't initialized yet).
* Use this to enable the FPU APU hack for 440 (instead of #ifdef).
* Moved some functionality into their own files so that they can easily
be reused by other code.
* Added crc32() function from FreeBSD. Implemented CRC handling and
validation.
* Implemented missing write functionality.
This replaces the use of a few BSD-specific functions, as well
as the direct references to _open/_close et-al.
BFS doesn't support the FTS_NOSTAT directory link count optimization,
and no statfs() function is available, so we simply turn that off.
* Added the aforementioned functions.
* create_area_etc() now takes a guard size parameter.
* The thread_info::stack_base/end range now refers to the usable range
only.
The value computed isn't actually used anywhere. It just ensured that
a panic would be triggered if we "skipped" to virtual addresses further
along. This shouldn't be problematic however.
This makes it less likely that uninitialized entries cause troubles.
Also panic if we encounter an unknown entry type instead of defaulting
to 4K pages.
And actually use the virtual address for it later on. This wasn't
problematic as the virtual and physical addresses are identity mapped,
but it seems more correct to do it in this order.
As there are only 8 bits for the index in the coarse page table entries
the maximum index is 256. This makes us correctly move to the next page
directory once we've run through all entries. Fixes missing unmap of
pages that crossed that boundary and consequent panic "page still has
mappings" when the page was removed from a cache.
Since we have the same setup with a loaded and mapped boot archive, we
can reuse the MemoryDevice implemented in uboot. This gets the loader
to the stage where it loads and attempts to boot the kernel.
An archive (ramfs) to be loaded can be specified in the raspberry pi
config.txt with a certain base address. We can use this to put our
floppy boot archive into memory on startup.
During the start procedure we now map that archive so we can later
load the kernel from it.
Add more fields to arch framebuffer to hold the physical address and
size of the framebuffer. Then fill these in when mapping the
framebuffer to virtual memory.
These can be used for on-screen debug output with relatively little
effort, as they just need a plain framebuffer definition to work.
Some stubs are added to not clutter up the kernel sources with too
many ifdefs.
Making the fields protected allows them to be set by arch framebuffer
implementations. The getters can be used to retrieve the configuration
from outside the implementation.
The stack pointer is set up so that it uses the space below our .text
section at 0x8000. The stack pointer actually points at one entry less
than the specified address, so it starts at 0x8000 - sizeof(uint32) and
grows downwards from there.
* Raspberry Pi is broken now after
the other recent arm work... needs
more investigation.
* Comment out stage2 header as it
links to headers with c++ code.
Need to verify entry.s can call
c++ code (I think it's mangled to
the assembly or something)
* Fix naming of code entry to match
other arm code.
This will get libroot.so to build, which should give us much more compilation
coverage for ARM now too. Will have to go through all the libroot code with
a tooth comb anyway once userland comes up...
(Will consider replacing the glibc mess with bsd for all non gcc2 platforms)
Will sort this out properly when userland is coming up. Need to get
a basic libroot version working first, so I can build a proper HaikuImage
to boot from ;)
These are pretty generic 32bit target files, so just copy them over.
Once we get userland properly starting, we can review these to see
if they need any changes.
This adds a few generic implementations of basic arithmetic functions. These
would normally be implemented in assembly, but add them for easy bringup of
new architectures.
This enables new platforms to start with a minimal set of changes to libroot.
There is no ARM port for the glibc version we're using mostly, so I'm picking
up files from more recent glibc and will probably need to hack around in them,
as glibc seem to have cleaned up their arch support a lot these days.
When iterator->current is NULL, hash_next() assumes we've reached the
end of a bucket (linked list) and moves to the next one. Wehn the first
element of a linked list was removed in hash_remove_current()
iterator->current was set to NULL, causing the next call to hash_next()
to skip over the rest of the list of that bucket.
To fix this we now decrement iterator->bucket by one, so the next call
to hash_next() correctly arrives at the new first element of the same
bucket. Doing it this way avoids having to search backwards through the
table to find the actual previous item.
This caused modules to be skipped in module_init_post_boot_device()
when normalizing module image paths so some of the module images ended
up non-normalized. This could then cause images to be loaded a second
time for modules that were part of an actually already loaded image.
This setup is present for the PCI module with the pci/x86 submodule
and would lead to a second copy of the PCI module image to be loaded
but without being initialized, eventually leading to #8684.
The affected module images were pretty much random, as it depended on
the order in which they were loaded from the file system, in this case
the boot floppy archive of the El-Torito boot part of ISO and anyboot
images. The r1alpha4 release images unfortunately had the module files
ordered in the archive just so that the PCI module image would be
skipped, allowing #8684 to happen on many systems with MSI support.
Since the block cache uses hash_remove_current() as well in some cases,
it is possible that transactions in its list could've been skipped.
Cursory testing didn't reveal this to be a usual case, and it is
possible that in the pattern it is used there, the bug wouldn't be
triggered at all. It's still possible that it caused rare misbehaviour
though.
Since we're using multi-part uImage format, we can add the FDT as
a seperate "blob" in the uImage, if the used U-Boot version is not
"FDT enabled".
This is used for example for our Verdex target. Currently I've got
a local hack in the platform/u-boot/Jamfile, looking into pulling
in the FDT files and a proper Jam setup to do that properly...
This detects everything up to ARMv6 right now. Need to check more
recent ARM ARMs for ARMv7 detection.
The detected details get passed on to the kernel, which can use
the pre-detected info for selecting right pagetable format and such.
Copyright removal of Axel done after agreement with Axel @ BeGeistert
that for files that were copy/pasted from x86 arch and then fully
replaced the implementation, removal of original copyright holder is
allowed, since their actual code is gone ;)
Remove the dummies from the C code and implement them in assembly,
due to the label referencing issues with the fault handler.
This code is ripe for optimisation, my ARM assembly is pretty
basic ;)
Does work though, and gets us one step closer to a full arch.
As noted during BeGeistert and today again by kallisti5, there's a
Pentium reference in the ARM bootloader code.
Correct the message to something more appropriate....
Thanks to Rene for the suggestion ;)
This is to make sure all ARM platforms will benefit from planned work on this
MMU/CPU code. The less code duplicated, the better.
Compile-tested for all supported ARM platforms
This also implements the fault handler correctly now, and cleans up the
exception handling. Seems a lot more stable now, no unexpected panics or
faults happening anymore.
This will generate asm_offsets.h which makes our assembly code
easier to maintain by preventing hardcoded offsets for fields within structures.
(copied from X86 and removed the X86 specifics)
* don't enforce a zero boundary or a zero alignment
* when going to the next range, takes alignment into account.
It could previously just be enforced again through alignment and loop infinite.
* it should help with some FreeBSD based drivers
This contains both the common ARM(v5) vector handling as well as
the PXA(verdex) specific interrupt controller code, to be seperated
when ARM support for FDT is implemented.
Functional enough to handle interrupts, needs work on KDL support.
* General fixes to get the refactored framebuffer code to work
(across all 3 supported architectures)
* PXA (verdex) specific fixes to framebuffer code.
Now properly displays the (greyed) icons on the framebuffer!
Signed-off-by: Ithamar R. Adema <ithamar@upgrade-android.com>
* since the FDT linux boot method doesn't pass the uimage, we can't
use it to pass the kernel+driver tgz in a multi-file uimage.
* instead we check for the linux initrd properties in the /chosen node.
(cherry picked from my sam460ex branch)
* we first try to find 'serial', 'serial0' or 'serial1' in /aliases
* extract the required properties from the found node and use them
* fallback to the hardcoded UART from the board definition header
(cherry picked from my sam460ex branch)
* add some helpers for Flattened Device Trees, for now a dump call
* dump the passed FDT on startup for now
Conflicts:
src/system/boot/platform/u-boot/Jamfile
(cherry picked from my sam460ex branch)
* For the sam460ex and likely some ARM boards we will try to boot
using the passed FDT, as it's the recommended method now.
(cherry picked from my sam460ex branch)
* The only implementation that would accept more than 2 TB was the one in
scsi_disk. But even that one was limited to 63 TB.
* Now there is a new utility function devfs_compute_geometry_size() which
does it correctly for sizes up to 2^64 which should be good enough for
quite some time :-)
* This fixes bug #8992.
The function fill_team_info() completely ignored the user id and the
group id of the process (fields info->uid and info->gid respectively).
Since the info structure was zeroed earlier, the ps output showed uid
and gid of each process equal to zero.
The patch fixes the problem by properly initializing the members with
effective uid and gid. Now the output is correct.
Fixes#8995.
Signed-off-by: Ryan Leavengood <leavengood@gmail.com>
* For now this allows linking the kernel and the pci bus manager.
* Could later on be turned into a wrapper to FDT methods since the
concepts are similar.
* When a block was only used in a sub-transaction, it was thrown away,
but the transaction::num_blocks field was not decremented.
* This caused transactions never considered finished which eventually
led to bug #8942. This does not explain the disk corruption occurring
in #8969, though.
* Don't redefine incorrect cpu headers in framebuffer code
* Drop unused err
* Fix missing parentheses as per gcc
* Fix Raspberry Pi Build
* Fix overo build due to missing header
* Proper framebuffer code is chosen based on hardware
* This change could extend into other arch code as well
* François gave permission to update his copyrights
* Minimal functional change
* These video sources would be good cannidates to be
refactored as classes. (like the arm serial code)
* No functional change. There are some order style issues
in some of the code (the top externs), but I decided to
not fix them as I can't build these atm to test.
sFreeAreaCount wasn't decremented after removing an area from
sFreeAreas, thus causing the loop to continue until enountering and
crashing on a NULL pointer after removing the last area. Introduce
helper methods _PushFreeArea() and _PopFreeArea() to ensure this cannot
easily happen again.
Fixes ticket #8972.
* Avoid floating point numbers in the kernel
* Warning would always show if custom swap file in use
* Don't change a custom swap file size if low space occurs
* Ram > 1GB? Don't double the memory for the automatic size
* Heavily based on Hamish Morrison's GCI work with some
modified logic and cleanup. #3723
* Adds automatic swap as well as user specified swap
* Limits:
Automatic: (ram * 2) up to 25% of the disk
User: user specified up to 90% of the disk
* Supports changing the swap disk location
* The ASSERT() I introduced in r44585 was incorrect: when the sub transaction
used block_cache_get_empty() to get the block, there is no original_data for
a reason.
* Added a test case that reproduces this situation.
* The block must be moved to the unused list in this situation, though, or else
it might contain invalid data. Since the block can only be allocated in the
current transaction, this should not be a problem, though, AFAICT.
The lowest 4 bits of the MSR serves as a hint to the hardware to
favor performance or energy saving. 0 means a hint preference for
highest performance while 15 corresponds to the maximum energy
savings. A value of 7 translates into a hint to balance performance
with energy savings.
The default reset value of the MSR is 0. If BIOS doesn't intialize
the MSR, the hardware will run in performance state. This patch
initialize the MSR with value of 7 for balance between performance
and energy savings
Signed-off-by: Fredrik Holmqvist <fredrik.holmqvist@gmail.com>
* In cache_abort_sub_transaction(), the original_data can already be freed
when the block is being removed from the transaction.
* block_cache::_GetUnusedBlock() no longer frees original/parent data - it
now requires them to be freed already (it makes no sense to have them still
around at this point).
* AFAICT the previous version did not have any negative consequences besides
freeing the original data late.
* cache_abort_sub_transaction() was setting the transaction_next pointer to
NULL in order to remove a block from a transaction -- however, it forgot to
actually remove it from the transaction's block list.
* Minor restructuring.
* since the FDT linux boot method doesn't pass the uimage, we can't
use it to pass the kernel+driver tgz in a multi-file uimage.
* instead we check for the linux initrd properties in the /chosen node.
* we first try to find 'serial', 'serial0' or 'serial1' in /aliases
* extract the required properties from the found node and use them
* fallback to the hardcoded UART from the board definition header
Renamed {32,64}/int.cpp to {32,64}/descriptors.cpp, which now contain
functions for GDT and TSS setup that were previously in arch_cpu.cpp,
as well as the IDT setup code. These get called from the init functions
in arch_cpu.cpp, rather than having a bunch of ifdef'd chunks of code
for 32/64.
Hoard reserves a chunk of the address space to grow the heap into.
As there is a much larger address space available on 64-bit systems,
we may as well reserve a larger chunk of address space (64GB for now,
though could probably reserve a lot more than that and still leave
plenty of room for other areas).
- If a trace entry has a stack trace, attempt to demangle the associated symbols.
Could be enhanced further to also demangle the arguments but doesn't yet.
Interestingly there are some mangled symbols that our demangler appears to
not handle correctly (gcc4).
Kernel mode code on x86_64 needs to be built with -mno-red-zone as
interrupts would corrupt the red zone if it were in use. However, the
kernel is linked with libsupc++, which was not compiled with
-mno-red-zone. If an interrupt occurred in libsupc++ code the red zone
would get corrupted. This was causing random panics, particularly under
heavy system load. Therefore, on x86_64 a separate build of libsupc++
with -mno-red-zone is now done for the kernel to use. Note: this commit
will require a rerun of configure and rebuild of cross tools.
Initializing the IO-APIC will initialize the PCI module, which does
read the MSI config of the devices only when MSIs are available.
Since we initialized them only after that, that condition wasn't met.
Later, due to the uninitialized arch info, MSIs were still marked as
available (0xcc = 204 MSIs). Due to the also uninitialized configured
count, they were always deemed busy however, in effect just breaking
MSI support whereever IO-APICs were available.
This bug was introduced by changing IS_USER_ADDRESS to check against
USER_BASE AND USER_TOP rather than just !IS_KERNEL_ADDRESS. Faults
on addresses outside both the user and kernel address spaces (i.e. the
gap between user and kernel) would result in addressSpace being NULL,
but addressSpace was being used without checking for NULL at one point.
If an uncanonical address is accessed a general protection fault will
be raised. When in the debugger, uncanonical address faults should be
handled by the fault handler (if any).
Reused x86 arch_user_debugger.cpp, with a few minor changes to make
the code work for both 32 and 64 bit. Something isn't quite working
right, if a breakpoint is hit the kernel will hang. Other than that
everything appears to work correctly.
* Changed IS_USER_ADDRESS to check an address using USER_BASE and
USER_SIZE, rather than just !IS_KERNEL_ADDRESS. The old check would
allow user buffers to point into the physical memory map area.
* Added an unmapped hole at the end of the bottom half of the address
space which catches buffers that cross into the uncanonical address
region. This also removes the need to check for uncanonical return
addresses in the syscall handler, it is no longer possible for the
return address to be uncanonical under normal circumstances. All
cases in which the return address might be changed by the kernel
are still handled via the IRET path.