The goal of this patch is to amortize the cost of context switch by making
the compiler aware that context switch clobbers all registers. Because all
register need to be saved anyway there is no additional cost of using
callee saved register in the function that does the context switch.
Similarly to previous patch regarding GDT this is mostly a rewrite of
IDT handling code from C to C++. Thanks to constexpr IDT is now entirely
generated at compile-time.
Virtually no functional change, just rewriting the code from
"C in *.cpp files" to C++. Use of constexpr may be advantageous but
that code is not performance critical anyway.
* Instead of forcing the hash-table to use a copy of the key,
introduce and use TypeOperation template to avoid taking a
reference of a reference type (which gcc2 doesn't allow).
For potential boot volumes with older packages states the respective
item in the boot volume menu now has a sub menu for selecting a state.
The boot loader functionality for this feature is complete -- i.e. the
respective kernel is loaded and the name of the old state is added to
the kernel args -- but kernel packagefs and package daemon support is
still missing.
After load_image() the child thread is suspended and the parent is
expected to resume it later. However, it is possible that the parent
attempts to resume its child after it has been notified that the image
had been loaded but before the child managed to suspend itself. In such
case the child would suspends itself after that wake up attempt and,
consequently will not be ever resumed.
To mitigate that problem flag Thread::going_to_suspend has been added
which helps synchronizing thread suspension and continuation in a similar
way that "traditional" thread blocking is performed. This means that
the child should behave in a following manner: set its going_to_suspend flag,
notify the parent (i.e. any thread that may want to resume it), acquire
its scheduler_lock and suspend itself if the going_to_suspend flag is set.
The parent should follow pattern: clear going_to_suspend flag of the thread
that is about to be resumed, acquire that thread scheduler_lock and enqueue
it in a run queue if it is suspended.
Thanks Oliver for reporting the bug and identifying what causes it.
Most of the actual UserEvent work is done in DPC so that we don't have
to care about the limitations of the context in which UserEvent::Fire()
is invoked. This requires appropriate management of lifetime of UserEvent
instances to make sure that DoDPC() method is always called on a valid
object.
* Add isb just because.
* pdziepak pointed out that ARMv5 and before
had different barrier support.
* pdziepak also mentioned that dsb was too strong
for __sync_synchronize
* On ARMv6 or older, we do a simulated dsb.
* Move __sync_synchronize into thread.c in libroot
and use the new arch_atomic.h dsb/dmb defines.
* Gets arm @bootstrap-raw to end of bootstrap.
* Don't assume verdex as it isn't clear this was
occurring.
* Make an educated guess on HAIKU_BOOT_PLATFORM
based on provided board (but still allow it to
be overridden)
* Error out if user doesn't populate
HAIKU_BOOT_PLATFORM or enters an unknown board
name.
* You need to add "-sHAIKU_BOOT_BOARD=xxx" to
your jam to build for the proper ARM device.
* Rename beagle to beagleboneblk as per the
documentation.
* Use atomic_get_and_set for return value
* Atomics are no longer volatile
* Add missing arch_cpu_pause stub
* Move arch_cpu_idle to arch_cpu header to match
other architectures
For non-US keyboards, the extra 102th/105th key is used to reach \. But,
we also need it to report | when shifted (this is the key left to
"enter").
This affects only USB keyboards. Thanks to gordoncjp for reporting!
UserEvent can be fired from scheduler_reschedule() i.e. while holding current
thread scheduler_lock. If the current thread goes sleep and during reschedule
one of its timers sends a signel to it, then scheduler_enqueue_in_run_queue()
attempts to acquire again its scheduler_lock resulting in a deadlock.
There was also a minor issue with both scheduler_reschedule() and
scheduler_enqueue_in_run_queue() acquiring current CPU scheduler mode lock.
* Set max cpu to 1 for PPC until atomic functions are finished
* We have atomic functions inline in the kernel and assembly
code in libroot post-scheduler merge... isn't that a lot of
duplication?
Add boot loader debug menu option "Save syslog from previous session
during boot". If enabled (defaults to true), the previous session's
debug syslog data is copy to a separate buffer and passed to the
kernel, which writes it back to the file /var/log/previous_syslog.
As long as Haiku still boots, this should now be the most convenient way
to retrieve the output from a kernel crash.
Previous implementation based on the actual load of each core and share
each thread has in that load turned up to be very problematic when
balancing load on very heavily loaded systems (i.e. more threads
consuming all available CPU time than there is logical CPUs).
The new approach is to estimate how much load would a thread produce
if it had all CPU time only for itself. Summing such load estimations
of each thread assigned to a given core we get a rank that contains
much more information than just simple actual core load.
This field forces kernel to track each CPU load all the time. It is not
a problem with the current scheduler on a multicore systems, but on
single core machnies or with any other future scheduler this field may
become just an unnecessary burden. It isn't difficult for an application
to compute CPU load by itself when it needs it.
atomic_{get, set}64() are problematic on architectures without 64 bit
compare and swap.
Also, using sequential lock instead of atomic access ensures that
any reads from cpu_ent::active_time won't require any writes to shared
memory.
The client code is not supposed to change the topology info.
It would be also nice if cpu_topology_node::children was an array of
pointers to const but that would require several const_casts in the
topology tree generation code so it's probably not worth it.
Apparently, reading from dr3 is slower than reading from memory
with cache hit.
Also, depending on hypervisor configuration, accessing dr3 may cause
a VM exit (and, at least on kvm, it does), what makes it much slower
than a memory access even when there is a cache miss.
Add get_safemode_option_early() and get_safemode_boolean_early() to get
safemode options before the kernel heap has been initialized. They use a
simplified parser.
* VMTranslationMap:
- Add DebugPrintMappingInfo(): Given a virtual address it is supposed
to print the paging structure information for that address. To be
implemented by derived classes.
- Add DebugGetReverseMappingInfo(): Given a physical addresss it is
supposed to find all virtual addresses mapped to it. To be
implemented by derived classes.
* X86VMTranslationMapPAE: Implement the new methods
DebugPrintMappingInfo() and DebugGetReverseMappingInfo().
* Add KDL command "mapping". It supports both virtual address lookups
and reverse lookups.
* VMAddressSpace: Add randomizingEnabled property.
* VMUserAddressSpace: Randomize addresses only when randomizingEnabled
property is set.
* create_team_arg(): Check, if the team's environment contains
"DISABLE_ASLR=1". Set the team's address space property
randomizingEnabled accordingly in load_image_internal() and
exec_team().
* Create new interface for cpuidle modules (similar to the cpufreq
interface)
* Generic cpuidle module is no longer needed
* Fix and update Intel C-State module
It's a browser for the system package content, where entries can be
selected to blacklist them. The selected entries are removed from the
packagefs instance in the boot loader, so that e.g. selected drivers
won't be picked up. The paths are also added to the safe mode driver
settings and will be interpreted when the system packagefs instance is
mounted by the kernel.
* Make Menu and MenuItem polymorphic.
* MenuItem:
- Make SetMarked() virtual, so it can be overridden.
- Add SetSubmenu() and Supermenu().
- Delete the submenu in the destructor.
* Menu:
- Add Entered()/Exited() hooks. They frame the time the user navigates
the menu or any of its submenus. The hooks allow for subclasses
populating their item list dynamically.
- Add SortItems().
* Update boot loader menu copyright text to include 2013, now that it is
over soon. :-)
* pin idle threads to their specific CPUs
* allow scheduler to implement SMP_MSG_RESCHEDULE handler
* scheduler_set_thread_priority() reworked
* at reschedule: enqueue old thread after dequeueing the new one
* Thread::scheduler_lock protects thread state, priority, etc.
* sThreadCreationLock protects thread creation and removal and list of
threads in team.
* Team::signal_lock and Team::time_lock protect list of threads in team
as well.
* Scheduler uses its own internal locking.
* The UNMAP command is theoretically much faster, as it can get many block
ranges instead of just a single range.
* Furthermore, the ATA TRIM command resembles it much better.
* Therefore, fs_trim_data now gets an array of ranges, and we use SCSI UNMAP
to trim.
* Updated BFS code to collect array ranges to fully support the new
fs_trim_data possibilities.
* No need for the atomically changed variables to be declared as
volatile.
* Drop support for atomically getting and setting unaligned data.
* Introduce atomic_get_and_set[64]() which works the same as
atomic_set[64]() used to. atomic_set[64]() does not return the
previous value anymore.
The flag main purpose is to avoid race conditions between event handler
and cancel_timer(). However, cancel_timer() is safe even without
using gSchedulerLock.
If the event is scheduled to happen on other CPU than the CPU that
invokes cancel_timer() then cancel_timer() either disables the event
before its handler starts executing or waits until the event handler
is done.
If the event is scheduled on the same CPU that calls cancel_timer()
then, since cancel_timer() disables interrupts, the event is either
executed before cancel_timer() or when the timer interrupt handler
starts running the event is already disabled.
* Replace ports list mutex with R/W-lock.
* Move team port list protection to separate array of mutexes.
Relieve contention on sPortsLock by removing Team::port_list from its
protected items. With this, set_port_owner() only needs to acquire the
sPortsLock for reading.
* Add another hash table holding the ports by name. Used by find_port()
so it doesn't have to iterate over the list anymore.
* Use slab-based memory allocator for port messages. sPortQuotaLock was
acquired on every message send or receive and was thus another point
of contention. The lock is not necessary anymore.
* Lock for port hashes and Port::lock are no longer locked in a nested
fashion to reduce chances of blocking other threads.
* Make operations concurrency-safe by adding an atomically accessed
Port::state which provides linearization points to port creation and
deletion. Both operations are now divided into logical and physical
parts, the logical part just updating the state and the physical part
adding/remove it to/from the port hash and team port list.
* set_port_owner() is the only remaining function which still locks
Port::lock and one or two of sTeamListLock[] in a nested fashion.
Since it needs to move the port from one team list to another and
change Port::owner, there's no way around.
* Ports are now reference counted to make accesses to already-deleted
ports safe.
* Should fix#8007.
Simple scheduler behaves exactly the same as affine scheduler with a
single core. Obviously, affine scheduler is more complicated thus
introduces greater overhead but quite a lot of multicore logic has been
disabled on single core systems in the previous commit.
There is a global heap of cores, where the key is the highest priority
of threads running on that core. Moreover, for each core there is
a heap of logical processors on this core where the key is the priority
of currently running thread.
The per-core heap is used for load balancing among logical processors
on that core. The global heap is used in initial decision where to put
the thread (note that the algorithm that makes this decision is not
complete yet).
Simple scheduler is used when we do not have to worry about cache affinity
(i.e. single core with or without SMT, multicore with all cache levels
shared).
When we replace gSchedulerLock with more fine grained locking affine
scheduler should also be chosen when logical CPU count is high (regardless
of cache).
In SMP systems simple scheduler will be used only when all logical
processors share all levels of cache and the number of CPUs is low.
In such systems we do not have to care about cache affinity and
the contention on the lock protecting shared run queue is low. Single
run queue makes load balancing very simple.
Kernel support for yielding to all (including lower priority) threads
has been removed. POSIX sched_yield() remains unchanged.
If a thread really needs to yield to everyone it can reduce its priority
to the lowest possible and then yield (it will then need to manually
return to its prvious priority upon continuing).
Each thread has its minimal priority that depends on the static priority.
However, it is still able to starve threads with even lower priority
(e.g. CPU bound threads with lower static priority). To prevent this
another penalty is introduced. When the minimal priority is reached
penalty (count mod minimal_priority) is added, where count is the number
of time slices since the thread reached its minimal priority. This prevents
starvation of lower priorirt threads (since all CPU bound threads may have
their priority temporaily reduced to 1) but preserves relation between
static priorities - when there are two CPU bound threads the one with
higher static priority would get more CPU time.
This adds the -mapcs-frame compiler flag for ARM to have "stable"
stack frames, adds support to the kernel for dumping stack crawls,
and initial support for iframes. There' much more functionality
to unlock in KDL, but this makes debugging already a lot more
comfortable.....
Since both platforms can boot the same kernel we must accept either
arg, so we make sure they are identical for now.
TODO: use a union or KMessage maybe?