Commit Graph

5552 Commits

Author SHA1 Message Date
Pawel Dziepak
45ff530069 scheduler: Be more demanding when cancelling penalties 2013-10-25 04:12:13 +02:00
Pawel Dziepak
9d7e2acf34 scheduler: Update scheduler_set_operation_mode() 2013-10-24 03:24:53 +02:00
Pawel Dziepak
978fc08065 scheduler: Remove support for running different schedulers
Simple scheduler behaves exactly the same as affine scheduler with a
single core. Obviously, affine scheduler is more complicated thus
introduces greater overhead but quite a lot of multicore logic has been
disabled on single core systems in the previous commit.
2013-10-24 02:04:03 +02:00
Pawel Dziepak
e927edd376 scheduler_affine: Disable logic not needed on current topology 2013-10-24 01:33:12 +02:00
Pawel Dziepak
7e1ecb9315 kernel: Protect scheduler_set_thread_priority() with lock 2013-10-24 00:59:10 +02:00
Pawel Dziepak
31a75d402f kernel: Protect lock internals with per-lock spinlock 2013-10-24 00:01:18 +02:00
Pawel Dziepak
d6efe8ee75 kernel: Update cpu_ent::active_time atomically 2013-10-23 21:56:14 +02:00
Pawel Dziepak
453bf75027 scheduler_affine: Try not to get overloaded by small tasks 2013-10-23 21:32:27 +02:00
Pawel Dziepak
2df11d8a80 scheduler_affine: Put small tasks on a single core 2013-10-23 01:59:25 +02:00
Pawel Dziepak
8d471bc3d9 scheduler_affine: Store cores with CPU bound threads separately
This is preparation for small task packing. We want to have as many idle
cores as possible. To achieve that we put all threads on the most heavily
loaded core (so the other ones can become idle). However, we don't really
want to do that if there are CPU bound tasks and if any of the cores
becomes overloaded.
2013-10-22 23:52:40 +02:00
Pawel Dziepak
f823aacf59 scheduler_affine: Remove old code 2013-10-22 01:21:51 +02:00
Pawel Dziepak
2e0ee59462 scheduler_affine: Migrate threads from overloaded cores
* Keep number of CPU bound threads on cores balanced.
 * If possible migrate normal threads from cores with cpu bound ones to
   the less busy cores.
2013-10-22 01:18:03 +02:00
Pawel Dziepak
7aba623f52 scheduler_affine: Balance number of threads assigned to CPUs
When the thread cannot be run immediately assign it to the core with
lowest number of CPU bound threads and assigned threads.
2013-10-21 21:39:40 +02:00
Pawel Dziepak
afe1735d7d scheduler_affine: Expire old cache affinities
Performance mode:
 If there have been a lot of activity on the core since the thread went
 sleep its data in cache probably has been overwritten.

Power saving mode:
 If the thread went to sleep a long time ago either there has been a
 lot of activity on its core or the core has been idle and it may
 be more efficient to wake another one.
2013-10-21 19:19:28 +02:00
Pawel Dziepak
7ea42e7add kernel: Remove invoke_scheduler_if_idle 2013-10-21 02:38:57 +02:00
Pawel Dziepak
ea79da9500 kernel: Remove support for thread_queue 2013-10-21 02:30:20 +02:00
Pawel Dziepak
74192fd984 scheduler_affine: Fix compilation warning 2013-10-21 02:21:14 +02:00
Pawel Dziepak
84812e6033 scheduler_affine: Correctly assign CPUs to idle threads 2013-10-21 02:20:09 +02:00
Pawel Dziepak
cd8d4e39fd kernel: Introduce scheduler modes of operation 2013-10-21 02:17:00 +02:00
Pawel Dziepak
4b446279e6 scheduler_affine: Use CPU topology tree to create ID mappings 2013-10-21 01:34:31 +02:00
Pawel Dziepak
343c489689 kernel: Create CPU topology tree 2013-10-21 01:33:35 +02:00
Pawel Dziepak
a6d4233e59 scheduler_affine: Choose wisely which core to wake up
The longer core is idle the deeper idle state it has entered. That's
why the scheduler should always choose the core that has gone idle
most recently (both for performance and power saving reasons).

Moreover, if there are more than one package the scheduler should
minimize the number of packages with at least one core active when
power saving is the priority. Contrary, as many packages as possible
should be used when aiming for high performance.
2013-10-20 23:26:32 +02:00
Pawel Dziepak
da3a48f4a8 scheduler_affine: Use min-max heap as per-core CPU heap 2013-10-17 19:23:27 +02:00
Pawel Dziepak
278c9784a1 scheduler_affine: Use global core heap and per-core CPU heaps
There is a global heap of cores, where the key is the highest priority
of threads running on that core. Moreover, for each core there is
a heap of logical processors on this core where the key is the priority
of currently running thread.

The per-core heap is used for load balancing among logical processors
on that core. The global heap is used in initial decision where to put
the thread (note that the algorithm that makes this decision is not
complete yet).
2013-10-17 02:11:28 +02:00
Pawel Dziepak
3ec1d8da42 scheduler_affine: Add logic shared with simple scheduler
The scheduler is in very early stage. There is no thread migration and
the algorithms choosing CPU for thread are very simple.

Since affine scheduler is going to use one run queue per core simple on
single core machines it will work exactly the same as simple scheduler.
That would allow us to have only one scheduler implementation usable
on all kinds of machines.
2013-10-16 23:50:18 +02:00
Pawel Dziepak
824ed26c51 kernel: Fully detect CPU topology before initializing scheduler 2013-10-16 20:02:56 +02:00
Pawel Dziepak
cf863a5040 kernel: Decide whether to use simple or affine scheduler
Simple scheduler is used when we do not have to worry about cache affinity
(i.e. single core with or without SMT, multicore with all cache levels
shared).

When we replace gSchedulerLock with more fine grained locking affine
scheduler should also be chosen when logical CPU count is high (regardless
of cache).
2013-10-16 18:39:25 +02:00
Pawel Dziepak
ebec24f9e0 kernel: Add support for disabling CPUs in scheduler 2013-10-15 02:02:54 +02:00
Pawel Dziepak
3de2c5ceec kernel: Add support for pinned threads 2013-10-15 01:47:28 +02:00
Pawel Dziepak
51d1e9ada0 kernel: Remove scheduler_simple_smp 2013-10-15 00:37:19 +02:00
Pawel Dziepak
f20ad54be2 kernel: Add support for SMP systems to simple scheduler
In SMP systems simple scheduler will be used only when all logical
processors share all levels of cache and the number of CPUs is low.
In such systems we do not have to care about cache affinity and
the contention on the lock protecting shared run queue is low. Single
run queue makes load balancing very simple.
2013-10-15 00:29:04 +02:00
Pawel Dziepak
298314fe4b libroot: Update sched_get_priority_{max, min}()
SCHED_RR is a real-time scheduling policy.
SCHED_FIFO and SCHED_SPORADIC are not supported (at least for now).
2013-10-09 20:57:42 +02:00
Pawel Dziepak
29e65827fd kernel: Remove possibility to yield to all threads
Kernel support for yielding to all (including lower priority) threads
has been removed. POSIX sched_yield() remains unchanged.

If a thread really needs to yield to everyone it can reduce its priority
to the lowest possible and then yield (it will then need to manually
return to its prvious priority upon continuing).
2013-10-09 20:42:34 +02:00
Pawel Dziepak
fee8009184 kernel: Add another penalty for CPU bound threads
Each thread has its minimal priority that depends on the static priority.
However, it is still able to starve threads with even lower priority
(e.g. CPU bound threads with lower static priority). To prevent this
another penalty is introduced. When the minimal priority is reached
penalty (count mod minimal_priority) is added, where count is the number
of time slices since the thread reached its minimal priority. This prevents
starvation of lower priorirt threads (since all CPU bound threads may have
their priority temporaily reduced to 1) but preserves relation between
static priorities - when there are two CPU bound threads the one with
higher static priority would get more CPU time.
2013-10-09 20:13:47 +02:00
Pawel Dziepak
24dbeeddb2 kernel: Give longer time slice to lower priority threads 2013-10-09 02:25:21 +02:00
Pawel Dziepak
3e91b082c8 libroot: Do not rely on thread_yield() 2013-10-09 02:07:08 +02:00
Pawel Dziepak
879ceb60d8 kernel: Remove suporfluous casts 2013-10-09 01:45:07 +02:00
Pawel Dziepak
130000e068 kernel: Dump scheduler specific thread data 2013-10-09 01:37:00 +02:00
Pawel Dziepak
f256b4aca7 kernel: Use SimpleRunQueue as run queue type everywhere 2013-10-09 01:20:40 +02:00
Pawel Dziepak
0896565a6e kernel: Support sched_yield() properly
sched_yield() should not yield to the threads with lower priority.
2013-10-09 01:18:55 +02:00
Pawel Dziepak
ee69e53630 kernel: Minor improvements, separate priority and yield logic 2013-10-08 21:36:49 +02:00
Pawel Dziepak
9363e99b19 kernel: Remove Thread::next_priority 2013-10-08 20:21:35 +02:00
Pawel Dziepak
346e789a21 kernel: Fix style issues 2013-10-08 20:15:21 +02:00
Pawel Dziepak
94f4574d78 kernel: Move thread retrieving code to separate function 2013-10-08 20:13:10 +02:00
Pawel Dziepak
03e3a82953 kernel: Allow threads to yield CPU properly 2013-10-08 06:41:20 +02:00
Pawel Dziepak
bab69bdb47 kernel: Force high priority threads to yield less often 2013-10-08 04:53:30 +02:00
Pawel Dziepak
547b8c76c7 kernel: Cancel penalty only if the thread actually waits
Require the thread to give up CPU for at least one time slice before
cancelling its penalty.
2013-10-08 04:50:23 +02:00
Pawel Dziepak
21808e8f0b kernel: Limit maximum priority penalty
The maximum penalty the thread can receive is now limited depending on
the real thread priority. However, since it make it possible to starve
threads with priority lower than that limit. To prevent that threads
that have already earned the maximum penalty are periodically forced
to yield CPU to all other threads.
2013-10-08 02:54:58 +02:00
Pawel Dziepak
31e65090db kernel: Use standard compliant version of variadic macros 2013-10-08 01:34:55 +02:00
Pawel Dziepak
e083bca041 kernel: Allow threads to always finish their time slice
Until now, when the thread has been preempted by higher priority
thread it was then placed at the end of its priority FIFO and given
a new time slice. This patch changes it allowing the thread to
complete its time slice (when the higher priority threads are done),
unless there was very little time left in which case this time is added
to the next time slice.

Apart from making the algorithm more fair this change allows to identify
CPU bound threads more easily. (Earlier they could 'hide' by being
preempted by higher priority thread and consequently never using
their whole time slice).
2013-10-08 01:08:05 +02:00
Pawel Dziepak
565e7a977d kernel: Update SchedulerTracing::EnqueueThread 2013-10-07 21:52:45 +02:00
Pawel Dziepak
72f844835e kernel: Do not require double parentheses in TRACE statements. 2013-10-07 21:29:10 +02:00
Pawel Dziepak
82c26e1f1f kernel: Punish CPU bound threads
This patch appears to fix #8007.

Thread that consume its whole quantum has its priority reduced. The penalty
is cancelled when the thread voluntarily gives up CPU. Real-time threads
are not affected.

The problem of thread starvation is not solved completely. The worst case
latency is still unbounded (even in systems with bounded number of threads).
When a middle priority thread is constantly preempted by high priority
threads it would not earn the penalty, thus the lower priority threads
still can be starved. Moreover, the punishment is probably too aggressive
as it reduces priority of virtually all CPU bound threads to 1.
2013-10-07 21:11:32 +02:00
Pawel Dziepak
6d7e291233 kernel: Allow scheduler initialization to fail 2013-10-05 20:45:07 +02:00
Pawel Dziepak
9ad558f01c kernel: Use the new runqueue in non-MP scheduler 2013-10-05 20:22:59 +02:00
Pawel Dziepak
b8c1df9b00 kernel: Add O(1) lookup and insertion priority queue 2013-10-05 20:16:06 +02:00
Pawel Dziepak
7039b950fb x86[_64]: Fix style issues 2013-10-05 18:03:00 +02:00
Pawel Dziepak
149c82a8ec kernel/util: Add bitmap implementation 2013-10-03 04:27:49 +02:00
Pawel Dziepak
7087b865e2 x86[_64]: Remove superfluous memset()s 2013-10-03 04:26:21 +02:00
Pawel Dziepak
36cc64a9b3 x86[_64]: Add CPU cache topology detection for AMD and Intel CPUs 2013-10-02 23:48:03 +02:00
Pawel Dziepak
1f50d09018 kernel/util: Add bit hack utilities 2013-10-02 21:24:46 +02:00
Pawel Dziepak
26c3861891 x86[_64]: Fix some style issues 2013-10-02 21:18:56 +02:00
Pawel Dziepak
fa6f78aee7 x86[_64]: Use uint32 for maximum CPUID leaf number 2013-10-02 21:03:34 +02:00
Pawel Dziepak
c9b6f27d94 x86[_64]: Add CPU topology detection for AMD processors 2013-10-02 02:34:35 +02:00
Pawel Dziepak
f1644d9d0b x86[_64]: Set level shift by counting bits in mask 2013-10-02 01:55:07 +02:00
Pawel Dziepak
fafeda52ea x86[_64]: Do not return too soon from detectCPUTopology() 2013-10-02 01:49:10 +02:00
Pawel Dziepak
8ec897323e x86[_64]: Add CPU topology detection for Intel processors 2013-10-02 01:19:17 +02:00
Pawel Dziepak
4110b730db x86[_64]: Add support for CPUID sub-leaves
Some CPUID leaves may contain one or more sub-leaves accessed by setting
ECX to an appropriate value.
2013-10-01 20:31:18 +02:00
Pawel Dziepak
ffd5393620 kernel/util: Make exit() available in bootloader as well 2013-10-01 19:31:48 +02:00
Pawel Dziepak
7aecb0b276 kernel/util: Make exit() available in kernel mode
Since we are using libraries originally intendent for user mode in kernel
mode providing them with some userland functions is inevitable. This
particular patch is to make zlib happy and able to call exit() when
its debug assertions fails.
2013-10-01 15:51:07 +02:00
François Revol
68bccdf6b4 M68K: More gcc options fixes 2013-09-30 04:21:41 +02:00
François Revol
4046c49f88 M68K: Account for extra parameter to create_area_etc()
Would have been nice to also fix 68k code... just sayin.
2013-09-30 04:15:27 +02:00
François Revol
835545cfd1 M68K: drop dupplicate strlen
Seems we have our own now.
2013-09-30 04:09:27 +02:00
François Revol
f7d6c2f8e5 M68K: Switch to new gcc options for specifying cpu
Latest gcc converts the old ones to the new ones anyway...
including when passing to gas, which of course is not new enough,
so we have to also force gcc to pass the old one around in one case.
2013-09-30 04:02:21 +02:00
François Revol
5e0e2739c9 ARM: work around too many libgcc objects when linking libroot
jam fails in execve() trying to run the command due to
a too large arguments list because of the many objects in libgcc.

We split them into two intermediate objects,
then we link them to libroot.
2013-09-30 00:37:06 +02:00
François Revol
f9ab70a1d1 Guard the __sync_* atomic helper with __ARM__
I didn't notice I was adding to a generic file.
2013-09-29 22:43:34 +02:00
François Revol
c436d67da4 ARM: Add note about updating libstdc++
The __sync_fetch_and_add_4() helper is deprecated in newer GCC,
and should be droped when we update libstdc++.
2013-09-29 21:02:11 +02:00
François Revol
75453edc01 ARM: Add a C version of __sync_fetch_and_add_4()
It just calls atomic_add().

No need for the asm version, it doesn't need to depend on defines.
2013-09-29 19:46:41 +02:00
François Revol
735ec4c018 ARM: Add longjmp_return.c to the libroot built
Linking was failing with undefined reference to __longjmp_return.
2013-09-29 04:03:09 +02:00
Ingo Weinhold
81291304ad Merge remote-tracking branch 'haiku/master' into package-management
Conflicts:
	build/jam/BuildSetup
	build/jam/HaikuImage
	build/jam/board/sam460ex/BoardSetup
	build/jam/board/verdex/BoardSetup
	data/catalogs/apps/icon-o-matic/fr.catkeys
	src/add-ons/kernel/drivers/audio/hda/hda_codec.cpp
	src/add-ons/kernel/drivers/disk/usb/usb_disk/usb_disk.cpp
	src/apps/debugger/files/FileManager.cpp
	src/apps/debugger/files/FileManager.h
	src/apps/debugger/user_interface/gui/inspector_window/MemoryView.cpp
	src/apps/haiku-depot/MainWindow.cpp
	src/apps/haiku-depot/MainWindow.h
	src/apps/haiku-depot/Model.cpp
	src/apps/haiku-depot/PackageInfo.h
	src/apps/haiku-depot/PackageInfoListener.h
	src/apps/haiku-depot/PackageInfoView.cpp
	src/apps/haiku-depot/PackageInfoView.h
	src/apps/haiku-depot/PackageListView.cpp
	src/apps/haiku-depot/PackageListView.h
	src/system/kernel/arch/arm/arch_timer.cpp
	src/system/libroot/os/arch/arm/atomic.S
	src/tools/translation/bitsinfo/Jamfile
	src/tools/translation/bmpinfo/Jamfile
	src/tools/translation/tgainfo/Jamfile
2013-09-27 01:55:45 +02:00
Jérôme Duval
1bc85a38d5 libroot: spawn_thread() now creates a detached pthread.
* __pthread_destroy_thread() will in turn free the pthread_thread object.
* this fixes a leak of 2072 bytes on each thread construction/destruction
and #9945. MediaExtractor spawns a thread on construction, which leaked
its pthread_thread object on destuction.
2013-09-26 21:30:59 +02:00
Pawel Dziepak
afaa6ed4b3 x86[_64]: Randomize initial stack pointer on alternative signal stacks
If the alternate signal stack is used randomize the initial stack
pointer in the same way it is randomized on "normal" thread stacks.
Also, update MINSIGSTKSZ value so that regardless of where the new
stack pointer points to there is at least 4k of stack left.
2013-09-21 21:52:13 +02:00
Ithamar R. Adema
e7c330c6f3 ARM: improve error output, fix iframe reporting. 2013-09-19 03:15:06 +02:00
Ithamar R. Adema
1847d8c486 ARM: user_memcpy/memset/strlcpy: fix my horrible ARM assembly
Turns out I was way to green (and tired) last year to code this properly...
now they finally work and the kernel is a lot more stable for it.
2013-09-18 22:20:17 +02:00
Ingo Weinhold
de15b85e5c getgr{nam,gid}[_r](): Fix retrieving group members 2013-09-18 16:33:16 +02:00
Ingo Weinhold
fb8a9c4710 getpw{nam,uid}[_r]: Fix return value behavior
... when the user is not found.
2013-09-18 16:33:16 +02:00
Ingo Weinhold
222fb7a91a getgrgid_r()/getgrname_r(): Fix group not found return value 2013-09-18 16:33:16 +02:00
Ingo Weinhold
9a85313bc6 X86PagingStructuresPAE: Zero fPageDirPointerTable in constructor
... and use it as a guard in the destructor. Fixes crash when running
out of memory and Init() is not called.
2013-09-18 16:33:15 +02:00
Ithamar R. Adema
501b24c63b ARM: kernel: Make 32/64-bit atomics work for ARMv5/6
Support for 64-bit atomic operations for ARMv7+ is currently stubbed
out in libroot, but our current targets do not use it anyway.

We now select atomics-as-syscalls automatically based on the ARM
architecture we're building for. The intent is to do away with
most of the board specifics (at the very least on the kernel side)
and just specify the lowest ARMvX version you want to build for.

This will give flexibility in being able to distribute a single
image for a wide range of devices, and building a tuned system
for one specific core type.
2013-09-18 05:03:18 +02:00
Ingo Weinhold
6dee6653c2 When switching to PAE don't copy not needed PTEs
Now we check whether the virtual address corresponding to the PTE lies
in an allocated virtual address range. This fixes a cause of #8345:
The assertion would trigger when such an entry was encountered. There
might be other causes that trigger the same assertion, though.
2013-09-18 00:42:45 +02:00
Ingo Weinhold
372a666344 X86VMTranslationMapPAE: Add some ktracing for page (un)mapping 2013-09-18 00:42:45 +02:00
Ingo Weinhold
6508ce9f52 X86VMTranslationMapPAE::Map(): More info in assert 2013-09-18 00:42:44 +02:00
Ingo Weinhold
bcb7463650 arch_vm_translation_map_early_map(): Fix debug output 2013-09-18 00:42:44 +02:00
Ingo Weinhold
93495b0354 X86PagingStructuresPAE: clear fVirtualPageDirs in constructor
... not just the first element. Fixes a crash in X86VMTranslationMapPAE
destructor when running out of memory when initializing the map.
2013-09-18 00:42:44 +02:00
Ingo Weinhold
34d0d4d85e dump_page_queue(): fix output
* Determine the cache type per page instead of printing the first page's
  cache type for all pages.
* Use vm_cache_type_to_string().
2013-09-18 00:42:43 +02:00
Ithamar R. Adema
cc65466f0d ARM: kernel: Make KDL more useful on ARM
This adds the -mapcs-frame compiler flag for ARM to have "stable"
stack frames, adds support to the kernel for dumping stack crawls,
and initial support for iframes. There' much more functionality
to unlock in KDL, but this makes debugging already a lot more
comfortable.....
2013-09-17 23:04:59 +02:00
Ithamar R. Adema
740490ba82 ARM: libroot: fix setjmp/longjmp implementation.
Just a couple of lines of code, but a head full of pain ;-) Finally
got it right and now KDL can properly recover from invalid accesses.
2013-09-17 22:26:48 +02:00
Ithamar R. Adema
34ed0fe74a ARM: kernel: fix system_time() when being called too early. 2013-09-17 15:57:36 +02:00
Ithamar R. Adema
dfa5aa0c98 device_manager: Move init_node_tree to after kdl cmd registration
This helps when debugging, since when a driver/module causes a crash
while registering with the device manager, you can actually look at
the device manager state ;-)
2013-09-17 14:42:06 +02:00
Ithamar R. Adema
ba06f07660 ARM: kernel: fix timer resolution and implement basic timekeeping.
The previously used method for programming the timer did not take
into account that our timespec is 64bit while the register we poke
it into is 32 bit. Since the PXA (SoC in Verdex target) has a limited
scale of resolution (us,ms,second) we dynamicly determine the one
that we can most closely match, and set that.

For f.ex. snooze to work however, we also need system_time to work.
The current implementation uses a system timer at microsecond
resolution to keep track of time.

Although the code is far from perfect, committing it now before
it gets lost, since I'm working on the infrastructure code
to properly factor out the SoC specific code out of the core
ARM architecture code (so the kernel can support more then
our poor old Verdex QEMU target ;))
2013-09-17 14:42:05 +02:00