Commit Graph

6145 Commits

Author SHA1 Message Date
Alexander von Gluck IV
5eeb4163fa platform/u-boot: Make bcm2708 mailbox code functional
* Functional == compiles. Needs tested :-)
2015-03-06 10:17:52 -06:00
Alexander von Gluck IV
5906dbb4d4 platform/u-boot: Work towards using arm mailbox driver
* Reference bcm2708 framebuffer when it makes sense
* Add bcm2708 define to Raspberry Pi board_config.h
2015-03-06 07:47:32 -06:00
Alexander von Gluck IV
7ddf9bcf0d boot/arm: bcm2708 cleanup; no functional change 2015-03-05 22:54:21 -06:00
Alexander von Gluck IV
c798e80b79 raspberry_pi: Move over to u-boot.
* The raspberry_pi loader wasn't in great shape anyway,
  but could still contain some valueable code.
2015-03-05 22:41:47 -06:00
Alexander von Gluck IV
8a06abf132 bcm2708: Convert framebuffer driver over to new format
* Remove from raspberry_pi bootloader (which should die soon)
* Change to use new arm device layout
* Create arch_mailbox to check arm mailboxes (WIP)
2015-03-05 22:41:47 -06:00
Adrien Destugues
1736cb1d59 driver_settings: fix allocating an empty settings
I misread the condition and broke this in 0687a01. Thanks to Axel for
reviewing!
* Refactor the code again to move all the error checking at the top of
the function, to make it easier to read.
2015-01-14 13:39:35 +01:00
Adrien Destugues
bcb793d37b Fix driver_settings in kernel mode outside of drivers.
The API allows to create driver settings which are not added to the
global list, however those were left partially uninitialized, and there
was no way to cleanly delete them.

Tag such unattached settings with a ref_count of -1, and have
delete_driver_settings check for this and handle the case correctly.

Note: #10494 comment 2 says the settings for packagefs shouldn't be
added to the kernel driver settings list, which is why I went with this
solution. An alternative would be always using the list and the
reference counting, but I don't know what the consequences are.

Fixes #10494.
2015-01-14 11:53:19 +01:00
Adrien Destugues
0687a01b53 driver_settings: don't strdup(NULL)
* This is not allowed by strdup POSIX specs and GCC may use its builtin
strdup which doesn't check for it.
* also refactor parse_driver_settings_string to create the
settings_handle using settings_new, to reduce code duplication.
2015-01-14 11:53:18 +01:00
PulkoMandy
98731302d8 fssh_api_wrapper: fix build on non-Haiku.
* I'm not sure why strings.h needs to be included before <new>, but it
wouldn't work otherwise.
2015-01-14 10:16:32 +01:00
Adrien Destugues
749bd21c45 rootfs: fs_shel build fix attempt.
Sorry, I can't test all cases when building from Haiku.

Including <new> after the fs shell wrapper makes the compiler fail
because new needs a size_t argument (not an fssh_size_t). But including
it before also fails because it includes C++ typedefs without the fssh
wrapper, leading to conflicts.

Undefining size_t just for the include of <new> isn't very clean, but
seems to work. new gets a size_t argument as it should and the other
typedefs aren't conflicting.
2015-01-13 16:21:14 +01:00
Adrien Destugues
a7d444d145 Remove old block cache implementation
This was not used anywhere except the tests written for it (which is
also removed).
2015-01-13 15:48:59 +01:00
Adrien Destugues
ace74964f1 Remove khash from the sources.
Fixes #9552.
2015-01-13 15:48:58 +01:00
Adrien Destugues
bb3092b298 rootfs: convert to BOpenHashTable.
* Add an fs-shell compatible version of BOpenHashTable in the fs_shell
to keep it working. The header is renamed to KOpenHashTable to avoid a
conflict with the OpenHashTable.h available in private/shared which is
not API compatible.
2015-01-13 15:48:58 +01:00
Adrien Destugues
9d1c3b8d4b block cache: convert to BOpenHashTable. 2015-01-13 15:48:56 +01:00
Rene Gollent
be60c04c89 modules: Fix #11746.
- When normalizing paths of the preloaded modules to their final mounted
  path, remove them from the hash table before updating their path. Otherwise,
  the remove would fail due to the hash no longer matching, which in turn
  would cause the code in question to introduce an infinite loop in the
  hash table's internal link list due to manually rewriting the next link.
2015-01-12 19:08:24 -05:00
Adrien Destugues
3395fdcd6a gcc4 build fix.
* offsetof is not allowed on non-POD types so we need to use
offset_of_member (gcc2 accepts offsetof, and C++11 relaxed the
constraints on where it is allowed so it should work there too)
* we have offset_of_member as a workaround until we switch to C++11,
move it from khash (which is soon to be removed) to list.h which is the
other place where it is used (for this one single call in our whole
codebase)

Also fix a typo in vfs.cpp.
2015-01-12 19:04:33 +01:00
Adrien Destugues
f9defd4526 VFS: migrate to BOpenHashTable. 2015-01-12 18:23:45 +01:00
Adrien Destugues
6235b4967b More useless inclusions of khash.h 2015-01-12 18:23:45 +01:00
Adrien Destugues
79b12613f5 legacy_drivers: convert to BOpenHashTable. 2015-01-12 18:23:44 +01:00
Adrien Destugues
887be0ac6a kernel/module: fully convert to BOpenHashTable 2015-01-12 18:23:44 +01:00
Adrien Destugues
c4718ea973 Missing std::nothrow on new
Forgot to add this when migrating to BOpenHashTable.
2015-01-12 09:46:40 +01:00
Rene Gollent
6e9704175e kernel: Style fix. 2015-01-09 19:49:52 -05:00
Rene Gollent
d05a5a70e0 kernel: Fix ELF hashtable iterator handling.
As a result of the refactoring for OpenHashTable, the iterator semantics
have changed a bit, such that the end of the table is no longer signalled
by the iterator returning NULL. This wasn't taken into account during
refactoring, which would lead to various places returning the last item
in the list in the case where no matching item was found, causing e.g.
drivers not to be loaded properly. This fixes the boot hang regressions
introduced in hrev48640.
2015-01-09 14:42:13 -05:00
Adrien Destugues
3b3cad8468 kernel elf: Fix Compare function
I forgot to change the function to return true on equality, instead of
returning the difference as khash required. Fixes a panic on boot.
2015-01-09 21:31:34 +01:00
Adrien Destugues
271ac910a4 Remove useless includes of khash.h
* These files were already converted to BOpenHashTable.
* For #9552.
2015-01-09 18:09:12 +01:00
Adrien Destugues
6a89f8040f devfs: migrate to BOpenHashTable
For #9552.
2015-01-09 18:09:10 +01:00
Adrien Destugues
69ff01cb9e Migrate image hash table to BOpenHashTable.
For #9552.
2015-01-09 18:09:09 +01:00
Adrien Destugues
57f933d348 CID603224: missing break in parsedate.
Could lead to wrongly setting the TYPE_MINUTE flag for an invalid (>59)
number of minutes. Harmless, as that flag is never used.
For completeness, also set the flag for seconds (also never used).

Fixes #11552.
2014-12-18 15:55:47 +01:00
Puck Meerburg
c038f26da8 Import div.c patch by Tri-Edge AI
Signed-off-by: Jérôme Duval <jerome.duval@gmail.com>
2014-12-14 18:06:21 +01:00
Puck Meerburg
a5f30beaad Fix #7008: Add a64l and l64a from glibc, and add some missing definitions in wchar.h and stdlib.h
Signed-off-by: Jérôme Duval <jerome.duval@gmail.com>
2014-12-14 18:06:09 +01:00
Adrien Destugues
38ec030ace FileDevice: implement icon ioctls
Fixes #9320.
2014-12-02 09:04:56 +01:00
Adrien Destugues
56abf4aa37 Fix std::isnan and friends for gcc2.
gcc2 was relying on the c99 functions being there, but they are not in
the std namespace.
* Disable the C99 functions and macros in C++ mode
* Redefine them as inline functions in cmath in the std namespace.

Fixes #7396.
2014-11-27 10:58:49 +01:00
Jérôme Duval
f3e381dd0c bios_ia32: for correctness, add clobber memory for asm invlpg.
* generated code is the same for x86_gcc2 and x86_64.
* fixed TRACE build for mmu.cpp.
2014-11-17 20:17:30 +01:00
PulkoMandy
8068b64b5c Fix build with guarded heap on x86_64
* Type mismatch.
2014-11-14 12:55:50 +01:00
Adrien Destugues
db214549c5 guarded_heap: fix build (volatile + atomic ops)
Unfortunately, the package manager uses more kernel memory and it's not
possible to boot to the desktop with the guarded heap anymore.
2014-11-12 16:20:24 +01:00
Adrien Destugues
7b4084f717 reject partitions with negative offset
I had a KDL when trying to read an audio CD which apparently uses this
as a copy protection scheme.
I don't know if this is the right place to do this, the KDL would happen
further down when the intel partitionning system or bfs would try to
read data from the disk at offset -2048.
2014-11-11 17:13:03 +01:00
Alexander von Gluck IV
57bc65034a Everything: Update lots of code to use B_COUNT_OF macro
* Likely not everything, but the obvious uses of B_COUNT_OF
2014-11-09 14:52:19 -06:00
Michael Lotz
9a6331459f kernel: Fix build with KDEBUG_LEVEL < 2.
The lock caller info isn't available in such a configuration.
2014-11-04 23:00:59 +01:00
François Revol
76b8f002e1 Implement lseek(SEEK_END) on devices
While the partitioning system does publish partitions as block
devices and report their size in stat(), the old BeOS-style
drivers have no means of reporting it this way.
So we fall back to ioctl(B_GET_GEOMETRY) to find out the size.
2014-11-04 20:47:04 +01:00
François Revol
b7ff6340ae U-Boot: always initialize args.arguments_count 2014-11-04 16:01:16 +01:00
François Revol
361f5a857f ARM: OMAP3: dynamically allocate the framebuffer
It seems to work on overo at least, which has only 128MB by default in QEMU.
2014-11-03 21:06:20 +01:00
François Revol
1cac4300c3 ARM: Add an mmu_get_virtual_mapping() call to bootloader
Will be needed to figure out the framebuffer address
once we allocate it properly instead of hardcoding.
2014-11-03 20:49:01 +01:00
François Revol
cb3ea122d3 U-Boot: delay checking /chosen:bootargs after remapping FDT
This avoids having to copy the strings.
For now we disregard argv[] as it is not remapped before
being used in add_stage2_driver_settings() and is not used
by the linux entry point.

This makes the overo loader panic at the same place as
the beagle xm one now, even though it fails to display
anything with the default RAM size since we allocate
the framebuffer beyond 128MB...
2014-11-03 19:41:38 +01:00
Michael Lotz
eac94f5db9 kernel: Also push lock caller in acquire_spinlock_nocheck. 2014-11-02 00:04:28 +01:00
Michael Lotz
41418981f4 kernel: Sync panic messages across acquire_spinlock versions.
* Always include last caller and lock value on both UP and MP path.
* Change lock value printing to hex format, as 0xdeadbeef is more
  obvious than its decimal counterpart.
2014-11-02 00:04:27 +01:00
Jessica Hamilton
22ea34153f access: fix to be POSIX compliant 2014-11-02 08:51:24 +13:00
François Revol
564a073b01 ARM: move uEnv.txt content to BoardSetup file
That's really where it belongs. Not all boards will need it,
but for now it's always created.
2014-11-01 19:57:48 +01:00
François Revol
95e9515c4b U-Boot: drop the bind on the flash image action 2014-11-01 19:08:57 +01:00
François Revol
92fcf262ff ARM: Check for RAM size in FDT
We skip the check when we already have ranges inserted,
like from the raspberry Pi start code, and we fall back to
32MB at SDRAM_BASE is not found.
2014-11-01 18:53:48 +01:00
François Revol
8d8bda071f U-Boot: generate a separate uImage for the boot tgz as well
We need this when using the linux entry point.
2014-11-01 17:11:01 +01:00
François Revol
d1ebf9716d U-Boot: ARM: Add a linux entry point to asm shell code
While the NetBSD entry point is handy as we can use a single uImage
with all 3 blobs, it bypasses U-Boot's own patching of the FDT since
it's not visible to it, so we won't get the RAM size and other things
through it.
2014-11-01 17:09:09 +01:00
François Revol
5de5d59d78 U-Boot: move gUImage and gFDT back to BSS section
No need for this trick anymore.
2014-11-01 16:39:44 +01:00
François Revol
909a14bb55 U-Boot: introduce a start_gen() catch-all entry
So we can pass it all the optional stuff instead of playing tricks
to initialize them outside of BSS.
2014-11-01 16:39:44 +01:00
Michael Lotz
bf685cdf2e kernel: Fix missing reference release in CreateThreadEvent.
CreateThreadEvent::DoDPC() missed a reference release to balance the
acquired reference before queuing the DPC, resulting in the
CreateThreadEvent objects being leaked.

This also removes the destructor that tried to cancel the DPC. Since
the class is reference counted and only destroyed when the DPC has
run and released the last reference, this didn't make much sense.
2014-11-01 16:32:04 +01:00
François Revol
1309cdade9 U-Boot: rework flash image rule to be more flexible
We can now specify arbitrary content and offsets for each.

Change the default block size to 1k.
2014-11-01 05:37:06 +01:00
François Revol
f680a1a723 U-Boot: skip flash-image targets if no U-Boot image is passed
When building flash images we want a U-Boot binary for now.
Testing for it avoids dd waiting for input on stdin
instead leaving no clue.
2014-11-01 02:15:09 +01:00
François Revol
88d51506d0 Move ARM device tree files to an arch-specific subfolder
FDT are also used on PPC at least, and at least skeleton.dtsi
might clash since there is a different one for PPC.
2014-10-31 16:28:48 +01:00
Michael Lotz
6a80e6889a kernel: Fix missing reference to team/thread in signal events.
The signal to the team/thread is only actually sent in a deferred
procedure. To ensure that the team/thread stays valid between the DPC
being queued and it actually running, we need to acquire a reference.

Fixes #11390, where the DPC was run after the team was already
destroyed.
2014-10-31 16:16:37 +01:00
Ithamar R. Adema
5d8ce4733c ARM: u-boot: Generate DTB and include in uImage 2014-10-31 12:08:03 +01:00
Ithamar R. Adema
a52dd58d2d ARM: kernel: introduce SoC abstraction
This introduces InterruptController and HardwareTimer classes to
handle the SoC specific implementations of timers and ints for
the ARM platform.

These could be improved and moved to a more 'generic' level once
we're confident they are 'good enough'.

NOTE: The OMAP timer implementation is fully untested and probably
      completely non-functional....
2014-10-31 11:37:02 +01:00
Ithamar R. Adema
1628632584 ARM: u-boot: fixup FDT handling
If we find an FDT (either from uImage or otherwise) we make sure
we map it after mmu_init() and use kernel_args to pass it to the
kernel (so it is available at all times there).
2014-10-31 11:21:38 +01:00
Ithamar R. Adema
b794d1f947 ARM: platform: grab the FDT from the bootloader 2014-10-31 11:19:00 +01:00
Ithamar R. Adema
f4c28fe71f loader: make sure bfs debug output ends up in log
Use dprintf instead of printf so any debug output ends up in
bootloader log instead of only being displayed on-screen.
2014-10-31 10:53:28 +01:00
Henry Harrington
601b2f7eda vm: Try harder to allocate early physical pages.
* On UEFI, pages are allocated top-down; previously,
  VM would fail to allocate early pages due to
  running into pages allocated at the top and
  assume it had run out of pages to map.

Signed-off-by: Jessica Hamilton <jessica.l.hamilton@gmail.com>
2014-10-31 13:42:48 +13:00
Ingo Weinhold
dfb13a8716 Increase the size of the kernel FD table
With packagefs potentially opening quite a few packages the default of
256 slots is a bit tight. It's 4096 now, which should be safe for a
while, but we might want to consider resizing the table dynamically and
probably even switching to another algorithm for allocating the slots.

Should fix #11328.
2014-10-29 21:07:03 +01:00
Ingo Weinhold
6bbd25f071 Make vfs_resize_fd_table() accessible in the kernel
Also update some types from int to uint32.
2014-10-29 21:07:02 +01:00
Ingo Weinhold
7ca277b9ca vm_soft_fault(): remove unused wiredRange parameter 2014-10-29 12:37:25 +01:00
Ingo Weinhold
078a965f65 vm_soft_fault(): Avoid deadlock waiting for wired ranges
* VMArea::AddWaiterIfWired(): Replace the ignoreRange argument by a
flags argument and introduce (currently only) flag
IGNORE_WRITE_WIRED_RANGES. If specified, ranges wired for writing
are ignored. Ignoring just a single specified range doesn't cut it
in vm_soft_fault(), and there aren't any other users of that feature.
* vm_soft_fault(): When having to unmap a page of a lower cache, this
page cannot be wired for writing. So we can safely ignore all
writed-wired ranges, instead of just our own. We even have to do that
in case there's another thread that concurrently tries to write-wire
the same page, since otherwise we'd deadlock waiting for each other.
2014-10-29 12:37:25 +01:00
Adrien Destugues
f26118f286 Change error code for already mounted partition to B_BAD_VALUE.
As Axel pointed out, B_BAD_DATA is not the correct code here. B_BUSY
could be used but I wantd a code different from the existing one for
"partition already being initialized".
2014-10-29 08:44:52 +01:00
Ingo Weinhold
8ef857d85c vm_soft_fault(): Avoid inconsistent state when seeing wired page
When we encounter a wired page that we'd have to unmap to map our newly
allocated one, we need to get rid of the latter before unlocking
everything and waiting for the wired page. Otherwise we'd leave things
in an inconsistent state (a page from an upper cache shadowing a mapped
page from a lower cache).
2014-10-29 02:36:09 +01:00
Ingo Weinhold
699b57307e VMAnonymousCache::_MergePagesSmallerConsumer(): Add ASSERT 2014-10-29 02:36:08 +01:00
Ingo Weinhold
9da590f73e Add vm_page_free_etc()
It additionally gets a vm_page_reservation* argument. If not NULL, the
page count of the reservation is incremented for the freed page.
2014-10-29 02:36:08 +01:00
Ingo Weinhold
70d3bd5592 vm_soft_fault(): Missing DEBUG_PAGE_ACCESS_END()
... in case we'd need to unmap a page that is wired.

Fixes the immediate issue of #10977. There's a problem remaining (as
discussed in comment 1): If two threads want to wire the same page at
the same time (which led to the assertion being triggered), they will
now deadlock, waiting for each other to remove the pre-registered
VMAreaWiredRange.
2014-10-29 02:36:08 +01:00
Michael Lotz
52d500e5b4 kernel: Workaround for double lock of spinlock in user timers.
The thread that is being [un]scheduled already has its time_lock locked
in {stop|continue}_cpu_timers(). When updating the TeamTimeUserTimer,
the team is asked for its cpu time. Team::CPUTime() then iterates the
threads of the team and locks the time_lock of the thread again.

This workaround passes a possibly locked thread through the relevant
functions so Team::CPUTime() can decide whether or not a thread it
iterates needs to be locked or not.

This works around #11032 and its duplicates #11314 and #11344.
2014-10-29 00:25:37 +01:00
Adrien Destugues
4ed39e6a62 disk device manager: check that partitions are unmounted before uninitializing.
when uninitializing a partition or a disk (removing the partition
table), check that all partitions from that table are unmounted, as they
are about to become invalid.

Fixes #8827.
2014-10-28 23:52:57 +01:00
PulkoMandy
6879e9df77 Tarfs: fix traces 2014-10-28 08:49:05 +01:00
Ithamar R. Adema
ed04ffb598 ARM: keep all pages we've mapped during kernel startup
Don't just keep the page directory, but also the actual allocates
pages for the pagetables we've created.
2014-10-26 23:43:35 +01:00
Ithamar R. Adema
a17ff8279b ARM: make sure we cleanup after the bootloader
The "2nd" assert that we always ran into was due to bootloader mappings
still being active after VM init. Turns out we missed a call in the
architecture specific code for cleaning this up.

Many thanks to Ingo for spending the time to figure this out!
2014-10-26 23:23:30 +01:00
Ingo Weinhold
4ce1f197fa x86 boot loader: check gKernelArgs.arch_args.pgtables overflow 2014-10-26 22:34:52 +01:00
Michael Lotz
831abecd6a kernel: Fix unbalanced release of sync object in FD select race.
When a file descriptor is closed between being selected and adding the
select info to its IO context, the select info needs to be cleaned up.
This is done by deselect_select_infos() which unconditionally also put
the select_sync associated with the infos. In this special case we do
not yet hold a reference to the select_sync however, so avoid putting
the corresponding sync object.

Fixes #11098, #10763 and #10230.
2014-10-26 00:30:08 +02:00
Ithamar R. Adema
9c71c67140 ARM: Fix OMAP3 framebuffer divider setting
QEMU was crashing since when setting the DSS divider we were _clearing_
the TV divider, and QEMU did not check for a divide by zero.

This "fixes" the QEMU crash and gets us a working framebuffer on Beagle ;)
2014-10-25 14:49:51 -07:00
Axel Dörfler
5a95af70a2 vfs/{b|btr|package|b}fs/ext2/exfat: common access check.
* Added VFS helper function check_access_permissions() that combines
  several partially correct versions to the one true version (tm).
* All but BFS (since recently) missed the S_IXOTH for root on directories,
  and all but packagefs missed proper group handling.
2014-10-25 18:47:15 +02:00
Ithamar R. Adema
2ce0d69a7e ARM: fix bootloader's mmu_map_physical_memory size
When the address is not page aligned, not only adjust the address
to start mapping, but also take the "overflow" on the last page
into account.

This makes the bootloader boot again ;)
2014-10-25 09:43:15 -07:00
Axel Dörfler
8efd5b7613 vfs: check the X permission on set cwd.
* When you change the current working directory, you actually
  should have the permission to enter that directory.
* This gives us a 0.04% better score on the perl test suite :-)
2014-10-25 15:57:38 +02:00
Michael Lotz
e9922e775f haiku_loader: Fix wrong size of gBootGDT on x86_64.
The BOOT_GDT_SEGMENT_COUNT was based on USER_DATA_SEGMENT on both
x86 and x86_64. However, on x86_64 the order of the segments is
different, leading to a too small gBootGDT array. Move the define to
the arch specific headers so they can be setup correctly in either case.
Also add a STATIC_ASSERT() to check that the descriptors fit into the
array.

Pointed out by CID 1210898.
2014-10-22 21:06:07 +02:00
Michael Lotz
368dd37798 runtime_loader: Fix missing include of util/kernel_cpp.h.
Due to the missing include, the builtin new and delete operators were
used in those two files instead of the ones from the include used
everywhere else in the runtime_loader.
2014-10-18 21:58:08 +02:00
Michael Lotz
8ea3e9126d Typo: Fix doubled "not" in comment. 2014-10-18 19:32:33 +02:00
Adrien Destugues
7554bc9a19 wctype: out of bound access in POSIX locale.
The POSIX locale has gLocaleRoster = NULL and relies on the non-wide
version of the implementation. However it doesn't check that the
characters are actually in range which leads to out of bound access and
crashes in __isctype.

Fixes #11322.
2014-10-06 16:54:31 +02:00
Lioncash
a4a9dade68 boot: Fix some always false conditions 2014-09-26 20:56:04 +02:00
Paweł Dziepak
9c5c599041 kernel: pagecache: provided buffers are not always in user memory
Source or destination buffers passed to pagecache functions may belong
to kernel memory (e.g. when the caller is packagefs). Because of that
we should tell vm_memcpy_{from, to}_physical() truth, not assume that all
buffers are in user memory. That's important because user memory page fault
handlers cannot be nested and these functions may be used while handling
a page fault.

With high probability fixes #11246.

Signed-off-by: Paweł Dziepak <pdziepak@quarnos.org>
2014-09-25 21:57:32 +02:00
Jérôme Duval
c8990b0907 _user_wait_for_objects: remove redundant check. 2014-09-17 21:04:14 +02:00
Jessica Hamilton
f0b0d6578b Undo accidental file mode changes. 2014-09-15 16:38:30 +12:00
Paweł Dziepak
95e97463d2 kernel: add generic wrapper for accessing user memory
This patch adds user_access() which can be used to gracefully handle
page faults that may happen when accessing user memory. It is used
by arch_cpu_user{memcpy, memset, strlcpy}() to allow using optimized
functions from the standard library.

Currently only x64 uses this, but nothing really is arch specific here.

Signed-off-by: Paweł Dziepak <pdziepak@quarnos.org>
2014-09-14 22:39:07 +02:00
Paweł Dziepak
4582b6e3a3 libroot/x86_64: new memcpy implementation
This patch introduces new memcpy() implementation that improves the
performance when the buffer is small. It was written for processors that
support ERMSB, but performs reasonably well on older CPUs as well.

The following benchmarks were done on Haswell i7 running Debian Jessie
with Linux 3.16.1. In each iteration 64MB buffer was copied, the
parameter "size" is the size of the buffer passed in a single call (i.e.
for "size: 2" memcpy() was called ~32 million times to copy the whole
64MB).

f - original implementation, g - new implementation, all buffers 16 byte
aligned

cpy, size:        8, f:    79971 µs, g:    20419 µs, ∆:   74.47%
cpy, size:       32, f:    42068 µs, g:    12159 µs, ∆:   71.10%
cpy, size:      128, f:    13408 µs, g:    10359 µs, ∆:   22.74%
cpy, size:      512, f:    10634 µs, g:    10433 µs, ∆:    1.89%
cpy, size:     1024, f:    10474 µs, g:    10536 µs, ∆:   -0.59%
cpy, size:     4096, f:     9419 µs, g:     8630 µs, ∆:    8.38%

f - glibc 2.19 implementation, g - new implementation, all buffers 16 byte
aligned

cpy, size:        8, f:    26299 µs, g:    20919 µs, ∆:   20.46%
cpy, size:       32, f:    11146 µs, g:    12159 µs, ∆:   -9.09%
cpy, size:      128, f:    10778 µs, g:    10354 µs, ∆:    3.93%
cpy, size:      512, f:    12291 µs, g:    10426 µs, ∆:   15.17%
cpy, size:     1024, f:    13923 µs, g:    10571 µs, ∆:   24.08%
cpy, size:     4096, f:    11770 µs, g:     8671 µs, ∆:   26.33%

f - glibc 2.19 implementation, g - new implementation, all buffers unaligned

cpy, size:       16, f:    13376 µs, g:    13009 µs, ∆:    2.74%
cpy, size:       32, f:    11130 µs, g:    12171 µs, ∆:   -9.35%
cpy, size:       64, f:    11017 µs, g:    11231 µs, ∆:   -1.94%
cpy, size:      128, f:    10884 µs, g:    10407 µs, ∆:    4.38%
cpy, size:      256, f:    10826 µs, g:    10106 µs, ∆:    6.65%
cpy, size:      512, f:    12354 µs, g:    10396 µs, ∆:   15.85%

Signed-off-by: Paweł Dziepak <pdziepak@quarnos.org>
2014-09-14 19:16:52 +02:00
Paweł Dziepak
1d7b716f84 libroot/x86_64: new memset implementation
This patch introduces new memset() implementation that improves the
performance when the buffer is small. It was written for processors that
support ERMSB, but performs reasonably well on older CPUs as well.

The following benchmarks were done on Haswell i7 running Debian Jessie
with Linux 3.16.1. In each iteration 64MB buffer was memset()ed, the
parameter "size" is the size of the buffer passed in a single call (i.e.
for "size: 2" memset() was called ~32 million times to memset the whole
64MB).

f - original implementation, g - new implementation, all buffers 16 byte
aligned

set, size:        8, f:    66885 µs, g:    17768 µs, ∆:   73.44%
set, size:       32, f:    17123 µs, g:     9163 µs, ∆:   46.49%
set, size:      128, f:     6677 µs, g:     6919 µs, ∆:   -3.62%
set, size:      512, f:    11656 µs, g:     7715 µs, ∆:   33.81%
set, size:     1024, f:     9156 µs, g:     7359 µs, ∆:   19.63%
set, size:     4096, f:     4936 µs, g:     5159 µs, ∆:   -4.52%

f - glibc 2.19 implementation, g - new implementation, all buffers 16 byte
aligned

set, size:        8, f:    19631 µs, g:    17828 µs, ∆:    9.18%
set, size:       32, f:     8545 µs, g:     9047 µs, ∆:   -5.87%
set, size:      128, f:     8304 µs, g:     6874 µs, ∆:   17.22%
set, size:      512, f:     7373 µs, g:     7486 µs, ∆:   -1.53%
set, size:     1024, f:     9007 µs, g:     7344 µs, ∆:   18.46%
set, size:     4096, f:     8169 µs, g:     5146 µs, ∆:   37.01%

Apparently, glibc uses SSE even for large buffers and therefore does not
takes advantage of ERMSB:

set, size:    16384, f:     7007 µs, g:     3223 µs, ∆:   54.00%
set, size:    32768, f:     6979 µs, g:     2930 µs, ∆:   58.02%
set, size:    65536, f:     6907 µs, g:     2826 µs, ∆:   59.08%
set, size:   131072, f:     6919 µs, g:     2752 µs, ∆:   60.23%

The new implementation handles unaligned buffers quite well:

f - glibc 2.19 implementation, g - new implementation, all buffers unaligned

set, size:       16, f:    10045 µs, g:    10498 µs, ∆:   -4.51%
set, size:       32, f:     8590 µs, g:     9358 µs, ∆:   -8.94%
set, size:       64, f:     8618 µs, g:     8585 µs, ∆:    0.38%
set, size:      128, f:     8393 µs, g:     6893 µs, ∆:   17.87%
set, size:      256, f:     8042 µs, g:     7621 µs, ∆:    5.24%
set, size:      512, f:     9661 µs, g:     7738 µs, ∆:   19.90%

Signed-off-by: Paweł Dziepak <pdziepak@quarnos.org>
2014-09-14 19:16:52 +02:00
Paweł Dziepak
718fd007a6 kernel/x86_64: clear xmm0-15 registers on syscall exit
As Alex pointed out we can leak possibly sensitive data in xmm registers
when returning from the kernel. To prevent that xmm0-15 are zeroed
before sysret or iret. The cost is negligible.

Signed-off-by: Paweł Dziepak <pdziepak@quarnos.org>
2014-09-14 19:16:52 +02:00
Paweł Dziepak
396b74228e kernel/x86_64: save fpu state at interrupts
The kernel is allowed to use fpu anywhere so we must make sure that
user state is not clobbered by saving fpu state at interrupt entry.
There is no need to do that in case of system calls since all fpu
data registers are caller saved.

We do not need, though, to save the whole fpu state at task swich
(again, thanks to calling convention). Only status and control
registers are preserved. This patch actually adds xmm0-15 register
to clobber list of task swich code, but the only reason of that is
to make sure that nothing bad happens inside the function that
executes that task swich. Inspection of the generated code shows
that no xmm registers are actually saved.

Signed-off-by: Paweł Dziepak <pdziepak@quarnos.org>
2014-09-14 19:16:52 +02:00
Paweł Dziepak
b41f281071 boot/x86_64: enable sse early
Enable SSE as a part of the "preparation of the environment to run any
C or C++ code" in the entry points of stage2 bootloader.

SSE2 is going to be used by memset() and memcpy().

Signed-off-by: Paweł Dziepak <pdziepak@quarnos.org>
2014-09-14 19:16:52 +02:00
Paweł Dziepak
acad7bf64a kernel/x86_64: make sure stack is properly aligned in syscalls
Just following the path of least resistance and adding andq $~15, %rsp
where appropriate. That should also make things harder to break
when changing the amount of stuff placed on stack before calling the
actual syscall routine.

Signed-off-by: Paweł Dziepak <pdziepak@quarnos.org>
2014-09-14 19:16:52 +02:00
Paweł Dziepak
f2f91078bd kernel/x86_64: remove memset and memcpy from commpage
There is absolutely no reason for these functions to be in commpage,
they don't do anything that involves the kernel in any way.

Additionaly, this patch rewrites memset and memcpy to C++, current
implementation is quite simple (though it may perform surprisingly
well when dealing with large buffers on cpus with ermsb). Better
versions are coming soon.

Signed-off-by: Paweł Dziepak <pdziepak@quarnos.org>
2014-09-14 19:16:52 +02:00