Commit Graph

1075 Commits

Author SHA1 Message Date
christos 00f17ebc18 Use defined constant instead of direct value (Etienne Brateau) 2021-10-28 15:09:08 +00:00
christos b0d97acfad Fix build with -Werror=array-parameter (Etienne Brateau) 2021-10-28 15:08:05 +00:00
andvar 50d9072672 remove duplicate the article in comments. 2021-10-04 21:02:39 +00:00
andvar a136e22ab6 fix various typos in comments, messages and documentation. 2021-09-19 10:34:06 +00:00
andvar 72e44f84cb fix typos in word "successfully", mainly s/succesfully/successfully/. 2021-09-16 21:29:41 +00:00
andvar 4ddb87935b s/aquire/acquire/ in comments, also one typo fix acqure->acquire. 2021-09-07 13:24:45 +00:00
christos 8f97cb72d8 remove lint exclusion 2021-08-30 12:52:32 +00:00
ryo 567a3a02e7 Improved the performance of kernel profiling on MULTIPROCESSOR, and possible to get profiling data for each CPU.
In the current implementation, locks are acquired at the entrance of the mcount
internal function, so the higher the number of cores, the more lock conflict
occurs, making profiling performance in a MULTIPROCESSOR environment unusable
and slow. Profiling buffers has been changed to be reserved for each CPU,
improving profiling performance in MP by several to several dozen times.

- Eliminated cpu_simple_lock in mcount internal function, using per-CPU buffers.
- Add ci_gmon member to struct cpu_info of each MP arch.
- Add kern.profiling.percpu node in sysctl tree.
- Add new -c <cpuid> option to kgmon(8) to specify the cpuid, like openbsd.
  For compatibility, if the -c option is not specified, the entire system can be
  operated as before, and the -p option will get the total profiling data for
  all CPUs.
2021-08-14 17:51:18 +00:00
ryo 1979ff4ae2 don't include "opt_multiprocessor.h" inside an ifdef to work "make depend" properly. 2021-08-14 17:38:44 +00:00
andvar ebbc7028d3 fix typos in words "pointer" and s/fram /frame/ 2021-08-13 20:47:54 +00:00
skrll 1306a159ff Whitespace 2021-08-08 07:17:18 +00:00
andvar 5298fab779 s/overwriten/overwritten/ in comments. 2021-08-01 21:58:56 +00:00
andvar 31f72197e0 fix more typos in style found one in file - check/fix them all. 2021-07-31 14:36:33 +00:00
skrll 65d55bcee1 As we're providing the legacy gcc __sync built-in functions for atomic
memory access we might as well get the memory barriers right...
From the gcc documentation:

In most cases, these built-in functions are considered a full barrier.
That is, no memory operand is moved across the operation, either forward
or backward. Further, instructions are issued as necessary to prevent the
processor from speculating loads across the operation and from queuing
stores after the operation.

type __sync_lock_test_and_set (type *ptr, type value, ...)

   This built-in function is not a full barrier, but rather an acquire
   barrier. This means that references after the operation cannot move to
   (or be speculated to) before the operation, but previous memory stores
   may not be globally visible yet, and previous memory loads may not yet
   be satisfied.

void __sync_lock_release (type *ptr, ...)

   This built-in function is not a full barrier, but rather a release
   barrier. This means that all previous memory stores are globally
   visible, and all previous memory loads have been satisfied, but
   following memory reads are not prevented from being speculated to
   before the barrier.
2021-07-29 10:29:05 +00:00
simonb 3fc2996b41 #define<tab> consistency. 2021-07-28 08:01:10 +00:00
skrll 8e8c0784cf Remove memory barriers from the atomic_ops(3) atomic operations. They're
not needed for correctness.

Add the correct memory barriers to the gcc legacy __sync built-in
functions for atomic memory access.  From the gcc documentation:

In most cases, these built-in functions are considered a full barrier.
That is, no memory operand is moved across the operation, either forward
or backward. Further, instructions are issued as necessary to prevent the
processor from speculating loads across the operation and from queuing
stores after the operation.

type __sync_lock_test_and_set (type *ptr, type value, ...)

   This built-in function is not a full barrier, but rather an acquire
   barrier. This means that references after the operation cannot move to
   (or be speculated to) before the operation, but previous memory stores
   may not be globally visible yet, and previous memory loads may not yet
   be satisfied.

void __sync_lock_release (type *ptr, ...)

   This built-in function is not a full barrier, but rather a release
   barrier. This means that all previous memory stores are globally
   visible, and all previous memory loads have been satisfied, but
   following memory reads are not prevented from being speculated to
   before the barrier.
2021-07-28 07:32:20 +00:00
skrll 6a2d1b5533 #include <sys/param.h> 2021-07-22 13:54:38 +00:00
skrll 5e911a385d s/ifdef _ARM_ARCH_6/if defined(_ARM_ARCH_6)/ for consistency. NFCI. 2021-07-10 06:53:40 +00:00
skrll 52728926ba One more s/pte/ptr/ 2021-07-06 08:31:41 +00:00
skrll 6788795c38 typo in comment s/pte/ptr/ 2021-07-05 08:50:31 +00:00
skrll 68a49f39f0 Fix the logic operation for atomic_nand_{8,16,32,64}
From the gcc docs the operations are as follows

 { tmp = *ptr; *ptr = ~(tmp & value); return tmp; }   // nand
 { tmp = ~(*ptr & value); *ptr = tmp; return *ptr; }   // nand

yes, this is really rather strange.
2021-07-04 06:55:47 +00:00
skrll d1034f1a89 Whitespace 2021-06-29 06:28:07 +00:00
skrll ce1248f15e Whitespace 2021-06-28 09:00:45 +00:00
rillig 60e7bdf08f memmem: remove unreachable return statement 2021-05-16 09:43:39 +00:00
skrll 387fd596e3 Provide all the LSE operation fuctions. The use of LSE instructions is
currently disabled.
2021-04-27 09:14:24 +00:00
skrll a864f2cc52 Improve the membar_ops barriers - no need to use dsb and wait for
completion.  Also, we only to act on the inner shareability domain.
2021-04-27 05:40:29 +00:00
skrll 85e6432cbe Add the appropriate memory barrier before the lock is cleared in
__sync_lock_release_{1,2,4,8}.  That is, all reads and write for in inner
shareability domain before the lock clear store.
2021-04-26 21:40:21 +00:00
christos 742eb06965 use ${MACHINE_MIPS64} 2021-04-25 22:45:16 +00:00
skrll 10d97f11ed Trailing whitespace 2021-04-24 20:34:34 +00:00
skrll 8bba7313d2 Fix __sync_lock_release_4 to actually zeroise the whole 4bytes/32bits. 2021-04-24 20:29:04 +00:00
skrll fd9a2ad443 Do previous differently as the API is different. 2021-04-21 16:23:47 +00:00
skrll b5c783f5b6 Provide some more operations that are part of compiler lse.S. This is
incomplete, but at least covers all the atomic_swap ops and allows the
aa64 kernel to link with gcc 10.
2021-04-21 07:31:37 +00:00
simonb 9eee6d14e7 Add CVS ID line. 2021-04-19 01:12:10 +00:00
mrg 2d63425964 avoid redefinition warning for __OPTIMIZE_SIZE__. 2021-04-17 21:43:47 +00:00
simonb c47d88974b Use __register_t instead of uregister_t - this is available to all ports
and both userland and kernel.
2021-04-17 08:06:58 +00:00
simonb e857cfe928 Cast the fill value to unsigned char so that the "fill" value used for
full-word fills isn't garbage.
2021-04-17 06:02:35 +00:00
simonb 4e9c90673d Disable the larger/faster code path. While the optimised code path was
indeed quicker, it nonetheless failed to actually fill all the requested
memory with the specified value much of the time if a non-aligned start
address was used.
2021-04-17 05:57:11 +00:00
dholland 53166c520a arm bswap32: fix fatal typo in thumb code (PR 55854) 2020-12-11 09:02:33 +00:00
dholland 3818c9a287 arm bswap32: Improve the comments showing the byte flow.
It's confusing to use 1-4 for bytes 1-4 and then 0 for literal zero,
so use a-d for bytes 1-4.
2020-12-09 02:46:57 +00:00
skrll e3ee4da69b Use the correct barriers - all of membar_{sync,producer,consumer} have
less scope than before.

LGTM from riastradh
2020-10-13 21:22:12 +00:00
skrll cfd51b63e1 Remove memory barriers from the atomic ops macros in the same way as was
done for the other atomic ops earlier.
2020-10-13 21:17:35 +00:00
skrll 058bd28709 Define _ARM_ARCH_8 when __ARM_ARCH_8A (no trailing double underscore) as
it is defined by gcc.

__ARM_ARCH_8A__ (with trailing double underscore) seems to be a typo (or
maybe historical)
2020-10-11 16:22:02 +00:00
skrll 7cbf2902dd Comment nit 2020-10-07 07:31:47 +00:00
jakllsch aeb04dceb1 Re-do previous aarch64eb strlen fix more simply and correctly. 2020-09-09 14:49:27 +00:00
mrg ebde11d941 make some prototypes match the builtin properly. GCC 9 complains
with the old version, GCC 8 is happy with this version.

tested on sparc.
2020-09-07 00:52:19 +00:00
jakllsch ea3caf96e6 Fix a broken corner case of strlen()/strnlen() on aarch64eb
Previously a string such as "\x1\x1\x1\x1\x1\x1\x1" would count as
0 instead of 7 on BE.
2020-09-05 20:24:43 +00:00
jakllsch d0f28ec00a Remove unused assembly source files 2020-09-03 16:45:49 +00:00
jakllsch 835e43960b Fix typo/pasteo in aarch64 clzdi2() END() 2020-09-02 15:43:06 +00:00
skrll e16659bb50 Part I of ad@'s performance improvements for aarch64
- Remove memory barriers from the atomic ops.  I don't understand why those
  are there.  Is it some architectural thing, or for a CPU bug, or just
  over-caution maybe?  They're not needed for correctness.

- Have unlikely conditional branches go forwards to help the static branch
  predictor.
2020-08-12 12:59:57 +00:00
skrll 76b3785162 More SYNC centralisation 2020-08-10 14:37:38 +00:00