dsl
c327a133c6
Significantly faster memcpy/memmove/bcopy and memset/bzero
2003-04-15 22:49:50 +00:00
bjh21
f34ba16c9c
NetBSD/acorn26 has used APCS-32 for years, so unifdef -U__APCS_26__.
2003-04-05 23:27:14 +00:00
matt
cc005c66db
Switch back to generic bzero/memset until new one is shown to work.
2003-02-25 20:15:02 +00:00
matt
97d38cdec2
Actually use bzero.S. Also fix bzero to use GET_CPUINFO
2003-02-24 07:14:17 +00:00
matt
05a4c83a70
Don't make memset.c since bzero.o has memset in addition to bzero.
2003-02-24 07:09:18 +00:00
fvdl
144469b350
Add strtoul.c
2002-11-25 00:55:22 +00:00
chris
f86ab1a63e
Sync arm asm libkern files with libc's asm files.
2002-11-23 14:29:29 +00:00
itohy
6e73936f81
Use assembly version of bzero() and memset().
2002-11-20 09:52:53 +00:00
itohy
5d1c87f395
Assembly version of bzero()/memset().
...
Written by SHIMIZU Ryo.
2002-11-20 09:51:52 +00:00
itohy
766d863c42
memcpy() and memmove() must return the first parameter.
...
Problem found by itohy, fixed by SHIMIZU Ryo.
2002-11-20 09:50:37 +00:00
rearnsha
6576c49b48
Add an assembler version of strcmp, based on example code from the ARM
...
ARM. As an example of the performance difference that this provides
a Dhrystone score on my Shark goes from 213k to 261k.
2002-11-16 18:27:40 +00:00
thorpej
7f74df5ef3
ABICALLS -> __ABICALLS__
2002-11-10 18:10:25 +00:00
chs
cab484e445
move includes to the top so that this builds in libc context too.
2002-10-29 04:40:55 +00:00
chs
c04f87a03e
remove setjmp/longjmp from libkern, they're not used.
2002-10-27 18:45:11 +00:00
chs
c5a350ef59
use %g5 instead of %g7 (since we want to use %g7 for the cpu_info pointer
...
in the kernel). resync libc and libkern versions of this file.
2002-10-27 18:41:27 +00:00
scw
03c573236d
Replace the SuperH memcpy() with homebrewed code. The former seems to have
...
a subtle failure mode which can result in corruption of memory outside the
bounds of the destination buffer.
2002-10-22 12:25:18 +00:00
scw
921743eed1
Fix a sign-extension botch for ILP32.
2002-10-19 08:54:23 +00:00
scw
0e1af8ca62
Doh. Bail out early if we're passed a zero-length buffer.
2002-10-19 08:53:45 +00:00
scw
99ad3a762b
Add native optimised assembler versions of some libkern routines.
...
The memcpy routine is courtesy of SuperH, with some tweaks by me.
XXX: There is room for further optimisation in some of these routines.
2002-10-17 11:53:32 +00:00
scw
42ca361622
Preserve and restore the caller's FP status register, and ensure
...
it contains a sane value while we're doing FP ops.
2002-09-28 10:33:59 +00:00
chs
2841e1341c
add strtoul.c, it's now used in MI code.
2002-09-21 17:45:16 +00:00
ragge
77d3833330
Need strtoul() also.
2002-09-19 17:37:32 +00:00
msaitoh
a991dcef11
Add __movstr_i4_{odd,even} for -m4.
...
Written by SHIMIZU Ryo.
2002-09-05 08:35:15 +00:00
itohy
f89823c1f8
Save 1-4 instructions on all cases except for the ret=0 case.
...
This is probably the last version from me. :)
You are welcome to speed it up, of course. :)
Here's a benchmark on SH-4 200MHz.
9.2% faster if all the cases occur evenly.
return value C version previous vers this version speed ratio
of ffs() (ns/call) *1 (ns/call) (ns/call) *2 (*1/*2)
------------ ------------ ------------- ------------ -----------
0 86 81 81 1.06
1 110 106 91 1.21
2 132 106 92 1.43
3 165 117 96 1.72
4 201 116 95 2.12
5 237 107 99 2.39
6 271 106 101 2.68
7 307 116 107 2.87
8 342 116 105 3.26
9 376 126 111 3.39
10 410 127 110 3.73
11 446 136 115 3.88
12 483 134 116 4.16
13 518 125 119 4.35
14 551 126 120 4.59
15 587 135 127 4.62
16 624 136 126 4.95
17 658 139 126 5.22
18 694 140 126 5.51
19 727 148 131 5.55
20 764 150 131 5.83
21 799 141 135 5.92
22 834 142 135 6.18
23 868 152 140 6.20
24 903 153 142 6.36
25 939 140 127 7.39
26 974 141 126 7.73
27 1009 152 131 7.70
28 1044 148 130 8.03
29 1080 141 136 7.94
30 1115 141 136 8.20
31 1151 151 141 8.16
32 1185 151 140 8.46
2002-09-01 13:14:53 +00:00
itohy
fa5465079f
Slightly improved version of ffs(3).
...
Partially from SHIMIZU Ryo <ryo@iij.ad.jp>. Thanks.
Some cases are slower, but other most cases are faster.
Here's a benchmark on SH-4 200MHz.
return value C version previous vers this version speed ratio
of ffs() (ns/call) *1 (ns/call) (ns/call) *2 (*1/*2)
------------ ------------ ------------- ------------ -----------
0 86 86 81 1.06
1 110 86 106 *(slower) 1.04
2 132 86 106 * 1.25
3 165 105 117 * 1.41
4 201 104 116 * 1.73
5 237 111 107 2.21
6 271 111 106 2.56
7 307 126 116 2.65
8 342 125 116 2.95
9 376 122 126 * 2.98
10 410 121 127 * 3.23
11 446 139 136 3.28
12 483 140 134 3.60
13 518 146 125 4.14
14 551 146 126 4.37
15 587 161 135 4.35
16 624 162 136 4.59
17 658 141 139 4.73
18 694 142 140 4.96
19 727 160 148 4.91
20 764 161 150 5.09
21 799 167 141 5.67
22 834 167 142 5.87
23 868 181 152 5.71
24 903 181 153 5.90
25 939 146 140 6.71
26 974 146 141 6.91
27 1009 166 152 6.64
28 1044 165 148 7.05
29 1080 171 141 7.66
30 1115 171 141 7.91
31 1151 185 151 7.62
32 1185 186 151 7.85
2002-08-28 15:34:35 +00:00
itohy
6736303e13
Use assembly version of ffs(3).
2002-08-24 06:39:48 +00:00
itohy
85ce1de27f
Oops, SYSLIBC_SCCS -> LIBC_SCCS
2002-08-24 06:37:24 +00:00
itohy
70b5675025
Assembly version of ffs(3).
...
Confirmed to return the same value as that of the C version.
The results of a simple benchmark on SH-4 200MHz, is shown below.
I think this shows acceptable performance.
return value C version this version speed
of ffs() (ns/call) (ns/call) ratio
------------ --------- ------------ -----
0 86 86 1.00
1 110 86 1.27
2 132 86 1.53
3 165 105 1.57
4 201 104 1.93
5 237 111 2.13
6 271 111 2.44
7 307 126 2.43
8 342 125 2.73
9 376 122 3.08
10 410 121 3.38
11 446 139 3.20
12 483 140 3.45
13 518 146 3.54
14 551 146 3.77
15 587 161 3.64
16 624 162 3.85
17 658 141 4.66
18 694 142 4.88
19 727 160 4.54
20 764 161 4.74
21 799 167 4.78
22 834 167 4.99
23 868 181 4.79
24 903 181 4.98
25 939 146 6.43
26 974 146 6.67
27 1009 166 6.07
28 1044 165 6.32
29 1080 171 6.31
30 1115 171 6.52
31 1151 185 6.22
32 1185 186 6.37
2002-08-24 06:30:34 +00:00
thorpej
dafc960ed6
Local label fixup.
2002-08-17 19:00:26 +00:00
chris
d8ac0fb3aa
pull in ffs.S from libc for arm.
...
The main benefit is that ffs always runs in constant time.
2002-08-17 01:22:33 +00:00
briggs
b98931f62e
Use .L prefix for all local labels.
2002-08-15 18:30:36 +00:00
matt
7c4618a9ce
cpu_info is not in spr0, but spr_g_0.
2002-07-30 06:10:46 +00:00
kent
6789db7962
Avoid redundant memory access.
2002-07-10 06:02:09 +00:00
scw
59474a8c82
NetBSD, meet the SH-5 cpu.
...
SH-5, meet NetBSD.
Let's hope this is the start of a long and fruitful relationship. :-)
This code, funded by Wasabi Systems, adds initial support for the
Hitachi SuperH(tm) SH-5 cpu architecture to NetBSD.
At the present time, NetBSD/evbsh5 only runs on a SH-5 core simulator
which has no simulated devices other than a simple console. However, it
is good enough to get to the "root device: " prompt.
Device driver support for Real SH-5 Hardware is in place, particularly for
supporting the up-coming Cayman evaluation board, and should be quite
easy to get running when the hardware is available.
There is no in-tree toolchain for this port at this time. Gcc-current has
rudimentary SH-5 support but it is known to be buggy. A working toolchain
was obtained from SuperH to facilitate this port. Gcc-current will be
fixed in due course.
The SH-5 architecture is fully 64-bit capable, although NetBSD/evbsh5 has
currently only been tested in 32-bit mode. It is bi-endian, via a boot-
time option and it also has an "SHcompact" mode in which it will execute
SH-[34] user-land instructions.
For more information on the SH-5, see www.superh.com. Suffice to say it
is *not* just another respin of the SH-[34].
2002-07-05 13:31:28 +00:00
bjh21
3763adaefd
Avoid leaving junk in the top half of R0 on return.
...
This fixes port-arm/17440.
2002-07-01 19:07:18 +00:00
fredette
e978777b86
Added hppa support to libkern.
2002-06-06 20:03:37 +00:00
martin
9f680534b0
Add strtoul.
2002-05-05 11:23:24 +00:00
ross
f98b9b43e8
Add strtoul.c
2002-04-24 16:56:36 +00:00
martin
22143f5a44
Add strtoul.c, otherwise kernels using "wi* at pcmcia?" do not work
...
anymore.
Why only four archs provide this is beyound me.
2002-04-16 06:36:02 +00:00
matt
cb520da5b3
Refresh from libc.
2002-03-28 00:46:08 +00:00
fredette
58830d68c5
Added brand-new integer multiply and divide support, used only
...
on the m68000.
2002-03-26 22:49:32 +00:00
matt
12810ed37d
Use size_t in prototype (so this will be LP64 clean for PPC64 someday).
...
Calculate len separately for icache & dcache in case each has different
cacheline widths. Make the code for both loops the same except for the
dcbst/icbi. Deal with sizes >=2GB properly (like that'll happen but ...)
2002-03-26 21:20:24 +00:00
fredette
d617871b0c
On the m68000, if and only if gcc doesn't seem to know
...
where libgcc.a is, fall back to one under DESTDIR.
2002-03-22 00:17:12 +00:00
dbj
f0658bdada
make compile with _STANDALONE
2002-03-18 05:10:58 +00:00
eeh
4c434f6210
Updated from libc.
2002-03-13 00:59:29 +00:00
matt
e2d6f22138
Add register prefixes to these.
2002-02-24 00:12:41 +00:00
matt
6cad4b795d
Upon further reflection, move udiv/urem to libkern and out of vax/vax.
2002-02-24 00:08:19 +00:00
ragge
f2d946a56e
blkset() used a register for set value that get clobbered by movc5,
...
causing the set area to get unpredictable contents.
2002-02-19 21:46:17 +00:00
thorpej
2362fef9a8
Add __blkcpy() and __blkset() (renamed/modified from __blkclr()) to
...
libkern.
2002-02-10 22:04:51 +00:00
ross
e31435237d
sync
2002-01-24 00:45:22 +00:00