of about 3.5X on my 1333MHz Athlon (about 37MB/sec!) compared to the old
C versions.
We could boost the speed of the C versions on most other architectures with
des.inc files that set the compile-time flags (DES_PTR, DES_RISC1, DES_RISC2)
correctly; at the moment they aren't set at all.
routines from OpenSSL. Speeds up DSA significantly. A similar
gain should also be seen for RSA.
Before:
Doing 512 bit sign dsa's for 10s: 965 512 bit DSA signs in 9.97s
Doing 512 bit verify dsa's for 10s: 766 512 bit DSA verify in 9.93s
Doing 1024 bit sign dsa's for 10s: 276 1024 bit DSA signs in 9.99s
Doing 1024 bit verify dsa's for 10s: 217 1024 bit DSA verify in 9.93s
sign verify sign/s verify/s
dsa 512 bits 0.0103s 0.0130s 96.8 77.1
dsa 1024 bits 0.0362s 0.0458s 27.6 21.9
After:
Doing 512 bit sign dsa's for 10s: 3742 512 bit DSA signs in 9.88s
Doing 512 bit verify dsa's for 10s: 3065 512 bit DSA verify in 9.92s
Doing 1024 bit sign dsa's for 10s: 1357 1024 bit DSA signs in 9.99s
Doing 1024 bit verify dsa's for 10s: 1094 1024 bit DSA verify in 9.83s
sign verify sign/s verify/s
dsa 512 bits 0.0026s 0.0032s 378.7 309.0
dsa 1024 bits 0.0074s 0.0090s 135.8 111.3
Before:
Doing rmd160 for 3s on 8 size blocks: 778828 rmd160's in 3.00s
Doing rmd160 for 3s on 64 size blocks: 430214 rmd160's in 3.00s
Doing rmd160 for 3s on 256 size blocks: 182108 rmd160's in 3.00s
Doing rmd160 for 3s on 1024 size blocks: 55050 rmd160's in 3.00s
Doing rmd160 for 3s on 8192 size blocks: 7339 rmd160's in 3.00s
type 8 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
rmd160 2076.87k 9177.90k 15539.88k 18790.40k 20040.36k
After:
Doing rmd160 for 3s on 8 size blocks: 1084941 rmd160's in 3.00s
Doing rmd160 for 3s on 64 size blocks: 617966 rmd160's in 3.00s
Doing rmd160 for 3s on 256 size blocks: 267381 rmd160's in 2.99s
Doing rmd160 for 3s on 1024 size blocks: 82001 rmd160's in 3.00s
Doing rmd160 for 3s on 8192 size blocks: 10974 rmd160's in 3.00s
type 8 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
rmd160 2893.18k 13183.27k 22892.82k 27989.67k 29966.34k
Before:
Doing md5 for 3s on 8 size blocks: 1490796 md5's in 3.00s
Doing md5 for 3s on 64 size blocks: 895849 md5's in 3.00s
Doing md5 for 3s on 256 size blocks: 410807 md5's in 3.00s
Doing md5 for 3s on 1024 size blocks: 129416 md5's in 3.00s
Doing md5 for 3s on 8192 size blocks: 17527 md5's in 3.00s
type 8 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
md5 3975.46k 19111.45k 35055.53k 44173.99k 47860.39k
After:
Doing md5 for 3s on 8 size blocks: 2041410 md5's in 3.00s
Doing md5 for 3s on 64 size blocks: 1345402 md5's in 3.00s
Doing md5 for 3s on 256 size blocks: 669827 md5's in 3.10s
Doing md5 for 3s on 1024 size blocks: 221744 md5's in 2.96s
Doing md5 for 3s on 8192 size blocks: 30685 md5's in 3.00s
type 8 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
md5 5443.76k 28701.91k 56968.68k 76711.44k 83790.51k
and decrypt from OpenSSL. Right now we only build the 586 version,
but eventually we will be able to build the 686 version based on a
CPP flag defined as a result of using `cc -mcpu=pentiumpro'.
We don't build the assembly version of BF_cbc_encrypt(), as it would
have to be rewritten to be PIC.
Performance difference is quite noticeable.
Before:
Doing blowfish cbc for 3s on 8 size blocks: 2891026 blowfish cbc's in 2.97s
Doing blowfish cbc for 3s on 64 size blocks: 411766 blowfish cbc's in 3.10s
Doing blowfish cbc for 3s on 256 size blocks: 104721 blowfish cbc's in 3.00s
Doing blowfish cbc for 3s on 1024 size blocks: 26291 blowfish cbc's in 2.98s
Doing blowfish cbc for 3s on 8192 size blocks: 3290 blowfish cbc's in 3.10s
type 8 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
blowfish cbc 7787.28k 8755.16k 8936.19k 9034.22k 8954.05k
After:
Doing blowfish cbc for 3s on 8 size blocks: 4573792 blowfish cbc's in 3.10s
Doing blowfish cbc for 3s on 64 size blocks: 713440 blowfish cbc's in 2.99s
Doing blowfish cbc for 3s on 256 size blocks: 183125 blowfish cbc's in 3.00s
Doing blowfish cbc for 3s on 1024 size blocks: 46221 blowfish cbc's in 3.00s
Doing blowfish cbc for 3s on 8192 size blocks: 5787 blowfish cbc's in 3.00s
type 8 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
blowfish cbc 12156.26k 15270.96k 15626.67k 15776.77k 15802.37k