Difference between revisions of "RaspberryPiPerformance"

From eLinux.org
Jump to: navigation, search
(openssl speed: include actual results table)
(Transferred to RPi Performance, intentionally blanking and redirecting)
 
(11 intermediate revisions by 3 users not shown)
Line 1: Line 1:
==Linpack==
+
#REDIRECT [[RPi Performance]]
 
 
The Arm has been tested using the linpack benchmark from [http://www.netlib.org/benchmark/linpackc.new], built with gcc with -O3 (Optimisation level 3). Run with array size 200.
 
 
 
With software floating point
 
 
 
<pre>
 
Memory required:  315K.
 
 
 
LINPACK benchmark, Double precision.
 
Machine precision:  15 digits.
 
Array size 200 X 200.
 
Average rolled and unrolled performance:
 
 
 
    Reps Time(s) DGEFA  DGESL  OVERHEAD    KFLOPS
 
      2  0.53  92.45%  1.89%  5.66%  5493.333
 
      4  1.07  92.52%  2.80%  4.67%  5385.621
 
      8  2.12  92.45%  2.36%  5.19%  5466.003
 
      16  4.24  92.45%  2.83%  4.72%  5438.944
 
      32  8.49  92.11%  2.71%  5.18%  5459.213
 
      64  16.98  92.05%  2.89%  5.06%  5452.440
 
</pre>
 
 
 
Hardware floating point (-mfloat-abi=softfp)
 
 
 
<pre>
 
Memory required:  315K.
 
LINPACK benchmark, Double precision.
 
Machine precision:  15 digits.
 
Array size 200 X 200.
 
Average rolled and unrolled performance:
 
 
 
    Reps Time(s) DGEFA  DGESL  OVERHEAD    KFLOPS
 
      8  0.51  90.20%  3.92%  5.88%  22888.889
 
      16  1.02  89.22%  4.90%  5.88%  22888.889
 
      32  2.05  90.24%  3.41%  6.34%  22888.889
 
      64  4.08  91.42%  2.94%  5.64%  22829.437
 
    128  8.16  91.54%  2.94%  5.51%  22799.827
 
    256  16.31  91.35%  2.76%  5.89%  22903.800
 
</pre>
 
 
 
==Whetstone/Dhrystone==
 
 
 
Code for these tests can be found here http://www.rowley.co.uk/arm/whet_dhry.zip.
 
 
 
All code compiled with gcc options -float-abi=softfp -O3
 
 
 
Whetstone
 
 
 
<pre>
 
Loops: 1000, Iterations: 10, Duration: 24 sec.
 
 
 
C Converted Double Precision Whetstones: 41.7 MIPS
 
</pre>
 
 
 
Dhrystone
 
 
 
<pre>
 
Microseconds for one run through Dhrystone: 1.2
 
 
 
Dhrystones per Second: 809061.5
 
</pre>
 
 
 
Rebuilding the Whetstone test code with 'gcc -mfpu -float-abi=softfp' gives better results:
 
 
 
<pre>
 
 
 
Loops: 1000, Iterations: 100, Duration: 106 sec.
 
C Converted Double Precision Whetstones: 94.3 MIPS
 
</pre>
 
 
 
However, the majority of compute time is spent in the SQRT function, which for the above test was built without -mfpu=vfp. Using a library with vfp give the following much improved result :
 
 
 
<pre>
 
Loops: 1000, Iterations: 100, Duration: 15 sec.
 
C Converted Double Precision Whetstones: 666.7 MIPS
 
</pre>
 
 
 
==OpenSSL==
 
 
 
Results of running openssl speed
 
 
 
<pre>
 
Doing md2 for 3s on 16 size blocks: 27716 md2's in 2.98s
 
Doing md2 for 3s on 64 size blocks: 17388 md2's in 2.99s
 
Doing md2 for 3s on 256 size blocks: 7322 md2's in 3.00s
 
Doing md2 for 3s on 1024 size blocks: 2173 md2's in 2.89s
 
Doing md2 for 3s on 8192 size blocks: 304 md2's in 2.99s
 
Doing md4 for 3s on 16 size blocks: 115369 md4's in 3.00s
 
Doing md4 for 3s on 64 size blocks: 115723 md4's in 3.00s
 
Doing md4 for 3s on 256 size blocks: 88908 md4's in 2.99s
 
Doing md4 for 3s on 1024 size blocks: 48620 md4's in 2.98s
 
Doing md4 for 3s on 8192 size blocks: 10258 md4's in 2.99s
 
Doing md5 for 3s on 16 size blocks: 70799 md5's in 2.98s
 
Doing md5 for 3s on 64 size blocks: 69896 md5's in 2.98s
 
Doing md5 for 3s on 256 size blocks: 56259 md5's in 3.00s
 
Doing md5 for 3s on 1024 size blocks: 33143 md5's in 3.00s
 
Doing md5 for 3s on 8192 size blocks: 7914 md5's in 2.99s
 
Doing hmac(md5) for 3s on 16 size blocks: 190400 hmac(md5)'s in 2.98s
 
Doing hmac(md5) for 3s on 64 size blocks: 163136 hmac(md5)'s in 3.00s
 
Doing hmac(md5) for 3s on 256 size blocks: 111608 hmac(md5)'s in 2.98s
 
Doing hmac(md5) for 3s on 1024 size blocks: 51076 hmac(md5)'s in 2.99s
 
Doing hmac(md5) for 3s on 8192 size blocks: 9286 hmac(md5)'s in 2.99s
 
Doing sha1 for 3s on 16 size blocks: 56948 sha1's in 3.00s
 
Doing sha1 for 3s on 64 size blocks: 51206 sha1's in 3.00s
 
Doing sha1 for 3s on 256 size blocks: 36283 sha1's in 2.99s
 
Doing sha1 for 3s on 1024 size blocks: 18403 sha1's in 2.99s
 
Doing sha1 for 3s on 8192 size blocks: 3584 sha1's in 2.98s
 
Doing sha256 for 3s on 16 size blocks: 127496 sha256's in 3.00s
 
Doing sha256 for 3s on 64 size blocks: 76127 sha256's in 2.99s
 
Doing sha256 for 3s on 256 size blocks: 34048 sha256's in 3.00s
 
Doing sha256 for 3s on 1024 size blocks: 10828 sha256's in 2.99s
 
Doing sha256 for 3s on 8192 size blocks: 1524 sha256's in 2.99s
 
Doing sha512 for 3s on 16 size blocks: 7691 sha512's in 3.00s
 
Doing sha512 for 3s on 64 size blocks: 7654 sha512's in 2.99s
 
Doing sha512 for 3s on 256 size blocks: 2717 sha512's in 2.99s
 
Doing sha512 for 3s on 1024 size blocks: 926 sha512's in 2.98s
 
Doing sha512 for 3s on 8192 size blocks: 130 sha512's in 3.01s
 
Doing rmd160 for 3s on 16 size blocks: 45651 rmd160's in 2.99s
 
Doing rmd160 for 3s on 64 size blocks: 39666 rmd160's in 2.99s
 
Doing rmd160 for 3s on 256 size blocks: 28201 rmd160's in 2.99s
 
Doing rmd160 for 3s on 1024 size blocks: 13908 rmd160's in 3.00s
 
Doing rmd160 for 3s on 8192 size blocks: 2733 rmd160's in 2.98s
 
Doing rc4 for 3s on 16 size blocks: 2739344 rc4's in 2.99s
 
Doing rc4 for 3s on 64 size blocks: 783949 rc4's in 2.98s
 
Doing rc4 for 3s on 256 size blocks: 203269 rc4's in 2.98s
 
Doing rc4 for 3s on 1024 size blocks: 51473 rc4's in 2.99s
 
Doing rc4 for 3s on 8192 size blocks: 6374 rc4's in 2.98s
 
Doing des cbc for 3s on 16 size blocks: 546219 des cbc's in 3.00s
 
Doing des cbc for 3s on 64 size blocks: 149992 des cbc's in 2.98s
 
Doing des cbc for 3s on 256 size blocks: 38552 des cbc's in 3.00s
 
Doing des cbc for 3s on 1024 size blocks: 9844 des cbc's in 3.00s
 
Doing des cbc for 3s on 8192 size blocks: 1229 des cbc's in 2.99s
 
Doing des ede3 for 3s on 16 size blocks: 213445 des ede3's in 2.97s
 
Doing des ede3 for 3s on 64 size blocks: 55158 des ede3's in 2.97s
 
Doing des ede3 for 3s on 256 size blocks: 13904 des ede3's in 2.97s
 
Doing des ede3 for 3s on 1024 size blocks: 3227 des ede3's in 2.74s
 
Doing des ede3 for 3s on 8192 size blocks: 441 des ede3's in 2.99s
 
Doing aes-128 cbc for 3s on 16 size blocks: 595070 aes-128 cbc's in 2.97s
 
Doing aes-128 cbc for 3s on 64 size blocks: 163409 aes-128 cbc's in 2.99s
 
Doing aes-128 cbc for 3s on 256 size blocks: 42375 aes-128 cbc's in 3.00s
 
Doing aes-128 cbc for 3s on 1024 size blocks: 10665 aes-128 cbc's in 2.99s
 
Doing aes-128 cbc for 3s on 8192 size blocks: 1338 aes-128 cbc's in 2.99s
 
Doing aes-192 cbc for 3s on 16 size blocks: 510290 aes-192 cbc's in 2.99s
 
Doing aes-192 cbc for 3s on 64 size blocks: 138844 aes-192 cbc's in 2.98s
 
Doing aes-192 cbc for 3s on 256 size blocks: 35894 aes-192 cbc's in 2.99s
 
Doing aes-192 cbc for 3s on 1024 size blocks: 9089 aes-192 cbc's in 3.00s
 
Doing aes-192 cbc for 3s on 8192 size blocks: 1132 aes-192 cbc's in 2.98s
 
Doing aes-256 cbc for 3s on 16 size blocks: 444002 aes-256 cbc's in 2.98s
 
Doing aes-256 cbc for 3s on 64 size blocks: 120882 aes-256 cbc's in 2.98s
 
Doing aes-256 cbc for 3s on 256 size blocks: 30963 aes-256 cbc's in 2.98s
 
Doing aes-256 cbc for 3s on 1024 size blocks: 7890 aes-256 cbc's in 2.99s
 
Doing aes-256 cbc for 3s on 8192 size blocks: 994 aes-256 cbc's in 2.98s
 
Doing aes-128 ige for 3s on 16 size blocks: 577263 aes-128 ige's in 2.99s
 
Doing aes-128 ige for 3s on 64 size blocks: 166651 aes-128 ige's in 2.98s
 
Doing aes-128 ige for 3s on 256 size blocks: 43055 aes-128 ige's in 2.98s
 
Doing aes-128 ige for 3s on 1024 size blocks: 10772 aes-128 ige's in 2.99s
 
Doing aes-128 ige for 3s on 8192 size blocks: 1306 aes-128 ige's in 2.99s
 
Doing aes-192 ige for 3s on 16 size blocks: 493664 aes-192 ige's in 2.99s
 
Doing aes-192 ige for 3s on 64 size blocks: 141065 aes-192 ige's in 2.99s
 
Doing aes-192 ige for 3s on 256 size blocks: 36340 aes-192 ige's in 2.99s
 
Doing aes-192 ige for 3s on 1024 size blocks: 9183 aes-192 ige's in 2.99s
 
Doing aes-192 ige for 3s on 8192 size blocks: 1108 aes-192 ige's in 2.99s
 
Doing aes-256 ige for 3s on 16 size blocks: 434801 aes-256 ige's in 2.98s
 
Doing aes-256 ige for 3s on 64 size blocks: 122980 aes-256 ige's in 2.99s
 
Doing aes-256 ige for 3s on 256 size blocks: 31594 aes-256 ige's in 2.99s
 
Doing aes-256 ige for 3s on 1024 size blocks: 7988 aes-256 ige's in 2.99s
 
Doing aes-256 ige for 3s on 8192 size blocks: 981 aes-256 ige's in 2.99s
 
Doing rc2 cbc for 3s on 16 size blocks: 525625 rc2 cbc's in 2.99s
 
Doing rc2 cbc for 3s on 64 size blocks: 140247 rc2 cbc's in 2.98s
 
Doing rc2 cbc for 3s on 256 size blocks: 35672 rc2 cbc's in 2.99s
 
Doing rc2 cbc for 3s on 1024 size blocks: 8987 rc2 cbc's in 2.99s
 
Doing rc2 cbc for 3s on 8192 size blocks: 1119 rc2 cbc's in 2.98s
 
Doing blowfish cbc for 3s on 16 size blocks: 1138316 blowfish cbc's in 2.99s
 
Doing blowfish cbc for 3s on 64 size blocks: 327400 blowfish cbc's in 2.99s
 
Doing blowfish cbc for 3s on 256 size blocks: 84685 blowfish cbc's in 2.99s
 
Doing blowfish cbc for 3s on 1024 size blocks: 21281 blowfish cbc's in 2.99s
 
Doing blowfish cbc for 3s on 8192 size blocks: 2606 blowfish cbc's in 2.98s
 
Doing cast cbc for 3s on 16 size blocks: 940793 cast cbc's in 2.97s
 
Doing cast cbc for 3s on 64 size blocks: 282189 cast cbc's in 3.00s
 
Doing cast cbc for 3s on 256 size blocks: 73868 cast cbc's in 2.98s
 
Doing cast cbc for 3s on 1024 size blocks: 18593 cast cbc's in 2.99s
 
Doing cast cbc for 3s on 8192 size blocks: 2285 cast cbc's in 2.99s
 
Doing 512 bit private rsa's for 10s: 726 512 bit private RSA's in 9.98s
 
Doing 512 bit public rsa's for 10s: 8359 512 bit public RSA's in 9.97s
 
Doing 1024 bit private rsa's for 10s: 158 1024 bit private RSA's in 10.03s
 
Doing 1024 bit public rsa's for 10s: 3643 1024 bit public RSA's in 9.99s
 
Doing 2048 bit private rsa's for 10s: 32 2048 bit private RSA's in 10.28s
 
Doing 2048 bit public rsa's for 10s: 1350 2048 bit public RSA's in 9.96s
 
Doing 4096 bit private rsa's for 10s: 6 4096 bit private RSA's in 10.83s
 
Doing 4096 bit public rsa's for 10s: 443 4096 bit public RSA's in 9.98s
 
Doing 512 bit sign dsa's for 10s: 852 512 bit DSA signs in 9.96s
 
Doing 512 bit verify dsa's for 10s: 734 512 bit DSA verify in 9.98s
 
Doing 1024 bit sign dsa's for 10s: 365 1024 bit DSA signs in 9.94s
 
Doing 1024 bit verify dsa's for 10s: 315 1024 bit DSA verify in 9.98s
 
Doing 2048 bit sign dsa's for 10s: 136 2048 bit DSA signs in 10.05s
 
Doing 2048 bit verify dsa's for 10s: 115 2048 bit DSA verify in 10.04s
 
OpenSSL 0.9.8o 01 Jun 2010
 
built on: Thu Aug 26 18:56:26 UTC 2010
 
options:bn(64,32) md2(int) rc4(ptr,int) des(idx,risc1,4,long) aes(partial) blowfish(idx)
 
compiler: gcc -fPIC -DOPENSSL_PIC -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DL_ENDIAN -DTERMIO -O2 -Wa,--noexecstack -g -Wall
 
available timing options: TIMES TIMEB HZ=100 [sysconf value]
 
timing function used: times
 
The 'numbers' are in 1000s of bytes per second processed.
 
type            16 bytes    64 bytes    256 bytes  1024 bytes  8192 bytes
 
md2                148.81k      372.18k      624.81k      769.95k      832.90k
 
mdc2                0.00        0.00        0.00        0.00        0.00
 
md4                615.30k    2468.76k    7612.19k    16707.01k    28104.86k
 
md5                380.13k    1501.12k    4800.77k    11312.81k    21682.77k
 
hmac(md5)        1022.28k    3480.23k    9587.80k    17492.25k    25441.78k
 
sha1              303.72k    1092.39k    3106.50k    6302.57k    9852.39k
 
rmd160            244.29k      849.04k    2414.53k    4747.26k    7513.00k
 
rc4              14658.70k    16836.49k    17462.03k    17628.21k    17522.08k
 
des cbc          2913.17k    3221.30k    3289.77k    3360.09k    3367.21k
 
des ede3          1149.87k    1188.59k    1198.46k    1206.00k    1208.25k
 
idea cbc            0.00        0.00        0.00        0.00        0.00
 
seed cbc            0.00        0.00        0.00        0.00        0.00
 
rc2 cbc          2812.71k    3012.02k    3054.19k    3077.82k    3076.12k
 
rc5-32/12 cbc        0.00        0.00        0.00        0.00        0.00
 
blowfish cbc      6091.32k    7007.89k    7250.62k    7288.21k    7163.88k
 
cast cbc          5068.25k    6020.03k    6345.71k    6367.64k    6260.44k
 
aes-128 cbc      3205.76k    3497.72k    3616.00k    3652.49k    3665.85k
 
aes-192 cbc      2730.65k    2981.88k    3073.20k    3102.38k    3111.86k
 
aes-256 cbc      2383.90k    2596.12k    2659.91k    2702.13k    2732.50k
 
camellia-128 cbc    0.00        0.00        0.00        0.00        0.00
 
camellia-192 cbc    0.00        0.00        0.00        0.00        0.00
 
camellia-256 cbc    0.00        0.00        0.00        0.00        0.00
 
sha256            679.98k    1629.47k    2905.43k    3708.32k    4175.45k
 
sha512              41.02k      163.83k      232.63k      318.20k      353.81k
 
aes-128 ige      3089.03k    3579.08k    3698.68k    3689.14k    3578.18k
 
aes-192 ige      2641.68k    3019.45k    3111.38k    3144.95k    3035.70k
 
aes-256 ige      2334.50k    2632.35k    2705.04k    2735.69k    2687.74k
 
                  sign    verify    sign/s verify/s
 
rsa  512 bits 0.013747s 0.001193s    72.7    838.4
 
rsa 1024 bits 0.063481s 0.002742s    15.8    364.7
 
rsa 2048 bits 0.321250s 0.007378s      3.1    135.5
 
rsa 4096 bits 1.805000s 0.022528s      0.6    44.4
 
                  sign    verify    sign/s verify/s
 
dsa  512 bits 0.011690s 0.013597s    85.5    73.5
 
dsa 1024 bits 0.027233s 0.031683s    36.7    31.6
 
dsa 2048 bits 0.073897s 0.087304s    13.5    11.5
 
</pre>
 
 
 
[[Category: RaspberryPi]]
 

Latest revision as of 09:08, 28 January 2012

Redirect to: