Difference between revisions of "RaspberryPiPerformance"
m (→Whetstone/Dhrystone) |
m (Add category) |
||
Line 197: | Line 197: | ||
Doing 2048 bit verify dsa's for 10s: 115 2048 bit DSA verify in 10.0 | Doing 2048 bit verify dsa's for 10s: 115 2048 bit DSA verify in 10.0 | ||
</pre> | </pre> | ||
+ | |||
+ | [[Category: RaspberryPi]] |
Revision as of 15:05, 27 October 2011
Linpack
The Arm has been tested using the linpack benchmark from [1], built with gcc with -O3 (Optimisation level 3). Run with array size 200.
With software floating point
Memory required: 315K. LINPACK benchmark, Double precision. Machine precision: 15 digits. Array size 200 X 200. Average rolled and unrolled performance: Reps Time(s) DGEFA DGESL OVERHEAD KFLOPS 2 0.53 92.45% 1.89% 5.66% 5493.333 4 1.07 92.52% 2.80% 4.67% 5385.621 8 2.12 92.45% 2.36% 5.19% 5466.003 16 4.24 92.45% 2.83% 4.72% 5438.944 32 8.49 92.11% 2.71% 5.18% 5459.213 64 16.98 92.05% 2.89% 5.06% 5452.440
Hardware floating point (-mfloat-abi=softfp)
Memory required: 315K. LINPACK benchmark, Double precision. Machine precision: 15 digits. Array size 200 X 200. Average rolled and unrolled performance: Reps Time(s) DGEFA DGESL OVERHEAD KFLOPS 8 0.51 90.20% 3.92% 5.88% 22888.889 16 1.02 89.22% 4.90% 5.88% 22888.889 32 2.05 90.24% 3.41% 6.34% 22888.889 64 4.08 91.42% 2.94% 5.64% 22829.437 128 8.16 91.54% 2.94% 5.51% 22799.827 256 16.31 91.35% 2.76% 5.89% 22903.800
Whetstone/Dhrystone
Code for these tests can be found here http://www.rowley.co.uk/arm/whet_dhry.zip.
All code compiled with gcc options -float-abi=softfp -O3
Whetstone
Loops: 1000, Iterations: 10, Duration: 24 sec. C Converted Double Precision Whetstones: 41.7 MIPS
Dhrystone
Microseconds for one run through Dhrystone: 1.2 Dhrystones per Second: 809061.5
Rebuilding the Whetstone test code with 'gcc -mfpu -float-abi=softfp' gives better results:
Loops: 1000, Iterations: 100, Duration: 106 sec. C Converted Double Precision Whetstones: 94.3 MIPS
However, the majority of compute time is spent in the SQRT function, which for the above test was built without -mfpu=vfp. Using a library with vfp give the following much improved result :
Loops: 1000, Iterations: 100, Duration: 15 sec. C Converted Double Precision Whetstones: 666.7 MIPS
OpenSSL
Results of running openssl speed
Doing md2 for 3s on 16 size blocks: 27716 md2's in 2.98s Doing md2 for 3s on 64 size blocks: 17388 md2's in 2.99s Doing md2 for 3s on 256 size blocks: 7322 md2's in 3.00s Doing md2 for 3s on 1024 size blocks: 2173 md2's in 2.89s Doing md2 for 3s on 8192 size blocks: 304 md2's in 2.99s Doing md4 for 3s on 16 size blocks: 115369 md4's in 3.00s Doing md4 for 3s on 64 size blocks: 115723 md4's in 3.00s Doing md4 for 3s on 256 size blocks: 88908 md4's in 2.99s Doing md4 for 3s on 1024 size blocks: 48620 md4's in 2.98s Doing md4 for 3s on 8192 size blocks: 10258 md4's in 2.99s Doing md5 for 3s on 16 size blocks: 70799 md5's in 2.98s Doing md5 for 3s on 64 size blocks: 69896 md5's in 2.98s Doing md5 for 3s on 256 size blocks: 56259 md5's in 3.00s Doing md5 for 3s on 1024 size blocks: 33143 md5's in 3.00s Doing md5 for 3s on 8192 size blocks: 7914 md5's in 2.99s Doing hmac(md5) for 3s on 16 size blocks: 190400 hmac(md5)'s in 2.98s Doing hmac(md5) for 3s on 64 size blocks: 163136 hmac(md5)'s in 3.00s Doing hmac(md5) for 3s on 256 size blocks: 111608 hmac(md5)'s in 2.98s Doing hmac(md5) for 3s on 1024 size blocks: 51076 hmac(md5)'s in 2.99s Doing hmac(md5) for 3s on 8192 size blocks: 9286 hmac(md5)'s in 2.99s Doing sha1 for 3s on 16 size blocks: 56948 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 51206 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 36283 sha1's in 2.99s Doing sha1 for 3s on 1024 size blocks: 18403 sha1's in 2.99s Doing sha1 for 3s on 8192 size blocks: 3584 sha1's in 2.98s Doing sha256 for 3s on 16 size blocks: 127496 sha256's in 3.00s Doing sha256 for 3s on 64 size blocks: 76127 sha256's in 2.99s Doing sha256 for 3s on 256 size blocks: 34048 sha256's in 3.00s Doing sha256 for 3s on 1024 size blocks: 10828 sha256's in 2.99s Doing sha256 for 3s on 8192 size blocks: 1524 sha256's in 2.99s Doing sha512 for 3s on 16 size blocks: 7691 sha512's in 3.00s Doing sha512 for 3s on 64 size blocks: 7654 sha512's in 2.99s Doing sha512 for 3s on 256 size blocks: 2717 sha512's in 2.99s Doing sha512 for 3s on 1024 size blocks: 926 sha512's in 2.98s Doing sha512 for 3s on 8192 size blocks: 130 sha512's in 3.01s Doing rmd160 for 3s on 16 size blocks: 45651 rmd160's in 2.99s Doing rmd160 for 3s on 64 size blocks: 39666 rmd160's in 2.99s Doing rmd160 for 3s on 256 size blocks: 28201 rmd160's in 2.99s Doing rmd160 for 3s on 1024 size blocks: 13908 rmd160's in 3.00s Doing rmd160 for 3s on 8192 size blocks: 2733 rmd160's in 2.98s Doing rc4 for 3s on 16 size blocks: 2739344 rc4's in 2.99s Doing rc4 for 3s on 64 size blocks: 783949 rc4's in 2.98s Doing rc4 for 3s on 256 size blocks: 203269 rc4's in 2.98s Doing rc4 for 3s on 1024 size blocks: 51473 rc4's in 2.99s Doing rc4 for 3s on 8192 size blocks: 6374 rc4's in 2.98s Doing des cbc for 3s on 16 size blocks: 546219 des cbc's in 3.00s Doing des cbc for 3s on 64 size blocks: 149992 des cbc's in 2.98s Doing des cbc for 3s on 256 size blocks: 38552 des cbc's in 3.00s Doing des cbc for 3s on 1024 size blocks: 9844 des cbc's in 3.00s Doing des cbc for 3s on 8192 size blocks: 1229 des cbc's in 2.99s Doing des ede3 for 3s on 16 size blocks: 213445 des ede3's in 2.97s Doing des ede3 for 3s on 64 size blocks: 55158 des ede3's in 2.97s Doing des ede3 for 3s on 256 size blocks: 13904 des ede3's in 2.97s Doing des ede3 for 3s on 1024 size blocks: 3227 des ede3's in 2.74s Doing des ede3 for 3s on 8192 size blocks: 441 des ede3's in 2.99s Doing aes-128 cbc for 3s on 16 size blocks: 595070 aes-128 cbc's in 2.97s Doing aes-128 cbc for 3s on 64 size blocks: 163409 aes-128 cbc's in 2.99s Doing aes-128 cbc for 3s on 256 size blocks: 42375 aes-128 cbc's in 3.00s Doing aes-128 cbc for 3s on 1024 size blocks: 10665 aes-128 cbc's in 2.99s Doing aes-128 cbc for 3s on 8192 size blocks: 1338 aes-128 cbc's in 2.99s Doing aes-192 cbc for 3s on 16 size blocks: 510290 aes-192 cbc's in 2.99s Doing aes-192 cbc for 3s on 64 size blocks: 138844 aes-192 cbc's in 2.98s Doing aes-192 cbc for 3s on 256 size blocks: 35894 aes-192 cbc's in 2.99s Doing aes-192 cbc for 3s on 1024 size blocks: 9089 aes-192 cbc's in 3.00s Doing aes-192 cbc for 3s on 8192 size blocks: 1132 aes-192 cbc's in 2.98s Doing aes-256 cbc for 3s on 16 size blocks: 444002 aes-256 cbc's in 2.98s Doing aes-256 cbc for 3s on 64 size blocks: 120882 aes-256 cbc's in 2.98s Doing aes-256 cbc for 3s on 256 size blocks: 30963 aes-256 cbc's in 2.98s Doing aes-256 cbc for 3s on 1024 size blocks: 7890 aes-256 cbc's in 2.99s Doing aes-256 cbc for 3s on 8192 size blocks: 994 aes-256 cbc's in 2.98s Doing aes-128 ige for 3s on 16 size blocks: 577263 aes-128 ige's in 2.99s Doing aes-128 ige for 3s on 64 size blocks: 166651 aes-128 ige's in 2.98s Doing aes-128 ige for 3s on 256 size blocks: 43055 aes-128 ige's in 2.98s Doing aes-128 ige for 3s on 1024 size blocks: 10772 aes-128 ige's in 2.99s Doing aes-128 ige for 3s on 8192 size blocks: 1306 aes-128 ige's in 2.99s Doing aes-192 ige for 3s on 16 size blocks: 493664 aes-192 ige's in 2.99s Doing aes-192 ige for 3s on 64 size blocks: 141065 aes-192 ige's in 2.99s Doing aes-192 ige for 3s on 256 size blocks: 36340 aes-192 ige's in 2.99s Doing aes-192 ige for 3s on 1024 size blocks: 9183 aes-192 ige's in 2.99s Doing aes-192 ige for 3s on 8192 size blocks: 1108 aes-192 ige's in 2.99s Doing aes-256 ige for 3s on 16 size blocks: 434801 aes-256 ige's in 2.98s Doing aes-256 ige for 3s on 64 size blocks: 122980 aes-256 ige's in 2.99s Doing aes-256 ige for 3s on 256 size blocks: 31594 aes-256 ige's in 2.99s Doing aes-256 ige for 3s on 1024 size blocks: 7988 aes-256 ige's in 2.99s Doing aes-256 ige for 3s on 8192 size blocks: 981 aes-256 ige's in 2.99s Doing rc2 cbc for 3s on 16 size blocks: 525625 rc2 cbc's in 2.99s Doing rc2 cbc for 3s on 64 size blocks: 140247 rc2 cbc's in 2.98s Doing rc2 cbc for 3s on 256 size blocks: 35672 rc2 cbc's in 2.99s Doing rc2 cbc for 3s on 1024 size blocks: 8987 rc2 cbc's in 2.99s Doing rc2 cbc for 3s on 8192 size blocks: 1119 rc2 cbc's in 2.98s Doing blowfish cbc for 3s on 16 size blocks: 1138316 blowfish cbc's in 2.99s Doing blowfish cbc for 3s on 64 size blocks: 327400 blowfish cbc's in 2.99s Doing blowfish cbc for 3s on 256 size blocks: 84685 blowfish cbc's in 2.99s Doing blowfish cbc for 3s on 1024 size blocks: 21281 blowfish cbc's in 2.99s Doing blowfish cbc for 3s on 8192 size blocks: 2606 blowfish cbc's in 2.98s Doing cast cbc for 3s on 16 size blocks: 940793 cast cbc's in 2.97s Doing cast cbc for 3s on 64 size blocks: 282189 cast cbc's in 3.00s Doing cast cbc for 3s on 256 size blocks: 73868 cast cbc's in 2.98s Doing cast cbc for 3s on 1024 size blocks: 18593 cast cbc's in 2.99s Doing cast cbc for 3s on 8192 size blocks: 2285 cast cbc's in 2.99s Doing 512 bit private rsa's for 10s: 726 512 bit private RSA's in 9.98s Doing 512 bit public rsa's for 10s: 8359 512 bit public RSA's in 9.97s Doing 1024 bit private rsa's for 10s: 158 1024 bit private RSA's in 10.03s Doing 1024 bit public rsa's for 10s: 3643 1024 bit public RSA's in 9.99s Doing 2048 bit private rsa's for 10s: 32 2048 bit private RSA's in 10.28s Doing 2048 bit public rsa's for 10s: 1350 2048 bit public RSA's in 9.96s Doing 4096 bit private rsa's for 10s: 6 4096 bit private RSA's in 10.83s Doing 4096 bit public rsa's for 10s: 443 4096 bit public RSA's in 9.98s Doing 512 bit sign dsa's for 10s: 852 512 bit DSA signs in 9.96s Doing 512 bit verify dsa's for 10s: 734 512 bit DSA verify in 9.98s Doing 1024 bit sign dsa's for 10s: 365 1024 bit DSA signs in 9.94s Doing 1024 bit verify dsa's for 10s: 315 1024 bit DSA verify in 9.98s Doing 2048 bit sign dsa's for 10s: 136 2048 bit DSA signs in 10.05s Doing 2048 bit verify dsa's for 10s: 115 2048 bit DSA verify in 10.0