|
|
(3 intermediate revisions by 2 users not shown) |
Line 1: |
Line 1: |
− | =CPU=
| + | #REDIRECT [[RPi Performance]] |
− | ==Linpack==
| |
− | | |
− | The Arm has been tested using the linpack benchmark from [http://www.netlib.org/benchmark/linpackc.new], built with gcc with -O3 (Optimisation level 3). Run with array size 200.
| |
− | | |
− | With software floating point
| |
− | | |
− | ===Source===
| |
− | [http://www.netlib.org/benchmark/linpackc.new]
| |
− | | |
− | ===Compile/Run===
| |
− | <pre>
| |
− | cc -O3 -o linpack linpack.c -lm
| |
− | linpack.c: In function ‘main’:
| |
− | linpack.c:69: warning: return type of ‘main’ is not ‘int’
| |
− | ./linpack
| |
− | Enter array size (q to quit) [200]: 200
| |
− | </pre>
| |
− | | |
− | | |
− | ===Results===
| |
− | Crippled
| |
− | <pre>
| |
− | Memory required: 315K.
| |
− | | |
− | LINPACK benchmark, Double precision.
| |
− | Machine precision: 15 digits.
| |
− | Array size 200 X 200.
| |
− | Average rolled and unrolled performance:
| |
− | | |
− | Reps Time(s) DGEFA DGESL OVERHEAD KFLOPS
| |
− | 2 0.53 92.45% 1.89% 5.66% 5493.333
| |
− | 4 1.07 92.52% 2.80% 4.67% 5385.621
| |
− | 8 2.12 92.45% 2.36% 5.19% 5466.003
| |
− | 16 4.24 92.45% 2.83% 4.72% 5438.944
| |
− | 32 8.49 92.11% 2.71% 5.18% 5459.213
| |
− | 64 16.98 92.05% 2.89% 5.06% 5452.440
| |
− | </pre>
| |
− | | |
− | Hardware floating point (-mfloat-abi=softfp)
| |
− | <pre>
| |
− | Memory required: 315K.
| |
− | LINPACK benchmark, Double precision.
| |
− | Machine precision: 15 digits.
| |
− | Array size 200 X 200.
| |
− | Average rolled and unrolled performance:
| |
− | | |
− | Reps Time(s) DGEFA DGESL OVERHEAD KFLOPS
| |
− | 8 0.51 90.20% 3.92% 5.88% 22888.889
| |
− | 16 1.02 89.22% 4.90% 5.88% 22888.889
| |
− | 32 2.05 90.24% 3.41% 6.34% 22888.889
| |
− | 64 4.08 91.42% 2.94% 5.64% 22829.437
| |
− | 128 8.16 91.54% 2.94% 5.51% 22799.827
| |
− | 256 16.31 91.35% 2.76% 5.89% 22903.800
| |
− | </pre>
| |
− | | |
− | ==Whetstone/Dhrystone==
| |
− | | |
− | All code compiled with gcc options -float-abi=softfp -O3
| |
− | | |
− | ===Source===
| |
− | Code for these tests can be found here http://www.rowley.co.uk/arm/whet_dhry.zip.
| |
− | Or if 404 this code might be analogous http://freespace.virgin.net/roy.longbottom/benchnt.zip
| |
− | | |
− | | |
− | ===Compile/Run===
| |
− | <pre>
| |
− | ?
| |
− | </pre>
| |
− | | |
− | | |
− | ===Results===
| |
− | Dhrystone
| |
− | <pre>
| |
− | Microseconds for one run through Dhrystone: 1.2
| |
− | | |
− | Dhrystones per Second: 809061.5
| |
− | </pre>
| |
− | | |
− | | |
− | Whetstone Crippled
| |
− | <pre>
| |
− | Loops: 1000, Iterations: 10, Duration: 24 sec.
| |
− | | |
− | C Converted Double Precision Whetstones: 41.7 MIPS
| |
− | </pre>
| |
− | | |
− | Rebuilding the Whetstone test code with 'gcc -mfpu -float-abi=softfp' gives better results:
| |
− | <pre>
| |
− | | |
− | Loops: 1000, Iterations: 100, Duration: 106 sec.
| |
− | C Converted Double Precision Whetstones: 94.3 MIPS
| |
− | </pre>
| |
− | | |
− | However, the majority of compute time is spent in the SQRT function, which for the above test was built without -mfpu=vfp. Using a library with vfp give the following much improved result :
| |
− | <pre>
| |
− | Loops: 1000, Iterations: 100, Duration: 15 sec.
| |
− | C Converted Double Precision Whetstones: 666.7 MIPS
| |
− | </pre>
| |
− | | |
− | ==OpenSSL==
| |
− | | |
− | ===Source===
| |
− | [http://www.openssl.org/source/]
| |
− | | |
− | ===Compile/Run===
| |
− | <pre>
| |
− | openssl version;
| |
− | openssl speed;
| |
− | </pre>
| |
− | | |
− | ===Results===
| |
− | <pre>
| |
− | OpenSSL 0.9.8o 01 Jun 2010
| |
− | built on: Thu Aug 26 18:56:26 UTC 2010
| |
− | options:bn(64,32) md2(int) rc4(ptr,int) des(idx,risc1,4,long) aes(partial) blowfish(idx)
| |
− | compiler: gcc -fPIC -DOPENSSL_PIC -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DL_ENDIAN -DTERMIO -O2 -Wa,--noexecstack -g -Wall
| |
− | available timing options: TIMES TIMEB HZ=100 [sysconf value]
| |
− | timing function used: times
| |
− | The 'numbers' are in 1000s of bytes per second processed.
| |
− | type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
| |
− | md2 148.81k 372.18k 624.81k 769.95k 832.90k
| |
− | mdc2 0.00 0.00 0.00 0.00 0.00
| |
− | md4 615.30k 2468.76k 7612.19k 16707.01k 28104.86k
| |
− | md5 380.13k 1501.12k 4800.77k 11312.81k 21682.77k
| |
− | hmac(md5) 1022.28k 3480.23k 9587.80k 17492.25k 25441.78k
| |
− | sha1 303.72k 1092.39k 3106.50k 6302.57k 9852.39k
| |
− | rmd160 244.29k 849.04k 2414.53k 4747.26k 7513.00k
| |
− | rc4 14658.70k 16836.49k 17462.03k 17628.21k 17522.08k
| |
− | des cbc 2913.17k 3221.30k 3289.77k 3360.09k 3367.21k
| |
− | des ede3 1149.87k 1188.59k 1198.46k 1206.00k 1208.25k
| |
− | idea cbc 0.00 0.00 0.00 0.00 0.00
| |
− | seed cbc 0.00 0.00 0.00 0.00 0.00
| |
− | rc2 cbc 2812.71k 3012.02k 3054.19k 3077.82k 3076.12k
| |
− | rc5-32/12 cbc 0.00 0.00 0.00 0.00 0.00
| |
− | blowfish cbc 6091.32k 7007.89k 7250.62k 7288.21k 7163.88k
| |
− | cast cbc 5068.25k 6020.03k 6345.71k 6367.64k 6260.44k
| |
− | aes-128 cbc 3205.76k 3497.72k 3616.00k 3652.49k 3665.85k
| |
− | aes-192 cbc 2730.65k 2981.88k 3073.20k 3102.38k 3111.86k
| |
− | aes-256 cbc 2383.90k 2596.12k 2659.91k 2702.13k 2732.50k
| |
− | camellia-128 cbc 0.00 0.00 0.00 0.00 0.00
| |
− | camellia-192 cbc 0.00 0.00 0.00 0.00 0.00
| |
− | camellia-256 cbc 0.00 0.00 0.00 0.00 0.00
| |
− | sha256 679.98k 1629.47k 2905.43k 3708.32k 4175.45k
| |
− | sha512 41.02k 163.83k 232.63k 318.20k 353.81k
| |
− | aes-128 ige 3089.03k 3579.08k 3698.68k 3689.14k 3578.18k
| |
− | aes-192 ige 2641.68k 3019.45k 3111.38k 3144.95k 3035.70k
| |
− | aes-256 ige 2334.50k 2632.35k 2705.04k 2735.69k 2687.74k
| |
− | sign verify sign/s verify/s
| |
− | rsa 512 bits 0.013747s 0.001193s 72.7 838.4
| |
− | rsa 1024 bits 0.063481s 0.002742s 15.8 364.7
| |
− | rsa 2048 bits 0.321250s 0.007378s 3.1 135.5
| |
− | rsa 4096 bits 1.805000s 0.022528s 0.6 44.4
| |
− | sign verify sign/s verify/s
| |
− | dsa 512 bits 0.011690s 0.013597s 85.5 73.5
| |
− | dsa 1024 bits 0.027233s 0.031683s 36.7 31.6
| |
− | dsa 2048 bits 0.073897s 0.087304s 13.5 11.5
| |
− | </pre>
| |
− | | |
− | =GPU=
| |
− | The RaspberryPi appears to handle h264 1080p movie from USB to HDMI at least 4MB/s.
| |
− | The Admin "JamesH" said it would handle "basically 1080p30, high profile, >40Mb/s."
| |
− | | |
− | | |
− | | |
− | ==3DMarkMobile ES 2.0==
| |
− | | |
− | ===Source===
| |
− | ?
| |
− | | |
− | ===Compile/Run===
| |
− | <pre>
| |
− | ?
| |
− | </pre>
| |
− | | |
− | ===Results===
| |
− | <pre>
| |
− | ?
| |
− | </pre>
| |
− | | |
− | [[Category: RaspberryPi]] | |
− | | |
− | =IO=
| |
− | | |
− | ==USB buss==
| |
− | *All IO uses the same bus so the combination of all IO can not exceed the the bus speed of an as yet hypothetical 60MB/s
| |
− | ==SD card==
| |
− | *TODO test
| |
− | ===Compile/Run===
| |
− | <pre>
| |
− | dd if=/dev/zero of=~/test.tmp bs=100K count=1024
| |
− | dd if~/test.tmp of=/dev/null bs=100K count=1024
| |
− | rm ~/test.tmp
| |
− | </pre>
| |
− | ===Results===
| |
− | * Depends on SD card used http://elinux.org/RaspberryPiBoardVerifiedPeripherals#SDHC_cards
| |
− | <pre>
| |
− | ?maybe 15MB/s?
| |
− | </pre>
| |
− | | |
− | ==NIC==
| |
− | *TODO test with wget, curl, etc
| |