Difference between revisions of "Tests:R-CAR-RAVB-RX-Checksum-Offload"
(→= With RX Checksum Offload) |
(→= Without RX Checksum Offload) |
||
Line 102: | Line 102: | ||
</pre> | </pre> | ||
− | ==== Without RX Checksum Offload === | + | ==== Without RX Checksum Offload ==== |
<pre> | <pre> | ||
Line 120: | Line 120: | ||
[ perf record: Captured and wrote 3.411 MB /run/perf.data (~149040 samples) ] | [ perf record: Captured and wrote 3.411 MB /run/perf.data (~149040 samples) ] | ||
− | # perf_3.16 report -i perf.data | + | # perf_3.16 report -i /run/perf.data | head -20 |
# To display the perf.data header info, please use --header/--header-only options. | # To display the perf.data header info, please use --header/--header-only options. | ||
# | # |
Revision as of 04:30, 14 September 2017
Contents
Kernel Version Configuration
RX Checksum Offload support for RAVB is currently available in a topic branch:
https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas.git topic/ravb-rx-checksum-offload
The ARM64 defconfig was used. The following option was distabled to produce a kernel image small enough to boot in the environment used for testing.
- CONFIG_SOUND
The following option was also disabled as the sub-system in question seems to fail to build in the net-next revision that the topic/ravb-rx-checksum-offload branch is based on, the latest net-next revision at the time.
- CONFIG_DRM
User Space Configuration
The tests described below requires netperf to be installed both on the board being tested and the host specified by the -H option when netperf is invoked on the board. netserver, which is part of the netperf package, should be running on the host.
Perf is used to record CPU usage during the test. For this reason perf needs to be installed on the board being tested.
Hardware Environment
- Salvator-X/r8a7795 (Gen 3 R-Car H3 SoC) ES1.0
- Salvator-X/r8a7796 (Gen 3 R-Car M3-W SoC) ES1.0
The results shown below are from tests performed on the Salvator-X/r8a7796.
The Salvator-XS/r8a7795 gives the same results.
Verify RAVB RX Checksum Offload Support
Verify Driver Initialisation
Initialisation of RAVB can be checked by inspection of the output of dmesg.
# dmesg | grep ravb [ 1.291370] libphy: ravb_mii: probed [ 1.295837] ravb e6800000.ethernet eth0: Base address at 0xe6800000, 2e:09:0a:00:be:d8, IRQ 45. [ 5.025952] ravb e6800000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx
Verify Configurability of RX Checksum Offload
# ethtool -k eth0 | grep rx-checksum rx-checksumming: on # ethtool -K eth0 rx off # ethtool -k eth0 | grep rx-checksum rx-checksumming: off # ethtool -K eth0 rx on # ethtool -k eth0 | grep rx-checksum rx-checksumming: on
Run netperf TCP_MAERTS tests
When run on the board this exercises RX using the RAVB by recieving a stream of TCP packets from the host.
Note that perf record writes to a file in /run. This was chosen as that directory is mounted as a tmpfs filesystem backed by memory. Writing to a file in an NFS partition significantly impacts the meaningfulness of results collected.
With RX Checksum Offload
# ethtool -K eth0 rx on # ethtool -k eth0 | grep rx-checksum rx-checksumming: on # /usr/bin/perf_3.16 record -o /run/perf.data -a netperf -t TCP_MAERTS -H 10.4.3.162 MIGRATED TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.4.3.162 () port 0 AF_INET : demo enable_enobufs failed: getprotobyname Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 87380 16384 16384 10.00 938.78 [ perf record: Woken up 14 times to write data ] [ perf record: Captured and wrote 3.524 MB /run/perf.data (~153957 samples) ] # perf_3.16 report -i /run/perf.data | head -20 # To display the perf.data header info, please use --header/--header-only options. # # Samples: 75K of event 'cycles' # Event count (approx.): 19704920110 # # Overhead Command Shared Object Symbol # ........ ............... ................. .................................... # 19.49% ksoftirqd/0 [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore 9.88% ksoftirqd/0 [kernel.kallsyms] [k] __pi_memcpy 7.33% ksoftirqd/0 [kernel.kallsyms] [k] skb_put 7.00% ksoftirqd/0 [kernel.kallsyms] [k] ravb_poll 3.89% ksoftirqd/0 [kernel.kallsyms] [k] dev_gro_receive 3.65% netperf [kernel.kallsyms] [k] __arch_copy_to_user 3.43% swapper [kernel.kallsyms] [k] arch_cpu_idle 2.77% swapper [kernel.kallsyms] [k] tick_nohz_idle_enter 1.85% ksoftirqd/0 [kernel.kallsyms] [k] __netdev_alloc_skb 1.80% swapper [kernel.kallsyms] [k] _raw_spin_unlock_irq 1.64% ksoftirqd/0 [kernel.kallsyms] [k] __slab_alloc.isra.79 1.62% ksoftirqd/0 [kernel.kallsyms] [k] __pi___inval_cache_range
Without RX Checksum Offload
# ethtool -K eth0 rx off # ethtool -k eth0 | grep rx-checksum rx-checksumming: off # perf_3.16 record -o /run/perf.data -a netperf -t TCP_MAERTS -H 10.4.3.162 MIGRATED TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.4.3.162 () port 0 AF_INET : demo enable_enobufs failed: getprotobyname Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 87380 16384 16384 10.00 941.09 [ perf record: Woken up 14 times to write data ] [ perf record: Captured and wrote 3.411 MB /run/perf.data (~149040 samples) ] # perf_3.16 report -i /run/perf.data | head -20 # To display the perf.data header info, please use --header/--header-only options. # # Samples: 73K of event 'cycles' # Event count (approx.): 18682878466 # # Overhead Command Shared Object Symbol # ........ ............. ................. .................................... # 17.50% ksoftirqd/0 [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore 10.60% ksoftirqd/0 [kernel.kallsyms] [k] __pi_memcpy 7.91% ksoftirqd/0 [kernel.kallsyms] [k] skb_put 6.95% ksoftirqd/0 [kernel.kallsyms] [k] do_csum 6.22% ksoftirqd/0 [kernel.kallsyms] [k] ravb_poll 3.84% ksoftirqd/0 [kernel.kallsyms] [k] dev_gro_receive 2.53% netperf [kernel.kallsyms] [k] __arch_copy_to_user 2.53% swapper [kernel.kallsyms] [k] arch_cpu_idle 2.27% swapper [kernel.kallsyms] [k] tick_nohz_idle_enter 1.90% ksoftirqd/0 [kernel.kallsyms] [k] __pi___inval_cache_range 1.90% ksoftirqd/0 [kernel.kallsyms] [k] __netdev_alloc_skb 1.52% ksoftirqd/0 [kernel.kallsyms] [k] __slab_alloc.isra.79
Inspect Available Governors
Inspect Available Governors
# grep . */cpufreq/scaling_available_governors cpu0/cpufreq/scaling_available_governors:conservative performance cpu1/cpufreq/scaling_available_governors:conservative performance cpu2/cpufreq/scaling_available_governors:conservative performance cpu3/cpufreq/scaling_available_governors:conservative performance
Exercise CPUFreq Support
Change to cpu sysfs directory
# cd /sys/devices/system/cpu
Set Governor
The conservative governor will be used for this test
# echo conservative > cpu0/cpufreq/scaling_governor # grep . */cpufreq/scaling_governor cpu0/cpufreq/scaling_governor:conservative cpu1/cpufreq/scaling_governor:conservative cpu2/cpufreq/scaling_governor:conservative cpu3/cpufreq/scaling_governor:conservative
Observe CPU Frequency Changes
On an idle system:
- Check the frequency; it should be a low value
- Apply some load to the system
- Check the frequency again; it should be a higher value
- Wait; the system is once again idle
- Check the frequency one last time; it should be reduced again
# grep . */cpufreq/scaling_cur_freq cpu0/cpufreq/scaling_cur_freq:500000 cpu1/cpufreq/scaling_cur_freq:500000 cpu2/cpufreq/scaling_cur_freq:500000 cpu3/cpufreq/scaling_cur_freq:500000 # for i in $(seq 1000000); do :; done # grep . */cpufreq/scaling_cur_freq cpu0/cpufreq/scaling_cur_freq:1500000 cpu1/cpufreq/scaling_cur_freq:1500000 cpu2/cpufreq/scaling_cur_freq:1500000 cpu3/cpufreq/scaling_cur_freq:1500000 # sleep 5 # grep . */cpufreq/scaling_cur_freq cpu0/cpufreq/scaling_cur_freq:500000 cpu1/cpufreq/scaling_cur_freq:500000 cpu2/cpufreq/scaling_cur_freq:500000 cpu3/cpufreq/scaling_cur_freq:500000