Difference between revisions of "Tests:R-CAR-RAVB-RX-Checksum-Offload"

From eLinux.org
Jump to: navigation, search
(= With RX Checksum Offload)
(= Without RX Checksum Offload)
Line 102: Line 102:
 
</pre>
 
</pre>
  
==== Without RX Checksum Offload ===
+
==== Without RX Checksum Offload ====
  
 
<pre>
 
<pre>
Line 120: Line 120:
 
[ perf record: Captured and wrote 3.411 MB /run/perf.data (~149040 samples) ]
 
[ perf record: Captured and wrote 3.411 MB /run/perf.data (~149040 samples) ]
  
# perf_3.16 report -i perf.data-TCP_MAERTS_rx-csum-off | head -20
+
# perf_3.16 report -i /run/perf.data | head -20
 
# To display the perf.data header info, please use --header/--header-only options.
 
# To display the perf.data header info, please use --header/--header-only options.
 
#
 
#

Revision as of 04:30, 14 September 2017

Kernel Version Configuration

RX Checksum Offload support for RAVB is currently available in a topic branch:

https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas.git topic/ravb-rx-checksum-offload

The ARM64 defconfig was used. The following option was distabled to produce a kernel image small enough to boot in the environment used for testing.

  • CONFIG_SOUND

The following option was also disabled as the sub-system in question seems to fail to build in the net-next revision that the topic/ravb-rx-checksum-offload branch is based on, the latest net-next revision at the time.

  • CONFIG_DRM

User Space Configuration

The tests described below requires netperf to be installed both on the board being tested and the host specified by the -H option when netperf is invoked on the board. netserver, which is part of the netperf package, should be running on the host.

Perf is used to record CPU usage during the test. For this reason perf needs to be installed on the board being tested.

Hardware Environment

  • Salvator-X/r8a7795 (Gen 3 R-Car H3 SoC) ES1.0
  • Salvator-X/r8a7796 (Gen 3 R-Car M3-W SoC) ES1.0

The results shown below are from tests performed on the Salvator-X/r8a7796.
The Salvator-XS/r8a7795 gives the same results.

Verify RAVB RX Checksum Offload Support

Verify Driver Initialisation

Initialisation of RAVB can be checked by inspection of the output of dmesg.

# dmesg | grep ravb
[    1.291370] libphy: ravb_mii: probed
[    1.295837] ravb e6800000.ethernet eth0: Base address at 0xe6800000, 2e:09:0a:00:be:d8, IRQ 45.
[    5.025952] ravb e6800000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx

Verify Configurability of RX Checksum Offload

# ethtool -k eth0 | grep rx-checksum
rx-checksumming: on
# ethtool -K eth0 rx off
# ethtool -k eth0 | grep rx-checksum
rx-checksumming: off
# ethtool -K eth0 rx on
# ethtool -k eth0 | grep rx-checksum
rx-checksumming: on

Run netperf TCP_MAERTS tests

When run on the board this exercises RX using the RAVB by recieving a stream of TCP packets from the host.

Note that perf record writes to a file in /run. This was chosen as that directory is mounted as a tmpfs filesystem backed by memory. Writing to a file in an NFS partition significantly impacts the meaningfulness of results collected.

With RX Checksum Offload

# ethtool -K eth0 rx on
# ethtool -k eth0 | grep rx-checksum
rx-checksumming: on
# /usr/bin/perf_3.16 record -o /run/perf.data -a netperf -t TCP_MAERTS -H 10.4.3.162
MIGRATED TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.4.3.162 () port 0 AF_INET : demo
enable_enobufs failed: getprotobyname
Recv   Send    Send                          
Socket Socket  Message  Elapsed              
Size   Size    Size     Time     Throughput  
bytes  bytes   bytes    secs.    10^6bits/sec  

 87380  16384  16384    10.00     938.78   
[ perf record: Woken up 14 times to write data ]
[ perf record: Captured and wrote 3.524 MB /run/perf.data (~153957 samples) ]

# perf_3.16 report -i /run/perf.data | head -20
# To display the perf.data header info, please use --header/--header-only options.
#
# Samples: 75K of event 'cycles'
# Event count (approx.): 19704920110
#
# Overhead          Command      Shared Object                                Symbol
# ........  ...............  .................  ....................................
#
    19.49%      ksoftirqd/0  [kernel.kallsyms]  [k] _raw_spin_unlock_irqrestore     
     9.88%      ksoftirqd/0  [kernel.kallsyms]  [k] __pi_memcpy                     
     7.33%      ksoftirqd/0  [kernel.kallsyms]  [k] skb_put                         
     7.00%      ksoftirqd/0  [kernel.kallsyms]  [k] ravb_poll                       
     3.89%      ksoftirqd/0  [kernel.kallsyms]  [k] dev_gro_receive                 
     3.65%          netperf  [kernel.kallsyms]  [k] __arch_copy_to_user             
     3.43%          swapper  [kernel.kallsyms]  [k] arch_cpu_idle                   
     2.77%          swapper  [kernel.kallsyms]  [k] tick_nohz_idle_enter            
     1.85%      ksoftirqd/0  [kernel.kallsyms]  [k] __netdev_alloc_skb              
     1.80%          swapper  [kernel.kallsyms]  [k] _raw_spin_unlock_irq            
     1.64%      ksoftirqd/0  [kernel.kallsyms]  [k] __slab_alloc.isra.79            
     1.62%      ksoftirqd/0  [kernel.kallsyms]  [k] __pi___inval_cache_range        

Without RX Checksum Offload

# ethtool -K eth0 rx off
# ethtool -k eth0 | grep rx-checksum
rx-checksumming: off
# perf_3.16 record -o /run/perf.data -a netperf -t TCP_MAERTS -H 10.4.3.162
MIGRATED TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.4.3.162 () port 0 AF_INET : demo
enable_enobufs failed: getprotobyname
Recv   Send    Send                          
Socket Socket  Message  Elapsed              
Size   Size    Size     Time     Throughput  
bytes  bytes   bytes    secs.    10^6bits/sec  

 87380  16384  16384    10.00     941.09   
[ perf record: Woken up 14 times to write data ]
[ perf record: Captured and wrote 3.411 MB /run/perf.data (~149040 samples) ]

# perf_3.16 report -i /run/perf.data | head -20
# To display the perf.data header info, please use --header/--header-only options.
#
# Samples: 73K of event 'cycles'
# Event count (approx.): 18682878466
#
# Overhead        Command      Shared Object                                Symbol
# ........  .............  .................  ....................................
#
    17.50%    ksoftirqd/0  [kernel.kallsyms]  [k] _raw_spin_unlock_irqrestore     
    10.60%    ksoftirqd/0  [kernel.kallsyms]  [k] __pi_memcpy                     
     7.91%    ksoftirqd/0  [kernel.kallsyms]  [k] skb_put                         
     6.95%    ksoftirqd/0  [kernel.kallsyms]  [k] do_csum                         
     6.22%    ksoftirqd/0  [kernel.kallsyms]  [k] ravb_poll                       
     3.84%    ksoftirqd/0  [kernel.kallsyms]  [k] dev_gro_receive                 
     2.53%        netperf  [kernel.kallsyms]  [k] __arch_copy_to_user             
     2.53%        swapper  [kernel.kallsyms]  [k] arch_cpu_idle                   
     2.27%        swapper  [kernel.kallsyms]  [k] tick_nohz_idle_enter            
     1.90%    ksoftirqd/0  [kernel.kallsyms]  [k] __pi___inval_cache_range        
     1.90%    ksoftirqd/0  [kernel.kallsyms]  [k] __netdev_alloc_skb              
     1.52%    ksoftirqd/0  [kernel.kallsyms]  [k] __slab_alloc.isra.79          

Inspect Available Governors

Inspect Available Governors

# grep . */cpufreq/scaling_available_governors
cpu0/cpufreq/scaling_available_governors:conservative performance 
cpu1/cpufreq/scaling_available_governors:conservative performance 
cpu2/cpufreq/scaling_available_governors:conservative performance 
cpu3/cpufreq/scaling_available_governors:conservative performance 

Exercise CPUFreq Support

Change to cpu sysfs directory

# cd /sys/devices/system/cpu

Set Governor

The conservative governor will be used for this test

# echo conservative > cpu0/cpufreq/scaling_governor

# grep . */cpufreq/scaling_governor
cpu0/cpufreq/scaling_governor:conservative
cpu1/cpufreq/scaling_governor:conservative
cpu2/cpufreq/scaling_governor:conservative
cpu3/cpufreq/scaling_governor:conservative

Observe CPU Frequency Changes

On an idle system:

  1. Check the frequency; it should be a low value
  2. Apply some load to the system
  3. Check the frequency again; it should be a higher value
  4. Wait; the system is once again idle
  5. Check the frequency one last time; it should be reduced again
# grep . */cpufreq/scaling_cur_freq                                            
cpu0/cpufreq/scaling_cur_freq:500000
cpu1/cpufreq/scaling_cur_freq:500000
cpu2/cpufreq/scaling_cur_freq:500000
cpu3/cpufreq/scaling_cur_freq:500000

# for i in $(seq 1000000); do :; done
# grep . */cpufreq/scaling_cur_freq
cpu0/cpufreq/scaling_cur_freq:1500000
cpu1/cpufreq/scaling_cur_freq:1500000
cpu2/cpufreq/scaling_cur_freq:1500000
cpu3/cpufreq/scaling_cur_freq:1500000

# sleep 5
# grep . */cpufreq/scaling_cur_freq
cpu0/cpufreq/scaling_cur_freq:500000
cpu1/cpufreq/scaling_cur_freq:500000
cpu2/cpufreq/scaling_cur_freq:500000
cpu3/cpufreq/scaling_cur_freq:500000