Difference between revisions of "Tests:mmc-delay-refactor"
m |
(some more paragraphs) |
||
Line 24: | Line 24: | ||
A standard busybox is needed only. Commands used are ''ifconfig'', ''echo'' and ''cat'' | A standard busybox is needed only. Commands used are ''ifconfig'', ''echo'' and ''cat'' | ||
+ | |||
+ | = mmc_delay() behaviour = | ||
==before this task== | ==before this task== | ||
Line 78: | Line 80: | ||
kworker/1:1-908 [001] .... 20.927627: __delay: 8333 | kworker/1:1-908 [001] .... 20.927627: __delay: 8333 | ||
</pre> | </pre> | ||
+ | |||
+ | How to read this trace: Every line containing ''__delay'' is a call to udelay/mdelay and means busy-looping. | ||
+ | Every line starting with ''mmc_'' is a call to mmc_delay which gets inlined to its caller function, so we see varying names there. | ||
+ | We can see that mmc_delay is called with either 10ms or 1ms as an argument. Because HZ=100 in this kernel config, the 10ms | ||
+ | calls result in msleep() rather then mdelay(), so it is not busy-looping and we don't see the ''__delay'' printout. | ||
+ | For 1ms, things are different. We see a call to ''__delay'' which means here is now busy-looping involved. | ||
+ | Note: The very first ''__udelay: 0'' comes from the SDHI driver while waiting with udelay(1) for a register to be cleared. | ||
+ | This is expected behaviour. | ||
==after this task== | ==after this task== | ||
Line 102: | Line 112: | ||
# TASK-PID CPU# |||| TIMESTAMP FUNCTION | # TASK-PID CPU# |||| TIMESTAMP FUNCTION | ||
# | | | |||| | | | # | | | |||| | | | ||
− | |||
kworker/0:3-968 [000] .... 165.014474: __delay: 0 | kworker/0:3-968 [000] .... 165.014474: __delay: 0 | ||
kworker/0:3-968 [000] .... 165.015311: mmc_power_up.part.9: 10 ms | kworker/0:3-968 [000] .... 165.015311: mmc_power_up.part.9: 10 ms | ||
Line 131: | Line 140: | ||
kworker/0:3-968 [000] .... 165.273847: mmc_set_uhs_voltage: 1 ms | kworker/0:3-968 [000] .... 165.273847: mmc_set_uhs_voltage: 1 ms | ||
</pre> | </pre> | ||
+ | Here, we see a ver similar output as before with the exception that the ''__delay'' calls for the 1ms case are also gone. | ||
+ | This means there is no busy looping involved anymore in mmc_delay(). Neither for 10ms nor for 1ms. All delays are now done with either | ||
+ | msleep() or usleep_range() as suggested by the Kernel timer documentation. | ||
=Conclusion= | =Conclusion= | ||
+ | |||
+ | It could be shown that previously calls to ''mmc_delay'' caused busy-looping and burned CPU cycles. | ||
+ | It could also be shown a fix that changes this behaviour to delay mechanisms which do not use busy-looping. | ||
+ | The now involved sleeping methods give the Kernel a chance to use the time slices for another task. |
Revision as of 10:45, 14 November 2017
!DRAFT! This document describes how to test that the Linux MMC core does not use busy-looping in its mmc_delay() helper anymore.
Contents
Setup
Kernel Version and Configuration
Support for this feature is expected to land upstream in v4.16. It is currently available in a topic branch:
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux.git renesas/topic/mmc-delay-refactor
The following kernel options need to be set in addition to the defconfig:
CONFIG_FTRACE=y
Hardware Environment
Test were done on two boards
- Salvator-X/r8a7796 (R-Car M3-W SoC)
- Lager/r8a7790 (R-Car H2 SoC)
with some random microSD card (Lager) or some random SD card (Salvator-X).
Software Environment
A standard busybox is needed only. Commands used are ifconfig, echo and cat
mmc_delay() behaviour
before this task
Preparations
On the host, to get the base Kernel tree without the MMC core improvements:
$ git checkout renesas/topic/mmc-delay-refactor-without-fix
Testing
# cat /sys/kernel/debug/tracing/trace # tracer: nop # # _-----=> irqs-off # / _----=> need-resched # | / _---=> hardirq/softirq # || / _--=> preempt-depth # ||| / delay # TASK-PID CPU# |||| TIMESTAMP FUNCTION # | | | |||| | | kworker/1:1-908 [001] .... 20.646434: __delay: 0 kworker/1:1-908 [001] .... 20.647390: mmc_power_up.part.9: 10 ms kworker/1:1-908 [001] .... 20.666645: mmc_power_up.part.9: 10 ms kworker/1:1-908 [001] .... 20.690323: mmc_go_idle: 1 ms kworker/1:1-908 [001] .... 20.691324: __delay: 8333 kworker/1:1-908 [001] .... 20.691499: mmc_go_idle: 1 ms kworker/1:1-908 [001] .... 20.692500: __delay: 8333 kworker/1:1-908 [001] .... 20.692717: mmc_go_idle: 1 ms kworker/1:1-908 [001] .... 20.693718: __delay: 8333 kworker/1:1-908 [001] .... 20.702239: mmc_go_idle: 1 ms kworker/1:1-908 [001] .... 20.703240: __delay: 8333 kworker/1:1-908 [001] .... 20.703415: mmc_go_idle: 1 ms kworker/1:1-908 [001] .... 20.704415: __delay: 8333 kworker/1:1-908 [001] .... 20.704632: mmc_go_idle: 1 ms kworker/1:1-908 [001] .... 20.705633: __delay: 8333 kworker/1:1-908 [001] .... 20.706582: mmc_send_app_op_cond: 10 ms kworker/1:1-908 [001] .... 20.727045: mmc_send_app_op_cond: 10 ms kworker/1:1-908 [001] .... 20.747044: mmc_send_app_op_cond: 10 ms kworker/1:1-908 [001] .... 20.767043: mmc_send_app_op_cond: 10 ms kworker/1:1-908 [001] .... 20.787042: mmc_send_app_op_cond: 10 ms kworker/1:1-908 [001] .... 20.807043: mmc_send_app_op_cond: 10 ms kworker/1:1-908 [001] .... 20.827042: mmc_send_app_op_cond: 10 ms kworker/1:1-908 [001] .... 20.847044: mmc_send_app_op_cond: 10 ms kworker/1:1-908 [001] .... 20.867045: mmc_send_app_op_cond: 10 ms kworker/1:1-908 [001] .... 20.887043: mmc_send_app_op_cond: 10 ms kworker/1:1-908 [001] .... 20.907365: mmc_set_uhs_voltage: 1 ms kworker/1:1-908 [001] .... 20.908365: __delay: 8333 kworker/1:1-908 [001] .... 20.908665: mmc_host_set_uhs_voltage: 10 ms kworker/1:1-908 [001] .... 20.926626: mmc_set_uhs_voltage: 1 ms kworker/1:1-908 [001] .... 20.927627: __delay: 8333
How to read this trace: Every line containing __delay is a call to udelay/mdelay and means busy-looping. Every line starting with mmc_ is a call to mmc_delay which gets inlined to its caller function, so we see varying names there. We can see that mmc_delay is called with either 10ms or 1ms as an argument. Because HZ=100 in this kernel config, the 10ms calls result in msleep() rather then mdelay(), so it is not busy-looping and we don't see the __delay printout. For 1ms, things are different. We see a call to __delay which means here is now busy-looping involved. Note: The very first __udelay: 0 comes from the SDHI driver while waiting with udelay(1) for a register to be cleared. This is expected behaviour.
after this task
Preparations
Now, the test with the full branch applied:
$ git checkout renesas/topic/mmc-delay-refactor
Testing
# cat /sys/kernel/debug/tracing/trace # tracer: nop # # _-----=> irqs-off # / _----=> need-resched # | / _---=> hardirq/softirq # || / _--=> preempt-depth # ||| / delay # TASK-PID CPU# |||| TIMESTAMP FUNCTION # | | | |||| | | kworker/0:3-968 [000] .... 165.014474: __delay: 0 kworker/0:3-968 [000] .... 165.015311: mmc_power_up.part.9: 10 ms kworker/0:3-968 [000] .... 165.028043: mmc_power_up.part.9: 10 ms kworker/0:3-968 [000] .... 165.044455: mmc_go_idle: 1 ms kworker/0:3-968 [000] .... 165.045881: mmc_go_idle: 1 ms kworker/0:3-968 [000] .... 165.047354: mmc_go_idle: 1 ms kworker/0:3-968 [000] .... 165.057109: mmc_go_idle: 1 ms kworker/0:3-968 [000] .... 165.058537: mmc_go_idle: 1 ms kworker/0:3-968 [000] .... 165.060007: mmc_go_idle: 1 ms kworker/0:3-968 [000] .... 165.062206: mmc_send_app_op_cond: 10 ms kworker/0:3-968 [000] .... 165.075342: mmc_send_app_op_cond: 10 ms kworker/0:3-968 [000] .... 165.088474: mmc_send_app_op_cond: 10 ms kworker/0:3-968 [000] .... 165.101607: mmc_send_app_op_cond: 10 ms kworker/0:3-968 [000] .... 165.114741: mmc_send_app_op_cond: 10 ms kworker/0:3-968 [000] .... 165.127873: mmc_send_app_op_cond: 10 ms kworker/0:3-968 [000] .... 165.141006: mmc_send_app_op_cond: 10 ms kworker/0:3-968 [000] .... 165.154139: mmc_send_app_op_cond: 10 ms kworker/0:3-968 [000] .... 165.167271: mmc_send_app_op_cond: 10 ms kworker/0:3-968 [000] .... 165.180404: mmc_send_app_op_cond: 10 ms kworker/0:3-968 [000] .... 165.193537: mmc_send_app_op_cond: 10 ms kworker/0:3-968 [000] .... 165.206671: mmc_send_app_op_cond: 10 ms kworker/0:3-968 [000] .... 165.219803: mmc_send_app_op_cond: 10 ms kworker/0:3-968 [000] .... 165.232938: mmc_send_app_op_cond: 10 ms kworker/0:3-968 [000] .... 165.246071: mmc_send_app_op_cond: 10 ms kworker/0:3-968 [000] .... 165.259526: mmc_set_uhs_voltage: 1 ms kworker/0:3-968 [000] .... 165.261123: mmc_host_set_uhs_voltage: 10 ms kworker/0:3-968 [000] .... 165.273847: mmc_set_uhs_voltage: 1 ms
Here, we see a ver similar output as before with the exception that the __delay calls for the 1ms case are also gone. This means there is no busy looping involved anymore in mmc_delay(). Neither for 10ms nor for 1ms. All delays are now done with either msleep() or usleep_range() as suggested by the Kernel timer documentation.
Conclusion
It could be shown that previously calls to mmc_delay caused busy-looping and burned CPU cycles. It could also be shown a fix that changes this behaviour to delay mechanisms which do not use busy-looping. The now involved sleeping methods give the Kernel a chance to use the time slices for another task.