Difference between revisions of "OMAP Power Management"

From eLinux.org
Jump to: navigation, search
(omap_hwmod, omap_device conversion)
(Device PM control via omap_device + omap_hwmod)
Line 280: Line 280:
* LWN article: http://lwn.net/Articles/347573/
* LWN article: http://lwn.net/Articles/347573/
* Kevin Hilman's talk from ELC 2010 in San Francisco: http://elinux.org/images/0/08/ELC-2010-Hilman-Runtime-PM.pdf
* Key features
* Key features
Line 289: Line 290:
** implement platform specific runtime PM hooks for OMAP  
** implement platform specific runtime PM hooks for OMAP  
** runtime PM API used by '''all''' OMAP drivers
** runtime PM API used by '''all''' OMAP drivers
** Explore platform_bus notifiers for automatic init/idle/suspend of devices
* Current status
** (add overview of API from driver perspective)
** Propsed runtime PM implementation for OMAP available in <tt>pm-wip/runtime</tt> branch of Kevin's linux-omap-pm git tree.
=== Public Power management test framework ===
=== Public Power management test framework ===

Revision as of 12:55, 28 June 2010

PM branch

The PM branch is a developement branch of the linux-omap kernel for the purposes of developing and stabilizing the PM infrastructure for OMAP and submitting it upstream.

The maintainer of the PM branch is Kevin Hilman.


  • full-chip retention in idle and suspend
  • full-chip OFF in idle and suspend
  • idle PM via CPUidle
  • active PM via DVFS using CPUfreq
  • support for multiple OMAP3 boards

The latest, tested PM branch is available as a branch named 'pm' from the linux-omap-pm repository. This branch is also sync'd daily as the 'pm' branch of the main linux-omap repository.

OMAP PM today.png

Current version

Released: 25 jun 2010

Important recent changes

  • rebased to latest omap/master (currently based on 2.6.35-rc2)
  • using completely re-written SmartReflex/Voltage layer from Thara
    • (some missing PMIC functionality compared to previous base)

Supported platforms (OMAP3 only)

Tested on the following platforms using omap3_pm_defconfig with busybox-based initramfs, and tested full-chip RET and OFF in idle and suspend:

  • 3430SDP
  • Beagle
  • Overo (Water + Tobi)
  • Nokia N900 (a.k.a RX51)
  • Zoom2
  • KwikByte KBOC

Using the PM branch


By default, the OMAP is configured to hit full-chip retention in suspend.

# echo mem > /sys/power/state

Serial console activity or other configured wakeup sources (keypad, touchscreen) will trigger resume.

Optionally you can use a wake-up timer:

# echo '15' > /debug/pm_debug/wakeup_timer_seconds

Upon resume, you can use the powerdomain state statistics to check whether all states hit the desired state, cf. 'Debug info'

# cat /debug/pm_debug/count

In addition, if any powerdomains did not hit the desired state, you will see a message on the console.

Enabling system for hitting retention during idle

By default, the kernel will not try to hit retention or off while idle. To enable idle path to attempt deeper sleep states:

# echo 1 > /debug/pm_debug/sleep_while_idle

Then, wait for any inactivity timers to expire (such as the 5 second UART timer) and check the powerdomain transition statistics to see that transitions are happening

 # cat /debug/pm_debug/count
Enabling system for hitting OFF

By default, retention is the deepest sleep state attempted. To enable powerdomain transitions to off mode

# echo 1 > /debug/pm_debug/enable_off_mode

In addition, to enable VDD1 and VDD2 to hit 0V

# echo 1 > /debug/pm_debug/voltage_off_while_idle

Once again, after a suspend or after some idle time, use the powerdomain transition stats to check that transitions to off-mode are happening

 # cat /debug/pm_debug/count

DVFS: Dynamic Voltage and Frequency Scaling

By default, no DVFS transitions will occur because the CPUfreq 'userspace' governor is the default governor. This means that any DVFS transitions must be manually triggered by a userspace application, or by using the CPUfreq sysfs interface( cf. 'CPUfreq kernel interface'). The OnDemand governor enables DVFS transitions based on CPU load.

Usage of the CPUfreq utils:

# cpufreq-info

Shows the current CPUfreq info: current governor, possible OPPs, current OPP ...

To change the current governor to e.g. 'userspace' or 'ondemand':

# cpufreq-set -g userspace
# cpufreq-set -g ondemand

Note: the corresponding governor support must be compiled in the kernel or as a module.

To change the frequency (with 'userspace' as the current governor):

# cpufreq-set -f 550000

The frequency is in KHz, as shown by cpufreq-info

Known Problems

  • Zoom2: serial console wakeups not working
    • Problem: on suspend, by default the serial driver will disable serial interrupts, thus disabling the GPIO IRQ needed for wakeup.
    • Fix: enable the wakeup feature for the tty used as console:
 # echo enabled > /sys/devices/platform/serial8250.3/tty/ttyS3/power/wakeup 
  • Root filesystem on MMC leads to crash when using off-mode.
    • There is currently no support for off-mode in the MMC driver.
  • GPIO module-level wakeups not always working
    • Background: GPIO wakeups can happen either via the GPIO module itself (module-level wakeups) or via IO pad wakeups if the CORE powerdomain is inactive, in retention or off.
    • If the IO pad wakeups are not enabled (either because CORE remains on, or because IO pad is not armed) GPIO wakeups may not happen unless the GPIO module-level wakeups are programmed correctly.
    • To ensure GPIO module wakeups are programmed correctly:
      • Enable GPIO IRQ for wakeup GPIO, including ISR. Use request_irq()
      • Ensure GPIO is edge-triggered. Only edge triggered GPIOs are wakeup capable (c.f. omap34xx TRM Sec.
        • the flags argument of request_irq() should have either IRQF_TRIGGER_FALLING, IRQF_TRIGGER_RISING or both.
      • Enable GPIO IRQ as wakeup source using enable_irq_wake(gpio_to_irq(<gpio>))
    • NOTE: It is very important that an interrupt handler be configured for the GPIO IRQ, even if it does nothing but return IRQ_HANDLED. This is because without an interrupt handler, the GPIO IRQ event will never be properly cleared and this can prevent the GPIO module from hitting retention or off on the next idle request (c.f. omap34xx TRM Sec.
  • GPIO wakeup works once, but prevents future retention
    • See NOTE just above

Advanced features for PM developers and power users

Debug info

First, mount the debug filesystem (debugfs)

# mount -t debugfs debugfs /debug

Show powerdomain state statistics and clockdomain active clocks

# cat /debug/pm_debug/count

Dump current PRCM registers

# cat /debug/pm_debug/registers/current

Dump PRCM register snapshot taken just before suspend (just before jump into SRAM idle code)

# cat /debug/pm_debug/registers/1

Dump PRCM register snapshot taken immediately after resume

# cat /debug/pm_debug/registers/2

UART wakeup and timeout options

By default, each of the on-chip OMAP UARTs are enabled as wakeup sources. In addition, they are configured with a configurable inactivity timer (default 5 seconds) after which the UART clocks are allowed to be gated during idle or suspend.

For example, to disable the wakeup capability of a UART1 (a.k.a ttyS0)

 # echo disabled > /sys/devices/platform/serial8250.0/power/wakeup

And to change the inactivity timer to 10 seconds, instead of the default 5:

 # echo 10 > /sys/devices/platform/serial8250.0/sleep_timeout 

Note that you can cat these files under /sys as well to see the current values.

UART PM Debugging Techniques

Debugging problems with the OMAP UART driver wakeup and data transfer when Power Management is enabled can be quite tedious, if one does not have a proper HW setup. An example of a setup (including both HW and SW changes) can be found in the OMAP_UART_pm_debugging page.

Misc. options

OPPs control

NOTE: OPP control via sysfs has been removed. Please use CPUfreq interfaces for DVFS.

SmartReflex control

NOTE: detailed information on SmartReflex can be found HERE

Enables SmartReflex autocompensation on VDD1 (Note: This feature can only be tested on a ES3.1 silicon)

# echo 1 > /sys/power/sr_vdd1_autocomp

Disables SmartReflex? autocompensation on VDD1

# echo 0 > /sys/power/sr_vdd1_autocomp

Enables SmartReflex autocompensation on VDD2 (Note: This feature can only be tested on a ES3.1 silicon)

# echo 1 > /sys/power/sr_vdd2_autocomp

Disables SmartReflex? autocompensation on VDD2

# echo 0 > /sys/power/sr_vdd2_autocomp
CPUfreq kernel interface

Although the cpufreq utils are the preferred way to use the DVFS feature, the cpufreq kernel interface has some more information available. The main entry point is in /sys/devices/system/cpu/cpu0/cpufreq.

To list the available governors:

# cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_governors

To list the available frequencies:

# cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_frequencies

To show the current frequency:

# cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq

To change the default governor:

# echo ondemand > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor

The 'stats' directory has info about the cpufreq transitions and the time spent in the various OPPs:

# cat /sys/devices/system/cpu/cpu0/cpufreq/stats/total_trans
# cat /sys/devices/system/cpu/cpu0/cpufreq/stats/time_in_state

PM code in Mainline

Here's a very crude outline of my plans for getting PM branch code to mainline. Unless otherwise stated, this has only been tested on OMAP3, although some care has been taken to not break OMAP2 in the process.

Currently in mainline (2.6.31)

  • clock framework and infrastructure
  • clockdomain and powerdomain core
  • full-chip retention in suspend
  • full-chip retention in idle

merged in 2.6.32

  • misc. PM driver updates
    • SPI
  • PM debug infrastructure
  • OMAP PM layer
  • omap_hwmod/omap_device
  • twl4030 power support

merged in 2.6.33

  • off-mode support
    • context save/restore support
  • CPUidle support
    • including off-mode C-states
  • Drivers
    • I2C driver off-mode support: re-init every transaction

Merged in 2.6.34

  • Large set of fixes from Nokia and others
  • core GPIO PM support
  • Misc. fixes

What's left in PM branch

The remaining parts in the PM branch are not ready for upstream in their current form. This is either due to code quality, basic design problems or not ready to scale for OMAP3430 and OMAP4.

  • GPIO off-mode support (waiting GPIO hwmod conversion to rework and push upstream)
  • OPP layer
  • SRF
  • DVFS, CPUfreq
  • SmartReflex driver core (deps: OMAP PM layer, OPP layer)

Dropped (or to be removed) from PM branch

  • debug observability (debobs): has been dropped from PM branch for 2.6.33/34. It needs mux updates and misc. cleanups before going upstream. Since there are no further dependencies on PM branch, a new branch 'pm-debobs' has been created based on mainline where the author (or anyone interested) can do necessary cleanups.
OMAP PM future.png

Future directions

What's next is to get the remaining functionality of the PM branch into mainline. As pointed out above in the 'what's left in PM branch' secion, those parts are not yet ready for mainline. To that end, the goal of this section is to lay out a rough plan of how to get those features done in a way that can be submitted upstream.

  • 3630, OMAP4 support
  • new OPP layer
  • device PM control

Device PM control via omap_device + omap_hwmod

Currently, we have a rather ad-hoc way for device drivers to do power management. Currently this is done in drivers by directly using the clock framework API in combination with manually setting device specific PM registers (e.g. SYSCONFIG for various idle setting bits etc.)

The goal of new device PM control is to have a standard, common, portable interface for device drivers to control PM. From a driver API point of view, there is a new single API: the Run-time PM API. Internally to the OMAP PM core, the implementation of the runtime PM API will use the new omap_device and omap_hwmod layers to implement device PM.

omap_hwmod, omap_device conversion

An important buidling block to converting to a common framework (runtime PM) for device PM is a common framework for all on-chip hardware blocks. This is available as the omap_device and omap_hwmod layers. These layers provide an abstraction so that all hardware IP blocks can be controlled using the same API. The runtime PM layer is then implemented as a think layer on top of the omap_device API.

This implies that in order to have runtime PM support for a device, the underlying HW IP must be represented by an omap_hwmod and have a correspdonding omap_device built for it. Then, using the runtime PM API from the driver will result in omap_device API calls to control the IP.

Run-time PM

Run-time PM is a recent development in the upstream kernel community. It provides an architecture independent framework for doing runtime power management of IO devices. It also extends the platform_bus/platform_device infrastructure to allow arch-specific extentions of the platform_device.

  • Key features
    • architecture independent
    • only a framework, does nothing without platform specific hooks
  • Plans for use in linux-omap
    • OMAP-specific extention of platform_device: contains an omap_device
    • implement platform specific runtime PM hooks for OMAP
    • runtime PM API used by all OMAP drivers
  • Current status
    • Propsed runtime PM implementation for OMAP available in pm-wip/runtime branch of Kevin's linux-omap-pm git tree.

Public Power management test framework

Some commonly used power management utilities are listed here which make sense from an OMAP perspective

Cpufreq utils

cpufreq utils for testing dynamic voltage and frequency scaling.

Maemo pm_test

pm-test plugin for Maemo says

 utility which tests that kernel and kernel modules works power management wise

This utility could be used to sanity test the powermanagement impact to a system for suspend/restore and basic power features.