Work on performance utilities for ARM/MIPS


 * Summary: Work on performance utilities for ARM/MIPS


 * Proposer: Holger Hans Peter Freyther

Description
Linux has gained great performance utilities in recent years. These include OProfile, SystemTap with utrace, perf for the CPU usage. For doing memory analysis GNU Libc provides the LD_PRELOADable memusagestat, and GNOME has the memprof utility to look into the allocations.

The story for ARM and MIPS is not as good. In general the tools that require a backtrace from userspace to be really useful (memprof, oprofile, perf) will not work well as GCC with -O2 compiles without frame pointers. The oprofile backtrace support for ARM requires the frame pointers for userspace, MIPS has no backtrace support for Oprofile at all. Most distributions (Poky/Yocto, Debian, Ubuntu, Fedora, MeeGo) build their software without frame pointers making good performance analysis painful as one needs to recompile them.

There are multiple work items here that could be of interest:


 * Make utrace work on ARM by implementing user_regset support and provide integration with Poky/Yocto.
 * Enable backtrace generation without the need of frame pointers by looking at the stack and finding old program counters.
 * Provide Oprofile backtrace support on MIPS. In the first version with the frame-pointer, later with the same approach taken on ARM.
 * Allow GNOME memprof to run over a socket and provide a client/server interface to remotely analyze an application.
 * Add ARM backtrace support to GNOME memprof.
 * Add MIPS backtrace support to GNOME memprof.

Scope
[utrace]
 * The ptrace/user_regset implementation and testing (that ptrace and gdb still work, specially for VFP, iwmmx and other co-processors) should be about two weeks. After this the utrace patch should be selectable for ARM.
 * There is likely another week for review, redoing the patch for upstream inclusion.
 * The addition of Systemtap/utrace the to yocto kernel and poky buildsystem is less than a day of work.

[ARM oprofile]
 * A heuristic needs to be implemented to detect the previous frames. One can look at libunwind and gdb to see their criteria for the heuristic and implement it for the kernel. A working patch ready for upstream should be creatable within 10 days, the feasibility to see if such a heuristic is applicable for the kernel can be done with less than a day.

[MIPS oprofile]
 * The first part to add basic backtrace support should be done within three days of work. For the heuristic it might be more difficult (just like in ARM) to get it right for LE/BE systems, different ABIs.

[memprof]
 * Client/Server work should take less than a week.
 * MIPS/ARM backtrace support each should not take more than a day each.

Contractor Candidates
I would be interested in working on some/all of these items myself.

Related work

 * SystemTap    - http://sourceware.org/systemtap/
 * Oprofiler    - http://oprofile.sf.net
 * GNOME memprof - http://git.gnome.org/browse/memprof/
 * utrace/ARM  - http://osdir.com/ml/linux.kernel.utrace/2008-03/msg00012.html