Mainline Android logger project

This page is for managing information related to the "Mainline Android Logger" project of the CE Workgroup. This is part of the Android Mainlining Project.

Android logger issues
This section describes some attributes of the Android logger code, which are relevant for mainlining the code into Linux. Let's use a modified SWOT (Strengths, Weaknesses, Opportunities, Threats) analysis for strategic planning to mainline this code.

To find the logger's strengths and weaknesses, lets research and provide information and hard numbers for how it compares with existing logging alternatives.

Alternatives
What are the alternatives to the Android logger?
 * logbuf (the kernel log buffer)
 * apparently, systemd uses this
 * syslog
 * "the journal" - a new logging system for systemd (by Lennart Poettering)
 * see systemd journal for a good discussion of problems with syslog
 * Neil Brown's log-pipe idea
 * I thought this one was interesting. See: Neil Brown's message about this on LKML
 * Arnd Bergmann's tmpfs idea
 * See this message

What are the pros and cons of each system (see feature matrix below)

Existence
Does the system or proposed alternative currently exist?

How much work would it be to create it?

How much work would it be to test it to confirm that it had the same semantics as current Android logger?

Performance
How much overhead does each logging system have?

How long does it take to write 1 million messages, for each of the above systems?

How many context switches are required to log a single message?

How long is each message, on average?

How many data copies are performed for each message?

What is the average time to write each message, from the caller's perspective?

Size
What are the space requirements for each logging system?

what is the code footprint in the kernel?

What is the data footprint in the kernel?

What is the code footprint in user-space?

What is the data footprint in user-space?

Are any other modules are required solely to support logging?

What is the size required in persistent storage?

Can the size be limited (at compile-time?, at boot-time?, at run-time?)

Can the size be adjusted (at boot-time?, at run-time?)

Given default sizes, and expected logging rates, how much time does each system record?

Maintainability
How much does each system need to be maintained?

How many changes have been made to the each system in the last 3 years?

Are any significant changes expected in the future?

Configurability
What things can be configured at compile-time and run-time for each system?

Features
What features does each of the log systems have?

Extension questions
Here are some questions about how the systems could be integrated:


 * What features could each log system have, if extended?
 * Would extension to an existing system interfere with it's current primary purpose?
 * For each change contemplated, can it be implemented to minimize the impact to existing code?

Barriers to entry as is (Weaknesses)
Does logger duplicate functionality that exists elsewhere? Does it make sense to extend an existing system, rather than implement a new system?

What are barriers to entry:
 * use of ioctl?
 * hardcoded number of log channels
 * other style issues?
 * longterm maintainer?

Attractive features of logger as is (Strengths)
What are the differentiating features of logger, compared to alternatives?


 * minimal context switches for logging
 * use of ring buffer automatically limits log size
 * log is in memory (no cost to store unused log messages)
 * all messages are tagged
 * all messages have priority
 * all messages have timestamp
 * user-space program to filter messages by tag
 * channel selection by user-space policy

Discussions
This section has notes about discussions with the community kernel developers and with Google engineers, with regards to mainlining this code

Google engineer requests/questions

 * Google doesn't want to change their class libraries or debug system
 * These have already shipped to developers and are integrated into other tools (eclipse log viewer)
 * the Android system and over 200,000 applications already use the existing classes
 * (So, can changes be made "under the hood" without changing the existing user interface?)


 * Google requests that any changes submitted to mainline also be placed in their kernel repository (or Gerrit review queue), so they can see them when they do merges and are not caught off guard.
 * I'm not sure the detailed steps required to perform this


 * Question: Why did Google write their own code, instead of using syslog?
 * Was it simply expedience, lack of familiarity with syslog, or are there specific reasons they wrote a new system (missing features, etc.)?

Community feedback/input/questions

 * Greg KH - "At worse case, all of the code could go away and the normal logging interface be used, after it would be fixed up to handle the special needs that warranted the creation of the android logger code in the first place."

The advantages of logger over logbuf are:
 * Alessio - In my opinion syslog shouldn't be compared with neither Android's logger nor logbuf because:
 * 1) Buggy syslog implementation (or future changes) could shutdown the entire log infrastructure (logger and logbuf aren't supposed to change a lot in future)
 * 2) Logbuf and logger don't require writeable file system access which is mandatory for syslog. A no-writeable files-system (or a file-system which allow only few writes) is a common situation in embedded world.
 * 3) Logbuf and logger don't require maintenance work (no rotate, compress or purge old log files). A mandatory feature in consumer devices.
 * 1) Logger's logs are shared among a unknown number of user space programs
 * 2) logbuf are writable only by kernel itself
 * 3) Logger's log are separated: An wild user space program couldn't fill up all logs.
 * 4) Write path in logger is optimized over all else

December 2011 LKML discussion
Here is the thread:
 * on lkml.indiana.edu
 * on lkml.org

Notes on specific issues:

[1] seems redundant with logbug/syslog(2)

 * [1] logbuf/syslog(2) code seems to do this now (Greg KH, Kay Sievers)
 * see systemd usage, which appears to coalesce stuff into /dev/kmsg
 * adding warts to logbuf doesn't seem like a great idea, it's already got realtime issues that were never solved
 * but integration with user and kernel messages could be nice
 * integration also requires some oddities
 * do you convert the log message fields (pid and priority) to structured text, and parse them out again later
 * note that systemd added support for priorities and/or tags to logbuf (Kay Seivers) see
 * the ioctl for logger looks a lot like syslog(2), any way to consolidate?
 * you'd have to add channels to syslog(2), if you wanted to have separate channels

[2] hardcoded names and number of logs in user-space and the kernel

 * [2] hardcoded names and number of logs in user-space and the kernel (David Brown, Neil Brown, Andrew Morton)
 * no argument there (Brian Swetland)

[3] logbuf filesystem idea
If you created a 'logbuf' filesystem that used libfs to provide a single directory in which privileged processes could create files then you wouldn't need the kernel to "know" the allowed logs: radio, events, main, system. The size could be set by ftruncate (by privileged used again) rather than being hardcoded.
 * [3] logbuf filesystem idea (Neil Brown)

You would define 'read' and 'write' much like you currently do to create a list of datagrams in a circular buffer and replace the ioctls by more standard interfaces:

LOGGER_GET_LOG_BUG_SIZE would use 'stat' and the st_blocks field LOGGER_GET_LOG_LEN would use 'stat' and the st_size field LOGGER_GET_NEXT_ENTRY_LEN could use the FIONREAD ioctl LOGGER_FLUSH_LOG could use ftruncate


 * this fixes (2) by allowing arbitrary logs created by user space
 * unfortunately, the log channels don't exist until after user space is up a while
 * maybe could hardcode a mechanism for a default system log at startup, with other to follow
 * Greg KH also asks why we need a new interface for this rather than logbuf/syslog(2)

[4] features, requirements and wishlist
consume too much memory systemwide The goals behind the logger driver have been: - keep userland and kernel logging separate (so that spammy userland logging doesn't make us lose critical kernel logs or the other way round) - make log writing very inexpensive -- avoid having to pass messages between processes (more critical on ARM9 platforms where this implied extra cache flushing), avoid having to make several syscalls to write a log message (getting time of day, etc), and so on - make log writing reliable -- don't trust userland to report its timestamp, PID, or to correctly format the datagrams, etc - allow a log watching process (logcat) to easily pull data from all logs at once - avoid committing a vast amount of memory to logging - try to prevent clients from spamming each other out of log space (only successful on a coarse granularity right now with the main/system/radio/events logs) - ensure logs are not lost at the moment an app crashes
 * [4] features, requirements and wishlist (Brian Swetland)
 * avoid hardcoding the names and sizes of the logs
 * adjust permissions so that some apps can only read back their own log messages (security by pid or by channel?)
 * allow logs to survive reboot
 * allow an app to create it's own log, up to a specified size limit, but in a way that wouldn't

[6] separate log channels are not needed

 * [6] separate log channels are not needed or desirable (Kay Sievers)
 * for early boot and debug, you want interleaving

[7] write cost is not a problem

 * [7] write cost is not a problem (Kay Sievers, David Lang)
 * I have no numbers, but suspect there is no problem

[8] ttyprintk driver

 * [8] userspace printk tty (ttyprintk) driver (Greg KH, Kay Sievers mentioned it)
 * I never heard of it - need to do some research I guess
 * see linux/drivers/char/ttyprintk.c
 * Feature was added in August 2010 (kernel version 2.6.36?)
 * Results: ttyprintk allows user-space to submit messages to the kernel log buffer
 * ttyprintk features:
 * convert CR to LF,
 * adds '[U]' tag in front of each message
 * splits long messages into fragments no longer than 508 bytes
 * (How is this different from /dev/kmsg???)
 * purpose appears to be to automatically grab messages from programs stdout that would normally be written to the system console
 * see thread about "console logging detour via printk" here: http://lkml.indiana.edu/hypermail/linux/kernel/1005.0/00009.html
 * I'm not sure how this is installed. (maybe with console=ttyprintk on the command line?)

Alan Cox said (explaining why a user-space solution wasn't sufficient): On a lot of embedded systems you don't have all the stuff Fedora carts around. No modules, initrds, magic front end processes, graphical startup daemons etc, all of which work to produce that feature IFF you have pty support in your kernel, and for the current code also glibc. Here is some sample code which should cause console messages to go to the printk log buffer: /*Send console output through ttyprintk.*/ if ((fd = open("/dev/ttyprintk", O_WRONLY)) > 0) { ioctl(fd, TIOCCONS, NULL); close(fd); }
 * See http://www.kerneltrap.org/mailarchive/linux-kernel/2010/8/25/4611115 for Alan Cox's justification for this, compared to systemd on Fedora 14, or Kay Seiver's 'exec /dev/kmsg 2>&1'
 * CAUTION: 'exec /dev/kmsg 2>&1' crashes the Xserver on my Fedora 14 system (frowand)
 * answering the question - what's different about this from /dev/kmsg
 * also: http://www.kerneltrap.org/mailarchive/linux-kernel/2010/8/25/4611254
 * A user-space program can do this message transfer (see Kay's thing above), but /dev/kmsg can't be a console because it's missing some ioctls that are expected for a tty device
 * this appears to be targetted at deeply embedded, and solely at interleaving klog and console messages (not separate channels like logger)
 * How to use /dev/ttyprintk:
 * use 'setconsole /dev/ttyprintk < /dev/console' (see man setconsole(8))
 * write your own program to redirect this with a tty ioctl (see man tty_ioctl(4), section TIOCCONS)

[9] no need to switch to something new
Having a bunch of people suggest how we can just rewrite our userspace to get something similar but not really the same as the functionality we already have registers as "big waste of our time."
 * [9] Google doesn't see a need to invest in switching (Brian Swetland)

Replacing a tiny, self-contained, and reliable driver that's worked for us for five-ish years with a pile of userspace surgery is uninteresting to me.

[10] Kay Siever's wishlist

 * [10] Kay Siever's wishlist - kmsg with structured data (Kay Sievers - who else?)
 * supports structured data, that can be used with current tools
 * each record has timestamp, log facility, log level, string (no pid)
 * timestamp converted on output (not saved as text in log)
 * record dictionary (key=value pairs)
 * single channel
 * systemd has ASCII-like stream format for structured data
 * on top of existing logbuf code

[11] avoid malicious log overruns

 * [11] avoid log spam (Brian Swetland)
 * logger could use separate channel per pid
 * systemd journal maintains separate journal per uid (Kay)
 * rate-limiting per uid or gid, is one way to handle this
 * this adds complexity, but consumes less space than having per-process logs

[12] per-process log security

 * [12] allow for eventual per-process log security (Brian Swetland)
 * logger could use multiple channels
 * systemd journal maintains separate journal per uid (Kay)

[13] code cleanups for current driver

 * [13] Specific code fixes in current driver (Andrew Morton, see https://lkml.org/lkml/2011/12/28/165)
 * [13.1] ** MAINLINED in 3.4 - logger_offset macro depends on local 'log' variable
 * [13.2] ** MAINLINED in 3.4 - should document endian-ness of numeric values in buffer
 * [13.3] ** MAINLINED in 3.4 - do get_entry_len more neatly (possibly using packed_struct.h)
 * [13.4] ** FIXED in 3.4 - use get_unaligned instead of current get_entry_len code??
 * [13.5] ** MAINLINED in 3.4 - change order of prepare_to_wait and mutex_lock lines in logger_read
 * [13.6] ** MAINLINED in 3.4 - make clock_interval function more obvious
 * [13.7] don't make clock_interval explicitly inline (gcc should figure it out)
 * [13.8] ** MAINLINED in 3.4 - in do_write_log_from_user, update log->w_off ??
 * [13.9] resolve pid, tgid confusion
 * [13.10] consider access control for logs, especially in container context
 * [13.11] ** MAINLINED in 3.4 - does reader->list need to be locked, in logger_release?
 * [13.12] get rid of ioctl?
 * [13.13] get rid of hardcoded minor names in get_log_from_minor

Project task list

 * Get needed hardware
 * Make sure contractors have all required source code:
 * AOSP, Android kernels, and logger code
 * Identify and review code before submission
 * identify possible mainline objections to code, and plan how to respond
 * compare with alternatives
 * compare and contrast logger code with kernel log buffer (e.g. dmesg vs. logcat)
 * performance, features, size,
 * compare and contrast logger code with syslog (syslogd, /var/log)
 * Submit code to LKML
 * Respond to feedback received
 * Add logger code to busybox?
 * Should this be done first?
 * log
 * logwrapper
 * logsync
 * What about logd_write.c code
 * included in bionic - should be included in busybox directly, because it's missing in glibc?
 * what about udev?
 * should mainline Linux automatically create logger devices nodes?
 * they are created by 'init' in Android systems

Volunteers

 * Sisir Koppaka

History
The proposals that led to this project are at:
 * First proposal: Mainline Android kernel functionality
 * Second proposal (more focused on just the logger): Mainline Android logger