Fast Kernel Decompression

This page has notes about faster kernel decompression.

Description
Currently, the method used to compress the kernel is gzip. However, other compression and decompression methods exist which may allow improvements in kernel decompression (and hence startup) performance.

This page documents Sony's investigation of UCL compression/decompression performance, for possible use in speeding up bootup time on an embedded device. In our testing UCL decompressed a sample file system image 43% faster than gunzip, and a sample kernel image 28% faster than gunzip.

From the UCL web page, it states:
 * UCL is written in ANSI C. Both the source code and the compressed data format are designed to be portable across platforms.


 * UCL implements a number of algorithms with the following features:
 * Decompression is simple and *very* fast.
 * Requires no memory for decompression.
 * The decompressors can be squeezed into less than 200 bytes of code.
 * Focuses on compression levels for generating pre-compressed data which achieve a quite competitive compression ratio.
 * Allows you to dial up extra compression at a speed cost in the compressor. The speed of the decompressor is not reduced.
 * Algorithm is thread safe.
 * Algorithm is lossless.


 * UCL supports in-place decompression.
 * UCL and the UCL algorithms and implementations are distributed under the terms of the GNU General Public License (GPL) { auf Deutsch }. Special licenses for commercial and other applications are available by contacting the author.

Another method of speeding up the kernel load phase of bootup is to use DMA Copy Of Kernel On Startup

How to implement or use
Get UCL from following URL and use sample command "uclpack"

http://www.oberhumer.com/opensource/ucl/download/ucl-1.03.tar.gz

untar the file, build, and use the sample command "uclpack", located at: ucl-1.03/examples/uclpack in the untar'ed source tree.

Expected Improvement
The case study below is intended to show a performance improvement in decompressing a sample file system and sample kernel.

Resources

 * UCL can be obtained at: http://www.oberhumer.com/opensource/ucl/

Projects

 * lzop ( http://www.lzop.org/ Wikipedia: lzop ) uses the miniLZO Lempel-Ziv-Oberhumer ( Wikipedia: LZO ) algorithm. It has the reputation of extremely fast decompression and a tiny decompressor, but larger compressed files -- apparently faster decompression and better compressed file size than Lempel-Ziv Ross Williams ( Wikipedia: LZRW ) -- faster than memcopy on some machines.
 * UPX: the the Ultimate Packer for eXecutables ( Wikipedia: UPX ; http://upx.sf.net/ ) uses the UCL algorithm
 * gzip ( Wikipedia: gzip )
 * bzip2 ( Wikipedia: bzip2 ; http://www.bzip.org/ ) has the reputation of giving smaller compressed files and about the same decompression time as gzip (but longer compression times)

[Are there other compressors with better decompression performance than gzip??

Case 1
For this use case, we compiled both uclpack and gzip for the PowerPC platform. Then we ran the programs on the target platform, compressing and decompressing two different file images - an initrd filesystem image, and a linux kernel image (originally uncompressed).

The size and performance results from running these commands are in the tables below.


 * Hardware : PPC440GP - 300 MHZ


 * Kernel Version : Linux kernel running on target was 2.6.11, kernel which was compressed with Linux 2.4.20


 * Configuration : See above tables for parameters to gzip and ucl


 * Time without change : [put that here]


 * Time with change : [put that here]