BeleniX Logo

BeleniX is an OpenSolaris LiveCD Distribution

BeleniX is the result of the free time hackery of several folks and is a community effort.
Any feedback, bugs, assistance in docs and fixes are welcome.

BeleniX

News    About    Downloads    Forums    Contact

Main Menu
* News
* About
* Downloads
* Forums
* Contact
* Bug Tracking
* Support Requests
* Feature Requests
* Roadmap
* Project Page
* FAQ   (NEW)
* Documentation
* Screenshots

OpenSolaris Enthusiast

This site Hosted At


Genunix.ORG
Sarovar.ORG

Note on Licensing


BeleniX is released under the CDDL license version 1.

However all the software in BeleniX are covered by their respective licenses (eg. GPL, LGPL, BSD etc).

* On The Fly Decompression - How it works

* Introduction

From version 0.4 onwards an On The Fly Decompression feature was introduced in BeleniX that enabled the filesystem on the CDROM to be compressed. The data is decompressed as and when requests for disk blocks are made by the OS and apps. Compressing data on the CDROM has 2 benefits :

  1. Compression enables more software to be crammed onto the CDROM. BeleniX uses zlib with max compression level that allows 1.8 GB of data to be put on one CD.
  2. Compressing data reduces access time as more data is transferred into RAM per I/O operation. It also minimizes seeking of the CDROM head which is an expensive operation. This reduces time taken to boot and start apps as the same data is now physically stored in a much smaller surface area on the CD. Thus the CDROM Drive head has to move less.

BeleniX uses a quite simple compression technique. The Zlib decompression code is already available as the zmod kernel module in OpenSolaris that simplified the task of implementing this feature. Since we are concerned with only reading data off the CDROM read-only compression is used. One will not be able to write to the compressed data once it is generated.

The lofi(7D) kernel module was modified to implement compression. lofi is a pseudo loopback block device module that enables a file to be viewed and accessed as a block device. So one can have a filesystem in a file if it is managed via lofi. Lofi is commonly used in OpenSolaris to mount ISO images and see their contents.

In BeleniX the entire contents of "/usr" which is on the CDROM is compressed. The steps that BeleniX uses to generate and use the compressed file are:

  1. Copy all the required files to "/usr" in a staging area.
  2. Generate an ISO filesystem (hsfs) image of this "/usr". Hsfs (High Sierra Filesystem) is used since it is lightweight. 
  3. Compress this ISO image using a custom utility that generate a specially formatted compressed file.
  4. This compressed file is then included in the final bootable ISO image of BeleniX.
  5. While booting in BeleniX the CDROM is mounted. Subsequently the compressed file is added as a block device via lofi.
  6. This pseudo block device is then mounted onto "/usr" as a hsfs filesystem.


* The Technique

The basic idea here is as follows:

  1. Split the given file into fixed size segments (typically 64K)
  2. Compress each segment individually and store them sequentially in another file.
  3. An array is used to store the starting offset of each compressed segment in the file.
  4. Finally a header, the array (also called index) from steps 2-3 and the compressed segments are copied into the final compressed file.
  5. The header contains the following: An 8-byte signature, The segment size, number of segments (This is also the array size), uncompressed size of last segment - since the file size may not be a multiple of the segment size.
  6. The array that follows the header is used as an index to get to individual compressed segments. The size of an individual compressed segment is derived by subtracting it's start offset from the start offset of the next segment. Thus the array contains an extra sentinel entry at the end to avoid having an extra check for the last segment.
          Compressed File Format
Compressed file Format


The following steps are taken by the modified lofi module to enable reading from a compressed file:

  1. When lofiadm(1M) is used to add a file to be managed via lofi, it ultimately results in a call to lofi_map_file in the lofi module that does all the preprocessing necessary to open the file and initialize data structures including a faked-up disk geometry.
  2. In addition to the above the modified lofi also reads the first 8 bytes of the file and check for a signature.
  3. If a proper signature is found then it reads the header components into memory and initializes the data structures. In addition it also reads the entire index (array) into memory. This is currently a series of uint32_t integers. The 4GB addressing range provided by uint32_t is enough to store upto 12GB of data in one 4GB compressed file. The array does not occupy too much kernel memory. About 128 KB of memory is required for a compressed file whose uncompressed size is 2GB.
  4. Once lofi receives a request to read some #X number of bytes from an offset #N in compressed file, it first computes the starting and ending compressed segment numbers that will contain the requested data. Since compressed segments are of fixed size, this is done easily via integer division and modulus. However as an added optimization the segment size is enforced to be a power of 2. So bitwise operations can be used instead of division and modulus.
  5. The file offsets and ranges of the compressed segments are extracted from the index array and the start offset is aligned on a disk block boundary (512 bytes at present). These bytes are then read into memory via segmap_fault in lofi_mapped_rdwr.
  6. The data read in at step #5 contains all the compressed segments required. Lofi then loops through the compressed segments in memory and uncompresses them one by one. Once a segment is uncompressed the relevant portion of that uncompressed segment that contains a part of the original data requested is appended to the buffer provided to lofi by the caller. This computation is a little tricky as the required data may begin in the middle of one unompressed segment and end in the middle of another.
  7. Once the loop ends we return to the caller.  The uncompressed segments have a one byte segment header in the first byte. This is currently used to indicate whether the segment is at all compressed or not. A segment will be compressed only if the compression reduces the segment size below a certain threshold. Currently this threshold is set at 63K. This avoids the overhead of compressing segments where there is little gain from compression.
  8. The offset values in the index array and the header values are stored on disk in network byte order. In addition the array indicates the first segment to begin at offset 0 and subsequent segments at subsequents offsets . This is adjusted to add in the header and index size when the array is read into memory.

Reading from compressed and normal files via lofi

Decompression during lofi read