PDF and JBIG2: Working With the Benefits of Both

Mar 24
2010

Amyuni Technologies Blog
Ever since the PDF’s payload began carrying images, file size quickly became an issue. As people began adding larger and larger images into their PDF it took longer to transmit these files and the space in which they were stored filled up faster. Luckily image compression technologies were never far behind to help alleviate these problems. One such technology is JBIG2, a bi-level (black/white) image compression format introduced by the Joint Bi-level Image Experts Group.

Unlike previous image compression formats (including its predecessor JBIG1) JBIG2 uses an intelligent algorithm to achieve its compression ratios. In short, the algorithm first searches for and recognizes similar groups of pixels within an image. Then, it creates symbols to represent common or repeated shapes it has found and stores them in a table. This lossless compression process does not affect image quality and the end results are documents that can be one quarter to one fifth of their original size.

To Whom Does JBIG2 Serve?

Government, judicial, and medical sectors are some examples of PDF-intensive work places where subtle implementations of the JBIG2 format in their work flows can help IT Managers reap noticeable cost saving benefits later on. Not only are PDF files that contain JBIG2 compressed information easier to send and share, but they are easier to store, they display rapidly online, and they are OCR ready.

The following table outlines some of JBIG2’s main features:

JBIG2 Feature: Benefit: Example:
Higher compression rates than its predecessors (e.g., JBIG1, TIFF G3, and G4). File size reduction capabilities up to 90% or higher. Reduction in storage space and transmission bandwidth. With JBIG2 compression, a 78 MB uncompressed 500-page PDF document, would see its file size drop to 12.7 MB. An equivalent TIFF file would be approx. 15.8 MB.
Lossy and lossless compression methods. Lossy yields a higher compression rate without any perceivable information loss. A pass to clean the document of dots and artifacts from a scanned document can help JBIG2 compression by coding more simple white areas.
The use of symbol dictionaries • For the compression of other images within the same document .
• Eventually one symbol dictionary could be used to recognize the text in the image. It contains the building blocks of a possible OCR procedure to help rebuild font information (if lost).
• Unique to one PDF document, a global JBIG2 stream can contain a dictionary of symbols used for all the pages of the document.
• Once the dictionary is built, software attempts to recognize letters and build legible text from them.
The use of arithmetic and Huffman coding schemes for bit representation. Huffman coding takes less page memory and has faster compression and decompression than arithmetic coding. However, arithmetic compression is slower, uses more memory but yields better compression results. JBIG2 can support the Huffman and the arithmetic coding algorithms for image structure information such as encoding schemes, references, indexes, sizes, offsets, and popular symbol identities.
ITU-T T.6 facsimile coding schemes and coding control functions for Group 4 facsimile functionalities which is activated by a MMR (Modified Modified READ (Relative Element Access Designate)) flag. Use of the latest facsimile logic for the compression of building block images. Any image leaf can be coded using MMR logic. In addition, a symbol in a dictionary or whole page can be found in the JBIG2 stream as a MMR image.
Stripped-page compression. JBIG2 can compress uninterrupted image flows. Under specific circumstances, if a scanner sends image information without a page cut, a JBIG2 stream can still take the data and compress it.
Most PDF viewers support reading JBIG2 (ver. 1.4 and higher). JBIG2 technology can be easily integrated into the PDF’s established technologies. Most of the PDF documents produced by high-end scanners with professional drivers are compressed with JBIG2 technologies.

JBIG2: Smaller Things are Easier to Handle

Amyuni Technologies has been carefully following the evolution of JBIG2 ever since the format became supported by PDF. Amyuni Technologies first included JBIG2 decoding (decompression) capabilities in their PDF Creator and PDF Converter products.

Now, with these products’ upcoming 4.5 releases, Amyuni Technologies extends their JBIG2 support to include its encoding (compression) capabilities in addition to OCR capabilities. Whether for PDF integration or publication and distribution purposes, end-users and developers will be able take advantage of JBIG2’s powerful black and white compression capabilities.

Dany Amiouny is the CTO for Amyuni Technologies
www.amyuni.com