Friday, September 05, 2008

Image Compression

Block Transform Coding:

image_compression

Transformation: DCT has advantage over other transformations such as DFT. It packs the most information into the fewest coefficients, less computational complexity.

JPEG: It defines 3 different coding systems:
(1) lossy baseline coding system,
(2) an extend system for greater compression, higher precision,
(3) a lossless independent coding system for reversible compression.


Usually we talk about lossy baseline coding system: The compression is performed in 3 sequential steps:

  • DCT computation,
  • quantization, and
  • variable-length coding.

The image is first subdivided in to pixel blocks of 8x8, then its 64 pixels are level-shifted by subtracting the quantity 2^(k-1), (if image is 256 or 2^8, subtract -2^7 or -128) where 2^k is the maximum number of intensity levels. The 2D DCT of the block is then computed, quantized in accordance with t(u, v) = round [T(u,v) / Z(u,v)], where
     [ Z(0,0)   Z(0,1)            ...      Z(0, n-1) ]
     [ Z(1,0)   ...                 ...        ...        ]
Z = [ ...                                        ...        ]
     [  ...                                        ...        ]
     [Z(n-1,0)   Z(n-1,1)      ...    Z(n-1,n-1)  ] 

a typical normalization matrix is
16 11 10 16 24 40 51 61
12 12 14 19 26 58 60 55
14 13 16 24 40 57 69 56
14 17 22 29 51 87 80 62
...........................
72 92 95 98 112 100 103 99

then reordered using zigzag pattern to form a 1D sequence of quantized coefficients. So the 1D reordered array generated under the zigzag pattern is arranged qualitatively according to increasing spatial frequency, the non-zero AC coefficients are coded using a variable-length code that defines the coefficient values and number of preceding zeros. The DC coefficient is usually using Huffman code. Default coding tables and quantization arrays (Q-table, Code-book) are provided for both color and monochrome processing.

For example, an 8x8 subimage with the JPEG baseline standard:
52 55 61 66 70 61 64 73
63 59 66 90 109 85 49 72
62 59 68 113 144 104 66 73
............................
87 79 69 68 65 76 78 94

The resulting shifted array is:
-76 -73 -67 -62 -58 -67 -64 -55
-65 -69 ..........................
...................................
-41 -49 -59 .....................

After DCT:
-415 -29 -62 25 55 -20 -1 3
7 -21 -62 ....................
...............................
-1 -1 -1 -2 -1 -1 0 -1

Use JPEG recommended normalization array, DC coefficient is computed as t(0,0) = round[T(0,0)/Z(0,0)] = -415/16 = -26, all the coefficients are:
-26 -3 -6 2 2 0 0 0
1 -2 -4 0 0 0 0 0
-3 1 5 -1 -1 0 0 0
-4 1 2 -1 0 0 0 0
1 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0

when the coefficient are reordered with zigzag ordering pattern, the resulting 1D coefficient sequence is :
[-26 -3 1 -3 -2 -6 2 -4 1 -4 1 1 5 0 2 0 0 -1 2 0 0 0 0 0 -1 -1 EOB]
A special End-Of-Block Huffman code word is provided to indicate that the remainder of coefficients in a reordered sequence are zeros.

Decompress: to decompress a JPEG compressed subimage, the decoder must first recreate the normalized transform coefficient that lead to the compressed bit stream. Because Huffman coded binary sequence is instantaneous and uniquely decodable, the step is easily accomplished in a simple lookup table manner.

Here is the regenerated array of quantized coefficients:
-26 -3 -6 2 2 0 0 0
1 -2 -4 0 0 0 0 0
-3 1 5 -1 -1 0 0 0
-4 1 2 -1 0 0 0 0
1 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
After denomalization with T'(u,v) = t(u,v) * Z(u,v):
-416 -33 -60 32 48 0 0  0
12 -24 -56 0 0 0 0 0
............................
0 0 0 0 0 0 0 0
where T'(0,0)=t(0,0)*Z(0,0) = -26*16 = -416

After iDCT:
-70 -64 -61 -64 -69 -66 -58 -50
-72 -73 -61 -39 -30 -40 -54 -59
-69 -78 -58 -9  ................
..................................

level shifing each inverse transformed pixel by 2^7 (+128):
58 64 67 64 59 62 70 78
56 55 67 89 98 88 74 69
60 50 70 119 ............
...........................

The differences between the original and reconstructed subimage are a result of the lossy nature of the JPEG compression and decompression process:
-6 -9 -6 2 11 -1 -6 -5
7 4 -1 1 11 -3 -5 3
2 9 -2 ...............
......................
Calculate root-mean-square error

JPEG2000 (band, DWT-wavelet ...). DCT is block based and get DC in every block. DWT makes the important information together and have lots of 0s, so it is easier to compress.

1 comment: