Many transforms have been proposed for image and video compression and the most popular transforms tend to fall into two categories: block-based and image-based. Examples of block-based transforms include the Karhunen–Loeve Transform (KLT), Singular Value Decomposition (SVD) and the ever-popular Discrete Cosine Transform (DCT)
Image-based transforms operate on an entire image or frame (or a large section of the image known as a ‘tile’). The most popular image transform is the Discrete Wavelet Transform (DWT or just ‘wavelet’). Image transforms such as the DWT have been shown to out-perform block transforms for still image compression but they tend to have higher memory requirements.
The DCT and the DWT both feature in MPEG-4
DCT: Removing the coefficients with insignificant magnitudes (for example by quantisation) enables image data to be represented with a reduced number of coefficient values at the expense of some loss of quality.
DWT: Each row of a 2D image is filtered with a low-pass and a high-pass filter (Lx and Hx ) and the output of each filter is down-sampled by a factor of two to produce the intermediate images L and H. L is the original image low-pass filtered and down-sampled in the x-direction and H is the original image high-pass filtered and down-sampled in the x-direction. Next, each column of these new images is filtered with low- and high-pass filters (Ly and Hy ) and down-sampled by a factor of two to produce four sub-images (LL, LH, HL and HH). These four ‘sub-band’ images can be combined to create an output image with the same number of samples as the original
The input to an MPEG-4 Visual encoder and the output of a decoder is a video sequence in 4:2:0, 4:2:2 or 4:4:4 progressive or interlaced format