The major feature is to provide the technology for object-based compression that is capable of separately encoding and decoding Video Object (VO).
Current MPEG-4 video texture coding is still based on combination of motion compensated prediction (MC) and transform coding (TC). MC is used to remove the interframe redundancy and TC is used to remove the intraframe redundancy. The basic technique to perform MC predictive coding for coding a video sequence is Motion Estimation(ME). The basic ME method is still block matching technique: to find the best-matched block in the previous frame for every block in the current frame. Adaptive selection of 16x16 or 4 8x8 blocks.
The algorithm for video texture coding is based on 8x8 DCT with MC. DCT is performed for each luminance and chrominance block, MC is performed only on luminance blocks.
The input video sequence is first decomposed into separate VOs, these VOs are then encoded into separate bitstream so that the user can access and manipulate the video sequence in the bitstream domain. Instances of VO in a given time are called VO plane (VOP). The bitstream contains also the composition information to indicate where and when each VOP is to be displayed.
The encoder mainly consists of two part: shape coding and texture coding of the input VOP. The decoder mainly consists of three parts: shape, motion and texture decoding. The bitstream is first demuliplexed into shape and motion information as well as texture information.