:| . The prediction in I-frame is not to predict the position of current block in next I-frame (of course, the next I-frame happens maybe n-second later). It just works in current I-frame (has nothing with motion things, so don't need to search/match for motion blocks, means less computation). It is used for encoding and decoding. The prediction block is formed based on previously encoded and reconstructed blocks. The prediction block is subtracted from the current block for encoding. The basic predictive coding looks like:
In this figure; X^_i-1is the reconstructed(or decoded) frame, predictor is the intra prediction. A 4x4 block in X^_i-1 tried those 9 modes and find the best match(SAE, compare this 4x4 with the same position 4x4 at X_i) inside X^_i-1, that best matched 4x4 is X^_i. The difference (error) of current frame X_i and X^_i is sent to the encoding process (DCT, Quant ...) and get the coefficients. These coefficients go to decoding process (IDCT, Quant...) and get the decoded error, this error is added to X^_i to get the next reconstructed frame X^_i, which is used for the same functionality of X^_i-1 in above figure. [update: 6/2/2010]
I-frame, intra coding frame, key frames: the most important property is, it is a "random access unit". (a 5-min video, you want to access the frame at 2:00)
For luma data, 4x4 or 16x16 block is used. First of all, the sample above and to the left has been encoded and reconstructed, it can be used for the prediction reference. There are 9 prediction modes for 4x4 block and 4 modes for 16x16 block. Use Sum of Absolute Errors to get the prediction error for each mode and find the best match mode.
Chroma data uses 8x8 blocks with 4 prediction modes. Each 8x8 macroblock is predicted from chroma samples above and/or left macroblock that have previously been encoded and reconstructed.