CS4624 Text - Ch 6
Please read this chapter very carefully! It covers important material in a clear and well organized fashion. The discussion below emphasizes the key points of each section.
See Study Questions.
See Course Notes for further explanation and to see most crucial points.
6 Digital Video and Image Compression
6.1 Evaluating a Compression System
- amount/degree of compression
- image quality
- speed of compression/decompression
6.1.1 How Much Compression?- Have same resolution for input and output to allow comparison.
- Measure in bpp (bits per pixel) for compressed file.
- Ex.: 2bpp if have 256x240 and 15K bitstream
6.1.2 How Good is the Picture?
(Subjective evaluation is ultimately necessary.)
6.1.3 How Fast Does it Compress or Decompress?- Videoconferencing requires symmetric.
- Capture must keep up with input, unless from tape.
- Offline compression gives best quality.
- Decompression should play back in real-time.
6.1.4 What Hardware and Software Does it Take?
(Compression requires lots, preferably standardized if hard-wired.)
6.2 Redundancy and Visibility- Spatial: regions of same color, lines duplicated, patterns
- Temporal: between frames (if motion is slow)
- Color: eye's insensitivity allows subsampling
6.3 Video Compression Techniques- Long researched, now possible because of VLSI and fast microprocessors with software codecs.
- Techniques can be applied to each color component.
- Adaptivity may improve compression through local optimization.
- Output bitstream must be analyzable, often in layers that lead to various groups, blocks, and eventually frames.
6.3.1 Simple Compression Techniques- Truncation: 16bpp adequate for original of color images
- CLUT: use 256 colors so 8bpp but would like higher quality
- RL: good for cartoons, CLUT, graphic images
6.3.2 Interpolative Techniques- Color subsampling (MPEG uses 2:1, 2:1 while DVI uses 4:1, 4:1) inverted by 2-D interpolation
- Bi-directional (MPEG B-frame) coding inverted by frame interpolation
6.3.3 Predictive Techniques
(Save previous item --- frame/line/pixel --- and use difference in the value from it to build up the next item.)
6.3.3a DPCM (Differential)- Item = pixel
- Value = amplitude
- Overloads if there are sudden changes in value, e.g., black to white
6.3.3b ADPCM- Adaptive: changes in step size are coded into bitstream
- Error: may creep in, so need to restart on occasion
- Not often used for video since only get 2:1 before artifacts become intolerable.
6.3.3c Other Predictive Techniques
(See MPEG P-frames that are predicted from previous I or P frame.)
6.3.4 Transform Coding Techniques- Reversible: has an inverse
- Blocks: usually 8x8 "bundle of data"
- Creates an alternate representation taking less space
6.3.5 A Simple Transform Example- Consider a 2x2 array
- Simple transform and inverse transform equations illustrate benefits from predictive coding.
- Useful transforms require larger blocks and extensive computation.
6.3.5a DCT- 8x8 yields DC coefficient (0 spatial freq.) and 63 AC coefficients.
- Amplitudes give 2-D spatial freq. components.
- Quantizing parameters allocate bits as long as not easily visible.
6.3.6 Statistical Coding- Huffman codes use fewer bits for common strings.
- Computation is required for statistical analysis yielding codes.
- Space is needed for code tables, in bitstream if done adaptively.
6.3.7 Motion Video Compression Techniques- For CD-ROM systems need new image every ~20ms stored in 5KB (256x240, 2/3 bpp).
- To achieve that need interframe coding, such as motion compensation.
- Most apps. require high quality compressed video, so use asymmetric methods, requiring extensive compression calculations.
6.3.7a Motion Compensation- Divide image into blocks.
- Determine motion vector for each block that has moved.
- Use difference = 0 for other blocks.
6.4 Standardization of Algorithms- CCITT Rec. T.4.1980 is Group 3 FAX
- ISO and IEC have joint committee JTC1 groups Joint Photographic Expert Group and Motion Picture Coding Expert Group
- While DVI used hardware standard, JPEG and MPEG are bitstream standards
6.5 JPEG
6.5.1 JPEG - Objectives- Objectives: state-of-art, parameterizable, handling any type of image, reasonable computation requirements
- 4 modes of operation: sequential (scan order), progressive (coarse to fine passes), lossless, hierarchical (multiple resolutions)
- Baseline sequential is all that is supported by many implementations.
6.5.2 JPEG - Architectures- Lossless done as predictor and statistical encoder.
- Hierarchical done at the start with a hierarchical control for filtering and subsampling.
- Progressive has image buffer before statistical encoding, to read out different portions of coefficients.
6.5.3 JPEG - DCT Encoding and Quantification- Quantized 8 bits/entry, using table for that type of image.
- zig-zag orderingoccurs before statistical coding
6.5.4 JPEG - Statistical Coding- Baseline specifies Huffman coding.
- Arithmetic coding takes more computation, less space, no table.
6.5.5 JPEG - Predictive Lossless Coding- 2:1 compression
- 7 different kinds of prediction allowed regarding how to use nearby pixels to predict the next.
6.5.6 JPEG - Performance- 1.5-2bpp like original
- .75-1.5 excellent
- .5-.75 good to very good
- .25-.50 moderate to good
6.6 ITU-T Recommendation H.261 (p*64)- Videoconferencing standard approved in 1990.
- p in [1,30] for multiples of 64 kbps
6.6.1 Objectives- For 525 or 625 line TVs
- 40 Kbps to 2Mbps
- Support bidirectional/unidirectional, error correction, switched multipoint
6.6.2 CIF - Common Intermediate Format- Y, CB, CR at 8bits each; 2:1 subsampling in each direction; 30 fps
- CIF is 352x288
- QCIF is 172x144
6.6.3 Coding Algorithm- 16x16 macroblocks containing 4 luminance and 2 color-difference blocks 8x8
- Drop blocks when need to reduce data rates.
- INTRA mode uses DCT; INTER mode uses prediction from previous picture.
6.7 MPEG- Motion JPEG can work with hardware supporting 30fps
- But more compression results from MPEG
6.7.1 Objectives- Technical: for 1-1.5Mbps, with chipsets supporting real-time decompression and later compression
- User: random access, VCR-like operations, playable in windows, editable
- Performance: AV sync maintained, error-correction, decompression delay controllable
6.7.2 Architecture- I, P, B frames
- Transmission order varies for forward or reverse playback
6.7.3 Bitstream Syntax (skim)
6.7.4 Performance
(CD-ROM 1.2Mbps, 352x240, 30fps, similar to VHS quality)
6.7.5 MPEG-2 and -4- MPEG-2 for 2-15Mbps, can support HDTV, scalable down to MPEG-1 and H.261
- MPEG-2 audio supports many channels and also allows lower sample rates
- MPEG-4 will use more expensive algorithms for higher compression, mobile multimedia
6.8 DVI- Special hardware, co-processors, first shown in 1987
- For x86, then Mac, then Indeo software
6.8.1 DVI Motion Video Compression
6.8.2 DVI Production-Level Compression- Asymmetric
- Digitize, filter, chrominance subsample, then 90:1 processing time on 64 node parallel processor
6.8.2a PLV Performance- 30fps gives visual averaging that hides some artifacts
- 15KB for reference frame, 5120 on average
- Multidimensional tradeoff: video frame rate, decompression processing time, compressed bytes/frame, image cropping
6.8.3 DVI Real-Time Compression- Symmetric
- Uses DVI boards, can take more space/frame since saved on fast hard drive
- SMPTE time codes allow replacement of RTV with PLV for final version
[Home |
Readings
]
Copyright 1996 Edward A. Fox