Ⅰ REPRESENTATION OF DIGITAL VIDEO 1
1 BASICS OF VIDEO 1
1.1 Analog Video 1
1.1.1 Analog Video Signal 2
1.1.2 Analog Video Standards 4
1.1.3 Analog Video Equipment 8
1.2 Digital Video 9
1.2.1 Digital Video Signal 9
1.2.2 Digital Video Standards 11
1.2.3 Why Digital Video? 14
1.3 Digital Video Processing 16
Preface 17
About the Author 19
2 TIME-VARYING IMAGE FORMATION MODELS 19
2.1 Three-Dimensional Motion Models 20
2.1.1 Rigid Motion in the Cartesian Coordinates 20
About the Notation 21
2.1.2 Rigid Motion in the Homogeneous Coordinates 26
2.1.3 Deformable Motion 27
2.2 Geometric Image Formation 28
2.2.1 Perspective Projection 28
2.2.2 Orthographic Projection 30
2.3.1 Lambertian Reflectance Model 32
2.3 Photometric Image Formation 32
2.3.2 Photometric Effects of 3-D Motion 33
2.4 Observation Noise 33
2.5 Exercises 34
3 SPATIO-TEMPORAL SAMPLING 36
3.1 Sampling for Analog and Digital Video 37
3.1.1 Sampling Structures for Analog Video 37
3.1.2 Sampling Structures for Digital Video 38
3.2 Two-Dimensional Rectangular Sampling 40
3.2.1 2-D Fourier Transform Relations 41
3.2.2 Spectrum of the Sampled Signal 42
3.3 Two-Dimensional Periodic Sampling 43
3.3.1 Sampling Geometry 44
3.3.2 2-D Fourier Transform Relations in Vector Form 44
3.3.3 Spectrum of the Sampled Signal 46
3.4 Sampling on 3-D Structures 46
3.4.1 Sampling on a Lattice 47
3.4.2 Fourier Transform on a Lattice 47
3.4.3 Spectrum of Signals Sampled on a Lattice 49
3.4.4 Other Sampling Structures 51
3.5 Reconstruction from Samples 53
3.5.1 Reconstruction from Rectangular Samples 53
3.5.2 Reconstruction from Samples on a Lattice 55
3.6 Exercises 56
4 SAMPLING STRUCTURE CONVERSION 57
4.1 Sampling Rate Change for 1-D Signals 58
4.1.1 Interpolation of 1-D Signals 58
4.1.2 Decimation of 1-D Signals 62
4.1.3 Sampling Rate Change by a Rational Factor 64
4.2 Sampling Lattice Conversion 66
4.3 Exercises 70
Ⅱ TWO-DIMENSIONAL MOTION ESTIMATION 72
5 OPTICAL FLOW METHODS 72
5.1 2-D Motion vs.Apparent Motion 72
5.1.1 2-D Motion 73
5.1.2 Correspondence and Optical Flow 74
5.2 2-D Motion Estimation 76
5.2.1 The Occlusion Problem 78
5.2.2 The Aperture Problem 78
5.2.3 Two-Dimensional Motion Field Models 79
5.3 Methods Using the Optical Folw Equation 81
5.3.1 The Optical Flow Equation 81
5.3.2 Second-Order Differentail Methods 82
5.3.3 Block Motion Model 83
5.3.4 Horn and Schunck Method 84
5.3.5 Estimation of the Gradients 85
5.3.6 Adaptive Methods 86
5.4 Examples 88
5.5 Exercises 93
6 BLOCK-BASED METHODS 95
6.1 Block-Motion Models 95
6.1.1 Translational Block Motion 96
6.1.2 Generalized/Deformable Block Motion 97
6.2 Phase-Correlation Method 99
6.2.1 The Phase-Correlation Function 99
6.2.2 Implementation Issues 100
6.3 Block-Matching Method 101
6.3.1 Matching Criteria 102
6.3.2 Search Procedures 104
6.4 Hierarchical Motion Estimation 106
6.5 Generalized Block-Motion Estimation 109
6.5.1 Postprocessing for Improved Motion Compensation 109
6.5.2 Deformable Block Matching 109
6.6 Examples 112
6.7 Exercises 115
7 PEL-RECURSIVE METHODS 117
7.1 Displaced Frame Difference 118
7.2 Gradient-Based Optimization 119
7.2.1 Steepest-Descent Method 120
7.2.2 Newton-Raphson Method 120
7.3 Steepest-Descent-Based Algorithms 121
7.2.3 Local vs.Global Minima 121
7.3.1 Netravali-Robbins Algorithm 122
7.3.2 Walker-Rao Algorithm 123
7.3.3 Extension to the Block Motion Model 124
7.4 Wiener-Estimation-Based Algorithms 125
7.5 Examples 127
7.6 Exercises 129
8 BAYESIAN METHODS 130
8.1 Optimization Methods 130
8.1.1 Simulated Annealing 131
8.1.2 Iterated Conditional Modes 134
8.1.4 Highest Confidence First 135
8.1.3 Mean Field Annealing 135
8.2 Basics of MAP Motion Estimation 136
8.2.1 The Likelihood Model 137
8.2.2 The Prior Model 137
8.3 MAP Motion Estimation Algorithms 139
8.3.1 Formulation with Discontinuity Mode? 139
8.3.2 Estimation with Local Outlier Rejection 146
8.3.3 Estimation with Region Labeling 147
8.4 Examples 148
8.5 Exercises 150
9 METHODS USING POINT CORRESPONDENCES 152
Ⅲ THREE-DIMENSIONAL MOTION ESTIMATION AND SEGMENTATION 152
9.1 Modeling the Projected Displacement Field 153
9.1.1 Orthographic Displacement Field Model 153
9.1.2 Perspective Displacement Field Model 154
9.2 Methods Based on the Orthographic Model 155
9.2.1 Two-Step Iteration Method from Two Views 155
9.2.2 An Improved Iterative Method 157
9.3 Methods Based on the Perspective Model 158
9.3.1 The Epipolar Constraint and Essential Parameters 158
9.3.2 Estimation of the Essential Parameters 159
9.3.3 Decomposition of the E-Matrix 161
9.3.4 Algorithm 164
9.4 The Case of 3-D Planar Surfaces 165
9.4.1 The Pure Parameters 165
9.4.2 Estimation of the Pure Parameters 166
9.4.3 Estimation of the Motion and Structure Parameters 166
9.5 Examples 168
9.5.1 Numerical Simulations 168
9.5.2 Experiments with Two Frames of Miss America 173
9.6 Exercises 175
10 OPTICAL FLOW AND DIRECT METHODS 177
10.1 Modeling the Projected Velocity Field 177
10.1.1 Orthographic Velocity Field Model 178
10.1.2 Perspective Velocity Field Model 178
10.1.3 Perspective Velocity vs.Displacement Models 179
10.2 Focus of Expansion 180
10.3 Algebraic Methods Using Optical Flow 181
10.3.1 Uniqueness of the Solution 182
10.3.2 Affine Flow 182
10.3.3 Quadratic Flow 183
10.3.4 Arbitrary Flow 184
10.4 Optimization Methods Using Optical Flow 186
10.5 Direct Methods 187
10.5.1 Extension of Optical Flow-Based Methods 187
10.5.2 Tsai-Huang Method 188
10.6 Examples 190
10.6.1 Numerical Simulations 191
10.6.2 Experiments with Two Frames of Miss America 194
10.7 Exercises 196
11 MOTION SEGMENTATION 198
11.1 Direct Methods 200
11.1.1 Thresholding for Change Detection 200
11.1.2 An Algorithm Using Mapping Parameters 201
11.1.3 Estimation of Model Parameters 203
11.2 Optical Flow Segmentation 204
11.2.1 Modified Hough Transform Method 205
11.2.2 Segmentation for Layered Video Representation 206
11.2.3 Bayesian Segmentation 207
11.3 Simultaneous Estimation and Segmentation 209
11.3.1 Motion Field Model 210
11.3.2 Problem Formulation 210
11.3.3 The Algorithm 212
11.3.4 Relationship to Other Algorithms 213
11.4 Examples 214
11.5 Exercises 217
12 STEREO AND MOTION TRACKING 219
12.1 Motion and Structure from Stereo 219
12.1.1 Still-Frame Stereo Imaging 220
12.1.2 3-D Feature Matching for Motion Estimation 222
12.1.3 Stereo-Motion Fusion 224
12.1.4 Extension to Multiple Motion 227
12.2 Motion Tracking 229
12.2.1 Basic Principles 229
12.2.2 2-D Motion Tracking 232
12.2.3 3-D Rigid Motion Tracking 235
12.3 Examples 239
12.4 Exercises 241
Ⅳ VIDEO FILTERING 245
13 MOTION COMPENSATED FILTERING 245
13.1 Spatio-Temporal Fourier Spectrum 246
13.1.1 Global Motion with Constant Velocity 247
13.1.2 Global Motion with Acceleration 249
13.2.1 Sampling in the Temporal Direction Only 250
13.2 Sub-Nyquist Spatio-Temporal Sampling 250
13.2.2 Sampling on a Spatio-Temporal Lattice 251
13.2.3 Critical Velocities 252
13.3 Filtering Along Motion Trajectories 254
13.3.1 Arbitrary Motion Trajectories 255
13.3.2 Global Motion with Constant Velocity 256
13.3.3 Accelerated Motion 256
13.4.1 Motion-Compensated Noise Filtering 258
13.4.2 Motion-Compensated Reconstruction Filtering 258
13.4 Applications 258
13.5 Exercises 260
14 NOISE FILTERING 262
14.1 Intraframe Filtering 263
14.1.1 LMMSE Filtering 264
14.1.2 Adaptive(Local)LMMSE Filtering 267
14.1.3 Directional Filtering 269
14.1.4 Median and Weighted Median Filtering 270
14.2 Motion-Adaptive Filtering 270
14.2.1 Direct Filtering 271
14.2.2 Motion-Detection Based Filtering 272
14.3 Motion-Compensated Filtering 272
14.3.1 Spatio-Temporal Adaptive LMMSE Filtering 274
14.3.2 Adaptive Weighted Averaging Filter 275
14.4 Examples 277
14.5 Exercises 277
15 RESTORATION 283
15.1 Modeling 283
15.1.1 Shift-Invariant Spatial Blurring 284
15.1.2 Shift-Varying Spatial Blurring 285
15.2 Intraframe Shift-Invariant Restoration 286
15.2.1 Pseudo Inverse Filtering 286
15.2.2 Constrained Least Squares and Wiener Filtering 287
15.3 Intraframe Shift-Varying Restoration 289
15.3.1 Overview of the POCS Method 290
15.3.2 Restoration Using POCS 291
15.4 Multiframe Pestoration 292
15.4.1 Cross-Correlated Multiframe Filter 294
15.4.2 Motion-Compensated Multiframe Filter 295
15.5 Examples 295
15.6 Exercises 296
16 STANDARDS CONVERSION 302
16.1 Down-Conversion 304
16.1.1 Down-Conversion with Anti-Alias Filtering 305
16.1.2 Down-Conversion without Anti-Alias Filtering 305
16.2 Practical Up-Conversion Methods 308
16.2.1 Intraframe Filtering 309
16.2.2 Motion-Adaptive Filtering 314
16.3 Motion-Compensated Up-Conversion 317
16.3.1 Basic Principles 317
16.3.2 Global-Motion-Compensated De-interlacing 322
16.4 Examples 323
16.5 Exercises 329
17 SUPERRESOLUTION 331
17.1 Modeling 332
17.1.1 Continuous-Discrete Model 332
17.1.2 Discrete-Discrete Model 335
17.2 Interpolation-Restoration Methods 336
17.1.3 Problem Interrelations 336
17.2.1 Intraframe Methods 337
17.2.2 Multiframe Methods 337
17.3 A Frequency Domain Method 338
17.4 A Unifying POCS Method 341
17.5 Examples 343
17.6 Exercises 346
Ⅴ STILL IMAGE COMPRESSION 348
18 LOSSLESS COMPRESSION 348
18.1 Basics of Image Compression 349
18.1.1 Elements of an Image Compression System 349
18.1.2 Information Theoretic Concepts 350
18.2.1 Fixed-Length Coding 353
18.2 Symbol Coding 353
18.2.2 Huffman Coding 354
18.2.3 Arithmetic Coding 357
18.3 Lossless Compression Methods 360
18.3.1 Lossless Predictive Coding 360
18.3.2 Run-Length Coding of Bit-Planes 363
18.3.3 Ziv-Lempel Coding 364
18.4 Exercises 366
19 DPCM AND TRANSFORM CODING 368
19.1 Quantization 368
19.1.1 Nonuniform Quantization 369
19.1.2 Uniform Quantization 370
19.2 Differential Pulse Code Modulation 373
19.2.1 Optimal Prediction 374
19.2.2 Quantization of the Prediction Error 375
19.2.3 Adaptive Quantization 376
19.2.4 Delta Modulation 377
19.3 Transform Coding 378
19.3.1 Discrete Cosine Transform 380
19.3.2 Quantization/Bit Allocation 381
19.3.3 Coding 383
19.3.4 Blocking Artifacts in Transform Coding 385
19.4 Exercises 385
20 STILL IMAGE COMPRESSION STANDARDS 388
20.1 Bilevel Image Compression Standards 389
20.1.1 One-Dimensional RLC 389
20.1.2 Two-Dimensional RLC 391
20.1.3 The JBIG Standard 393
20.2 The JPEG Standard 394
20.2.1 Baseline Algorithm 395
20.2.2 JPEG Progressive 400
20.2.3 JPEG Lossless 401
20.2.4 JPEG Hierarchical 401
20.2.5 Implementations of JPEG 402
20.3 Exercises 403
21 VECTOR QUANTIZATION,SUBBAND CODING AND OTHER METHODS 404
21.1 Vector Quantization 404
21.1.1 Structure of a Vector Quantizer 405
21.1.2 VQ Codebook Design 408
21.1.3 Practical VQ Implementations 408
21.2 Fractal Compression 409
21.3 Subband Coding 411
21.3.1 Subband Decomposition 411
21.3.2 Coding of the Subbands 414
21.3.3 Relationship to Transform Coding 414
21.4 Second-Generation Coding Methods 415
21.3.4 Relationship to Wavelet Transform Coding 415
21.5 Exercises 416
Ⅵ VIDEO COMPRESSION 419
22 INTERFRAME COMPRESSION METHODS 419
22.1 Three-Dimensional Waveform Coding 420
22.1.1 3-D Transform Coding 420
22.1.2 3-D Subbband Coding 421
22.2 Motion-Compensated Waveform Coding 424
22.2.1 MC Transform Coding 424
22.2.2 MC Vector Quantization 425
22.3 Model-Based Coding 426
22.2.3 MC Subband Coding 426
22.3.1 Object-Based Coding 427
22.3.2 Knowledge-Based and Semantic Coding 428
22.4 Exercises 429
23 VIDEO COMPRESSION STANDARDS 432
23.1 The H.261 Standard 432
23.1.1 Input Image Formats 433
23.1.2 Video Multiplex 434
23.1.3 Video Compression Algorithm 435
23.2 The MPEG-1 Standard 440
23.2.1 Features 440
23.2.2 Input Video Format 441
23.2.3 Data Structure and Compression Modes 441
23.2.4 Intraframe Compression Mode 443
23.2.5 Interframe Compression Modes 444
23.2.6 MPEG-1 Encoder and Decder 447
23.3 The MPEG-2 Standard 448
23.3.1 MPEG-2 Macroblocks 449
23.3.2 Coding Interlaced Video 450
23.3.3 Scalable Extensions 452
23.3.4 Other Improvements 453
23.3.5 Overview of Profiles and Levels 454
23.4 Software and Hardware Implementations 455
24 MODEL-BASED CODING 457
24.1 General Object-Based Methods 457
24.1.1 2-D/3-D Rigid Objects with 3-D Motion 458
24.1.2 2-D Flexible Objects with 2-D Motion 460
24.1.3 Affine Transformations with Triangular Meshes 462
24.2 Knowledge-Based and Semantic Methods 464
24.2.1 General Principles 465
24.2.2 MBASIC Algorithm 470
24.2.3 Estimation Using a Flexible Wireframe Model 471
24.3 Examples 478
25 DIGITAL VIDEO SYSTEMS 486
25.1 Videoconferencing 487
25.2 Interactive Video and Multimedia 488
25.3 Digital Television 489
25.3.1 Digital Studio Standards 490
25.3.2 Hybrid Advanced TV Systems 491
25.3.3 All-Digital TV 493
25.4 Low-Bitrate Video and Videophone 497
25.4.1 The ITU Recommendation H.263 498
25.4.2 The ISO MPEG-4 Requirements 499
APPENDICES 502
A MARKOV AND GIBBS RANDOM FIELDS 502
A.1 Definitions 502
A.1.1 Markov Random Fields 503
A.1.2 Gibbs Random Fields 504
A.2 Equivalence of MRF and GRF 505
A.3 Local Conditional Probabilities 506
B BASICS OF SEGMENTATION 508
B.1 Thresholding 508
B.1.1 Finding the Optimum Threshold(s) 509
B.2 Clustering 510
B.3 Bayesian Methods 512
B.3.1 The MAP Method 513
B.3.2 The Adaptive MAP Method 515
B.3.3 Vector Field Segmentation 516
C KALMAN FILTERING 518
C.1 Linear State-Space Model 518
C.2 Extended Kalman Filtering 520