1 Introduction 1
1.1 Summary 8
1.2 Exercises 8
1.3 References 9
2 The digitized image and its properties 10
2.1 Basic concepts 10
2.1.1 Image functions 10
2.1.3 The Fourier transform 13
2.1.2 The Dirac distribution and convolution 13
2.1.4 Images as a stochastic process 15
2.1.5 Images as linear systems 17
2.2 Image digitization 18
2.2.1 Sampling 18
2.2.2 Quantization 22
2.2.3 Color images 23
2.3 Digital image properties 27
2.3.1 Metric and topological properties of digital images 27
2.3.2 Histograms 32
2.3.3 Visual perception of the image 33
2.3.4 Image quality 35
2.3.5 Noise in images 35
2.4 Summary 37
2.5 Exercises 38
2.6 References 40
3 Data structures for image analysis 42
3.1 Levels of image data representation 42
3.2 Traditional image data structures 43
3.2.1 Matrices 43
3.2.2 Chains 45
3.2.3 Topological data structures 47
3.2.4 Relational structures 48
3.3 Hierarchical data structures 49
3.3.1 Pyamids 49
3.3.2 Quadtrees 51
3.3.3 Other pyramidical structures 52
3.4 Summary 53
3.5 Exercises 54
3.6 References 55
4 Image pre-processing 57
4.1 Pixel brightness transformations 58
4.1.1 Position-dependent brightness correction 58
4.1.2 Gray-scale transformation 59
4.2 Geometric transformations 62
4.2.1 Pixel co-ordinate transformations 63
4.2.2 Brightness interpolation 65
4.3 Local pre-processing 68
4.3.1 Image smoothing 69
4.3.2 Edge detectors 77
4.3.3 Zero-crossings of the second derivative 83
4.3.4 Scale in image processing 88
4.3.5 Canny edge detection 90
4.3.6 Parametric edge models 93
4.3.7 Edges in multi-spectral images 94
4.3.8 Other local pre-processing operators 94
4.3.9 Adaptive neighborthood pre-processing 98
4.4 Image restoration 102
4.4.1 Degradations that are easy to restore 105
4.4.3 Wiener filtration 106
4.4.2 Inverse filtration 106
4.5 Summary 108
4.6 Exercises 111
4.7 References 118
5 Segmentation 123
5.1 Thresholding 124
5.1.1 Threshold detection methods 127
5.1.2 Optimal thresholding 128
5.1.3 Multi-spectral thresholding 131
5.1.4 Thresholding in hierarchical data structures 133
5.2 Edge-based segmentation 134
5.2.1 Edge image thresholding 135
5.2.2 Edge relaxation 137
5.2.3 Border tracing 142
5.2.4 Border detection as graph searching 148
5.2.5 Border detection as dynamic programming 158
5.2.6 Hough transforms 163
5.2.7 Border detection using border location information 173
5.2.8 Region construction form borders 174
5.3 Region-based segmentation 176
5.3.1 Region merging 177
5.3.2 Region splitting 181
5.3.3 Splitting and merging 181
5.3.4 Watershed segmentation 186
5.3.5 Region growing post-processing 188
5.4 Matching 190
5.4.1 Matching criteria 191
5.4.2 Control strategies of matching 193
5.5.1 Simultaneous detection of border pairs 194
5.5 Advanced optimal border and surface detection approaches 194
5.5.2 Surface detection 199
5.6 Summary 205
5.7 Exercises 210
5.8 References 216
6 Shape representation and description 228
6.1 Region identification 232
6.2 Contour-based shape representation and description 235
6.2.1 Chain codes 236
6.2.2 Simple geometric border representation 237
6.2.3 Fourier transforms of boundaries 240
6.2.4 Boundary description using segment sequences 242
6.2.5 B-spline representation 245
6.2.6 Other contour-based shape description approaches 248
6.2.7 Spape invariants 249
6.3 Region-based shape representation and description 254
6.3.1 Simple scalar region descriptors 254
6.3.2 Moments 259
6.3.3 Convex hull 262
6.3.4 Graph representation based on region skeleton 267
6.3.5 Region decomposition 271
6.3.6 Region neighborhood graphs 272
6.4 Shape classes 273
6.5 Summary 274
6.6 Exercises 276
6.7 References 279
7 Object recognition 290
7.1 Knowledge representation 291
7.2 Statistical pattern recognition 297
7.2.1 Classification principles 298
7.2.2 Classifier setting 300
7.2.3 Classifier learning 303
7.2.4 Cluster analysis 307
7.3 Neural nets 308
7.3.1 Feed-forward networks 310
7.3.2 Unsupervised learning 312
7.3.3 Hopfield neural nets 313
7.4 Syntactic pattern recognition 315
7.4.1 Grammars and languages 317
7.4.2 Syntactic analysis, syntactic classifier 319
7.4.3 Syntactic classifier learning, grammar inference 321
7.5 Recognition as graph matching 323
7.5.1 Isomorphism of graphs and sub-graphs 324
7.5.2 Similarity of graphs 328
7.6 Optimization techniques in recognition 328
7.6.1 Genetic algorithms 330
7.6.2 Simulated annealing 333
7.7.1 Fuzzy sets and fuzzy membership functions 336
7.7 Fuzzy systems 336
7.7.2 Fuzzy set operators 338
7.7.3 Fuzzy reasoning 339
7.7.4 Fuzzy system design and training 343
7.8 Summary 344
7.9 Exercises 347
7.10 References 354
8 Image understanding 362
8.1.2 Hierarchical control 364
8.1.1 Parallel and serial processing control 364
8.1 Image understanding control strategies 364
8.1.3 Bottom-up control strategies 365
8.1.4 Model-based control strategies 366
8.1.5 Combined control strategies 367
8.1.6 Non-hierarchical control 371
8.2 Active contour models-snakes 374
8.3 Point distribution models 380
8.4 Pattern recognition methods in image understanding 390
8.4.1 Contextual image classification 392
8.5 Scene labeliing and constraint propagation 397
8.5.1 Discrete relaxation 398
8.5.2 Probabilistic relaxation 400
8.5.3 Searching interpretation trees 404
8.6 Semantic image segmentation and understanding 404
8.6.1 Semantic region growing 406
8.6.2 Genetic image interpretation 408
8.7 Hidden Markov models 417
8.8 Summary 423
8.9 Exercises 426
8.10 References 428
9 3D vision, geometry, and radiometry 441
9.1 3D vision tasks 442
9.1.1 Marr s theory 444
9.1.2 Other vision paradigms: Active and purposive vision 446
9.2 Geometry for 3D vision 448
9.2.1 Basic of projective geometry 448
9.2.2 The single perspective camera 449
9.2.3 An overview of single camera calibration 453
9.2.4 Calibration of one camera from a known scene 455
9.2.5 Two cameras, stereopsis 457
9.2.6 The geometry of two cameras;the fundamental matrix 460
9.2.7 Relative motion of the camera;the essential matrix 462
9.2.8 Fundamental matrix estimation from image point correspondences 464
9.2.9 Applications of epipolar geometry in vision 466
9.2.10 Three and more cameras 471
9.2.11 Stereo correspondence algorithms 476
9.2.12 Active acquisition of range images 483
9.3 Radiometry and 3D vision 486
9.3.1 Radiometric considerations in determining gary-level 486
9.3.2 Surface reflectance 490
9.3.3 Shape from shading 494
9.3.4 Photometric stereo 498
9.4 Summary 499
9.5 Exercises 501
9.6 References 502
10 Use of 3D vision 508
10.1 Shape from X 508
10.1.1 Shape from motion 508
10.1.2 Shape from texture 515
10.1.3 Other shape from X techniques 517
10.2 Full 3D objects 519
10.2.1 3D objects, models, and related issues 519
10.2.2 Line labeling 521
10.2.3 Volumetirc representation, direct mesurements 523
10.2.4 Volumetric modeling strategies 525
10.2.5 Surface modeling strategies 527
10.2.6 Registering surface patches and their fusion to get a full 3D model 529
10.3.1 General considerations 535
10.3 3D model-based vision 535
10.3.2 Goad s algorithm 537
10.3.3 Model-based recognition of curved objects from intensity images 541
10.3.4 Model-based recognition based on range images 543
10.4 2D view-based representations of a 3D scene 544
10.4.1 Viewing space 544
10.4.2 Multi-view representations and aspect graphs 544
10.4.3 Geons as a 2D view-based structural representation 545
10.4.4 Visualizing 3D real-world scenes using stored collections of 2D view 546
10.5 Summary 551
10.6 Exercies 552
10.7 References 553
11 Mathematical morphology 559
11.1 Basic morphological concepts 559
11.2 Four morphological principles 561
11.3 Binary dilation and erosion 563
11.3.1 Dilation 563
11.3.2 Erosion 565
11.3.3 Hit-or-miss transformation 568
11.3.4 Opening and closing 568
11.4 Gray-scle dilation and erosion 569
11.4.1 Top surface, umbra, and gray-scale dilation and erosion 570
11.4.2 Umbra homeomorphism theorem, properties of erosion and dilation,opening and closing 573
11.4.3 Top hat transformation 574
11.5 Skeltons and object marking 576
11.5.1 Homotopic transformations 576
11.5.2 Skeletion, maximal ball 576
11.5.3 Thinning, thickening,and homotopic skeleton 578
11.5.4 Quench function, ultimate erosion 581
11.5.5 Ultimate erosion and distance functions 584
11.5.6 Geodesic trandformations 585
11.5.7 Morphological reconstruction 586
11.6 Granulometry 589
11.7 Morphological segmentation and watersheds 590
11.7.1 Particles segmenttation, marking, and watersheds 590
11.7.2 Binary morphological segmentation 592
11.7.3 Gary-scale segmentation, watersheds 594
11.8 Summary 595
11.9 Exercises 597
11.10 References 598
12 Linear discrete image tranforms 600
12.1 Basic theory 600
12.2 Fourier transform 602
12.3 Hadamard transform 604
12.4 Discrete cosine transform 605
12.5 Wavelets 606
12.6 Other orthogonal image transforms 608
12.7 Applications of discrete image transforms 609
12.8 Summary 613
12.9 Exercises 617
12.10 References 619
13.Image data compression 621
13.1 Image data properties 622
13.2 Discrete image tranforms in image data compression 623
13.3 Predictive compression methods 624
13.4 Vector quantization 629
13.5 Hierarchical and progressive compression methods 630
13.6 Comparison of compression methods 631
13.7 Other techniques 632
13.8 Coding 633
13.9 JPEG and MPEG image compression 634
13.9.1 JPEG-still image compression 634
13.9.2 MPEG-full-motion video compression 636
13.10 Summary 637
13.11 Exercises 640
13.12 References 641
14 Texture 646
14.1 Statistical texture description 649
14.1.1 Methods based on spatial frequencies 649
14.1.2 Co-occurrence matices 651
14.1.3 Edge frequency 653
14.1.4 Primitive length (run length) 655
14.1.5 Laws texture energy measures 656
14.1.6 Fractal texture description 657
14.1.7 Other statistical methods of texture description 659
14.2 Syntactic texture description methods 660
14.2.1 Shape chain grammars 661
14.2.2 Graph grammars 663
14.2.3 Primitive grouping in hierarchical textures 664
14.3 Hybrid texture description methods 666
14.4 Texture recognition method applications 667
14.5 Summary 668
14.6 Exercises 670
14.7 References 672
15 Motion analysis 679
15.1 Differential mition analysis methods 682
15.2 Optical flow 685
15.2.1 Optical flow computation 686
15.2.2 Global and local optical fow estimation 689
15.2.3 Optical flow computation approaches 690
15.2.4 Optical flow in motion analysis 693
15.3 Analysis based on correspondence of interest points 696
15.3.1 Detection of interest points 696
15.3.2 Correspondence of interest points 697
15.3.3 Object tracking 700
15.4 Kalman filters 708
15.4.1 Example 709
15.5 Summary 710
15.6 Exercises 712
15.7 References 714
16 Case studies 722
16.1 An optical music recognition system 722
16.2 Automated image analysis in cardiology 727
16.2.1 Robust analysis of coronry angiograms 730
16.2.2 Knowledge-based analysis of intra-vascular ultrasound 733
16.3 Automated indentification of airway trees 738
16.4 Passive surveillance 744
16.5 References 750
Index 755