1 Introduction 1
Mathematical Formulation 2
Example:A Transportation Problem 4
Continuous versus Discrete Optimization 4
Constrained and Unconstrained Optimization 6
Global and Local Optimization 6
Stochastic and Deterministic Optimization 7
Optimization Algorithms 7
Convexity 8
Notes and References 9
2 Fundamentals of Unconstrained Optimization 10
2.1 What Is a Solution? 13
Recognizing a Local Minimum 15
Nonsmooth Problems 18
2.2 Overview of Algorithms 19
Two Strategies:Line Search and Trust Region 19
Search Directions for Line Search Methods 21
Models for Trust-Region Methods 26
Scaling 27
Rates of Convergence 28
R-Rates of Convergence 29
Notes and References 30
Exercises 30
3 Line Search Methods 34
3.1 Step Length 36
The Wolfe Conditions 37
The Goldstein Conditions 41
Sufficient Decrease and Backtracking 41
3.2 Convergence of Line Search Methods 43
3.3 Rate of Convergence 46
Convergence Rate of Steepest Descent 47
Quasi-Newton Methods 49
Newton’s Method 51
Coordinate Descent Methods 53
3.4 Step-Length Selection Algorithms 55
Interpolation 56
The Initial Step Length 58
A Line Search Algorithm for the Wolfe Conditions 58
Notes and References 61
Exercises 62
4 Trust-Region Methods 64
Outline of the Algorithm 67
4.1 The Cauchy Point and Related Algorithms 69
The Cauchy Point 69
Improving on the Cauchy Point 70
The Dogleg Method 71
Two-Dimensional Subspace Minimization 74
Steihaug’s Approach 75
4.2 Using Nearly Exact Solutions to the Subproblem 77
Characterizing Exact Solutions 77
Calculating Nearly Exact Solutions 78
The Hard Case 82
Proof of Theorem 4.3 84
4.3 Global Convergence 87
Reduction Obtained by the Cauchy Point 87
Convergence to Stationary Points 89
Convergence of Algorithms Based on Nearly Exact Solutions 93
4.4 Other Enhancements 94
Scaling 94
Non-Euclidean Trust Regions 96
Notes and References 97
Exercises 97
5 Conjugate Gradient Methods 100
5.1 The Linear Conjugate Gradient Method 102
Conjugate Direction Methods 102
Basic Properties of the Conjugate Gradient Method 107
A Practical Form of the Conjugate Gradient Method 111
Rate of Convergence 112
Preconditioning 118
Practical Preconditioners 119
5.2 Nonlinear Conjugate Gradient Methods 120
The Fletcher-Reeves Method 120
The Polak-Ribiere Method 121
Quadratic Termination and Restarts 122
Numerical Performance 124
Behavior of the Fletcher-Reeves Method 124
Global Convergence 127
Notes and References 131
Exercises 132
6 Practical Newton Methods 134
6.1 Inexact Newton Steps 136
6.2 Line Search Newton Methods 139
Line Search Newton-CG Method 139
Modified Newton’s Method 141
6.3 Hessian Modifications 142
Eigenvalue Modification 143
Adding a Multiple of the Identity 144
Modified Cholesky Factorization 145
Gershgorin Modification 150
Modified Symmetric Indefinite Factorization 151
6.4 Trust-Region Newton Methods 154
Newton-Dogleg and Subspace-Minimization Methods 154
Accurate Solution of the Trust-Region Problem 155
Trust-Region Newton-CG Method 156
Preconditioning the Newton-CG Method 157
Local Convergence of Trust-Region Newton Methods 159
Notes and References 162
Exercises 162
7 Calculating Derivatives 164
7.1 Finite-Difference Derivative Approximations 166
Approximating the Gradient 166
Approximating a Sparse Jacobian 169
Approximating the Hessian 173
Approximating a Sparse Hessian 174
7.2 Automatic Differentiation 176
An Example 177
The Forward Mode 178
The Reverse Mode 179
Vector Functions and Partial Separability 183
Calculating Jacobians of Vector Functions 184
Calculating Hessians:Forward Mode 185
Calculating Hessians:Reverse Mode 187
Current Limitations 188
Notes and References 189
Exercises 189
8 Quasi-Newton Methods 192
8.1 The BFGS Method 194
Properties of the BFGS Method 199
Implementation 200
8.2 The SR1 Method 202
Properties of SR1 Updating 205
8.3 The Broyden Class 207
Properties of the Broyden Class 209
8.4 Convergence Analysis 211
Global Convergence of the BFGS Method 211
Superlinear Convergence of BFGS 214
Convergence Analysis of the SR1 Method 218
Notes and References 219
Exercises 220
9 Large-Scale Quasi-Newton and Partially Separable Optimization 222
9.1 Limited-Memory BFGS 224
Relationship with Conjugate Gradient Methods 227
9.2 General Limited-Memory Updating 229
Compact Representation of BFGS Updating 230
SR1 Matrices 232
Unrolling the Update 232
9.3 Sparse Quasi-Newton Updates 233
9.4 Partially Separable Functions 235
A Simple Example 236
Internal Variables 237
9.5 Invariant Subspaces and Partial Separability 240
Sparsity vs.Partial Separability 242
Group Partial Separability 243
9.6 Algorithms for Partially Separable Functions 244
Exploiting Partial Separability in Newton’s Method 244
Quasi-Newton Methods for Partially Separable Functions 245
Notes and References 247
Exercises 248
10 Nonlinear Least-Squares Problems 250
10.1 Background 253
Modeling,Regression,Statistics 253
Linear Least-Squares Problems 256
10.2 Algorithms for Nonlinear Least-Squares Problems 259
The Gauss-Newton Method 259
The Levenberg-Marquardt Method 262
Implementation of the Levenberg-Marquardt Method 264
Large-Residual Problems 266
Large-Scale Problems 269
10.3 Orthogonal Distance Regression 271
Notes and References 273
Exercises 274
11 Nonlinear Equations 276
11.1 Local Algorithms 281
Newton’s Method for Nonlinear Equations 281
Inexact Newton Methods 284
Broyden’s Method 286
Tensor Methods 290
11.2 Practical Methods 292
Merit Functions 292
Line Search Methods 294
Trust-Region Methods 298
11.3 Continuation/Homotopy Methods 304
Motivation 304
Practical Continuation Methods 306
Notes and References 310
Exercises 311
12 Theory of Constrained Optimization 314
Local and Global Solutions 316
Smoothness 317
12.1 Examples 319
A Single Equality Constraint 319
A Single Inequality Constraint 321
Two Inequality Constraints 324
12.2 First-Order Optimality Conditions 327
Statement of First-Order Necessary Conditions 327
Sensitivity 330
12.3 Derivation of the First-Order Conditions 331
Feasible Sequences 332
Characterizing Limiting Directions:Constraint Qualifications 336
Introducing Lagrange Multipliers 339
Proof of Theorem 12.1 341
12.4 Second-Order Conditions 342
Second-Order Conditions and Projected Hessians 348
Convex Programs 350
12.5 Other Constraint Qualifications 351
12.6 A Geometric Viewpoint 354
Notes and References 357
Exercises 358
13 Linear Programming&The Simplex Method 362
Linear Programming 364
13.1 Optimality and Duality 366
Optimality Conditions 366
The Dual Problem 367
13.2 Geometry of the Feasible Set 370
Basic Feasible Points 370
Vertices of the Feasible Polytope 372
13.3 The Simplex Method 374
Outline of the Method 374
Finite Termination of the Simplex Method 377
A Single Step of the Method 378
13.4 Linear Algebra in the Simplex Method 379
13.5 Other (Important) Details 383
Pricing and Selection of the Entering Index 383
Starting the Simplex Method 386
Degenerate Steps and Cycling 389
13.6 Where Does the Simplex Method Fit? 391
Notes and References 392
Exercises 393
14 Linear Programming:Interior-Point Methods 394
14.1 Primal-Dual Methods 396
Outline 396
The Central Path 399
A Primal-Dual Framework 401
Path-Following Methods 402
14.2 A Practical Primal-Dual Algorithm 404
Solving the Linear Systems 408
14.3 Other Primal-Dual Algorithms and Extensions 409
Other Path-Following Methods 409
Potential-Reduction Methods 409
Extensions 410
14.4 Analysis of Algorithm 14.2 411
Notes and References 416
Exercises 417
15 Fundamentals of Algorithms for Nonlinear Constrained Optimization 420
Initial Study of a Problem 422
15.1 Categorizing Optimization Algorithms 423
15.2 Elimination of Variables 426
Simple Elimination for Linear Constraints 427
General Reduction Strategies for Linear Constraints 430
The Effect of Inequality Constraints 434
15.3 Measuring Progress:Merit Functions 434
Notes and References 437
Exercises 438
16 Quadratic Programming 440
An Example:Portfolio Optimization 442
16.1 Equality-Constrained Quadratic Programs 443
Properties of Equality-Constrained QPs 444
16.2 Solving the KKT System 447
Direct Solution of the KKT System 448
Range-Space Method 449
Null-Space Method 450
A Method Based on Conjugacy 452
16.3 Inequality-Constrained Problems 453
Optimality Conditions for Inequality-Constrained Problems 454
Degeneracy 455
16.4 Active-Set Methods for Convex QP 457
Specification of the Active-Set Method for Convex QP 461
An Example 463
Further Remarks on the Active-Set Method 465
Finite Termination of the Convex QP Algorithm 466
Updating Factorizations 467
16.5 Active-Set Methods for Indefinite QP 470
Illustration 472
Choice of Starting Point 474
Failure of the Active-Set Method 475
Detecting Indefiniteness Using the LBLT Factorization 475
16.6 The Gradient-Projection Method 476
Cauchy Point Computation 477
Subspace Minimization 480
16.7 Interior-Point Methods 481
Extensions and Comparison with Active-Set Methods 484
16.8 Duality 484
Notes and References 485
Exercises 486
17 Penalty,Barrier,and Augmented Lagrangian Methods 490
17.1 The Quadratic Penalty Method 492
Motivation 492
Algorithmic Framework 494
Convergence of the Quadratic Penalty Function 495
17.2 The Logarithmic Barrier Method 500
Properties of Logarithmic Barrier Functions 500
Algorithms Based on the Log-Barrier Function 505
Properties of the Log-Barrier Function and Framework 17.2 507
Handling Equality Constraints 509
Relationship to Primal-Dual Methods 510
17.3 Exact Penalty Functions 512
17.4 Augmented Lagrangian Method 513
Motivation and Algorithm Framework 513
Extension to Inequality Constraints 516
Properties of the Augmented Lagrangian 518
Practical Implementation 521
17.5 Sequential Linearly Constrained Methods 523
Notes and References 525
Exercises 526
18 Sequential Quadratic Programming 528
18.1 Local SQP Method 530
SQP Framework 531
Inequality Constraints 533
IQP vs.EQP 534
18.2 Preview of Practical SQP Methods 534
18.3 Step Computation 536
Equality Constraints 536
Inequality Constraints 538
18.4 The Hessian of the Quadratic Model 539
Full Quasi-Newton Approximations 540
Hessian of Augmented Lagrangian 541
Reduced-Hessian Approximations 542
18.5 Merit Functions and Descent 544
18.6 A Line Search SQP Method 547
18.7 Reduced-Hessian SQP Methods 548
Some Properties of Reduced-Hessian Methods 549
Update Criteria for Reduced-Hessian Updating 550
Changes of Bases 551
A Practical Reduced-Hessian Method 552
18.8 Trust-Region SQP Methods 553
Approach Ⅰ:Shifting the Constraints 555
Approach Ⅱ:Two Elliptical Constraints 556
Approach Ⅲ:Sl 1 QP (Sequential l 1 Quadratic Programming) 557
18.9 A Practical Trust-Region SQP Algorithm 560
18.10 Rate of Convergence 563
Convergence Rate of Reduced-Hessian Methods 565
18.11 The Maratos Effect 567
Second-Order Correction 570
Watchdog (Nonmonotone) Strategy 571
Notes and References 573
Exercises 574
A Background Material 576
A.1 Elements of Analysis,Geometry,Topology 577
Topology of the Euclidean Space Rn 577
Continuity and Limits 580
Derivatives 581
Directional Derivatives 583
Mean Value Theorem 584
Implicit Function Theorem 585
Geometry of Feasible Sets 586
Order Notation 591
Root-Finding for Scalar Equations 592
A.2 Elements of Linear Algebra 593
Vectors and Matrices 593
Norms 594
Subspaces 597
Eigenvalues,Eigenvectors,and the Singular-Value Decomposition 598
Determinant and Trace 599
Matrix Factorizations:Cholesky,LU,QR 600
Sherman-Morrison-Woodbury Formula 605
Interlacing Eigenvalue Theorem 605
Error Analysis and Floating-Point Arithmetic 606
Conditioning and Stability 608
References 611
Index 625