CHAPTER Ⅰ FEEDBACK CONTROL AND THE CALCULUS OF VARIATIONS 13
1.1 Introduction 13
1.2 Mathematical description of a physical system 13
1.3 Parenthetical 15
1.4 Hereditary influences 16
1.5 Criteria of performance 17
1.6 Terminal control 18
1.7 Control process 18
1.8 Feedback control 19
1.9 An alternate concept 20
1.10 Feedback control as a variational problem 21
1.11 The scalar variational problem 22
1.12 Discussion 24
1.13 Relative minimum versus absolute minimum 24
1.14 Nonlinear differential equations 26
1.15 Two-point boundary value problems 27
1.16 An example of multiplicity of solution 29
1.17 Non-analytic criteria 30
1.18 Terminal control and implicit variational problems 31
1.19 Constraints 32
1.20 Linearity 34
1.21 Summing up 35
Bibliography and discussion 36
CHAPTER Ⅱ DYNAMICAL SYSTEMS AND TRANSFORMATIONS 41
2.1 Introduction 41
2.2 Functions of initial values 41
2.3 The principle of causality 42
2.4 The basic functional equation 42
2.5 Continuous version 43
2.6 The functional equations satisfied by the elementary functions 43
2.7 The matrix exponential 44
2.8 Transformations and iteration 44
2.9 Carleman's linearization 45
2.10 Functional equations and maximum range 46
2.11 Vertical motion—Ⅰ 46
2.12 Vertical motion—Ⅱ 47
2.13 Maximum altitude 47
2.14 Maximum range 48
2.15 Multistage processes and differential equations 48
Bibliography and discussion 48
CHAPTER Ⅲ MULTISTAGE DECISION PROCESSES AND DYNAMIC PROGRAMMING 51
3.1 Introduction 51
3.2 Multistage decision processes 51
3.3 Discrete deterministic multistage decision processes 52
3.4 Formulation as a conventional maximization problem 53
3.5 Markovian-type processes 54
3.6 Dynamic programming approach 55
3.7 A recurrence relation 56
3.8 The principle of optimality 56
3.9 Derivation of recurrence relation 57
3.10 “Terminal” control 57
3.11 Continuous deterministic processes 58
3.12 Discussion 59
Bibliography and comments 59
CHAPTER Ⅳ DYNAMIC PROGRAMMING AND THE CALCULUS OF VARIATIONS 61
4.1 Introduction 61
4.2 The calculus of variations as a multistage decision process 62
4.3 Geometric interpretation 62
4.4 Functional equations 63
4.5 Limiting partial differential equations 64
4.6 The Euler equations and characteristics 65
4.7 Discussion 66
4.8 Direct derivation of Euler equation 66
4.9 Discussion 67
4.10 Discrete processes 67
4.11 Functional equations 68
4.12 Minimum of maximum deviation 69
4.13 Constraints 70
4.14 Structure of optimal policy 70
4.15 Bang-bang control 72
4.16 Optimal trajectory 73
4.17 The brachistochrone 74
4.18 Numerical computation of solutions of differential equations 76
4.19 Sequential computation 77
4.20 An example 78
Bibliography and comments 79
CHAPTER Ⅴ COMPUTATIONAL ASPECTS OF DYNAMIC PROGRAMMING 85
5.1 Introduction 85
5.2 The computational process—Ⅰ 86
5.3 The computational process—Ⅱ 87
5.4 The computational process—Ⅲ 88
5.5 Expanding grid 88
5.6 The computational process—Ⅳ 88
5.7 Obtaining the solution from the numerical results 89
5.8 Why is dynamic programming better than straightforward enumeration 90
5.9 Advantages of dynamic programming approach 90
5.10 Absolute maximum versus relative maximum 91
5.11 Initial value versus two-point boundary value problems 91
5.12 Constraints 91
5.13 Non-analyticity 92
5.14 Implicit variational problems 92
5.15 Approximation in policy space 93
5.16 The curse of dimensionality 94
5.17 Sequential search 95
5.18 Sensitivity analysis 95
5.19 Numerical solution of partial differential equations 95
5.20 A simple nonlinear hyperbolic equation 96
5.21 The equation fT=g1+g2fc+g3fc2 97
Bibliography and comments 98
CHAPTER Ⅵ THE LAGRANGE MULTIPLIER 100
6.1 Introduction 100
6.2 Integral constraints 101
6.3 Lagrange multiplier 102
6.4 Discussion 103
6.5 Several constraints 103
6.6 Discussion 104
6.7 Motivation for the Lagrange multiplier 105
6.8 Geometric motivation 106
6.9 Equivalence of solution 108
6.10 Discussion 109
Bibliography and discussion 110
CHAPTER Ⅶ TWO-POINT BOUNDARY VALUE PROBLEMS 111
7.1 Introduction 111
7.2 Two-point boundary value problems 112
7.3 Application of dynamic programming techniques 113
7.4 Fixed terminal state 113
7.5 Fixed terminal state and constraint 115
7.6 Fixed terminal set 115
7.7 Internal conditions 116
7.8 Characteristic value problems 116
Bibliography and comments 117
CHAPTER Ⅷ SEQUENTIAL MACHINES AND THE SYNTHESIS OF LOGICAL SYSTEMS 119
8.1 Introduction 119
8.2 Sequential machines 119
8.3 Information pattern 120
8.4 Ambiguity 121
8.5 Functional equations 121
8.6 Limiting case 122
8.7 Discussion 122
8.8 Minimum time 123
8.9 The coin-weighing problem 123
8.10 Synthesis of logical systems 124
8.11 Description of problem 124
8.12 Discussion 125
8.13 Introduction of a norm 125
8.14 Dynamic programming approach 125
8.15 Minimum number of stages 125
8.16 Medical diagnosis 126
Bibliography and discussion 126
CHAPTER Ⅸ UNCERTAINTY AND RANDOM PROCESSES 129
9.1 Introduction 129
9.2 Uncertainty 130
9.3 Sour grapes or truth? 131
9.4 Probability 132
9.5 Enumeration of equally likely possibilities 132
9.6 The frequency approach 134
9.7 Ergodic theory 136
9.8 Random variables 136
9.9 Continuous stochastic variable 137
9.10 Generation of random variables 138
9.11 Stochastic process 138
9.12 Linear stochastic sequences 138
9.13 Causality and the Markovian property 139
9.14 Chapman-Kolmogoroff equations 140
9.15 The forward equations 141
9.16 Diffusion equations 142
9.17 Expected values 142
9.18 Functional equations 144
9.19 An application 145
9.20 Expected range and altitude 145
Bibliography and comments 146
CHAPTER Ⅹ STOCHASTIC CONTROL PROCESSES 152
10.1 Introduction 152
10.2 Discrete stochastic multistage decision processes 152
10.3 The optimization problem 153
10.4 What constitutes an optimal policy? 154
10.5 Two particular stochastic control processes 155
10.6 Functional equations 155
10.7 Discussion 156
10.8 Terminal control 156
10.9 Implicit criteria 157
10.10 A two-dimensional process with implicit criterion 157
Bibliography and discussion 158
CHAPTER Ⅺ MARKOVIAN DECISION PROCESSES 160
11.1 Introduction 160
11.2 Limiting behavior of Markov processes 160
11.3 Markovian decision processes—Ⅰ 161
11.4 Markovian decision processes—Ⅱ 162
11.5 Steady-state behavior 162
11.6 The steady-state equation 163
11.7 Howard's iteration scheme 164
11.8 Linear programming and sequential decision processes 164
Bibliography and comments 165
CHAPTER Ⅻ QUASILINEARIZATION 167
12.1 Introduction 167
12.2 Continuous Markovian decision processes 168
12.3 Approximation in policy space 168
12.4 Systems 170
12.5 The Riccati equation 170
12.6 Extensions 171
12.7 Monotone approximation in the calculus of variations 171
12.8 Computational aspects 172
12.9 Two-point boundary-value problems 173
12.10 Partial differential equations 174
Bibliography and comments 175
CHAPTER ⅩⅢ STOCHASTIC LEARNING MODELS 176
13.1 Introduction 176
13.2 A stochastic learning model 176
13.3 Functional equations 177
13.4 Analytic and computational aspects 177
13.5 A stochastic learning model—Ⅱ 177
13.6 Inverse problem 178
Bibliography and comments 178
CHAPTER ⅩⅣ THE THEORY OF GAMES AND PURSUIT PROCESSES 180
14.1 Introduction 180
14.2 A two-person process 181
14.3 Multistage process 181
14.4 Discussion 182
14.5 Borel-von Neumann theory of games 182
14.6 The min-max theorem of von Neumann 184
14.7 Discussion 184
14.8 Computational aspects 185
14.9 Card games 185
14.10 Games of survival 185
14.11 Control processes as games against nature 186
14.12 Pursuit processes—minimum time to capture 187
14.13 Pursuit processes—minimum miss distance 189
14.14 Pursuit processes—minimum miss distance within a given time 189
14.15 Discussion 190
Bibliography and discussion 190
CHAPTER ⅩⅤ ADAPTIVE PROCESSES 194
15.1 Introduction 194
15.2 Uncertainty revisited 195
15.3 Reprise 198
15.4 Unknown—Ⅰ 199
15.5 Unknown—Ⅱ 200
15.6 Unknown—Ⅲ 200
15.7 Adaptive processes 201
Bibliography and comments 201
CHAPTER ⅩⅥ ADAPTIVE CONTROL PROCESSES 203
16.1 Introduction 203
16.2 Information pattern 205
16.3 Basic assumptions 207
16.4 Mathematical formulation 208
16.5 Functional equations 208
16.6 From information patterns to distribution functions 209
16.7 Feedback control 209
16.8 Functional equations 210
16.9 Further structural assumptions 210
16.10 Reduction from functionals to functions 211
16.11 An illustrative example—deterministic version 211
16.12 Stochastic version 212
16.13 Adaptive version 213
16.14 Sufficient statistics 215
16.15 The two-armed bandit problem 215
Bibliography and discussion 216
CHAPTER ⅩⅦ SOME ASPECTS OF COMMUNICATION THEORY 219
17.1 Introduction 219
17.2 A model of a communication process 220
17.3 Utility a function of use 221
17.4 A stochastic allocation process 221
17.5 More general processes 222
17.6 The efficient gambler 222
17.7 Dynamic programming approach 223
17.8 Utility of a communication channel 224
17.9 Time-dependent case 224
17.10 Correlation 225
17.11 M-signal channels 225
17.12 Continuum of signals 227
17.13 Random duration 228
17.14 Adaptive processes 228
Bibliography and comments 230
CHAPTER ⅩⅧ SUCCESSIVE APPROXIMATION 232
18.1 Introduction 232
18.2 The classical method of successive approximations 233
18.3 Application to dynamic programming 234
18.4 Approximation in policy space 235
18.5 Quasilinearization 236
18.6 Application of the preceding ideas 236
18.7 Successive approximations in the calculus of variations 237
18.8 Preliminaries on differential equations 239
18.9 A terminal control process 240
18.10 Differential-difference equations and retarded control 241
18.11 Quadratic criteria 241
18.12 Successive approximations once more 243
18.13 Successive approximations—Ⅱ 244
18.14 Functional approximation 244
18.15 Simulation techniques 246
18.16 Quasi-optimal policies 246
18.17 Non-zero sum games 247
18.18 Discussion 248
Bibliography and discussion 249