高性能计算机上的数值线性 英文影印版PDF电子书下载
- 电子书积分:12 积分如何计算积分?
- 作 者:(美)冬格拉等著
- 出 版 社:北京:清华大学出版社
- 出版年份:2011
- ISBN:9787302244998
- 页数:345 页
1 High-Performance Computing 1
1.1 Trends in Computer Design 1
1.2 Traditional Computers and Their Limitations 2
1.3 Parallelism within a Single Processor 3
1.3.1 Multiple Functional Units 3
1.3.2 Pipelining 3
1.3.3 Overlapping 4
1.3.4 RISC 5
1.3.5 VLIW 6
1.3.6 Vector Instructions 7
1.3.7 Chaining 7
1.3.8 Memory-to-Memory and Register-to-Register Organizations 8
1.3.9 Register Set 9
1.3.10 Stripmining 9
1.3.11 Reconfigurable Vector Registers 10
1.3.12 Memory Organization 10
1.4 Data Organization 11
1.4.1 Main Memory 12
1.4.2 Cache 14
1.4.3 Local Memory 15
1.5 Memory Management 15
1.6 Parallelism through Multiple Pipes or Multiple Processors 18
1.7 Message Passing 19
1.8 Virtual Shared Memory 21
1.8.1 Routing 21
1.9 Interconnection Topology 22
1.9.1 Crossbar Switch 23
1.9.2 Timeshared Bus 23
1.9.3 Ring Connection 24
1.9.4 Mesh Connection 24
1.9.5 Hypercube 25
1.9.6 Multi-staged Network 26
1.10 Programming Techniques 26
1.11 Trends:Network-Based Computing 29
2 Overview of Current High-Performance Computers 31
2.1 Supercomputers 31
2.2 RISC-Based Processors 34
2.3 Parallel Processors 35
3 Implementation Details and Overhead 39
3.1 Parallel Decomposition and Data Dependency Graphs 39
3.2 Synchronization 42
3.3 Load Balancing 43
3.4 Recurrence 44
3.5 Indirect Addressing 46
3.6 Message Passing 47
3.6.1 Performance Prediction 49
3.6.2 Message-Passing Standards 50
3.6.3 Routing 55
4 Performance:Analysis,Modeling,and Measurements 57
4.1 Amdahl's Law 58
4.1.1 Simple Case of Amdahl's Law 58
4.1.2 General Form of Amdahl's Law 59
4.2 Vector Speed and Vector Length 59
4.3 Amdahl's Law—Parallel Processing 60
4.3.1 A Simple Model 61
4.3.2 Gustafson's Model 63
4.4 Examples of(r∞,n1/2)-values for Various Computers 64
4.4.1 CRAY J90 and CRAY T90 (One Processor) 65
4.4.2 General Observations 66
4.5 LINPACK Benchmark 66
4.5.1 Description of the Benchmark 66
4.5.2 Calls to the BLAS 67
4.5.3 Asymptotic Performance 67
5 Building Blocks in Linear Algebra 71
5.1 Basic Linear Algebra Subprograms 71
5.1.1 Level 1 BLAS 72
5.1.2 Level 2 BLAS 73
5.1.3 Level 3 BLAS 74
5.2 Levels of Parallelism 76
5.2.1 Vector Computers 76
5.2.2 Parallel Processors with Shared Memory 78
5.2.3 Parallel-Vector Computers 78
5.2.4 Clusters Computing 78
5.3 Basic Factorizations of Linear Algebra 79
5.3.1 Point Algorithm:Gaussian Elimination with Partial Piv-oting 79
5.3.2 Special Matrices 80
5.4 Blocked Algorithms:Matrix-Vector and Matrix-Matrix Versions 83
5.4.1 Right-Looking Algorithm 85
5.4.2 Left-Looking Algorithm 86
5.4.3 Crout Algorithm 87
5.4.4 Typical Performance of Blocked LU Decomposition 88
5.4.5 Blocked Symmetric Indefinite Factorization 89
5.4.6 Typical Performance of Blocked Symmetric Indefinite Fac-torization 91
5.5 Linear Least Squares 92
5.5.1 Householder Method 93
5.5.2 Blocked Householder Method 94
5.5.3 Typical Performance of the Blocked Householder Factor-ization 95
5.6 Organization of the Modules 95
5.6.1 Matrix-Vector Product 96
5.6.2 Matrix-Matrix Product 97
5.6.3 Typical Performance for Parallel Processing 98
5.6.4 Benefits 98
5.7 LAPACK 99
5.8 ScaLAPACK 100
5.8.1 The Basic Linear Algebra Communication Subprograms(BLACS) 101
5.8.2 PBLAS 102
5.8.3 ScaLAPACK Sample Code 103
6 Direct Solution of Sparse Linear Systems 107
6.1 Introduction to Direct Methods for Sparse Linear Systems 111
6.1.1 Four Approaches 111
6.1.2 Description of Sparse Data Structure 112
6.1.3 Manipulation of Sparse Data Structures 113
6.2 General Sparse Matrix Methods 115
6.2.1 Fill-in and Sparsity Ordering 115
6.2.2 Indirect Addressing—Its Effect and How to Avoid It 118
6.2.3 Comparison with Dense Codes 120
6.2.4 Other Approaches 121
6.3 Methods for Symmetric Matrices and Band Systems 123
6.3.1 The Clique Concept in Gaussian Elimination 124
6.3.2 Further Comments on Ordering Schemes 126
6.4 Frontal Methods 126
6.4.1 Frontal Methods—Link to Band Methods and Numerical Pivoting 128
6.4.2 Vector Performance 129
6.4.3 Parallel Implementation of Frontal Schemes 130
6.5 Multifrontal Methods 131
6.5.1 Performance on Vector Machines 135
6.5.2 Performance on RISC Machines 136
6.5.3 Performance on Parallel Machines 137
6.5.4 Exploitation of Structure 142
6.5.5 Unsymmetric Multifrontal Methods 143
6.6 Other Approaches for Exploitation of Parallelism 144
6.7 Software 145
6.8 Brief Summary 147
7 Krylov Subspaces:Projection 149
7.1 Notation 149
7.2 Basic Iteration Methods:Richardson Iteration,Power Method 150
7.3 Orthogonal Basis(Arnoldi,Lanczos) 153
8 Iterative Methods for Linear Systems 157
8.1 Krylov Subspace Solution Methods:Basic Principles 157
8.1.1 The Ritz-Galerkin Approach:FOM and CG 158
8.1.2 The Minimum Residual Approach:GMRES and MINRES 159
8.1.3 The Petrov-Galerkin Approach:Bi-CG and QMR 159
8.1.4 The Minimum Error Approach:SYMMLQ and GMERR 161
8.2 Iterative Methods in More Detail 162
8.2.1 The CG Method 163
8.2.2 Parallelism in the CG Method:General Aspects 165
8.2.3 Parallelism in the CG Method:Communication Overhead 166
8.2.4 MINRES 168
8.2.5 Least Squares CG 170
8.2.6 GMRES and GMRES(m) 172
8.2.7 GMRES with Variable Preconditioning 175
8.2.8 Bi-CG and QMR 179
8.2.9 CGS 182
8.2.10 Bi-CGSTAB 184
8.2.11 Bi-CGSTAB(e)and Variants 186
8.3 Other Issues 189
8.4 How to Test Iterative Methods 191
9 Preconditioning and Parallel Preconditioning 195
9.1 Preconditioning and Parallel Preconditioning 195
9.2 The Purpose of Preconditioning 195
9.3 Incomplete LU Decompositions 199
9.3.1 Efficient Implementations of ILU(0) Preconditioning 203
9.3.2 General Incomplete Decompositions 204
9.3.3 Variants of ILU Preconditioners 207
9.3.4 Some General Comments on ILU 208
9.4 Some Other Forms of Preconditioning 209
9.4.1 Sparse Approximate Inverse(SPAI) 209
9.4.2 Polynomial Preconditioning 211
9.4.3 Preconditioning by Blocks or Domains 211
9.4.4 Element by Element Preconditioners 213
9.5 Vector and Parallel Implementation of Preconditioners 215
9.5.1 Partial Vectorization 215
9.5.2 Reordering the Unknowns 217
9.5.3 Changing the Order of Computation 219
9.5.4 Some Other Vectorizable Preconditioners 222
9.5.5 Parallel Aspects of Reorderings 225
9.5.6 Experiences with Parallelism 227
10 Linear Eigenvalue Problems Ax=λx 231
10.1 Theoretical Background and Notation 231
10.2 Single-Vector Methods 232
10.3 The QR Algorithm 234
10.4 Subspace Projection Methods 235
10.5 The Arnoldi Factorization 237
10.6 Restarting the Arnoldi Process 239
10.6.1 Explicit Restarting 239
10.7 Implicit Restarting 240
10.8 Lanczos'Method 243
10.9 Harmonic Ritz Values and Vectors 245
10.10 Other Subspace Iteration Methods 246
10.11 Davidson's Method 248
10.12 The Jacobi-Davidson Iteration Method 249
10.12.1 JDQR 252
10.13 Eigenvalue Software:ARPACK,P_ARPACK 253
10.13.1 Reverse Communication Interface 254
10.13.2 Parallelizing ARPACK 255
10.13.3 Data Distribution of the Arnoldi Factorization 256
10.14 Message Passing 259
10.15 Parallel Performance 260
10.16 Availability 261
10.17 Summary 261
11 The Generalized Eigenproblem 263
11.1 Arnoldi/Lanczos with Shift-Invert 263
11.2 Alternatives to Arnoldi/Lanczos with Shift-Invert 265
11.3 The Jacobi-Davidson QZ Algorithm 266
11.4 The Jacobi-Davidson QZ Method:Restart and Deflation 268
11.5 Parallel Aspects 271
A Acquiring Mathematical Software 273
A.1 netlib 273
A.1.1 Mathematical Software 275
A.2 Mathematical Software Libraries 275
B Glossary 277
C Level 1,2,and 3 BLAS Quick Reference 291
D Operation Counts for Various BLAS and Decompositions 295
Bibliography 301
Index 329
- 《计算机网络与通信基础》谢雨飞,田启川编著 2019
- 《大学计算机实验指导及习题解答》曹成志,宋长龙 2019
- 《联吡啶基钌光敏染料的结构与性能的理论研究》李明霞 2019
- 《线性代数简明教程》刘国庆,赵剑,石玮编著 2019
- 《计算机辅助平面设计》吴轶博主编 2019
- 《计算机组成原理解题参考 第7版》张基温 2017
- 《云计算节能与资源调度》彭俊杰主编 2019
- 《FDS火灾数值模拟》李胜利,李孝斌编著 2019
- 《Helmholtz方程的步进计算方法研究》李鹏著 2019
- 《微笑 影印本》N.达列基作 1947
- 《断陷湖盆比较沉积学与油气储层》赵永胜等著 1996
- 《SQL与关系数据库理论》(美)戴特(C.J.Date) 2019
- 《魔法销售台词》(美)埃尔默·惠勒著 2019
- 《看漫画学钢琴 技巧 3》高宁译;(日)川崎美雪 2019
- 《优势谈判 15周年经典版》(美)罗杰·道森 2018
- 《社会学与人类生活 社会问题解析 第11版》(美)James M. Henslin(詹姆斯·M. 汉斯林) 2019
- 《海明威书信集:1917-1961 下》(美)海明威(Ernest Hemingway)著;潘小松译 2019
- 《迁徙 默温自选诗集 上》(美)W.S.默温著;伽禾译 2020
- 《上帝的孤独者 下 托马斯·沃尔夫短篇小说集》(美)托马斯·沃尔夫著;刘积源译 2017
- 《巴黎永远没个完》(美)海明威著 2017
- 《大学计算机实验指导及习题解答》曹成志,宋长龙 2019
- 《指向核心素养 北京十一学校名师教学设计 英语 七年级 上 配人教版》周志英总主编 2019
- 《大学生心理健康与人生发展》王琳责任编辑;(中国)肖宇 2019
- 《大学英语四级考试全真试题 标准模拟 四级》汪开虎主编 2012
- 《大学英语教学的跨文化交际视角研究与创新发展》许丽云,刘枫,尚利明著 2020
- 《北京生态环境保护》《北京环境保护丛书》编委会编著 2018
- 《复旦大学新闻学院教授学术丛书 新闻实务随想录》刘海贵 2019
- 《大学英语综合教程 1》王佃春,骆敏主编 2015
- 《大学物理简明教程 下 第2版》施卫主编 2020
- 《指向核心素养 北京十一学校名师教学设计 英语 九年级 上 配人教版》周志英总主编 2019