PARI 1 DiqilAt LOqic ANd FiNiv SIAIE MAChiNfs 3
CHAPTER 1 DIGITAL LOGIC FUNDAMENTALS 3
1.1 Boolean Algebra 4
1.1.1 Basic Functions 4
1.1.2 Manipulating Boolean FUnctions 6
1.2 Basic Combinatorial Logic 13
1.3 More Complex Combinatorial Components 16
1.3.1 Multiplexers 16
1.3.2 Decoders 18
1.3.3 Encoders 20
1.3.4 comparators 20
1.3.5 Adders and subtracters 23
1.3.6 Memory 27
1.4 Combinatorial Circuit Designs 29
1.4.1 BCD to 7-segment Decoder 30
1.4.2 Data Sorter 31
PRACIiCAL PERSPECTiVE:WHy LED S ARE USUAllY ACTiVE LOW 32
1.5 Basic Sequential Components 34
1.6 More Complex Sequential Components 39
1.6.1 Counters 39
1.6.2 Shift Registers 41
1.7 REAl WORld EXAMPlE:PROGRAMMAblE LOqicDEViCES 43
1.8 Summary 46
Prlblems 47
CHAPTER 2 INTRODUCTION TO FINITE STATE MACHINES 51
2.1 State Diagrams and State Tables 52
HisToRiCAl PERSPECTiVE: FINiTE STATE MACHiNE ANd MiCROPROCESSORS 53
2.2 Mealy and Moore Machines 56
2.3 Designning State Diagrams 58
2.3.1 Modulo 6 Counter 58
2.3.2 String Checker 60
2.3.3 Toll Booth Controller 61
PRACTiCAl PERSPECTiVE: DiffERENT MODElS FOR THE SAME PROblEM 157
2.4 From State Diagram to Implementation 66
2.4.1 Assigning State Valuse 66
2.4.2 Mealy and Moore Machine Implementations 68
2.4.3 Generating the Next State 70
2.4.4 Generating System Outputs 74
2.4.5 An Alternative Desian 77
2.4.6 The Eight-State String Checker 80
2.5 REAL WORLd EXAMPLE:PRACTiCAL CONSidERATiONS 83
2.5.1 Unused States 83
2.5.2 Asynchronous Desins 86
2.5.3 Machine Conversion 91
2.6 Summary 92
Problems 93
PART 2 COMDUTLR ORGANiZAIiON ANd ARChi ECTURE 103
CHAPTER 3 INSTRUCTION SET ARCHITECTURES 103
3.1 Levels of Programming Languages 104
3.1.1 Language Categories 105
3.1.2 Compliling and Asscmbling Programs 106
PRACTiCAL PERSPECTiVE: JAVA APPLETS -A DiffERENT WAy of PROCESSiNG PROqRAMS 109
3.2 Assembly Language Instructions 110
3.2.1 Instruction Types 110
3.2.2 Data TyPes 112
3.2.3 Addressing Modes 113
3.2.4 Instruction Formats 115
3.3 Instruction Set Architecture Design 119
3.4 A Relatively Simple Instruction Set Architccture 121
3.5 REAL WORLd EXAMPLE:THE 8085 MiCROPROCESSOR INSTRUCTiON SET ARCHiTECTURE 128
3.5.1 The 8085 Microprocessor Register Set 128
HisToRicAl PERSPEcivE:INTEl s EARlyMicRoPRocESSORS 129
3.5.2 The 8085 Microprocessor Instruction Set 130
3.5.3 A Simple 8085 Program 134
3.5.4 Analyzing the 8085 Instruction Set Architecture 136
3.6 Summary 137
Problems 138
CHAPTER 4 INTRODUCTION TO COMPUTER ORGANIZATION 141
4.1 Basic Computer Organization 142
4.1.1 System Buses 142
4.1.2 Instruction Cycles 143
PRACTiCAl PERSPECTiVE:THE PERiPHERAl COMPONENT INTERCONNECT BUS 144
4.2 CPU Organization 146
4.3 Memory Subsystem Organization and Interfacing 148
4.3.1 Types of Memory 149
4.3.2 Internal Chip Organization 151
4.3.3 Memory Subsystem Configuration 152
HiSTORiCAl PERSPECTiVE:THE VON NEUMANN ANd HARVARd ARCHiTECTURES 157
4.3.4 Multibyte Data Organization 157
4.3.5 Beyond the Basics 158
4.4 I/O Subsystem Organization and Interfacing 159
4.5 A Relatively Simple Computer 162
4.6 REAl WORld EXAMPlE:AN 8085-bASEd COMPUTER 166
HiSTORiCAl PERSPECTiVE:THE SOjOURNER ROVER 170
4.7 Summary 171
Problems 172
CHAPTER 5 REGISTER TRANSFER LANGUAGES 175
5.1 Micro-Operations and Register Transfer Language 176
5.2 Using RTL to Specify Digital Systems 184
5.2.1 Specification of Digital Components 185
5.2.2 Specification and Implementation of Simple Systems 186
5.3 More Complex Digital Systems and RTL 190
5.3.1 Modulo 6 Counter 190
5.3.2 Toll Booth Controller 192
5.4 REAL WORld EXAMPlE:VHDL-VHSIC HARdWARE DESCRiPTiON LANqUAGE 199
PRACTICAL PERSPECTIVE:HARdWARE DESCRiPTiON LANGUAGES 200
5.4.1 VHDL Syntac 200
5.4.2 VHDL Design with a High Level of Abstraction 203
5.4.3 VHDL Design with a Low Level of Abstraction 207
5.5 Summary 209
PRACTiCAl PERSPECTiVE: SOME AdVANCEd CAPAbiliTiES of VHDL 210
Problems 211
CHAPTER 6 CPU DESIGN 214
6.1 Specifying a CPU 214
6.2 Design and Implementation of a Very Simple CPU 216
6.2.1 Specifications for a Very Simple CPU 216
6.2.2 Fetching Instructions from Memory 217
PRACTiCAL PERSPECTiVE: WHY A CPU INCREMENTS PC DURiNG THE FETCH CyClE 218
6.2.3 Decoding Instructions 219
6.2.4 Executing Instructions 219
6.2.5 Establishing Required Data Paths 221
6.2.6 Design of a Very Simple ALU 226
6.2.7 Designing the Control Unit Using Hardwired Control 227
6.2.8 Design Verification 232
6.3 Design and Implementation of a Relatively Simple CPU 233
6.3.1 Specifications for a Relatively Simple CPU 234
6.3.2 Fetching and Decoding Instructions 236
6.3.3 Executing Instructions 237
6.3.4 Establishing Data Paths 242
6.3.5 Design of a Relatively Simple ALU 245
6.3.6 Designing the Control Unit Using Hardwired Control 247
6.3.7 Design Verification 250
6.4 Shortcomings of the Simple CPUs 251
6.4.1 More Internal Registers and Cache 251
HisTORiCAl PERSPECTiVE:STORAGE in LNTEl MiCROPROCESSORS 252
6.4.2 Multiple Buses Within the CPU 253
6.4.3 Pipelined Instruction Processing 253
6.4.4 Larger Instruction Sets 253
6.4.5 Subroutines and Interrupts 256
6.5 REAl WORld EXAMPlE:INTERNAl ARCHiTECTURE of THE 8085 MiCROPROCESSOR 256
6.6 Summary 256
Problems 259
CHAPTER 7 MICROSEQUENCER CONTROL UNIT DESIGN 267
7.1 Basic Microsequencer Design 268
7.1.1 Microsequencer Operations 268
7.1.2 Microinstruction Formats 270
7.2 Design and Implementation of a Very Simple Microsequencer 272
7.2.1 The Basic Layout 272
7.2.2 Generating the Correct Sequence and Designing the Mapping Logic 273
7.2.3 Generating the Micro-Operations Using Horizontal Microcode 275
7.2.4 Generating the Micro-Operations Using Vertical Microcode 277
PRACTiCAL PERSPECTiVE:NANOiNSTRUCTiONS 282
7.2.5 Directly Generating the Control Signals from the Microcode 283
7.3 Design and Implementation of a Relatively Simple Microsequencer 285
7.3.1 Modifying the State Diagram 285
7.3.2 Designing the Sequencing Hardware and Microcode 285
7.3.3 Completing the Design Using Horizontal Microcode 291
7.4 Reducing the Number of Microinstructions 294
7.4.1 Microsubroutines 294
7.4.2 Microcode Jumps 298
7.5 Microprogrammed Control v5. Hardwired Control 300
7.5.1 Complexity of the Instruction Set 300
7.5.2 Ease of Modification 301
7.5.3 Clock Speed 301
7.6 REAl WORld EXAMPlE:A (MosTly)MicROCOdEd CPU:THE PENTiUM PROCESSOR 301
HiSTORiCAL PERSPECTiVE:HOW THE PENTiUM GOT LTS NAME 303
7.7 Summary 304
Problems 304
CHAPTER 8 COMPUTER ARITHMETIC 308
8.1 Unsigned Notation 309
8.1.1 Addition and Subtractlon 310
8.1.2 Multiplication 314
8.1.3 Division 323
8.2 Signed Notation 334
8.2.1 Signed-Magnitude Notation 334
8.2.2 Signed-Two s Complement Notation 339
8.3 Binary Coded Decimal 340
8.3.1 BCD Numeric Format 341
8.3.2 Addition and Subtraction 341
8.3.3 Multiplication and Division 344
8.4 Specialized Arithmetic Hardware 348
HisToRiCAl PERSPECTiVE:COPROCESSORS 348
8.4.1 PiPelining 349
8.4.2 Lookup Tables 351
HiSTORiCAl PERSPECTiVE:THE PENTiUM FlOATiNG POiNT BUG 352
8.4.3 Wallace Trees 353
8.5 Floating Point Numbers 358
8.5.1 Numeric Format 358
8.5.2 Numeric Characteristics 359
8.5.3 Addition and Subtraction 361
8.5.4 Multiplication and Division 366
8.6 REAl WORld EXAMPLE:THE IEEE754 FlOATiNG POiNT STANdARd 369
8.6.1 Formats 369
8.6.2 Denormalized Values 371
8.7 Summary 371
Problems 372
CHAPTER 9 MEMORY ORGANIZATION 376
9.1 Hierarchical Memory Systems 376
9.2 Cache Memory 378
9.2.1 Associative Memory 378
9.2.2 Cache Memory with Associative Mapping 380
9.2.3 Cache Memory with Direct Mapping 383
9.2.4 Cache Memory with Set-Associative Mapping 385
PRACTiCAl PERSPECTiVE:MAPPiNG STRATEGiES iN CURRENT CPUS 387
9.2.5 Replacing Data in the Cache 388
9.2.6 Writing Data to the Cache 390
9.2.7 Cache Performance 391
9.3 Virtual Memory 396
9.3.1 Paging 396
9.3.2 Segmentation 405
9.3.3 Memory Protection 408
9.4 Beyond the Basics of Cache and Virtual Memory 410
9.4.1 Beyond the Basics of Cache Memory 410
PRACTiCAl PERSPECTiVE:CACHE HiERARCHY in THE lTANIUM MiCROPROCESSOR 411
9.4.2 Beyond the Basics of Virtual Memory 411
9.5 REAL WORld EXAMPlE:MEMORY MANAGEMENT in A PENTiUM/WiNdOWS PERSONAl COMPUTER 413
9.6 Summary 414
Problems 415
CHAPTER 10 INPUT/OUTPUT ORGANIZATION 422
10.1 Asynchronous Data Transfers 422
10.1.1 Source-Initiated Data Transfer 423
10.1.2 Destination-Initiated Data Transfer 425
10.1.3 Handshaking 427
10.2 Programmed l/o 430
10.2.1 New Instructions 434
10.2.2 New Control Signais 435
10.2.3 New States and RTL Code 435
10.2.4 Modify the CPU Hardware for the New Instruction 435
10.2.5 Make Sure Other Instructions Still Work 437
10.3 Interrupts 438
10.3.1 Transferring Data Between the CPU and I/o Devices 438
10.3.2 Types of Interrupts 440
10.3.3 Processing Interruptw 441
10.3.4 Interrupt Hardware and Priority 443
10.3.5 Implementing Interrupts Inside the CPU 449
10.4 Direct Memory Access 452
10.4.1 Incorporating Direct Memory Access(DMA)into a Computer System 452
10.4.2 DMA Transfer Mooes 455
10.4.3 Modifying the CPU to Work with DMA 456
10.5 I/O Processors 458
PRACTiCAl PERSPECTiVE:THE i960I/O PROCESSOR WiTH BUilT/iN DMA 461
10.6 Serial Communication 462
10.6.1 Serial Communication Basics 462
10.6.2 Universal Asynchronous Receiver/Transmitters(UARTs) 466
10.7 REAl WORld EXAMPlE:SERLAl COMMUNiCATiON STANdARdS 467
10.7.1 The Rs-232-C Standard 468
PRACTiCAi PERSPECTiVE:THE RS/422 SERiAl STANdARd 469
10.7.2 The Universal Serial Bus Standard 470
10.8 Summary 471
Problems 472
PARI 3 AdvANCFd Topics 479
CHAPTER 11 REDUCED INSTRUCTION SET COMPUTING 479
11.1 RISC Rationale 479
11.1.1 Fixed Length Instructions 481
11.1.2 Limited Loading and Storing Instructions Access Memory 481
11.1.3 Fewer Addressing Mooes 481
11.1.4 Instruction Pipeline 481
PRACTiCAl PERSPECTiVE: AddRESSiNG MOdES iN THE POWERPC750 RISC CPU 482
11.1.5 Large Number of Registers 482
11.1.6 Hardwired Control Unit 482
11.1.7 Delayed Loads and Branches 482
11.1.8 Speculative Execution of Instructions 483
11.1.9 Optimizing Compiler 483
11.1.10 Sepqrater Instruction and Data Streams 483
11.2 RISC Instruction Sets 483
11.3 Instruction Pipelines and Register Windows 486
11.3.1 Instruction Pipelines 487
11.3.2 Register Windowing and Renaming 491
PRACTiCAl PERSPECTiVE: REGiSTER WiNdOWiNG ANd REGiSTER RENAMiNG iN REAl/WORld CPUs 494
11.4 Instruction Pipeline Conflicts 494
11.4.1 Data Conflicts 495
11.4.2 Branch Conflicts 498
11.5 RISC vs. CISC 504
11.6 REAl WORld EXAMPlE:THE ITANIUM MiCROPROCESSOR 506
11.7 Summary 509
Problems 509
CHAPTER 12 INTRODUCTION TO PARALLEL PROCESSING 514
12.1 Parallelism in Uniprocessor Systems 515
12.2 Organization of Multiprocessor Systems 519
12.2.1 Flynn s Classification 519
12.2.2 System Topologies 521
12.2.3 MIMD System Architectures 524
PRACTiCAl PERSPECTiVE:THE WORld s LARGEST MUlTiCOMPUTER? 526
PRACTiCAl PERSPECTiVE:THE BlVE GENE COMPUTER 527
12.3 Communication in Multiprocessor Systems 528
12.3.1 Fixed Connections 528
12.3.2 Reconfigurable Connections 529
12.3.3 Routing on Multistage Interconnection Networks 534
12.4 Memory Organization in Multiprocessor Systems 540
12.4.1 Shared Memory 540
12.4.2 Cache Coherence 542
12.5 Multiprocessor Operating Systems and Software 547
12.6 Parallel Algorithms 549
12.6.1 Parallel Bubble Sort 549
12.6.2 Parallel Matrix Multiplication 551
12.7 Alternative Parallel Architectures 554
12.7.1 Dataflow Computing 555
12.7.2 Systolic Arrays 559
12.7.3 Neural Networks 562
12.8 Summary 564
Problems 564
INDEX 569