Hadoop权威指南英文第4版PDF电子书下载

电子书积分：20 积分如何计算积分？
作者：（美）怀特著
出版社：南京：东南大学出版社
出版年份：2015
ISBN：9787564159177
页数：730 页

图书介绍：通过这本综合性指南的第四版，你将会学习到如何通过Apache Hadoop建立和维护可靠的、可扩展的分布式系统。本书是期望分析任意大小的数据集的程序员以及想建立和运行Hadoop集群的管理员的理想选择。在这本面向Hadoop 2的新版书籍中，作者Tom White增加了关于YARN和一些Hadoop的相关项目如Parquet, Flume, Crunch和Spark的新章节。你将会了解到Hadoop版本的最新变化，并且研究在医疗健康系统和基因数据处理中Hadoop的应用案例。

查看图书目录点击购买PDF全本电子书

上一篇：水工程地震动输入参数分析下一篇：大学计算机基础实践教程

《Hadoop权威指南英文第4版》目录

标签：权威指南

Part Ⅰ．Hadoop Fundamentals 3

1．Meet Hadoop 3

Data! 3

Data Storage and Analysis 5

Querying All Your Data 6

Beyond Batch 7

Comparison with Other Systems 8

Relational Database Management Systems 8

Grid Computing 10

Volunteer Computing 11

A Brief History of Apache Hadoop 12

What's in This Book? 15

2．MapReduce 19

A Weather Dataset 19

Data Format 19

Analyzing the Data with Unix　Tools 21

Analyzing the Data with Hadoop 22

Map and Reduce 22

Java Map Reduce 24

Scaling Out 30

Data Flow 30

Combiner Functions 34

Running a Distributed Map Reduce Job 37

Hadoop Streaming 37

Ruby 37

Python 40

3．The Hadoop Distributed Filesystem 43

The Design of HDFS 43

HDFS Concepts 45

Blocks 45

Namenodes and Datanodes 46

Block Caching 47

HDFS Federation 48

HDFS High Availability 48

The Command-Line Interface 50

Basic Filesystem Operations 51

Hadoop Filesystems 53

Interfaces 54

The Java Interface 56

Reading Data from a Hadoop URL 57

Reading Data Using the FileSystem API 58

Writing Data 61

Directories 63

Querying the Filesystem 63

Deleting Data 68

Data Flow 69

Anatomy of a File Read 69

Anatomy of a File Write 72

Coherency Model 74

Parallel Copying with distcp 76

Keeping an HDFS Cluster Balanced 77

4．YARN 79

Anatomy of a YARN Application Run 80

Resource Requests 81

Application Lifespan 82

Building YARN Applications 82

YARN Compared to MapReduce 1 83

Scheduling in YARN 85

Scheduler Options 86

Capacity Scheduler Configuration 88

Fair Scheduler Configuration 90

Delay Scheduling 94

Dominant Resource Fairness 95

Hadoop权威指南 英文 第4版PDF电子书下载

Hadoop权威指南英文第4版PDF电子书下载