《Python自然语言处理 英文版》PDF下载

  • 购买积分:15 如何计算积分?
  • 作  者:(英)伯德,(英)克莱因,(美)洛普著
  • 出 版 社:南京:东南大学出版社
  • 出版年份:2010
  • ISBN:9787564122614
  • 页数:482 页
图书介绍:本书提供了非常易学的自然语言处理入门介绍,该领域涵盖从文本和电子邮件预测过滤,到自动总结和翻译等多种语言处理技术。在本书中,你将学会编写 Python 程序处理大量非结构化文本。还将通过使用综合语言数据结构访问含有丰富注释的数据集,理解用于分析书面通信内容和结构的主要算法。

1.Language Processing and Python 1

1.1 Computing with Language:Texts and Words 1

1.2 A Closer Look at Python:Texts as Lists of Words 10

1.3 Computing with Language:Simple Statistics 16

1.4 Back to Python:Making Decisions and Taking Control 22

1.5 Automatic Natural Language Understanding 27

1.6 Summary 33

1.7 Further Reading 34

1.8 Exercises 35

2.Accessing Text Corpora and Lexical Resources 39

2.1 Accessing Text Corpora 39

2.2 Conditional Frequency Distributions 52

2.3 More Python:Reusing Code 56

2.4 Lexical Resources 59

2.5 WordNet 67

2.6 Summary 73

2.7 Further Reading 73

2.8 Exercises 74

3.Processing Raw Text 79

3.1 Accessing Text from the Web and from Disk 80

3.2 Strings:Text Processing at the Lowest Level 87

3.3 Text Processing with Unicode 93

3.4 Regular Expressions for Detecting Word Patterns 97

3.5 Useful Applications of Regular Expressions 102

3.6 Normalizing Text 107

3.7 Regular Expressions for Tokenizing Text 109

3.8 Segmentation 112

3.9 Formatting:From Lists to Strings 116

3.10 Summary 121

3.11 Further Reading 122

3.12 Exercises 123

4.Writing Structured Programs 129

4.1 Back to the Basics 130

4.2 Sequences 133

4.3 Questions of Style 138

4.4 Functions:The Foundation of Structured Programming 142

4.5 Doing More with Functions 149

4.6 Program Development 154

4.7 Algorithm Design 160

4.8 A Sample of Python Libraries 167

4.9 Summary 172

4.10 Further Reading 173

4.11 Exercises 173

5.Categorizing and Tagging Words 179

5.1 Using a Tagger 179

5.2 Tagged Corpora 181

5.3 Mapping Words to Properties Using Python Dictionaries 189

5.4 Automatic Tagging 198

5.5 N-Gram Tagging 202

5.6 Transformation-Based Tagging 208

5.7 How to Determine the Category of a Word 210

5.8 Summary 213

5.9 Further Reading 214

5.10 Exercises 215

6.Learning to Classify Text 221

6.1 Supervised Classifcation 221

6.2 Further Examples of Supervised Classification 233

6.3 Evaluation 237

6.4 Decision Trees 242

6.5 Naive Bayes Classifiers 245

6.6 Maximum Entropy Classifiers 250

6.7 Modeling Linguistic Patterns 254

6.8 Summary 256

6.9 Further Reading 256

6.10 Exercises 257

7.Extracting Information from Text 261

7.1 Information Extraction 261

7.2 Chunking 264

7.3 Developing and Evaluating Chunkers 270

7.4 Recursion in Linguistic Structure 277

7.5 Named Entity Recognition 281

7.6 Relation Extraction 284

7.7 Summary 285

7.8 Further Reading 286

7.9 Exercises 286

8.Analyzing Sentence Structure 291

8.1 Some Grammatical Dilemmas 292

8.2 What's the Use of Syntax? 295

8.3 Context-Free Grammar 298

8.4 Parsing with Context-Free Grammar 302

8.5 Dependencies and Dependency Grammar 310

8.6 Grammar Development 315

8.7 Summary 321

8.8 Further Reading 322

8.9 Exercises 322

9.Building Feature-Based Grammars 327

9.1 Grammatical Features 327

9.2 Processing Feature Structures 337

9.3 Extending a Feature-Based Grammar 344

9.4 Summary 356

9.5 Further Reading 357

9.6 Exercises 358

10.Analyzing the Meaning of Sentences 361

10.1 Natural Language Understanding 361

10.2 Propositional Logic 368

10.3 First-Order Logic 372

10.4 The Semantics of English Sentences 385

10.5 Discourse Semantics 397

10.6 Summary 402

10.7 Further Reading 403

10.8 Exercises 404

11.Managing Linguistic Data 407

11.1 Corpus Structure:A Case Study 407

11.2 The Life Cycle of a Corpus 412

11.3 Acquiring Data 416

11.4 Working with XML 425

11.5 Working with Toolbox Data 431

11.6 Describing Language Resources Using OLAC Metadata 435

11.7 Summary 437

11.8 Further Reading 437

11.9 Exercises 438

Afterword:The Language Challenge 441

Bibliography 449

NLTK Index 459

General Index 463