《自然语言生成系统的建造》PDF下载

  • 购买积分:10 如何计算积分?
  • 作  者:(英)雷特,(澳)戴尔著
  • 出 版 社:北京市:北京大学出版社
  • 出版年份:2010
  • ISBN:9787301171547
  • 页数:248 页
图书介绍:本书介绍怎样来建造一个自然语言生成系统。自然语言生成系统是一个计算机软件系统,它使用人工智能和计算语言学的方法和技术,自动地生成可理解的自然语言文本,这样的文本可以是独立的,也可以是多媒体文档的一个组成部分。自然语言生成系统要从某种非语言表达出发,以这种非语言表达作为信息的输入,使用语言知识和用应用系统的领域知识,自动地产生出文档、报告、说明书、帮助信息以及其他类型的文本。

1 Introduction 1

1.1 The Research Perspective 2

1.1.1 Differences between NL Generation and NL Understanding 2

1.1.2 Sharing Knowledge between Generation and Understanding 3

1.2 The Applications Perspective 4

1.2.1 Computer as Authoring Aid 5

1.2.2 Computer as Author 6

1.2.3 Uses of NLG Technology 6

1.3 Some Example NLG Systems 7

1.3.1 WEATHERREPORTER 8

1.3.2 FOG 9

1.3.3 IDAS 12

1.3.4 MODELEXPLAINER 14

1.3.5 PEBA 15

1.3.6 STOP 16

1.4 A Short History of NLG 19

1.5 The Structure of This Book 20

1.6 Further Reading 21

2 National Language Generation in Practice 23

2.1 Introduction 23

2.2 When Are NLG Techniques Appropriate? 24

2.2.1 Text versus Graphics 25

2.2.2 Natural Language Generation versus Mail Merge 26

2.2.3 Natural Language Generation versus Human Authoring 28

2.3 Using a Corpus to Determine User Requirements 30

2.3.1 Assembling an Initial Corpus of Output Texts 31

2.3.2 Analysing the Information Content of Corpus Texts 33

2.3.3 Creating a Target Text Corpus 35

2.4 Evaluating NLG Systems 37

2.5 Fielding NLG Systems 38

2.6 Further Reading 40

3 The Architecture of a Natural Language Generation System 41

3.1 Introduction 41

3.2 The Inputs and Outputs of Natural Language Generation 42

3.2.1 Language as Goal-Driven Communication 42

3.2.2 The Inputs to Natural Language Generation 43

3.2.3 The Output of Natural Language Generation 46

3.3 An Informal Characterisation of the Architecture 47

3.3.1 An Overview of the Architecture 47

3.3.2 Content Determination 50

3.3.3 Document Structuring 51

3.3.4 Lexicalisation 52

3.3.5 Referring Expression Generation 55

3.3.6 Aggregation 56

3.3.7 Linguistic Realisation 57

3.3.8 Structure Realisation 59

3.4 The Architecture and Its Representations 59

3.4.1 Broad Structure and Terminology 59

3.4.2 Messages 61

3.4.3 The Document Planner 63

3.4.4 Document Plans 64

3.4.5 Microplanning 65

3.4.6 Text Specifications 66

3.4.7 Phrase Specifications 67

3.4.8 Surface Realisation 71

3.5 Other Architectures 72

3.5.1 Different Representations and Modularisations 72

3.5.2 Different Architectures:Integrated Systems 76

3.6 Further Reading 77

4 Document Planning 79

4.1 Introduction 79

4.1.1 What Document Planning Is About 79

4.1.2 The Inputs and Outputs of Document Planning 80

4.1.3 A WEATHERREPORTER Example 82

4.2 Representing Information in the Domain 83

4.2.1 What's in a Domain Model? 86

4.2.2 Domain Modelling for WEATHERREPORTER 87

4.2.3 Implementing Domain Models 88

4.2.4 Defining Messages 89

4.2.5 Determining the Degree of Abstraction in Messages 91

4.2.6 A Methodology for Domain Modelling and Message Definition 94

4.3 Content Determination 95

4.3.1 Aspects of Content Determination 96

4.3.2 Deriving Content Determination Rules 98

4.3.3 Implementing Content Determination 100

4.4 Document Structuring 101

4.4.1 Discourse Relations 102

4.4.2 Implementation:Schemas 104

4.4.3 Implementation:Bottom-up Techniques 107

4.4.4 A Comparison of Approaches 109

4.4.5 Knowledge Acquisition 110

4.5 Document Planner Architecture 110

4.6 Further Reading 112

5 Microplanning 114

5.1 Introduction 115

5.1.1 Why Do We Need Microplanning? 115

5.1.2 What's Involved in Microplanning? 116

5.1.3 The Inputs and Outputs of Microplanning 117

5.1.4 The Architecture of a Microplanner 122

5.2 Lexicalisation 124

5.2.1 Simple Lexicalisation 124

5.2.2 Simple Lexical Choice 126

5.2.3 Contextual and Pragmatic Influences on Lexical Choice 128

5.2.4 Expressing Discourse Relations 129

5.2.5 Fine-Grained Lexicalisation 130

5.3 Aggregation 132

5.3.1 Mechanisms for Sentence Formation 133

5.3.2 Choosing between Possible Aggregations 140

5.3.3 Order of Presentation 142

5.3.4 Paragraph Formation 143

5.4 Generating Referring Expressions 144

5.4.1 The Nature of the Problem 144

5.4.2 Forms of Referring Expressions and Their Uses 145

5.4.3 Requirements for Referring Expression Generation 146

5.4.4 Generating Pronouns 149

5.4.5 Generating Subsequent References 152

5.5 Limitations and Other Approaches 156

5.6 Further Reading 157

6 Surface Realisation 159

6.1 Introduction 159

6.2 Realising Text Specifications 162

6.3 Varieties of Phrase Specifications 164

6.3.1 Skeletal Propositions 165

6.3.2 Meaning Specifications 166

6.3.3 Lexicalised Case Frames 168

6.3.4 Abstract Syntactic Structures 168

6.3.5 Canned Text 169

6.3.6 Orthographic Strings 170

6.3.7 Summary 170

6.4 KPML 171

6.4.1 An Overview 171

6.4.2 The Input to KPML 171

6.4.3 Using Systemic Grammar for Linguistic Realisation 176

6.4.4 Summary 179

6.5 SURGE 179

6.5.1 An Overview 179

6.5.2 The Input to SURGE 180

6.5.3 Functional Unification Grammar 182

6.5.4 Linguistic Realisation via Unification 183

6.6 REALPRO 186

6.6.1 An Overview 186

6.6.2 The Input to REALPRO 187

6.6.3 Meaning-Text Theory 189

6.6.4 How REALPRO Works 190

6.6.5 Summary 191

6.7 Choosing a Realiser 192

6.8 Bidirectional Grammars 194

6.9 Further Reading 196

7 Beyond Text Generation 198

7.1 Introduction 198

7.2 Typography 201

7.2.1 The Uses of Typography 201

7.2.2 Typography in NLG Systems 203

7.2.3 Implementing Typographic Awareness 206

7.3 Integrating Text and Graphics 208

7.3.1 The Automatic Generation of Graphical Objects 209

7.3.2 Choosing a Medium 210

7.3.3 Commonalities between Text and Graphics 213

7.3.4 Implementing Text and Graphics Integration 214

7.4 Hypertext 216

7.4.1 Hypertext and Its Uses 216

7.4.2 Implementing Hypertext-based HLG Systems 219

7.5 Speech Output 221

7.5.1 The Benefits of Speech Output 221

7.5.2 Text-to-Speech Systems 222

7.5.3 Implementing Concept-to-Speech 225

7.6 Further Reading 227

Appendix:NLG Systems Mentioned in This Book 229

References 231

Index 243

1.1 The Macquarie weather summary for February 1995 8

1.2 Data input for the WEATHERREPORTER system 10

1.3 FOG input:a weather system over Canada 11

1.4 Some example forecasts from FOG 11

1.5 Some example texts from IDAS 13

1.6 An example MODELEXPLAINER input:An object-oriented class model 14

1.7 An example description produced by MODELEXPLAINER from the model in Figure 1.6 15

1.8 A www page generated by PEBA 16

1.9 A letter generated by the prototype STOP system 18

2.1 An example human-authored text in the WEATHERREPORTER domain 32

2.2 An example daily weather record 32

2.3 An example target text in the WEATHERREPORTER domain 36

3.1 Modules and tasks 49

3.2 The weather summary for February 1995 50

3.3 The weather summary for August 1995 50

3.4 The weather summary for January 1996 51

3.5 The structure of the weather summary in Figure 3.2 53

3.6 Using typography to indicate discourse structure 54

3.7 An NLG system architecture 60

3.8 A definition of some message types in WEATHERREPORTER 62

3.9 A MonthlyTemperatureMsg message 62

3.10 A MonthlyRainfallMsg message 63

3.11 A simple document plan representation 64

3.12 A simple text specification representation 67

3.13 Some phrase specification representations 68

3.14 An abstract syntactic representation for The month had some rainy days 70

3.15 A lexicalised case frame for The month had some rainy days 70

3.16 Combining canned text and abstract syntactic structures 71

3.17 Variations in terminology 72

3.18 A proto-phrase specification representation of the MonthlyRainfallMsg message 73

4.1 A knowledge base fragment in PEBA 81

4.2 A daily weather record 82

4.3 The document plan corresponding to the text in Figure 4.5 83

4.4 A documentplan 84

4.5 The weather summary for July 1996 84

4.6 Example MonthlyTemperatureMsg and RainEventMsg messages from the document plan shown in Figure 4.3 85

4.7 The correspondence of temperature values to categories 91

4.8 The definition of TemperatureSpellMsg 91

4.9 A TemperatureSpellMsg message 92

4.10 A message defined as a string 92

4.11 A message specified as a month and a textual attribute 93

4.12 A corpus-based procedure for domain modelling and message definition 95

4.13 A corpus-based procedure for identifying content determination rules 99

4.14 The RST definition of Elaboration 103

4.15 PEBA's Compare-And-Contrast schema 105

4.16 A simple set of schemas for WEATHERREPORTER 106

4.17 A bottom-up discourse structuring algorithm 108

4.18 Bottom-up construction of a document plan 109

4.19 Different combinations of document structuring and content determination 111

5.1 A simple weather summary 118

5.2 The top-level document plan for the text in Figure 5.1 118

5.3 The first constituent of the document plan corresponding to the text in Figure 5.1 118

5.4 The second constituent of the document plan corresponding to the text in Figure 5.1 119

5.5 A schematic representation of the document plan in Figures 5.2-5.4 119

5.6 The top-level text specification corresponding to the text in Figure 5.1 120

5.7 The phrase specification for Sentence1 in Figure 5.6(The month was slightly warmer than average,with the average number of rain days) 120

5.8 The phrase specification for Sentence2 in Figure 5.6(Heavy rain fell on the 27th and 28th) 121

5.9 A blackboard architecture for a microplanner 122

5.10 A simple microplanner architecture 123

5.11 Simplified microplanning 124

5.12 A simple template for RainEventMsgs 125

5.13 A simple message 125

5.14 The proto-phrase specification produced by applying the template to amessage 125

5.15 A simple template for RainSpellMessages 126

5.16 An algorithm for lexicalising spells 127

5.17 A template proto-phrase specification for on the ith 127

5.18 A template proto-phrase specification for on the ith and jth 127

5.19 A decision tree for realising the Contrast relation 129

5.20 The weather summary for July 1996 without aggregation 132

5.21 A proto-phrase specification for Heavy rain fell on the 28th 134

5.22 The result of a simple conjunction(Heavy rainfell on the 27th and 28th) 135

5.23 The result of a shared-participant conjunction(Heavy rain fell on the 27th and 28th) 138

5.24 The result of a shared-structure conjunction(Heavy rain fell on the 27th and 28th) 139

5.25 Some of Scott and de Souza's heuristics for aggregation 142

5.26 Some example rules from the AECMA Simplified English guide 142

5.27 A conservative pronoun-generation algorithm 151

5.28 Distinguishing descriptions 153

5.29 An algorithm for producing distinguishing descriptions 154

5.30 Constructing a referring expression 155

6.1 A simple PEBA text specification 160

6.2 A surface form with mark-up annotations for the PEBA text 160

6.3 The PEBA text as displayed by the presentation system 161

6.4 The logical structure specification in LATEX form 164

6.5 The logical structure specification in Word RTF form 164

6.6 A skeletal proposition as an AVM 166

6.7 A meaning specification 167

6.8 A lexicalised case frame 168

6.9 An abstract syntactic structure 169

6.10 A canned text structure 169

6.11 An orthographic string structure 170

6.12 An input to KPML,from which KPML produces March had some rainy days 172

6.13 An AVM representation of the structure in Figure 6.12 172

6.14 A more complex SPL expression(The month was cool and dry with the average number of rain days) 173

6.15 An SPL which exploits a middle model(March had some rainy days) 174

6.16 The mood system for the English clause 176

6.17 A chooser for deciding which determiner should be used 177

6.18 Some realisation statement operators 178

6.19 Realisation statements in a network 178

6.20 An input to SURGE,from which SURGE produces March had some rainy days 180

6.21 An AVM representation of the structure in Figure 6.20 181

6.22 A more complex input to SURGE(The month was cool anddry with the average number of rain days) 182

6.23 An FD for a simple sentence(John sells the car) 184

6.24 A simple FUF grammar 184

6.25 The result of initial unification of the input FD with the grammar 185

6.26 The FD in Figure 6.23 after unification with the grammar in Figure 6.24 186

6.27 The DSyntS for John helps Mary 187

6.28 Variations of the DSyntS in Figure 6.27 187

6.29 A DSyntS for March hadsome rainy days 188

6.30 An AVM representation of the structure in Figure 6.29 188

6.31 A more complex DSyntS input to REALPRO (The month was cool and dry with the average number of rain days) 188

6.32 Common relations in MTT's DSyntS 189

6.33 Some simple REALPRO grammar rules 191

6.34 Stages of realisation in REALPRO 192

7.1 An example output from PEBA 199

7.2 Using typeface variation for emphasis 202

7.3 Using a labelled list structure to aid information access 203

7.4 Information expressed typographically as a decision tree 204

7.5 Information expressed typographically via a table 204

7.6 Information expressed compactly 205

7.7 Declarative mark-up annotations 205

7.8 Physical mark-up annotations 205

7.9 A tabular presentation of aggregated information 207

7.10 Multimodal output from the WIP system 208

7.11 Graphics-only output from the WIP system 209

7.12 Text-only output from the WIP system 209

7.13 The document plan for the output in Figure 7.10 215

7.14 The architecture of a text-to-speech system 222

7.15 SABLE mark-ups for controlling speech synthesis 224