《中国学生英语作文自动评分模型的构建》PDF下载

  • 购买积分:11 如何计算积分?
  • 作  者:梁茂成著
  • 出 版 社:北京:外语教学与研究出版社
  • 出版年份:2011
  • ISBN:9787513504997
  • 页数:291 页
图书介绍:本书研究了如何构建中国学生英语作文自动评分的统计模型,探讨学生英语作文中的多种文本特征项对学生作文成绩的预测能力。

Part One Introduction 1

Introducing the Study 2

0.1 Introductory remarks 2

0.2 Need for this study 3

0.2.1 Theoretical considerations 3

0.2.2 Practical considerations 7

0.3 Description of the study 10

0.4 Organization of the study 11

0.5 Summary 12

Part Two Literature Review 13

Chapter 1 A Review of Existing Computer-Assisted Essay Scoring Systems 14

1.1 Introduction 14

1.2 Key concepts 14

1.2.1 Computer-assisted essay scoring 14

1.2.2 EFL writing assessment 16

1.3 Existing computer-assisted essay scoring systems 17

1.3.1 Project Essay Grade(PEG):A form-focused system 17

1.3.2 Intelligent Essay Assessor(IEA):A content-focused system 20

1.3.3 E-rater:A hybrid system with a modular structure 22

1.3.4 An appraisal of the three existing systems 25

1.4 Lessons from existing essay scoring systems 28

1.5 Summary 31

Chapter 2 Studies on Measures of Writing Quality 33

2.1 Introduction 33

2.2 Measures of writing quality in the literature 33

2.2.1 Measures of the quality of language 34

2.2.2 Measures of the quality of content and organization 51

2.3 An overview of the measures in the literature 57

2.4 A conceptual model for the computer-assisted scoring of EFL essays 61

2.5 Proposed measures of EFL writing quality 65

2.5.1 Proposed measures of the quality of language in EFL writing 65

2.5.2 Proposed measures of the quality of content in EFL writing 69

2.5.3 Proposed measures of the quality of organization in EFL writing 71

2.6 Summary 75

Part Three Methodology 77

Chapter 3 Research Questions and Data Preparation 78

3.1 Introduction 78

3.2 Research questions 78

3.3 The corpus 80

3.4 The rating scheme 82

3.4.1 Selecting a rating scale 82

3.4.2 The revised rating scale 84

3.4.3 The evaluation of content 87

3.4.4 The weighting scheme 90

3.5 Rating 91

3.5.1 Rater selection 92

3.5.2 Rater training 92

3.5.3 The rating sessions 93

3.6 Score reliability 94

3.7 Summary 96

Chapter 4 Text Analysis and Statistical Analysis 97

4.1 Introduction 97

4.2 Tools 97

4.3 Essay feature extraction 99

4.3.1 Language features 100

4.3.2 Content features 103

4.3.3 Organizational features 110

4.4 Data analysis 111

4.4.1 Correlation analysis 111

4.4.2 Multiple regression analysis 112

4.4.3 Stages of data analysis 113

4.5 Summary 117

Part Four Results and Discussion 119

Chapter 5 Identifying Predictors of EFL Writing Quality 120

5.1 Introduction 120

5.2 Linguistic features and writing quality 120

5.2.1 Fluency and writing quality 123

5.2.2 Complexity of language and writing quality 126

5.2.3 Measures of linguistic idiomaticity and appropriateness 138

5.3 Results of content analysis 144

5.3.1 Results of Latent Semantic Analysis 145

5.3.2 Procedural vocabulary and essay score 149

5.4 Essay organization and writing quality 151

5.4.1 Paragraphing and writing quality 152

5.4.2 Discourse conjuncts and writing quality 159

5.4.3 Demonstratives,pronouns,connective and writing quality 159

5.5 Power of the predictors proposed in this study 159

5.6 Summary 161

Chapter 6 A Statistical Model for Computer-Assisted Essay Scoring 164

6.1 Introduction 164

6.2 Diagnosing the preliminary model 165

6.3 The refined model 168

6.4 Predictors and aspects of writing quality measured 172

6.4.1 Predictors in the language module 173

6.4.2 Predictors in the content module 178

6.4.3 Predictors in the organization module 181

6.4.4 Interdependence of the modules 183

6.5 Implementing the model 185

6.6 Summary 187

Chapter 7 Validating the Model 188

7.1 Introduction 188

7.2 Cross-validating the model 188

7.3 Reliability of computer scores in cross-validation 191

7.3.1 Aspects of reliability 191

7.3.2 Consistency estimates 193

7.3.3 Consensus estimates 195

7.4 Double cross-validation 198

7.4.1 Constructing the model 198

7.4.2 Model statistics and estimated equation 199

7.5 Reliability of computer scores in double cross-validation 201

7.6 Comparison with existing essay scoring systems 204

7.6.1 Comparison with PEG 205

7.6.2 Comparison with IEA 208

7.6.3 Comparison with E-rater 212

7.7 Summary 214

Part Five Conclusion 215

Chapter 8 Conclusion 216

8.1 Major findings 216

8.1.1 A model for the computer-assisted scoring of EFL essays 216

8.1.2 Predictors of EFL writing quality 220

8.2 Limitations of the study 223

8.2 Future work 224

References 226

Appendices 249

Appendix Ⅰ PEG's proxes and their beta values(Page 1968) 249

Appendix Ⅱ Page's(1995)model and variables 251

Appendix Ⅲ Argument weight 253

Appendix Ⅳ Examples of good openings and endings 255

Appendix Ⅴ Scoring table(Organization & Content) 256

Appendix Ⅵ Scoring table(Language) 257

Appendix Ⅶ List of stopwords 258

Appendix Ⅷ Lemma list(excerpt) 262

Appendix Ⅸ List of content words 266

Appendix Ⅹ Sample essays 283

Appendix Ⅺ POS-tagged samples 286

Chapter 1

Table 1.1 Comparison of strengths and weaknesses of existing essay scoring systems 26

Table 1.2 Approaches and measured constructs 28

Chapter 2

Table 2.1 Measures of writing quality in previous studies 58

Chapter 3

Table 3.1 Comparison of holistic and analytic scales(from Weigle 2002) 83

Table 3.2 Jacobs et al.'s(1981)scale:Aspects of quality and their emphasis 85

Table 3.3 Modified scheme:Aspects of writing quality 86

Table 3.4 Aspects of writing quality and their emphasis in the revised scale 91

Table 3.5 Inter-rater correlations(Training set) 95

Table 3.6 Mean and standard deviation of scores (Training set) 95

Table 3.7 Inter-rater correlations(Validation set) 95

Table 3.8 Mean and standard deviation of scores(Validation set) 95

Chapter 4

Table 4.1 Directly extracted language features 100

Table 4.2 Computed language features 100

Chapter 5

Table 5.1 Measures of the quality of language explored 122

Table 5.2 Correlations between fluency measures and essay scores 123

Table 5.3 Correlations between general lexical features and essay scores 127

Table 5.4 Correlations between TTR,Index of Guiraud and essay scores 130

Table 5.5 Correlations between the number of words in VFP lists and essay scores 131

Table 5.6 Correlation between uncommon-common word ratio and essay scores 134

Table 5.7 Correlations between measures of syntactic complexity and essay scores 135

Table 5.8 Examples of recurrent word combinations 140

Table 5.9 Correlation between the number of RWCs and essay scores 140

Table 5.10 Correlations between the use of prepositions,the use of the definite article and essay scores 143

Table 5.11 Correlations between standard SVD measures,revised SVD measures and essay scores 146

Table 5.12 Correlation between the number of PV items and essay scores 149

Table 5.13 Correlation between paragraphing and essay scores 154

Table 5.14 Categories of discourse conjuncts 156

Table 5.15 Correlation between discourse conjuncts and essay scores 157

Table 5.16 Power of the predictors proposed in this study 160

Table 5.17 Variables and aspects of writing quality they measure 161

Chapter 6

Table 6.1 Summary for the preliminary model 165

Table 6.2 Problematic variables in the model 167

Table 6.3 Predicting power of the model 168

Table 6.4 Predictors and their beta weights 170

Table 6.5 Predictors in the language module 173

Table 6.6 Predicting power of the language module 176

Table 6.7 Coefficients of predictors in the language module 177

Table 6.8 Predictors in the content module 178

Table 6.9 Predicting power of the content module 180

Table 6.10 Coefficients of predictors in the content module 180

Table 6.11 Predictors in the organization module 181

Table 6.12 Predicting power of the organization module 182

Table 6.13 Coefficients of predictors in the organization module 183

Table 6.14 The unique power of the content module 183

Table 6.15 The unique power of the organization module 184

Table 6.16 The unique power of the language module 185

Chapter 7

Table 7.1 Pearson correlations between human raters and the computer 194

Table 7.2 Cronbach's alpha coefficients 195

Table 7.3 Exact agreement between human raters and computer 196

Table 7.4 Exact-plus-adjacent agreement between human raters and computer 197

Table 7.5 A summary of reliability estimates 198

Table 7.6 Model summary(double cross-validation) 199

Table 7.7 Regression coefficients(double cross-validation) 200

Table 7.8 Consistency coefficients of reliability(double cross-validation) 202

Table 7.9 Consensus estimates of reliability(double cross-validation) 203

Table 7.10 Pearson correlations and exact-plus-adjacent agreement 205

Table 7.11 Reliability of PEG's first experiment(from Page 2003) 205

Table 7.12 Reliability of PEG's NAEP experiment(Page 1994) 206

Table 7.13 Reliability of PEG's Praxis experiment(from Page 2003) 207

Table 7.14 Major experiments with PEG 207

Table 7.15 Representative experiments with LSA 210

Table 7.16 E-rater's mean agreement with human raters(Burstein et al. 1998a) 212

Table 7.17 E-rater's reliability reported in Burstein et al.(2001) 213

Chapter 8

Table 8.1 A list of reconfirmed predictors 221

Table 8.2 Revised predictors and their correlations with essay quality 222

Chapter 1

Figure 1.1 Modularity in the computer-assisted essay scoring model 30

Chapter 2

Figure 2.1 Relationship between the No. of types and the No. of tokens in a text 44

Figure 2.2 A conceptual model for the computer-assisted scoring of EFL essays 62

Chapter 4

Figure 4.1 Term-by-document matrix 105

Figure 4.2 Weighted term-by-document matrix 106

Figure 4.3 Singular Value Decomposition(SVD) 107

Figure 4.4 Matrix reconstruction 107

Figure 4.5 Reconstructed matrix 108

Figure 4.6 The reference in the revised approach of LSA 109

Figure 4.7 Flow chart of the model-training stage 113

Figure 4.8 Flow chart of the cross-validation phase 114

Figure 4.9 Flow chart of the double cross-validation phase 116

Chapter 5

Figure 5.1 Relationship between the number of paragraphs and essay scores 152

Chapter 6

Figure 6.1 Relationship between the standardized predicted value and the dependent variable 169

Figure 6.2 Estimated Equation 1 172

Figure 6.3 Implementing the model 187

Chapter 7

Figure 7.1 Computing essay scores 190

Figure 7.2 Variables and computer-generated scores 191

Figure 7.3 Estimated equation 2(double cross-validation) 201