Chapter 1 Introduction 1
1.1 The Nature of the Problem 1
1.2 Defining the Problem 3
1.3 Purpose of Investigations 4
1.4 Significance of the Study 6
1.5 Organization of Thesis 7
Chapter 2 English Teaching and Language Testing in China 10
2.1 English Education in China 11
2.1.1 Student Population 11
2.1.2 Teacher Resources 12
2.1.3 Instructional Time for Language Learning 12
2.1.4 Management System 13
2.1.5 Textbooks and Teaching Resources 14
2.1.6 Trends of Development 14
2.2 Communicative Language Teaching and the National English Curriculum Standards 15
2.2.1 Communicative Language Teaching 15
2.2.2 The National English Curriculum Standards 16
2.3 Current Problems in English Teaching in Schools 19
2.3.1 The Curriculum Objectives and Language Assessment 19
2.3.2 Curriculum,Teaching and Testing:Problems 21
2.4 Task-Based Approaches to the Problems 26
2.4.1 Task-Based Language Teaching:Challenges 27
2.4.2 Task-Based Assessment 30
2.5 Summary 32
Chapter 3 Literature Review:Language Testing,Task-Based Assessment,and Task Difficulty 34
3.1 Language Testing:Issues and Problems 35
3.2 Communicative Language Testing 37
3.2.1 Basic Issues 37
3.2.2 Communicative Tests:Models 42
3.3 Task-Based Approaches and Performance Assessment 46
3.3.1 Some Research Questions in Task-Based Approaches 48
3.3.2 Task-Based Approaches to Testing 49
3.3.3 Cognitive Dimensions:Information-Processing Approaches 61
3.3.4 Implications for Language Assessment 67
3.4 Assessing Task Difficulty 70
3.4.1 Defining Task Difficulty 71
3.4.2 Sequencing Tasks:Rationale and Task Difficulty Factors 73
3.4.3 Designing Tasks 80
3.5 Task Difficulty Matrix 82
3.5.1 The Norris-Brown et al.'s Studies 82
3.5.2 Problematizing Norris et al.'s Matrix 84
3.6 Rater Training and Objective Measures 86
3.6.1 Task Difficulty Matrix:Rater Training 86
3.6.2 Measuring Task Performance:The Discourse Analysis Measures 90
Chapter 4 Research Methods 95
4.1 Research Design 96
4.2 Method 101
4.2.1 Participants 104
4.2.2 Effective Tasks 108
4.2.3 Establishing the Difficulty of Tasks 111
4.2.4 Qualitative Analyses 115
4.2.5 Rater Training 118
4.2.6 Measuring Student Written Performance 119
4.3 Data Summary and DataAnalysis 122
4.3.1 TBA Reliability 123
4.3.2 TBA Validity 123
4.3.3 Data Analysis 126
Chapter 5 Task Difficulty Matrix Ⅰ:Evolving the IPO Task Difficulty Matrix 130
5.1 Application of Norris et al.'s Matrix 131
5.1.1 Study One:Application of Norris et al.'s Matrix 131
5.1.2 Studies Two,Three,and Four:Applications of the Norris et al.'s Matrix 136
5.2 The Input-Processing-Output Task Difficulty Matrix 156
5.2.1 Establishing the Matrix:Input-Processing-Output by Content-Form-Support 156
5.2.2 Study Five:Applying the IPO-CFS Task Difficulty Framework 159
5.2.3 Study Six:Refining the Task Difficulty Framework:IPO by Extended CFS 168
5.2.4 Study Seven:Refining the Task Difficulty Framework:IPO by Extended CFS 175
5.2.5 Self-Report Written Data 182
5.2.6 Study Eight:Refining the Task Difficulty Framework:Integrated IPO by ILPS 190
5.3 Task Difficulty Component Analysis 198
5.3.1 Component Analysis and the Refined IPO Matrix 198
5.3.2 Study Nine:Refining the Task Difficulty Framework:IPO by Redueed ILPS 202
5.3.3 Study Ten:Refining the Task Difficulty Framework:IPO by Reduced ILPS 204
Chapter 6 IPO Task Difficulty Matrix Ⅱ:Dimensions and Components 217
6.1 The Construction of the Task Difficulty Matrix 218
6.1.1 Definitions of Dimensions and Their Characteristics 220
6.1.2 Definitions of the Task Components in Operational Terms 222
6.2 A Comparison between Brown et al.'s Matrix and the IPO Task Difficulty Matrix 236
6.2.1 Similarities 236
6.2.2 Differences 238
6.3 The Original Research Questions 246
6.3.1 Research Question 1 246
6.3.2 Research Question 2 247
6.3.3 Research Question 3 248
6.3.4 Research Question 4 250
6.4 Summary 251
Chapter 7 Rater Training for IPO Task Difficulty Matrix 253
7.1 Rationale 254
7.2 Study 11:A Pilot Study for Rater Training 256
7.2.1 Materials and Raters 256
7.2.2 Procedures 257
7.2.3 Results 258
7.3 Study 12:Establishing Expert Ratings 261
7.3.1 Tasks and Instruments 261
7.3.2 Experts and Procedures 264
7.3.3 Results 265
7.4 Study 13:Rater Training 270
7.4.1 Raters and Materials 271
7.4.2 Procedures 272
7.4.3 Results 277
7.5 Discussion and Implications 285
7.5.1 Discussion:Rater Training and Standardization 285
7.5.2 Implications for the Rater Training 286
Chapter 8 Discourse Measures for Student Performance 288
8.1 Research Methods 291
8.1.1 Participants 291
8.1.2 Writing Tasks 292
8.1.3 Task Instructions and Formats 295
8.1.4 Discourse Analysis Measures 297
8.1.5 Setting and Procedures 304
8.2 Results and Analysis 304
8.2.1 Discourse Measure Results 304
8.2.2 Discourse Measure Correlations 308
8.2.3 Analytical Rating Results and Discourse Measures 310
8.2.4 Students'Perceptions of the Writing Tasks 317
8.3 Discussions and Impact of Conditions on Discourse Measures 321
8.4 Conclusions 322
Chapter 9 Conclusions and Implications 326
9.1 Summary of Research Findings 326
9.2 Implications,Reflections,and Future Research 328
9.2.1 Tasks and Task-Based Assessment 328
9.2.2 Language Teaching and Learning 329
9.2.3 Reflections and Limitations 331
9.2.4 Suggestions for Future Research 335
9.3 Conclusions 336
References 340
Appendices 363
Table 2.1 The NECS Goals 17
Table 3.1 Two dimensions underlying the study of tasks 54
Table 3.2 Feuerstein's Cognitive Map 63
Table 3.3 Feuerstein et al.'s three aspects of learning phase 65
Table 3.4 Checklist for task description 81
Table 3.5 Task difficulty matrix for prototypical tasks:ALP 83
Table 4.1 Research overview 99
Table 4.2 Summary of research participants and instruments 103
Table 4.3 Raters in the 13 studies 105
Table 4.4 Six Norris et al.'s tasks with tentative difficulty levels 109
Table 4.5 Input—Processing-Output Task Difficulty Matrix for TBA 113
Table 4.6 Analytical Writing Scales 119
Table 4.7 Discourse Measures(Writing) 121
Table 4.8 Specifications for Authenticity 125
Table 4.9 The framework of studies related to the research questions 127
Table 5.1 Task difficulty matrix for prototypical tasks:ALP 132
Table 5.2 Modified task difficulty matrix 137
Table 5.3 Questionnaire for teachers:Judging difficulty levels of tasks 138
Table 5.4 Six Norris et al.'s tasks with different difficulty levels 141
Table 5.5 Norris et al.'s estimations and teachers'ratings in Study Two 142
Table 5.6 Norris et al.'s estimations and teachers'ratings and rankings in Study Three 146
Table 5.7 A comparison of the average ratings of teachers'ratings in Study Two and Study Three 147
Table 5.8 Eight tasks of low difficulty levels 150
Table 5.9 Norris et al.'s estimations and teachers'ratings and rankings in Study Four 152
Table 5.10 Summary of the task difficulty factors 154
Table 5.11 IPO-CFS Task difficulty rating framework 158
Table 5.12 The average of the two sets of 24 tasks 162
Table 5.13 A range of 24 tasks of 8 themes at different levels 169
Table 5.14 Rating results of the 24 tasks on the IPO-CFS task difficulty framework 172
Table 5.15 Factors reported to affect task difficulty 174
Table 5.16 Task difficulty:IPO by CFMS 176
Table 5.17 Extended IPO-CFS task difficulty matrix 177
Table 5.18 Total averages for the 24 tasks in Study Seven 179
Table 5.19 Total inter-rater correlations of the 24 tasks in Study Seven 180
Table 5.20 E's Estimations of the five tasks 192
Table 5.21 Fifteen prototypical tasks 193
Table 5.22 Total averages for the 15 tasks in StudyEight 194
Table 5.23 Inter-rater correlations of the 15 tasks in Study Eight 195
Table 5.24 Total inter-rater correlations in Study Nine 203
Table 5.25 Nine tasks for the refined IPO task difficulty matrix 205
Table 6.1 IPO task difficulty matrix 219
Table 6.2 IPO Task Difficulty Matrix for TBA 232
Table 6.3 Example task characteristic codings 240
Table 6.4 Teachers'and students'perceptions of factors that affect task difficulty 249
Table 7.1 Nine tasks for the refined IPO task difficulty matrix 256
Table 7.2 Total inter-rater correlations of the nine tasks 259
Table 7.3 Component correlations and descriptive statistics 259
Table 7.4 Twelve tasks for rater training 262
Table 7.5 Study 12:Basic statistics 265
Table 7.6 Expert rating correlations 268
Table 7.7 Task 1-6(NECS themes:School life;Weather) 268
Table 7.8 Task 7-12(NECS themes:Topical issues;Literature) 268
Table 7.9 Component correlations and median 269
Table 7.10 Twelve writing tasks 272
Table 7.11 Format of training and standardization session 272
Table 7.12 Basic statistics in Study 13 277
Table 7.13 Study 13:Rater training rating correlations 279
Table 7.14 Inter-rater correlations of the first 6 tasks 279
Table 7.15 Inter-rater correlations of the second 6 tasks 280
Table 7.16 Medians of the component correlations 281
Table 7.17 A comparison between trainee ratings and expert ratings 282
Table 8.1 Potential effects of task features on learner production 289
Table 8.2 Summary of participants in Study 14 292
Table 8.3 A range of 6 tasks of 5 themes at different difficulty levels 293
Table 8.4 Developmental Measures(Writing) 298
Table 8.5 Analytical Writing Scales 301
Table 8.6 Questionnaire on task instructions,topics,time and support 303
Table 8.7 Average of discourse measure results 305
Table 8.8 Discourse measure correlations 309
Table 8.9 Two marker correlations on the 6 tasks:Fluency,accuracy and complexity 311
Table 8.10 Expert ratings+Student performance ratings+Discourse measures 312
Table 8.11 Correlations between expert ratings,student performance,and discourse measures 313
Table 8.12 Focus group interview summary 318
Table 8.13 Results of the questionnaire 320
Table 8.14 Task dimensions and task characteristics 322
Figure 2.1 The structure of curriculum goals 18
Figure 3.1 Components of language competence 44
Figure 3.2 Components of communicative language ability in communicative language use 44
Figure 3.3 Task-Based Performance and Language Testing:The Skehan Model 45
Figure 4.1 The holistic task difficulty vertical line 114
Figure 5.1 The holistic task difficulty vertical line 144
Figure 5.2 Feuerstein's seven parameters and IPO-CFS task difficulty matrix 157
Figure 5.3 IPO figure:Mental process of task raters 188
Figure 7.1 12 tasks and their difficulty levels 263
Figure 7.2 Expert ratings of the 12 writing tasks 266
Figure 7.3 Comparison of expert holistic and analytical ratings 267
Figure 7.4 Sequence of the holistic and the analytical ratings 267
Figure 7.5 Rater training data results 278
Figure 7.6 A comparison between trainee ratings and expert ratings 282
Figure 7.7 Rating difference between experts and trainees 283
Figure 7.8 Sequence of the expert and the trainee ratings 284
Figure 8.1 The difficulty levels of the 6 writing tasks 293
Figure 8.2 Relationship between expert estimates and student performance 314
Figure 8.3 Relationship between expert estimates and language complexity 314
Figure 8.4 Relationship between language complexity and student performance 315
Appendix A General Objectives in the Chinese National English Curriculum standards 363
Appendix B Research Overview 366
Appendix C Analytical Writing Scales 370
Appendix D Six Prototypical Tasks from Norris et al.and Related Five NECS Themes 372
Appendix E The NECS Theme-Based 48 Tasks 374
Appendix F Part Ⅰ:Task Demands and Task Features 377
Part Ⅱ:Task Requirements for the Nine Pairs of the NECS-Based Tasks 381
Appendix G A Range of NECS Theme-Based 24 Tasks of Three Generated Difficulty Levels 383
Appendix H A Synthesis of Cognitive Skills with Examples 385
Appendix I Reasons for the 24 Tasks Being Easy or Difficult 389
Appendix J IPO-CFS Task Difficulty Matrix for Task-Based Assessment 391
Appendix K Task Samples in the Self-Written Reports 393
Appendix L IPO Task Difficulty Matrix for TBA:Fifteen Tasks 395
Appendix M Part Ⅰ:IPO Task Difficulty Matrix for Task-Based Assessment 396
Part Ⅱ:Task-Based Assessment 400
Part Ⅲ:IPO Task-Based Difficulty Matrix for Task-Based Assessment 401
Appendix N NECS Theme-Based 12 Tasks for Rater Training 402
Appendix O Questionnaire on Task Instructions,Topics,Time and Support 415
Appendix P Measures of Fluency,Accuracy,and Complexity of Each Task 416