Original article | International Journal of Progressive Education 2017, Vol. 13(1) 136-152
pp. 136 - 152 | Manu. Number: ijpe.2017.008
Published online: February 01, 2017 | Number of Views: 231 | Number of Download: 700
The aim of this study is to examine the variability in and reliability of scores assigned to different quality EFL compositions by EFL instructors and their rating behaviors. Using a mixed research design, quantitative data were collected from EFL instructors’ ratings of 30 compositions of three different qualities using a holistic scoring rubric. Qualitatively, think-aloud protocol data were collected concretely from a sub-sample of raters. The generalizability theory (G-theory) approach was used to analyze the quantitative data. The results showed that the raters mostly deviated while giving scores to very low level and mid-range compositions, but that they were more consistent while rating very high-level compositions. The reliability of the ratings of high quality papers (e.g. g: .87 and phi: .79 respectively) was higher than the coefficients obtained for mid-range and low quality compositions. This result indicated that more reliable ratings could be obtained in the rating of high quality papers. The think-aloud protocol analysis indicated that the raters attended differently to different aspects of these three level compositions. Implications are given from performance assessment practice perspectives.
Keywords: Inexpert raters, generalizability theory, variability of ratings, writing assessment.
|How to Cite this Article?|
APA 6th edition
Chicago 16th edition
February 2017All Articles