The Use of Open-ended Questions in Large-Scale Tests for Selection: Generalizability and Dependability

Hakan Atılgan, Elif Kübra Demir, Tuncay Ogretmen & Tahsin Oğuz Başokçu

It has become a critical question what the reliability level would be when open-ended questions are used in large-scale selection tests. One of the aims of the present study is to determine what the reliability would be in the event that the answers given by test-takers are scored by experts when open-ended short answer questions are used in large-scale selection tests. On the other hand, another aim of the study is to reveal how reliability changes upon changing the number of items and raters and what the required number of items and raters is to reach a sufficient degree of reliability. The study group consisted of 443 8th grade students from three secondary schools located in three different towns of the city of Izmir.  These students were given a test including 20 open-ended short answer questions which was developed within the scope of the study. Students’ answers were rated by four experienced teachers independently of one another. In the analyses, G theory’s fully crossed two-facet design p x i x r with students (p), items (I) and raters (r). The analyses found   and Φ=0,855 and it was concluded that well-educated raters in rating open-ended short answer questions can achieve consistent scoring at an adequate level.

Keywords: Large-Scale Tests, Open-Ended Question, Generalizability Theory, Rater Reliability, Generalizability, Dependability

