چکیده:
Equating test scores is an important issue in large scale testing. Almost all standardized tests have several forms which vary in difficulty. New forms are written and added every year. When the items in different forms of a test vary in difficulty, direct comparison of test-takers who have taken different forms and are at the same ability level is not possible; hence, the issue of test fairness arises. In such situations there is a need for equating test scores so that standards can be maintained from year to year. That is, there is the need to adjust the scores for the difficulty of the test forms and report a scaled score that is comparable across all forms of the test. In this study, two forms of a reading comprehension test were equated and the pass/fail decision consistency was investigated under two conditions of with and without equating. Concurrent common item equating with one parameter Items Response Model (IRT) was used to equate the two test forms. Results showed that the lack of equating leads to unfair pass/fail decisions. The implications for high-stakes large scale testing are thus discussed.
خلاصه ماشینی:
"JELS, Vol. 1, No. 3, Spring 2010, 113-128 IAUCTB Test Score Equating and Fairness in Language Assessment Purya Baghaei Assistant Professor of Applied Linguistics, Islamic Azad University, Mashad Branch, Iran Abstract Keywords: Introduction Testing companies normally administer their assessments more than once during a year and for security reasons, they cannot use the same test form over different administrations.
In order to exclude the second explanation and make sure that the observed trends are a valid reflection of changes in students’ abilities over time, the assessment mechanism needs to equate scores from different test forms and adjust for variations in form difficulty.
One of the most important properties of IRT is that if the data fit the IRT model, estimates of item difficulty and person ability parameters are independent of the sample of persons and the test form used for item analysis and person measurement.
120 JELS, Vol. 1, No. 3, Spring 2010, 113-128 Figure 2 – Data setup for common item equating Concurrent common item equating design was used to place the items and persons from the two tests on the same scale so that the comparison of the abilities of the persons who had taken the two different test forms could become possible.
Thus, the procedure allows the comparison of the difficulty estimates of the items in the two test forms and the ability estimates of the persons who have taken the two forms on a common scale."