Wednesday, March 04, 2015, 10:15 AM – 11:15 AM (PST)


Assessing Conversion Equivalence Between Computer and Paper Pencil Testing


Practice Area Division(s): Education, Clinical
Topic: Testing, Measurement, and Psychometrics

Measurement conditions play a vital role in assessment. In this study, measurement conditions refer to the situation in which the same set of items is administered in different modes: Computer based test (CBT), and paper-and-pencil test (PPT). When a test is administered in different modes, the question of score equity arises: Do items indeed function the same way when they are administered in computer vs. in paper?

The issue of score equity between CBT and PPT has been extensively studied, yet no consensus has been reached. Some studies found no mode effects, whereas other studies detected mode effects. While the debating continues, none of the earlier studies examined practical equivalence of conversions: when mode effects are detected, should a separate conversion be reported for CBT and PPT each, or should a common conversion be reported?

The purpose of this study is to assess the practical equivalence of conversions, and to determine when it is defensible to report a separate conversion for the same form but administered under CBT vs. PPT.

Data Collection: Equivalent Groups Design
Data will be collected from an admission test. Test takers will be randomly assigned into two groups: One group takes the CBT, and the other takes the PPT. The scores obtained under CBT will be linked to those obtained under PPT using equivalent groups design.

Analysis Plan
Comparison of item statistics. Item statistics, such as P+ and r-bis, can be compared for individual items. Correlation of the item difficulty will be examined as well.

Comparison of Form Equivalence
We also need to evaluate form equivalence at the test level: Will test difficult y change large enough to result in different equating functions? We will link the CBT form to the PPT form. If the difference between the linking function and the identity function is large enough (0.5) and not due to sampling error (±2SEE), mode effects are suspected to exist.

Assessing the Practical Equivalence of Conversions
If steps 1 and 2 above suggest that mode effects are likely to exist, we will assess whether it is defensible to report a separate. If more than half of the test takers whose reported scores will be affected by the choice of conversion (e.g., unrounded scores differ by at least half of the reported score unit), using a separate conversion for the altered conditions should be considered.

PRESENTERS:
Jinghua Liu, Secondary School Admission Test Board
Linda Cook, Lakeview Consulting