Uncommon Measures Revisited

 

Practice Area Division(s): Education

Topic: Test Administration and Delivery Models

Session Type: Breakout

Late last century, the National Research Council published Uncommon Measures: Equivalence and Linkage Among Educational Tests. The motivation for that report was the debate in the late 1990s between those who favored Voluntary National Tests as a means of assessing the educational progress of students across the nation and those who believed that statistical linkages among existing tests could be used to achieve that purpose. The volume examined the feasibility of linking the results of commercial and state tests to compare one student’s achievement with national and international benchmarks, as well as with those of students in other places. The executive summary of that report (Feuer, Holland, Green, Bertenthal, & Hemphill, F.C., 1999) contains:

  1. Comparing the full array of currently administered commercial and state achievement tests to one another, through the development of a single equivalency or linking scale, is not feasible. (p. 4)

  2. Reporting individual student scores from the full array of state and commercial achievement tests on the NAEP scale and transforming individual scores on these various tests and assessments into the NAEP achievement levels are not feasible. (p. 4)

  3. Under limited conditions it may be possible to calculate a linkage between two tests, but multiple factors affect the validity of inferences drawn from the linked scores. These factors include the content, format, and margins of error of the tests; the intended and actual uses of the tests; and the consequences attached to the results of the tests. When tests differ on any of these factors, some limited interpretations of the linked results may be defensible while others would not. (p. 5)

  4. Links between most existing tests and NAEP, for the purpose of reporting individual students' scores on the NAEP scale and in terms of the NAEP achievement levels, will be problematic. Unless the test to be linked to NAEP is very similar to NAEP in content, format, and uses, the resulting linkage is likely to be unstable and potentially misleading. (p. 5)

Over the last decade and a half, there have been several developments related to linking test score scales. Many linkages have been performed, some sound and some unsound. In addition, there have been several technical developments with respect to linking scales (Dorans, 1999; Dorans and Holland, 2000; Pommerich & Dorans, 2004; Dorans, Pommerich, & Holland, 2007).

Presenter: Neil Dorans, Educational Testing Service (ETS)