Comparison and Properties of Correlational and Agreement Methods for Determining Whether or Not to Report Subtest Scores

Authors

  • Oksana Babenko
  • W. Todd Rogers

Keywords:

subscore reporting; accuracy; precision; large-scale assessment

Abstract

Large-scale testing agencies often report subtest scores in addition to reporting
the total test score. But is there evidence that subtests reveal differences in student
performances? Three methods for determining whether subscore reporting is warranted
were examined and evaluated using large-scale data as well as samples of various sizes
for Reading and Mathematics assessments. Results revealed that subtests did not differ
among themselves and added no value over the total test. The method statistics were
determined to be accurate and precise estimators of the population parameters.
Implications for subscore reporting are discussed.

References

American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (1999). Standards for educational and psychological testing. Washington, DC: American Educational Research Association.

Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16, 297–334.

Grandy, J. (1992). Construct validity study of the NTE core battery using confirmatory factor analysis. (ETS Research Report No. RR-92-03). Princeton, NJ: Educational Testing Service.

Gulliksen, H. (1950, 1967). Theory of mental tests. New York: John Wiley & Sons, Inc.

Haberman, S. J. (2005). When can subscores have value? (ETS Research Report No. RR-05-08). Princeton, NJ: Educational Testing Service.

Haberman, S. J. (2008). Subscores and validity. (ETS Research Report No. RR-08-64). Princeton, NJ: Educational Testing Service.

Haladyna, T. M. & Kramer, G. A. (2004). The validity of subscores for a credentialing test. Evaluation and the Health Professions, 27, 349–368.

Harris, D. J. & Hanson, B. A. (1991, April). Methods of examining the usefulness of subscores. Paper presented at the annual meeting of the National Council on Measurement in Education, Chicago.

Kelley, T. L. (1923). A new method for determining the significance of differences in intelligence and achievement scores. Journal of Educational Psychology, 14, 300–303.

Lord, F. M. & Novick, M. R. (1968). Statistical theories of mental test scores. New York: Addison–Wesley.

Lyrén, P. E. (2009). Reporting subscores from college admission tests. Practical Assessment, Research and Evaluation, 14(4), 1–10.

McPeek, M., Altman, R., Wallmark, M., & Wingersky, B. C. (1976). An investigation of the feasibility of obtaining additional subscores on the GRE Advanced Psychology Test (GRE Board Professional Report No. 74 - 4P). Princeton, NJ: Educational Testing Service. (ERIC Document No. ED163090)

Ryan, J. (2003). An analysis of item mapping and test reporting strategies. Greensboro, NC: South Carolina Department of Education.

Sinharay, S. (2010). How often do subscores have added value? Results from operational and simulated data. Journal of Educational Measurement, 47, 150–174.

Sinharay, S., Haberman, S. J., & Puhan, G. (2007). Subscores based on classical test theory: To report or not to report. Educational Measurement: Issues and Practice, 26, 21–28.

Sinharay, S., Puhan, G., & Haberman, S. (2009). Reporting diagnostic scores: Temptations, pitfalls, and some solutions. Paper presented at the National Council on Measurement in Education, San Diego, CA, USA.

Tate, R. L. (2004). Implications of multidimensionality for total score and subscore performance. Applied Measurement in Education, 17, 89–112.

Wainer, H., Sheehan, K. M., & Wang, X. (2000). Some paths toward making Praxis scores more useful. Journal of Educational Measurement, 37, 113–140.

Wainer, H., Vevea, J. L., Camacho, F., Reeve, B. B., Rosa, K., Nelson, L., Swygert, K. A., & Thissen, D. (2001). Augmented scores –“borrowing strength†to compute scores based on small numbers of items. In Test Scoring (pp. 343–387). Mahwah, NJ: Lawrence Erlbaum Associates.

Yao, L. & Boughton, K. A. (2007). A multidimensional item response modeling approach for improving subtest proficiency estimation and classification. Applied Psychological Measurement, 31, 83–105.

Downloads

Published

2014-04-30

Issue

Section

Articles