Klasika: Program Analisis Item dan Tes dengan Pendekatan Klasik

Bahrul Hayat


This article introduces the Klasika software developed to run item and test analysis using the Classical Test Theory approach. Classical Test Theory is one of the specialized competencies and skills that undergraduate students of psychology must possess. Classical Test Theory becomes a mandatory course for all schools or departments of psychology in Indonesia. This article also provides a theoretical foundation of Classical Test Theory's essential concepts and statistical methods, specifically related to items and test statistics. The item analysis and test reliability procedures using Klasika, starting from the data preparation untill data interpretation, are explained with an empirical illustration. Finally, the analysis results using Klasika are compared with the results from Quest software to test the accuracy of the estimation results.


classical test theory; item analysis; test reliability; Klasika; scoring; psychometrics software


Adams, R. J. (2005). Reliability as a measurement of design effect. Studies in Educational Evaluation, 31(2-3), 162-172. https://doi.org/10.1016/j.stueduc.2005.05.008.

Adams, R. J., & Khoo, S. T. (1993). Quest: the interactive test analysis system. Australian Council for Educational Research.

Allen, M. J. & Yen, W. M. (1979). Introduction to measurement theory. Brooks/Cole Publishing Company.

Andrich, D. (2004). Controversy and the Rasch model: A characteristic of incompatible paradigms? Medical Care, 42(1), 7-16. https://doi.org/10.1097/01.mlr.0000103528.48582.7c.

Andrich, D., & Marais, I. (2019). A course in Rasch Measurement Theory: Measuring in the educational, social and health sciences. Springer.

AP2TPI. (2018). Kurikulum inti program studi psikologi jenjang sarjana. Asosiasi Penyelenggara Pendidikan Tinggi Psikologi Indonesia (AP2TPI). https://ap2tpi.or.id/wp-content/uploads/2019/05/SK-AP2TPI-Perubahan-Kurikulum-Inti-Program-Studi-Sarjana-Final-sdh-ttd-22-november-2018.pdf.

Bazaldua, D. A. L., Lee, Y. S., Keller, B., & Fellers, L. (2017). Assessing the performance of classical test theory item discrimination estimators in monte carlo simulations. Asia Pacific Education Review, 18, 585-598. https://doi.org/10.1007/s12564-017-9507-4.

Berk, R. A., & Griesemer, H. A. (1976). Iteman: an item analysis program for tests, questionnaires, and scales. Educational and Psychological Measurement, 36(1), 189-191. https://doi.org/10.1177/001316447603600122.

Brennan, R. L. (2011). Generalizability theory and classical test theory. Applied Measurement in Education, 24(1), 1-21. https://doi.org/10.1080/08957347.2011.532417.

Crocker, L., & Algina, J. (2008). Introduction to classical and modern test theory (3rd edition.). Cengage Learning.

Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16(3), 297-334. https://doi.org/10.1007/BF02310555.

Cronbach, L. J. (1960). Essentials of psychological testing (2nd ed.). Harper & Row.

Ebel, R. L., & Frisbie, D. A. (1991). Essentials of educational measurement (5th edition.). Prentice Hall.

Loewenthal, K. M. (2001). An introduction to psychological tests and scales (2nd ed.). Psychology Press.

Lord, F. M., & Novick, M. E. (1968). Statistical theories of mental test scores. Addison-Wesley.

Lord, F. M. (1980). Applications of item response theory to practical testing problems. Lawrence Erlbaum Associates.

Martins, P. S. R., Barbosa-Pereira, D., Valgas-Costa, M., & Mansur-Alves, M. (2020). Item analysis of the Child Neuropsychological Assessment Test (TENI): Classical test theory and item response theory. Applied Neuropsychology: Child. https://doi.org/10.1080/21622965.2020.1846128.

Novick, M. R. (1966). The axioms and principal results of classical test theory. Journal of Mathematical Psychology, 3(1), 1–18. https://doi.org/10.1016/0022-2496(66)90002-2.

Price, L. R. (2017). Psychometric methods: Theory into practice. The Guilford Press.

Raykov, T., & Marcoulides, G. (2011). Introduction to psychometric theory. Routledge.

Raykov, T., & Marcoulides, G. A. (2016). On the relationship between classical test theory and item response theory: From one to the other and back. Educational and Psychological Measurement, 76(2), 325–338. https://doi.org/10.1177/0013164415576958.

Testa, S., Toscano, A., & Rosato, R. (2018). Distractor efficiency in an item pool for a Statistic Classroom Exam: Assessing its relation with item cognitive level classified according to Bloom’s taxonomy. Frontiers in Psychology, 9: 1585. https://doi.org/10.3389/fpsyg.2018.01585.

Traub, R. E. (1997). Classical test theory in historical perspective. Educational Measurement: Issues and Practice, 16(4), 8-14. https://doi.org/10.1111/j.1745-3992.1997.tb00603.x.

Wang, W. C. (1998). Rasch analysis of distractors in multiple-choice items. Journal of Outcome Measurement, 2(1), 43-65.

Wilkinson, L., & Task Force on Statistical Inference, American Psychological Association, Science Directorate. (1999). Statistical methods in psychology journals: Guidelines and explanations. American Psychologist, 54(8), 594–604. https://doi.org/10.1037/0003-066X.54.8.594.

Wright, B. D. (1997). A history of social science measurement. Educational Measurement: Issues and Practice, 16(4), 33–45, 52. https://doi.org/10.1111/j.1745-3992.1997.tb00606.x.

Full Text: PDF

DOI: 10.15408/jp3i.v10i1.20551


  • There are currently no refbacks.

Copyright (c) 2021 Bahrul Hayat

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.