Development of irtawsi: A User-Friendly R Package for IRT Analysis
DOI:
https://doi.org/10.15408/jp3i.v14i1.32091Keywords:
IRT, irtawsi, Calibration, instrument, User FriendlyAbstract
The complexity of the IRT analysis makes it difficult to perform manually, therefore requiring easy-to-use software. While many software options exist for IRT analysis, the high cost of paid software can make it inaccessible for many students and lecturers in Indonesia. While the mirt package provides a complete, free option for IRT analysis, proficiency in the R programming language is required. This study aims to develop an R package for IRT analysis, equipped with a user-friendly interface based on the mirt package, designed to be easy to use for beginners in IRT analysis. The System Development Life Cycle (SDLC) model is used for development and includes five stages: Planning, Analysis, Design, Implementation, and System. The resulting package is named irtawsi and includes functionality comparable to paid software. This package can calibrate both test and non-test instruments using various IRT models, such as the Rasch, 2PL, 3PL, 4PL, GRM, PCM, and GPCM models. The irtawsi package functionality includes: (1) an easy-to-use user interface, (2) automatic interpretation of analysis results, (3) a guide for IRT analysis, (4) recommendations when assumptions are not met, (5) an HTML report format for analysis results,(6) support for two languages (Indonesian and English), (7) it is free, and (8) can be installed on Windows, macOS, and Linux operating systems. The results of this development contribute to the calibration process, making it easier for practitioners and researchers to calibrate the instruments being developed, especially for beginners who are learning IRT.
References
References
Battauz, M. (2020). Regularized Estimation of the Four-Parameter Logistic Model. Psych, 2(4), 269–278. https://doi.org/10.3390/psych2040020
Bichi, A. A., & Talib, R. (2018). Item Response Theory: An Introduction to Latent Trait Models to Test and Item Development. International Journal of Evaluation and Research in Education (IJERE), 7(2), 142. https://doi.org/10.11591/ijere.v7i2.12900
Cai, L., & Monroe, S. (2014). A new statistic for evaluating Item Response Theory models for ordinal data.
Chalmers, R. P. (2012). mirt : A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6). https://doi.org/10.18637/jss.v048.i06
Chen, W.-H., & Thissen, D. (1997). Local Dependence Indexes for Item Pairs Using Item Response Theory. Journal of Educational and Behavioral Statistics, 22(3), 265–289. https://doi.org/10.3102/10769986022003265
Choi, Y.-J., & Asilkalkan, A. (2019). R Packages for Item Response Theory Analysis: Descriptions and Features. Measurement: Interdisciplinary Research and Perspectives, 17(3), 168–175. https://doi.org/10.1080/15366367.2019.1586404
De Champlain, A. F. (2010). A primer on classical test theory and item response theory for assessments in medical education. Medical Education, 44(1), 109–117. https://doi.org/10.1111/j.1365-2923.2009.03425.x
Dennis, A., Wixom, H. B., & Tegarden, D. (2015). Systems Analysis Design with UML Version 2.5: An Object-Oriented Approach. In John Wiley & Sons.
Edwards, M. C., Houts, C. R., & Cai, L. (2018). A diagnostic procedure to detect departures from local independence in item response theory models. Psychological Methods, 23(1), 138–149. https://doi.org/10.1037/met0000121
Foster, G. C., Min, H., & Zickar, M. J. (2017). Review of Item Response Theory Practices in Organizational Research. Organizational Research Methods, 20(3), 465–486. https://doi.org/10.1177/1094428116689708
Granjon, D. (2022). bs4Dash: A “Bootstrap 4” Version of “shinydashboard” (R package version 2.2.1). https://cran.r-project.org/package=bs4Dash
Guenole, N., & Brown, A. (2014). The consequences of ignoring measurement invariance for path coefficients in structural equation models. Frontiers in Psychology, 5. https://doi.org/10.3389/fpsyg.2014.00980
Hackenberger, B. K. (2020). R software: unfriendly but probably the best. Croatian Medical Journal, 61(1), 66–68. https://doi.org/10.3325/cmj.2020.61.66
Hambleton, R., Swaminathan, H., & Rogers, H. J. (1991). fundamental of item response theory. SAGE.
Han, K. (Chris) T., & Paek, I. (2014). A Review of Commercial Software Packages for Multidimensional IRT Modeling. Applied Psychological Measurement, 38(6), 486–498. https://doi.org/10.1177/0146621614536770
Hori, K., Fukuhara, H., & Yamada, T. (2020). Item response theory and its applications in educational measurement Part I: Item response theory and its implementation in R. WIREs Computational Statistics, 14(2). https://doi.org/10.1002/wics.1531
Iannone, R., Cheng, J., Schloerke, B., Hughes, E., & Seo, J. (2022). Package gt (p. 306).
Jumailiyah, M. (2017). Item response theory: A basic concept. Educational Research and Reviews, 12(5), 258–266. https://doi.org/10.5897/ERR2017.3147
Lameijer, C. M., van Bruggen, S. G. J., Haan, E. J. A., Van Deurzen, D. F. P., Van der Elst, K., Stouten, V., Kaat, A. J., Roorda, L. D., & Terwee, C. B. (2020). Graded response model fit, measurement invariance and (comparative) precision of the Dutch-Flemish PROMIS® Upper Extremity V2.0 item bank in patients with upper extremity disorders. BMC Musculoskeletal Disorders, 21(1), 170. https://doi.org/10.1186/s12891-020-3178-8
Magis, D., & Barrada, J. R. (2017). Computerized Adaptive Testing with R : Recent Updates of the Package catR. Journal of Statistical Software, 76(Code Snippet 1). https://doi.org/10.18637/jss.v076.c01
Maydeu-Olivares, A. (2013). Goodness-of-Fit Assessment of Item Response Theory Models. Measurement: Interdisciplinary Research & Perspective, 11(3), 71–101. https://doi.org/10.1080/15366367.2013.831680
Maydeu-Olivares, A. (2014). Evaluating the Fit of IRT Models. In S. P. Reise & D. A. Revicki (Eds.), Handbook of Item Response Theory Modeling (pp. 129–145). Routledge. https://doi.org/10.4324/9781315736013-15
Nguyen, T. H., Han, H.-R., Kim, M. T., & Chan, K. S. (2014). An Introduction to Item Response Theory for Patient-Reported Outcome Measurement. The Patient - Patient-Centered Outcomes Research, 7(1), 23–35. https://doi.org/10.1007/s40271-013-0041-0
Paek, I., & Cole, K. (2019). Using R for Item Response Theory Model Applications. Routledge. https://doi.org/10.4324/9781351008167
Paura, L., & Arhipova, I. (2012). Advantages and Disadvantages of Professional and Free Software for Teaching Statistics. Information Technology and Management Science, 15(1). https://doi.org/10.2478/v10313-012-0001-z
Perrier, V., Meyer, F., & Granjon, D. (2023). shinyWidgets: Custom Inputs Widgets for Shiny (R package version 0.7.6). https://cran.r-project.org/package=shinyWidgets
Petersen, M. A. (2005). Introduction to Nonparametric Item Response Theory. Quality of Life Research, 14(4), 1201–1202. https://doi.org/10.1007/s11136-005-1259-7
Posit team. (2023). RStudio: Integrated Development Environment for R. http://www.posit.co/
R Core Team. (2022). R: A Language and enviroment for statiscital computing. R Foundation for statistical Computing,. https: // www.R-project.org/.
R Core Team. (2023). R: A Language and Environment for Statistical Computing. https://www.r-project.org/
Reeve, B. B., Hays, R. D., Bjorner, J. B., Cook, K. F., Crane, P. K., Teresi, J. A., Thissen, D., Revicki, D. A., Weiss, D. J., Hambleton, R. K., Liu, H., Gershon, R., Reise, S. P., Lai, J., & Cella, D. (2007). Psychometric Evaluation and Calibration of Health-Related Quality of Life Item Banks. Medical Care, 45(5), S22–S31. https://doi.org/10.1097/01.mlr.0000250483.85507.04
Retnawati, H. (2014). Teori Respons Butir dan Penerapannya.
Retnawati, H. (2016). validitas dan reliabilitas dan karakteristik butir (1st ed.). Parama Publising.
Sali, A., & Attali, D. (2020). shinycssloaders: Add Loading Animations to a “shiny” Output While It’s Recalculating (R package version 1.0.0). https://cran.r-project.org/package=shinycssloaders
Soetaert, K. (2020). diagram: Functions for Visualising Simple Graphs (Networks), Plotting Flow Diagrams (R package version 1.6.5). https://cran.r-project.org/package=diagram
Sudaryono. (2013). Toeri Responsi Butir (pertama). Graha Ilmu.
Susanto, H. P., Retnawati, H., Abadi, A. M., Haryanto, H., & Ali, R. M. (2023). irtawsi: Items Response Theory Analysis with Steps and Interpretation (R package version 0.3.4). CRAN R Pgroam. https://cran.r-project.org/package=irtawsi
Tilley, S., & Rosenblatt, H. (2016). Systems Analysis and Design, Eleventh Edition. In A Guide to Medical Computing.
Toland, M. D. (2014). Practical Guide to Conducting an Item Response Theory Analysis. The Journal of Early Adolescence, 34(1), 120–151. https://doi.org/10.1177/0272431613511332
Wickham, H., & Bryan, J. (2023). readxl: Read Excel Files (R package version 1.4.2). https://cran.r-project.org/package=readxl
Wickham, H., Bryan, J., & Barrett, M. (2022). usethis: Automate Package and Project Setup. https://cran.r-project.org/package=usethis
William Revelle. (2023). psych: Procedures for Psychological, Psychometric, and Personality Research (R package version 2.3.3). https://cran.r-project.org/package=psych
Xie, Y., Cheng, J., & Tan, X. (2023). DT: A Wrapper of the JavaScript Library “DataTables” (R package version 0.27). https://cran.r-project.org/package=DT
Xie, Y., Dervieux, C., & Riederer, E. (2020). R Markdown Cookbook. Chapman and Hall/CRC. https://bookdown.org/yihui/rmarkdown-cookbook
Xu, J., Zhang, Q., & Yang, Y. (2020). Impact of violations of measurement invariance in cross-lagged panel mediation models. Behavior Research Methods, 52(6), 2623–2645. https://doi.org/10.3758/s13428-020-01426-z
Yen, W. M. (1984). Effects of Local Item Dependence on the Fit and Equating Performance of the Three-Parameter Logistic Model. Applied Psychological Measurement, 8(2). https://doi.org/10.1177/014662168400800201
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Hari Purnomo Susanto

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
 
						

