Development of irtawsi: A User-Friendly R Package for IRT Analysis

Hari Purnomo Susanto; Agus Maman Abadi; Haryanto ‎; Heri Retnawati; Raden Muhammad Ali; Hasan Djidu

Development of irtawsi: A User-Friendly R Package for IRT Analysis

Hari Purnomo Susanto, Agus Maman Abadi, Haryanto ‎, Heri Retnawati, Raden Muhammad Ali, Hasan Djidu

Abstract

The complexity of the IRT analysis makes it difficult to perform manually, therefore requiring easy-to-use software. While many software options exist for IRT analysis, the high cost of paid software can make it inaccessible for many students and lecturers in Indonesia. While the mirt package provides a complete, free option for IRT analysis, it requires proficiency in the R programming language to use. This study aims to develop an R package for IRT analysis, equipped with a user-friendly interface based on the mirt package, designed to be easy to use for beginners in IRT analysis. The System Development Life Cycle (SDLC) model is used for development, which includes five stages: Planning, Analysis, Design, Implementation, and System. The resulting package is named irtawsi and includes functionality comparable to paid software. This package can calibrate both test and non-test instruments using various IRT models, such as the Rasch, 2PL, 3PL, 4PL, GRM, PCM, and GPCM model. The irtawsi package functionality includes (1) an easy-to-use user interface, (2) automatic interpretation of analysis results, (3) a guide for IRT analysis, (4) recommendations when assumptions are not met, (5) an HTML report format for analysis results,(6) support for two languages (Indonesian and English), (7) it’s free, and (8) can be installed on Windows, macOS, and Linux operating systems.

Keywords

IRT, irtawsi, Calibration, instrument, User Friendly

References

Battauz, M. (2020). Regularized Estimation of the Four-Parameter Logistic Model. Psych, 2(4), 269–278. https://doi.org/10.3390/psych2040020

Bichi, A. A., & Talib, R. (2018). Item Response Theory: An Introduction to Latent Trait Models to Test and Item Development. International Journal of Evaluation and Research in Education (IJERE), 7(2), 142. https://doi.org/10.11591/ijere.v7i2.12900

Cai, L., & Monroe, S. (2014). A new statistic for evaluating Item Response Theory models for ordinal data.

Chalmers, R. P. (2012). mirt : A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6). https://doi.org/10.18637/jss.v048.i06

Chen, W.-H., & Thissen, D. (1997). Local Dependence Indexes for Item Pairs Using Item Response Theory. Journal of Educational and Behavioral Statistics, 22(3), 265–289. https://doi.org/10.3102/10769986022003265

Choi, Y.-J., & Asilkalkan, A. (2019). R Packages for Item Response Theory Analysis: Descriptions and Features. Measurement: Interdisciplinary Research and Perspectives, 17(3), 168–175. https://doi.org/10.1080/15366367.2019.1586404

De Champlain, A. F. (2010). A primer on classical test theory and item response theory for assessments in medical education. Medical Education, 44(1), 109–117. https://doi.org/10.1111/j.1365-2923.2009.03425.x

Dennis, A., Wixom, H. B., & Tegarden, D. (2015). Systems Analysis Design with UML Version 2.5: An Object-Oriented Approach. In John Wiley & Sons.

Edwards, M. C., Houts, C. R., & Cai, L. (2018). A diagnostic procedure to detect departures from local independence in item response theory models. Psychological Methods, 23(1), 138–149. https://doi.org/10.1037/met0000121

Foster, G. C., Min, H., & Zickar, M. J. (2017). Review of Item Response Theory Practices in Organizational Research. Organizational Research Methods, 20(3), 465–486. https://doi.org/10.1177/1094428116689708

Granjon, D. (2022). bs4Dash: A “Bootstrap 4” Version of “shinydashboard” (R package version 2.2.1). https://cran.r-project.org/package=bs4Dash

Guenole, N., & Brown, A. (2014). The consequences of ignoring measurement invariance for path coefficients in structural equation models. Frontiers in Psychology, 5. https://doi.org/10.3389/fpsyg.2014.00980

Hackenberger, B. K. (2020). R software: unfriendly but probably the best. Croatian Medical Journal, 61(1), 66–68. https://doi.org/10.3325/cmj.2020.61.66

Hambleton, R., Swaminathan, H., & Rogers, H. J. (1991). fundamental of item response theory. SAGE.

Han, K. (Chris) T., & Paek, I. (2014). A Review of Commercial Software Packages for Multidimensional IRT Modeling. Applied Psychological Measurement, 38(6), 486–498. https://doi.org/10.1177/0146621614536770

Hori, K., Fukuhara, H., & Yamada, T. (2020). Item response theory and its applications in educational measurement Part I: Item response theory and its implementation in R. WIREs Computational Statistics, 14(2). https://doi.org/10.1002/wics.1531

Iannone, R., Cheng, J., Schloerke, B., Hughes, E., & Seo, J. (2022). Package gt (p. 306).

Jumailiyah, M. (2017). Item response theory: A basic concept. Educational Research and Reviews, 12(5), 258–266. https://doi.org/10.5897/ERR2017.3147

Lameijer, C. M., van Bruggen, S. G. J., Haan, E. J. A., Van Deurzen, D. F. P., Van der Elst, K., Stouten, V., Kaat, A. J., Roorda, L. D., & Terwee, C. B. (2020). Graded response model fit, measurement invariance and (comparative) precision of the Dutch-Flemish PROMIS® Upper Extremity V2.0 item bank in patients with upper extremity disorders. BMC Musculoskeletal Disorders, 21(1), 170. https://doi.org/10.1186/s12891-020-3178-8

Magis, D., & Barrada, J. R. (2017). Computerized Adaptive Testing with R : Recent Updates of the Package catR. Journal of Statistical Software, 76(Code Snippet 1). https://doi.org/10.18637/jss.v076.c01

Maydeu-Olivares, A. (2013). Goodness-of-Fit Assessment of Item Response Theory Models. Measurement: Interdisciplinary Research & Perspective, 11(3), 71–101. https://doi.org/10.1080/15366367.2013.831680

Maydeu-Olivares, A. (2014). Evaluating the Fit of IRT Models. In S. P. Reise & D. A. Revicki (Eds.), Handbook of Item Response Theory Modeling (pp. 129–145). Routledge. https://doi.org/10.4324/9781315736013-15

Nguyen, T. H., Han, H.-R., Kim, M. T., & Chan, K. S. (2014). An Introduction to Item Response Theory for Patient-Reported Outcome Measurement. The Patient - Patient-Centered Outcomes Research, 7(1), 23–35. https://doi.org/10.1007/s40271-013-0041-0

Paek, I., & Cole, K. (2019). Using R for Item Response Theory Model Applications. Routledge. https://doi.org/10.4324/9781351008167

Paura, L., & Arhipova, I. (2012). Advantages and Disadvantages of Professional and Free Software for Teaching Statistics. Information Technology and Management Science, 15(1). https://doi.org/10.2478/v10313-012-0001-z

Perrier, V., Meyer, F., & Granjon, D. (2023). shinyWidgets: Custom Inputs Widgets for Shiny (R package version 0.7.6). https://cran.r-project.org/package=shinyWidgets

Petersen, M. A. (2005). Introduction to Nonparametric Item Response Theory. Quality of Life Research, 14(4), 1201–1202. https://doi.org/10.1007/s11136-005-1259-7

Posit team. (2023). RStudio: Integrated Development Environment for R. http://www.posit.co/

R Core Team. (2022). R: A Language and enviroment for statiscital computing. R Foundation for statistical Computing,. https: // www.R-project.org/.

R Core Team. (2023). R: A Language and Environment for Statistical Computing. https://www.r-project.org/

Reeve, B. B., Hays, R. D., Bjorner, J. B., Cook, K. F., Crane, P. K., Teresi, J. A., Thissen, D., Revicki, D. A., Weiss, D. J., Hambleton, R. K., Liu, H., Gershon, R., Reise, S. P., Lai, J., & Cella, D. (2007). Psychometric Evaluation and Calibration of Health-Related Quality of Life Item Banks. Medical Care, 45(5), S22–S31. https://doi.org/10.1097/01.mlr.0000250483.85507.04

Retnawati, H. (2014). Teori Respons Butir dan Penerapannya.

Retnawati, H. (2016). validitas dan reliabilitas dan karakteristik butir (1st ed.). Parama Publising.

Sali, A., & Attali, D. (2020). shinycssloaders: Add Loading Animations to a “shiny” Output While It’s Recalculating (R package version 1.0.0). https://cran.r-project.org/package=shinycssloaders

Soetaert, K. (2020). diagram: Functions for Visualising Simple Graphs (Networks), Plotting Flow Diagrams (R package version 1.6.5). https://cran.r-project.org/package=diagram

Sudaryono. (2013). Toeri Responsi Butir (pertama). Graha Ilmu.

Susanto, H. P., Retnawati, H., Abadi, A. M., Haryanto, H., & Ali, R. M. (2023). irtawsi: Items Response Theory Analysis with Steps and Interpretation (R package version 0.3.4). CRAN R Pgroam. https://cran.r-project.org/package=irtawsi

Tilley, S., & Rosenblatt, H. (2016). Systems Analysis and Design, Eleventh Edition. In A Guide to Medical Computing.

Toland, M. D. (2014). Practical Guide to Conducting an Item Response Theory Analysis. The Journal of Early Adolescence, 34(1), 120–151. https://doi.org/10.1177/0272431613511332

Wickham, H., & Bryan, J. (2023). readxl: Read Excel Files (R package version 1.4.2). https://cran.r-project.org/package=readxl

Wickham, H., Bryan, J., & Barrett, M. (2022). usethis: Automate Package and Project Setup. https://cran.r-project.org/package=usethis

William Revelle. (2023). psych: Procedures for Psychological, Psychometric, and Personality Research (R package version 2.3.3). https://cran.r-project.org/package=psych

Xie, Y., Cheng, J., & Tan, X. (2023). DT: A Wrapper of the JavaScript Library “DataTables” (R package version 0.27). https://cran.r-project.org/package=DT

Xie, Y., Dervieux, C., & Riederer, E. (2020). R Markdown Cookbook. Chapman and Hall/CRC. https://bookdown.org/yihui/rmarkdown-cookbook

Xu, J., Zhang, Q., & Yang, Y. (2020). Impact of violations of measurement invariance in cross-lagged panel mediation models. Behavior Research Methods, 52(6), 2623–2645. https://doi.org/10.3758/s13428-020-01426-z

Yen, W. M. (1984). Effects of Local Item Dependence on the Fit and Equating Performance of the Three-Parameter Logistic Model. Applied Psychological Measurement, 8(2). https://doi.org/10.1177/014662168400800201

Full Text: PDF

DOI: 10.15408/jp3i.v14i1.32091