Optimal Scale Points for Reliable Measurements: Exploring the Impact of Scale Point Variation

Raoda Ismail; Heri Retnawati; Sugiman Sugiman; Farida Agus Setiawati; Okky Riswandha Imawan; Purwoko Haryadi Santoso

Optimal Scale Points for Reliable Measurements: Exploring the Impact of Scale Point Variation

Raoda Ismail, Heri Retnawati, Sugiman Sugiman, Farida Agus Setiawati, Okky Riswandha Imawan, Purwoko Haryadi Santoso

Abstract

Ensuring reliable measurements is crucial for minimising errors in assessments. The assessment
community commonly employs the evaluation of reliability coefficients to estimate the dependability of
test scores. Despite its significance, limited research has explored the relationship between the estimated reliability coefficient and the number of scale points utilised. This study aims to provide valuable insights to practitioners by investigating the optimal number of scale points required for the most accurate reliability coefficient estimation. Using simulated data, the research scrutinises scales with varying points, ranging from 2 to 11. The results reveal a substantial impact of the number of scale points on reliability estimation. The most accurate estimate of reliability is obtained for scales with 8 points. This study helps us understand the optimal number of scale points for reliable measurements and guides future assessment improvements.

Keywords

number of scale points, non-normal, reliability coefficient.

References

Alan, Ü., & Kabasakal, K. A. (2020). Effect of number of response options on the psychometric properties of Likert-type scales used with children. Studies in Educational Evaluation, 66. https://doi.org/https://doi.org/10.1016/j.stueduc.2020.100895

Allen, M. J., & Yen, W. . (1979). Introduction to measurement theory. Brooks/Cole Publishing Company.

Altuna, O. K., & Arslan, F. M. (2016). Impact of the Number of Scale Points on Data Characteristics and Respondents’ Evaluations: An Experimental Design Approach Using 5-Point and 7-Point Likert-type Scales. İstanbul Üniversitesi Siyasal Bilgiler Fakültesi Dergisi, 55, 1–20. https://doi.org/10.17124/IUSIYASAL.320009

Arshad, S. S., Zaman, S., & Nazir, A. (2022). Development and Validation of Scale for Assessment of Followership among School Teachers. International Journal of Instruction, 15(3), 1031–1046. https://doi.org/https://doi.org/10.29333/iji.2022.15355a

Bonett, D. G., & Wright, T. A. (2015). Cronbach’s alpha reliability: Interval estimation, hypothesis testing, and sample size planning. Journal of Organizational Behavior, 36(1), 3–15. https://doi.org/10.1002/JOB.1960

Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16(3), 297–334.

De Beuckelaer, A., Toonen, S., & Davidov, E. (2013). On the optimal number of scale points in graded paired comparisons. Quality & Quantity, 47(5), 2869–2882. https://doi.org/10.1007/S11135-012-9695-2

Ebel, R., & Frisbie, D. (1991). Essential of Educational Measurment. Prentice-H.

Eisinga, R., Grotenhuis, M., & Pelzer, B. (2012). The reliability of a two-item scale : Pearson , Cronbach or Spearman-Brown? International Journal of Public Health. https://doi.org/http://dx.doi.org/10.1007/s00038-012-0416-3

Greco, L. M., O’Boyle, E. H., Cockburn, B. S., & Yuan, Z. (2018). Meta-Analysis of Coefficient Alpha: A Reliability Generalization Study. Journal of Management Studies, 55(4), 583–618. https://doi.org/10.1111/JOMS.12328

Liu, Y., Wu, A. D., & Zumbo, B. D. (2010). The Impact of Outliers on Cronbach’s Coefficient Alpha Estimate of Reliability: Ordinal/Rating Scale Item Responses. Educational and Psychological Measurement, 70(1), 5–21. https://doi.org/10.1177/0013164409344548

McColly, W., & Remstad, R. (1965). Composition Rating Scales for General Merit: An Experimental Evaluation. Journal of Educational Research, 59(2), 55–56. https://doi.org/10.1080/00220671.1965.10883300

Menold, N., & Tausch, A. (2015). Measurement of Latent Variables With Different Rating Scales: Testing Reliability and Measurement Equivalence by Varying the Verbalization and Number of Categories. Http://Dx.Doi.Org/10.1177/0049124115583913, 45(4), 678–699. https://doi.org/10.1177/0049124115583913

Menold, N., & Toepoel, V. (2022). Do Different Devices Perform Equally Well with Different Numbers of Scale Points and Response Formats? A test of measurement invariance and reliability: Sociological Methods & Research, 1–42. https://doi.org/10.1177/00491241221077237

Panayides, P. (2013). Coefficient Alpha: Interpret With Caution. Europe’s Journal of Psychology, 9(4), 687–696. https://doi.org/10.5964/ejop.v9i4.653

Piqueras, J. A., Martín-Vivar, M., Sandin, B., San Luis, C., & Pineda, D. (2017). The Revised Child Anxiety and Depression Scale: A systematic review and reliability generalization meta-analysis. Journal of Affective Disorders, 218, 153–169. https://doi.org/10.1016/J.JAD.2017.04.022

Postmes, T., Haslam, S. A., & Jans, L. (2013). A single-item measure of social identification: Reliability, validity, and utility. British Journal of Social Psychology, 52(4), 597–617. https://doi.org/10.1111/BJSO.12006

Raadt, A. de, Warrens, M. J., Bosker, R. J., & Kiers, H. A. L. (2021). A Comparison of Reliability Coefficients for Ordinal Rating Scales. Rhode Island Medical Journal (2013), 38, 519–543. https://doi.org/https://doi.org/10.1007/s00357-021-09386-5

Rahayu, W., & Abidin, Z. (2017). The Effect Number of Replication and the Number of Option Scale toward the Reliability Coefficient of Maximal in the Rubric Assessment of Vocational Learning Outcome. American Journal of Education Research, 5(6), 645–649. https://doi.org/10.12691/education-5-6-9

Retnawati, H. (2020). Validitas, Reliabilitas & Karakteristik Butir: Panduan untuk Peneliti, Mahasiswa, dan Psikometrian. Parama Publishing.

Rouse, S. V. (2015). A reliability analysis of Mechanical Turk data. Computers in Human Behavior, 43, 304–307. https://doi.org/10.1016/J.CHB.2014.11.004

Shumate, S. R., Surles, J., Johnson, R. L., & Penny, J. (2007). The effects of the number of scale points and non-normality on the generalizability coefficient: A Monte Carlo study. Applied Measurement in Education, 20(4), 357–376. https://doi.org/10.1080/08957340701429645

Sijtsma, K. (2009). On the use, the misuse, and the very limited usefulness of cronbach’s alpha. Psychometrika, 74(1), 107–120. https://doi.org/10.1007/S11336-008-9101-0/TABLES/5

Toepoel, V., & Funke, F. (2018). Sliders, visual analogue scales, or buttons: Influence of formats and scales in mobile and desktop surveys. Mathematical Population Studies, 25(2), 112–122. https://doi.org/10.1080/08898480.2018.1439245

Trizano-Hermosilla, I., & Alvarado, J. M. (2016). Best alternatives to Cronbach’s alpha reliability in realistic conditions: Congeneric and asymmetrical measurements. Frontiers in Psychology, 7, 769. https://doi.org/10.3389/FPSYG.2016.00769/BIBTEX

Tsai, A. C., Liou, M., Simak, M., & Cheng, P. E. (2017). On hyperbolic transformations to normality. Computational Statistics and Data Analysis, 115, 250–266. https://doi.org/10.1016/j.csda.2017.06.001

Wu, H., & Leung, S. O. (2017). Can Likert Scales be Treated as Interval Scales?—A Simulation Study. Journal of Social Service Research, 43(4), 527–532. https://doi.org/10.1080/01488376.2017.1329775

Wu, X. Z. (2020). Quantifying the non-normality of shear strength of geomaterials. European Journal of Environmental and Civil Engineering, 24(6), 740–766. https://doi.org/10.1080/19648189.2017.1421102

Full Text: PDF

DOI: 10.15408/jp3i.v13i1.34173