Religious Tolerance Measurement: Validity Test in Indonesia

This study aims to determine the construct validity of items in the religious tolerance scale constructed by Witenberg (2007). There are three aspects or dimensions of religious tolerance used in this scale: fairness, empathy, and reasonableness. This study examines the religious tolerance scale, particularly in Indonesia, using a psychological perspective. The sample is 360 students of Syarif Hidayatullah State Islamic University of Jakarta. The samples use non-probability sampling with a purposive sampling technique. This study uses Confirmatory Factor Analysis (CFA) method and analyzed by software Lisrel 8.7. The results show that the religious tolerance model has a model fit, and this scale is unidimensional. From 30 items, item 12 of the fairness dimension is not valid.


Introduction
Religious tolerance and religious intolerance issues have been hot topics in Indonesia which both are discussed by many people, both academicians, and non-academicians. Many media such as in social media, newspapers, scholarly journals, and others shared the same concerns regarding these issues. The issue known as religious tolerance, in particular, is related to intergroup relations proliferating. For example, many people discuss religious matters on social media, such as hatred against Jews and Christians, rejection of differences in Islamic groups, terrorists, and other issues (Fahmi, 2018;Idris, 2015).
The development of the problem of religious intolerance, which is widely discussed by the community, is highly related to destructive actions by people with characteristics of religious intolerance. For example, several cases of worship places destruction done by many certain groups in Minahasa, Mojokerto, Aceh, and Surabaya (Amindoni, 2019;BBC, 2019;Puspita, 2020). Another thing that has promoted the development of the issue of religious tolerance is the advanced development of the internet. Everyone is free to discuss and have their opinion about this phenomenon on social media. When freedom of discussion is wide open on social media, the most severe consequences are the spreading hatred towards others and disrupting social harmony, as has happened in Indonesia this decade (Verkuyten & Yogeeswaran, 2016).
Based on this issue, the Center for the Study of Islam and Society (PPIM) conducted a study related to the opinion of religious tolerance, especially in generation Z that majorly consists of students today. The results show that they tend to have a radical view of differences and religion, especially hatred towards Jews (PPIM UIN Jakarta, 2018a). Their hatred of Jews is based on dogma and the perspective on the portrayal of Jews as cunning groups and enemies to Islam. In other similar studies, the Wahid Foundation found that religious intolerance towards non-Muslims or non-Islam in Indonesia has a high score of 38.4% (Wahid Foundation, 2016). Previous research by The Ministry of Education and Culture of the Republic Indonesia (KEMDIKBUD RI) found that tolerance among religious people, especially in terms of accepting activities carried out by other religions is extremely low with an intolerance score of 57.6% (PDSPK Kemdikbud RI, 2017).
On the other hand, data on religious tolerance within Islam itself, such as differences in how to practice Islam between groups, has remarkably high intolerance scores (PPIM UIN Jakarta, 2018b). However, the religious intolerance opinion data in the same survey of non-Muslims, especially religion teachers and lecturers, got a score above 20%. From these data, considering that the sample used in the study were religious teachers, they should possess good religious opinions and be role models for students, but instead they showed high opinion of intolerance.
In such situations, a need to establish religious tolerance is a pivotal thing. For instance, to achieve ideal conditions such as people having a moderate opinion of differences, particularly in religious tolerance, a formal study should be conducted to solve them. For example, conducting research using a psychological approach to describe people. In reality, there are many measurements for religious tolerance studies in Indonesia. Still, it is clear that many measurements are not using psychological concepts and theories, in particular social psychology or political psychology. For example, a study by Mansur (2017) examined religious tolerance from the perspective of religion study and sociology as he has not studied on how to measure tolerance in a straightforward style. Another article by Putra (2017) did not discuss religious tolerance measurement and descriptions of tolerance did not exist. In this context, it is essential to establish a religious tolerance tool based on psychological concepts and theories to enrich the knowledge of measurement science in behavioral studies.
In psychology, the measurement of religious tolerance in Indonesia is hard to find even after many explorations. Some studies conducted in establishing a measurement tool of religious tolerance were not valid, which means they do not measure what they aim to measure. In this context, it is to measure religious tolerance. Mahar Dika used Allport's theory as a basis of his measurement, but it did not measure religious tolerance in its true meaning. (Dhika, 2015).  Agama RI, 2010). The study showed that the ministry did not establish measurement based on statistical analysis to test its validity.
The study by The Ministry of Religious Affairs of The Republic Indonesia only used a mechanism of expert judgment and basic and simple statistical analysis, such as the classic technique by correlating scores of items with a total score (Kementerian Agama RI, 2010). Another article discussing a tolerance scale was written by Supriyanto and published in Psikoislamika Journal (Supriyanto, 2017). The tolerance scale based on social psychological theories is deemed as the true one, but Supriyanto just constructed a religious tolerance scale without explanation about the validity. He only conducted reliability testing with Alpha Chronbach's coefficient (See, Supriyanto, 2017). Therefore, further development on religious tolerance scale based on psychological theories is needed for this time in the context of Indonesian people.

Religious Tolerance
Tolerance is defined as the kindness and warmth from individuals by accepting others with no regards of skin color, race, religion, and so on (Allport, 1954;O'Connor, 2017). Mummendey and Wenzel (1999) described tolerance as acceptance and positive appraisal toward differences and mutual understanding and respect among groups, better known as part of inclusive life. Although the individuals disagree with diversity, they must not dispel it, instead they must accommodate it and mutually interact with each other (Verkuyten & Yogeeswaran, 2016). In the context of religious tolerance, to agree with other's religion is not something pivotal, but rather to be accommodative and interactive in daily life is most pivotal. Thus, diversity acceptance is something crucial in religious tolerance.
In relation to it, Verkuyten (2010) explained that tolerance is to value and revere diversity and perceive others positively. It is also about the freedom from prejudice and accepting different ideas. More importantly, tolerance is not about imposing our understandings and beliefs to other individuals or groups. Other experts, Chong, wrote that tolerant individuals must adjust to other groups with uniqueness and different situations, both with religious or social factors (Chong, 1994).
Based on awareness toward the effect of conflict in different spheres, the concept of tolerance exists to lessen the potential negative impact that will arise when every individual and any group behave intolerantly to diversity (Sullivan & Transue, 1999). Therefore, despite feeling uncomfortable or dislikes in that condition, they have to show warmth and accept variety to achieve regular and harmonious life when being tolerant (Allport, 1954;Allport & Ross, 1967). Thus, tolerance is consciously acknowledged, having positive appraisal and belief appropriate behavior, empathy, and respect toward others based on equality despite many differences (Witenberg, 2004(Witenberg, , 2007(Witenberg, , 2019. In conclusion, Witenberg has a tolerance view from the cognitive aspect or reasonableness, behavior or fairness, and affective aspect of empathy to measure the individual's tolerance level.
Therefore, explanation of three dimensions that measure religious tolerance construct is as follows: 1. Fairness is to equally and fairly treat others and to have feeling of similarity as part of different life system. 2. Empathy is related to individual attitude toward feeling, way of view and suffering of others. 3. Reasonableness is to give appraisal on the basis of logical and rational assumption to different people.
Construct validity, such as psychometric properties, is a critical aspect of measuring instrument research. Using the construct validity test for this measuring instrument, the parameters on this scale determine whether the measuring instrument model is sufficient to measure religious tolerance or not (Umar & Nisa, 2020). In this study, the writers used modification and tested the construct of religious tolerance from Witenberg's theory. The writers hope that this scale model can be a fit model and be used to measure religious tolerance, especially in Indonesia.

Description of Instrument
The measurement of religious tolerance is based on Witenberg theory. Witenberg stated there are three dimensions of religious tolerance, mainly fairness, empathy, and reasonableness, using Likert scale. This scale uses statements with some optional responses. Our research used four scales: strongly disagree, disagree, agree, and strongly agree. We also used the items in the category of favorable and unfavorable to gain respondents' consistency. Below is the table of the Likert Scale that we used in our study: The measurement tool is adapted from Witenberg's Tolerance to Human Diversity (2007) with openended questions from storytelling to children and adolescents. Translated and adapted scale for Indonesian Muslim culture has gone through a series of processes and mechanisms such as reviews and expert approval in the field: 1. The researcher examines the construct of theory from Witenberg to be translated into Indonesian language. 2. Construct measuring instruments or items along with a blueprint based on the constructs from the theory above. 3. Ask experts in social psychology, politics, religious studies, and cognitive psychology to assess the quality and appropriateness of these measuring instruments. 4. Determine this item to be 30 items to try on the respondent.
The blueprint of the religious tolerance scale instrument consists of 30 items as follows:

Methods
The testing of the religious tolerance scale construct used confirmatory factor analysis (CFA) on the data from 360 participants of students in Syarif Hidayatullah State Islamic University, Jakarta. Construct validity testing for this study is conducted using software LISREL 8.70. CFA is often used as a method for confirmation in testing the validity of this scale's construct or measurement model. Good validity of a construct will consistently produce a value of scale factor consistent with the instrument's value in the field. In other words, CFA can show the fitness of structure covariance of the variables that are measured. Although the goal is to confirm a model, in CFA, model modification that revises or changes the structure may be needed to find a valid model.
The followings are the steps and logic used in CFA (Umar, 2011): 1. Establishing operational definition of construct in order to make statements that fit the scale used.
The result is factors whereas factors are measured by analyzing every response of every item.
2. Testing hypothesis of unidimensional model of items constructed to test with model fit. This testing of model to see whether it only measures one factor (unidimensional) or not. The testing is done by comparing correlation matrix of the data (Σ) with empirical data matrix (S). If it measures unidimensional, there is no different between Σ and S as shown its notation: Σ -S = 0.
3. Examining the coefficient of goodness of fit statistics in output to test null hypothesis. The test used parametric test with chi-square, when chi-square is not significant (p > 0.05) the model can be assumed fit or null hypothesis is not rejected: Σ -S = 0. However, analysis using chi-square is extremely sensitive toward magnitude of sample size, therefore when number of samples is big the inclination of chisquare coefficient will always be significant or the model is not fit (Browne & Cudeck, 1992;Clogg & Bollen, 1991;Jöreskog & Sörbom, 1993).

For other options to evaluate a sufficiency for the model, Root Mean Square Error of Approximation
RMSEA) is a better choice as it is not sensitive towards sample size. The testing by using RMSEA is considered fit if if p<0.05 or p<0.08 (significant) (Browne & Cudeck, 1992). Other non-parametric test such as GFI, CFI, NFI and others with criteria p>0.09 are also great alternative options (Jöreskog & Sörbom, 1993).
5. The testing of items' significance in measuring factors by examining T-test can only be conducted after the model is deemed fit. This study used level of confidence 95 % or 0.05 therefore conditions of item to significant is having T value more than 1.96 (t >1.96).
6. After testing items' significance, the items that load negative T values or less than 1.96 must be deleted.

Fairness Dimension Validity Test
The validity of fairness dimension consists of 13 items is tested using CFA first order or unidimensional. The testing of first hypothesis found one model to be not fit with Chi-Square=391.26, df=65, P-value=0.000, and RMSEA=0.118. Model modification by allowing item error in theta-delta matrix to correlate is necessary to find a fit model. After 22 times of modification it eventually found fit model with Chi-Square=58.41, df=43, P-value=0.0586, and RMSEA=0.032.
After finding a fit model, the next step is to test items' significance in order to see which item should be dropped and which should stay. The testing is done by looking at t-value and factor loading. If t>1.96 the item is significant and should not be dropped, and items with negative factor loading should be dropped.   Table 3 shows item 12 should be dropped because it is not significant (t<1.96). Figure 1 shows an estimates diagram for fairness dimension:

Empathy Dimension Validity Test
The validity of empathy dimension consists of 9 items is tested using CFA first order or unidimensional. The testing of first hypothesis found one model to be not fit with chi-square=169.24, df=27, p-value=0.000, and RMSEA= 0.121. Model modification by allowing item error in theta-delta matrix to correlate is necessary to find a fit model. After modifying as many as 8 times, fit model was found with chi-square=29.34, df=19, p-value=0.061, and RMSEA=0.039. After finding fit model, the next step is to test item significance in order to see what item should be dropped and which should stay. This test is done by verifying whether t-value and factor loading is significant or not. If t-value more than 1.96 (t >1.96), the item will be significant and will not be removed.  Figure 2 shows an estimates diagram for empathy dimension:

Reasonableness Dimension Validity Test
The validity of reasonableness dimension consists of 8 items is tested using CFA first order or unidimensional. Initial hypothesis found that one model is not fit, with Chi-Square=87.15, df=20, P-value=0.000, and RMSEA=0.097. Model modification by allowing item error in theta-delta matrix to correlate is necessary to find a fit model. After modifying the model 7 times, the model fit was found with Chi-Square=17.07, df=13, P-value=0.196, and RMSEA=0.030.
After finding a fit model, the next step is to test significance of items to see which item must be deleted. This testing is done by looking at t-value and factor loading. If t>1.96, the item is assumed significant and does not need to deleted.  Table 5 shows that all items measuring reasonableness dimension are significant (t>1.96) therefore no item should be eliminated. The following Figure 3 is an estimates diagram for reasonableness dimension:

Discussion
This study aims to test the scale of religious tolerance in the context of interactions with both non-Muslims and fellow Muslims who have different ways of being religious. Indonesian culture is considered friendly by other nations, while contrary, the issue and fluctuation of religious intolerance are widely spread issues. Ranging from the realm of intolerance to destructive actions in the nuances of religious intolerance, especially in the stage of youth to early adulthood. Based on the data, it is essential to measure religious tolerance instruments. However, current research is more focused on the relationship between religious communities, more specifically Muslims against non-Muslims, leading to the scarcity of the data on religious tolerance in the internal context of religion.
The result of this validity construct analysis showed that the religious tolerance scale was constructed from 360 subjects with three factors, namely Fairness, Empathy, and Reasonableness. In this study, all items used were significant after being tested using CFA analysis using parametric and non-parametric tests except for 1 item in the fairness dimension. Although this study has good psychometric properties, further testing on this measuring tool is needed on a more diverse sample, for example by expanding the sample by involving people who live in multicultural environments and other types of environment. However, the sample of this study only involved students of UIN Jakarta. Nevertheless, this measuring tool will serve as a guide for further research on measuring the phenomenon of religious tolerance in Indonesia.
This study found that the scale of religious tolerance had similar results with other studies. Witenberg carried out measurements and research on tolerance with these three dimensions and the results obtained had a strong inter-reliability score (Witenberg, 2019). That way, this is a step towards the right direction to get closer in getting the right measuring tool of religious tolerance from a psychological perspective as their research was conducted with the cultural atmosphere of non-Muslim Australians. In contrast, this study measures religious tolerance in Muslim participants towards adherents in Indonesia's other religions.
However, this study found one item to be not valid in the fairness dimension. It may be caused by factors such as respondents failed to understand a statement, there are too many items that caused an error or other causes. However, it is not problematic as other items have the same tones, even though the dimension of fairness is essential. This dimension measures people's opinions regarding equality and justice issues in daily life (Witenberg, 2007).
Empathy dimension has a vital role in increasing religious tolerance. In several studies, empathy is a positive predictor to make people tolerant (Gawali & Khattar, 2016;L. D. Korol, 2017;L. D. Korol & Cabral, 2016). Therefore, the aspect of empathy to other people is a necessary factor involved in measuring tolerance, particularly in religious tolerance. The dimension of reasonableness with the same indicator and constructs with open-mindedness, such as nothing prejudice in judging other people differently necessary attribution, open to change and so on is a good predictor in measuring religious tolerance (Korol, 2018;K. Van Der Zee et al., 2013;K. I. Van Der Zee & Oudenhoven, 2001).
Another study in India has proved that the religious tolerance scale between the ages of 18-28 shows a significantly positive correlation to another religious tolerance scale (Batool & Akram, 2020). The study referred to another religious tolerance model from Van der Walt (see. Walt, 2014) which focused on education, particularly among teachers and students, shows a high correlation. Model of religious tolerance on that research has similar factors and indicators such as inclusivity, respect to others, recognition of freedom to others, etc. Therefore, this construct of religious tolerance should be conducted, especially in indigenous culture of Indonesian.
Finally, this study's limitations the error in CFA methods. For future research, using advanced analysis methods, such as SEM (Structural Equation Modelling), is highly advised. Another limitation in this research is having too many correlations in an error of measurement (theta-delta), which means that the scale is not perfect and not free from errors. The last one is the lack of sample diversity as the participants are only students from Syarif Hidayatullah State Islamic Univeristy Jakarta, a suggestion to involve various multicultural people in Indonesia is advised for future research.

Conclusion
This study seeks to find a scale or measuring instrument with valid psychometric properties to measure religious tolerance in Indonesia, especially by using a psychological perspective. This study defined the religious tolerance instruments based on Witenberg's theory, which consists of three dimensions, namely fairness, empathy, and reasonableness. Using CFA analysis, the results show among the 30 items intended to measure religious tolerance, only 1 item was invalid and should be dropped. By involving 360 respondents of Muslim Students in Syarif Hidayatullah State Islamic University Jakarta from various faculties, this measuring tool is statistically valid. This is supported by similar research involving non-Muslim respondents in Australia. Thus, this scale can be applied in Indonesia as it has good reliability and validity to measure religious tolerance. However, further testing from various experts is needed. Therefore, in the future, the measurement of religious tolerance can use sufficient and sophisticated tools, particularly in a psychological approach.