RASCH MODEL ANALYSIS: DEVELOPMENT OF HOTS-BASED MATHEMATICAL ABSTRACTION ABILITY INSTRUMENT ACCORDING TO RIAU ISLANDS CULTURE

The HOTS-based mathematical abstraction ability test instrument according to the culture of the Riau Islands is a form of questions that is still rarely found in school learning. The goal of this study is to analyze the HOTS-based mathematical abstraction ability instrument according to the culture of the Riau Islands for 10th grade high school students. The respondents were 402 students taken from four high schools in the city of Tanjungpinang. Research data was collected using multiple choice questions. The test instrument consisted of 19 items that were tested. The high level of difficulty of the HOTS questions and the students' limited time in working on the questions caused the test instruments to be divided into two packages. Package A consists of 10 items, and package B consists of 9 items. Winsteps software is used for data analysis. Rasch modeling is a method applied in this analysis. Both package A and package B instruments were analyzed using Rasch modeling, the results showed reliability with Cronbach's Alpha values of 0.73 and 0.75 respectively, which means both instruments can be relied upon to measure students’ mathematical abstraction abilities. The separation values of packages A and B are 7.35 and 6.15 respectively, which means the item have an excellent distribution of respons. Rasch modeling data analysis results demonstrate that every test instrument item satisfies the item fit order criteria. Based on research findings, 19 test instrument items were classified as extremely suitable for utilization. The development of a HOTS-based test instrument for mathematical abstraction ability that takes into account the cultural norms of the Riau Islands is helping students by promoting an even more engaging and inclusive mathematics education.


INTRODUCTION
Students today must possess critical, analytical, logical, systematic, and creative thinking abilities as a result of the rapid advancement of life.These skills are needed to come up with answers for a wide range of present and future societal issues (Kristanto & Setiawan, 2020;Lestari, 2019;Nisa, Widyastuti, & Hamid, 2018).Critical, analytical, logical, systematic, and creative thinking abilities are related to the qualities of high-order thinking skills or HOTS.HOTS are necessary to meet upcoming difficulties.In order to compete with students from other countries, HOTS abilities are becoming more and more crucial for Indonesian high school students in the current era of globalization and the industrial revolution 4.0.
According to research by Ocy et al. (2021), which examined students' HOTS abilities several schools in the Riau Islands Province.In other words, 79.5% of students were unable to abstract a problem so that the problem could be solved mathematically.This shows that the ability of high school students in Tanjungpinang is low in HOTS based mathematical abstraction ability.
If supported by appropriate resources, an effort to increase students' HOTS abilities in the area of mathematical abstraction will be successful, but 53.6% of teachers find it difficult to create engaging and contextual stimuli, and 51.8% of teachers find it difficult to create questions that adhere to HOTS criteria (Ocy, Rahayu, & Makmuri, 2021).The need for an instrument that can help teachers motivate and gauge the level of their students' HOTS talents in mathematical abstraction is where the urgency of this research lies.However, it has been discovered that the exploratory nature of the current study adds its significance, while the majority of implementations using the abstraction process have been qualitative (Hutagalung, Mulyana, & Pangaribuan, 2020;Komala, 2018;Murtianto et al., 2019;Nurrahmah, Zaenuri, & Wardono, 2021;Pratidiana, ISSN 2089-8703 (Print) Volume 12, No. 4, 2023, 3542-3560 ISSN 2442-5419 (Online) DOI: https://doi.org/10.24127/ajpm.v12i4.7613Rifa'i, & Priyani, 2021;Simon, 2020;Ulia, Waluya, & Walid, 2022) and there have been very few experimental studies undertaken.This means that this research will clearly contribute to the literature.In addition, a test instrument developed based on HOTS-based mathematical abstraction ability indicators was used in this research.This kind of study is valuable because it provides a wealth of data about the efficiency and credibility of the test instrument.

3544|
In the age of modernization, many students of the multicultural nation of Indonesia today exhibit indifference to traditional values.In order to address this, creating culturally-based HOTS instruments is a solution to developing youths' appreciation of their culture (Kamid, Saputri, & Hariyadi, 2021;Khoriyah & Oktiningrum, 2021;Yuliani, Alfarisa, & Tiurlina, 2022).It is intended that by incorporating aspects of Riau Islands culture into the instrument's creation, students will be exposed to new stimuli.This fits the criterion for issues related to HOTS because the problem is one that students have never encountered before (Alfiatin & Oktiningrum, 2019;Kamid et al., 2021;Khoriyah & Oktiningrum, 2021).
In contrast to previous instrument development research that employs the Classical Test Theory (CTT) (Arifin & Retnawati, 2015, 2017;Kurniasi & Arsisari, 2020;Lestari, 2019;Masitoh & Aedi, 2020), this study produces items with high item qualifications through data analysis utilizing the Rasch model.A Rasch modeling analysis with help from Winsteps software will be used in order to understand how well students are performing on tasks, as well as how difficult the tasks are, and to evaluate the HOTS-based mathematical abstraction ability instrument that has been constructed in accordance with the Riau Islands' culture.Identifying the viability of the developed instrument is the aim of this study.An analysis of the class is done prior to giving out package A and package B questions to each class.It's important to remember that students in this subject have to have completed HOTS coursework on trigonometric comparison materials.By using the Slovin formula to calculate the number of students who would become respondents in each school, the decision was made to test question packages A and B for ten classes using a proportional sampling technique.This resulted in 198 students for testing package A and 204 students for testing package B. Table 3 displays the full set of data.

3546| Item Fit Order
In Rasch model analysis, the fit of items is assessed to determine how well they contribute to the measurement of the underlying construct.When conducting Rasch model analysis, the fit of items is typically evaluated using fit statistics such as infit and outfit mean square (MNSQ) statistics (J M Linacre, 2019).The order in which items are fitted in Rasch model analysis is determined by examining the fit statistics for each item (Bambang Sumintono, 2018).The Rasch model uses the following criteria to determine an item's quality (Chan, Looi, & Sumintono, 2021;Maryati, Prasetyo, Wilujeng, & Sumintono, 2019;Perera, Sumintono, & Jiang, 2018;Bambang Sumintono, 2018).1. Outfit MNSQ (Mean Square) value: 2. Outfit ZSTD (Z-Standard) value:

Point Measure Correlation value:
The items of instruments that do not meet the three aforementioned conditions are deemed -misfit‖ and must be replaced; however, if the items of instruments that do meet at least two of the requirements are still deemed to be -fit‖ or in decent condition (Bambang Sumintono, 2018).Table 4 lists the requirements for item fit as determined by the instrument quality rating system (Boone, Staver, & Yale, 2014;John M. Linacre, 2010).

Table 4. Outfit MNSQ criteria
Item model fit meansquare range extremes Criteria

Good Marginal Poor
Table 4 explains that MNSQ>2,0 indicates disruption to the measurement system, this suggests that the item is not functioning well within the measurement model and may need further investigation or revision; 1,5<MNSQ<2,0 indicates it has no significance for measurement, these values indicate some degree of misfit, they may still be acceptable depending on the specific research context or purpose of measurement; 0,5<MNSQ< 1,5 these values suggest that the item is functioning well within the measurement model and contributes meaningfully to the overall measurement; MNSQ<0,5 indicates it is not useful for measurement even though it doesn't cause damage to the measurement system.

Item Difficulty Level
According to Table 5, the Rasch modeling divides the item difficulty into four groups based on the Measure (logit) and the logit item's Standard Deviation (SD) values (Boone et al., 2014;Maryati et al., 2019;Perera et al., 2018;Bambang Sumintono, 2018).2018), questions with negative correlation values must be examined to determine whether the items need to be updated, should be removed from the test, or the answer keys are invalid.

Unidimensionality
Utilize the results of the unidimensionality test to evaluate the Rasch model's validity.The Rasch model requires that the variables are unidimensional.The dimensionality map, or more specifically, the raw variance data collected from the Winstep software, or by taking a look at the "raw variance explain measure" value, either show the unidimensional requirement.In the Rasch model, the unidimensional limit must be at least 40% and preferably more.The dimensionality map also shows the independence needs."Unexplained variance in 1st-5th" is an excellent indication of the value of independence.Table 8 displays the criterion for unexplained variance (Boone et al., 2014;Perera et al., 2018).

Monotonicity
Subsequently, investigate items monotonicity characteristics.The ability range of the response group known as group invariation (monotonization) is from low to high.When something is invariant, it either has a monotonous growth in item qualities or is curved.The observed average column, where the values must tend to increase gradually or consistently from small to large, reveals the nature of the monotonization in winsteps (J M Linacre, 2019).The instrument is monotonous meaning that there are no confusing instruments.

Reliability
Reliability is another factor that influences the quality of the item.With a Rasch reliability value of zero to one,

Person and Item Separation Index
The index of separation of persons and items estimates the tool that can distinguish between students' skills.The wider the spread of items from easy to difficult items, and the higher the index of separation of people and items, the more accurately the distribution of items responds to items (John M. Linacre, 2010).There are values for the separation index ranging from 0 to infinity, the higher the separation the better.Table 10 displays the criteria for the person index and item separation (Boone et al., 2014;Chan et al., 2021; J M Linacre, 2019).

Precision of Measurement
The precision of measurement is heavily dependent on the instrument, and it describes the result.To assess the adaptability and dependability of an instrument, accurate measurements are crucial.Less than 0.5 is the minimum acceptable standard of error for an instrument.In Rasch model, the estimated value of the items can be found in the "Model S.E" column (Perera et al., 2018).

Discriminating Power
The capacity of students who are able to answer questions and those who are unable to answer questions is separated using the discriminating power of questions.Analysis is employed in the Rasch modeling process as a method to assess discriminatory power at the level of the individual.To further identify groups of respondents, a method based on the respondent separation index can be employed.The higher the item separation value, the better the instrument is for all respondents and item items since it can differentiate between groups of items and groups of respondents (B Sumintono, 2016)

STAGE 2: DESIGN
The first step in the design stage involves clearly defining the learning objectives that the tests aims to measure.This includes identifying the specific higher-order thinking skills that are targeted.In order to compile tests based on the hierarchical level of ability of HOTS, a table of specifications has been produced.At this point, the instrument design process starts, with three mathematics experts acting as the instrument's validators.The example of the table of specifications can be seen in Table 12.There are currently twenty-five developed questions.Experts and panelists then vetted these 25 questions.Out of all the questions, only 19 were deemed valid by experts and panelists.The 19 questions that had passed the validation stage were then tested on a small scale.This is done to ensure that the questions do not contain writing errors, language that is difficult to understand, or calculation errors.Based on the results of the small-scale test, revision of the question items was carried out based on sentence formulation, completeness of stimulus information, inappropriate indicators, distribution of answer choices (distractors) that were less for multiple choice test questions.

STAGE 4: DISSEMINATION Person Fit
Finding the appropriate participant for the Rasch model was the first stage in this study's data processing.According to the findings of the fit person analysis, which involved 402 respondents, 134 out of 198 respondents to the package A instrument and 128 out of 204 respondents to the package B instrument were found to be fit.Table 13 displays the findings of the analysis of misfit individuals.072,073,074,075,077,080,081,082,086,088,089,090,093,094,111,112,113,115,116,118,121,122,123,128,129,130,131,132,135,152,155,157,158,159,162,163,165,166,177,178,179,190,192,194,195,196 100,101,102,103,104,106,116,121,122,123,126,127,128,133,134,135,137,138,139,140,141,142,143,144,145,146,147,152,159,163,169,173,174,180,185,186,193,194,195,200,201,202 In Rasch model analysis, Table 13, a misfit person refers to an individual whose responses to the items in a measurement instrument do not conform well to the expectations of the model.A misfit person in Rasch model analysis refers to someone whose responses deviate significantly from what would be expected based on their ability level.This deviation can manifest as consistently endorsing items that are too easy or too difficult for their ability level or showing inconsistent response patterns across items.
Identifying misfit persons is important because their responses can introduce noise and bias into the measurement process.Misfit persons may have different response patterns due to various reasons such as guessing, misunderstanding of items, or lack of motivation.Their inclusion in the analysis can compromise the validity and reliability of the measurement instrument.

Item Fit
After the outlier (misfit persons) is removed from the data, the data is once again examined.Being able to observe the item quality or item fit order information is valuable information from Rasch modeling.In Rasch model analysis, the fit of items is Table 14 shows that every item, specifically the 19 items found in packages A and B of instruments, satisfies the very good fit item requirement.The Point Measure Correlation value with very good criteria shows that participants in tests are not confused when selecting answers to the instrument's questions, meaning that all participants with low abilities choose the wrong answers to the questions and all participants with high abilities choose the right answers.

Item Difficulty Level
To determine which items belong into the very easy to extremely difficult categories, the difficulty of each item is analyzed.The SD logit values for package A and B items are 1.95 and 1.65, respectively, according to the Rasch model's computations.It is possible to determine the difficulty level of each item based on the measure logit and SD logit values.The findings of the item difficulty level analysis can be summarized as shown in Table 15

Monotonicity
Positive sequential distance values are not separated in order to ascertain the type of monotonization, and it is stated that the answer categories can be  The analysis reveals that in both package A and package B, the value in the observed average column has increased from a negative to a positive value.Both package A's and package B's observed average values tend to rise from -2.08 to 1.59 and from -1.86 to 1.39, respectively.The Root Mean Square Residual (RMSR) values for packages A and B grew as well, rising from 0.3294 to 0.3773 and 0.3473 to 0.3877, respectively.This demonstrates that the products utilized have complied with monotonicity condition.

Reliability
Rasch reliability can be used to examine the stability of persons and items of the instrument, which is another useful insight provided by Rasch modeling.Any reliability value around one is thought to be internally consistent.Figures 3 and 4 show the value of the person index and item reliability for packages A and B. The stratum equation's (H) value for the items in packages A and B turns out to be 10.133, which is rounded to 10, and 8.546, which is rounded to 9. As for the respondents in package A and B, the results obtained are 2.42 which is rounded up to 2 and 2.386, rounded up to 2, respectively.As a result, it is possible to identify 10 groups of items A and 9 groups of items B. For respondents, this means that package A and package B respondents can be sorted into two groups that represent the skills of the students.
Based on the results of the psychometric validity analysis and the fit statistics test, the results of the field test data analysis can be summarized in Table 18.The following findings can be obtained from Table 18, the examination of the instrument's psychometric validity: (1) measurement packages A and B were able to explain 50% and 45.9% of the raw variance, respectively, showing that they met the conditions for unidimensionality; (2) with scores for package A ranging from -2.08 to 1.59 and for package B from -1.86 to 1.39, the monotonization of packages A and B tended to rise (meeting the monotonicity criteria); (3) the A and B instruments' respective Alpha Cronbach reliability scores of 0.73 and 0.75 show that respondents' consistency in responding to the instrument can be accepted holistically; (4) the person reliability scores for packages A and B were 0.70 and 0.71, respectively, indicating that the subject's consistency in giving an answer proved quite good; (5) the item reliability scores for packages A and B were 0.97 and 0.98, respectively, indicating the excellence of package A and B instruments in the special category.All items examined empirically met valid and reliable criteria according to the findings of statistical fit test analysis and psychometric validity.

Students' HOTS in the Aspect of Mathematical Abstraction Ability
The examination of the Rasch model reveals students' capacities for mathematical abstraction.Precisely by examining the logit score measure's mean value on the output of the person measure.For respondents in package A, the mean value for the person measure is logit.Additionally, package B respondents' mean value for the person measure is logit.This demonstrates that students' mathematical abstraction skills fall below of the required level for the item's difficulty, which means that HOTS performance in the area of students' mathematical abstraction abilities is poor.ISSN 2089-8703 (Print) Volume 12, No. 4, 2023, 3542-3560 ISSN 2442-5419 (Online) DOI: https://doi.org/10.24127/ajpm.v12i4.7613

3556| DISCUSSION
The research findings revealed that Tanjungpinang high school students have low HOTS abilities in the area of mathematical abstraction.This demonstrates the necessity of improving students' HOTS skills, particularly in the area of mathematical abstraction.There are several strategies that a mathematics teacher can use to improve their students' mathematical abstraction abilities.The following is how this research contributes to students' stronger HOTS abilities in the area of mathematical abstraction ability: (1) Students can use the final product as training material to develop higherorder thinking skills in the mathematical abstraction ability area; (2) High school math teachers can use the final product to assess their students' understanding of mathematical concepts, aptitude for mathematical abstraction, and capacity for higher-order thinking.
To give students fresh insights, a cultural component-specifically, the culture of the Riau Islands-was incorporated into the instrument's construction.The problem environment that this instrument develops presents new challenges that students have never encountered before.That is one of the study's advantages.A further advantage of this research is that, in contrast to CTT, which focuses on observed scores and assumes equal item discrimination across different ability levels, the Rasch model operates on a more fundamental level by estimating item difficulty and person ability parameters independently.This distinction gives rise to several advantages that position the Rasch analysis model as a powerful tool for measurement and assessment.

CONCLUSION AND SUGGESTION
Based on the results of the research and discussion, the following conclusions were obtained: (1) The final product in this research produced a HOTS instrument on the aspect of mathematical abstraction abilities according to the culture of the Riau Islands.The test instrument is a set of multiple-choice questions consisting of 19 questions.The validity of the instrument is proven by the results of expert assessment and Rasch model analysis.The instrument also meets the criteria for reliability; (2) Multiplechoice questions have the characteristics of item difficulty levels that tend to be difficult or very difficult, good unidimensionality, fulfill monotonous characteristics, and very good precision of measurement.This research's limitation is the instrument's incorporation of cultural components from the Riau Islands, which makes it challenging for students from other regions to use.Because students from other regions are unfamiliar with the cultural background of the questions.This can be overcome by conducting research regarding the development of HOTS instruments on aspects of mathematical abstraction abilities by involving other cultural contexts that are familiar and close to students.

Figure 1 .
Figure 1.Monotonicity of packet A

Figure
Figure 3. Person and item reliability packet A

Table 1 .
Field test subjects

Table 5 .
The criteria for the level of difficulty of the items

Table 6 .
Criteria for discriminant power of items

Table 9 .
Person and item reliability

Table 10 .
Person and item strata separation

Table 11 .
Mathematical abstraction ability indicator

Table 12 .
Example of the table of specifications for perceptual abstraction level

Table 15 .
based on the above explanation.Item categories based on difficulty level

Table 17 .
Unexplained variance in 1 st -5 th of the instrument

Table 18 .
Summary of field test data analysis