Validity and reliability estimation of assessment ability instrument for data literacy on high school physics material

Assessment ability instrument for data literacy of physics material that’s not accordance with characteristics of the assessment, it can make students less understanding of thermodynamics as a material of physics. Therefore, we need an assessment instrument developed according to Sumadi, which consists of multiple choice tests to determine the ability of data literacy in order to find out the quality of the content and empirical validity, the reliability of the assessment instrument. Analysis of the content validity that uses the Aiken V, the trial is analyzed through the MNSQ INFIT value, the estimated relability is expressed by the Cronbach alpha coefficient, Item Response Theory (IRT) seen through the total information function and Standard Error Measrument (SEM). The results of this study indicate that the instrument for assessing data literacy ability is valid with a V Aiken value> 0.8 and MNSQ INFIT parameter 1.00 ± 0.06. Estimation of the Cronbach alpha coefficient value is 0,93. The results obtained are based on the total information function curve and the instrument is reliable to measure the ability of students in the range of -3 to +3. Based on the value of the information function, the ability of students is still low.


Introduction
Indonesia is a developing country that has been in the era of the 21 st century with various types of demands, especially in the field of education which must be easily accepted by students and can compete internationally and facilitate learners in the 21 st century to experience the best learning experience so that they can achieve the learning objectives effectively [1]. The graduation competency is the completeness of students in class of physics, so that they can go to the next level.
Strategies that can be used to assist students in achieving completeness in physics material, namely by providing an assessment to find out how the level of student understanding. Assessment in the realm of education is part of collecting and processing various information related to student achievements during learning [2]. Assessment has basic objectives, one of which is conformity with the learning objectives to be implemented and make students worked in teams to solve real problem [3]. Assessment does not only measure the success of learning, but to see the effectiveness of the success of students [4]. For arrange an assessment that is correspond to characteristics of a good test or assessment, including (1) validity, (2) reliability, (3) objective, and (4) practice [5], [6]. Assessments are arranged in accordance with indicators that are useful for measuring one of the potentials of students such as data literacy in physics material, so that it can be used properly if it meets the criteria / conditions of a good assessment. Continuity between learning and assessment must be carried out well so that the objectives of learning activities can be realized. The suitability of the strategy in applying assessment to learning, will help educators know the potential possessed by students [7]. Data literacy is one of the variables needed in the industrial revolution era 4.0. Because, data literacy activities in learning physics will train students to be more consistent in a pedagogical manner and easy to understand the learning process and according from PISA data which said that Indonesia from 2000-2015 had been ranked below for literacy skills [8]. This is in accordance with the concept of learning physics that emphasizes systematic, creative, and objective activities and concludes the events that are around and data literacy is included in the ability to formulate and answer data-based questions, use data as needed, represent data and infer data from information obtained [9]. Before teaching how the concept of literacy in learning, an educator should understand several indicators contained in data literacy and data literacy used in learning can be useful for conveying information into knowledge and practice by collecting data, analyzing and interpreting [10]. Table 1 show the synthesis of several articles related to data literacy.

Reviewing results
Students can plan the interpretation of objects obtained based on the conclusions on the data presented.
One of the material physics that requires data literacy capabilities, namely thermodynamics. This is because thermodynamics is a part of physics material with abstract concepts so it is poorly understood and visualized and has a high complexity [11]. Students have a low level of mastery of concepts in thermodynamic material [12]. Another error occurred in the thermodynamic material that is, interpreting the graph and using the terms heat and energy in an inconsistent manner and identifying the effort and heat in the P-V diagram which is not quite right, whereas each thermodynamic process is often interpreted through the graph [13]. Various errors that occur in the application of thermodynamic material in learning, can be reduced by applying appropriate measurement variables.
Based on the explanation related to the background of this study, a problem formulation can be obtained namely how the validity and reliability of the assessment instruments used to measure the The 5th International Seminar on Science Education Journal of Physics: Conference Series 1440 (2020) 012020 IOP Publishing doi:10.1088/1742-6596/1440/1/012020 3 ability of high school students' data literacy on physics subject matter thermodynamics? Researchers hope that from solving the main problem, it can become a foothold for practitioners to better know how to compile an assessment instrument used to measure the ability of high school students' data literacy on physics subject matter thermodynamics.

Research method
This study uses the stages of developing learning outcomes tests according to Sumadi here the stages that must be carried out are as follows.

Development of assessment instrument specifications
This stage is used by researchers to find information related to assessment instruments used in MAN 3 Sleman as a research location. The specifications of the assessment instruments developed must be based on the distribution of material in the 2013 curriculum (K13) and know the needs of the questions used in the thermodynamic material at MAN 3 Sleman.

Writing assessment instruments
This stage is carried out activities compiling assessment instruments in the form of thermodynamicrelated questions that are adjusted to the indicators in the data literacy and in the form of multiple choice questions 12 with 5 answer choices with a score of "1" if true and a score of "0" if false.

Validity of assessment instruments
The validation stage of the assessment instrument is the validation of the assessment instrument to 5 expert judgments. The validation sheet consists of 17 assessment indicators related to aspects of material, construction and language. The validation activity is used to find out the validity and feasibility of each item to be used in the empirical test. Thus, the results of the validation carried out by 5 expert judgments were then analyzed using the Aiken V formula which revealed the attributes in accordance with the indicators and items in the test. [14], as follows: Information, is index of Aiken validity, is category of i, is the score given by the average of i, is number of ratings / criteria, is number of rater. The provisions of the Aiken V index are if the Aiken V index is less than 0.4, then the validity is said to be low, between 0.4-0.8 the validity is said to be moderate, and if more than 0.8 the validity is said to be high, after the scores were attained from the experts, quantitative data were convertes into qualitative data in order to identify the product quality [15].

Testing empirical assessment instrument items
At this stage the activity was tested empirically about the questions by involving 12th grade high school students in MAN 3 Sleman, which consisted of 2 classes with 30 students in each class. Validation is empirically tested using both classical and modern approaches (IRT) to see the validity of each item, the level of difficulty of the item and the function of item information.

Evidence of empirical validity.
The model used in this study is the Rasch model with 1 PL. Item items are declared valid if the INFITMNSQ value on each item is in the range of 0.77 to 1.30 [16]. The Rasch model has the condition whether an item has been analyzed or not, which has INFIT t≤2, the level of difficulty of a good item ranges from -2 to +2. Between difficult and easy questions must be balanced. therefore, researchers must choose the level of difficulty items according to the provisions.

Reliability estimation.
The reliability estimation of the item literacy ability assessment point is shown through the cronbach alpha coefficient. The items are said to be reliable if the reliability value is more than 0.85 [17]. The higher the value of item reliability the more convincing that the items or

Information function and measurement error standard.
In Item Response Theory (IRT), the item information function becomes one of the factors that affect the quality of an instrument. The information function of an instrument will be high if it is composed of items that have a high information function. Therefore, the correctness of the information function is inseparable from measurement error. Measurement errors in IRT are referred to as Standard Error of Measurement (SEM). In IRT, SEM value does not depend on the ability of the respondent, so the instrument information function shown is more accurate [18]. The SEM value is inversely proportional to the root value of the test information function, as the following equation shows: Information: ( ) is standard error of measrument and ( ) is information function.

Administration of final form assessment instruments
The final stage of this research is the development of assessment instruments, namely activities to select questions that are suitable for use and meet the requirements of a valid and reliable assessment instrument so that it can be used at the stage of field testing (implementation). In addition, the final data in this study can be used as a need assessment in advanced research.

Results of validation of assessment instruments
At this stage, the assessment instruments were tested for content validity through the Aiken formula V value on each item. the results obtained based on the validator's assessment, will be used as a reference for researchers to select items that are valid and can be used. In accordance with the provisions of the Aiken V index, the results of all content validity by 5 expert ratings on 12 items are declared valid as in table 2 below.

Results of empirical validity
The empirical validity of the valuation instrument can be seen through the value of item compatibility (goodness of fit). Goodness of fit value is indicated by the value of INFIT MNSQ using the help of Quest software, so that the average and standard deviation results are 1,00 and 0,06. The average value (mean) is valid because it is in the range of 0.77 to 1.30. Figure 1 shows the distribution of MNSQ INFIT values for each item in the Fit Model, as follows.    Figure 2 shows the results of the distribution of questions with difficulty ranging from easy questions to difficult questions. Items 1, 2, 4, 5, and 10 have difficulty levels approaching +2 which means that the question is difficult. While items 3,6,7,8,9,11, and 12 have difficulty levels approaching -2 which means that the questions are easy.

Results about item reliability
The reliability estimation of the item literacy ability is indicated by the alpha cronbach coefficient of 0.93, so the reliability estimate is reliable because the reliability value is more than 0.85.

Results information function and standard measurement error
The reliability analysis in figure 3 is still influenced by classical theory, and therefore an analysis based on item response theory (IRT) is also performed. The results of the analysis of the response theory in the form of graphs shown in figure 3, as follows.   Figure 3. Total information function and standard error measurement (SEM). Figure 3 above shows that the grading test item has a total information function of 0.016, with a Standard Error Measurement (SEM) of 6.55. Therefore, the instrument for assessing the ability of reliable data literacy to measure the ability of high school students can be known from the Parscale program output at intervals of -3 to +3 can be seen in figure 4, as follows. The level of students' ability in figure 4 illustrates that the number of students with low or near-3 abilities is greater than the number of students with moderate ability or more than 0 (close to +3). The results of the analysis on the worthiness of the instrument for evaluating the ability of data literacy can be used to determine the ability of students. Instruments that can be used to see the ability of students, must be in accordance with the terms and conditions of the validity, the reliability of an instrument [18]. One of the results of research that has been done is the benefit of developing tested instruments, namely valid assessment instruments in content validation and empirical validation, one of which is able to measure physics problem solving skills. [19]. This, in accordance with the results of previous studies related to literacy with a valid assessment instrument, is reliable so that it can be used in construct testing [20]. The ability of literacy data can be adequate students to know how to understand the problem of physics material [21]. Based on the results of previous studies, states that the development of valid and reliable assessment instruments can be used to determine students' data literacy abilities.