Data literacy assessment instrument for preparing 21 Cs literacy: preliminary study

This study aims to develop an instrument to measure data literacy. Data literacy is one of the 21st century literacy (21Cs) that needed face the Industrial Revolution 4.0 (4IR). The interpretations about the ability of students’ data literacy are used as additional data. This study uses a modified instrument development stage, which consists of 6 stages. Sampling technique using purposive random sampling with a total sample 94 junior high school students. The results of the content validity test show that all questions are valid in content with a minimum Aiken’s V value of 0.76. The validity and reliability test empirically using Parscale shows that the questions are valid and reliable to measure students with abilities of -2.8 to 2.1 logit scale. Based on the research it is concluded that the data literacy instrument can be used to test the initial ability of data literacy.


Introduction
Industrial Revolution 4.0 (4IR) demands the world to produce educated workers who can manage and utilize Industry 4.0 [1]. Education plays an important role in preparing workers to have a set of abilities required by 4IR [2], this set of abilities is often referred to 21st Century skills (21Cs). There are three 21Cs categories, namely learning skills (critical thinking, creativity, collaboration, communication) [3], life skills (flexibility, leadership, initiative, productivity, social skills), and literacy skills. Literacy skill is a constructive, integrative and critical ability needed in daily life [4]. Literacy skills are new abilities that are still little introduced to students.
Literacy is ongoing proses, it is a set of abilities and skills of individuals to solve problems in everyday life [5] [4]. There are several literacy capabilities needed in 21Cs such as digital literacy [6] [7], scientific literacy [8], and information literacy. All literacy is related to other literacy and cannot stand alone, for example technological literacy will be related to digital literacy, while statistical literacy and information literacy will be related to data literacy [9] [10]. Basically, the more general a literacy will be, the wider the scope will be, and to develop the literacy capability, it will require the development of more specific basic literacy abilities.
One of the basic literacies that support 21Cs is data literacy. Data literacy is a new literacy that must be developed in Indonesia [11]. Through good data literacy skills, a person can make decisions appropriately [12], respond to situations wisely [13], and most importantly can prepare themselves to meet RI 4.0, for that data literacy ability is one of the most important skills taught to student. Data literacy includes the ability to collect data [14], understand data [15], explain data [16], identify, interpret and implement data [17], communicate and evaluate data [18], use data [19], data analysis [20], and managing data [21]. This ability can be taught to students through formal or non-formal systems.
Promoting data literacy to students can be done by including it in a particular curriculum, seminar or training [22]. This study aims to develop instruments that can measure students' data literacy abilities. This study also roughly describes the abilities of students' initial data literacy.

Research method
This research focuses on the test preparation process. The preparation process follows the preparation of test and non-test instrument techniques [23]. In this study the modification of instrument preparation steps was carried out by eliminating the revision and trial stages

Results and Discussion
Analysis of the content validity of items using Aiken's V can use several rating categories, with rating categories of at least 2 to 7 [24]. The results of Aiken's V analysis can be seen in the table below. Aiken's V minimum value when using 7 rater and 4 answer criteria and 5% error tolerance is 0.76. Validation interpretation by comparing the calculated value of Aiken 'V with the minimum value of Aiken's V. For example, in items 1 and 3 have Aiken's V value of 0.76, this calculated value is the same as the minimum value of Aiken's V so that the item is still declared valid in content. Based on this explanation it can be said that all items are valid in content.
An analysis of the results of empirical trials was carried out to see the compatibility of the items with the model. This analysis process uses Quest.    Figure 2 shows the analysis of the difficulty level for each item number 1 2, 5, 6 and 10 as representatives. In item number 1, only 21 students are correct, this item has a difficulty level of 1. 16. In item number 10 there are 74 students who answered correctly, the difficulty level of this item is -1.18. Item no 1 is item that is more difficult than item number 2 and number 10. While item no. 10 is the easiest item. The level of difficulty is related to the number of questions answered correctly by students, in figure 2 the level of difficulty is shown in the THRSH column, the higher the THRSH value indicates that the item is more difficult.
The results of the analysis of the validity and reliability of 10 empirical questions, using the IRC software Parscale are presented in the form of a picture that shows the total information, Item characteristic curve, and item information curve.  Figure 3. Parscale information total curve. Based on Figure 3, it can be seen that the value of information is 1.5 and the standard error is 0.56. In IRT analysis, the value of information is seen as reliability while the standard error is also called a measurement error. Reliability is inversely proportional to the standard error so that the greater the reliability the standard error will be lower. Based on this performance, it can be inferred that the instrument is reliable.
In IRT analysis, the scale score section shows the logit scale. The point of intersection between the blue line which is the value of information and the red line which is the standard error, shows the reliability of the questions in measuring data literacy of students. Based on the pictures it can be stated that the questions have a reliability level of 1.5 and can be used to measure the literacy ability of students who have a logit scale of -2.8 to 2.1. The greatest reliability is shown by the green line of 6.6 with a standard error of 0.21. This shows that this problem is best done by students with the ability of 0.4 logit scale.   Figure 5 shows that item no 1 has an information value of 1.51 if it is done by students who have the ability to logit scale 1.1. Figure 3 shows the reliability of the questions in measuring data literacy of students while figures 4 and 5 show the level of validity of each item to measure the ability of students.  Table 3 shows a rough calculation of the level of achievement of data literacy indicators. Based on table 3 it can be concluded that the ability of student data literacy in general is still low and the lowest ability on determine data in accordance with the circumstances, this can be seen from the level of achievement per indicator that does not reach 75%.