Students Performance Prediction by Mining Behavioural Pattern

Analysing and monitoring students’ progress and performance is an active research area in educational data mining. Some research work uses direct construct relating to academic achievements such as GPA, SAT in predicting performance while others make use of psychometric construct in measuring skill and knowledge, ability and educational achievement from data gotten from questionnaires. In this paper, we propose the use of psychometric construct as a method of extracting non-cognitive features from campus check-ins data in predicting academic performance of college students. A P-FRAME framework was developed and non-cognitive attributes extracted through psychological theories were used as the predicting variables. The extracted variable was applied to various penalized regression algorithm and the results were compared. The result of our experiments showed a high correlation between the actual score and the predicted score which is within 2:0 of reported score.


Introduction
In the educational system, the implementation of smart ID cards has been very useful in an administrative task, staff, student and visitor's management, time/attendance tracking, access control and so on. The mundane tasks of registration, book purchases, meal programs that require an army of administrators to manage piles of paperwork are made simple through different applications and databases incorporated with the smart cards and sensor network. Many higher institutions have adopted the use of smart ID card both with students and staffs because of its cost-effectiveness and its simplicity.
It is important to analyze these data to gain insights that can be useful to the school management. The ability to uncover hidden patterns in large databases will enable higher institution develop models that can be used to predict different desirable outcomes. By acting on these predictive models educational institutions can effectively address issues ranging from performance, retention, transfer, marketing, and intervention. Many factors influence the academic performance of college student. It is a general believe that cognitive attributes are most important in determining academic performance but with on-going research, it has been discovered that non-cognitive attributes may play an important role in reversing or limiting delays in cognitive development and academic achievement and they may complement direct efforts to improve academic learning [1]. The information from the students smart card reflect various activities and patterns undertaken each day that includes visit to different locations they visit and at Time when swiping the ID card to make purchase, to gain access into a building or check-ins for attendance. Using the location clusters and the time information, it is possible to analyze behavioral pattern and activities of an individual to determine persistence, motivation and behavioral pattern. For example, we can determine the interest or enjoyment of a task or goal. Engaging in reading activity when school is not in section or engaging in a task with a reward attached to it. Cash reward for performing well in a sport competition or marks awarded to a top sport performer which can be classified as a form of motivation.
This paper aims to investigate the underlying patterns within the campus big data and using psychology theories to extract variables which can be used to predict academic performance. The P-FRAME model was developed in predicting academic performance based on the extracted attributes that includes Motivation, behavioral pattern and attitude. A penalized regression algorithm (lasso /ridge, forward stepwise and regression algorithm) was used to test the data set to discover important features and propose a model to be used in predicting the academic performance of college students.

Related Works
Early stages of educational research, the transition from high school to college or universities has received much attention [1,2]. There are three important indicators when it comes to predicting future success during this transition period: high school GPA, intelligence quotient (IQ) and self-efficacy. In [3], there was analysis to measure the correlations between the high school GPA and undergraduate grades in the range of 0.26 to 0.53. Also [2] emphasize that high school grades are the best known predictors of student's readiness for undergraduate studies, regardless of the quality and type of high school attended, while a standardized admission test provides useful supplementary information.
The indicative values of several types of explanatory variables have also been assessed during the transition from undergraduate to graduate studies in predicting academic success. The Indicators of undergraduate achievements that predicts future graduate level performance has been shown in many studies [4]. Applying a factor analysis on school grades, [5] find two components that determine achievement: ability and adaptation to the school system. That second dimension is also identified as student non-cognitive behavior [6], academic ethic [7], or common grade dimension ( [8]. It is related to non-cognitive constructs such as motivation, effort, self-efficacy, perseverance, and locus of control. Nevertheless, disagreement has also arisen about the quantification of those constructs [9,10]. Here, we refer to that second dimension as adaptation to the academic culture.

Model Framework
This section gives a detailed description of the model developed. The P-FRAME ( fig.1) was developed and psychological factors/theories were used to extract attributes for predicting academic performance. The first level of the model is the Data input and this is where part of the preprocessing takes place. At this level, the first stage of preprocessing takes place on the raw data and other additional data that will be useful for the model. The raw data, meta-data or the group data are at the base of the model. At this stage, the cleaning of data, removing incomplete data set, and making sure the overall data is uniform and conforming to the models requirement.

Meta Data
Meta data contains all the domain information about the data which includes the size, time and date creation. Here the granularity is very important, which helps get in-depth and more structured information.

Activity and Group Data
Activity data and group data are other data that contains information about extracurricular activities, Dormitory information, Domain information etc. This is the most important stage of this model. This is the level at which reference is made to the psychological factors and theories, which will help us in extracting the attributes needed for the prediction model. Here, the psychology factors can be predefined or existing theories can be incorporated with the data set with the aim to help select attributes.
The clear understanding of these theories is very important because it gives various dimensions in extracting and attributes labeling. Firstly, these p-factors are used for data filtering before attribute selection. Secondly, they can also be used to discover patterns and also used to give labels to variable with the purpose of better understanding of the features. The last level of the P-FRAME model is where the appropriate algorithm is applied to get a predictive model. After features engineering from the two previous levels, here an appropriate penalized regression algorithm is applied which is because of the nature of the target variable? Section III-B gives the explanations of the algorithms that will be applied to these variables extracted from the previous level. The forward pairwise, ridge, lasso and LARS algorithm uses the features extracted from the level 3 of the P-FRAME model to train the predictive models. There will be more iterative flow between the prediction model and model testing which is dependent on the threshold error required to obtain a good predictive model.

Experiment
The data consist of the check-in data of students from the Shenyang Jianzhu university over a period of 4 years, and a total of 8 semesters. There are 4 classes used for this experiment, each class have at least 30 students. The daily usage of student's id cards is continuously logged into the database when a student activates the card either by buying, checking-in, accessing a building, subscribe for internet, visit the medical center etc. Majority of the data clusters accounted for are at the eating locations across the campus because students tend to activate their ID card more than once while buying food at the cafeteria.
The features extracted from the data-set 36 different features. The experiment was performed on the attributes before normalization and after normalization. From the result obtained using Forward step-wise algorithm, the prediction model developed was able to predict the error in the range of ±0.9 RMS while using 12 attributes. The iterative loop was 666 times. In table 1 we see the best attributes that gives the minimum error. For example if we the top 5 attributes, the error will be ±1.37, if we are to predict a score of 50, it will fall with the range( 48.63 to 51.37). Table 2 shows the result of the actual score and the predicted scores with a variance of 0.92. Various penalized regression algorithms ( forward step-wise, LARS, LASSO, Ridged) were used for experimental analysis to determine the best attributes. Form the table 3, Lasso and LARS algorithm gave a better result while ridge regression algorithm performed poorly on the test data. The important variables agree with the ground truth from [69] that indicates breakfast have positive influence on academic performance. Also, the extrinsic motivation influences future outcomes than intrinsic motivation. The Mean of the errors on the test data was used as the measuring metrics and lowest error for the best algorithm (LASSO) is 0.54. The result from this experiment proves that there are non-cognitive factors that contribute to the academic performance of college students.

Conclusion
This paper investigated the underlying patterns within the student's check-in data set. These patterns were extracted by the help of psychological theories and P-FRAME model was developed which helps to integrate the check-in data into suitable algorithms for analysis. Various penalized regression algorithms (forward step-wise, LARS, LASSO, Ridged) were used for experimental analysis to determine the best attributes. The comparison results of the experiments show Lasso and LARS algorithm gave a better result while ridge regression algorithm performed poorly on the test data.