Human Activities and Postural Transitions Classification using Support Vector Machine and K-Nearest Neighbor Methods

Nowadays, the gyroscope and accelerometer sensors are available on almost all smartphone devices. One of the sensor uses is to sensor is to determine body’s position. It is related to human activity that could be related to someone’s fitness level, and monitoring it is one of the main focus smart city system. By using classification method, we can determine body’s position. The experiment was conducted using k-nearest neighbor (KNN) with n-neighbors 3, 5, 7, and 9, and support vector machine (SVM) which kernels were polynomial, radial basis function (RBF), and sigmoid methods. The results of the K-NN method for all n-neighbor variations were 85.3% - 85.7% for 10 folds of cross validation. While for SVM, only the RBF kernel had a good result with 86.0% for 10 folds of cross-validation. So it can be concluded that K-NN and SVM with kernel RBF had good result.


Introduction
Nowadays, the gyroscope and accelerometer sensors are available on almost all smartphone devices. One of the uses of this sensor is to determine body's position. It is related to human activity. The physical activity is important to ensure health [1]. Government is one of the actors in smart city system. In the smart city system government can monitoring, and the result of the monitoring is evidence-based [2]. The Government can also monitor the fitness of their peoples, one of the parameters is their people's activities. It became important so that a number of studies were developed. Some of them are the human activity recognition by Reyes-Ortiz [3], and Anguita et.al [4]. Both recognize it based on the machine learning method. This method focuses on developing a system that is capable of "self" learning without having to be repeatedly programmed by humans [5].
The other classification method's such as SVM or K-NN can make easier to determine someone's body position. SVM is one of classification method that often used in the classification process [3]. Also, the K-NN usually used when doing the classification process [6]. The other method have also been used like hidden markov model [7] [8], decision tree [9]. The SVM method provides some varieties of kernel function, the kernel uses as one of parameters in the classification process. Some well-known kernel functions are linear, polynomial, RBF, and sigmoid [10]. In the K-NN method, one of the parameters that we can try some variations in the classification process is how many neighbors (K values) that used at classification process [11]. The aim of this paper is to contribute knowledge of any method that can be used in the classification of body position. So, in the future it can accelerate the development process of the uses of the gyroscope and accelerometer sensors in Human activities and postural transition. More specifically the uses of SVM and K-NN as the classification methods at this case.

Smartphone-Based Recognition of Human Activities and Postural Transitions Data Set
The dataset was smartphone-based recognition of human activities and postural transaction dataset. This dataset contains the result of experiment of 30 volunteers. The research had involved the people which aged between 19 and 48 years olds. The volunteers performed six basic activities, those basic activities were three static posture (standing, sitting, lying) and three dynamic activities (walking, walking downstairs, walking upstairs) [12]. At this experiment also included the position movement between the static posture and the movement are stand-to-sit, sit-to-stand, sit-to-lie, lie-to-sit, stand-tolie, and lie-to-stand. The volunteers wore a smartphone (Samsung Galaxy S II) to get the values of 3axial angular velocity. It was produced from gyroscope and accelerometer sensor at constant rate in 50 Hz who has embedded at the smartphone. The dataset contains 561 features and 10929 rows for 30 volunteer's data, and we use 21 volunteer's data with to classifier process, so data that we classifier is 561 features and 7667 rows. 561 features were obtained from calculating variables to the time and frequency domain. There is 12 class label who used at classification process based on position when recording process data (six basic activities, six movement transition).

Support Vector Machine: C-Support Vector Classification
SVM is a machine learning algorithm, and it has developed by different formulations, and this method applicable as an effective classification tool (Novakovic and Veljovic 2011). At this following text, we use one of type of SVM, C-SVC that can works with different basic kernel. it was training vectors in the two-class case and was the corresponding class labels decision, the formula of C-SVC optimization for classification [13] [14] : (1) with constraints: (2) The dual problem definition is: where the vector of all ones is , C > 0 as the upper bound, Q is a by positive semidefinite matrix, and is the kernel. Function transforms training vector into a higher (or it can infinite) dimensional space. The decision function: The decision to choose where the kernel we should use is often a difficult task. We would expect that non-linear kernels at C-SVC would be perform better than the linear kernel based, if the data we approximately known not linearly separable. Different performance levels are the consequences when choose the different kernels. Polynomial, radial basis function (RBF), and sigmoid will be use as the kernels at this paper for known what the result with this kernel.

a. Polynomial kernel
We can define the polynomial kernel as: (5) where the degree of the polynomial is for vectors that are linearly dependent on dimensions. It can be used to transform into linearly independent vectors on d dimensions that mentioned. After that, the vectors are transformed into dimension space. This dimension space became linearly separable and the linear C-SVC case handle the classification problem.

b. Radial basis function kernel
We can define the RBF kernel as: (6) Data that have a class-conditional probability distribution function approaching the Gaussian distribution is suited with RBF kernel. This data would be mapped by RBF kernel into a different space where the data becomes linearly separable. With RBF kernel we expected this kernel give better perform than the linear or the polynomial kernel [15].

c. Sigmoid kernel
We can define the sigmoid kernel as: (7) The sigmoid kernel is not efficient as a kernel than the other kernels. [15]. Sigmoid kernel come from Neural Network field. The bipolar Sigmoid usually used for activation function at artificial neuron.

K Nearest Neighbor
K-Nearest Neighbor (K-NN) is one of method to classification. This method based on the nearest learning data from the object [16]. The learning data will be represented to the multi-dimensional space, and each data feature represented of each dimension. Then, this space divided to parts based on the classification of learning data. At the training phases, K-NN only save the feature vectors and the conducting classification on the training set. When classification running, the similar features are calculated for the testing set. With Euclidian Distance formula the distance between testing set and testing set is calculated, and the formula: (8) The distance between vector and was represent by variable , with and .

SVM and K-NN
Method used in this classification consists of 2 methods, the first method was SVM and the second was K-NN. The steps of this classification process are illustrated in Figure 1. To keep that the results are more measurable, there is a need to limit the following: • The dataset used is a Smartphone-Based Recognition of Human Activities and Postural Transitions Dataset.

Preprocessing
Scikit-learn is one of a Python module integrating and a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and supervised problems [17]. In scikit-learn, data in from of 2-dimensional arrays of size samples x features become input in to all object and algorithm. Scikit-learn share the data in uniform that depends on their purposes. Estimator for make model from data, predictors for predict new data based on data that have made, transformer for convert the data from one representation to another(Abraham et al. 2014). Cross-validation is one of techniques to check the reliability of the estimator on a given dataset. This technique split and iterate data in to some folds (folds value) [18]. In the scikit-learn there are function to doing cross-validation and return the reliability value for each fold, the function was "cross_val_score()" and function based on stratified k-folds. The stratified k-folds ensure that each fold approximately have same size for samples from each class label in the dataset [19]. The steps for the classifiers process for describing figure 1 are: • Make data set into dataframe pandas, pandas provides a similarly class name DataFrame who reference into DataFrame in R programming language [20]. • And then split the dataframe into 2 dataframes one is features dataframe and the other is label dataframe. • For SVC method the parameter we set parameter kernels into some kernel parameter (Polynomial, RBF, and Sigmoid). • For K-NN method, we made n neighbors (k values) in to some value (3, 5, 7, and 9). • Each method will be checking the reliability with cross-validation, with fold = 10.

Results
Using cross validation with value of cross validation 10, the results are shown in Table 1 and 2.  Table 2. The result show that accuracy after cross validation with fold value is 10 that in K-NN method have stable result with variance of K values. Because the setting weight parameter was set in to uniform make all point in each neighbor are weighted equally. But this result maybe has different if we don't use the parameter default (Scikitlearn). However, next research maybe we can give more variance at parameter to see more accurate result.
The experiment was conducted using k-nearest neighbor (K-NN) with n-neighbors 3, 5, 7, and 9, and support vector machine (SVM) which kernels are polynomial, Radial Basis Function (RBF), and sigmoid methods. The results of the K-NN method for all n-neighbor variations, 85.3% -85.7% for 10 folds of cross validation. While for SVM, only the RBF kernel has a good result with 86.0% for 10 folds of cross-validation. So, it can be concluded that K-NN and SVM with kernel RBF have good result.