Intelligent passive infrared sensor based on learning vector quantization for human detection

Passive Infrared (PIR) Sensors have been used widely in human detection indoors nowadays due to their low cost and range. However, traditional PIR sensors may get fault detection, especially when the human is in a static pose. To overcome this limitation, a Machine Learning (ML)-based PIR sensor is proposed in this work for detection accuracy enhancement. The Learning Vector Quantization (LVQ) approach is used to be easily implemented in the embedded device (which requires a low computational complexity) to provide a real-time response. The experimental scenarios to create the datasets are conducted in two distinct locations for training and testing purposes. In each location, participants performed a series of different activities and left the room unoccupied. Data is collected via a PIR sensor and then wireless transmitted to a computer for training and testing. In the test set, the presence of humans with an accuracy of 89.25 % is obtained using the proposed LVQ algorithm prediction. Finally, the LVQ is implemented on an embedded device based on Xtensa Dual-Core 32-bit LX6 CPU to form an intelligent PIR (iPIR)-based LVQ sensor, this novel iPIR sensor then is evaluated and tested with a remarkable result.


Introduction
Human detection is necessary for most smart homes, smart buildings, smart factories, and smart city applications.Therefore, improving the accuracy of human detection is an essential task for the performance of these systems.Several sensors are researched and developed to do this task with the comparison of accuracy, robustness, and energy consumption.In the works of [1][2][3], a human is detected using the camera, but the image processing algorithms can not be implemented in an embedded device due to the limited resources.The shortrange Doppler radar sensors are used for detecting humans by the authors of [4][5][6], while the authors of [7] measure CO 2 emissions from ambient air to know the presence or absence of people in rooms.However, these sensors are expensive and require corresponding algorithms.PIR sensors were used in [8][9][10] for human detection.Among them, PIR sensors are widely applied with the advantage of low costs and energy consumption [11,12].Conventional PIR sensors use fixed threshold values to notify the human presence and may detect incorrectly in many circumstances, especially when the human is in a static pose.Therefore, several works are focusing on overcoming this limitation of PIR.An optical shutter periodically chops the Field of View of the PIR sensor was proposed in [9,13,14] to enhance stationary human detection.In [12], a PIR is added with a motor to avoid failure detection in a static pose.These methods require special hardware so they are hard to be applied in a wide range of scenarios.In recent years, Machine Learning (ML) has been applied to improve human detection accuracy from the traditional PIR sensor [10,11,[14][15][16][17] showing the potential solution.In [15], the Deep Learning method is used based on chest motion data recorded by a PIR sensor.A Convolution Neural Network model was proposed to study the signal feature of human motion compared to animal motion by the authors of [10].The authors in [16] compared Machine Learning and Deep Learning models with several algorithms.The result showed that while the Deep Learning method gained a higher accuracy, Machine Learning algorithms performed better in real-time detection.Deep Learning methods require a high computational cost, making them difficult to implement on devices with constrained resources.The work in [14] applied 6 different algorithms including Random Forest (RF), Multi-layer Perceptron (MLP), Naïve Bayes Classifier (NB), K-Nearest Neighbors (KNN), Support Vector Machine (SVM) and Decision Tree (DT) for performance comparison, all 6 algorithms had high accuracy.Nonetheless, they used their lab-made SLEEPIR sensor instead of a traditional PIR sensor.A Bayesian Machine Learning algorithm and Support Vector Machine were proposed in [11,17], respectively.However, all these ML algorithms are computationally intensive, making their implementation on limited-constrained devices challenging.Nowadays, with the advancement of semiconductors and the Internet of Things technologies, limited-resource devices microcontrollers have become popular in the world and have been used in most commercial off-the-shelf (COTS) hardware products.Besides, in smart home system applications, a PIR sensor is used and connected to a microcontroller with wire/ wireless communication technologies in order to transfer data among them and/or to the server.Motivated by these, this work directly implements an ML algorithm that is built on a COTS to increase the accuracy of the PIR sensor.This intelligent PIR sensor namely iPIR allows us to improve the performance of the PIR sensor without spending additional costs on hardware.Our main contributions include the following: (i) Propose a Learning Vector Quantization (LVQ)-based system that can significantly improve the performance of the embedded PIR sensor.
(ii) Training and testing our iPIR in two distinct real-world scenarios.Results indicate the effectiveness of our methodology on PIR-based human detection tasks.
The structure of this paper is as follows: section 2 introduces the principle of Passive Infrared sensors.Section 3 presents our proposed iPIR using LVQ algorithms.Our experimental scenario and the performance of iPIR are then described in section 4.And finally, section 5 concludes this work with the future directions of our research.

Passive infrared sensor
In the field of surveillance and automatic lighting control, PIR sensors are popular and effective tools for detecting presence due to their simplicity and power.Specifically, PIR sensors offer a solution to privacy concerns that may arise from the use of camera-based surveillance systems [18].A PIR sensor is classified in the thermal IR detectors class but can be converted into an electrical signal, specifically a voltage change via thermal electric conversion.The output of the PIR sensor can be affected by several aspects, including the distance between occupancy and the PIR sensor, the presence of multiple humans, and the direction and speed of occupancy.
PIR sensors usually have Fresnel lenses to expand their range of detection, hence, they can detect effectively in a range of 10 meters with a field of view is about 150 degrees, as shown in figure 1(a).With a large change in human pose, the signal of the PIR sensor will change enormously.In contrast, when there is only a tiny change in human motion, the signal fluctuates slightly in a small margin, which can be shown in figure 1(b).
In the industry, currently, they have used threshold value and a Trigger Schmitt for occupancy detection.If the value is higher than the threshold values, this system will decide that the room is occupied, then the Trigger Schmitt makes a time delay for this prediction and waits for some seconds.For a certain period, if the values from the PIR sensor are lower than the threshold value, the system will make a decision that the room is unoccupied.
This system has a weakness, when using it in a smart home for lighting, not everyone knows how to change the threshold value, it often leads to the wrong prediction and the light turns off when the room is occupied.In addition, there are some other scenarios the industrial PIR sensors may detect wrongly such as: When the human body does not fluctuate much so the PIR sensor value is lower than the threshold value and it will decide the room is unoccupied; or when the noise appears and makes the value suddenly rises more than the threshold value, the system will turn on the light when the room is unoccupied.

Proposed intelligent PIR system using learning vector quantization algorithm
Motivated by the advancement of ML algorithms, this study targets improving the performance of traditional PIR sensors by leveraging a prototype-based algorithm, the so-called Learning Vector Quantization.The overall pipeline of our approach is illustrated in figure 2.
A PIR sensor with an analog output signal is connected to a microcontroller in the proposed device for predicting the presence of humans.The most appropriate codebooks were determined by the training phase.First, an experiment was conducted to create a database for our LVQ model, which will be described in section 4.2 in detail.After splitting and labeling, the data was input into the Learning Vector Quantization model for codebook update.The data from the test set was then used to evaluate the model performance.If the performance was satisfactory, these codebooks were subsequently loaded onto the microcontroller for inference.When a person enters the PIR sensor's field of view, the microcontroller examines the analog signal received from the PIR sensor and determines whether the room is occupied or not.
Machine Learning has been used widely in human detection, various algorithms are applied in real life, and some prominent among them can be mentioned are K-Nearest Neighbours (KNN), Support Vector Machine (SVM), Random Forest (RF), and Multi-layer Perceptron (MLP).KNN is a suitable choice as its idea is relatively straightforward and does not require users to undergo a complex training phase.The main idea of KNN is to find k nearest neighbors of the point in the train set then an algorithm has to predict and decide output based on its neighbors.However, for a large train set, finding k nearest neighbors is time-consuming and requires a large amount of computation.In addition, KNN is easily affected by noise and can lead to wrong predictions.RF is a popular ensemble method that operates by constructing decision-random trees.Despite its high prediction accuracy, RF is difficult to control and may overfit when it comes to small data or low-dimensional data as the randomness is significantly reduced.SVM is one of the best regression or classification methods available today.Analogous to RF, SVM is also considered to be a black box model, which constructs support vectors to separate hyperplanes in high or infinite dimensional space.MLP is an artificial neural network constructed from multiple hidden layers to map the input to the output.Each hidden layer contains a linear fully connected function accompanied by a nonlinear activation function.MLP may take multiple layers and multiple neurons each layer to transfer input data to deeper hidden representations, which also leads to a black box model.These limitations of machine learning models introduced in previous works about iPIR sensors motivate us to develop an iPIR sensor system that is easy to interpret and requires low computation cost.In this paper, we investigate the LVQ algorithm in order to provide an intuitive classification method for human detection by PIR sensors.
In [19] and [20], LVQ is a supervised classification algorithm, which is widely used for classification problems due to its easy implementation and understanding.The purpose of LVQ is to learn a codebook (or prototype) for assigning an arbitrary input vector to a target class label from training data composed of an input vector x and a corresponding label y.At least one codebook is prepared for each label.An LVQ system is represented by a fixed number of codebooks W = {w i |i = 1,K,M} in which w i is a codebook in the feature space of observed data, M is the number of codebooks.One codebook can be represented: are Ndimensional input features; y i ä {1,K,C} is the label of corresponding feature.
The classification scheme is based on the winner-takes-all strategy, which needs to find the best matching unit (BMU) in W. The BMU is often decided based on the smallest Euclidean Distance d(x, y) = ||x − y|| 2 of arbitrary input vector x and w j : By applying the winner-takes-all strategy, w j is the BMU mapping to the label of x: y j = y(w j (x)).The learning algorithm aims at pushing the codebooks closer to their corresponding feature vectors if they have the same label and moving the codebooks away otherwise.The original LVQ proposed by Kohonen et al [21,22] are heuristics-based methodologies, which lack a thorough mathematical analysis.This issue can cause training instability, slow convergence, and initialization sensitivity due to unexpected behaviors of the algorithms [19].To cope with these problems, Sato & Yamada [23] introduced the generalized learning vector quantization (GLVQ) as a margin maximization approach, which aims at minimizing the cost function: where Θ( • ) is a monotonically increasing function, e.g. the identity function Θ(x) = x or the logistic sigmoid function.A previous study found that the identity function generalizes well for most tasks [24].μ is the relative distance difference, defined as: ( is the distance of the nearest codebook having the same label with x i and d − (x i ) = d(w − , x i ) is the distance of nearest codebook having a different label from x i .It can be observed that μ(x i ) is negative if and only if the feature vector x i is classified correctly.The term d + (x i ) + d − (x i ) is the scaling factor bounding μ(x i ) between -1 and 1.The learning rule of GLVQ is visualized in figure 3. The error rate can be improved by minimizing the cost function J GLVQ based on the steepest descent method as follows: Here Δw constitutes the update rule by the partial derivative of J GLVQ with respect to the parameter w.For a given feature vector x and a learning rate α > 0, we can take the following derivatives: If the identity function is chosen for Θ, the update rule is derived as: Despite being very straightforward for human understanding, not all data types can be classified easily by Euclidean distance-based learning rule in practice [24].For the human detection task, the PIR signals are sensitive to environmental noise, making the unoccupied and occupied signals identical.The same problem occurs when a person is stationary.Under these challenging circumstances, the Euclidean distance measure may not generalize well, which encourages the derivation of the generalized matrix learning vector quantization (GMLVQ) [24].The relative distance difference μ of GMLVQ is derived as: Here d Λ (x, w) = (x − w) T Λ(x − w) is the generalized distance measure in a relative space.By this definition, the correlations matrix Λ must be a symmetric and positive definite matrix, which can be obtained by: Λ = Ω T Ω, where  N N W Î ´.The learning rules are formulated by taking derivatives of the GMLVQ's cost function with respect to parameters w and Ω.Similar to GLVQ learning rules, we have: The learning rule for the parameter Ω is derived for each matrix element Ω hk , where h, k = 1,K,N.The partial derivative Ψ(x, w) of the distance function d Λ (x, w) with respect to Ω hk is given by: Consequently, we can formulate the update rule for the matrix element Ω hk as follows: Algorithm 1. GMLVQ algorithm for iPIR system.The feature vectors obtained by training data are utilized to localize the optimal codebooks by the GLVQ and GMLVQ learning rules.Only these codebooks are required to predict the presence of humans in each time frame.In summary, the proposed system using GMLVQ algorithm is presented in algorithm 1.A similar process for the GLVQ algorithm can be obtained by omitting Ω and using equations of GLVQ.In this subsection, we briefly describe the hardware of our iPIR as an edge device.The diagram of the device is illustrated in figure 4.

Experiments
The PIR sensor detects infrared waves from the object around to create a voltage signal.This voltage signal is usually small and affected by noise.Therefore an amplifying and filtering module is used to remove the noise part.An ADC then converts the filtered signal into a digital signal.The MCU reads the digital signal and conducts LVQ inference through the trained codebooks to predict whether the room is occupied.
In our iPIR device, the main controller is the ESP32 Devkit V1, which includes the ADC, the processor cores, and the wifi transceiver.An LED is connected to the ESP32 to show the prediction result.The amplifier and filter module is a conventional analog frontend for the PIR sensor, comprising several operational amplifier (op amp) stages and bandpass filters.Our iPIR sensor is shown in figure 5(a).

Data collection and pre-processing
In this subsection, our experiment to create a dataset for training and testing purposes is described.The data were recorded in two different places.One was recorded in our laboratory at Hanoi University of Science and Technology, while the other was recorded in the restroom.The laboratory is quite large and contains noise, while the restroom is small and has a minor noise to ensure the different environmental conditions.The PIR sensor is connected to a computer for data transmitting purposes.In addition, a camera is used to observe all activities occurring in the room.Our setup is illustrated in figure 5(b).
After finishing the setup, in both rooms, a series of real activities involving practically everyday human actions were conducted.These actions included: going in the room, going out of the room, resting (room with nobody), cleaning the floor, working out or sitting in the lab and washing hands or washing face in the restroom, etc.All these actions are recorded by a single PIR sensor and observed by a computer camera.The PIR sensor sample time is 10ms which means data is transmitted to the computer for storage every 10 milliseconds.The dataset was collected following the pipeline demonstrated in figure 6.In our experiment, the personal computer played the role of a local server, which received data from a PIR sensor module and a webcam simultaneously.While the webcam was connected directly to the computer for recording purposes, the PIR module transmitted digital data to the computer through a local wireless network.The PIR signal is manually labeled by segments based on the presence of humans from recorded videos.Subsequently, a dataset was formed by only segmentby-segment PIR signals as input and the presence of humans as output.
In fact, PIR sensing is a potential unobtrusive technology that has been widely used to monitor activities of daily living [25,26].In our experiment, the camera was only used for dataset collection purposes.All the recordings were stored locally on our server, thus there are no privacy concerns about this dataset.When the iPIR sensor is in operation, only the thermal signals are recorded by the PIR sensor and the controller receives a temporary duration of two seconds of the amplified analog signal to make a prediction.The recorded data by cameras are only used for labeling the dataset and do not relate to the operation of the iPIR.As the system does not employ video-based or audio-based sensors, privacy protection is guaranteed.After finishing recording videos and PIR sensor data, our data is split into various periods by the sliding window method, each period's length is 2s with a hop size of 0.5s.Due to the sampling period of 10 milliseconds, a vector comprising 200 values is created in two seconds.Each vector is then labeled based on the video. 1 indicates occupied rooms while 0 indicates unoccupied rooms.In total, the number of vectors of each data type is illustrated in table 1.
From each vector, the five most important features that benefit the human prediction are chosen based on recursive feature elimination with cross-validation: the minimum value, the maximum value, the average value, the standard deviation, and the slope of the data.The slope is chosen because the speed of changing signals between when the room was occupied and unoccupied are different.The slope of a vector is calculated by:    As shown in table 1, the total number of training vectors is 2532; each vector has six values, the first five of which are the aforementioned features and the last of which is the vector's label.Similarly, there are 633 vectors for testing, with each vector containing five values, and one vector for storing the label of each feature vector in order to verify its accuracy.
Figure 7 depicts a pairwise scatter plot for improved comprehension of the relationship between the presence of humans and the extracted training data features.One can observe that the of occupied and unoccupied conditions can be easily distinguished, which means all five extracted features are valuable information for the classification process.The distributions of occupied conditions are considerably larger than those of unoccupied conditions owing to the more stable PIR signals obtained from unoccupied conditions.

Performance evaluation
First, the GLVQ algorithm was used to evaluate performance.The number of epochs is set to 1000.The number of codebooks is a crucial parameter that relies on the number of modes of the underlying label distribution.A large number of codebooks can affect generalization bounds and lead to overfitting, as well as increase the computational cost of GMLVQ, while not enough codebooks may fail to represent the underlying data distribution.In our study, we conduct empirical experiments to determine a reasonable number of codebooks for PIR data.The accuracy and visualization corresponding to varying numbers of codebooks are depicted in table 2 and figure 8, respectively.From table 2, we can observe that the proposed strategy based on GLVQ achieves high accuracy with only two codebooks per label.This can be expected due to the generalization capability of the GLVQ.The algorithm performance peaks at 86.10% and decreases as a large number of codebooks may lead to overfitting.Figure 8 visualizes the final codebooks of the GLVQ algorithm with different numbers of codebooks.Table 3 shows further improvements in our iPIR approach using the GMLVQ algorithm.It can be seen that when the number of codebooks for each type is [5,5], the accuracy reached a  The learned matrix Ω for the generalized distance measure of GMLVQ is Other algorithms are implemented on our dataset in order to compare their precision with our proposed algorithm.We also simulated the industry-standard threshold value for PIR sensors on the computer and compared its accuracy using our dataset.Various threshold values are tested, with 67.71% being the most accurate.Table 5 depicts a comparison of the proposed algorithm's performance to that of several other algorithms mentioned in section 3.In particular, for the K-Nearest Neighbors (KNN) model, we use a k-value of 5 and the uniform weight function.The Support Vector Machine (SVM) model utilizes the radial basis function with a regularization parameter C set to 1.To ensure a gradual decision boundary and avoid overfitting, the gamma parameter is scaled by the number of features and the training data variance.For the Random Forest (RF) model, we set the number of trees to 10 and the minimum number of samples required to split an internal node to 2, while the minimum number of samples of a leaf node is set to 1.We also employ the bootstrap aggregation technique to train the individual decision trees.The Multilayer Perceptron (MLP) model consists of 2 hidden layers, with 20 neurons in the first layer and 10 neurons in the second layer.The activation function is ReLU.Finally, our GMLVQ model has 10 codebooks in total and a learning matrix Ω as stated above.The Adam optimizer [27] with a learning rate of 0.0005 is used.
As shown in table 5, it can be seen that the threshold value method has the poorest performance among the different methods for PIR sensors.Despite its low computational cost, our proposed LVQ-based algorithm achieves a competitive performance, which outperforms other advanced algorithms by a small margin.

Conclusion
In this paper, an intelligent Passive Infrared sensor for human detection based on Learning Vector Quantization is proposed.Its performance is evaluated in two distinct environments by collecting the dataset in different locations.Both GLVQ and GMLVQ are implemented in this paper to make a comparison about the accuracy with other Machine Learning algorithms.Its performance is also compared with the method-based threshold value used in the industry currently.With the GLVQ algorithm, the accuracy is 86.10% while with the GMLVQ algorithm, the accuracy of our proposed system is 89.25%, which is remarkably competitive to many advanced Machine Learning algorithms despite its low computational cost.Furthermore, LVQ-based algorithms provide an intuitive explanation for human detection based on PIR sensors.
Future works will focus on the way to develop a multimodal iPIR sensors.The current system only predicts whether the room is occupied or not.Our future goal is to improve a PIR sensor array to make predictions of which actions the occupants in the rooms do and track the occupants' trajectory.Another direction is to utilize this embedded PIR device in a Smart Home system.With the low energy consumption, it can be used as an awakening device for other devices, and help other devices in the system detect when to work, which can reduce the energy waste if other devices have to run all the time.

Figure 1 .
Figure 1.PIR sensor field of view and its signal (a) PIR sensor's detecting area (b) PIR sensor's signal.

Figure 2 .
Figure 2. The structure of an intelligent PIR sensor.

4. 1 .
Hardware descriptionOff-the-shelf PIR modules mostly use specific ICs.The module's input is an infrared wave, and the output is logical values specifying whether humans are present based on fixed threshold values.For our implementation, we need our output signals of the PIR module to be analog signals representing the amplified PIR signals.Thus, a PIR module is reimplemented without analog-digital components like Schmitt trigger circuits or logic gates instead of off-the-shelf PIR modules.

Figure 4 .
Figure 4. Diagram of the hardware description.

Figure 5 .
Figure 5. Setup for recording dataset and our intelligent PIR sensor (a) Intelligent PIR sensor (b) Setup for recording dataset.

Figure 6 .
Figure 6.PIR data collection and handling pipeline.
argmax(vector) and argmin(vector) represent the index of max and min values of the vector respectively.

Figure 7 .
Figure 7. Pairwise scatter plot for the training data.

Figure 8 .
Figure 8. Visualization of GLVQ algorithm with different numbers of codebooks.

Table 1 .
The number of vectors each type.

Table 2 .
Test accuracy with different number of codebooks of GLVQ algorithm.

Table 3 .
Test accuracy with different numbers of codebooks of GMLVQ.