This site uses cookies. By continuing to use this site you agree to our use of cookies. To find out more, see our Privacy and Cookies policy.
Paper The following article is Open access

Behavior anomaly detection based on big data analysis of Internet of Things

, , and

Published under licence by IOP Publishing Ltd
, , Citation Jinliang Yang et al 2021 J. Phys.: Conf. Ser. 2004 012011 DOI 10.1088/1742-6596/2004/1/012011

1742-6596/2004/1/012011

Abstract

The technical requirements of behavior anomaly detection are higher and higher. Using the Internet of things technology combined with a variety of big data analysis algorithms, we can achieve accurate behavior anomaly detection by classifying behavior data sets to a large extent. In this paper, PLA - PRF (parallel random forest) algorithm is used to realize the behavior anomaly detection model of Internet of things integrating big data analysis. In behavior detection, the PRF algorithm and DFS algorithm are compared in the case of a different number of decision trees. The results show that, compared with DRF algorithm, PLA-PRF, SPARK MLRF(Spark Machine Learning Random Forests) and PRF algorithm perform better on the four datasets, with kappa values increased by about 3.13%, 2.56% and 1.98% respectively. In contrast, PLA-PRF algorithm has higher accuracy in the case of a small sample size. With the increase of sample size, the accuracy of behavior anomaly detection gradually decreases; because the algorithm is in subspace in the process of construction, some high pheromone features are abandoned, which makes the new spatial information of features insufficient, resulting in the decision tree training process does not learn the inherent laws of abandoned data. Compared with spark MLRF and DRF, PLA-PRF has a faster execution speed in large data sets, and with the increase of data volume, the advantage is more prominent. This is because PLA-PRF uses data reuse strategy "DRS" in the process of parallelization, which reduces the data communication overhead in a distributed environment and improves the parallelization efficiency of the algorithm.

Export citation and abstract BibTeX RIS

Content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

Please wait… references are loading.
10.1088/1742-6596/2004/1/012011