Abstract
The internet of things (IoT) is the networking of interrelated devices and sensors connected through the internet to transfer and share data. The data gathered from these devices may have anomalies or other errors for various reasons, such as malicious activities or sensor failures. Anomaly detection is found in several domains, such as fault detection and health monitoring systems. In this paper, we review and analyze the relevant literature on existing anomaly detection techniques that apply different machine learning approaches in the IoT. In addition, we examine different anomaly detection datasets used in IoT and highlight the most concerning issues with these datasets for different approaches, and list several future research directions. We believe this survey will serve as a starting point for researchers to gain knowledge from the IoT that employs machine learning approaches to detect anomalies. Moreover, the datasets that associate with anomaly detection in IoT, issues and future directions from a dataset perspective.
Similar content being viewed by others
Data Availability
Data sharing not applicable to this article as no datasets were generated or analysed during the current study
Abbreviations
- IoT:
-
Internet of things
- IWC:
-
Inverse weight clustering
- DBSCAN:
-
Density-based spatial clustering of applications with noise
- RF:
-
Random forest
- LDA:
-
Linear discriminant analysis
- PCA:
-
Principal component analysis
- NB:
-
Naive bayes
- FF:
-
Farthest first
- EM:
-
Expectation maximization
- SVDD:
-
Support vector data description
- DNNs:
-
Deep neural networks
- DT:
-
Decision tree
- DOS:
-
Denial of service
- MD:
-
Mahalanobis distance
- LOF:
-
Local outlier factor
- KNNs:
-
K-nearest neighbours
- SVMs:
-
Support vector machines
- NNs:
-
Neural networks
- OC-SVM:
-
One-class SVM
- RBF:
-
Radial basis function
- FPR:
-
False positive rate
- TPR:
-
True positive rate
References
Cisco global cloud index: Forecast and methodology, 2016–2021 white paper (2018). https://www.cisco.com/c/en/us/solutions/collateral/service-provider/global-cloud-index-gci/white-paper-c11-738085.html
Chandola, V., Banerjee, A., & Kumar, V. (2009). Anomaly detection: A survey. ACM Computing Surveys, 41(3), 15:1. https://doi.org/10.1145/1541880.1541882.
Goldstein, M., & Uchida, S. (2016). A comparative evaluation of unsupervised anomaly detection algorithms for multivariate data. PLoS ONE, 11(4), 1. https://doi.org/10.1371/journal.pone.0152173.
Ahmed, M., Mahmood, A. N., & Hu, J. (2016). A survey of network anomaly detection techniques. Journal of Network and Computer Applications, 60, 19.
Tsai, C. F., Hsu, Y. F., Lin, C. Y., & Lin, W. Y. (2009). Intrusion detection by machine learning: A review. Expert Systems with Applications, 36(10), 11994. https://doi.org/10.1016/j.eswa.2009.05.029.
Jang, J. S. R., Sun, C. T., & Mizutani, E. (1997). Neuro-fuzzy and soft computing: A computational approach to learning and machine intelligence. New Jersey: Prentice Hall.
Ford, V., Siraj, A., & Eberle, W. (2014). In 2014 IEEE symposium on computational intelligence applications in smart grid (CIASG) (pp. 1–6). https://doi.org/10.1109/CIASG.2014.7011557
Commission for Energy Regulation (CER), Irish Social Science Data Archive (ISSDA). (2012). www.ucd.ie/issda/data/commissionforenergyregulationcer/
Cañedo, J., & Skjellum, A. (2016). In 2016 14th Annual conference on privacy, security and trust (PST) (pp. 219–222). https://doi.org/10.1109/PST.2016.7906930
Jain, R., & Shah, H. (2016). In 2016 International conference on signal and information processing (IConSIP) (pp. 1–5). https://doi.org/10.1109/ICONSIP.2016.7857445
Ali, M. I., Gao, F., & Mileo, A. (2015). In Proceedings of ISWC 2015–14th international semantic web conference (W3C (pp. 374–389). Bethlehem, PA
Pollution Data, Citypulse Project. (2014). http://iot.ee.surrey.ac.uk:8080/datasets.html
Hodo, E., Bellekens, X., Hamilton, A., Dubouilh, P., Iorkyase, E., Tachtatzis, C., & Atkinson, R. (2016). In International symposium on networks. Computers and communications (ISNCC) (Vol. 2016, pp. 1–6). https://doi.org/10.1109/ISNCC.2016.7746067
Pachauri, G., & Sharma, S. (2015) Procedia Computer Science 70, 325. https://doi.org/10.1016/j.procs.2015.10.026. (Proceedings of the 4th International Conference on Eco-friendly Computing and Communication Systems).
Goldberger, A. L., Amaral, L. A. N., Glass, L., Hausdorff, J. M., Ivanov, P. C., Mark, R. G., et al. (2000). Circulation electronic pages. Circulation, 101(23), e215. https://doi.org/10.1161/01.CIR.101.23.e215.
PhysioNet. https://www.physionet.org/cgi-bin/atm/ATM
Hasan, M., Islam, M. M., Zarif, M. I. I., & Hashem, M. (2019). Internet of Things, 7, 100059.
Pahl, M. O., & Aubet, F. X. (2018). In 2018 14th International conference on network and service management (CNSM) (pp. 72–80).
Pajouh, H. H., Javidan, R., Khayami, R., Ali, D., & Choo, K. K. R. (2016). A two-layer dimension reduction and two-tier classification model for anomaly-based intrusion detection in IoT backbone networks. IEEE Transactions on Emerging Topics in Computing. https://doi.org/10.1109/TETC.2016.2633228.
Tavallaee, M., Bagheri, E., Lu, W., & Ghorbani, A. A. (2009). In 2009 IEEE symposium on computational intelligence for security and defense applications (pp. 1–6). https://doi.org/10.1109/CISDA.2009.5356528
Pajouh, H. H., Dastghaibyfard, G., & Hashemi, S. (2017). Two-tier network anomaly detection model: A machine learning approach. Journal of Intelligent Information Systems, 48(1), 61. https://doi.org/10.1007/s10844-015-0388-x.
Alghuried, A. (2017). A model for anomalies detection in internet of things (IOT) using inverse weight clustering and decision tree. Masters dissertation. https://doi.org/10.21427/D7WK7S
Bodik, P., Hong, W., Guestrin, C., Madden, S., Paskin, M., & Thibaux, R. (2004). Intel Lab Data. http://db.csail.mit.edu/labdata/labdata.html
Zhao, S., Li, W., Zia, T., & Zomaya, A. Y. (2017). In 2017 IEEE 15th international conference on dependable, autonomic and secure computing, 15th international conference on pervasive intelligence and computing, 3rd international conference on big data intelligence and computing and cyber science and technology congress(DASC/PiCom/DataCom/CyberSciTech) (pp. 836–843). https://doi.org/10.1109/DASC-PICom-DataCom-CyberSciTec.2017.141
KDD Cup 1999 Data. (1999). http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html
SaeediEmadi, H., & Mazinani, S. M. (2018). A novel anomaly detection algorithm using DBSCAN and SVM in wireless sensor networks. Wireless Personal Communications, 98(2), 2025. https://doi.org/10.1007/s11277-017-4961-1.
Hosseini, M., & Borojeni, H. R. S. (2018). In Proceedings of the international conference on smart cities and internet of things (SCIOT 18). Association for Computing Machinery. https://doi.org/10.1145/3269961.3269975
Alghanmi, N., Alotaibi, R., & Buhari, S. M. (2019). HLMCC: A hybrid learning anomaly detection model for unlabeled data in internet of things. IEEE Access, 7, 179492. https://doi.org/10.1109/ACCESS.2019.2959739.
Suthaharan, S., Alzahrani, M., Rajasegarar, S., Leckie, C., & Palaniswami, M. (2010). In 2010 Sixth international conference on intelligent sensors, sensor networks and information processing (pp. 269–274). https://doi.org/10.1109/ISSNIP.2010.5706782
Vangipuram, R., Gunupudi, R. K., Puligadda, V. K., & Vinjamuri, J. (2020). A machine learning approach for imputation and anomaly detection in IoT environment. Expert Systems, 37(5), e12556. https://doi.org/10.1111/exsy.12556.
Zheng, Y., Rajasegarar, S., Leckie, C., & Palaniswami, M. (2014). In 2014 IEEE ninth international conference on intelligent sensors, sensor networks and information processing (ISSNIP) (pp. 1–6). https://doi.org/10.1109/ISSNIP.2014.6827618
San Francisco Parking Data. (2013). http://sfpark.org/
Morrow, A., Baseman, E., & Blanchard, S. (2016). In 2016 International conference on computational science and computational intelligence (CSCI) (pp. 629–632). https://doi.org/10.1109/CSCI.2016.0124
Schroeder, B., & Gibson, G. A. (2007). Workshop on reliability analysis of system failure data (RAF07). Cambridge: MSR Cambridge.
Garcia-Font, V., Garrigues, C., & Rifà-Pous, H. (2016). A comparative study of anomaly detection techniques for smart city wireless sensor networks. Sensors, 16, 6. https://doi.org/10.3390/s16060868.
Martí, L., Sanchez-Pi, N., Molina, J. M., & Garcia, A. C. B. (2015). Anomaly detection based on sensor data in petroleum industry applications. Sensors, 15(2), 2774. https://doi.org/10.3390/s150202774.
Inoue, J., Yamagata, Y., Chen, Y., Poskitt, C. M., & Sun, J. (2017). In 2017 IEEE international conference on data mining workshops (ICDMW) (pp. 1058–1065). https://doi.org/10.1109/ICDMW.2017.149
Secure Water Treatment (SWaT). (2017). http://itrust.sutd.edu.sg/testbeds/secure-water-treatment-swat/
Goh, J., Adepu, S., Junejo, K. N., & Mathur, A. (2017). Critical information infrastructures security (pp. 88–99). Cham: Springer.
Hoang, D. H., & Nguyen, H. D. (2018). In 2018 20th International conference on advanced communication technology (ICACT) (pp. 381–386). https://doi.org/10.23919/ICACT.2018.8323766
Traffic Data from Kyoto University’s Honeypots. (2006). http://www.takakura.com/Kyoto_data
White, J., & Legg, P. (2021). In 2021 International conference on cyber situational awareness, data analytics and assessment (CyberSA) (pp. 1–8). https://doi.org/10.1109/CyberSA52016.2021.9478248
Handl, J., Knowles, J., & Kell, D. B. (2005). Computational cluster validation in post-genomic data analysis. Bioinformatics, 21(15), 3201.
Suo, H., Wan, J., Zou, C., & Liu, J. (2012). In 2012 International conference on computer science and electronics engineering (Vol. 3, pp. 648–651). https://doi.org/10.1109/ICCSEE.2012.373
Author information
Authors and Affiliations
Contributions
NA carried out the conceptualization, formal analysis, methodology, implementation, visualization, and writing the original draft of the manuscript. RA carried out the conceptualization, formal analysis, methodology, revising and editing the paper. SB revised and edited the paper.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflcit of interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Alghanmi, N., Alotaibi, R. & Buhari, S.M. Machine Learning Approaches for Anomaly Detection in IoT: An Overview and Future Research Directions. Wireless Pers Commun 122, 2309–2324 (2022). https://doi.org/10.1007/s11277-021-08994-z
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11277-021-08994-z