Abstract
The development of structural health monitoring (SHM) on civil infrastructures has resulted in enormous amount of acquired data along with the pressure of data processing and data mining. Abnormalities in data can lead to serious analytical error in later assessment. Such anomalous data patterns generally account for a relatively small portion of the overall dataset, which can be easily misclassified as normal data by regular classifiers. In this paper, a novel and robust data anomaly detection framework was proposed. The core novelty in this framework is the utilization of learning active learning (LAL) and AdaBoost algorithm aiming to reduce the costly manual work of labeling and improve the classification of anomaly patterns. Furthermore, the problem of biased classification brought by imbalanced datasets has also been solved by the LAL. Wavelet packet transform was also utilized to extract features from the acceleration data. The methodologies were firstly introduced precisely in this paper followed by two study cases to verify the feasibility of the proposed framework for data anomaly detection. The first case was a dataset with the anomalies synthetically added to the acceleration time history data measured in dynamic tests of a grid structure, including five kinds of data abnormalities. Both the balanced and imbalanced datasets were studied and analyzed, where a comparative study was carried out between the LAL-AdaBoost and uncertainty sampling-based AdaBoost with the same training and testing sets. The results showed that LAL-AdaBoost outperformed in both scenarios with higher accuracies and faster convergence speed. Then, a further study was carried out using acceleration data collected from a long-span bridge. By querying only limited amount of the training set, the proposed framework could accurately detect and classify 97.95% anomaly patterns of the testing set, showing great potential for further and broader application in the field of SHM data processing.
Similar content being viewed by others
References
Xu J, Gui C, Han Q (2020) Recognition of rust grade and rust ratio of steel structures based on ensembled convolutional neural network. Comput Civ Infrastruct Eng 35:1160–1174. https://doi.org/10.1111/mice.12563
Han Q, Ma Q, Xu J, Liu M (2021) Structural health monitoring research under varying temperature condition: a review. J Civ Struct Health Monit 11:149–173. https://doi.org/10.1007/s13349-020-00444-x
Bao Y, Li H (2021) Machine learning paradigm for structural health monitoring. Struct Health Monit 20:1353–1372. https://doi.org/10.1177/1475921720972416
Xu J, Liu H, Han Q (2021) Blockchain technology and smart contract for civil structural health monitoring system. Comput Civ Infrastruct Eng 36:1288–1305. https://doi.org/10.1111/mice.12666
Han Q, Zhao N, Xu J (2021) Recognition and location of steel structure surface corrosion based on unmanned aerial vehicle images. J Civ Struct Health Monit. https://doi.org/10.1007/s13349-021-00515-7
Xu J, Liu X, Han Q, Wang W (2021) A particle swarm optimization–support vector machine hybrid system with acoustic emission on damage degree judgment of carbon fiber reinforced polymer cables. Struct Health Monit 20:1551–1562. https://doi.org/10.1177/1475921720922824
Expedia (2020) Adaptive Alerting (AA)-Streaming anomaly detection with automated model selection and fitting. In: Adaptive-alerting. https://github.com/ExpediaGroup/adaptive-alerting. Accessed 20 July 2020
Arundo (2020) Anomaly Detection Toolkit (ADTK). https://github.com/arundo/adtk. Accessed 17 Apr 2020
KDD-OpenSource (2019) Anomaly detection on time series: an evaluation of deep learning methods. https://github.com/KDD-OpenSource/DeepADoTS. Accessed 21 Nov 2019
Douzas G, Bacao F, Last F (2018) Improving imbalanced learning through a heuristic oversampling method based on k-means and SMOTE. Inf Sci (Ny) 465:1–20. https://doi.org/10.1016/j.ins.2018.06.056
Chen J, Wang L, Hu Q (2020) Machine learning-based anomaly detection of ganglia monitoring data in HEP Data Center. EPJ Web Conf 245:07061. https://doi.org/10.1051/epjconf/202024507061
Jian Z, Wing Y, Ti Z (2019) Anomaly detection of target dynamics based on clustering. In: Proceedings—2018 15th International Symposium on Pervasive Systems, Algorithms and Networks, I-SPAN 2018. Institute of Electrical and Electronics Engineers Inc., pp 287–291
Ni FT, Zhang J, Noori MN (2020) Deep learning for data anomaly detection and data compression of a long-span suspension bridge. Comput Civ Infrastruct Eng 35:685–700. https://doi.org/10.1111/mice.12528
Tang Z, Chen Z, Bao Y, Li H (2019) Convolutional neural network-based data anomaly detection method using multiple information for structural health monitoring. Struct Control Health Monit. https://doi.org/10.1002/stc.2296
Bao Y, Tang Z, Li H, Zhang Y (2019) Computer vision and deep learning-based data anomaly detection method for structural health monitoring. Struct Health Monit 18:401–421. https://doi.org/10.1177/1475921718757405
Kang D-K, Han M-G (2019) Heterogeneous ensemble of classifiers from under-sampled and over-sampled data for imbalanced data. Int J Adv Smart Converg 8:75–81. https://doi.org/10.7236/IJASC.2019.8.1.75
Li H, Zou P, Wang X, Xia R (2013) A new combination sampling method for imbalanced data. In: The 2013 Chinese intelligent automation conference. Lecture notes in electrical engineering, vol 256. Yangzhou, China, pp 547–554
Smaz SG, Chandola V, Patra AK (2019) Integrated clustering and anomaly detection (INCAD) for streaming data. In: The computational science-ICCS 2019. Faro, Portugal
Xu B, Mou K (2020) A review of methods for detecting point anomalies on numerical dataset. In: 2020 IEEE 4th information technology, networking, electronic and automation control conference (ITNEC 2020). Chongqing
Bull L, Worden K, Manson G, Dervilis N (2018) Active learning for semi-supervised structural health monitoring. J Sound Vib 437:373–388. https://doi.org/10.1016/j.jsv.2018.08.040
Zhao Y, Shi Z, Zhang J et al (2019) A novel active learning framework for classification: using weighted rank aggregation to achieve multiple query criteria. Pattern Recognit 93:581–602. https://doi.org/10.1016/j.patcog.2019.03.029
Zhang Q, Sun S (2010) Multiple-view multiple-learner active learning. Pattern Recognit 43:3113–3119. https://doi.org/10.1016/j.patcog.2010.04.004
Bull L, Manson G, Worden K, Dervilis N (2019) Active learning approaches to structural health monitoring. Struct Health Monit 5:157–159. https://doi.org/10.1007/978-3-319-75390-4_14
Bull LA, Rogers TJ, Wickramarachchi C et al (2019) Probabilistic active learning: an online framework for structural health monitoring. Mech Syst Signal Process 134:106294. https://doi.org/10.1016/j.ymssp.2019.106294
Zhao J, Sun S, Wang H, Cao Z (2020) Promoting active learning with mixtures of Gaussian processes. Knowl Based Syst 188:105044. https://doi.org/10.1016/j.knosys.2019.105044
Yang Y, Ma Z, Nie F et al (2015) Multi-class active learning by uncertainty sampling with diversity maximization. Int J Comput Vis 113:113–127. https://doi.org/10.1007/s11263-014-0781-x
Wang R, Kwong S (2014) Active learning with multi-criteria decision making systems. Pattern Recognit 47:3106–3119. https://doi.org/10.1016/j.patcog.2014.03.011
Jiao Y, Zhao P, Wu J et al (2014) A multicriterion query-based batch mode active learning technique. In: Wen Z, Li T (eds) Advances in intelligent systems and computing, vol 277. Springer, Berlin, pp 669–680
Lei Y, Yang B, Jiang X, et al (2020) Applications of machine learning to machine fault diagnosis: a review and roadmap. Mech. Syst. Signal Process. 138
Beale C, Niezrecki C, Inalpolat M (2020) An adaptive wavelet packet denoising algorithm for enhanced active acoustic damage detection from wind turbine blades. Mech Syst Signal Process. https://doi.org/10.1016/j.ymssp.2020.106754
Ebert S, Fritz M, Schiele B (2012) RALF: a reinforced active learning formulation for object class recognition. In: 2012 IEEE conference on computer vision and pattern recognition. Providence, RI, USA, pp 3626–3633
Tang Y-P, Huang S-J Self-paced active learning: query the right thing at the right time
Konyushkova K, Sznitman R, Fua P (2017) Learning active learning from data
Hastie T, Rosset S, Zhu J, Zou H (2009) Multi-class AdaBoost. Stat Interface 2:349–360. https://doi.org/10.4310/sii.2009.v2.n3.a8
Wu Y, Ke Y, Chen Z et al (2020) Application of alternating decision tree with AdaBoost and bagging ensembles for landslide susceptibility mapping. CATENA. https://doi.org/10.1016/j.catena.2019.104396
Wen X, Shao L, Xue Y, Fang W (2015) A rapid learning algorithm for vehicle classification. Inf Sci (Ny) 295:395–406. https://doi.org/10.1016/j.ins.2014.10.040
Zheng S, Zhu YX, Li DQ et al (2021) Probabilistic outlier detection for sparse multivariate geotechnical site investigation data using Bayesian learning. Geosci Front 12:425–439. https://doi.org/10.1016/j.gsf.2020.03.017
Aguinis H, Gottfredson RK, Joo H (2013) Best-practice recommendations for defining, identifying, and handling outliers. Organ Res Methods 16:270–301
Wei Y, Chen N (1998) Square wave analysis. J Math Phys 39:4226–4245. https://doi.org/10.1063/1.532493
Chen Z, Li H, Bao Y (2019) Analyzing and modeling inter-sensor relationships for strain monitoring data and missing data imputation: a copula and functional data-analytic approach. Struct Health Monit 18:1168–1188. https://doi.org/10.1177/1475921718788703
Bao Y, Li J, Nagayama T et al (2021) The 1st international project competition for structural health monitoring (IPC-SHM, 2020): a summary and benchmark problem. Struct Health Monit 20:2229–2239. https://doi.org/10.1177/14759217211006485
Li S, Wei S, Bao Y, Li H (2018) Condition assessment of cables by pattern recognition of vehicle-induced cable tension ratio. Eng Struct 155:1–15. https://doi.org/10.1016/j.engstruct.2017.09.063
Acknowledgements
The authors would like to thank the organizations of the International Project Competition for SHM (IPC-SHM 2020) ANCRiSST, Harbin Institute of Technology (China), and University of Illinois at Urbana-Champaign (USA) for their generously providing the invaluable data from actual structures. The authors also would like to thank the chairs of IPC-SHM 2020 Prof. Hui Li, and Prof. Billie F. Spencer Jr for their leadership on the competition.
Funding
The authors also would express appreciation to the financial support by: the Joint Funds of the National Natural Science Foundation of China (U1939208), National Natural Science Foundation of China (no. 51525803) and the 111 Project (B20039)
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interests
The authors declare no conflict of interests or competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Xu, J., Dang, D., Ma, Q. et al. A novel and robust data anomaly detection framework using LAL-AdaBoost for structural health monitoring. J Civil Struct Health Monit 12, 305–321 (2022). https://doi.org/10.1007/s13349-021-00544-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13349-021-00544-2