A Minimum Rare-Itemset-Based Anomaly Detection Method and Its Application on Sensor Data Stream

Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 1042)


In recent years, the scale of data stream is becoming much larger in real life. However, the anomaly data often exists in the collected data stream, while the existence of anomaly is a main reason for the decrease of the accuracy of data-based operations. The anomaly data have two main characteristics, that is, appear rarely and deviate much from most data elements, thus, the anomaly detection methods should accurately detect the anomaly data by considering these two attributes. Because the data stream is continuously generated and constantly flowing, thus, the previous static anomaly detection methods are not suitable for processing data streams. In addition, the large amount of data stream makes the time consumption and memory occupation of rare itemset mining phase very high. To effectively solve these problems, this paper first proposes an efficient MRI-Mine method for mining minimum rare itemsets, and then proposes an accurately anomaly detection method called MRI-AD based on anomaly index to identify the implicit anomaly data. The experiments indicate the proposed MRI-Mine method can mine the minimum rare itemsets in less time consumption and memory occupation, and the detection accuracy of MRI-AD method is also competitive.


Anomaly detection Minimum rare itemsets Anomaly index Sensor data stream Data mining 


  1. 1.
    Zhang, Z., Wu, P., Han, W., Yu, W.: Remote monitoring system for agricultural information based on wireless sensor network. J. Chin. Inst. Eng. 40(1), 75–81 (2017)CrossRefGoogle Scholar
  2. 2.
    Okazaki, T., Orii, T., Ueda, A., Kuramitz, H.: A reusable fiber optic sensor for the real-time sensing of CaCO3 scale formation in geothermal water. IEEE Sens. J. 17(5), 1207–1208 (2017)CrossRefGoogle Scholar
  3. 3.
    Yuan, J., Wang, Z., Sun, Y., Zhang, W., Jiang, J.: An effective pattern-based Bayesian classifier for evolving data stream. Neurocomputing 295, 17–28 (2018)CrossRefGoogle Scholar
  4. 4.
    Hawkins, D.M.: Identification of Outliers, vol. 11. Chapman and Hall, London (1980)CrossRefGoogle Scholar
  5. 5.
    Huang, J., Zhu, Q., Yang, L., Cheng, D., Wu, Q.: A novel outlier cluster detection algorithm without top-n parameter. Knowl.-Based Syst. 121, 32–40 (2017)CrossRefGoogle Scholar
  6. 6.
    Ramaswamy, S., Rastogi, R., Shim, K.: Efficient algorithms for mining outliers from large data sets. In: ACM SIGMOD Record, Dallas, USA, vol. 29, no. 2, pp. 427–438 (2000)Google Scholar
  7. 7.
    Zhang, L., Lin, J., Karim, R.: Adaptive kernel density-based anomaly detection for nonlinear systems. Knowl.-Based Syst. 139, 50–63 (2018)CrossRefGoogle Scholar
  8. 8.
    He, Z., Xu, X., Huang, Z., Deng, S.: FP-outlier: frequent pattern based outlier detection. Comput. Sci. Inf. Syst. 2(1), 103–118 (2005)CrossRefGoogle Scholar
  9. 9.
    Cai, S., Sun, R., Li, J., Deng, C., Li, S.: Abnormal detecting over data stream based on maximal pattern mining technology. In: Sun, Y., Lu, T., Xie, X., Gao, L., Fan, H. (eds.) ChineseCSCW 2018. CCIS, vol. 917, pp. 371–385. Springer, Singapore (2019). Scholar
  10. 10.
    Feng, L., Wang, L., Jin, B.: Research on maximal frequent pattern outlier factor for online high dimensional time-series outlier detection. J. Converg. Inf. Technol. 5(10), 66–71 (2010)Google Scholar
  11. 11.
    Hao, S., Cai, S., Sun, R., Li, S.: An efficient frequent closed itemset-based outlier detecting approach on data stream. In: CCF Conference on Computer Supported Cooperative Work and Social Computing, Guilin, China, pp. 371–385 (2018)Google Scholar
  12. 12.
    Hao, S., Cai, S., Sun, R., Li, S.: An efficient outlier detection approach over uncertain data stream based on frequent itemset mining. J. Inf. Technol. Control 48(1), 34–46 (2019)Google Scholar
  13. 13.
    Cai, S., Sun, R., Hao, S., Li, S., Yuan, G.: Minimal weighted infrequent itemset mining-based outlier detection approach on uncertain data stream. Neural Comput. Appl. 9, 1–21 (2018)Google Scholar
  14. 14.
    Zhang, W., Wu, J., Yu, J.: An improved method of outlier detection based on frequent pattern. In: WASE International Conference on Information Engineering (ICIE), Washington, USA, pp. 3–6 (2010)Google Scholar
  15. 15.
    Dallachiesa, M., Jacques-Silva, G., Gedik, B., Wu, K., Palpanas, T.: Sliding windows over uncertain data streams. Knowl. Inf. Syst. 45(1), 159–190 (2015)CrossRefGoogle Scholar
  16. 16.
    Yang, G.: The complexity of mining maximal frequent itemsets and maximal frequent patterns. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Seattle, WA, pp. 344–353 (2004)Google Scholar
  17. 17.
    Szathmary, L., Napoli, A., Valtchev, P.: Towards rare itemset mining. In: International Conference on Tools with Artificial Intelligence (ICTAI), Patras, Greece, pp. 305–312 (2007)Google Scholar
  18. 18.
    Gupta, A., Mittal, A., Bhattacharya, A.: Minimally infrequent itemset mining using pattern-growth paradigm and residual trees. In: International Conference on Management of Data, Bangalore, India, pp. 1–14 (2011)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  1. 1.College of Information and Electrical EngineeringChina Agricultural UniversityBeijingChina
  2. 2.Scientific Research Base for Integrated Technologies of Precision Agriculture (Animal Husbandry), The Ministry of AgricultureBeijingChina

Personalised recommendations