Skip to main content

Unsupervised Fraud Detection in Environmental Time Series Data

  • Chapter
  • First Online:
New Developments in Unsupervised Outlier Detection

Abstract

Time series often contains outliers and structural changes. These unexpected events are of the utmost importance in fraud detection as they may pinpoint suspicious activities. The presence of such unusual activities can easily mislead conventional time series analysis and yield erroneous conclusions. Traditionally, time series data are first divided into small chunks. kNN-based outlier detection approaches are then applied for monitoring behaviors over time in data mining. However, time series data are very large in size and  cannot be scanned multiple times, and as they are produced continuously, new data are arriving. To cope with the speed they are coming, in this chapter, we propose a simple statistical parameter-based anomaly method for fraud detection in environmental time series data. The results of the experiments performed show that the proposed algorithm is effective and efficient .

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Akaike, H. (1969). Fitting autoregressive models for prediction. Annals of the Institute of Statistical Mathematics, 21(1), 243–247.

    Article  MathSciNet  Google Scholar 

  2. Ye, N. (2000). A Markov chain model of temporal behavior for anomaly detection. In Proceedings of the 2000 IEEE SMC Information Assurance and Security Workshop (Vol. 166, pp. 171–174).

    Google Scholar 

  3. Yang, J., & Wang, W. (2003). CLUSEQ: Efficient and effective sequence clustering. In Proceedings of the 19th IEEE International Conference on Data Engineering (ICDE’03), Bangalore, India (pp. 101–112).

    Google Scholar 

  4. Sun, P., Chawla, S., & Arunasalam, B. (2006). Mining for outliers in sequential databases. In Proceedings of the 6th SIAM International Conference on Data Mining (SDM’06), Bethesda, MD, United states (pp. 94–105).

    Google Scholar 

  5. Eskin, E., Lee, W., & Stolfo, S. (2001). Modeling system calls for intrusion detection with dynamic window sizes. In Proceedings of DARPA Information Survivability Conference and Exposition II (DISCEX’01), Anaheim, CA, United states (Vol. 1, pp. 165–175).

    Google Scholar 

  6. Lee, W., Stolfo, S. J., & Chan, P. K. (1997). Learning patterns from unix process execution traces for intrusion detection. In Proceedings of the AAAI Workshop on AI Approaches Fraud Detection and Risk Management (pp. 50–56).

    Google Scholar 

  7. Fox, A. J. (1972). Outliers in time series. Journal of the Royal Statistical Society: Series B, 34, 350–363.

    MathSciNet  MATH  Google Scholar 

  8. Tsay, R. S. (1986). Time series model specification in the presence of outliers. Journal of the American Statistical Association, 81, 132–141.

    Article  Google Scholar 

  9. Tsay, R. S. (1988). Outliers, level shifts, and variance changes in time series. Journal of Forecasting, 7, 1–20.

    Article  Google Scholar 

  10. Chen, C., & Liu, L.-M. (1993). Joint estimation of model parameters and outlier effects in time series. Journal of the American Statistical Association, 88, 284–297.

    MATH  Google Scholar 

  11. Knorr, E. M. and Ng, R. T. (1998). Algorithms for mining distance-based outliers in large datasets. In Proceedings of the International Conference on Very Large Data Bases (VLDB’98), New York, pp. 392–403.

    Google Scholar 

  12. Ramaswamy, S., Rastogi, R., & Shim, K. (2000). Efficient algorithms for mining outliers from large data sets. In Proceedings of the 2000 ACM International Conference on Management of Data (SIGMOD’00), Dallas (pp. 427–438).

    Google Scholar 

  13. Angiulli, F., & Pizzuti, C. (2002). Fast outlier detection in high dimensional spaces. In Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery (PKDD’02), Helsinki (pp. 15–26).

    Google Scholar 

  14. Breuning, M. M., Kriegel, H. P., Ng, R. T., & Sander, J. (2000). LOF: Identifying density-based local outliers. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data (SIGMOD’00), Dallas, TX, United states (pp. 93–104).

    Google Scholar 

  15. Tang, J., Chen, Z., Fu, A. W. C., & Cheung, D. W. (2002). Enhancing effectiveness of outlier detections for low density patterns. In Proceedings of the 6th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’02), Taipei, Taiwan (pp. 535–548).

    Google Scholar 

  16. Gibbons, P. B., Papadimitriou, S., Kitagawa, H., & Faloutsos, C. (2003). LOCI: Fast outlier detection using the local correlation integral. In Proceedings of the IEEE 19th International Conference on Data Engineering (ICDE’03), Bangalore, India (pp. 315–326).

    Google Scholar 

  17. Sun, P., & Chawla, S. (2004). On local spatial outliers. In Proceedings of the 4th IEEE International Conference on Data Mining (ICDM’04), Brighton, UK (pp. 209–216).

    Google Scholar 

  18. Jin, W., Tung, A. K. H., Han, J., & Wang, W. (2006). Ranking outliers using symmetric neighborhood relationship. In Proceedings of the 10th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’06), Singapore (pp. 577–593).

    Google Scholar 

  19. Fan, H., Zaiane, O. R., Foss, A., & Wu, J. (2006). A nonparametric outlier detection for efficiently discovering top-N outliers from engineering data. In Proceedings of the 10th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining (PAKDD’06), Singapore, Singapore (pp. 557–566).

    Google Scholar 

  20. Zhang, K., Hutter, M., & Jin, H. (2009). A new local distance-based outlier detection approach for scattered real-world data. In Proceedings of the 13th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’09) (pp. 813–822).

    Google Scholar 

  21. Huang, H., Mehrotra, K., & Mohan, C. K. (2013). Rank-based outlier detection. Journal of Statistical Computation and Simulation, 83(3), 518–531.

    Article  MathSciNet  Google Scholar 

  22. Schubert, E., Zimek, A., & Kriegel H. P. (2014). Generalized outlier detection with flexible kernel density estimates. In Proceedings of the 14th Siam International Conference on Data Mining (SDM’14), Philadelphia (pp. 542–550).

    Google Scholar 

  23. Ru, X., Liu, Z., Huang, Z., et al. (2016). Normalized residual-based constant false-alarm rate outlier detection. Pattern Recognition Letters, 69, 1–7.

    Article  Google Scholar 

  24. Tang, B., & He, H. (2017). A local density-based approach for outlier detection. Neurocomputing, 241, 171–180.

    Article  Google Scholar 

  25. Kriegel, H.-P., Schubert, M., & Zimek, A. (2008). Angle-based outlier detection in high-dimensional data. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’08), Las Vegas, Nevada, USA (pp. 444–452).

    Google Scholar 

  26. Hawkins, D. M. (1980). Identification of outliers. London: Chapman and Hall.

    Book  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaochun Wang .

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Xi'an Jiaotong University Press

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Wang, X., Wang, X., Wilkes, M. (2021). Unsupervised Fraud Detection in Environmental Time Series Data. In: New Developments in Unsupervised Outlier Detection. Springer, Singapore. https://doi.org/10.1007/978-981-15-9519-6_10

Download citation

Publish with us

Policies and ethics