Advertisement

Multi-source Manifold Outlier Detection

  • Lei ZhangEmail author
  • Shupeng WangEmail author
  • Ge FuEmail author
  • Zhenyu Wang
  • Lei Cui
  • Junteng Hou
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11537)

Abstract

Outlier detection is an important task in data mining, with many practical applications ranging from fraud detection to public health. However, with the emergence of more and more multi-source data in many real-world scenarios, the task of outlier detection becomes even more challenging as traditional mono-source outlier detection techniques can no longer be suitable for multi-source heterogeneous data. In this paper, a general framework based the consistent representations is proposed to identify multi-source heterogeneous outlier. According to the information compatibility among different sources, Manifold learning are combined in the proposed method to obtain a shared representation space, in which the information-correlated representations are close along manifold while the semantic-complementary instances are close in Euclidean distance. Furthermore, the multi-source outliers can be effectively identified in the affine subspace which is learned through affine combination of shared representations from different sources in the feature-homogeneous space. Comprehensive empirical investigations are presented that confirm the promise of our proposed framework.

Keywords

Multi-source Manifold learning Heterogeneous Outlier detection 

Notes

Acknowledgments

This work was supported by National Natural Science Foundation of China (No. 61601458, 61602465).

References

  1. 1.
    Ando, R.K., Zhang, T.: A framework for learning predictive structures from multiple tasks and unlabeled data. J. Mach. Learn. Res. 6(3), 1817–1853 (2005)MathSciNetzbMATHGoogle Scholar
  2. 2.
    Bertsekas, D.P.: Convex Optimization Theory. Athena Scientific (2009)Google Scholar
  3. 3.
    Breukelen, M.V., Duin, R.P.W., Tax, D.M.J., Hartog, J.E.D.: Handwritten digit recognition by combined classifiers. Kybernetika -Praha- 34(4), 381–386 (1998)zbMATHGoogle Scholar
  4. 4.
    Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. 41(3), 1–58 (2009)CrossRefGoogle Scholar
  5. 5.
    Chen, D., Lv, J., Yi, Z.: A local non-negative pursuit method for intrinsic manifold structure preservation. In: Proceedings of AAAI Conference on Artificial Intelligence, pp. 1745–1751 (2014)Google Scholar
  6. 6.
    Cover, T.M., Hart, P.E.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13(1), 21–27 (2002)CrossRefGoogle Scholar
  7. 7.
    Elhamifar, E., Vidal, R.: Sparse subspace clustering: algorithm, theory, and applications. IEEE Trans. Pattern Anal. Mach. Intell. 35(11), 2765–2781 (2013)CrossRefGoogle Scholar
  8. 8.
    Guo, Y., Xiao, M.: Cross language text classification via subspace co-regularized multi-view learning. In: Proceedings of ACM International Conference on Machine Learning, pp. 915–922 (2012)Google Scholar
  9. 9.
    Huiskes, M.J., Lew, M.S.: The MIR Flickr retrieval evaluation. In: Proceedings of ACM International Conference on Multimedia Information Retrieval, pp. 39–43 (2008)Google Scholar
  10. 10.
    Janeja, V., Palanisamy, R.: Multi-domain anomaly detection in spatial datasets. Knowl. Inf. Syst. 36(3), 749–788 (2013)CrossRefGoogle Scholar
  11. 11.
    Knorr, E.M., Ng, R.T.: Algorithms for mining distance-based outliers in large datasets. In: Proceedings of International Conference on Very Large Data Bases, pp. 392–403 (1998)Google Scholar
  12. 12.
    Knorr, E.M., Ng, R.T., Tucakov, V.: Distance-based outliers: algorithms and applications. VLDB J. 8(3), 237–253 (2000)CrossRefGoogle Scholar
  13. 13.
    Li, X., Lv, J., Yi, Z.: An efficient representation-based method for boundary point and outlier detection. IEEE Trans. Neural Net. Learn. Syst. 29(1), 51–62 (2016)MathSciNetCrossRefGoogle Scholar
  14. 14.
    Liu, A., Lam, D.: Using consensus clustering for multi-view anomaly detection. In: IEEE Symposium on Security and Privacy Workshops, pp. 117–124 (2012)Google Scholar
  15. 15.
    Nesterov, Y.: Introductory lectures on convex optimization. Appl. Optim. 87(5), 236 (2004)MathSciNetzbMATHGoogle Scholar
  16. 16.
    Nesterov, Y.: Smooth minimization of non-smooth functions. Math. Program. 103(1), 127–152 (2005)MathSciNetCrossRefGoogle Scholar
  17. 17.
    Pimentel, M.A.F., Clifton, D.A., Clifton, L., Tarassenko, L.: A review of novelty detection. Signal Process. 99(6), 215–249 (2014)CrossRefGoogle Scholar
  18. 18.
    Rahmani, M., Atia, G.: Randomized robust subspace recovery and outlier detection for high dimensional data matrices. IEEE Tran. Signal Process. 65(6), 1580–1594 (2017)MathSciNetCrossRefGoogle Scholar
  19. 19.
    Rasiwasia, N., et al.: A new approach to cross-modal multimedia retrieval. In: Proceedings of ACM International Conference on Multimedia, pp. 251–260 (2010)Google Scholar
  20. 20.
    Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000)CrossRefGoogle Scholar
  21. 21.
    Sheng, L., Ming, S., Yun, F.: Multi-view low-rank analysis for outlier detection. In: Proceedings of SIAM International Conference on Data Mining, pp. 748–756 (2015)Google Scholar
  22. 22.
    Sun, L., Ji, S., Ye, J.: Canonical correlation analysis for multilabel classification: a least-squares formulation, extensions, and analysis. IEEE Trans. Pattern Anal. Mach. Intell. 33(1), 194–200 (2011)CrossRefGoogle Scholar
  23. 23.
    Weinberger, K.Q., Saul, L.K.: Distance metric learning for large margin nearest neighbor classification. J. Mach. Learn. Res. 10(1), 207–244 (2009)zbMATHGoogle Scholar
  24. 24.
    Wen, Z., Yin, W.: A feasible method for optimization with orthogonality constraints. Math. Program. 142(1–2), 397–434 (2013)MathSciNetCrossRefGoogle Scholar
  25. 25.
    Zhang, L., et al.: Collaborative multi-view denoising. In: Proceedings of ACM SIGKDD International Conference on Knowledge Discovery Data Mining, pp. 2045–2054 (2016)Google Scholar
  26. 26.
    Zhang, L., Zhao, Y., Zhu, Z., Wei, S., Wu, X.: Mining semantically consistent patterns for cross-view data. IEEE Trans. Knowl. Data Eng. 26(11), 2745–2758 (2014)CrossRefGoogle Scholar
  27. 27.
    Zhao, H., Fu, Y.: Dual-regularized multi-view outlier detection. In: Proceedings of the International Joint Conference on Artificial Intelligence, pp. 4077–4083 (2015)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Institute of Information EngineeringCASBeijingChina
  2. 2.CNCERT/CCBeijingChina

Personalised recommendations