Multi-source Manifold Outlier Detection
Outlier detection is an important task in data mining, with many practical applications ranging from fraud detection to public health. However, with the emergence of more and more multi-source data in many real-world scenarios, the task of outlier detection becomes even more challenging as traditional mono-source outlier detection techniques can no longer be suitable for multi-source heterogeneous data. In this paper, a general framework based the consistent representations is proposed to identify multi-source heterogeneous outlier. According to the information compatibility among different sources, Manifold learning are combined in the proposed method to obtain a shared representation space, in which the information-correlated representations are close along manifold while the semantic-complementary instances are close in Euclidean distance. Furthermore, the multi-source outliers can be effectively identified in the affine subspace which is learned through affine combination of shared representations from different sources in the feature-homogeneous space. Comprehensive empirical investigations are presented that confirm the promise of our proposed framework.
KeywordsMulti-source Manifold learning Heterogeneous Outlier detection
This work was supported by National Natural Science Foundation of China (No. 61601458, 61602465).
- 2.Bertsekas, D.P.: Convex Optimization Theory. Athena Scientific (2009)Google Scholar
- 5.Chen, D., Lv, J., Yi, Z.: A local non-negative pursuit method for intrinsic manifold structure preservation. In: Proceedings of AAAI Conference on Artificial Intelligence, pp. 1745–1751 (2014)Google Scholar
- 8.Guo, Y., Xiao, M.: Cross language text classification via subspace co-regularized multi-view learning. In: Proceedings of ACM International Conference on Machine Learning, pp. 915–922 (2012)Google Scholar
- 9.Huiskes, M.J., Lew, M.S.: The MIR Flickr retrieval evaluation. In: Proceedings of ACM International Conference on Multimedia Information Retrieval, pp. 39–43 (2008)Google Scholar
- 11.Knorr, E.M., Ng, R.T.: Algorithms for mining distance-based outliers in large datasets. In: Proceedings of International Conference on Very Large Data Bases, pp. 392–403 (1998)Google Scholar
- 14.Liu, A., Lam, D.: Using consensus clustering for multi-view anomaly detection. In: IEEE Symposium on Security and Privacy Workshops, pp. 117–124 (2012)Google Scholar
- 19.Rasiwasia, N., et al.: A new approach to cross-modal multimedia retrieval. In: Proceedings of ACM International Conference on Multimedia, pp. 251–260 (2010)Google Scholar
- 21.Sheng, L., Ming, S., Yun, F.: Multi-view low-rank analysis for outlier detection. In: Proceedings of SIAM International Conference on Data Mining, pp. 748–756 (2015)Google Scholar
- 25.Zhang, L., et al.: Collaborative multi-view denoising. In: Proceedings of ACM SIGKDD International Conference on Knowledge Discovery Data Mining, pp. 2045–2054 (2016)Google Scholar
- 27.Zhao, H., Fu, Y.: Dual-regularized multi-view outlier detection. In: Proceedings of the International Joint Conference on Artificial Intelligence, pp. 4077–4083 (2015)Google Scholar