Meta-path-based outlier detection in heterogeneous information network
- 208 Downloads
Mining outliers in heterogeneous networks is crucial to many applications, but challenges abound. In this paper, we focus on identifying meta-path-based outliers in heterogeneous information network (HIN), and calculate the similarity between different types of objects. We propose a meta-path-based outlier detection method (MPOutliers) in heterogeneous information network to deal with problems in one go under a unified framework. MPOutliers calculates the heterogeneous reachable probability by combining different types of objects and their relationships. It discovers the semantic information among nodes in heterogeneous networks, instead of only considering the network structure. It also computes the closeness degree between nodes with the same type, which extends the whole heterogeneous network. Moreover, each node is assigned with a reliable weighting to measure its authority degree. Substantial experiments on two real datasets (AMiner and Movies dataset) show that our proposed method is very effective and efficient for outlier detection.
Keywordsdata mining heterogeneous information network outlier detection short text similarity
Unable to display preview. Download preview PDF.
This work was supported by the National Natural Science Foundation of China (Grant Nos. 61872163 and 61806084), China Postdoctoral Science Foundation project (2018M631872), and Jilin Provincial Education Department project (JJKH20190160KJ).
- 2.Dalmia A, Gupta M, Varma V. Query-based evolutionary graph cuboid outlier detection. In: Proceedings of the 16th International Conference on Data Mining Workshops. 2016, 85–92Google Scholar
- 10.Gupta M, Gao J, Aggarwal C, Han J. Community distribution outlier detection in heterogeneous information networks. In: Proceedings of European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases. 2013, 557–573Google Scholar
- 11.Gupta M, Gao J, Yan X, Cam H, Han J. On detecting association-based clique outliers in heterogeneous information networks. In: Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. 2013, 108–115Google Scholar
- 13.Gao J, Liang F, Fan W, Wang C, Sun Y, Han J. On community outliers and their efficient detection in information networks. In: Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2010, 813–822Google Scholar
- 19.Aggarwal C C, Zhao Y, Yu P S. Outlier detection in graph streams. In: Proceedings of International Conference on Data Engineering. 2011, 399–409Google Scholar
- 21.Yin S N, Kang H S, Kim S R. Clustering algorithm based on outlier detection for anomaly intrusion detection. Journal of Internet Technology, 2016, 17(2): 291–299Google Scholar
- 22.Gupta M, Gao J, Sun Y, Han J. Integrating community matching and outlier detection for mining evolutionary community outliers. In: Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2012, 859–867Google Scholar
- 23.Zhuang H, Zhang J, Brova G, Tang J, Cam H, Yan, X, Han J. Mining query-based subnetwork outliers in heterogeneous information networks. In: Proceedings of IEEE International Conference on Data Mining. 2014, 1127–1132Google Scholar
- 24.Kuck J, Zhuang H, Yan X, Cam H, Han J. Query-based outlier detection in heterogeneous information networks. In: Proceedings of the 18th International Conference on Extending Database Technology. 2015, 325–336Google Scholar
- 32.Sun Y, Han J, Yan X, Yu P S, Wu T. PathSim: meta path-based top-k similarity search in heterogeneous information networks. In: Proceedings of International Conference on Very Large Databases. 2011, 992–1003Google Scholar