Abstract
Discovering relationships between vertices in a secured information network is an important task in information network analysis. In HIN, meta-path, or a sequence of vertex types and edge types connecting two vertices. Path instance of a meta-path is path in HIN that satisfies the meta-path. The length of meta-path is the number of relations (edges) in this meta-path. Meaningful meta-path is a meta-path with at least one path instance. Recent works on meta-path discovery mainly focus on in-memory algorithms that fit in only one computer. In this chapter, we propose distributed algorithms to discover all shortest meaningful meta-paths between two vertices of a large HIN using Apache Spark. Shortest meaningful meta-path is a meaningful meta-path with shortest length. We employ a scalable implementation of the Distributed Breadth-First Search (D-BFS) algorithm as a baseline approach. Finding all possible shortest paths in a large HIN can be time consuming. Therefore, we propose a novel algorithm called shortest meaningful meta-path based search (S-MPS). S-MPS first searches all shortest meta-path candidates between vertices in the graph of the network schema of HIN. We conduct experiments on DBLP data set to prove the efficiency of our proposed S-MPS algorithm over D-BFS.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Sun, Y., Han, J., Yan, X., Yu, P., Wu, T.: Path-Sim: meta path-based top-k similarity search in heterogeneous information networks. In: VLDB, pp. 992–1003 (2011). https://doi.org/10.14778/3402707.3402736
Shi, C., Li, Y., Zhang, J., Sun, Y., Yu, P.S.: A survey of heterogeneous information network analysis. IEEE Trans. Knowl. Data Eng. (2017). https://doi.org/10.1109/TKDE.2016.2598561
Phan, T., Do, P.: Building a Vietnamese question answering system based on knowledge graph and distributed CNN. Neural Comput. Appl. 33, 14887–14907 (2021). https://doi.org/10.1007/s00521-021-06126-z
Do, P., Pham, P.: DW-PathSim: a distributed computing model for topic-driven weighted meta-path-based similarity measure in a large-scale content-based heterogeneous information network. J. Inf. Telecommun. 3(1), 19–38 (2019). https://doi.org/10.1080/24751839.2018.1516714
Salhi, D., Tari, A., Kechadi, T.: Using clustering for forensics analysis on internet of things. Int. J. Softw. Sci. Comput. Intell. (2021)
Kong, X., Cao, B., Yu, P., Ding, Y., Wild, D.: Meta path-based collective classification in heterogeneous information. Networks (2013). https://doi.org/10.1145/2396761.2398474
Trappey, A.J., Trappey, C.V., Chang, A., Li, J.X.: Deriving competitive foresight using an ontology-based patent roadmap and valuation analysis. In: International Journal on Semantic Web and Information Systems, pp. 68–91 (2019). https://doi.org/10.4018/IJSWIS.2019040104
Ho, T., Do, P.: Discovering communities of users on social networks based on topic model combined with Kohonen network. In: Seventh International Conference on Knowledge and Systems Engineering, pp. 268–273 (2015). https://doi.org/10.1109/KSE.2015.54.
Do, P.: A system for natural language interaction with the heterogeneous information network. In: Handbook of Research on Cloud Computing and Big Data Applications in IoT (2019)
Besmir, S., Florie, I., Lule, A.: Integration of semantics into sensor data for the IoT: a systematic literature review. In: International Journal on Semantic Web and Information Systems (2020). https://doi.org/10.4018/IJSWIS.2020100101
Meng, C., Cheng, R., Maniu, S., Senellart, P., Zhang, W.: Discovering meta-paths in large heterogeneous information networks. In: Proceedings of the 24th International Conference on World Wide Web (2015). https://doi.org/10.1145/2736277.2741123
Liu, H., Jin, C., Yang, B., Zhou, A.: Finding Top-k shortest paths with diversity. In: IEEE Transactions on Knowledge and Data Engineering, pp. 488–502 (2018). https://doi.org/10.1109/TKDE.2017.2773492.
Khekare, G., Verma, P., Dhanre, U., Raut, S., Sheikh, S: The optimal path finding algorithm based on reinforcement learning. In: International Journal of Software Science and Computational Intelligence (2020). https://doi.org/10.4018/IJSSCI.2020100101
Iqbal, S., Hussain, I., Sharif, Z., Qureshi, K.H., Jabeen, J.: Reliable and energy-efficient routing scheme for underwater wireless sensor networks (UWSNs). In: International Journal of Cloud Applications and Computing (IJCAC) (2021). https://doi.org/10.4018/IJCAC.2021100103
Zhu, Z., Cheng, R., Do, L., Huang, Z., Zhang, H.: Evaluating Top-k meta path queries on large heterogeneous information networks. In: IEEE International Conference on Data Mining, pp. 1470–1475 (2018). https://doi.org/10.1109/ICDM.2018.00204
Drabas, T., Lee D.: Learning PySpark. Packt (2017)
Al-Nawasrah, A., Almomani, A.A., Atawneh, S., Alauthman, M.: A survey of fast flux botnet detection with fast flux cloud computing. In: International Journal of Cloud Applications and Computing (2021). https://doi.org/10.4018/IJCAC.2020070102
Dave, A., Jindal, A., Liy, L.E., Xin, R., Gonzalez, J., Zaharia, M.: GraphFrames: an integrated API for mixing graph and relational queries. In: Proceedings of the Fourth International Workshop on Graph Data Management Experiences and Systems (2016). https://doi.org/10.1145/2960414.2960416
Koji, U., Toyotaro, S., Naoya, M., Katsuki, F., Satoshi, M.: Efficient breadth-first search on massively parallel and distributed-memory machines. In: Data Science and Engineering (2017). https://doi.org/10.1007/s41019-016-0024-y
Shi, C., Li, Y., Zhang, J., Sun, Y.: A survey of heterogeneous information network analysis. IEEE Trans. Knowl. Data Eng. (2017)
Ni, L., William, C.: Fast query execution for retrieval models based on path-constrained random walks. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2010). https://doi.org/10.1145/1835804.1835916
Chuan, S., Xiangnan, K., Yue, H., Philip, Y., Bin, W.: HeteSim: a general framework for relevance measure in heterogeneous networks. In: IEEE Transactions on Knowledge and Data Engineering, vol. 26 (2013). https://doi.org/10.1109/TKDE.2013.2297920
Blei, D.M., Ng, A.Y., Michael, I.J.: Latent Dirichlet allocation. J. Mach. Learn. Res. (2003)
Lijun, C., Xuemin, L., Lu, Q., Jeffrey, X., Jian, P.: Efficiently computing Top-K shortest path join (2015)
Acknowledgements
This research is funded by Vietnam National University Ho Chi Minh City (VNU-HCMC) under the grant number DS2020-26-01.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Do, P. (2022). Finding All Shortest Meaningful Meta-Paths Between Two Vertices of a Secured Large Heterogeneous Information Network Using Distributed Algorithm. In: Nedjah, N., Abd El-Latif, A.A., Gupta, B.B., Mourelle, L.M. (eds) Robotics and AI for Cybersecurity and Critical Infrastructure in Smart Cities. Studies in Computational Intelligence, vol 1030. Springer, Cham. https://doi.org/10.1007/978-3-030-96737-6_10
Download citation
DOI: https://doi.org/10.1007/978-3-030-96737-6_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-96736-9
Online ISBN: 978-3-030-96737-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)