PathSimExt: Revisiting PathSim in Heterogeneous Information Networks
Similarity queries in graph databases have been studied over the past few decades. Typically, the similarity queries are used in homogeneous networks, where random walk based approaches (e.g., Personalized PageRank and SimRank) are the representative methods. However, these approaches do not well suit for heterogeneous networks that consist of multi-typed and interconnected objects, such as bibliographic information, social media networks, crowdsourcing data, etc. Intuitively, two objects are similar in heterogeneous networks if they have strong connections among the heterogeneous relationships. PathSim is the first work to address this problem which captures the similarity of two objects based on their connectivity along a semantic path. However, PathSim only considers the information in the semantic path but simply omit other supportive information (e.g., number of citations in bibliographic data) . Thus we revisit the definition of PathSim by introducing external support to enrich the result of PathSim.
Unable to display preview. Download preview PDF.
- 2.Jeh, G., Widom, J.: Scaling personalized web search. In: WWW, pp. 271–279 (2003)Google Scholar
- 3.Jeh, G., Widom, J.: Simrank: a measure of structural-context similarity. In: KDD, pp. 538–543 (2002)Google Scholar
- 4.Sun, Y., Han, J., Yan, X., Yu, P.S., Wu, T.: Pathsim: Meta path-based top-k similarity search in heterogeneous information networks. In: PVLDB, vol. 4(11), pp. 992–1003 (2011)Google Scholar