Skip to main content

Schema-Rich Heterogeneous Network Mining

  • Chapter
  • First Online:

Part of the book series: Data Analytics ((DAANA))

Abstract

Traditional heterogeneous information network usually has simple network schema, where there are a small number of types of nodes and links and meta paths are easily enumerated. However, in many real applications, some heterogeneous information networks have a huge number of types of nodes and links, and it is hard to depict their network schema. We call this kind of networks as schema-rich heterogeneous information network. For example, knowledge graph, constructed with \(<object, relation, object>\) tuples, can be considered as a schema-rich heterogeneous network, where there are usually tens of thousands of types of nodes and links. In this chapter, we introduce two data mining tasks on schema-rich heterogeneous network: link prediction and entity set expansion. Through these two tasks, we illustrate the challenges and potential solutions on mining this kind of more complex and popular heterogeneous networks.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   159.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   159.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.: DBpedia: A Nucleus for a Web of Open Data. Springer, Berlin (2007)

    Google Scholar 

  2. Bizer, C., Lehmann, J., Kobilarov, G., Auer, S., Becker, C., Cyganiak, R., Hellmann, S.: Dbpedia-a crystallization point for the web of data. Web Semant.: Sci. Serv. Agents World Wide Web 7(3), 154–165 (2009)

    Article  Google Scholar 

  3. Bollacker, K., Evans, C., Paritosh, P., Sturge, T., Taylor, J.: Freebase: a collaboratively created graph database for structuring human knowledge. In: SIGMOD, pp. 1247–1250 (2008)

    Google Scholar 

  4. Cao, B., Kong, X., Yu, P.S.: Collective prediction of multiple types of links in heterogeneous information networks. In: ICDM, pp. 50–59 (2014)

    Google Scholar 

  5. Cao, H., Jiang, D., Pei, J., He, Q., Liao, Z., Chen, E., Li, H.: Context-aware query suggestion by mining click-through and session data. In: KDD, pp. 875–883 (2008)

    Google Scholar 

  6. Cao, X., Zheng, Y., Shi, C., Li, J., Wu, B.: Link prediction in schema-rich heterogeneous information network. In: PAKDD, pp. 449–460 (2016)

    Google Scholar 

  7. Cohen, W.W., Sarawagi, S.: Exploiting dictionaries in named entity extraction: combining semi-markov extraction processes and data integration methods. In: KDD, pp. 89–98 (2004)

    Google Scholar 

  8. He, Y., Xin, D.: Seisa: set expansion by iterative similarity aggregation. In: WWW, pp. 427–436 (2011)

    Google Scholar 

  9. Hu, J., Wang, G., Lochovsky, F., Sun, J.t., Chen, Z.: Understanding user’s query intent with wikipedia. In: WWW, pp. 471–480 (2009)

    Google Scholar 

  10. Jaiwei, H.: Mining heterogeneous information networks: the next frontier. In: SIGKDD, pp. 2–3 (2012)

    Google Scholar 

  11. Lao, N., Cohen, W.W.: Relational retrieval using a combination of path-constrained random walks. Mach. Learn. 81(1), 53–67 (2010)

    Article  MathSciNet  Google Scholar 

  12. Li, X.L., Zhang, L., Liu, B., Ng, S.K.: Distributional similarity vs. pu learning for entity set expansion. In: ACL, pp. 359–364 (2010)

    Google Scholar 

  13. Metzger, S., Schenkel, R., Sydow, M.: Qbees: query by entity examples. In: CIKM, pp. 1829–1832 (2013)

    Google Scholar 

  14. Metzger, S., Schenkel, R., Sydow, M.: Aspect-based similar entity search in semantic knowledge graphs with diversity-awareness and relaxation. In: IJCWI, pp. 60–69 (2014)

    Google Scholar 

  15. Qi, Z., Liu, K., Zhao, J.: Choosing better seeds for entity set expansion by leveraging wikipedia semantic knowledge. In: CCPR, pp. 655–662 (2012)

    Google Scholar 

  16. Sarmento, L., Jijkuon, V., de Rijke, M., Oliveira, E.: More like these: growing entity classes from seeds. In: CIKM, pp. 959–962 (2007)

    Google Scholar 

  17. Shi, C., Kong, X., Yu, P.S., Xie, S., Wu, B.: Relevance search in heterogeneous networks. In: EDBT, pp. 180–191 (2012)

    Google Scholar 

  18. Shi, C., Li, Y., Zhang, J., Sun, Y., Yu, P.S.: A survey of heterogeneous information network analysis. Comput. Sci. 134(12), 87–99 (2015)

    Google Scholar 

  19. Singhal, A.: Introducing the knowledge graph: things, not strings. Official google blog (2012)

    Google Scholar 

  20. Suchanek, F.M., Kasneci, G., Weikum, G.: Yago: a core of semantic knowledge. In: WWW, pp. 697–706 (2007)

    Google Scholar 

  21. Sun, Y., Barber, R., Gupta, M., Aggarwal, C.C., Han, J.: Co-author relationship prediction in heterogeneous bibliographic networks. In: ASONAM, pp. 121–128 (2011)

    Google Scholar 

  22. Sun, Y., Han, J., Yan, X., Yu, P.S., Wu, T.: Pathsim: Meta path-based top-k similarity search in heterogeneous information networks. VLDB 4(11), 992–1003 (2011)

    Google Scholar 

  23. Sun, Y., Han, J., Aggarwal, C.C., Chawla, N.V.: When will it happen?: relationship prediction in heterogeneous information networks. In: WSDM, pp. 663–672 (2012)

    Google Scholar 

  24. Sun, Y., Norick, B., Han, J., Yan, X., Yu, P.S., Yu, X.: Integrating meta-path selection with user-guided object clustering in heterogeneous information networks. In: KDD, pp. 1348–1356 (2012)

    Google Scholar 

  25. W3C: Rdf current status. http://www.w3.org/standards/techs/rdf#w3c_all

  26. Wang, R.C., Cohen, W.W.: Language-independent set expansion of named entities using the web. In: ICDM, pp. 342–350 (2007)

    Google Scholar 

  27. Wang, R.C., Cohen, W.W.: Iterative set expansion of named entities using the web. In: ICDM, pp. 1091–1096 (2008)

    Google Scholar 

  28. Yu, X., Gu, Q., Zhou, M., Han, J.: Citation prediction in heterogeneous bibliographic networks. In: SDM, pp. 1119–1130 (2012)

    Google Scholar 

  29. Zha, H., He, X., Ding, C.H.Q., Gu, M., Simon, H.D.: Bipartite graph partitioning and data clustering. CoRR cs.IR/0108018 (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chuan Shi .

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this chapter

Cite this chapter

Shi, C., Yu, P.S. (2017). Schema-Rich Heterogeneous Network Mining. In: Heterogeneous Information Network Analysis and Applications. Data Analytics. Springer, Cham. https://doi.org/10.1007/978-3-319-56212-4_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-56212-4_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-56211-7

  • Online ISBN: 978-3-319-56212-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics