Skip to main content

Advanced Topic: Knowledge Graph Completion

  • Chapter
  • First Online:

Part of the book series: SpringerBriefs in Computer Science ((BRIEFSCOMPUTER))

Abstract

With the possible exception of good data collection and ontology design, information extraction and entity resolution are the two most important data-driven steps in a domain-specific knowledge graph construction pipeline. Yet, it is very rarely the case that the story ends there. Once constructed, the knowledge graph is so noisy that additional knowledge graph completion steps often have to be applied to refine the initial KG further. These steps entail procedures like knowledge graph embeddings, which tend to rely on neural techniques, but also graphical models like probabilistic soft logic. After completion, the KG also has to be stored and indexed so that it can be queried in an application framework. The Semantic Web has produced a great deal of research in this realm, along with NoSQL methodologies that have emerged from the mainstream database and knowledge discovery communities. In this chapter, we briefly survey some of these topics. While covering any one of these topics in depth is out of scope, we provide pointers to additional material, in each of these topical areas, for the interested reader.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Achichi, M., Cheatham, M., Dragisic, Z., Euzenat, J., Faria, D., Ferrara, A., Flouris, G., Fundulaki, I., Harrow, I., Ivanova, V., et al.: Results of the ontology alignment evaluation initiative 2016. In: OM: Ontology Matching, pp. 73–129. No commercial editor. (2016)

    Google Scholar 

  2. Agichtein, E., Gravano, L.: Snowball: extracting relations from large plain-text collections. In: Proceedings of the Fifth ACM Conference on Digital Libraries, pp. 85–94. ACM (2000)

    Google Scholar 

  3. Ahn, D.: The stages of event extraction. In: Proceedings of the Workshop on Annotating and Reasoning About Time and Events, pp. 1–8. Association for Computational Linguistics (2006)

    Google Scholar 

  4. Alfonseca, E., Manandhar, S.: An unsupervised method for general named entity recognition and automated concept discovery. In: Proceedings of the 1st International Conference on General WordNet, Mysore, pp. 34–43 (2002)

    Google Scholar 

  5. Allemang, D., Hendler, J.: Semantic Web for the Working Ontologist: Effective Modeling in RDFS and OWL. Elsevier, Amsterdam (2011)

    Google Scholar 

  6. Arasu, A., Garcia-Molina, H.: Extracting structured data from web pages. In: Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, pp. 337–348. ACM (2003)

    Google Scholar 

  7. Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M., Davis, A.P., Dolinski, K., Dwight, S.S., Eppig, J.T., et al.: Gene ontology: tool for the unification of biology. Nat. Genet. 25(1), 25 (2000)

    Article  Google Scholar 

  8. Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.: Dbpedia: a nucleus for a web of open data. In: The Semantic Web, pp. 722–735. Springer, Berlin (2007)

    Google Scholar 

  9. Bach, N., Badaskar, S.: A survey on relation extraction. Language Technologies Institute, Carnegie Mellon University (2007)

    Google Scholar 

  10. Banko, M., Cafarella, M.J., Soderland, S., Broadhead, M., Etzioni, O.: Open information extraction from the web. In: IJCAI, vol. 7, pp. 2670–2676 (2007)

    Google Scholar 

  11. Bauer, F., Kaltenböck, M.: Linked Open Data: The Essentials. Edition mono/monochrom, Vienna (2011)

    Google Scholar 

  12. Baxter, R., Christen, P., Churches, T., et al.: A comparison of fast blocking methods for record linkage. In: ACM SIGKDD, vol. 3, pp. 25–27. Citeseer (2003)

    Google Scholar 

  13. Benajiba, Y., Diab, M., Rosso, P.: Arabic named entity recognition using optimized feature sets. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 284–293. Association for Computational Linguistics (2008)

    Google Scholar 

  14. Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013)

    Article  Google Scholar 

  15. Benjelloun, O., Garcia-Molina, H., Menestrina, D., Su, Q., Whang, S.E., Widom, J.: Swoosh: a generic approach to entity resolution. VLDB J.: Int. J. Very Large Data Bases 18(1), 255–276 (2009)

    Google Scholar 

  16. Berant, J., Srikumar, V., Chen, P.C., Vander Linden, A., Harding, B., Huang, B., Clark, P., Manning, C.D.: Modeling biological processes for reading comprehension. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1499–1510 (2014)

    Google Scholar 

  17. Berners-Lee, T., Fielding, R., Masinter, L.: Uniform resource identifier (URI): generic syntax. Technical report (2004)

    Google Scholar 

  18. Berners-Lee, T., Hendler, J., Lassila, O.: The semantic web. Sci. Am. 284(5), 34–43 (2001)

    Article  Google Scholar 

  19. Bhaskaran, S., Rafeeque, P.: A survey on relation extraction methodologies from unstructured text. In: Emerging Trends in Engineering, Science and Technology for Society, Energy and Environment, pp. 869–874. CRC Press, Leiden (2018)

    Google Scholar 

  20. Bhattacharya, I., Getoor, L.: A latent dirichlet model for unsupervised entity resolution. In: Proceedings of the 2006 SIAM International Conference on Data Mining, pp. 47–58. SIAM (2006)

    Google Scholar 

  21. Bhattacharya, I., Getoor, L.: Collective entity resolution in relational data. ACM Trans. Knowl. Discov. Data (TKDD) 1(1), 5 (2007)

    Google Scholar 

  22. Bick, E.: A named entity recognizer for Danish. In: LREC. Citeseer (2004)

    Google Scholar 

  23. Bilenko, M., Mooney, R.J.: Adaptive duplicate detection using learnable string similarity measures. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 39–48. ACM (2003)

    Google Scholar 

  24. Bilke, A., Naumann, F.: Schema matching using duplicates. In: 21st International Conference on Data Engineering, 2005 (ICDE 2005). Proceedings, pp. 69–80. IEEE (2005)

    Google Scholar 

  25. Bizer, C.: The emerging web of linked data. IEEE Intell. Syst. 24(5), 87–92 (2009)

    Article  Google Scholar 

  26. Bizer, C., Heath, T., Berners-Lee, T.: Linked data: the story so far. In: Semantic Services, Interoperability and Web Applications: Emerging Concepts, pp. 205–227. IGI Global (2011)

    Google Scholar 

  27. Bizer, C., Heath, T., Idehen, K., Berners-Lee, T.: Linked data on the web (ldow2008). In: Proceedings of the 17th International Conference on World Wide Web, pp. 1265–1266. ACM (2008)

    Google Scholar 

  28. Björne, J., Heimonen, J., Ginter, F., Airola, A., Pahikkala, T., Salakoski, T.: Extracting complex biological events with rich graph-based feature sets. In: Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing: Shared Task, pp. 10–18. Association for Computational Linguistics (2009)

    Google Scholar 

  29. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)

    MATH  Google Scholar 

  30. Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J., Yakhnenko, O.: Translating embeddings for modeling multi-relational data. In: Advances in Neural Information Processing Systems, pp. 2787–2795 (2013)

    Google Scholar 

  31. Bordes, A., Weston, J., Collobert, R., Bengio, Y., et al.: Learning structured embeddings of knowledge bases. In: AAAI, vol. 6, p. 6 (2011)

    Google Scholar 

  32. Brin, S.: Extracting patterns and relations from the world wide web. In: International Workshop on the World Wide Web and Databases, pp. 172–183. Springer (1998)

    Google Scholar 

  33. Cao, Y., Chen, Z., Zhu, J., Yue, P., Lin, C.Y., Yu, Y.: Leveraging unlabeled data to scale blocking for record linkage. In: IJCAI Proceedings-International Joint Conference on Artificial Intelligence, vol. 22, p. 2211 (2011)

    Google Scholar 

  34. Chakrabarti, K., Chaudhuri, S., Cheng, T., Xin, D.: A framework for robust discovery of entity synonyms. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1384–1392. ACM (2012)

    Google Scholar 

  35. Chambers, N., Jurafsky, D.: Unsupervised learning of narrative event chains. In: Proceedings of ACL-08: HLT, pp. 789–797 (2008)

    Google Scholar 

  36. Chang, C.H., Kayed, M., Girgis, M.R., Shaalan, K.F.: A survey of web information extraction systems. IEEE Trans. Knowl. Data Eng. 18(10), 1411–1428 (2006)

    Article  Google Scholar 

  37. Chang, C.H., Kuo, S.C.: Olera: semisupervised web-data extraction with visual support. IEEE Intell. Syst. 19(6), 56–64 (2004)

    Article  Google Scholar 

  38. Chang, C.H., Lui, S.C.: Iepad: information extraction based on pattern discovery. In: Proceedings of the 10th International Conference on World Wide Web, pp. 681–688. ACM (2001)

    Google Scholar 

  39. Christen, P.: Automatic record linkage using seeded nearest neighbour and support vector machine classification. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 151–159. ACM (2008)

    Google Scholar 

  40. Christen, P.: Febrl-: an open source data cleaning, deduplication and record linkage system with a graphical user interface. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1065–1068. ACM (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2019 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Kejriwal, M. (2019). Advanced Topic: Knowledge Graph Completion. In: Domain-Specific Knowledge Graph Construction. SpringerBriefs in Computer Science. Springer, Cham. https://doi.org/10.1007/978-3-030-12375-8_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-12375-8_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-12374-1

  • Online ISBN: 978-3-030-12375-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics