Advanced Topic: Knowledge Graph Completion

Kejriwal, Mayank

doi:10.1007/978-3-030-12375-8_4

Advanced Topic: Knowledge Graph Completion

Mayank Kejriwal¹⁵

Chapter
First Online: 05 March 2019

3660 Accesses
1 Citations

Part of the book series: SpringerBriefs in Computer Science ((BRIEFSCOMPUTER))

Abstract

With the possible exception of good data collection and ontology design, information extraction and entity resolution are the two most important data-driven steps in a domain-specific knowledge graph construction pipeline. Yet, it is very rarely the case that the story ends there. Once constructed, the knowledge graph is so noisy that additional knowledge graph completion steps often have to be applied to refine the initial KG further. These steps entail procedures like knowledge graph embeddings, which tend to rely on neural techniques, but also graphical models like probabilistic soft logic. After completion, the KG also has to be stored and indexed so that it can be queried in an application framework. The Semantic Web has produced a great deal of research in this realm, along with NoSQL methodologies that have emerged from the mainstream database and knowledge discovery communities. In this chapter, we briefly survey some of these topics. While covering any one of these topics in depth is out of scope, we provide pointers to additional material, in each of these topical areas, for the interested reader.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Achichi, M., Cheatham, M., Dragisic, Z., Euzenat, J., Faria, D., Ferrara, A., Flouris, G., Fundulaki, I., Harrow, I., Ivanova, V., et al.: Results of the ontology alignment evaluation initiative 2016. In: OM: Ontology Matching, pp. 73–129. No commercial editor. (2016)
Google Scholar
Agichtein, E., Gravano, L.: Snowball: extracting relations from large plain-text collections. In: Proceedings of the Fifth ACM Conference on Digital Libraries, pp. 85–94. ACM (2000)
Google Scholar
Ahn, D.: The stages of event extraction. In: Proceedings of the Workshop on Annotating and Reasoning About Time and Events, pp. 1–8. Association for Computational Linguistics (2006)
Google Scholar
Alfonseca, E., Manandhar, S.: An unsupervised method for general named entity recognition and automated concept discovery. In: Proceedings of the 1st International Conference on General WordNet, Mysore, pp. 34–43 (2002)
Google Scholar
Allemang, D., Hendler, J.: Semantic Web for the Working Ontologist: Effective Modeling in RDFS and OWL. Elsevier, Amsterdam (2011)
Google Scholar
Arasu, A., Garcia-Molina, H.: Extracting structured data from web pages. In: Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, pp. 337–348. ACM (2003)
Google Scholar
Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M., Davis, A.P., Dolinski, K., Dwight, S.S., Eppig, J.T., et al.: Gene ontology: tool for the unification of biology. Nat. Genet. 25(1), 25 (2000)
Article Google Scholar
Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.: Dbpedia: a nucleus for a web of open data. In: The Semantic Web, pp. 722–735. Springer, Berlin (2007)
Google Scholar
Bach, N., Badaskar, S.: A survey on relation extraction. Language Technologies Institute, Carnegie Mellon University (2007)
Google Scholar
Banko, M., Cafarella, M.J., Soderland, S., Broadhead, M., Etzioni, O.: Open information extraction from the web. In: IJCAI, vol. 7, pp. 2670–2676 (2007)
Google Scholar
Bauer, F., Kaltenböck, M.: Linked Open Data: The Essentials. Edition mono/monochrom, Vienna (2011)
Google Scholar
Baxter, R., Christen, P., Churches, T., et al.: A comparison of fast blocking methods for record linkage. In: ACM SIGKDD, vol. 3, pp. 25–27. Citeseer (2003)
Google Scholar
Benajiba, Y., Diab, M., Rosso, P.: Arabic named entity recognition using optimized feature sets. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 284–293. Association for Computational Linguistics (2008)
Google Scholar
Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013)
Article Google Scholar
Benjelloun, O., Garcia-Molina, H., Menestrina, D., Su, Q., Whang, S.E., Widom, J.: Swoosh: a generic approach to entity resolution. VLDB J.: Int. J. Very Large Data Bases 18(1), 255–276 (2009)
Google Scholar
Berant, J., Srikumar, V., Chen, P.C., Vander Linden, A., Harding, B., Huang, B., Clark, P., Manning, C.D.: Modeling biological processes for reading comprehension. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1499–1510 (2014)
Google Scholar
Berners-Lee, T., Fielding, R., Masinter, L.: Uniform resource identifier (URI): generic syntax. Technical report (2004)
Google Scholar
Berners-Lee, T., Hendler, J., Lassila, O.: The semantic web. Sci. Am. 284(5), 34–43 (2001)
Article Google Scholar
Bhaskaran, S., Rafeeque, P.: A survey on relation extraction methodologies from unstructured text. In: Emerging Trends in Engineering, Science and Technology for Society, Energy and Environment, pp. 869–874. CRC Press, Leiden (2018)
Google Scholar
Bhattacharya, I., Getoor, L.: A latent dirichlet model for unsupervised entity resolution. In: Proceedings of the 2006 SIAM International Conference on Data Mining, pp. 47–58. SIAM (2006)
Google Scholar
Bhattacharya, I., Getoor, L.: Collective entity resolution in relational data. ACM Trans. Knowl. Discov. Data (TKDD) 1(1), 5 (2007)
Google Scholar
Bick, E.: A named entity recognizer for Danish. In: LREC. Citeseer (2004)
Google Scholar
Bilenko, M., Mooney, R.J.: Adaptive duplicate detection using learnable string similarity measures. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 39–48. ACM (2003)
Google Scholar
Bilke, A., Naumann, F.: Schema matching using duplicates. In: 21st International Conference on Data Engineering, 2005 (ICDE 2005). Proceedings, pp. 69–80. IEEE (2005)
Google Scholar
Bizer, C.: The emerging web of linked data. IEEE Intell. Syst. 24(5), 87–92 (2009)
Article Google Scholar
Bizer, C., Heath, T., Berners-Lee, T.: Linked data: the story so far. In: Semantic Services, Interoperability and Web Applications: Emerging Concepts, pp. 205–227. IGI Global (2011)
Google Scholar
Bizer, C., Heath, T., Idehen, K., Berners-Lee, T.: Linked data on the web (ldow2008). In: Proceedings of the 17th International Conference on World Wide Web, pp. 1265–1266. ACM (2008)
Google Scholar
Björne, J., Heimonen, J., Ginter, F., Airola, A., Pahikkala, T., Salakoski, T.: Extracting complex biological events with rich graph-based feature sets. In: Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing: Shared Task, pp. 10–18. Association for Computational Linguistics (2009)
Google Scholar
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
MATH Google Scholar
Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J., Yakhnenko, O.: Translating embeddings for modeling multi-relational data. In: Advances in Neural Information Processing Systems, pp. 2787–2795 (2013)
Google Scholar
Bordes, A., Weston, J., Collobert, R., Bengio, Y., et al.: Learning structured embeddings of knowledge bases. In: AAAI, vol. 6, p. 6 (2011)
Google Scholar
Brin, S.: Extracting patterns and relations from the world wide web. In: International Workshop on the World Wide Web and Databases, pp. 172–183. Springer (1998)
Google Scholar
Cao, Y., Chen, Z., Zhu, J., Yue, P., Lin, C.Y., Yu, Y.: Leveraging unlabeled data to scale blocking for record linkage. In: IJCAI Proceedings-International Joint Conference on Artificial Intelligence, vol. 22, p. 2211 (2011)
Google Scholar
Chakrabarti, K., Chaudhuri, S., Cheng, T., Xin, D.: A framework for robust discovery of entity synonyms. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1384–1392. ACM (2012)
Google Scholar
Chambers, N., Jurafsky, D.: Unsupervised learning of narrative event chains. In: Proceedings of ACL-08: HLT, pp. 789–797 (2008)
Google Scholar
Chang, C.H., Kayed, M., Girgis, M.R., Shaalan, K.F.: A survey of web information extraction systems. IEEE Trans. Knowl. Data Eng. 18(10), 1411–1428 (2006)
Article Google Scholar
Chang, C.H., Kuo, S.C.: Olera: semisupervised web-data extraction with visual support. IEEE Intell. Syst. 19(6), 56–64 (2004)
Article Google Scholar
Chang, C.H., Lui, S.C.: Iepad: information extraction based on pattern discovery. In: Proceedings of the 10th International Conference on World Wide Web, pp. 681–688. ACM (2001)
Google Scholar
Christen, P.: Automatic record linkage using seeded nearest neighbour and support vector machine classification. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 151–159. ACM (2008)
Google Scholar
Christen, P.: Febrl-: an open source data cleaning, deduplication and record linkage system with a graphical user interface. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1065–1068. ACM (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

Information Sciences Institute, University of Southern California, Marina del Rey, CA, USA
Mayank Kejriwal

Authors

Mayank Kejriwal
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Kejriwal, M. (2019). Advanced Topic: Knowledge Graph Completion. In: Domain-Specific Knowledge Graph Construction. SpringerBriefs in Computer Science. Springer, Cham. https://doi.org/10.1007/978-3-030-12375-8_4

Download citation

DOI: https://doi.org/10.1007/978-3-030-12375-8_4
Published: 05 March 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-12374-1
Online ISBN: 978-3-030-12375-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics