Skip to main content

How to Build a Knowledge Graph

  • Chapter
  • First Online:
Book cover Knowledge Graphs

Abstract

This chapter outlines the state of the art of Knowledge Graph technologies by introducing the process of building a Knowledge Graph . We define the following major steps of an overall process model: (1) knowledge creation , (2) knowledge hosting , (3) knowledge curation , and (4) knowledge deployment . We demonstrate the methodology for the knowledge creation process that creates, extracts, and structures the fact base for a Knowledge Graph. We describe the process of knowledge collection , storage, and retrieval that implements established knowledge in a graph-based storage system. We analyze existing methods and tools to improve the quality of a large Knowledge Graph. For the Knowledge Curation process, we establish sub-steps, such as knowledge assessment , cleaning, and enrichment. For each of them, we determine various categories and dimensions that have been developed and described in the literature and identify tasks which can be applied (e.g., Knowledge Graph completion and correctness , error detection and correction, identifying and resolving duplicates). Finally, we describe the deployment process of a Knowledge Graph based on the following principles: findability, accessibility, interoperability, and reusability.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 49.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 64.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • M. Achichi, Z. Bellahsene, K. Todorov, Legato results for OAEI 2017, in Proceedings of the 12th International Workshop on Ontology Matching (OM2017) Co-Located with the 16th International Semantic Web Conference (ISWC2017), CEUR Workshop Proceedings, vol. 2032, Vienna, Austria, 21 October 2017

    Google Scholar 

  • M. Acosta, A. Zaveri, E. Simperl, D. Kontokostas, S. Auer, J. Lehmann, Crowdsourcing linked data quality assessment, in Proceedings of the 12th International Semantic Web Conference (ISWC2013), Sydney, Australia, 21–25 October 2013. Springer LNCS, vol. 8219

    Google Scholar 

  • R. Angles, C. Gutiérrez, Querying RDF data from a graph database perspective, in Proceedings of the 2nd European Semantic Web Conference (ESWC2005), Heraklion, Greece, 29 May–1 June 2005. Springer LNCS, vol. 3532

    Google Scholar 

  • R. Angles, C. Gutiérrez, Survey of graph database models. ACM Comput. Surv. 40(1), 1–39 (2008)

    Article  Google Scholar 

  • A.P. Aprosio, C. Giuliano, A. Lavelli, Automatic expansion of DBpedia exploiting Wikipedia cross-language information, in Proceedings of the 10th International Extended Semantic Web Conference (ESWC2013) on the Semantic Web: Semantics and Big Data, Montpellier, France, 26–30 May 2013. Springer LNCS, vol. 7882

    Google Scholar 

  • S. Araújo, J. Hidders, D. Schwabe, A.P. de Vries, SERIMI—resource description similarity, RDF instance matching and interlinking, in Proceedings of the 6th International Workshop on Ontology Matching (OM2011), CEUR Workshop Proceedings, vol. 814, Bonn, Germany, 24 October 2011

    Google Scholar 

  • S. Athanasiou, G. Giannopoulos, D. Graux, N. Karagiannakis, J. Lehmann, A.N. Ngomo, K. Patroumpas, M.A. Sherif, D. Skoutas, Big POI data integration with linked data technologies, in Proceedings of the 22nd International Conference on Extending Database Technology (EDBT2019), Lisbon, Portugal, 26–29 March 2019a. OpenProceedings.org

  • S. Athanasiou, M. Alexakis, G. Giannopoulos, N. Karagiannakis, Y. Kouvaras, P. Mitropoulos, K. Patroumpas, D. Skoutas, SLIPO: large-scale data integration for points of interest, in Proceedings of the 22nd International Conference on Extending Database Technology (EDBT), Lisbon, Portugal, 26–29 March 2019b, pp. 574–577

    Google Scholar 

  • C. Batini, M. Scannapieco, Data Quality: Concepts, Methodologies and Techniques. Data-Centric Systems and Applications (Springer, New York, 2006)

    MATH  Google Scholar 

  • C. Batini, M. Lenzerini, S.B. Navathe, A comparative analysis of methodologies for database schema integration. ACM Comput. Surv. 18(4), 323–364 (1986)

    Article  Google Scholar 

  • C. Batini, C. Cappiello, C. Francalanci, A. Maurino, Methodologies for data quality assessment and improvement. ACM Comput. Surv. 41(3), 1–52 (2009)

    Article  Google Scholar 

  • W. Beek, L. Rietveld, H.R. Bazoobandi, J. Wielemaker, S. Schlobach, LOD laundromat: a uniform way of publishing other people’s dirty data, in Proceedings of the 13th International Semantic Web Conference (ISWC2014), Riva del Garda, Italy, 19–23 October 2014. Springer LNCS, vol. 8796

    Google Scholar 

  • O. Benjelloun, H. Garcia-Molina, D. Menestrina, Q. Su, S.E. Whang, J. Widom, Swoosh: a generic approach to entity resolution. Int. J. Very Large Data Bases 18(1), 255–276 (2009)

    Article  Google Scholar 

  • I. Bhattacharya, L. Getoor, Collective entity resolution in relational data. ACM Trans. Knowl. Discov. Data 1(1), 5 (2007)

    Article  Google Scholar 

  • A. Bilke, J. Bleiholder, C. Böhm, K. Draba, F. Naumann, M. Weis, Automatic data fusion with HumMer, in Proceedings of the 31st International Conference on Very Large Data Bases (VLDB2005), VLDB Endowment, Trondheim, Norway, 30 August–2 September 2005

    Google Scholar 

  • C. Bizer, R. Cygania, Quality-driven information filtering using the WIQA policy framework. J. Web Semant. 7(1), 1–10 (2009)

    Article  Google Scholar 

  • C. Bizer, T. Heath, K. Idehen, T. Berners-Lee, Linked data on the web (LDOW2008), in Proceedings of the 17th International Conference on World Wide Web (WWW2008): Workshop, 21–25 April 2008 (ACM, Beijing)

    Google Scholar 

  • C. Bizer, T. Heath, T. Berners-Lee, Linked data—the story so far. Int. J. Semant. Web Inf. Syst. 5(3), 1–22 (2009)

    Article  Google Scholar 

  • J. Bleiholder, F. Naumann, Data fusion. ACM Comput. Surv. 41(1), 1–41 (2009)

    Article  Google Scholar 

  • J. Bleiholder, K. Draba, F. Naumann, FuSem—exploring different semantics of data fusion, in Proceedings of the 33rd International Conference on Very Large Data Bases (VLDB2007), VLDB Endowment, Vienna, Austria, 23–27 September 2007

    Google Scholar 

  • A. Borodin, G.O. Roberts, J.S. Rosenthal, P. Tsaparas, Link analysis ranking: algorithms, theory, and experiments. ACM Trans. Internet Technol. 5(1), 231–297 (2005)

    Article  Google Scholar 

  • W.M. Campbell, L. Li, C.K. Dagli, J. Acevedo-Aviles, K. Geyer, J.P. Campbell, C. Priebe, Cross-Domain Entity Resolution in Social Media, Technical Report, arXiv preprint, 1608.01386 (2016). https://arxiv.org/abs/1608.01386

  • C. Chang, M. Kayed, M.R. Girgis, K.F. Shaalan, A survey of web information extraction systems. IEEE Trans. Knowl. Data Eng. 18(10), 1411–1428 (2006)

    Article  Google Scholar 

  • V. Christophides, V. Efthymiou, K. Stefanidis, Entity Resolution in the Web of Data (Morgan & Claypool, San Rafael, 2015)

    Book  Google Scholar 

  • X. Chu, M. Ouzzani, J. Morcos, I.F. Ilyas, P. Papotti, N. Tang, Y. Ye, KATARA: reliable data cleaning with knowledge bases and crowdsourcing, in Proceedings of the 41st International Conference on Very Large Data Bases (PVLDB2015), Hawaii, 31 August–4 September 2015, VLDB Endowment, 8(12), 1952–1955 (2015)

    Google Scholar 

  • P. Cimiano, S. Handschuh, S. Staab, Towards the self-annotating web, in Proceedings of the 13th International Conference on World Wide Web (WWW2004), 17–20 May 2004 (ACM, New York)

    Google Scholar 

  • J. De Bruijn, R. Lara, A. Polleres, D. Fensel, OWL DL vs. OWL flight: conceptual modeling and reasoning for the Semantic Web, in Proceedings of the 14th International World Wide Web Conference (ISWC2005), 10–14 May 2005 (ACM, Chiba, Japan)

    Google Scholar 

  • G. De Melo, Not quite the same: identity constraints for the web of linked data, in Proceedings of the 27th Conference on Artificial Intelligence (AAAI2013), 14–18 July 2013 (AAAI Press, Bellevue, USA)

    Google Scholar 

  • J. Debattista, S. Auer, C. Lange, Luzzu—a methodology and framework for linked data quality assessment. J. Data Inf. Qual. 8(1), 1–32 (2016a)

    Article  Google Scholar 

  • J. Debattista, C. Lange, S. Auer, A preliminary investigation towards improving linked data quality using distance-based outlier detection, in Proceedings of the 6th Joint International Semantic Technology Conference (JIST2016): Revised Selected Papers, Singapore, 2–4 November 2016b. Springer LNCS, vol. 10055

    Google Scholar 

  • S. Decker, S. Melnik, F. van Harmelen, D. Fensel, M.C.A. Klein, J. Broekstra, M. Erdmann, I. Horrocks, The Semantic Web: the roles of XML and RDF. IEEE Internet Comput. 4(5), 63–74 (2000)

    Article  Google Scholar 

  • M. Dezani-Ciancaglini, R. Horne, V. Sassone, Tracing where and who provenance in linked data: a calculus. Theor. Comput. Sci. 464, 113–129 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  • D. Dietrich, J. Gray, T. McNamara, A. Poikola, P. Pollock, J. Tait, T. Zijlstra, Open data handbook (Open Knowledge International, Cambridge, 2009)

    Google Scholar 

  • A. Dimou, M.V. Sande, P. Colpaert, R. Verborgh, E. Mannens, R.V. de Walle, RML: a generic language for integrated RDF mappings of heterogeneous data, in Proceedings of the Workshop on Linked Data on the Web (LDOW2014) Co-Located with the 23rd International World Wide Web Conference (WWW2014), CEUR Workshop Proceedings, vol. 1184, Seoul, Korea, 8 April 2014

    Google Scholar 

  • L. Ding, P. Kolari, Z. Ding, S. Avancha, Using ontologies in the Semantic Web: a survey. Ontol. Integr. Ser. Inf. Syst. 14, 79–113 (2007)

    Google Scholar 

  • X.L. Dong, F. Naumann, Data fusion—resolving data conflicts for integration. Proc. Very Large Data Bases Endow. 2(2), 1654–1655 (2009)

    Google Scholar 

  • X.L. Dong, D. Srivastava, Knowledge curation and knowledge fusion: challenges, models and applications, in Proceedings of the 2015 ACM International Conference on Management of Data (SIGMOD2015), 31 May–4 June 2015 (ACM, Melbourne)

    Google Scholar 

  • X.L. Dong, L. Berti-Équille, D. Srivastava, Integrating conflicting data: the role of source dependence. Proc. Very Large Data Bases Endow. 2(1), 550–561 (2009a)

    Google Scholar 

  • X.L. Dong, E. Gabrilovich, G. Heitz, W. Horn, K. Murphy, S. Sun, W. Zhang, From data fusion to knowledge fusion. Proc. Very Large Data Bases Endow. 7(10), 881–892 (2014b)

    Google Scholar 

  • H.L. Dunn, Record linkage. Am. J. Public Health Nations Health 36(12), 1412–1416 (1946)

    Article  Google Scholar 

  • D. Esteves, A. Rula, A.J. Reddy, J. Lehmann, Toward veracity assessment in RDF knowledge bases: an exploratory analysis. ACM J. Data Inf. Qual. 9(3), 1–26 (2018)

    Article  Google Scholar 

  • M. Färber, F. Bartscherer, C. Menne, A. Rettinger, Linked data quality of DBpedia, Freebase, OpenCyc, Wikidata, and YAGO. Semant. Web J. 9(1), 77–129 (2018)

    Article  Google Scholar 

  • D.C. Faye, O. Curé, G. Blin, A survey of RDF storage approaches. Rev. Afr. Rech. Inf. Math. Appl. 15, 11–35 (2012)

    Google Scholar 

  • J.D. Fernández, W. Beek, M.A. Martínez-Prieto, M. Arias. LOD-a-lot: a queryable dump of the LOD cloud, in Proceedings of the 16th International Semantic Web Conference (ISWC2017), Vienna, Austria, 21–25 October 2017. Springer LNCS, vol. 10588

    Google Scholar 

  • D. Fleischhacker, H. Paulheim, V. Bryl, J. Völker, C. Bizer, Detecting errors in numerical linked data using cross-checked outlier detection, in Proceedings of the 13th International Conference on Management of Data (ISWC2014), Riva del Garda, Italy, 19–23 October 2014. Springer LNCS, vol. 8796

    Google Scholar 

  • A. Flemming, Qualitätsmerkmale von Linked Data-veröffentlichenden Daten-quellen, Diploma thesis, Humboldt-Universität zu Berlin, 2011

    Google Scholar 

  • C. Fürber, M. Hepp, Using SPARQL and SPIN for data quality management on the Semantic Web, in Proceedings of the 13th International Conference on Business Information Systems (BIS2010), Berlin, Germany, 3–5 May 2010a. Springer LNBI, vol. 47

    Google Scholar 

  • C. Fürber, M. Hepp, Using Semantic Web resources for data quality management, in Proceedings of the 17th International Conference on Knowledge Engineering and Management by the Masses (EKAW2010), Lisbon, Portugal, 11–15 October 2010b. Springer LNCS, vol. 6317

    Google Scholar 

  • C. Fürber, M. Hepp, SWIQA—a Semantic Web information quality assessment framework, in Proceedings of the 19th European Conference on Information Systems (ECIS2011), Association for Information Systems (AIS eLibrary), Helsinki, Finland, 9–11 June 2011

    Google Scholar 

  • A. Fuxman, E. Fazli, R.J. Miller, ConQuer: efficient management of inconsistent databases, in Proceedings of the International Conference on Management of Data (SIGMOD2005), 14–16 June 2005 (ACM, Baltimore)

    Google Scholar 

  • E. Gamma, R. Helm, R. Johnson, J. Vlissides, Design Patterns: Elements of Reusable Object-Oriented Software (Addison-Wesley Longman, Boston, MA, 1995)

    MATH  Google Scholar 

  • A. Gangemi, A.G. Nuzzolese, V. Presutti, F. Draicchio, A. Musetti, P. Ciancarini, Automatic typing of DBpedia entities, in Proceedings of the 11th International Semantic Web Conference (ISWC2012), Boston, 11–15 November 2012. Springer LNCS, vol. 7649

    Google Scholar 

  • H. Garcia-Molina, J.D. Ullman, J. Widom, Database Systems: The Complete Book, Chapter 7, 2nd edn. (Pearson International Editing, 2009)

    Google Scholar 

  • L.M. Garshol, A. Borge, Hafslund Sesam—an archive on semantics, in Proceedings of the 10th Extending Semantic Web Conference (ESWC2013): Semantics and Big Data, Montpellier, France, 26–30 May 2013. Springer LNCS, vol. 7882

    Google Scholar 

  • G. Gawriljuk, A. Harth, C.A. Knoblock, P.A. Szekely, A scalable approach to incrementally building knowledge graphs, in Proceedings of the 20th International Conference on Theory and Practice of Digital Libraries (TPDL2016), Hannover, Germany, 5–9 September 2016. Springer LNCS, vol. 9819

    Google Scholar 

  • L. Getoor, A. Machanavajjhala, Entity resolution: theory, practice & open challenges, in Proceedings of the 38th International Conference on Very Large Data Bases (VLDB2012), 5(12), 2018–2019 (2012)

    Google Scholar 

  • L. Getoor, A. Machanavajjhala, Entity resolution for big data, in Proceedings of the 19th International Conference on Knowledge Discovery and Data Mining (KDD2013): Tutorial, 11–14 August 2013 (ACM, Chicago)

    Google Scholar 

  • G. Giannopoulos, D. Skoutas, T. Maroulis, N. Karagiannakis, S. Athanasiou, FAGI: a framework for fusing geospatial RDF data, in Proceedings of the Confederated International Conferences on the Move to Meaningful Internet Systems (OTM2014), Amantea, Italy, 27–31 October 2014. Springer LNCS, vol. 8841

    Google Scholar 

  • H. Glaser, I. Millard, W. Sung, S. Lee, P. Kim, B. You, Research on linked data and co-reference resolution, in Proceedings of the International Conference on Dublin Core and Metadata Applications (DCMI2019), Dublin Core Metadata Initiative, Seoul, Korea, 12–16 October 2009

    Google Scholar 

  • A. Gómez-Pérez, M. Fernandez-Lopez, O. Corcho, Ontological Engineering: With Examples from the Areas of Knowledge Management, e-Commerce and the Semantic Web (Springer, Berlin, 2010)

    Google Scholar 

  • J.M. Gómez-Pérez, J.Z. Pan, G. Vetere, H. Wu, Enterprise knowledge graph: an introduction, in Exploiting Linked Data and Knowledge Graphs in Large Organisations, ed. by J. Z. Pan, G. Vetere, J. M. Gómez-Pérez, H. Wu, (Springer, Cham, 2017)

    Google Scholar 

  • C. Guéret, P.T. Groth, C. Stadler, J. Lehmann, Assessing linked data mappings using network measures, in Proceedings of the 9th Extended Semantic Web Conference (ESWC2012), Heraklion, Greece, 27–31 May 2012. Springer LNCS, vol. 7295

    Google Scholar 

  • R.V. Guha, Introducing schema.org: Search engines come together for a richer web, Google Official Blog (2011)

  • R.V. Guha, D. Brickley, S. Macbeth, Schema.org: evolution of structured data on the web. Commun. ACM 59(2), 44–51 (2016)

  • K. Gunaratna, S. Lalithsena, A.P. Sheth, Alignment and dataset identification of linked data in Semantic Web. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 4(2), 139–151 (2014)

    Article  Google Scholar 

  • S. Gupta, G.E. Kaiser, D. Neistadt, P. Grimm, DOM-based content extraction of HTML documents, in Proceedings of the 12th International World Wide Web Conference (WWW2003), 20–24 May 2003 (ACM, Budapest)

    Google Scholar 

  • S. Gupta, P.A. Szekely, C.A. Knoblock, A. Goel, M. Taheriyan, M. Muslea, Karma: a system for mapping structured sources into the Semantic Web, in Proceedings of the 9th Extended Semantic Web Conference (ESWC2012): Revised Selected Papers, Crete, Greece, 27–31 May 2012. Springer LNCS, vol. 7540

    Google Scholar 

  • H. Halpin, P.J. Hayes, J.P. McCusker, D.L. McGuinness, H.S. Thompson, When owl:sameAs isn’t the same: an analysis of identity in linked data, in Proceedings of the 9th International Semantic Web Conference (ISWC2010), 7–11 November 2010 (Springer, Shanghai)

    Google Scholar 

  • J.B. Hansen, A. Beveridge, R. Farmer, L. Gehrmann, A.J.G. Gray, S. Khutan, T. Robertson, J. Val, Validata: an online tool for testing RDF data conformance, in Proceedings of the 8th International Conference on Semantic Web Applications and Tools for Life Sciences (SWAT4LS2015), CEUR Workshop Proceedings, vol. 1546, Cambridge, UK, 7–10 December 2015

    Google Scholar 

  • S. Harris, A. Seaborne, E. Prud’hommeaux (eds.), SPARQL 1.1 Query Language, W3C Recommendation, 21 March 2013. https://www.w3.org/TR/sparql11-query/

  • P. Hayes, The Logic of Frames, Readings in Artificial Intelligence (Morgan Kaufmann, Los Altos, CA, 1981)

    Google Scholar 

  • P. Hayes (ed.), RDF semantics, W3C recommendation, 10 February 2004. https://www.w3.org/TR/sparql11-query/

  • J. Hipp, U. Güntzer, G. Nakhaeizadeh, Algorithms for association rule mining—a general survey and comparison. ACM SIGKDD Explor. Newsl. 2(1), 58–64 (2000)

    Article  Google Scholar 

  • A. Hogan, A. Harth, S. Decker, Performing object consolidation on the Semantic Web data graph, in Proceedings of the 16th International World Wide Web Conference (WWW2007): Workshop I3: Identity, Identifiers, Identification, Entity-Centric Approaches to Information and Knowledge Management on the Web, CEUR Workshop Proceedings, vol. 249, Banff, Canada, 8 May 2007

    Google Scholar 

  • S.M. Inzalkar, J. Sharma, A survey on text mining-techniques and application. Int. J. Res. Sci. Eng. 14, 1–14 (2015)

    Google Scholar 

  • K. Janowicz, P. Hitzler, B. Adams, D. Kolas, C. Vardeman, Five stars of linked data vocabulary use. Semant. Web J. 5(3), 173–176 (2014)

    Article  Google Scholar 

  • E. Kärle, U. Şimşek, D. Fensel, semantify.it, a platform for creation, publication and distribution of semantic annotations, in Proceedings of the 11th International Conference on Advances in Semantic Processing (SEMAPRO2017), IARIA, Barcelona, Spain, 12–16 November 2017

    Google Scholar 

  • E. Kärle, U. Şimşek, O. Panasiuk, D. Fensel, Building an ecosystem for the tyrolean tourism knowledge graph, in Proceedings of the International Conference on Trends in Web Engineering (ICWE2018), International Workshops, MATWEP, EnWot, KD-Web, WEOD, TourismKG: Revised Selected Papers, Caceres, Spain, 5 June 2018. Springer LNCS, vol. 11153

    Google Scholar 

  • L. Karoui, M.-A. Aufaure, N. Bennacer, Ontology discovery from web pages: application to tourism, in Proceedings of the Workshop on Knowledge Discovery and Ontologies (ECML/PKDD2004), Pisa, Italy, 20–24 September 2004

    Google Scholar 

  • M. Kejriwal, C. Knoblock, P. Szekely, Constructing domain-specific knowledge graphs, in Proceedings of the 16th International Semantic Web Conference (ISWC2017): Tutorial, Vienna, Austria, 21–25 October 2017. https://usc-isi-i2.github.io/ISWC17/

  • M. Kifer, G. Lausen, J. Wu, Logical foundations of object-oriented and frame-based languages. J. ACM 42(4), 741–843 (May 1995)

    Article  MathSciNet  MATH  Google Scholar 

  • J.M. Kleinberg, Authoritative sources in a hyperlinked environment. J. ACM 46(5), 604–632 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  • T. Knap, J. Michelfeit, M. Necaský, Linked open data aggregation: conflict resolution and aggregate quality, in Proceedings of the 36th Annual IEEE Computer Software and Applications Conference Workshops (COMP-SAC2012), IEEE Computer Society, Izmir, Turkey, 16–20 July 2012

    Google Scholar 

  • D. Kontokostas, P. Westphal, S. Auer, S. Hellmann, J. Lehmann, R. Cornelissen, A. Zaveri, Test-driven evaluation of linked data quality, in Proceedings of the 23rd International Conference on World Wide Web (WWW2014), 07–11 April 2014 (ACM, Seoul)

    Google Scholar 

  • N. Korula, S. Lattanzi, An efficient reconciliation algorithm for social networks. Proc. Very Large Data Bases Endow. 7(5), 377–388 (2014)

    Google Scholar 

  • S. Lalithsena, P. Hitzler, A.P. Sheth, P. Jain, Automatic domain identification for linked open data, in Proceedings of the International Joint Conference on Web Intelligence (WI2013) and Intelligent Agent Technologies (IAT2013), IEEE Computer Society, Atlanta, 17–20 November 2013

    Google Scholar 

  • D. Lange, C. Böhm, F. Naumann, Extracting structured information from Wikipedia articles to populate infoboxes, in Proceedings of the 19th Conference on Information and Knowledge Management (CIKM2010), 26–30 October 2010 (ACM, Toronto)

    Google Scholar 

  • A. Langegger, W. Wöß, Langegger: XLWrap–querying and integrating arbitrary spreadsheets with SPARQL, in Proceedings of the 8th International Semantic Web Conference (ISWC 2009), 25–29 October 2009 (Springer, Chantilly, VA)

    Google Scholar 

  • P. Lertvittayakumjorn, N. Kertkeidkachorn, R. Ichise, Resolving range violations in DBpedia, in Proceedings of the 7th Joint International Semantic Technology Conference (JIST2017), Gold Coast, Australia, 10–12 November 2017. Springer LNCS, vol. 10675

    Google Scholar 

  • W. Li, C. Clifton, SEMINT: a tool for identifying attribute correspondences in heterogeneous databases using neural networks. Data Knowl. Eng. 33(1), 49–84 (2000)

    Article  MATH  Google Scholar 

  • Y. Li, J. Gao, C. Meng, Q. Li, L. Su, B. Zhao, W. Fan, J. Han, A survey on truth discovery. ACM SIGKDD Explor. Newsl. 17(2), 1–16 (2016)

    Article  Google Scholar 

  • J. Liang, Y. Xiao, Y. Zhang, S. Hwang, H. Wang, Graph-based wrong IsA relation detection in a large-scale lexical taxonomy, in Proceedings of the 31st Conference on Artificial Intelligence (AAAI2017), 4–9 February 2017 (AAAI Press, San Francisco)

    Google Scholar 

  • L. Ma, Z. Su, Y. Pan, L. Zhang, T. Liu, RStar: an RDF storage and query system for enterprise resource management, in Proceedings of the 13th International Conference on Information and knowledge Management (CIKM2004), 8–13 November 2004 (ACM, Washington)

    Google Scholar 

  • Y. Ma, H. Gao, T. Wu, G. Qi, Learning disjointness axioms with association rule mining and its application to inconsistency detection of linked data, in Proceedings of the 8th Chinese Semantic Web and Web Science Conference (CSWS2014): Revised Selected Papers, Wuhan, China, 8–12 August 2014. Springer CCIS 480

    Google Scholar 

  • R. Mahanti, Data Quality: Dimensions, Measurement, Strategy, Management, and Governance (ASQ Quality Press, Milwaukee, 2019)

    Google Scholar 

  • A. Melo, H. Paulheim, Detection of relation assertion errors in knowledge graphs, in Proceedings of the 9th International Conference on Knowledge Capture (K-CAP2017), 4–6 December 2017 (ACM, Austin)

    Google Scholar 

  • P.N. Mendes, H. Mühleisen, C. Bizer, Sieve: linked data quality assessment and fusion, in Proceedings of the 2nd International Workshop on Linked Web Data Management (LWDM 2012), in Conjunction with the 15th International Conference on Extending Database Technology (EDBT2012): Workshops, 30 March 2012 (ACM, Berlin)

    Google Scholar 

  • D. Menestrina, S. Whang, H. Garcia-Molina, Evaluating entity resolution results. Proc. Very Large Data Bases Endow. 3(1–2), 208–219 (2010)

    Google Scholar 

  • P. Mika, On Schema.org and why it matters for the web. IEEE Internet Comput. 19(4), 52–55 (2015)

  • B. Mohit, Named entity recognition, in Natural Language Processing of Semitic Languages, ed. by I. Zitouni, (Springer, Berlin, 2014), pp. 221–245

    Chapter  Google Scholar 

  • A. Moschitti, K. Tymoshenko, P. Alexopoulos, A.D. Walker, M. Nicosia, G. Vetere, A. Faraotti, M. Monti, J.Z. Pan, H. Wu, Y. Zhao, Question answering and knowledge graphs, in Exploiting Linked Data and Knowledge Graphs in Large Organisations, ed. by J. Z. Pan, G. Vetere, J. M. Gómez-Pérez, H. Wu, (Springer, Cham, 2017)

    Google Scholar 

  • E. Muñoz, A. Hogan, A. Mileo, Triplifying Wikipedia’s tables, in Proceedings of the 1st International Workshop on Linked Data for Information Extraction (LD4IE2013) Co-Located with the 12th International Semantic Web Conference (ISWC2013), CEUR Workshop Proceeding, vol. 1057, Sydney, Australia, 21 October 2013

    Google Scholar 

  • A.N. Ngomo, S. Auer, LIMES—a time-efficient approach for large-scale link discovery on the web of data, in Proceedings of the 22nd International Joint Conference on Artificial Intelligence (IJ-CAI2011), 16–22 July 2011 (AAAI Press, Barcelona)

    Google Scholar 

  • A. Nikolov, V.S. Uren, E. Motta, A.N.D. Roeck, Integration of semantically annotated data by the KnoFuss architecture, in Proceedings of the 16th International Conference on Knowledge Engineering and Knowledge Management (EKAW2008): Practice and Patterns, Acitrezza, Italy, 29 September–2 October 2008. Springer LNCS, vol. 5268

    Google Scholar 

  • N. Noy, Y. Gao, A. Jain, A. Narayanan, A. Patterson, J. Taylor, Industry-scale knowledge graphs: lessons and challenges. ACM Queue 17(2), 48–75 (2019)

    Google Scholar 

  • A.G. Nuzzolese, A. Gangemi, V. Presutti, P. Ciancarini, Type inference through the analysis of Wikipedia links, in Proceedings of the 21st International Conference on World Wide Web (WWW2012): Workshop on Linked Data on the Web (LDOW2012), CEUR Workshop Proceedings, vol. 937, Lyon, France, 16 April 2012

    Google Scholar 

  • M.J. O’Connor, C. Halaschek-Wiener, M.A. Musen, Mapping master: a flexible approach for mapping spreadsheets to OWL, in Proceedings of the 9th International Semantic Web Conference (ISWC2010): Revised Selected Papers, Shanghai, China, 7–11 November 2010. Springer LNCS, vol. 6497

    Google Scholar 

  • J. Z. Pan, G. Vetere, J. M. Gómez-Pérez, H. Wu (eds.), Exploiting Linked Data and Knowledge Graphs in Large Organisations (Springer, Cham, 2017b)

    Google Scholar 

  • O. Panasiuk, E. Kärle, U. Şimşek, D. Fensel, Defining tourism domains for semantic annotation of web content, in Proceedings of the Conference on Information and Communication Technologies in Tourism (ENTER2018): Research Notes, Jönköping, Sweden, 24–26 January 2018a

    Google Scholar 

  • O. Panasiuk, Z. Akbar, T. Gerrier, D. Fensel, Representing GeoData for tourism with Schema.org, in Proceedings of the 4th International Conference on Geographical Information Systems Theory, Applications and Management (GISTAM2018), 17–19 March 2018b (SciTePress, Funchal, Portugal)

  • O. Panasiuk, Z. Akbar, U. Şimşek, D. Fensel, Enabling conversational tourism assistants through Schema.org mapping, in Proceedings of the European Semantic Web Conference (ESWC2018): Satellite Event, Revised Selected Papers, Hersonissos, Greece, 3–7 June 2018c. Springer LNCS, vol. 11155

  • O. Panasiuk, O. Holzknecht, U. Şimşek, E. Kärle, D. Fensel, Verification and validation of semantic annotations, in Proceedings of the 12th A.P. Ershov Informatics Conference (PSI 2019), Novosibirsk, Russia, 2–5 July 2019 (Springer). Preprint. https://arxiv.org/abs/1904.01353

  • L. Papaleo, N. Pernelle, F. Saïs, C. Dumont, Logical detection of invalid SameAs statements in RDF data, in Proceedings of the 19th International Conference on Knowledge Engineering and Knowledge Management (EKAW2014), Linköping, Sweden, 24–28 November 2014. Springer LNCS, vol. 8876

    Google Scholar 

  • P. Paritosh, The missing science of knowledge curation: improving incentives for large-scale knowledge curation, in Proceedings of the International World Wide Web Conference (WWW2018), 23–27 April 2018 (ACM, Lyon)

    Google Scholar 

  • P.F. Patel-Schneider, Analyzing Schema.org, in Proceedings of the 13th International Semantic Web Conference (ISWC2014), Riva del Garda, Italy, 19–23 October 2014. Springer LNCS, vol. 8796

  • P.F. Patel-Schneider, I. Horrocks, Position paper: a comparison of two modelling paradigms in the Semantic Web, in Proceedings of the 15th International World Wide Web Conference (WWW2006), 23–26 May 2006 (ACM, Edinburgh)

    Google Scholar 

  • H. Paulheim, Identifying wrong links between datasets by multi-dimensional outlier detection, in Proceedings of the 3rd International Workshop on Debugging Ontologies and Ontology Mappings (WoDOOM2014) Co-Located with the 11th Extended Semantic Web Conference (ESWC2014), CEUR Workshop Proceedings, vol. 1162, Hersonissou, Greece, 26 May 2014

    Google Scholar 

  • H. Paulheim, Knowledge graph refinement: a survey of approaches and evaluation methods. Semant. Web J. 8(3), 489–508 (2017)

    Article  Google Scholar 

  • H. Paulheim, Machine learning with and for Semantic Web knowledge graphs, ed. by C. d’Amato, M. Theobald, in Proceedings of the 14th International Summer School 2018: Reasoning Web. Learning, Uncertainty, Streaming, and Scalability: Tutorial Lectures, Esch-sur-Alzette, Luxembourg, 22–26 September 2018a. Springer LNCS, vol. 11078

    Google Scholar 

  • H. Paulheim, How much is a triple? Estimating the cost of knowledge graph creation, in Proceedings of the 17th International Semantic Web Conference (ISWC2018): Posters & Demonstrations, Industry and Blue Sky Ideas Tracks, CEUR Workshop Proceedings, vol. 2180, Monterey, 8–12 October 2018b

    Google Scholar 

  • H. Paulheim, C. Bizer, Type inference on noisy RDF data, in Proceedings of the 12th International Semantic Web Conference (ISWC2013), Sydney, Australia, 21–25 October 2013. Springer LNCS, vol. 8218

    Google Scholar 

  • H. Paulheim, C. Bizer, Improving the quality of linked data using statistical distributions. Int. J. Semant. Web Inf. Syst. 10(2), 63–86 (2014)

    Article  Google Scholar 

  • H. Paulheim, M. Sabou, M. Cochez, W. Beek, Evaluation of knowledge graphs, ed. by P.A. Bonatti, S. Decker, A. Polleres, V. Presutti, in Knowledge Graphs: New Directions for Knowledge Representation on the Semantic Web (Dagstuhl Seminar 18371), Dagstuhl Rep. 8(9), 29–111 (2019)

    Google Scholar 

  • N. Pernelle, J. Raad, F. Saıs, Detection of invalid identity links statements in RDF knowledge graphs. Presented in the 21st International Conference on Knowledge Engineering and Knowledge Management (EKAW2018): Workshops: Symbolic methods for data-interlinking, Nancy, France, 12–16 November 2018. https://project.inria.fr/ekaw2018/workshops/

  • L. Pipino, Y.W. Lee, R.Y. Wang, Data quality assessment. Commun. ACM 45(4), 211–218 (2002)

    Article  Google Scholar 

  • J. Plu, G. Rizzo, R. Troncy, ADEL: ADaptable Entity Linking: a hybrid approach to link entities with linked data for information extraction. Semant. Web J. (Special Issue on Linked Data for Information Extraction) 1, 1–5 (2017)

    Google Scholar 

  • J. Raad, N. Pernelle, F. Saïs, Detection of contextual identity links in a knowledge base, in Proceedings of the Knowledge Capture Conference (K-CAP2017), 4–6 December 2017 (ACM, Austin)

    Google Scholar 

  • J. Raad, W. Beek, F. van Harmelen, N. Pernelle, F. Saïs, Detecting erroneous identity links on the web using network metrics, in Proceeding of the 17th International Semantic Web Conference (ISWC2018), Monterrey, 8–12 October 2018. Springer LNCS, vol. 111

    Google Scholar 

  • Y. Raimond, C. Sutton, M.B. Sandler, Automatic interlinking of music datasets on the Semantic Web, in Proceedings of the 17th International World Wide Web Conference (WWW2008): Workshop on Linked Data on the Web (LDOW2008), CEUR Workshop Proceedings, vol. 369, Beijing, China, 22 April 2008

    Google Scholar 

  • T. Rekatsinas, X. Chu, I.F. Ilyas, C. Ré, HoloClean: holistic data repairs with probabilistic inference. Proc. Very Large Data Bases Endow. 10(11), 1190–1201 (2017)

    Google Scholar 

  • M. Rubiolo, M.L. Caliusco, G. Stegmayer, M. Gareli, M. Coronel, Knowledge source discovery: an experience using ontologies, WordNet and artificial neural networks, in Proceedings of the 13th International Conference on Knowledge-Based and Intelligent Information and Engineering Systems (KES2009), Santiago, Chile, 28–30 September 2009. Springer LNCS, vol. 5712

    Google Scholar 

  • A. Rula, M. Palmonari, S. Rubinacci, A.N. Ngomo, J. Lehmann, A. Maurino, D. Esteves, TISCO: temporal scoping of facts. J. Web Semant. 54, 72–86 (2019)

    Article  Google Scholar 

  • A.T. Schreiber, G. Schreiber, H. Akkermans, A. Anjewierden, N. Shadbolt, R. de Hoog, W. Van de Velde, N.R. Shadbolt, B. Wielinga, Knowledge Engineering and Management: The CommonKADS Methodolog (MIT Press, Cambridge, MA, 2000)

    Google Scholar 

  • A. Schultz, A. Matteini, R. Isele, P.N. Mendes, C. Bizer, C. Becker, LDIF—a framework for large-scale linked data integration, in Proceedings of the 21st International World Wide Web Conference (WWW2012): Developers Track, Lyon, France, 18–20 April 2012

    Google Scholar 

  • S. Shehata, F. Karray, M.S. Kamel, An efficient concept-based mining model for enhancing text clustering. IEEE Trans. Knowl. Data Eng. 22(10), 1360–1371 (2010)

    Article  Google Scholar 

  • U. Şimşek, D. Fensel, Now we are talking! Flexible and open goal-oriented dialogue systems for accessing touristic services, in Proceedings of the Conference on Information and Communication Technologies in Tourism (ENTER2018): Research Notes, Jönköping, Sweden, 24–26 January 2018b

    Google Scholar 

  • U. Şimşek, E. Kärle, O. Holzknecht, D. Fensel, Domain specific semantic validation of schema.org annotations, in Proceedings of the 11th International A. P. Ershov Informatics Conference (PSI 2017), Moscow, Russia, 27–29 June 2017. Springer LNCS, vol. 10742 (2018a)

  • U. Şimşek, E. Kärle, D. Fensel, Machine readable web APIs with Schema.org action annotations, in Proceedings of the 14th International Conference on Semantic Systems (SEMANTICS 2018), 10–13 September 2018b (Elsevier, Vienna)

  • U. Şimşek, E. Kärle, D. Fensel, RocketRML—a NodeJS implementation of a use-case specific RML mapper, in Proceedings of 1st Knowledge Graph Building Workshop Co-Located with the 16th Extended Semantic Web Conference (ESWC2019), CEUR Workshop Proceedings, Portoroz, Slovenia, 3 June 2019a

    Google Scholar 

  • J. Sleeman, T. Finin, Type prediction for efficient coreference resolution in heterogeneous semantic graphs, in Proceedings of the 7th International Conference on Semantic Computing (ICSC2013), IEEE Computer Society, Irvine, 16–18 September 2013

    Google Scholar 

  • J. Sleeman, T. Finin, A. Joshi, Topic modeling for RDF graphs, in Proceedings of the 3rd International Workshop on Linked Data for Information Extraction (LD4IE2015) Co-Located with the 14th International Semantic Web Conference (ISWC2015), CEUR Workshop Proceedings, vol. 1467, Bethlehem, 12 October 2015

    Google Scholar 

  • M. Sporny, D. Longley, G. Kellogg, M. Lanthaler, N. Lindström (eds.), JSON-LD 1.0. W3C recommendation, 16 January 2014. https://www.w3.org/TR/json-ld/

  • S. Staab, R. Studer, Ontology Handbook (Springer, Berlin, 2010)

    Google Scholar 

  • F. Stegmaier, U. Gröbner, M. Döller, H. Kosch, G. Baese, Evaluation of current RDF database solutions, in Proceedings of the 10th International Workshop on Semantic Multimedia Database Technologies (SeMuDaTe2009) in Conjunction with the 4th International Conference on Semantics and Digital Media Technologies (SAMT2009), CEUR Workshop Proceedings, vol. 539, Graz, Austria, 2 December 2009

    Google Scholar 

  • G. Stegmayer, M.L. Caliusco, O. Chiotti, M.R. Galli, ANN-agent for distributed knowledge source discovery, in Proceedings of the on the Move to Meaningful Internet Systems (OTM2007): Confederated International Workshops and Posters, AWeSOMe, CAMS, OTM Academy Doctoral Consortium, MONET, OnToContent, ORM, PerSys, PPN, RDDS, SSWS, and SWWS 2007, Vilamoura, Portugal, 25–30 November 2007. Springer LNCS, vol. 4805

    Google Scholar 

  • A. Stolz, M. Hepp, Integrating product classification standards into Schema.org: eCl@ss and UNSPSC on the web of data, in Proceedings of on the Move to Meaningful Internet Systems. OTM 2017 Workshops, Rhodes, Greece, 23–28 October 2017 (2018). Springer LNCS, vol. 10697

  • R. Studer, V.R. Benjamins, D. Fensel, Knowledge engineering: principles and methods. Data Knowl. Eng. 25(1–2), 161–197 (1998)

    Article  MATH  Google Scholar 

  • V. Uren, P. Cimiano, J. Iria, S. Handschuh, M. Vargas-Vera, E. Motta, F. Ciravegna, Semantic annotation for knowledge management: requirements and a survey of the state of the art. Web Semant. Sci. Serv. Agents World Wide Web Arch. 4(1), 14–28 (2006)

    Article  Google Scholar 

  • D. Van Deursen, C. Poppe, G. Martens, E. Mannens, R. Van de Walle, XML to RDF conversion: a generic approach, in Proceedings of the 4th International Conference on Automated solutions for Cross Media Content and Multi-Channel Distribution (AXMEDIS2008), 17–19 November 2008 (IEEE, Florence)

    Google Scholar 

  • M.Y. Vardi, How the hippies destroyed the Internet. Commun. ACM 61(7), 9 (2018)

    Article  Google Scholar 

  • S. Vijayarani, M.J. Ilamathi, M. Nithya, Preprocessing techniques for text mining-an overview. Int. J. Comput. Sci. Commun. Netw. 5(1), 7–16 (2015)

    Google Scholar 

  • B. Villazón-Terrazas, N. García-Santa, Y. Ren, A. Faraotti, H. Wu, Y. Zhao, G. Vetere, J.Z. Pan, Knowledge graph foundations, in Exploiting Linked Data and Knowledge Graphs in Large Organisations, ed. by J. Z. Pan, G. Vetere, J. M. Gómez-Pérez, H. Wu, (Springer, Cham, 2017)

    Google Scholar 

  • J. Volz, C. Bizer, M. Gaedke, G. Kobilarov, Discovering and maintaining links on the web of data, in Proceedings of the 8th International Semantic Web Conference (ISWC2009), Chantilly, 25–29 October 2009. Springer LNCS, vol. 5823

    Google Scholar 

  • R.Y. Wang, A product perspective on total data quality management. Commun. ACM 41(2), 58–65 (1998)

    Article  Google Scholar 

  • R.Y. Wang, D.M. Strong, Beyond accuracy: what data quality means to data consumers. J. Manag. Inf. Syst. 12(4), 5–33 (1996)

    Article  Google Scholar 

  • R.Y. Wang, M. Ziad, Y.W. Lee, Data Quality (Kluwer Academic Publisher, Norwell, MA, 2001)

    MATH  Google Scholar 

  • R. West, E. Gabrilovich, K. Murphy, S. Sun, R. Gupta, D. Lin, Knowledge base completion via search-based question answering, in Proceedings of the 23rd International World Wide Web Conference (WWW2014), 07–11 April 2014 (ACM, Seoul)

    Google Scholar 

  • D. Wienand, H. Paulheim, Detecting incorrect numerical data in DBpedia, in Proceedings of the 11th International European Semantic Web Conference (ESWC2014), Anissaras, Greece, 25–29 May 2014. Springer LNCS, vol. 8465

    Google Scholar 

  • M.D. Wilkinson, M. Dumontier, I.J. Aalbersberg, G. Appleton, M. Axton, A. Baak, N. Blomberg, J.-W. Boiten, L.B. da Silva Santos, P.E. Bourne, J. Bouwman, A.J. Brookes, T. Clark, M. Crosas, I. Dillo, O. Dumon, S. Edmunds, C.T. Evelo, R. Finkers, A. Gonzalez-Beltran, A.J. Gray, P. Groth, C. Goble, J.S. Grethe, J. Heringa, P.A. ‘t Hoen, R. Hooft, T. Kuhn, R. Kok, J. Kok, S.J. Lusher, M.E. Martone, A. Mons, A.L. Packer, B. Persson, P. Rocca-Serra, M. Roos, R. van Schaik, S.-A. Sansone, E. Schultes, T. Sen-gstag, T. Slater, G. Strawn, M.A. Swertz, M. Thompson, J. van der Lei, E. van Mulligen, J. Velterop, A. Waagmeester, P. Wittenburg, K. Wolsten-croft, J. Zhao, B. Mons, The FAIR guiding principles for scientific data management and stewardship. Sci. Data 3, 160018 (2016)

    Article  Google Scholar 

  • W.E. Winkler, Overview of Record Linkage and Current Research Directions. Research report series: Statistics #2006-2, Bureau of the Census (2006). https://www.census.gov/srd/papers/pdf/rrs2006-02.pdf

  • M. Wu, A. Marian, Corroborating answers from multiple web sources, in Proceedings of the 10th International Workshop on the Web and Databases (WebDB2007), Beijing, China, 15 June 2007

    Google Scholar 

  • A. Zaveri, D. Kontokostas, M.A. Sherif, L. Bühmann, M. Morsey, S. Auer, J. Lehmann, User-driven quality evaluation of DBpedia, in Proceedings of the 9th International Conference on Semantic Systems (I-SEMANTICS2013), 4–6 September 2013 (ACM, Graz)

    Google Scholar 

  • A. Zaveri, A. Rula, A. Maurino, R. Pietrobon, J. Lehmann, S. Auer, Quality assessment for linked data: a survey. Semant. Web J. 7(1), 63–93 (2016)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Fensel, D. et al. (2020). How to Build a Knowledge Graph. In: Knowledge Graphs. Springer, Cham. https://doi.org/10.1007/978-3-030-37439-6_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-37439-6_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-37438-9

  • Online ISBN: 978-3-030-37439-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics