Advertisement

How to Build a Knowledge Graph

Chapter
  • 1.6k Downloads

Abstract

This chapter outlines the state of the art of Knowledge Graph technologies by introducing the process of building a Knowledge Graph. We define the following major steps of an overall process model: (1) knowledge creation, (2) knowledge hosting, (3) knowledge curation, and (4) knowledge deployment. We demonstrate the methodology for the knowledge creation process that creates, extracts, and structures the fact base for a Knowledge Graph. We describe the process of knowledge collection, storage, and retrieval that implements established knowledge in a graph-based storage system. We analyze existing methods and tools to improve the quality of a large Knowledge Graph. For the Knowledge Curation process, we establish sub-steps, such as knowledge assessment, cleaning, and enrichment. For each of them, we determine various categories and dimensions that have been developed and described in the literature and identify tasks which can be applied (e.g., Knowledge Graph completion and correctness, error detection and correction, identifying and resolving duplicates). Finally, we describe the deployment process of a Knowledge Graph based on the following principles: findability, accessibility, interoperability, and reusability.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. M. Achichi, Z. Bellahsene, K. Todorov, Legato results for OAEI 2017, in Proceedings of the 12th International Workshop on Ontology Matching (OM2017) Co-Located with the 16th International Semantic Web Conference (ISWC2017), CEUR Workshop Proceedings, vol. 2032, Vienna, Austria, 21 October 2017Google Scholar
  2. M. Acosta, A. Zaveri, E. Simperl, D. Kontokostas, S. Auer, J. Lehmann, Crowdsourcing linked data quality assessment, in Proceedings of the 12th International Semantic Web Conference (ISWC2013), Sydney, Australia, 21–25 October 2013. Springer LNCS, vol. 8219Google Scholar
  3. R. Angles, C. Gutiérrez, Querying RDF data from a graph database perspective, in Proceedings of the 2nd European Semantic Web Conference (ESWC2005), Heraklion, Greece, 29 May–1 June 2005. Springer LNCS, vol. 3532Google Scholar
  4. R. Angles, C. Gutiérrez, Survey of graph database models. ACM Comput. Surv. 40(1), 1–39 (2008)CrossRefGoogle Scholar
  5. A.P. Aprosio, C. Giuliano, A. Lavelli, Automatic expansion of DBpedia exploiting Wikipedia cross-language information, in Proceedings of the 10th International Extended Semantic Web Conference (ESWC2013) on the Semantic Web: Semantics and Big Data, Montpellier, France, 26–30 May 2013. Springer LNCS, vol. 7882Google Scholar
  6. S. Araújo, J. Hidders, D. Schwabe, A.P. de Vries, SERIMI—resource description similarity, RDF instance matching and interlinking, in Proceedings of the 6th International Workshop on Ontology Matching (OM2011), CEUR Workshop Proceedings, vol. 814, Bonn, Germany, 24 October 2011Google Scholar
  7. S. Athanasiou, G. Giannopoulos, D. Graux, N. Karagiannakis, J. Lehmann, A.N. Ngomo, K. Patroumpas, M.A. Sherif, D. Skoutas, Big POI data integration with linked data technologies, in Proceedings of the 22nd International Conference on Extending Database Technology (EDBT2019), Lisbon, Portugal, 26–29 March 2019a. OpenProceedings.org
  8. S. Athanasiou, M. Alexakis, G. Giannopoulos, N. Karagiannakis, Y. Kouvaras, P. Mitropoulos, K. Patroumpas, D. Skoutas, SLIPO: large-scale data integration for points of interest, in Proceedings of the 22nd International Conference on Extending Database Technology (EDBT), Lisbon, Portugal, 26–29 March 2019b, pp. 574–577Google Scholar
  9. C. Batini, M. Scannapieco, Data Quality: Concepts, Methodologies and Techniques. Data-Centric Systems and Applications (Springer, New York, 2006)zbMATHGoogle Scholar
  10. C. Batini, M. Lenzerini, S.B. Navathe, A comparative analysis of methodologies for database schema integration. ACM Comput. Surv. 18(4), 323–364 (1986)CrossRefGoogle Scholar
  11. C. Batini, C. Cappiello, C. Francalanci, A. Maurino, Methodologies for data quality assessment and improvement. ACM Comput. Surv. 41(3), 1–52 (2009)CrossRefGoogle Scholar
  12. W. Beek, L. Rietveld, H.R. Bazoobandi, J. Wielemaker, S. Schlobach, LOD laundromat: a uniform way of publishing other people’s dirty data, in Proceedings of the 13th International Semantic Web Conference (ISWC2014), Riva del Garda, Italy, 19–23 October 2014. Springer LNCS, vol. 8796Google Scholar
  13. O. Benjelloun, H. Garcia-Molina, D. Menestrina, Q. Su, S.E. Whang, J. Widom, Swoosh: a generic approach to entity resolution. Int. J. Very Large Data Bases 18(1), 255–276 (2009)CrossRefGoogle Scholar
  14. I. Bhattacharya, L. Getoor, Collective entity resolution in relational data. ACM Trans. Knowl. Discov. Data 1(1), 5 (2007)CrossRefGoogle Scholar
  15. A. Bilke, J. Bleiholder, C. Böhm, K. Draba, F. Naumann, M. Weis, Automatic data fusion with HumMer, in Proceedings of the 31st International Conference on Very Large Data Bases (VLDB2005), VLDB Endowment, Trondheim, Norway, 30 August–2 September 2005Google Scholar
  16. C. Bizer, R. Cygania, Quality-driven information filtering using the WIQA policy framework. J. Web Semant. 7(1), 1–10 (2009)CrossRefGoogle Scholar
  17. C. Bizer, T. Heath, K. Idehen, T. Berners-Lee, Linked data on the web (LDOW2008), in Proceedings of the 17th International Conference on World Wide Web (WWW2008): Workshop, 21–25 April 2008 (ACM, Beijing)Google Scholar
  18. C. Bizer, T. Heath, T. Berners-Lee, Linked data—the story so far. Int. J. Semant. Web Inf. Syst. 5(3), 1–22 (2009)CrossRefGoogle Scholar
  19. J. Bleiholder, F. Naumann, Data fusion. ACM Comput. Surv. 41(1), 1–41 (2009)CrossRefGoogle Scholar
  20. J. Bleiholder, K. Draba, F. Naumann, FuSem—exploring different semantics of data fusion, in Proceedings of the 33rd International Conference on Very Large Data Bases (VLDB2007), VLDB Endowment, Vienna, Austria, 23–27 September 2007Google Scholar
  21. A. Borodin, G.O. Roberts, J.S. Rosenthal, P. Tsaparas, Link analysis ranking: algorithms, theory, and experiments. ACM Trans. Internet Technol. 5(1), 231–297 (2005)CrossRefGoogle Scholar
  22. W.M. Campbell, L. Li, C.K. Dagli, J. Acevedo-Aviles, K. Geyer, J.P. Campbell, C. Priebe, Cross-Domain Entity Resolution in Social Media, Technical Report, arXiv preprint, 1608.01386 (2016). https://arxiv.org/abs/1608.01386
  23. C. Chang, M. Kayed, M.R. Girgis, K.F. Shaalan, A survey of web information extraction systems. IEEE Trans. Knowl. Data Eng. 18(10), 1411–1428 (2006)CrossRefGoogle Scholar
  24. V. Christophides, V. Efthymiou, K. Stefanidis, Entity Resolution in the Web of Data (Morgan & Claypool, San Rafael, 2015)CrossRefGoogle Scholar
  25. X. Chu, M. Ouzzani, J. Morcos, I.F. Ilyas, P. Papotti, N. Tang, Y. Ye, KATARA: reliable data cleaning with knowledge bases and crowdsourcing, in Proceedings of the 41st International Conference on Very Large Data Bases (PVLDB2015), Hawaii, 31 August–4 September 2015, VLDB Endowment, 8(12), 1952–1955 (2015)Google Scholar
  26. P. Cimiano, S. Handschuh, S. Staab, Towards the self-annotating web, in Proceedings of the 13th International Conference on World Wide Web (WWW2004), 17–20 May 2004 (ACM, New York)Google Scholar
  27. J. De Bruijn, R. Lara, A. Polleres, D. Fensel, OWL DL vs. OWL flight: conceptual modeling and reasoning for the Semantic Web, in Proceedings of the 14th International World Wide Web Conference (ISWC2005), 10–14 May 2005 (ACM, Chiba, Japan)Google Scholar
  28. G. De Melo, Not quite the same: identity constraints for the web of linked data, in Proceedings of the 27th Conference on Artificial Intelligence (AAAI2013), 14–18 July 2013 (AAAI Press, Bellevue, USA)Google Scholar
  29. J. Debattista, S. Auer, C. Lange, Luzzu—a methodology and framework for linked data quality assessment. J. Data Inf. Qual. 8(1), 1–32 (2016a)CrossRefGoogle Scholar
  30. J. Debattista, C. Lange, S. Auer, A preliminary investigation towards improving linked data quality using distance-based outlier detection, in Proceedings of the 6th Joint International Semantic Technology Conference (JIST2016): Revised Selected Papers, Singapore, 2–4 November 2016b. Springer LNCS, vol. 10055Google Scholar
  31. S. Decker, S. Melnik, F. van Harmelen, D. Fensel, M.C.A. Klein, J. Broekstra, M. Erdmann, I. Horrocks, The Semantic Web: the roles of XML and RDF. IEEE Internet Comput. 4(5), 63–74 (2000)CrossRefGoogle Scholar
  32. M. Dezani-Ciancaglini, R. Horne, V. Sassone, Tracing where and who provenance in linked data: a calculus. Theor. Comput. Sci. 464, 113–129 (2012)MathSciNetzbMATHCrossRefGoogle Scholar
  33. D. Dietrich, J. Gray, T. McNamara, A. Poikola, P. Pollock, J. Tait, T. Zijlstra, Open data handbook (Open Knowledge International, Cambridge, 2009)Google Scholar
  34. A. Dimou, M.V. Sande, P. Colpaert, R. Verborgh, E. Mannens, R.V. de Walle, RML: a generic language for integrated RDF mappings of heterogeneous data, in Proceedings of the Workshop on Linked Data on the Web (LDOW2014) Co-Located with the 23rd International World Wide Web Conference (WWW2014), CEUR Workshop Proceedings, vol. 1184, Seoul, Korea, 8 April 2014Google Scholar
  35. L. Ding, P. Kolari, Z. Ding, S. Avancha, Using ontologies in the Semantic Web: a survey. Ontol. Integr. Ser. Inf. Syst. 14, 79–113 (2007)Google Scholar
  36. X.L. Dong, F. Naumann, Data fusion—resolving data conflicts for integration. Proc. Very Large Data Bases Endow. 2(2), 1654–1655 (2009)Google Scholar
  37. X.L. Dong, D. Srivastava, Knowledge curation and knowledge fusion: challenges, models and applications, in Proceedings of the 2015 ACM International Conference on Management of Data (SIGMOD2015), 31 May–4 June 2015 (ACM, Melbourne)Google Scholar
  38. X.L. Dong, L. Berti-Équille, D. Srivastava, Integrating conflicting data: the role of source dependence. Proc. Very Large Data Bases Endow. 2(1), 550–561 (2009a)Google Scholar
  39. X.L. Dong, E. Gabrilovich, G. Heitz, W. Horn, K. Murphy, S. Sun, W. Zhang, From data fusion to knowledge fusion. Proc. Very Large Data Bases Endow. 7(10), 881–892 (2014b)Google Scholar
  40. H.L. Dunn, Record linkage. Am. J. Public Health Nations Health 36(12), 1412–1416 (1946)CrossRefGoogle Scholar
  41. D. Esteves, A. Rula, A.J. Reddy, J. Lehmann, Toward veracity assessment in RDF knowledge bases: an exploratory analysis. ACM J. Data Inf. Qual. 9(3), 1–26 (2018)CrossRefGoogle Scholar
  42. M. Färber, F. Bartscherer, C. Menne, A. Rettinger, Linked data quality of DBpedia, Freebase, OpenCyc, Wikidata, and YAGO. Semant. Web J. 9(1), 77–129 (2018)CrossRefGoogle Scholar
  43. D.C. Faye, O. Curé, G. Blin, A survey of RDF storage approaches. Rev. Afr. Rech. Inf. Math. Appl. 15, 11–35 (2012)Google Scholar
  44. J.D. Fernández, W. Beek, M.A. Martínez-Prieto, M. Arias. LOD-a-lot: a queryable dump of the LOD cloud, in Proceedings of the 16th International Semantic Web Conference (ISWC2017), Vienna, Austria, 21–25 October 2017. Springer LNCS, vol. 10588Google Scholar
  45. D. Fleischhacker, H. Paulheim, V. Bryl, J. Völker, C. Bizer, Detecting errors in numerical linked data using cross-checked outlier detection, in Proceedings of the 13th International Conference on Management of Data (ISWC2014), Riva del Garda, Italy, 19–23 October 2014. Springer LNCS, vol. 8796Google Scholar
  46. A. Flemming, Qualitätsmerkmale von Linked Data-veröffentlichenden Daten-quellen, Diploma thesis, Humboldt-Universität zu Berlin, 2011Google Scholar
  47. C. Fürber, M. Hepp, Using SPARQL and SPIN for data quality management on the Semantic Web, in Proceedings of the 13th International Conference on Business Information Systems (BIS2010), Berlin, Germany, 3–5 May 2010a. Springer LNBI, vol. 47Google Scholar
  48. C. Fürber, M. Hepp, Using Semantic Web resources for data quality management, in Proceedings of the 17th International Conference on Knowledge Engineering and Management by the Masses (EKAW2010), Lisbon, Portugal, 11–15 October 2010b. Springer LNCS, vol. 6317Google Scholar
  49. C. Fürber, M. Hepp, SWIQA—a Semantic Web information quality assessment framework, in Proceedings of the 19th European Conference on Information Systems (ECIS2011), Association for Information Systems (AIS eLibrary), Helsinki, Finland, 9–11 June 2011Google Scholar
  50. A. Fuxman, E. Fazli, R.J. Miller, ConQuer: efficient management of inconsistent databases, in Proceedings of the International Conference on Management of Data (SIGMOD2005), 14–16 June 2005 (ACM, Baltimore)Google Scholar
  51. E. Gamma, R. Helm, R. Johnson, J. Vlissides, Design Patterns: Elements of Reusable Object-Oriented Software (Addison-Wesley Longman, Boston, MA, 1995)zbMATHGoogle Scholar
  52. A. Gangemi, A.G. Nuzzolese, V. Presutti, F. Draicchio, A. Musetti, P. Ciancarini, Automatic typing of DBpedia entities, in Proceedings of the 11th International Semantic Web Conference (ISWC2012), Boston, 11–15 November 2012. Springer LNCS, vol. 7649Google Scholar
  53. H. Garcia-Molina, J.D. Ullman, J. Widom, Database Systems: The Complete Book, Chapter 7, 2nd edn. (Pearson International Editing, 2009)Google Scholar
  54. L.M. Garshol, A. Borge, Hafslund Sesam—an archive on semantics, in Proceedings of the 10th Extending Semantic Web Conference (ESWC2013): Semantics and Big Data, Montpellier, France, 26–30 May 2013. Springer LNCS, vol. 7882Google Scholar
  55. G. Gawriljuk, A. Harth, C.A. Knoblock, P.A. Szekely, A scalable approach to incrementally building knowledge graphs, in Proceedings of the 20th International Conference on Theory and Practice of Digital Libraries (TPDL2016), Hannover, Germany, 5–9 September 2016. Springer LNCS, vol. 9819Google Scholar
  56. L. Getoor, A. Machanavajjhala, Entity resolution: theory, practice & open challenges, in Proceedings of the 38th International Conference on Very Large Data Bases (VLDB2012), 5(12), 2018–2019 (2012)Google Scholar
  57. L. Getoor, A. Machanavajjhala, Entity resolution for big data, in Proceedings of the 19th International Conference on Knowledge Discovery and Data Mining (KDD2013): Tutorial, 11–14 August 2013 (ACM, Chicago)Google Scholar
  58. G. Giannopoulos, D. Skoutas, T. Maroulis, N. Karagiannakis, S. Athanasiou, FAGI: a framework for fusing geospatial RDF data, in Proceedings of the Confederated International Conferences on the Move to Meaningful Internet Systems (OTM2014), Amantea, Italy, 27–31 October 2014. Springer LNCS, vol. 8841Google Scholar
  59. H. Glaser, I. Millard, W. Sung, S. Lee, P. Kim, B. You, Research on linked data and co-reference resolution, in Proceedings of the International Conference on Dublin Core and Metadata Applications (DCMI2019), Dublin Core Metadata Initiative, Seoul, Korea, 12–16 October 2009Google Scholar
  60. A. Gómez-Pérez, M. Fernandez-Lopez, O. Corcho, Ontological Engineering: With Examples from the Areas of Knowledge Management, e-Commerce and the Semantic Web (Springer, Berlin, 2010)Google Scholar
  61. J.M. Gómez-Pérez, J.Z. Pan, G. Vetere, H. Wu, Enterprise knowledge graph: an introduction, in Exploiting Linked Data and Knowledge Graphs in Large Organisations, ed. by J. Z. Pan, G. Vetere, J. M. Gómez-Pérez, H. Wu, (Springer, Cham, 2017)Google Scholar
  62. C. Guéret, P.T. Groth, C. Stadler, J. Lehmann, Assessing linked data mappings using network measures, in Proceedings of the 9th Extended Semantic Web Conference (ESWC2012), Heraklion, Greece, 27–31 May 2012. Springer LNCS, vol. 7295Google Scholar
  63. R.V. Guha, Introducing schema.org: Search engines come together for a richer web, Google Official Blog (2011)
  64. R.V. Guha, D. Brickley, S. Macbeth, Schema.org: evolution of structured data on the web. Commun. ACM 59(2), 44–51 (2016)
  65. K. Gunaratna, S. Lalithsena, A.P. Sheth, Alignment and dataset identification of linked data in Semantic Web. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 4(2), 139–151 (2014)CrossRefGoogle Scholar
  66. S. Gupta, G.E. Kaiser, D. Neistadt, P. Grimm, DOM-based content extraction of HTML documents, in Proceedings of the 12th International World Wide Web Conference (WWW2003), 20–24 May 2003 (ACM, Budapest)Google Scholar
  67. S. Gupta, P.A. Szekely, C.A. Knoblock, A. Goel, M. Taheriyan, M. Muslea, Karma: a system for mapping structured sources into the Semantic Web, in Proceedings of the 9th Extended Semantic Web Conference (ESWC2012): Revised Selected Papers, Crete, Greece, 27–31 May 2012. Springer LNCS, vol. 7540Google Scholar
  68. H. Halpin, P.J. Hayes, J.P. McCusker, D.L. McGuinness, H.S. Thompson, When owl:sameAs isn’t the same: an analysis of identity in linked data, in Proceedings of the 9th International Semantic Web Conference (ISWC2010), 7–11 November 2010 (Springer, Shanghai)Google Scholar
  69. J.B. Hansen, A. Beveridge, R. Farmer, L. Gehrmann, A.J.G. Gray, S. Khutan, T. Robertson, J. Val, Validata: an online tool for testing RDF data conformance, in Proceedings of the 8th International Conference on Semantic Web Applications and Tools for Life Sciences (SWAT4LS2015), CEUR Workshop Proceedings, vol. 1546, Cambridge, UK, 7–10 December 2015Google Scholar
  70. S. Harris, A. Seaborne, E. Prud’hommeaux (eds.), SPARQL 1.1 Query Language, W3C Recommendation, 21 March 2013. https://www.w3.org/TR/sparql11-query/
  71. P. Hayes, The Logic of Frames, Readings in Artificial Intelligence (Morgan Kaufmann, Los Altos, CA, 1981)Google Scholar
  72. P. Hayes (ed.), RDF semantics, W3C recommendation, 10 February 2004. https://www.w3.org/TR/sparql11-query/
  73. J. Hipp, U. Güntzer, G. Nakhaeizadeh, Algorithms for association rule mining—a general survey and comparison. ACM SIGKDD Explor. Newsl. 2(1), 58–64 (2000)CrossRefGoogle Scholar
  74. A. Hogan, A. Harth, S. Decker, Performing object consolidation on the Semantic Web data graph, in Proceedings of the 16th International World Wide Web Conference (WWW2007): Workshop I3: Identity, Identifiers, Identification, Entity-Centric Approaches to Information and Knowledge Management on the Web, CEUR Workshop Proceedings, vol. 249, Banff, Canada, 8 May 2007Google Scholar
  75. S.M. Inzalkar, J. Sharma, A survey on text mining-techniques and application. Int. J. Res. Sci. Eng. 14, 1–14 (2015)Google Scholar
  76. K. Janowicz, P. Hitzler, B. Adams, D. Kolas, C. Vardeman, Five stars of linked data vocabulary use. Semant. Web J. 5(3), 173–176 (2014)CrossRefGoogle Scholar
  77. E. Kärle, U. Şimşek, D. Fensel, semantify.it, a platform for creation, publication and distribution of semantic annotations, in Proceedings of the 11th International Conference on Advances in Semantic Processing (SEMAPRO2017), IARIA, Barcelona, Spain, 12–16 November 2017Google Scholar
  78. E. Kärle, U. Şimşek, O. Panasiuk, D. Fensel, Building an ecosystem for the tyrolean tourism knowledge graph, in Proceedings of the International Conference on Trends in Web Engineering (ICWE2018), International Workshops, MATWEP, EnWot, KD-Web, WEOD, TourismKG: Revised Selected Papers, Caceres, Spain, 5 June 2018. Springer LNCS, vol. 11153Google Scholar
  79. L. Karoui, M.-A. Aufaure, N. Bennacer, Ontology discovery from web pages: application to tourism, in Proceedings of the Workshop on Knowledge Discovery and Ontologies (ECML/PKDD2004), Pisa, Italy, 20–24 September 2004Google Scholar
  80. M. Kejriwal, C. Knoblock, P. Szekely, Constructing domain-specific knowledge graphs, in Proceedings of the 16th International Semantic Web Conference (ISWC2017): Tutorial, Vienna, Austria, 21–25 October 2017. https://usc-isi-i2.github.io/ISWC17/
  81. M. Kifer, G. Lausen, J. Wu, Logical foundations of object-oriented and frame-based languages. J. ACM 42(4), 741–843 (May 1995)MathSciNetzbMATHCrossRefGoogle Scholar
  82. J.M. Kleinberg, Authoritative sources in a hyperlinked environment. J. ACM 46(5), 604–632 (1999)MathSciNetzbMATHCrossRefGoogle Scholar
  83. T. Knap, J. Michelfeit, M. Necaský, Linked open data aggregation: conflict resolution and aggregate quality, in Proceedings of the 36th Annual IEEE Computer Software and Applications Conference Workshops (COMP-SAC2012), IEEE Computer Society, Izmir, Turkey, 16–20 July 2012Google Scholar
  84. D. Kontokostas, P. Westphal, S. Auer, S. Hellmann, J. Lehmann, R. Cornelissen, A. Zaveri, Test-driven evaluation of linked data quality, in Proceedings of the 23rd International Conference on World Wide Web (WWW2014), 07–11 April 2014 (ACM, Seoul)Google Scholar
  85. N. Korula, S. Lattanzi, An efficient reconciliation algorithm for social networks. Proc. Very Large Data Bases Endow. 7(5), 377–388 (2014)Google Scholar
  86. S. Lalithsena, P. Hitzler, A.P. Sheth, P. Jain, Automatic domain identification for linked open data, in Proceedings of the International Joint Conference on Web Intelligence (WI2013) and Intelligent Agent Technologies (IAT2013), IEEE Computer Society, Atlanta, 17–20 November 2013Google Scholar
  87. D. Lange, C. Böhm, F. Naumann, Extracting structured information from Wikipedia articles to populate infoboxes, in Proceedings of the 19th Conference on Information and Knowledge Management (CIKM2010), 26–30 October 2010 (ACM, Toronto)Google Scholar
  88. A. Langegger, W. Wöß, Langegger: XLWrap–querying and integrating arbitrary spreadsheets with SPARQL, in Proceedings of the 8th International Semantic Web Conference (ISWC 2009), 25–29 October 2009 (Springer, Chantilly, VA)Google Scholar
  89. P. Lertvittayakumjorn, N. Kertkeidkachorn, R. Ichise, Resolving range violations in DBpedia, in Proceedings of the 7th Joint International Semantic Technology Conference (JIST2017), Gold Coast, Australia, 10–12 November 2017. Springer LNCS, vol. 10675Google Scholar
  90. W. Li, C. Clifton, SEMINT: a tool for identifying attribute correspondences in heterogeneous databases using neural networks. Data Knowl. Eng. 33(1), 49–84 (2000)zbMATHCrossRefGoogle Scholar
  91. Y. Li, J. Gao, C. Meng, Q. Li, L. Su, B. Zhao, W. Fan, J. Han, A survey on truth discovery. ACM SIGKDD Explor. Newsl. 17(2), 1–16 (2016)CrossRefGoogle Scholar
  92. J. Liang, Y. Xiao, Y. Zhang, S. Hwang, H. Wang, Graph-based wrong IsA relation detection in a large-scale lexical taxonomy, in Proceedings of the 31st Conference on Artificial Intelligence (AAAI2017), 4–9 February 2017 (AAAI Press, San Francisco)Google Scholar
  93. L. Ma, Z. Su, Y. Pan, L. Zhang, T. Liu, RStar: an RDF storage and query system for enterprise resource management, in Proceedings of the 13th International Conference on Information and knowledge Management (CIKM2004), 8–13 November 2004 (ACM, Washington)Google Scholar
  94. Y. Ma, H. Gao, T. Wu, G. Qi, Learning disjointness axioms with association rule mining and its application to inconsistency detection of linked data, in Proceedings of the 8th Chinese Semantic Web and Web Science Conference (CSWS2014): Revised Selected Papers, Wuhan, China, 8–12 August 2014. Springer CCIS 480Google Scholar
  95. R. Mahanti, Data Quality: Dimensions, Measurement, Strategy, Management, and Governance (ASQ Quality Press, Milwaukee, 2019)Google Scholar
  96. A. Melo, H. Paulheim, Detection of relation assertion errors in knowledge graphs, in Proceedings of the 9th International Conference on Knowledge Capture (K-CAP2017), 4–6 December 2017 (ACM, Austin)Google Scholar
  97. P.N. Mendes, H. Mühleisen, C. Bizer, Sieve: linked data quality assessment and fusion, in Proceedings of the 2nd International Workshop on Linked Web Data Management (LWDM 2012), in Conjunction with the 15th International Conference on Extending Database Technology (EDBT2012): Workshops, 30 March 2012 (ACM, Berlin)Google Scholar
  98. D. Menestrina, S. Whang, H. Garcia-Molina, Evaluating entity resolution results. Proc. Very Large Data Bases Endow. 3(1–2), 208–219 (2010)Google Scholar
  99. P. Mika, On Schema.org and why it matters for the web. IEEE Internet Comput. 19(4), 52–55 (2015)
  100. B. Mohit, Named entity recognition, in Natural Language Processing of Semitic Languages, ed. by I. Zitouni, (Springer, Berlin, 2014), pp. 221–245CrossRefGoogle Scholar
  101. A. Moschitti, K. Tymoshenko, P. Alexopoulos, A.D. Walker, M. Nicosia, G. Vetere, A. Faraotti, M. Monti, J.Z. Pan, H. Wu, Y. Zhao, Question answering and knowledge graphs, in Exploiting Linked Data and Knowledge Graphs in Large Organisations, ed. by J. Z. Pan, G. Vetere, J. M. Gómez-Pérez, H. Wu, (Springer, Cham, 2017)Google Scholar
  102. E. Muñoz, A. Hogan, A. Mileo, Triplifying Wikipedia’s tables, in Proceedings of the 1st International Workshop on Linked Data for Information Extraction (LD4IE2013) Co-Located with the 12th International Semantic Web Conference (ISWC2013), CEUR Workshop Proceeding, vol. 1057, Sydney, Australia, 21 October 2013Google Scholar
  103. A.N. Ngomo, S. Auer, LIMES—a time-efficient approach for large-scale link discovery on the web of data, in Proceedings of the 22nd International Joint Conference on Artificial Intelligence (IJ-CAI2011), 16–22 July 2011 (AAAI Press, Barcelona)Google Scholar
  104. A. Nikolov, V.S. Uren, E. Motta, A.N.D. Roeck, Integration of semantically annotated data by the KnoFuss architecture, in Proceedings of the 16th International Conference on Knowledge Engineering and Knowledge Management (EKAW2008): Practice and Patterns, Acitrezza, Italy, 29 September–2 October 2008. Springer LNCS, vol. 5268Google Scholar
  105. N. Noy, Y. Gao, A. Jain, A. Narayanan, A. Patterson, J. Taylor, Industry-scale knowledge graphs: lessons and challenges. ACM Queue 17(2), 48–75 (2019)Google Scholar
  106. A.G. Nuzzolese, A. Gangemi, V. Presutti, P. Ciancarini, Type inference through the analysis of Wikipedia links, in Proceedings of the 21st International Conference on World Wide Web (WWW2012): Workshop on Linked Data on the Web (LDOW2012), CEUR Workshop Proceedings, vol. 937, Lyon, France, 16 April 2012Google Scholar
  107. M.J. O’Connor, C. Halaschek-Wiener, M.A. Musen, Mapping master: a flexible approach for mapping spreadsheets to OWL, in Proceedings of the 9th International Semantic Web Conference (ISWC2010): Revised Selected Papers, Shanghai, China, 7–11 November 2010. Springer LNCS, vol. 6497Google Scholar
  108. J. Z. Pan, G. Vetere, J. M. Gómez-Pérez, H. Wu (eds.), Exploiting Linked Data and Knowledge Graphs in Large Organisations (Springer, Cham, 2017b)Google Scholar
  109. O. Panasiuk, E. Kärle, U. Şimşek, D. Fensel, Defining tourism domains for semantic annotation of web content, in Proceedings of the Conference on Information and Communication Technologies in Tourism (ENTER2018): Research Notes, Jönköping, Sweden, 24–26 January 2018aGoogle Scholar
  110. O. Panasiuk, Z. Akbar, T. Gerrier, D. Fensel, Representing GeoData for tourism with Schema.org, in Proceedings of the 4th International Conference on Geographical Information Systems Theory, Applications and Management (GISTAM2018), 17–19 March 2018b (SciTePress, Funchal, Portugal)
  111. O. Panasiuk, Z. Akbar, U. Şimşek, D. Fensel, Enabling conversational tourism assistants through Schema.org mapping, in Proceedings of the European Semantic Web Conference (ESWC2018): Satellite Event, Revised Selected Papers, Hersonissos, Greece, 3–7 June 2018c. Springer LNCS, vol. 11155
  112. O. Panasiuk, O. Holzknecht, U. Şimşek, E. Kärle, D. Fensel, Verification and validation of semantic annotations, in Proceedings of the 12th A.P. Ershov Informatics Conference (PSI 2019), Novosibirsk, Russia, 2–5 July 2019 (Springer). Preprint. https://arxiv.org/abs/1904.01353
  113. L. Papaleo, N. Pernelle, F. Saïs, C. Dumont, Logical detection of invalid SameAs statements in RDF data, in Proceedings of the 19th International Conference on Knowledge Engineering and Knowledge Management (EKAW2014), Linköping, Sweden, 24–28 November 2014. Springer LNCS, vol. 8876Google Scholar
  114. P. Paritosh, The missing science of knowledge curation: improving incentives for large-scale knowledge curation, in Proceedings of the International World Wide Web Conference (WWW2018), 23–27 April 2018 (ACM, Lyon)Google Scholar
  115. P.F. Patel-Schneider, Analyzing Schema.org, in Proceedings of the 13th International Semantic Web Conference (ISWC2014), Riva del Garda, Italy, 19–23 October 2014. Springer LNCS, vol. 8796
  116. P.F. Patel-Schneider, I. Horrocks, Position paper: a comparison of two modelling paradigms in the Semantic Web, in Proceedings of the 15th International World Wide Web Conference (WWW2006), 23–26 May 2006 (ACM, Edinburgh)Google Scholar
  117. H. Paulheim, Identifying wrong links between datasets by multi-dimensional outlier detection, in Proceedings of the 3rd International Workshop on Debugging Ontologies and Ontology Mappings (WoDOOM2014) Co-Located with the 11th Extended Semantic Web Conference (ESWC2014), CEUR Workshop Proceedings, vol. 1162, Hersonissou, Greece, 26 May 2014Google Scholar
  118. H. Paulheim, Knowledge graph refinement: a survey of approaches and evaluation methods. Semant. Web J. 8(3), 489–508 (2017)CrossRefGoogle Scholar
  119. H. Paulheim, Machine learning with and for Semantic Web knowledge graphs, ed. by C. d’Amato, M. Theobald, in Proceedings of the 14th International Summer School 2018: Reasoning Web. Learning, Uncertainty, Streaming, and Scalability: Tutorial Lectures, Esch-sur-Alzette, Luxembourg, 22–26 September 2018a. Springer LNCS, vol. 11078Google Scholar
  120. H. Paulheim, How much is a triple? Estimating the cost of knowledge graph creation, in Proceedings of the 17th International Semantic Web Conference (ISWC2018): Posters & Demonstrations, Industry and Blue Sky Ideas Tracks, CEUR Workshop Proceedings, vol. 2180, Monterey, 8–12 October 2018bGoogle Scholar
  121. H. Paulheim, C. Bizer, Type inference on noisy RDF data, in Proceedings of the 12th International Semantic Web Conference (ISWC2013), Sydney, Australia, 21–25 October 2013. Springer LNCS, vol. 8218Google Scholar
  122. H. Paulheim, C. Bizer, Improving the quality of linked data using statistical distributions. Int. J. Semant. Web Inf. Syst. 10(2), 63–86 (2014)CrossRefGoogle Scholar
  123. H. Paulheim, M. Sabou, M. Cochez, W. Beek, Evaluation of knowledge graphs, ed. by P.A. Bonatti, S. Decker, A. Polleres, V. Presutti, in Knowledge Graphs: New Directions for Knowledge Representation on the Semantic Web (Dagstuhl Seminar 18371), Dagstuhl Rep. 8(9), 29–111 (2019)Google Scholar
  124. N. Pernelle, J. Raad, F. Saıs, Detection of invalid identity links statements in RDF knowledge graphs. Presented in the 21st International Conference on Knowledge Engineering and Knowledge Management (EKAW2018): Workshops: Symbolic methods for data-interlinking, Nancy, France, 12–16 November 2018. https://project.inria.fr/ekaw2018/workshops/
  125. L. Pipino, Y.W. Lee, R.Y. Wang, Data quality assessment. Commun. ACM 45(4), 211–218 (2002)CrossRefGoogle Scholar
  126. J. Plu, G. Rizzo, R. Troncy, ADEL: ADaptable Entity Linking: a hybrid approach to link entities with linked data for information extraction. Semant. Web J. (Special Issue on Linked Data for Information Extraction) 1, 1–5 (2017)Google Scholar
  127. J. Raad, N. Pernelle, F. Saïs, Detection of contextual identity links in a knowledge base, in Proceedings of the Knowledge Capture Conference (K-CAP2017), 4–6 December 2017 (ACM, Austin)Google Scholar
  128. J. Raad, W. Beek, F. van Harmelen, N. Pernelle, F. Saïs, Detecting erroneous identity links on the web using network metrics, in Proceeding of the 17th International Semantic Web Conference (ISWC2018), Monterrey, 8–12 October 2018. Springer LNCS, vol. 111Google Scholar
  129. Y. Raimond, C. Sutton, M.B. Sandler, Automatic interlinking of music datasets on the Semantic Web, in Proceedings of the 17th International World Wide Web Conference (WWW2008): Workshop on Linked Data on the Web (LDOW2008), CEUR Workshop Proceedings, vol. 369, Beijing, China, 22 April 2008Google Scholar
  130. T. Rekatsinas, X. Chu, I.F. Ilyas, C. Ré, HoloClean: holistic data repairs with probabilistic inference. Proc. Very Large Data Bases Endow. 10(11), 1190–1201 (2017)Google Scholar
  131. M. Rubiolo, M.L. Caliusco, G. Stegmayer, M. Gareli, M. Coronel, Knowledge source discovery: an experience using ontologies, WordNet and artificial neural networks, in Proceedings of the 13th International Conference on Knowledge-Based and Intelligent Information and Engineering Systems (KES2009), Santiago, Chile, 28–30 September 2009. Springer LNCS, vol. 5712Google Scholar
  132. A. Rula, M. Palmonari, S. Rubinacci, A.N. Ngomo, J. Lehmann, A. Maurino, D. Esteves, TISCO: temporal scoping of facts. J. Web Semant. 54, 72–86 (2019)CrossRefGoogle Scholar
  133. A.T. Schreiber, G. Schreiber, H. Akkermans, A. Anjewierden, N. Shadbolt, R. de Hoog, W. Van de Velde, N.R. Shadbolt, B. Wielinga, Knowledge Engineering and Management: The CommonKADS Methodolog (MIT Press, Cambridge, MA, 2000)Google Scholar
  134. A. Schultz, A. Matteini, R. Isele, P.N. Mendes, C. Bizer, C. Becker, LDIF—a framework for large-scale linked data integration, in Proceedings of the 21st International World Wide Web Conference (WWW2012): Developers Track, Lyon, France, 18–20 April 2012Google Scholar
  135. S. Shehata, F. Karray, M.S. Kamel, An efficient concept-based mining model for enhancing text clustering. IEEE Trans. Knowl. Data Eng. 22(10), 1360–1371 (2010)CrossRefGoogle Scholar
  136. U. Şimşek, D. Fensel, Now we are talking! Flexible and open goal-oriented dialogue systems for accessing touristic services, in Proceedings of the Conference on Information and Communication Technologies in Tourism (ENTER2018): Research Notes, Jönköping, Sweden, 24–26 January 2018bGoogle Scholar
  137. U. Şimşek, E. Kärle, O. Holzknecht, D. Fensel, Domain specific semantic validation of schema.org annotations, in Proceedings of the 11th International A. P. Ershov Informatics Conference (PSI 2017), Moscow, Russia, 27–29 June 2017. Springer LNCS, vol. 10742 (2018a)
  138. U. Şimşek, E. Kärle, D. Fensel, Machine readable web APIs with Schema.org action annotations, in Proceedings of the 14th International Conference on Semantic Systems (SEMANTICS 2018), 10–13 September 2018b (Elsevier, Vienna)
  139. U. Şimşek, E. Kärle, D. Fensel, RocketRML—a NodeJS implementation of a use-case specific RML mapper, in Proceedings of 1st Knowledge Graph Building Workshop Co-Located with the 16th Extended Semantic Web Conference (ESWC2019), CEUR Workshop Proceedings, Portoroz, Slovenia, 3 June 2019aGoogle Scholar
  140. J. Sleeman, T. Finin, Type prediction for efficient coreference resolution in heterogeneous semantic graphs, in Proceedings of the 7th International Conference on Semantic Computing (ICSC2013), IEEE Computer Society, Irvine, 16–18 September 2013Google Scholar
  141. J. Sleeman, T. Finin, A. Joshi, Topic modeling for RDF graphs, in Proceedings of the 3rd International Workshop on Linked Data for Information Extraction (LD4IE2015) Co-Located with the 14th International Semantic Web Conference (ISWC2015), CEUR Workshop Proceedings, vol. 1467, Bethlehem, 12 October 2015Google Scholar
  142. M. Sporny, D. Longley, G. Kellogg, M. Lanthaler, N. Lindström (eds.), JSON-LD 1.0. W3C recommendation, 16 January 2014. https://www.w3.org/TR/json-ld/
  143. S. Staab, R. Studer, Ontology Handbook (Springer, Berlin, 2010)Google Scholar
  144. F. Stegmaier, U. Gröbner, M. Döller, H. Kosch, G. Baese, Evaluation of current RDF database solutions, in Proceedings of the 10th International Workshop on Semantic Multimedia Database Technologies (SeMuDaTe2009) in Conjunction with the 4th International Conference on Semantics and Digital Media Technologies (SAMT2009), CEUR Workshop Proceedings, vol. 539, Graz, Austria, 2 December 2009Google Scholar
  145. G. Stegmayer, M.L. Caliusco, O. Chiotti, M.R. Galli, ANN-agent for distributed knowledge source discovery, in Proceedings of the on the Move to Meaningful Internet Systems (OTM2007): Confederated International Workshops and Posters, AWeSOMe, CAMS, OTM Academy Doctoral Consortium, MONET, OnToContent, ORM, PerSys, PPN, RDDS, SSWS, and SWWS 2007, Vilamoura, Portugal, 25–30 November 2007. Springer LNCS, vol. 4805Google Scholar
  146. A. Stolz, M. Hepp, Integrating product classification standards into Schema.org: eCl@ss and UNSPSC on the web of data, in Proceedings of on the Move to Meaningful Internet Systems. OTM 2017 Workshops, Rhodes, Greece, 23–28 October 2017 (2018). Springer LNCS, vol. 10697
  147. R. Studer, V.R. Benjamins, D. Fensel, Knowledge engineering: principles and methods. Data Knowl. Eng. 25(1–2), 161–197 (1998)zbMATHCrossRefGoogle Scholar
  148. V. Uren, P. Cimiano, J. Iria, S. Handschuh, M. Vargas-Vera, E. Motta, F. Ciravegna, Semantic annotation for knowledge management: requirements and a survey of the state of the art. Web Semant. Sci. Serv. Agents World Wide Web Arch. 4(1), 14–28 (2006)CrossRefGoogle Scholar
  149. D. Van Deursen, C. Poppe, G. Martens, E. Mannens, R. Van de Walle, XML to RDF conversion: a generic approach, in Proceedings of the 4th International Conference on Automated solutions for Cross Media Content and Multi-Channel Distribution (AXMEDIS2008), 17–19 November 2008 (IEEE, Florence)Google Scholar
  150. M.Y. Vardi, How the hippies destroyed the Internet. Commun. ACM 61(7), 9 (2018)CrossRefGoogle Scholar
  151. S. Vijayarani, M.J. Ilamathi, M. Nithya, Preprocessing techniques for text mining-an overview. Int. J. Comput. Sci. Commun. Netw. 5(1), 7–16 (2015)Google Scholar
  152. B. Villazón-Terrazas, N. García-Santa, Y. Ren, A. Faraotti, H. Wu, Y. Zhao, G. Vetere, J.Z. Pan, Knowledge graph foundations, in Exploiting Linked Data and Knowledge Graphs in Large Organisations, ed. by J. Z. Pan, G. Vetere, J. M. Gómez-Pérez, H. Wu, (Springer, Cham, 2017)Google Scholar
  153. J. Volz, C. Bizer, M. Gaedke, G. Kobilarov, Discovering and maintaining links on the web of data, in Proceedings of the 8th International Semantic Web Conference (ISWC2009), Chantilly, 25–29 October 2009. Springer LNCS, vol. 5823Google Scholar
  154. R.Y. Wang, A product perspective on total data quality management. Commun. ACM 41(2), 58–65 (1998)CrossRefGoogle Scholar
  155. R.Y. Wang, D.M. Strong, Beyond accuracy: what data quality means to data consumers. J. Manag. Inf. Syst. 12(4), 5–33 (1996)CrossRefGoogle Scholar
  156. R.Y. Wang, M. Ziad, Y.W. Lee, Data Quality (Kluwer Academic Publisher, Norwell, MA, 2001)zbMATHGoogle Scholar
  157. R. West, E. Gabrilovich, K. Murphy, S. Sun, R. Gupta, D. Lin, Knowledge base completion via search-based question answering, in Proceedings of the 23rd International World Wide Web Conference (WWW2014), 07–11 April 2014 (ACM, Seoul)Google Scholar
  158. D. Wienand, H. Paulheim, Detecting incorrect numerical data in DBpedia, in Proceedings of the 11th International European Semantic Web Conference (ESWC2014), Anissaras, Greece, 25–29 May 2014. Springer LNCS, vol. 8465Google Scholar
  159. M.D. Wilkinson, M. Dumontier, I.J. Aalbersberg, G. Appleton, M. Axton, A. Baak, N. Blomberg, J.-W. Boiten, L.B. da Silva Santos, P.E. Bourne, J. Bouwman, A.J. Brookes, T. Clark, M. Crosas, I. Dillo, O. Dumon, S. Edmunds, C.T. Evelo, R. Finkers, A. Gonzalez-Beltran, A.J. Gray, P. Groth, C. Goble, J.S. Grethe, J. Heringa, P.A. ‘t Hoen, R. Hooft, T. Kuhn, R. Kok, J. Kok, S.J. Lusher, M.E. Martone, A. Mons, A.L. Packer, B. Persson, P. Rocca-Serra, M. Roos, R. van Schaik, S.-A. Sansone, E. Schultes, T. Sen-gstag, T. Slater, G. Strawn, M.A. Swertz, M. Thompson, J. van der Lei, E. van Mulligen, J. Velterop, A. Waagmeester, P. Wittenburg, K. Wolsten-croft, J. Zhao, B. Mons, The FAIR guiding principles for scientific data management and stewardship. Sci. Data 3, 160018 (2016)CrossRefGoogle Scholar
  160. W.E. Winkler, Overview of Record Linkage and Current Research Directions. Research report series: Statistics #2006-2, Bureau of the Census (2006). https://www.census.gov/srd/papers/pdf/rrs2006-02.pdf
  161. M. Wu, A. Marian, Corroborating answers from multiple web sources, in Proceedings of the 10th International Workshop on the Web and Databases (WebDB2007), Beijing, China, 15 June 2007Google Scholar
  162. A. Zaveri, D. Kontokostas, M.A. Sherif, L. Bühmann, M. Morsey, S. Auer, J. Lehmann, User-driven quality evaluation of DBpedia, in Proceedings of the 9th International Conference on Semantic Systems (I-SEMANTICS2013), 4–6 September 2013 (ACM, Graz)Google Scholar
  163. A. Zaveri, A. Rula, A. Maurino, R. Pietrobon, J. Lehmann, S. Auer, Quality assessment for linked data: a survey. Semant. Web J. 7(1), 63–93 (2016)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Semantic Technology Institute Innsbruck, Department of Computer ScienceUniversity of InnsbruckInnsbruckAustria
  2. 2.Onlim GmbHTelfsAustria

Personalised recommendations