Skip to main content

Abstract

The use of Semantic Web technologies, including knowledge graphs, is a widespread practice in the development of modern intelligent systems, information retrieval, and question answering. Knowledge graph engineering requires automation and improvement, including the use of various sources of information (e.g., databases, documents, conceptual models). Tables are one of the most accessible and common ways of storing and presenting information, as well as a valuable source of structured domain knowledge. In this paper, we propose to automate the process of extracting specific entities (facts) from tabular data for subsequent filling and augmentation of a target knowledge graph. A new approach is proposed for this purpose. The key feature of our approach is the semantic interpretation (annotation) of separate table elements. We present the main stages of this approach and a description of its implementation. We also conducted a case study on the task of designing a domain knowledge graph using the proposed approach in the field of industrial safety inspection of petrochemical equipment and technological complexes. The results obtained show the advisability of using our approach and software for this task.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 229.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 299.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Hogan, A., et al.: Knowledge Graphs (2021)

    Google Scholar 

  2. Balog, K.: Populating knowledge bases. Entity-Orient. Search INRE 39, 189–222 (2018)

    Google Scholar 

  3. Zhang, S., Balog, K.: Web table extraction, retrieval, and augmentation: a survey. ACM Trans. Intell. Syst. Technol. 11(2), 1–35 (2020)

    Article  Google Scholar 

  4. Lehmberg, O., Ritze, D., Meusel, R., Bizer, C.: A large public corpus of web tables con-taining time and context metadata. In: Proceedings of the 25th International Conference Companion on World Wide Web, pp. 75–76 (2016)

    Google Scholar 

  5. Dorodnykh, N., Yurin, A.: Spreadsheet data transformation for ontology engineering in petrochemical equipment inspection tasks. In: Kovalev, S., Tarassov, V., Snasel, V., Sukhanov, A. (eds.) IITI 2021. LNNS, vol. 330, pp. 562–571. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-87178-9_55

    Chapter  Google Scholar 

  6. Bischof, S., Decker, S., Krennwallner, T., Lopes, N., Polleres, A.: Mapping between RDF and XML with XSPARQL. J. Data Semant. 1(3), 147–185 (2012)

    Article  Google Scholar 

  7. Lefrançois, M., Zimmermann, A., Bakerally, N.: A SPARQL extension for generating RDF from heterogeneous formats. In: Blomqvist, E., Maynard, D., Gangemi, A., Hoekstra, R., Hitzler, P., Hartig, O. (eds.) The Semantic Web: 14th International Conference, ESWC 2017, Portorož, Slovenia, May 28 – June 1, 2017, Proceedings, Part I, pp. 35–50. Springer International Publishing, Cham (2017). https://doi.org/10.1007/978-3-319-58068-5_3

    Chapter  Google Scholar 

  8. Han, L., Finin, T., Parr, C., Sachs, J., Joshi, A.: RDF123: from spreadsheets to RDF. In: Sheth, A., et al. (eds.) ISWC 2008. LNCS, vol. 5318, pp. 451–466. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88564-1_29

    Chapter  Google Scholar 

  9. Lebo, T., Williams, G.: Converting governmental datasets into Linked Data. In: Proceedings of the 6th International Conference on Semantic Systems, pp. 1–3 (2010)

    Google Scholar 

  10. Scharffe, F., et al.: Enabling linked data publication with the datalift platform. In: Proceedings of the AAAI workshop on semantic cities. In 26th Conference on Artificial Intelligence, W10: Semantic Cities, pp. 25–30 (2012)

    Google Scholar 

  11. Spread2RDF. https://github.com/marcelotto/spread2rdf. Accessed 07 May 2022

  12. Fiorelli, M., Lorenzetti, T., Pazienza, M., Stellato, A., Turbati, A.: Sheet2RDF: a flexible and dynamic spreadsheet import&lifting framework for RDF. In: Ali, M., Kwon, Y.S., Lee, C.-H., Kim, J., Kim, Y. (eds.) IEA/AIE 2015. LNCS (LNAI), vol. 9101, pp. 131–140. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-19066-2_13

    Chapter  Google Scholar 

  13. De Vos, M., Wielemaker, J., Rijgersberg, H., Schreiber, G., Wielinga, B., Top, J.: Com-bining information on structure and content to automatically annotate natural science spreadsheets. Int. J. Hum Comput Stud. 103, 63–76 (2017)

    Article  Google Scholar 

  14. Maguire, E., González-Beltrán, A., Whetzel, P.L., Sansone, S.A., Rocca-Serra, P.: On-tomaton: a bioportal powered ontology widget for Google Spreadsheets. Bioinformatics 29(4), 525–527 (2013)

    Article  Google Scholar 

  15. Chen, J., Jimenez-Ruiz, E., Horrocks, I., Sutton, C.: ColNet: Embedding the semantics of web tables for column type prediction. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, no. 1, pp. 29–36 (2019)

    Google Scholar 

  16. Hulsebos, M., et al.: Sherlock: a deep learning approach to semantic data type detection. In: KDD 2019: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1500–1508 (2019)

    Google Scholar 

  17. Kruit, B., Boncz, P., Urbani, J.: Extracting novel facts from tables for knowledge graph completion. In: Ghidini, C., et al. (eds.) ISWC 2019. LNCS, vol. 11778, pp. 364–381. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30793-6_21

    Chapter  Google Scholar 

  18. Cremaschi, M., Paoli, F.D., Rula, A., Spahiu, B.: A fully automated approach to a complete semantic table interpretation. Futur. Gener. Comput. Syst. 112, 478–500 (2020)

    Article  Google Scholar 

  19. Deng, X., Sun, H., Lees, A., Wu, Y., Yu, C.: TURL: table understanding through representation learning. Proc. VLDB Endowment 14(3), 307–319 (2020)

    Article  Google Scholar 

  20. Xie, J., Lu, Y., Cao, C., Li, Z., Guan, Y., Liu, Y.: Joint entity linking for web tables with hybrid semantic matching. In: Krzhizhanovskaya, V.V., et al. (eds.) ICCS 2020. LNCS, vol. 12138, pp. 618–631. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-50417-5_46

    Chapter  Google Scholar 

  21. Huynh, V.-P., Liu, J., Chabot, Y., Deuzé, F., Labbé, T., Monnin, P., Troncy, R.: DAGOBAH: table and graph contexts for efficient semantic annotation of tabular data. In: Proceedings of the 20th International Semantic Web Conference (ISWC 2021), SemTab, pp. 19–31 (2021)

    Google Scholar 

  22. Nguyen, P., Yamada, I., Kertkeidkachorn, N., Ichise, R., Takeda, H.: SemTab 2021: tabular data annotation with MTab tool. In: Proceedings of the 20th International Semantic Web Conference (ISWC 2021), SemTab, pp. 92–101 (2021)

    Google Scholar 

  23. Vu, B., Knoblock, C.A., Szekely, P., Pham, M., Pujara, J.: A graph-based approach for inferring semantic descriptions of wikipedia tables. In: Hotho, A., et al. (eds.) ISWC 2021. LNCS, vol. 12922, pp. 304–320. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-88361-4_18

    Chapter  Google Scholar 

  24. SemTab-2021. https://www.cs.ox.ac.uk/isg/challenges/sem-tab/. Accessed 07 May 2022

  25. Stanford CoreNLP. https://stanfordnlp.github.io/CoreNLP/. Accessed 07 May 2022

  26. Dorodnykh, N.O., Yurin, A.Yu.: Towards a universal approach for semantic interpretation of spreadsheets data. In: IDEAS 2020: Proceedings of the 24th Symposium on International Database Engineering & Applications, vol. 22, pp. 1–9 (2020)

    Google Scholar 

  27. ISI-167E: Entity Spreadsheet Tables. https://data.mendeley.com/datasets/3gjy46mx88/1. Accessed 07 May 2022

  28. Bizer, C., et al.: DBpedia – a crystallization point for the web of data. J. Web Semant. 7(3), 154–165 (2009)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nikita O. Dorodnykh .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Dorodnykh, N.O., Yurin, A.Y. (2023). Knowledge Graph Augmentation Based on Tabular Data: A Case Study for Industrial Safety Inspection. In: Kovalev, S., Sukhanov, A., Akperov, I., Ozdemir, S. (eds) Proceedings of the Sixth International Scientific Conference “Intelligent Information Technologies for Industry” (IITI’22). IITI 2022. Lecture Notes in Networks and Systems, vol 566. Springer, Cham. https://doi.org/10.1007/978-3-031-19620-1_30

Download citation

Publish with us

Policies and ethics