Abstract
The use of Semantic Web technologies, including knowledge graphs, is a widespread practice in the development of modern intelligent systems, information retrieval, and question answering. Knowledge graph engineering requires automation and improvement, including the use of various sources of information (e.g., databases, documents, conceptual models). Tables are one of the most accessible and common ways of storing and presenting information, as well as a valuable source of structured domain knowledge. In this paper, we propose to automate the process of extracting specific entities (facts) from tabular data for subsequent filling and augmentation of a target knowledge graph. A new approach is proposed for this purpose. The key feature of our approach is the semantic interpretation (annotation) of separate table elements. We present the main stages of this approach and a description of its implementation. We also conducted a case study on the task of designing a domain knowledge graph using the proposed approach in the field of industrial safety inspection of petrochemical equipment and technological complexes. The results obtained show the advisability of using our approach and software for this task.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Hogan, A., et al.: Knowledge Graphs (2021)
Balog, K.: Populating knowledge bases. Entity-Orient. Search INRE 39, 189–222 (2018)
Zhang, S., Balog, K.: Web table extraction, retrieval, and augmentation: a survey. ACM Trans. Intell. Syst. Technol. 11(2), 1–35 (2020)
Lehmberg, O., Ritze, D., Meusel, R., Bizer, C.: A large public corpus of web tables con-taining time and context metadata. In: Proceedings of the 25th International Conference Companion on World Wide Web, pp. 75–76 (2016)
Dorodnykh, N., Yurin, A.: Spreadsheet data transformation for ontology engineering in petrochemical equipment inspection tasks. In: Kovalev, S., Tarassov, V., Snasel, V., Sukhanov, A. (eds.) IITI 2021. LNNS, vol. 330, pp. 562–571. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-87178-9_55
Bischof, S., Decker, S., Krennwallner, T., Lopes, N., Polleres, A.: Mapping between RDF and XML with XSPARQL. J. Data Semant. 1(3), 147–185 (2012)
Lefrançois, M., Zimmermann, A., Bakerally, N.: A SPARQL extension for generating RDF from heterogeneous formats. In: Blomqvist, E., Maynard, D., Gangemi, A., Hoekstra, R., Hitzler, P., Hartig, O. (eds.) The Semantic Web: 14th International Conference, ESWC 2017, Portorož, Slovenia, May 28 – June 1, 2017, Proceedings, Part I, pp. 35–50. Springer International Publishing, Cham (2017). https://doi.org/10.1007/978-3-319-58068-5_3
Han, L., Finin, T., Parr, C., Sachs, J., Joshi, A.: RDF123: from spreadsheets to RDF. In: Sheth, A., et al. (eds.) ISWC 2008. LNCS, vol. 5318, pp. 451–466. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88564-1_29
Lebo, T., Williams, G.: Converting governmental datasets into Linked Data. In: Proceedings of the 6th International Conference on Semantic Systems, pp. 1–3 (2010)
Scharffe, F., et al.: Enabling linked data publication with the datalift platform. In: Proceedings of the AAAI workshop on semantic cities. In 26th Conference on Artificial Intelligence, W10: Semantic Cities, pp. 25–30 (2012)
Spread2RDF. https://github.com/marcelotto/spread2rdf. Accessed 07 May 2022
Fiorelli, M., Lorenzetti, T., Pazienza, M., Stellato, A., Turbati, A.: Sheet2RDF: a flexible and dynamic spreadsheet import&lifting framework for RDF. In: Ali, M., Kwon, Y.S., Lee, C.-H., Kim, J., Kim, Y. (eds.) IEA/AIE 2015. LNCS (LNAI), vol. 9101, pp. 131–140. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-19066-2_13
De Vos, M., Wielemaker, J., Rijgersberg, H., Schreiber, G., Wielinga, B., Top, J.: Com-bining information on structure and content to automatically annotate natural science spreadsheets. Int. J. Hum Comput Stud. 103, 63–76 (2017)
Maguire, E., González-Beltrán, A., Whetzel, P.L., Sansone, S.A., Rocca-Serra, P.: On-tomaton: a bioportal powered ontology widget for Google Spreadsheets. Bioinformatics 29(4), 525–527 (2013)
Chen, J., Jimenez-Ruiz, E., Horrocks, I., Sutton, C.: ColNet: Embedding the semantics of web tables for column type prediction. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, no. 1, pp. 29–36 (2019)
Hulsebos, M., et al.: Sherlock: a deep learning approach to semantic data type detection. In: KDD 2019: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1500–1508 (2019)
Kruit, B., Boncz, P., Urbani, J.: Extracting novel facts from tables for knowledge graph completion. In: Ghidini, C., et al. (eds.) ISWC 2019. LNCS, vol. 11778, pp. 364–381. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30793-6_21
Cremaschi, M., Paoli, F.D., Rula, A., Spahiu, B.: A fully automated approach to a complete semantic table interpretation. Futur. Gener. Comput. Syst. 112, 478–500 (2020)
Deng, X., Sun, H., Lees, A., Wu, Y., Yu, C.: TURL: table understanding through representation learning. Proc. VLDB Endowment 14(3), 307–319 (2020)
Xie, J., Lu, Y., Cao, C., Li, Z., Guan, Y., Liu, Y.: Joint entity linking for web tables with hybrid semantic matching. In: Krzhizhanovskaya, V.V., et al. (eds.) ICCS 2020. LNCS, vol. 12138, pp. 618–631. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-50417-5_46
Huynh, V.-P., Liu, J., Chabot, Y., Deuzé, F., Labbé, T., Monnin, P., Troncy, R.: DAGOBAH: table and graph contexts for efficient semantic annotation of tabular data. In: Proceedings of the 20th International Semantic Web Conference (ISWC 2021), SemTab, pp. 19–31 (2021)
Nguyen, P., Yamada, I., Kertkeidkachorn, N., Ichise, R., Takeda, H.: SemTab 2021: tabular data annotation with MTab tool. In: Proceedings of the 20th International Semantic Web Conference (ISWC 2021), SemTab, pp. 92–101 (2021)
Vu, B., Knoblock, C.A., Szekely, P., Pham, M., Pujara, J.: A graph-based approach for inferring semantic descriptions of wikipedia tables. In: Hotho, A., et al. (eds.) ISWC 2021. LNCS, vol. 12922, pp. 304–320. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-88361-4_18
SemTab-2021. https://www.cs.ox.ac.uk/isg/challenges/sem-tab/. Accessed 07 May 2022
Stanford CoreNLP. https://stanfordnlp.github.io/CoreNLP/. Accessed 07 May 2022
Dorodnykh, N.O., Yurin, A.Yu.: Towards a universal approach for semantic interpretation of spreadsheets data. In: IDEAS 2020: Proceedings of the 24th Symposium on International Database Engineering & Applications, vol. 22, pp. 1–9 (2020)
ISI-167E: Entity Spreadsheet Tables. https://data.mendeley.com/datasets/3gjy46mx88/1. Accessed 07 May 2022
Bizer, C., et al.: DBpedia – a crystallization point for the web of data. J. Web Semant. 7(3), 154–165 (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Dorodnykh, N.O., Yurin, A.Y. (2023). Knowledge Graph Augmentation Based on Tabular Data: A Case Study for Industrial Safety Inspection. In: Kovalev, S., Sukhanov, A., Akperov, I., Ozdemir, S. (eds) Proceedings of the Sixth International Scientific Conference “Intelligent Information Technologies for Industry” (IITI’22). IITI 2022. Lecture Notes in Networks and Systems, vol 566. Springer, Cham. https://doi.org/10.1007/978-3-031-19620-1_30
Download citation
DOI: https://doi.org/10.1007/978-3-031-19620-1_30
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-19619-5
Online ISBN: 978-3-031-19620-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)