Abstract
A great number of companies and institutions use spreadsheets for managing, publishing and sharing their data. Though effective, spreadsheets are mainly designed for being interpreted by humans, and the automatic extraction of their content and interpretation is a complex task. The task becomes even harder when tables present different kinds of mistakes and their layout is complex. In this paper, we outline the approach that we wish to develop during the PhD for answering the research question “how to semi-automatically extract coherent semantic information from heterogeneous and complex spreadsheets?”.
PhD Advisor Prof. Marco Mesiti.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bonfitto, S., Cappelletti, L., Trovato, F., Valentini, G., Mesiti, M.: Semi-automatic column type inference for CSV table understanding. In: Bureš, T., et al. (eds.) SOFSEM 2021. LNCS, vol. 12607, pp. 535–549. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-67731-2_39
Bonfitto, S., Casiraghi, E., Mesiti, M.: Table understanding approaches for extracting knowledge from heterogeneous tables. WIREs Data Min. Knowl. Disc. (2020, to appear)
Holeček, M., Hoskovec, A., Baudiš, P., Klinger, P.: Table understanding in structured documents. In: Proceedings of International Conference on Document Analysis and Recognition Workshops (ICDARW), vol. 5, pp. 158–164 (2019)
Jin, Z., Anderson, M.R., Cafarella, M., Jagadish, H.V.: Foofah: transforming data by example. In: Proceedings of ACM SIGMOD, pp. 683–698 (2017). https://doi.org/10.1145/3035918.3064034
Kandel, S., Paepcke, A., Hellerstein, J., Heer, J.: Wrangler: interactive visual specification of data transformation scripts. In: ACM Human Factors in Computing Systems (CHI), pp. 3363–3372 (2011). https://doi.org/10.1145/1978942.1979444
Shigarov, A., Khristyuk, V., Mikhailov, A., Paramonov, V.: TabbyXL: rule-based spreadsheet data extraction and transformation. In: Damaševičius, R., Vasiljevienė, G. (eds.) ICIST 2019. CCIS, vol. 1078, pp. 59–75. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30275-7_6
Taheriyan, M., Knoblock, C.A., Szekely, P., Ambite, J.L.: Learning the semantics of structured data sources. J. Web Semant. 37, 152–169 (2016)
Zhang, Z.: Effective and efficient semantic table interpretation using tableminer\({}^{\text{+ }}\). Semant. Web 8(6), 921–957 (2017). https://doi.org/10.3233/SW-160242
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Bonfitto, S. (2021). Semantic Integration of Heterogeneous and Complex Spreadsheet Tables. In: Jensen, C.S., et al. Database Systems for Advanced Applications. DASFAA 2021. Lecture Notes in Computer Science(), vol 12683. Springer, Cham. https://doi.org/10.1007/978-3-030-73200-4_52
Download citation
DOI: https://doi.org/10.1007/978-3-030-73200-4_52
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-73199-1
Online ISBN: 978-3-030-73200-4
eBook Packages: Computer ScienceComputer Science (R0)