Subgraph-Induced Extraction Technique for Information (SETI) from Administrative Documents

Sharma Kafle, Dipendra; Thomas, Eliott; Coustaty, Mickael; Joseph, Aurélie; Doucet, Antoine; Poulain d’Andecy, Vincent

doi:10.1007/978-3-031-41501-2_8

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14194))

Included in the following conference series:

International Conference on Document Analysis and Recognition

400 Accesses

Abstract

Information Extraction plays a key role in the automation of auditing processes in administrative documents. However, variety in layout and language is always a challenging task. On the other hand, large volumes of public training datasets related to administrative documents such as invoices are rare to find. In this work, we use Graph Attention Network model for information extraction. This type of model makes it easier to understand the mechanism as compared to classical neural networks due to the visualization of link between entities in the graph. Moreover, it maximizes the layout and structure retrieval which is a crucial advantage in administrative documents. From the same graph, our model learns at different graph levels to encapsulate dynamic and more enriched knowledge in each batch, thus maximizing the generalization on smaller dataset. We present how the model learns in each graph level and compare the results with baselines on private as well as public datasets. Our model succeeds in improving recall and precision scores for some classes in our private dataset and produces comparable results for public datasets designed for Form Understanding and Information Extraction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 74.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Improving Information Extraction from Semi-structured Documents Using Attention Based Semi-variational Graph Auto-Encoder

Consideration of the Word’s Neighborhood in GATs for Information Extraction in Semi-structured Documents

Doc2Graph: A Task Agnostic Document Understanding Framework Based on Graph Neural Networks

References

Brody, S., Alon, U., Yahav, E.: How attentive are graph attention networks? arXiv preprint arXiv:2105.14491 (2021)
Carbonell, M., Riba, P., Villegas, M., Fornés, A., Lladós, J.: Named entity recognition and relation extraction with graph neural networks in semi structured documents. In: 2020 25th International Conference on Pattern Recognition (ICPR), pp. 9622–9627. IEEE (2021)
Google Scholar
Denk, T.I., Reisswig, C.: BERTgrid: contextualized embedding for 2D document representation and understanding. arXiv preprint arXiv:1909.04948 (2019)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Finkel, J.R., Grenager, T., Manning, C.D.: Incorporating non-local information into information extraction systems by Gibbs sampling. In: Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL 2005), pp. 363–370 (2005)
Google Scholar
Hamdi, A., Carel, E., Joseph, A., Coustaty, M., Doucet, A.: Information extraction from invoices. In: Lladós, J., Lopresti, D., Uchida, S. (eds.) ICDAR 2021. LNCS, vol. 12822, pp. 699–714. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86331-9_45
Chapter Google Scholar
Huang, Y., Lv, T., Cui, L., Lu, Y., Wei, F.: LayoutLMv3: pre-training for document AI with unified text and image masking. In: Proceedings of the 30th ACM International Conference on Multimedia, pp. 4083–4091 (2022)
Google Scholar
Huang, Z., et al.: ICDAR 2019 competition on scanned receipt OCR and information extraction. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 1516–1520. IEEE (2019)
Google Scholar
Jaume, G., Ekenel, H.K., Thiran, J.P.: FUNSD: a dataset for form understanding in noisy scanned documents. In: 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW), vol. 2, pp. 1–6. IEEE (2019)
Google Scholar
Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016)
Lee, C.Y., et al.: ROPE: reading order equivariant positional encoding for graph-based document information extraction. arXiv preprint arXiv:2106.10786 (2021)
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)
Google Scholar
Lohani, D., Belaïd, A., Belaïd, Y.: An invoice reading system using a graph convolutional network. In: Carneiro, G., You, S. (eds.) ACCV 2018. LNCS, vol. 11367, pp. 144–158. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-21074-8_12
Chapter Google Scholar
Martin, L., et al.: Camembert: a tasty French language model. arXiv preprint arXiv:1911.03894 (2019)
Palm, R.B., Winther, O., Laws, F.: CloudScan-a configuration-free invoice analysis system using recurrent neural networks. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 406–413. IEEE (2017)
Google Scholar
Peterson, L.E.: K-nearest neighbor. Scholarpedia 4(2), 1883 (2009)
Article Google Scholar
Saout, T., Lardeux, F., Saubion, F.: A two-stage approach for table extraction in invoices. arXiv preprint arXiv:2210.04716 (2022)
Veličković, P., Cucurull, G., Casanova, A., Romero, A., Lio, P., Bengio, Y.: Graph attention networks. arXiv preprint arXiv:1710.10903 (2017)
Zhang, Y., Zhang, B., Wang, R., Cao, J., Li, C., Bao, Z.: Entity relation extraction as dependency parsing in visually rich documents. arXiv preprint arXiv:2110.09915 (2021)

Download references

Acknowledgement

This work was supported by the French government in the framework of the France Relance program and by the YOOZ company. We would also like to thank Jérôme Lacour, Jonathan Ouellet and Mohamed Saadi from YOOZ for their support.

Author information

Authors and Affiliations

Université de La Rochelle, L3i Avenue Michel Crépeau, 17042, La Rochelle, France
Dipendra Sharma Kafle, Eliott Thomas, Mickael Coustaty & Antoine Doucet
Yooz, Aimargues, France
Aurélie Joseph & Vincent Poulain d’Andecy

Authors

Dipendra Sharma Kafle
View author publications
You can also search for this author in PubMed Google Scholar
Eliott Thomas
View author publications
You can also search for this author in PubMed Google Scholar
Mickael Coustaty
View author publications
You can also search for this author in PubMed Google Scholar
Aurélie Joseph
View author publications
You can also search for this author in PubMed Google Scholar
Antoine Doucet
View author publications
You can also search for this author in PubMed Google Scholar
Vincent Poulain d’Andecy
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dipendra Sharma Kafle .

Editor information

Editors and Affiliations

University of La Rochelle, La Rochelle, France
Mickael Coustaty
Autonomous University of Barcelona, Bellaterra, Spain
Alicia Fornés

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sharma Kafle, D., Thomas, E., Coustaty, M., Joseph, A., Doucet, A., Poulain d’Andecy, V. (2023). Subgraph-Induced Extraction Technique for Information (SETI) from Administrative Documents. In: Coustaty, M., Fornés, A. (eds) Document Analysis and Recognition – ICDAR 2023 Workshops. ICDAR 2023. Lecture Notes in Computer Science, vol 14194. Springer, Cham. https://doi.org/10.1007/978-3-031-41501-2_8

Download citation

DOI: https://doi.org/10.1007/978-3-031-41501-2_8
Published: 15 August 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-41500-5
Online ISBN: 978-3-031-41501-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Subgraph-Induced Extraction Technique for Information (SETI) from Administrative Documents

Abstract

Access this chapter

Similar content being viewed by others

Improving Information Extraction from Semi-structured Documents Using Attention Based Semi-variational Graph Auto-Encoder

Consideration of the Word’s Neighborhood in GATs for Information Extraction in Semi-structured Documents

Doc2Graph: A Task Agnostic Document Understanding Framework Based on Graph Neural Networks

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

Subgraph-Induced Extraction Technique for Information (SETI) from Administrative Documents

Abstract

Access this chapter

Similar content being viewed by others

Improving Information Extraction from Semi-structured Documents Using Attention Based Semi-variational Graph Auto-Encoder

Consideration of the Word’s Neighborhood in GATs for Information Extraction in Semi-structured Documents

Doc2Graph: A Task Agnostic Document Understanding Framework Based on Graph Neural Networks

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation