Skip to main content

Subgraph-Induced Extraction Technique for Information (SETI) from Administrative Documents

  • Conference paper
  • First Online:
Document Analysis and Recognition – ICDAR 2023 Workshops (ICDAR 2023)

Abstract

Information Extraction plays a key role in the automation of auditing processes in administrative documents. However, variety in layout and language is always a challenging task. On the other hand, large volumes of public training datasets related to administrative documents such as invoices are rare to find. In this work, we use Graph Attention Network model for information extraction. This type of model makes it easier to understand the mechanism as compared to classical neural networks due to the visualization of link between entities in the graph. Moreover, it maximizes the layout and structure retrieval which is a crucial advantage in administrative documents. From the same graph, our model learns at different graph levels to encapsulate dynamic and more enriched knowledge in each batch, thus maximizing the generalization on smaller dataset. We present how the model learns in each graph level and compare the results with baselines on private as well as public datasets. Our model succeeds in improving recall and precision scores for some classes in our private dataset and produces comparable results for public datasets designed for Form Understanding and Information Extraction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 74.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Brody, S., Alon, U., Yahav, E.: How attentive are graph attention networks? arXiv preprint arXiv:2105.14491 (2021)

  2. Carbonell, M., Riba, P., Villegas, M., Fornés, A., Lladós, J.: Named entity recognition and relation extraction with graph neural networks in semi structured documents. In: 2020 25th International Conference on Pattern Recognition (ICPR), pp. 9622–9627. IEEE (2021)

    Google Scholar 

  3. Denk, T.I., Reisswig, C.: BERTgrid: contextualized embedding for 2D document representation and understanding. arXiv preprint arXiv:1909.04948 (2019)

  4. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

  5. Finkel, J.R., Grenager, T., Manning, C.D.: Incorporating non-local information into information extraction systems by Gibbs sampling. In: Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL 2005), pp. 363–370 (2005)

    Google Scholar 

  6. Hamdi, A., Carel, E., Joseph, A., Coustaty, M., Doucet, A.: Information extraction from invoices. In: Lladós, J., Lopresti, D., Uchida, S. (eds.) ICDAR 2021. LNCS, vol. 12822, pp. 699–714. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86331-9_45

    Chapter  Google Scholar 

  7. Huang, Y., Lv, T., Cui, L., Lu, Y., Wei, F.: LayoutLMv3: pre-training for document AI with unified text and image masking. In: Proceedings of the 30th ACM International Conference on Multimedia, pp. 4083–4091 (2022)

    Google Scholar 

  8. Huang, Z., et al.: ICDAR 2019 competition on scanned receipt OCR and information extraction. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 1516–1520. IEEE (2019)

    Google Scholar 

  9. Jaume, G., Ekenel, H.K., Thiran, J.P.: FUNSD: a dataset for form understanding in noisy scanned documents. In: 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW), vol. 2, pp. 1–6. IEEE (2019)

    Google Scholar 

  10. Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016)

  11. Lee, C.Y., et al.: ROPE: reading order equivariant positional encoding for graph-based document information extraction. arXiv preprint arXiv:2106.10786 (2021)

  12. Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)

    Google Scholar 

  13. Lohani, D., Belaïd, A., Belaïd, Y.: An invoice reading system using a graph convolutional network. In: Carneiro, G., You, S. (eds.) ACCV 2018. LNCS, vol. 11367, pp. 144–158. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-21074-8_12

    Chapter  Google Scholar 

  14. Martin, L., et al.: Camembert: a tasty French language model. arXiv preprint arXiv:1911.03894 (2019)

  15. Palm, R.B., Winther, O., Laws, F.: CloudScan-a configuration-free invoice analysis system using recurrent neural networks. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 406–413. IEEE (2017)

    Google Scholar 

  16. Peterson, L.E.: K-nearest neighbor. Scholarpedia 4(2), 1883 (2009)

    Article  Google Scholar 

  17. Saout, T., Lardeux, F., Saubion, F.: A two-stage approach for table extraction in invoices. arXiv preprint arXiv:2210.04716 (2022)

  18. Veličković, P., Cucurull, G., Casanova, A., Romero, A., Lio, P., Bengio, Y.: Graph attention networks. arXiv preprint arXiv:1710.10903 (2017)

  19. Zhang, Y., Zhang, B., Wang, R., Cao, J., Li, C., Bao, Z.: Entity relation extraction as dependency parsing in visually rich documents. arXiv preprint arXiv:2110.09915 (2021)

Download references

Acknowledgement

This work was supported by the French government in the framework of the France Relance program and by the YOOZ company. We would also like to thank Jérôme Lacour, Jonathan Ouellet and Mohamed Saadi from YOOZ for their support.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dipendra Sharma Kafle .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sharma Kafle, D., Thomas, E., Coustaty, M., Joseph, A., Doucet, A., Poulain d’Andecy, V. (2023). Subgraph-Induced Extraction Technique for Information (SETI) from Administrative Documents. In: Coustaty, M., Fornés, A. (eds) Document Analysis and Recognition – ICDAR 2023 Workshops. ICDAR 2023. Lecture Notes in Computer Science, vol 14194. Springer, Cham. https://doi.org/10.1007/978-3-031-41501-2_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-41501-2_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-41500-5

  • Online ISBN: 978-3-031-41501-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics