Advertisement

Analysis of the Logical Layout of Documents

  • Andreas DengelEmail author
  • Faisal Shafait
Reference work entry

Abstract

Automatic document understanding is one of the most important tasks when dealing with printed documents since all post-ordered systems require the captured but process-relevant data. Analysis of the logical layout of documents not only enables an automatic conversion into a semantically marked-up electronic representation but also reveals options for developing higher-level functionality like advanced search (e.g., limiting search to titles only), automatic routing of business letters, automatic processing of invoices, and developing link structures to facilitate navigation through books. Over the last three decades, a number of techniques have been proposed to address the challenges arising in logical layout analysis of documents originating from many different domains. This chapter provides a comprehensive review of the state of the art in the field of automated document understanding, highlights key methods developed for different target applications, and provides practical recommendations for designing a document understanding system for the problem at hand.

Keywords

Bibliographic meta-data extraction Document structure extraction Documentunderstanding Information extraction Invoice processing Logical labeling Logicallayout analysis 

References

  1. 1.
    Aiello M, Monz C, Todoran L, Worring M (2002) Document understanding for a broad class of documents. Int J Doc Anal Recognit 5(1):1–16zbMATHCrossRefGoogle Scholar
  2. 2.
    Altamura O, Esposito F, Malerba D (2001) Transforming paper documents into XML format with WISDOM++. Int J Doc Anal Recognit 4(1):2–17CrossRefGoogle Scholar
  3. 3.
    Bayer T, Franke J, Kressel U, Mandler E, Oberländer M, Schürmann J (1992) Towards the understanding of printed documents. In: Baird H, Bunke H, Yamamoto K (eds) Structured document image analysis. Springer, Berlin/New York, pp 3–35CrossRefGoogle Scholar
  4. 4.
    Belaïd A (2001) Recognition of table of contents for electronic library consulting. Int J Doc Anal Recognit 4(1):35–45MathSciNetCrossRefGoogle Scholar
  5. 5.
    Cattoni R, Coianiz T, Messelodi S, Modena CM (1998) Geometric layout analysis techniques for document image understanding: a review. Technical report 9703-09, IRST, Trento, ItalyGoogle Scholar
  6. 6.
    Cesarini F, Gori M, Marinai S, Soda G (1998) INFORMys: a flexible invoice-like form-reader system. IEEE Trans Pattern Anal Mach Intell 20(7):730–746CrossRefGoogle Scholar
  7. 7.
    Cesarini F, Francesconi E, Gori M, Soda G (2003) Analysis and understanding of multi-class invoices. Int J Doc Anal Recognit 6(2):102–114CrossRefGoogle Scholar
  8. 8.
    Déjean H, Meunier J (2009) On tables of contents and how to recognize them. Int J Doc Anal Recognit 12(1):1–20CrossRefGoogle Scholar
  9. 9.
    Dengel A (1992) ANASTASIL: a system for low-level and high-level geometric analysis of printed documents. In: Baird H, Bunke H, Yamamoto K (eds) Structured document image analysis. Springer, Berlin/New York, pp 70–98CrossRefGoogle Scholar
  10. 10.
    Dengel A, Barth G (1988) High level document analysis guided by geometric aspects. Int J Pattern Recognit Artif Intell 2(4):641–655CrossRefGoogle Scholar
  11. 11.
    Dengel A, Dubiel F (1996) Computer understanding of document structure. Int J Imaging Syst Technol 7:271–278CrossRefGoogle Scholar
  12. 12.
    Dengel A, Bleisinger R, Hoch R, Fein F, Hönes F (1992) From paper to office document standard representation. IEEE Comput 25(7):63–67CrossRefGoogle Scholar
  13. 13.
    Doucet A, Kazai G, Dresevic B, Uzelac A, Radakovic B, Todic N (2011) Setting up a competition framework for the evaluation of structure extraction from OCR-ed books. Int J Doc Anal Recognit 14(1):45–52CrossRefGoogle Scholar
  14. 14.
    Duygulu P, Atalay V (2002) A hierarchical representation of form documents for identification and retrieval. Int J Doc Anal Recognit 5(1):17–27zbMATHCrossRefGoogle Scholar
  15. 15.
    Eglin V, Bres S (2004) Analysis and interpretation of visual saliency for document functional labeling. Int J Doc Anal Recognit 7(1):28–43CrossRefGoogle Scholar
  16. 16.
    e Silva AC, Jorge AM, Torgo L (2006) Design of an end-to-end method to extract information from tables. Int J Doc Anal Recognit 8(2–3):144–171Google Scholar
  17. 17.
    Esposito F, Malerba D, Lisi F (2000) Machine learning for intelligent processing of printed documents. J Intell Inf Syst 14(2–3):175–198CrossRefGoogle Scholar
  18. 18.
    Fan H, Zhu L, Tang Y (2010) Skew detection in document images based on rectangular active contour. Int J Doc Anal Recognit 13(4):261–269CrossRefGoogle Scholar
  19. 19.
    Kazai G, Doucet A (2008) Overview of the INEX 2007 book search track (BookSearch’07). SIGIR Forum 42(1):2–15CrossRefGoogle Scholar
  20. 20.
    Klein B, Dengel A (2003) Problem-adaptable document analysis and understanding for high-volume applications. Int J Doc Anal Recognit 6(3):167–180CrossRefGoogle Scholar
  21. 21.
    Klein B, Agne S, Dengel A (2006) On benchmarking of invoice analysis systems. In: Proceedings of the international workshop on document analysis systems, Nelson, pp 312–323Google Scholar
  22. 22.
    Klink S, Kieninger T (2001) Rule-based document structure understanding with a fuzzy combination of layout and textual features. Int J Doc Anal Recognit 4(1):18–26CrossRefGoogle Scholar
  23. 23.
    Krishnamoorthy M, Nagy G, Seth S, Viswanathan M (1993) Syntactic segmentation and labeling of digitized pages from technical journals. IEEE Trans Pattern Anal Mach Intell 15(7):737–747CrossRefGoogle Scholar
  24. 24.
    Lemaitre A, Camillerapp J, Coüasnon B (2008) Multiresolution cooperation makes easier document structure recognition. Int J Doc Anal Recognit 11(2):97–109CrossRefGoogle Scholar
  25. 25.
    Lin X, Xiong Y (2006) Detection and analysis of table of contents based on content association. Int J Doc Anal Recognit 8(2–3):132–143CrossRefGoogle Scholar
  26. 26.
    Medvet E, Bartoli A, Davanzo G (2011) A probabilistic approach to printed document understanding. Int J Doc Anal Recognit 14(4):335–347CrossRefGoogle Scholar
  27. 27.
    Nagy G, Seth S, Viswanathan M (1992) A prototype document image analysis system for technical journals. IEEE Comput 7(25):10–22CrossRefGoogle Scholar
  28. 28.
    Rangoni Y, Belaïd A, Vajda S (2012) Labelling logical structures of document images using a dynamic perceptive neural network. Int J Doc Anal Recognit 15(2):45–55CrossRefGoogle Scholar
  29. 29.
    Schürmann J, Bartneck N, Bayer T, Franke J, Mandler E, Oberländer M (1992) Document analysis – from pixels to contents. Proc IEEE 80(7):1101–1119CrossRefGoogle Scholar
  30. 30.
    Shafait F, Breuel TM (2011) The effect of border noise on the performance of projection based page segmentation methods. IEEE Trans Pattern Anal Mach Intell 33(4):846–851CrossRefGoogle Scholar
  31. 31.
    Shafait F, van Beusekom J, Keysers D, Breuel TM (2008) Document cleanup using page frame detection. Int J Doc Anal Recognit 11(2):81–96CrossRefGoogle Scholar
  32. 32.
    Staelin C, Elad M, Greig D, Shmueli O, Vans M (2007) Biblio: automatic meta-data extraction. Int J Doc Anal Recognit 10(2):113–126CrossRefGoogle Scholar
  33. 33.
    Story GA, O’Gorman L, Fox D, Schaper LL, Jagadish HV (1992) The rightpages image-based electronic library for alerting and browsing. IEEE Comput 25:17–26CrossRefGoogle Scholar
  34. 34.
    Tan CL, Liu QH (2004) Extraction of newspaper headlines from microfilm for automatic indexing. Int J Doc Anal Recognit 6(3):201–210CrossRefGoogle Scholar
  35. 35.
    Tsujimoto S, Asada H (1992) Major components of a complete text reading system. Proc IEEE 80(7):1133–1149CrossRefGoogle Scholar
  36. 36.
    van Beusekom J, Keysers D, Shafait F, Breuel TM (2007) Example-based logical labeling of document title page images. In: Proceedings of the international conference on document analysis and recognition, Curitiba, pp 919–923Google Scholar
  37. 37.
    van Beusekom J, Shafait F, Breuel TM (2010) Combined orientation and skew detection using geometric text-line modeling. Int J Doc Anal Recognit 13(2):79–92CrossRefGoogle Scholar
  38. 38.
    Wang S, Cao Y, Cai S (2001) Using citing information to understand the logical structure of document images. Int J Doc Anal Recognit 4(1):27–34CrossRefGoogle Scholar
  39. 39.
    Wang Y, Phillips I, Haralick R (2004) Table structure understanding and its performance evaluation. Pattern Recognit 37(7):1479–1497CrossRefGoogle Scholar
  40. 40.
    Wong KY, Casey RG, Wahl FM (1982) Document analysis system. IBM J Res Dev 26(6):647–656CrossRefGoogle Scholar
  41. 41.
    Xiao Y, Yan H (2004) Location of title and author regions in document images based on the Delaunay triangulation. Image Vis Comput 22(4):319–329CrossRefGoogle Scholar
  42. 42.
    Yu B, Jain AK (1996) A generic system for form dropout. IEEE Trans Pattern Anal Mach Intell 18(11):1127–1134CrossRefGoogle Scholar
  43. 43.
    Zou J, Le D, Thoma G (2010) Locating and parsing bibliographic references in HTML medical articles. Int J Doc Anal Recognit 13(2):107–119CrossRefGoogle Scholar

Further Reading

  1. Aiello M, Monz C, Todoran L, Worring M (2002) Document understanding for a broad class of documents. Int J Doc Anal Recognit 5(1):1–16zbMATHCrossRefGoogle Scholar
  2. Medvet E, Bartoli A, Davanzo G (2011) A probabilistic approach to printed document understanding. Int J Doc Anal Recognit 14(4):335–347CrossRefGoogle Scholar
  3. Rangoni Y, Belaid A, Vajda S (2012) Labelling logical structures of document images using a dynamic perceptive neural network. Int J Doc Anal Recognit 15(2):45–55CrossRefGoogle Scholar

Copyright information

© Springer-Verlag London 2014

Authors and Affiliations

  1. 1.German Research Center for Artificial Intelligence (DFKI GmbH)KaiserslauternGermany
  2. 2.German Research Center for Artificial Intelligence (DFKI GmbH)KaiserslauternGermany
  3. 3.School of Computer Science and Software EngineeringThe University of Western AustraliaPerthAustralia

Personalised recommendations