Abstract
Automatic document understanding is one of the most important tasks when dealing with printed documents since all post-ordered systems require the captured but process-relevant data. Analysis of the logical layout of documents not only enables an automatic conversion into a semantically marked-up electronic representation but also reveals options for developing higher-level functionality like advanced search (e.g., limiting search to titles only), automatic routing of business letters, automatic processing of invoices, and developing link structures to facilitate navigation through books. Over the last three decades, a number of techniques have been proposed to address the challenges arising in logical layout analysis of documents originating from many different domains. This chapter provides a comprehensive review of the state of the art in the field of automated document understanding, highlights key methods developed for different target applications, and provides practical recommendations for designing a document understanding system for the problem at hand.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Aiello M, Monz C, Todoran L, Worring M (2002) Document understanding for a broad class of documents. Int J Doc Anal Recognit 5(1):1–16
Altamura O, Esposito F, Malerba D (2001) Transforming paper documents into XML format with WISDOM++. Int J Doc Anal Recognit 4(1):2–17
Bayer T, Franke J, Kressel U, Mandler E, Oberländer M, Schürmann J (1992) Towards the understanding of printed documents. In: Baird H, Bunke H, Yamamoto K (eds) Structured document image analysis. Springer, Berlin/New York, pp 3–35
Belaïd A (2001) Recognition of table of contents for electronic library consulting. Int J Doc Anal Recognit 4(1):35–45
Cattoni R, Coianiz T, Messelodi S, Modena CM (1998) Geometric layout analysis techniques for document image understanding: a review. Technical report 9703-09, IRST, Trento, Italy
Cesarini F, Gori M, Marinai S, Soda G (1998) INFORMys: a flexible invoice-like form-reader system. IEEE Trans Pattern Anal Mach Intell 20(7):730–746
Cesarini F, Francesconi E, Gori M, Soda G (2003) Analysis and understanding of multi-class invoices. Int J Doc Anal Recognit 6(2):102–114
Déjean H, Meunier J (2009) On tables of contents and how to recognize them. Int J Doc Anal Recognit 12(1):1–20
Dengel A (1992) ANASTASIL: a system for low-level and high-level geometric analysis of printed documents. In: Baird H, Bunke H, Yamamoto K (eds) Structured document image analysis. Springer, Berlin/New York, pp 70–98
Dengel A, Barth G (1988) High level document analysis guided by geometric aspects. Int J Pattern Recognit Artif Intell 2(4):641–655
Dengel A, Dubiel F (1996) Computer understanding of document structure. Int J Imaging Syst Technol 7:271–278
Dengel A, Bleisinger R, Hoch R, Fein F, Hönes F (1992) From paper to office document standard representation. IEEE Comput 25(7):63–67
Doucet A, Kazai G, Dresevic B, Uzelac A, Radakovic B, Todic N (2011) Setting up a competition framework for the evaluation of structure extraction from OCR-ed books. Int J Doc Anal Recognit 14(1):45–52
Duygulu P, Atalay V (2002) A hierarchical representation of form documents for identification and retrieval. Int J Doc Anal Recognit 5(1):17–27
Eglin V, Bres S (2004) Analysis and interpretation of visual saliency for document functional labeling. Int J Doc Anal Recognit 7(1):28–43
e Silva AC, Jorge AM, Torgo L (2006) Design of an end-to-end method to extract information from tables. Int J Doc Anal Recognit 8(2–3):144–171
Esposito F, Malerba D, Lisi F (2000) Machine learning for intelligent processing of printed documents. J Intell Inf Syst 14(2–3):175–198
Fan H, Zhu L, Tang Y (2010) Skew detection in document images based on rectangular active contour. Int J Doc Anal Recognit 13(4):261–269
Kazai G, Doucet A (2008) Overview of the INEX 2007 book search track (BookSearch’07). SIGIR Forum 42(1):2–15
Klein B, Dengel A (2003) Problem-adaptable document analysis and understanding for high-volume applications. Int J Doc Anal Recognit 6(3):167–180
Klein B, Agne S, Dengel A (2006) On benchmarking of invoice analysis systems. In: Proceedings of the international workshop on document analysis systems, Nelson, pp 312–323
Klink S, Kieninger T (2001) Rule-based document structure understanding with a fuzzy combination of layout and textual features. Int J Doc Anal Recognit 4(1):18–26
Krishnamoorthy M, Nagy G, Seth S, Viswanathan M (1993) Syntactic segmentation and labeling of digitized pages from technical journals. IEEE Trans Pattern Anal Mach Intell 15(7):737–747
Lemaitre A, Camillerapp J, Coüasnon B (2008) Multiresolution cooperation makes easier document structure recognition. Int J Doc Anal Recognit 11(2):97–109
Lin X, Xiong Y (2006) Detection and analysis of table of contents based on content association. Int J Doc Anal Recognit 8(2–3):132–143
Medvet E, Bartoli A, Davanzo G (2011) A probabilistic approach to printed document understanding. Int J Doc Anal Recognit 14(4):335–347
Nagy G, Seth S, Viswanathan M (1992) A prototype document image analysis system for technical journals. IEEE Comput 7(25):10–22
Rangoni Y, Belaïd A, Vajda S (2012) Labelling logical structures of document images using a dynamic perceptive neural network. Int J Doc Anal Recognit 15(2):45–55
Schürmann J, Bartneck N, Bayer T, Franke J, Mandler E, Oberländer M (1992) Document analysis – from pixels to contents. Proc IEEE 80(7):1101–1119
Shafait F, Breuel TM (2011) The effect of border noise on the performance of projection based page segmentation methods. IEEE Trans Pattern Anal Mach Intell 33(4):846–851
Shafait F, van Beusekom J, Keysers D, Breuel TM (2008) Document cleanup using page frame detection. Int J Doc Anal Recognit 11(2):81–96
Staelin C, Elad M, Greig D, Shmueli O, Vans M (2007) Biblio: automatic meta-data extraction. Int J Doc Anal Recognit 10(2):113–126
Story GA, O’Gorman L, Fox D, Schaper LL, Jagadish HV (1992) The rightpages image-based electronic library for alerting and browsing. IEEE Comput 25:17–26
Tan CL, Liu QH (2004) Extraction of newspaper headlines from microfilm for automatic indexing. Int J Doc Anal Recognit 6(3):201–210
Tsujimoto S, Asada H (1992) Major components of a complete text reading system. Proc IEEE 80(7):1133–1149
van Beusekom J, Keysers D, Shafait F, Breuel TM (2007) Example-based logical labeling of document title page images. In: Proceedings of the international conference on document analysis and recognition, Curitiba, pp 919–923
van Beusekom J, Shafait F, Breuel TM (2010) Combined orientation and skew detection using geometric text-line modeling. Int J Doc Anal Recognit 13(2):79–92
Wang S, Cao Y, Cai S (2001) Using citing information to understand the logical structure of document images. Int J Doc Anal Recognit 4(1):27–34
Wang Y, Phillips I, Haralick R (2004) Table structure understanding and its performance evaluation. Pattern Recognit 37(7):1479–1497
Wong KY, Casey RG, Wahl FM (1982) Document analysis system. IBM J Res Dev 26(6):647–656
Xiao Y, Yan H (2004) Location of title and author regions in document images based on the Delaunay triangulation. Image Vis Comput 22(4):319–329
Yu B, Jain AK (1996) A generic system for form dropout. IEEE Trans Pattern Anal Mach Intell 18(11):1127–1134
Zou J, Le D, Thoma G (2010) Locating and parsing bibliographic references in HTML medical articles. Int J Doc Anal Recognit 13(2):107–119
Further Reading
Aiello M, Monz C, Todoran L, Worring M (2002) Document understanding for a broad class of documents. Int J Doc Anal Recognit 5(1):1–16
Medvet E, Bartoli A, Davanzo G (2011) A probabilistic approach to printed document understanding. Int J Doc Anal Recognit 14(4):335–347
Rangoni Y, Belaid A, Vajda S (2012) Labelling logical structures of document images using a dynamic perceptive neural network. Int J Doc Anal Recognit 15(2):45–55
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag London
About this entry
Cite this entry
Dengel, A., Shafait, F. (2014). Analysis of the Logical Layout of Documents. In: Doermann, D., Tombre, K. (eds) Handbook of Document Image Processing and Recognition. Springer, London. https://doi.org/10.1007/978-0-85729-859-1_6
Download citation
DOI: https://doi.org/10.1007/978-0-85729-859-1_6
Published:
Publisher Name: Springer, London
Print ISBN: 978-0-85729-858-4
Online ISBN: 978-0-85729-859-1
eBook Packages: Computer ScienceReference Module Computer Science and Engineering