Multiresolution cooperation makes easier document structure recognition

Lemaitre, Aurélie; Camillerapp, Jean; Coüasnon, Bertrand

doi:10.1007/s10032-008-0072-6

Multiresolution cooperation makes easier document structure recognition

Original Paper
Published: 23 September 2008

Volume 11, pages 97–109, (2008)
Cite this article

International Journal of Document Analysis and Recognition (IJDAR) Aims and scope Submit manuscript

Aurélie Lemaitre¹,
Jean Camillerapp¹ &
Bertrand Coüasnon¹

145 Accesses
17 Citations
Explore all metrics

Abstract

This paper shows the interest of imitating the perceptive vision to improve the recognition of the structure of ancient, noisy and low structured documents. The perceptive vision, that is used by human eye, consists in focusing attention on interesting elements after having detecting their presence in a global vision process. We propose a generic method in order to apply this concept to various problems and kinds of documents. Thus, we introduce the concept of cooperation between multiresolution visions into a generic method. The originality of this work is that the cooperation between resolutions is totally led by the knowledge dedicated to each kind of document. In this paper, we present this method on three kinds of documents: handwritten low structured mail documents, naturalization decree register that are archive noisy documents from the 19th century and Bangla script that requires a precise vision. This work is validated on 86,291 documents.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Augustin, E., Carre, M., Grosicki, E., Brodin, J.M., Geoffrois, E., Preteux, F.: Rimes evaluation campaign for handwritten mail processing. In: Proceedings 10th International Workshop on Frontiers in Handwriting Recognition (IWFHR06), pp. 231–235. La Baule, France (2006)
Bajcsy R., Rosenthal D.A.: Visual and Conceptual Focus of Attention, pp. 133–149. Academic Press, Dublin (1980)
Google Scholar
Bloomberg, D.: Multiresolution morphological approach to document image analysis. In: ICDAR 1991, pp. 963–971 (1991)
Burt P.J.: Smart sensing with a pyramid vision machine. Proc. IEEE 76, 1006–1015 (1988)
Article Google Scholar
Cantoni, V., Cinque, L., Lombardi, L., Manzini, G.: Page segmentation using a pyramidal architecture. In: Workshop on Computer Architectures for Machine Perception, p. Session 6 (1997)
Cheng, H., Bouman, C.: Multiscale bayesian segmentation using a trainable context model. IEEE Trans. Image Process. 10(4), 511–525 (2001). URL: http://citeseer.ist.psu.edu/cheng01multiscale.html
Google Scholar
Cinque, L., Forino, L., Levialdi, S., Lombardi, L., Tanimoto, S.L.: Understanding the page logical structure. In: 10th International Conference on Image Analysis and Processing (ICIAP 1999), pp. 1003–1008 (1999)
Coüasnon, B.: DMOS: A generic document recognition method to application to an automatic generator of musical scores, mathematical formulae and table structures recognition systems. In: Proceedings of International Conference on Document Analysis and Recognition (ICDAR’01), pp. 215–220 (2001)
Coüasnon B.: DMOS, a generic document recognition method: application to table structure analysis in a general and in a specific way. Int. J. Document Anal. Recognit. IJDAR 8(2), 111–122 (2006)
Article Google Scholar
Coüasnon, B., Camillerapp, J., Leplumey, I.: Making handwritten archives documents accessible to public with a generic system of document image analysis. In: International Conference on Document Image Analysis for Libraries (DIAL), pp. 270–277 (2004)
Déforges, O., Barba, D.: A fast multiresolution text-line and non text-line structures extraction. In: International Conference on Image Processing (ICIP), pp. 134–138 (1994)
Dyer C.R.: Multiscale Image Understanding, pp. 171–213. Academic Press Professional Inc., San Diego, CA, USA (1987)
Google Scholar
Jolion J.M., Rosenfeld A.: A Pyramid Framework for Early Vision: Multiresolutional Computer Vision. Kluwer Academic Publishers, Norwell, MA, USA (1994)
Google Scholar
Lemaitre, A., Camillerapp, J.: Text line extraction in handwritten document with kalman filter applied on low resolution image. In: Document Image Analysis for Libraries (DIAL’06), pp. 38–45 (2006). URL: http://dx.doi.org/10.1109/DIAL.2006.41
Lemaitre, A., Camillerapp, J., Coüasnon, B.: Contribution of multiresolution description for archive document structure recognition. In: Proceedings of International Conference on Document Analysis and Recognition (ICDAR’07), pp. 247–251 (2007)
Lemaitre, A., Camillerapp, J., Coüasnon, B.: A generic method for structure recognition of handwritten mail documents. In: Document Recognition and Retrieval (DRR XV) (2008)
Lemaitre, A., Chaudhuri, B.B., Coüasnon, B.: Perceptive vision for headline localisation in bangla handwritten text recognition. In: Proceedings of International Conference on Document Analysis and Recognition (ICDAR’07), pp. 614–618 (2007)
Leplumey, I., Camillerapp, J., Queguiner, C.: Kalman filter contributions towards document segmentation. In: Proceedings of International Conference on Document Analysis and Recognition (ICDAR’95), pp. 765–769 (1995)
Shi, Z., Govindaraju, V.: Multi-scale techniques for document page segmentation. In: ICDAR ’05: Proceedings of the Eighth International Conference on Document Analysis and Recognition, pp. 1020–1024. IEEE Computer Society, Washington, DC, USA (2005). doi:10.1109/ICDAR.2005.165
Silberberg, T.M.: Multiresolution aerial image interpretation. In: Image Understanding Workshop, pp. 505–511 (1988)

Download references

Author information

Authors and Affiliations

IRISA/INSA, Campus de Beaulieu, 35043, Rennes Cedex, France
Aurélie Lemaitre, Jean Camillerapp & Bertrand Coüasnon

Authors

Aurélie Lemaitre
View author publications
You can also search for this author in PubMed Google Scholar
Jean Camillerapp
View author publications
You can also search for this author in PubMed Google Scholar
Bertrand Coüasnon
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Aurélie Lemaitre.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lemaitre, A., Camillerapp, J. & Coüasnon, B. Multiresolution cooperation makes easier document structure recognition. IJDAR 11, 97–109 (2008). https://doi.org/10.1007/s10032-008-0072-6

Download citation

Received: 25 February 2008
Revised: 24 July 2008
Accepted: 29 August 2008
Published: 23 September 2008
Issue Date: November 2008
DOI: https://doi.org/10.1007/s10032-008-0072-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multiresolution cooperation makes easier document structure recognition

Abstract

Access this article

Similar content being viewed by others

Attention mechanisms in computer vision: A survey

OCR with Tesseract, Amazon Textract, and Google Document AI: a benchmarking experiment

Image Inpainting: A Review

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multiresolution cooperation makes easier document structure recognition

Abstract

Access this article

Similar content being viewed by others

Attention mechanisms in computer vision: A survey

OCR with Tesseract, Amazon Textract, and Google Document AI: a benchmarking experiment

Image Inpainting: A Review

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation