Skip to main content
Log in

Multiresolution cooperation makes easier document structure recognition

  • Original Paper
  • Published:
International Journal of Document Analysis and Recognition (IJDAR) Aims and scope Submit manuscript

Abstract

This paper shows the interest of imitating the perceptive vision to improve the recognition of the structure of ancient, noisy and low structured documents. The perceptive vision, that is used by human eye, consists in focusing attention on interesting elements after having detecting their presence in a global vision process. We propose a generic method in order to apply this concept to various problems and kinds of documents. Thus, we introduce the concept of cooperation between multiresolution visions into a generic method. The originality of this work is that the cooperation between resolutions is totally led by the knowledge dedicated to each kind of document. In this paper, we present this method on three kinds of documents: handwritten low structured mail documents, naturalization decree register that are archive noisy documents from the 19th century and Bangla script that requires a precise vision. This work is validated on 86,291 documents.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Augustin, E., Carre, M., Grosicki, E., Brodin, J.M., Geoffrois, E., Preteux, F.: Rimes evaluation campaign for handwritten mail processing. In: Proceedings 10th International Workshop on Frontiers in Handwriting Recognition (IWFHR06), pp. 231–235. La Baule, France (2006)

  2. Bajcsy R., Rosenthal D.A.: Visual and Conceptual Focus of Attention, pp. 133–149. Academic Press, Dublin (1980)

    Google Scholar 

  3. Bloomberg, D.: Multiresolution morphological approach to document image analysis. In: ICDAR 1991, pp. 963–971 (1991)

  4. Burt P.J.: Smart sensing with a pyramid vision machine. Proc. IEEE 76, 1006–1015 (1988)

    Article  Google Scholar 

  5. Cantoni, V., Cinque, L., Lombardi, L., Manzini, G.: Page segmentation using a pyramidal architecture. In: Workshop on Computer Architectures for Machine Perception, p. Session 6 (1997)

  6. Cheng, H., Bouman, C.: Multiscale bayesian segmentation using a trainable context model. IEEE Trans. Image Process. 10(4), 511–525 (2001). URL: http://citeseer.ist.psu.edu/cheng01multiscale.html

    Google Scholar 

  7. Cinque, L., Forino, L., Levialdi, S., Lombardi, L., Tanimoto, S.L.: Understanding the page logical structure. In: 10th International Conference on Image Analysis and Processing (ICIAP 1999), pp. 1003–1008 (1999)

  8. Coüasnon, B.: DMOS: A generic document recognition method to application to an automatic generator of musical scores, mathematical formulae and table structures recognition systems. In: Proceedings of International Conference on Document Analysis and Recognition (ICDAR’01), pp. 215–220 (2001)

  9. Coüasnon B.: DMOS, a generic document recognition method: application to table structure analysis in a general and in a specific way. Int. J. Document Anal. Recognit. IJDAR 8(2), 111–122 (2006)

    Article  Google Scholar 

  10. Coüasnon, B., Camillerapp, J., Leplumey, I.: Making handwritten archives documents accessible to public with a generic system of document image analysis. In: International Conference on Document Image Analysis for Libraries (DIAL), pp. 270–277 (2004)

  11. Déforges, O., Barba, D.: A fast multiresolution text-line and non text-line structures extraction. In: International Conference on Image Processing (ICIP), pp. 134–138 (1994)

  12. Dyer C.R.: Multiscale Image Understanding, pp. 171–213. Academic Press Professional Inc., San Diego, CA, USA (1987)

    Google Scholar 

  13. Jolion J.M., Rosenfeld A.: A Pyramid Framework for Early Vision: Multiresolutional Computer Vision. Kluwer Academic Publishers, Norwell, MA, USA (1994)

    Google Scholar 

  14. Lemaitre, A., Camillerapp, J.: Text line extraction in handwritten document with kalman filter applied on low resolution image. In: Document Image Analysis for Libraries (DIAL’06), pp. 38–45 (2006). URL: http://dx.doi.org/10.1109/DIAL.2006.41

  15. Lemaitre, A., Camillerapp, J., Coüasnon, B.: Contribution of multiresolution description for archive document structure recognition. In: Proceedings of International Conference on Document Analysis and Recognition (ICDAR’07), pp. 247–251 (2007)

  16. Lemaitre, A., Camillerapp, J., Coüasnon, B.: A generic method for structure recognition of handwritten mail documents. In: Document Recognition and Retrieval (DRR XV) (2008)

  17. Lemaitre, A., Chaudhuri, B.B., Coüasnon, B.: Perceptive vision for headline localisation in bangla handwritten text recognition. In: Proceedings of International Conference on Document Analysis and Recognition (ICDAR’07), pp. 614–618 (2007)

  18. Leplumey, I., Camillerapp, J., Queguiner, C.: Kalman filter contributions towards document segmentation. In: Proceedings of International Conference on Document Analysis and Recognition (ICDAR’95), pp. 765–769 (1995)

  19. Shi, Z., Govindaraju, V.: Multi-scale techniques for document page segmentation. In: ICDAR ’05: Proceedings of the Eighth International Conference on Document Analysis and Recognition, pp. 1020–1024. IEEE Computer Society, Washington, DC, USA (2005). doi:10.1109/ICDAR.2005.165

  20. Silberberg, T.M.: Multiresolution aerial image interpretation. In: Image Understanding Workshop, pp. 505–511 (1988)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aurélie Lemaitre.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lemaitre, A., Camillerapp, J. & Coüasnon, B. Multiresolution cooperation makes easier document structure recognition. IJDAR 11, 97–109 (2008). https://doi.org/10.1007/s10032-008-0072-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10032-008-0072-6

Keywords

Navigation