Multiresolution cooperation makes easier document structure recognition

  • Aurélie LemaitreEmail author
  • Jean Camillerapp
  • Bertrand Coüasnon
Original Paper


This paper shows the interest of imitating the perceptive vision to improve the recognition of the structure of ancient, noisy and low structured documents. The perceptive vision, that is used by human eye, consists in focusing attention on interesting elements after having detecting their presence in a global vision process. We propose a generic method in order to apply this concept to various problems and kinds of documents. Thus, we introduce the concept of cooperation between multiresolution visions into a generic method. The originality of this work is that the cooperation between resolutions is totally led by the knowledge dedicated to each kind of document. In this paper, we present this method on three kinds of documents: handwritten low structured mail documents, naturalization decree register that are archive noisy documents from the 19th century and Bangla script that requires a precise vision. This work is validated on 86,291 documents.


Structure recognition Multiresolution Perceptive vision Grammar 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Augustin, E., Carre, M., Grosicki, E., Brodin, J.M., Geoffrois, E., Preteux, F.: Rimes evaluation campaign for handwritten mail processing. In: Proceedings 10th International Workshop on Frontiers in Handwriting Recognition (IWFHR06), pp. 231–235. La Baule, France (2006)Google Scholar
  2. 2.
    Bajcsy R., Rosenthal D.A.: Visual and Conceptual Focus of Attention, pp. 133–149. Academic Press, Dublin (1980)Google Scholar
  3. 3.
    Bloomberg, D.: Multiresolution morphological approach to document image analysis. In: ICDAR 1991, pp. 963–971 (1991)Google Scholar
  4. 4.
    Burt P.J.: Smart sensing with a pyramid vision machine. Proc. IEEE 76, 1006–1015 (1988)CrossRefGoogle Scholar
  5. 5.
    Cantoni, V., Cinque, L., Lombardi, L., Manzini, G.: Page segmentation using a pyramidal architecture. In: Workshop on Computer Architectures for Machine Perception, p. Session 6 (1997)Google Scholar
  6. 6.
    Cheng, H., Bouman, C.: Multiscale bayesian segmentation using a trainable context model. IEEE Trans. Image Process. 10(4), 511–525 (2001). URL: Google Scholar
  7. 7.
    Cinque, L., Forino, L., Levialdi, S., Lombardi, L., Tanimoto, S.L.: Understanding the page logical structure. In: 10th International Conference on Image Analysis and Processing (ICIAP 1999), pp. 1003–1008 (1999)Google Scholar
  8. 8.
    Coüasnon, B.: DMOS: A generic document recognition method to application to an automatic generator of musical scores, mathematical formulae and table structures recognition systems. In: Proceedings of International Conference on Document Analysis and Recognition (ICDAR’01), pp. 215–220 (2001)Google Scholar
  9. 9.
    Coüasnon B.: DMOS, a generic document recognition method: application to table structure analysis in a general and in a specific way. Int. J. Document Anal. Recognit. IJDAR 8(2), 111–122 (2006)CrossRefGoogle Scholar
  10. 10.
    Coüasnon, B., Camillerapp, J., Leplumey, I.: Making handwritten archives documents accessible to public with a generic system of document image analysis. In: International Conference on Document Image Analysis for Libraries (DIAL), pp. 270–277 (2004)Google Scholar
  11. 11.
    Déforges, O., Barba, D.: A fast multiresolution text-line and non text-line structures extraction. In: International Conference on Image Processing (ICIP), pp. 134–138 (1994)Google Scholar
  12. 12.
    Dyer C.R.: Multiscale Image Understanding, pp. 171–213. Academic Press Professional Inc., San Diego, CA, USA (1987)Google Scholar
  13. 13.
    Jolion J.M., Rosenfeld A.: A Pyramid Framework for Early Vision: Multiresolutional Computer Vision. Kluwer Academic Publishers, Norwell, MA, USA (1994)Google Scholar
  14. 14.
    Lemaitre, A., Camillerapp, J.: Text line extraction in handwritten document with kalman filter applied on low resolution image. In: Document Image Analysis for Libraries (DIAL’06), pp. 38–45 (2006). URL:
  15. 15.
    Lemaitre, A., Camillerapp, J., Coüasnon, B.: Contribution of multiresolution description for archive document structure recognition. In: Proceedings of International Conference on Document Analysis and Recognition (ICDAR’07), pp. 247–251 (2007)Google Scholar
  16. 16.
    Lemaitre, A., Camillerapp, J., Coüasnon, B.: A generic method for structure recognition of handwritten mail documents. In: Document Recognition and Retrieval (DRR XV) (2008)Google Scholar
  17. 17.
    Lemaitre, A., Chaudhuri, B.B., Coüasnon, B.: Perceptive vision for headline localisation in bangla handwritten text recognition. In: Proceedings of International Conference on Document Analysis and Recognition (ICDAR’07), pp. 614–618 (2007)Google Scholar
  18. 18.
    Leplumey, I., Camillerapp, J., Queguiner, C.: Kalman filter contributions towards document segmentation. In: Proceedings of International Conference on Document Analysis and Recognition (ICDAR’95), pp. 765–769 (1995)Google Scholar
  19. 19.
    Shi, Z., Govindaraju, V.: Multi-scale techniques for document page segmentation. In: ICDAR ’05: Proceedings of the Eighth International Conference on Document Analysis and Recognition, pp. 1020–1024. IEEE Computer Society, Washington, DC, USA (2005). doi: 10.1109/ICDAR.2005.165
  20. 20.
    Silberberg, T.M.: Multiresolution aerial image interpretation. In: Image Understanding Workshop, pp. 505–511 (1988)Google Scholar

Copyright information

© Springer-Verlag 2008

Authors and Affiliations

  • Aurélie Lemaitre
    • 1
    Email author
  • Jean Camillerapp
    • 1
  • Bertrand Coüasnon
    • 1
  1. 1.IRISA/INSARennes CedexFrance

Personalised recommendations