Abstract
This paper shows the interest of imitating the perceptive vision to improve the recognition of the structure of ancient, noisy and low structured documents. The perceptive vision, that is used by human eye, consists in focusing attention on interesting elements after having detecting their presence in a global vision process. We propose a generic method in order to apply this concept to various problems and kinds of documents. Thus, we introduce the concept of cooperation between multiresolution visions into a generic method. The originality of this work is that the cooperation between resolutions is totally led by the knowledge dedicated to each kind of document. In this paper, we present this method on three kinds of documents: handwritten low structured mail documents, naturalization decree register that are archive noisy documents from the 19th century and Bangla script that requires a precise vision. This work is validated on 86,291 documents.
Similar content being viewed by others
References
Augustin, E., Carre, M., Grosicki, E., Brodin, J.M., Geoffrois, E., Preteux, F.: Rimes evaluation campaign for handwritten mail processing. In: Proceedings 10th International Workshop on Frontiers in Handwriting Recognition (IWFHR06), pp. 231–235. La Baule, France (2006)
Bajcsy R., Rosenthal D.A.: Visual and Conceptual Focus of Attention, pp. 133–149. Academic Press, Dublin (1980)
Bloomberg, D.: Multiresolution morphological approach to document image analysis. In: ICDAR 1991, pp. 963–971 (1991)
Burt P.J.: Smart sensing with a pyramid vision machine. Proc. IEEE 76, 1006–1015 (1988)
Cantoni, V., Cinque, L., Lombardi, L., Manzini, G.: Page segmentation using a pyramidal architecture. In: Workshop on Computer Architectures for Machine Perception, p. Session 6 (1997)
Cheng, H., Bouman, C.: Multiscale bayesian segmentation using a trainable context model. IEEE Trans. Image Process. 10(4), 511–525 (2001). URL: http://citeseer.ist.psu.edu/cheng01multiscale.html
Cinque, L., Forino, L., Levialdi, S., Lombardi, L., Tanimoto, S.L.: Understanding the page logical structure. In: 10th International Conference on Image Analysis and Processing (ICIAP 1999), pp. 1003–1008 (1999)
Coüasnon, B.: DMOS: A generic document recognition method to application to an automatic generator of musical scores, mathematical formulae and table structures recognition systems. In: Proceedings of International Conference on Document Analysis and Recognition (ICDAR’01), pp. 215–220 (2001)
Coüasnon B.: DMOS, a generic document recognition method: application to table structure analysis in a general and in a specific way. Int. J. Document Anal. Recognit. IJDAR 8(2), 111–122 (2006)
Coüasnon, B., Camillerapp, J., Leplumey, I.: Making handwritten archives documents accessible to public with a generic system of document image analysis. In: International Conference on Document Image Analysis for Libraries (DIAL), pp. 270–277 (2004)
Déforges, O., Barba, D.: A fast multiresolution text-line and non text-line structures extraction. In: International Conference on Image Processing (ICIP), pp. 134–138 (1994)
Dyer C.R.: Multiscale Image Understanding, pp. 171–213. Academic Press Professional Inc., San Diego, CA, USA (1987)
Jolion J.M., Rosenfeld A.: A Pyramid Framework for Early Vision: Multiresolutional Computer Vision. Kluwer Academic Publishers, Norwell, MA, USA (1994)
Lemaitre, A., Camillerapp, J.: Text line extraction in handwritten document with kalman filter applied on low resolution image. In: Document Image Analysis for Libraries (DIAL’06), pp. 38–45 (2006). URL: http://dx.doi.org/10.1109/DIAL.2006.41
Lemaitre, A., Camillerapp, J., Coüasnon, B.: Contribution of multiresolution description for archive document structure recognition. In: Proceedings of International Conference on Document Analysis and Recognition (ICDAR’07), pp. 247–251 (2007)
Lemaitre, A., Camillerapp, J., Coüasnon, B.: A generic method for structure recognition of handwritten mail documents. In: Document Recognition and Retrieval (DRR XV) (2008)
Lemaitre, A., Chaudhuri, B.B., Coüasnon, B.: Perceptive vision for headline localisation in bangla handwritten text recognition. In: Proceedings of International Conference on Document Analysis and Recognition (ICDAR’07), pp. 614–618 (2007)
Leplumey, I., Camillerapp, J., Queguiner, C.: Kalman filter contributions towards document segmentation. In: Proceedings of International Conference on Document Analysis and Recognition (ICDAR’95), pp. 765–769 (1995)
Shi, Z., Govindaraju, V.: Multi-scale techniques for document page segmentation. In: ICDAR ’05: Proceedings of the Eighth International Conference on Document Analysis and Recognition, pp. 1020–1024. IEEE Computer Society, Washington, DC, USA (2005). doi:10.1109/ICDAR.2005.165
Silberberg, T.M.: Multiresolution aerial image interpretation. In: Image Understanding Workshop, pp. 505–511 (1988)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Lemaitre, A., Camillerapp, J. & Coüasnon, B. Multiresolution cooperation makes easier document structure recognition. IJDAR 11, 97–109 (2008). https://doi.org/10.1007/s10032-008-0072-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10032-008-0072-6