Building digital libraries from paper documents, using ART based neuro-fuzzy systems

  • R. Sanz Guadarrama
  • J. M. Cano Izquierdo
  • Y. A. Dimitriadis
  • G. I. Sainz Palmero
  • J. Lopez Coronado
Formal Tools and Computational Models of Neurons and Neural Net Architectures
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1240)


In this paper a new neuro-fuzzy system is proposed for both tasks of document analysis and Optical Character Recognition. FasART (Fuzzy adaptive system ART based) inherits the stability, flexibility and modularity properties of ART supervised models, but with a formal description as a Fuzzy Logic System, and increased functionality. On the other hand Recursive FasART permits us to exploit context information, crucial aspect in document understanding. Satisfactory experimental results are presented for the global application of building a digital library of scientific papers, giving special emphasis on the creation of links between items in table of contents and paper first pages.


Neuro-fuzzy systems Adaptive Resonance Theory Digital Library Document Understanding OCR HTML 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bayer, T. A. Understanding structured text documents by a model based document analysis system. Proc. of the 2nd International Conference on Document Analysis and Recognition, Sukuba Science City, Japan, October 20–22 1993.Google Scholar
  2. 2.
    Bayer, T. A., Walischewski H. Experiments on extracting structural information from paper documents using syntactic pattern analysis. Proc. of the 3rd International Conference on Document Analysis and Recognition, Montreal, Canada, August 14–16, 1995, 476–479.Google Scholar
  3. 3.
    Cano, J. M., Dimitriadis, Y.A. Lopez, J. A fuzzy neural arquitecture for supervised learning and classification of temporal sequences. Proc. of International Congress on Artificial Neural Networks, Amsterdam, The Netherlands, 1993.Google Scholar
  4. 4.
    Cano, J. M., Dimitriadis, Y. A., Lopez, J. FasArt: A New Neuro-Fuzzy Architecture for Incremental Learning in System Identification. Proc. of Congress of International Federation on Automatic Control '96, Vol. F, 133–138, San Francisco, USA, June 30–July 5 1996.Google Scholar
  5. 5.
    Carpenter, G., Grossberg, S., Markuzon, N., Reynolds, J. Fuzzy ARTMAP: A Neural Network Architecture for Incremental Supervised Learning of Analog Multidimensional Maps. IEEE Transactions on Neural Networks, 3 698–713 1992.Google Scholar
  6. 6.
    Carpenter, G. A., Grossberg, S. A massively Parallel architecture for self-organizing neural pattern recognition machine. Computer Vision, Graphics, and Image Processing, 37 54–115 1987.Google Scholar
  7. 7.
    Dengel, A., Bleisinger, R., Hoch, R. From paper to office document standard representation. Computer, 25 63–67 1992.Google Scholar
  8. 8.
    Gladney, H. M., Ahmed, Z., Ashany, R., Belkin, N. J., Fox, E. A., Zemankova, M. Digital Library: Gross structure and Requirements. IBM Research Report RJ 9840 (May 1994).Google Scholar
  9. 9.
    Fujihara, H., Babiker, E., Simmons, D. B. Fuzzy Approach to Document Recognition. Proc. of the 2nd IEEE International Conference on Fuzzy Systems, 980–985, San Francisco, USA, March–April 1993.Google Scholar
  10. 10.
    Govindan, V. K., Shivasprased, A. P. Character recognition: A review. Pattern Recognition, 23 671–683 1990.Google Scholar
  11. 11.
    O' Gorman, L., Kasturi, R. Document Image Analysis. Chapter 4, IEEE Computer Society Press. 161–181, 1994Google Scholar
  12. 12.
    Le, D. X., Thomas, G. R., Wechsler, H. Document image analysis using integrated image and neural processing. Proc. of the 3rd International Conference on Document Analysis and Recognition, 327–330, Montreal, Canada, August 14–16 1995.Google Scholar
  13. 13.
    Myka, A. Putting Paper Documents in the World-Wide-Web. Proc. of the 2nd International WWW conference, October 17–20, 1994, Chicago, 199–208.Google Scholar
  14. 14.
    Myka, A., Guntzer, U. Full-text searches in OCR databses. In Adam, N.; Bhargava, B.; Halem, N.; Yesha, Y. (editors): Advances in Digital Libraries. Springer Verlag, 1996.Google Scholar
  15. 15.
    Nagy, G., Seth, S. C., Stoddard, S. D. Document analysis with an expert system. Proc. of Recognition in Practice II, Amsterdam, June 19–21 1985.Google Scholar
  16. 16.
    Sainz, G. I., Cano, J. M., Dimitriadis, Y. A., Lopez, J. Document Understanding Based on a Neuro-Fuzzy Approach. Proc. of International Conference on Engineering Applications of Neural Networks, Part II, London, UK, June 17–19 1996.Google Scholar
  17. 17.
    Sainz, G. I., Cano, J. M., Dimitriadis, Y. A., Lopez, J. A New Neuro-Fuzzy System for Logical Labeling of Documents. Proc. of The International Conference on Pattern Recognition, volume Track D: Parallel and Connectionist Systems, pp 431–436, Vienna, Austria, August 25–29 1996.Google Scholar
  18. 18.
    Sainz, G. I., Cano, J. M., Dimitriadis, Y. A., Lopez, J. Structured document labeling and rule extraction using a new recurrent fuzzy-neural system. Submited to ICDAR'97.Google Scholar
  19. 19.
    Satoh, S., Takasu, A., Katsura, E. An Automated Generation of Electronic Library based on Document Image Understanding. Proc. of International Conference on Document Analysis and Recognition, Vol. I, 63–166, Montreal, August 1995.Google Scholar
  20. 20.
    Srihari, S. N., Lam, S. W., Govindaraju, V., Srihari, R. K., Hull, J. J. Document Image Understanding. CEDAR-TR-92-1. (May 1992). Available at Scholar
  21. 21.
    Wu, S., Manber, U. Fast text searching allowing errors. Communications of ACM 35 83–91 1992.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1997

Authors and Affiliations

  • R. Sanz Guadarrama
    • 1
  • J. M. Cano Izquierdo
    • 2
  • Y. A. Dimitriadis
    • 1
  • G. I. Sainz Palmero
    • 2
  • J. Lopez Coronado
    • 2
  1. 1.Department of Signal Theory, Communications and Telematics Engineering School of Telecommunications EngineeringUniversity of ValladolidValladolidSpain
  2. 2.Department of Systems Engineering and Control School of Industrial EngineeringUniversity of ValladolidValladolidSpain

Personalised recommendations