Combining Fuzzy Clustering and Morphological Methods for Old Documents Recovery

  • João R. Caldas Pinto
  • Lourenço Bandeira
  • João M. C. Sousa
  • Pedro Pina
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3523)

Abstract

In this paper we tackle the specific problem of old documents recovery. Spots, print through, underlines and others ageing features are undesirable not only because they harm the visual appearance of the document, but also because they affect future Optical Character Recognition (OCR). This paper proposes a new method integrating fuzzy clustering of color properties of original images and mathematical morphology. We will show that this technique leads to higher quality of the recovered images and, at the same time, it delivers cleaned binary text for OCR applications. The proposed method was applied to books of XIX Century, which were cleaned in a very effective way.

Keywords

Fuzzy Cluster Optical Character Recognition Mathematical Morphology Fuzzy Partition Recovered Image 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function. Plenum Press, New York (1981)MATHGoogle Scholar
  2. 2.
    Buse, R., Liu, Z.Q., Bezdek, J.: Word recognition using fuzzy logic. IEEE Transactions on Fuzzy Systems 10(1), 65–76 (2001)CrossRefGoogle Scholar
  3. 3.
    Driankov, D., Hellendoorn, H., Reinfrank, M.: An Introduction to Fuzzy Control. Springer, Berlin (1993)MATHGoogle Scholar
  4. 4.
    Serra, J.: Image Analysis and Mathematical Morphology. Academic Press, London (1982)MATHGoogle Scholar
  5. 5.
    Soille, P.: Morphological Image Analysis, 2nd edn. Springer, Berlin (2003)MATHGoogle Scholar
  6. 6.
    ABBYY FineReader Homepage, http://www.abbyy.com, ABBYY Software House
  7. 7.
    Caldas Pinto, J.R., Pina, P., Bandeira, L., Pimentel, L., Ramalho, M.: Underline Removal on Old Documents. In: Campilho, A.C., Kamel, M.S. (eds.) ICIAR 2004. LNCS, vol. 3212, pp. 226–234. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  8. 8.
    Soille, P., Talbot, H.: Directional Morphological Filtering. IEEE Transactions on Pattern Analysis and Machine Intelligence 23(11), 1313–1329 (2001)CrossRefGoogle Scholar
  9. 9.
    Ribeiro, C.S., Gil, J.M., Caldas Pinto, J.R., Sousa, J.M.: Ancient document recognition using fuzzy methods. In: Proceedings of the 4th international Workshop on Pattern Recognition in Information Systems, Porto, Portugal, pp. 98–107 (2004)Google Scholar
  10. 10.
    Caldas Pinto, J.R., Marcolino, A., Ramalho, M.: Clustering Algorithm for Colour Segmentation. In: SIARP 2000 - V Ibero-American Symposium On Pattern Recognition, pp. 611–617 (2000)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • João R. Caldas Pinto
    • 1
  • Lourenço Bandeira
    • 1
  • João M. C. Sousa
    • 1
  • Pedro Pina
    • 2
  1. 1.IDMECInstituto Superior TécnicoLisboaPortugal
  2. 2.CVRM / Geo-Systems CentreInstituto Superior TécnicoLisboaPortugal

Personalised recommendations