Combining Fuzzy Clustering and Morphological Methods for Old Documents Recovery
In this paper we tackle the specific problem of old documents recovery. Spots, print through, underlines and others ageing features are undesirable not only because they harm the visual appearance of the document, but also because they affect future Optical Character Recognition (OCR). This paper proposes a new method integrating fuzzy clustering of color properties of original images and mathematical morphology. We will show that this technique leads to higher quality of the recovered images and, at the same time, it delivers cleaned binary text for OCR applications. The proposed method was applied to books of XIX Century, which were cleaned in a very effective way.
KeywordsFuzzy Cluster Optical Character Recognition Mathematical Morphology Fuzzy Partition Recovered Image
Unable to display preview. Download preview PDF.
- 6.ABBYY FineReader Homepage, http://www.abbyy.com, ABBYY Software House
- 9.Ribeiro, C.S., Gil, J.M., Caldas Pinto, J.R., Sousa, J.M.: Ancient document recognition using fuzzy methods. In: Proceedings of the 4th international Workshop on Pattern Recognition in Information Systems, Porto, Portugal, pp. 98–107 (2004)Google Scholar
- 10.Caldas Pinto, J.R., Marcolino, A., Ramalho, M.: Clustering Algorithm for Colour Segmentation. In: SIARP 2000 - V Ibero-American Symposium On Pattern Recognition, pp. 611–617 (2000)Google Scholar