Advertisement

An old greek handwritten OCR system based on an efficient segmentation-free approach

  • K. Ntzios
  • B. Gatos
  • I. Pratikakis
  • T. Konidaris
  • S. J. Perantonis
ORIGINAL PAPER

Abstract

Recognition of Old Greek Early Christian manuscripts is essential for efficient content exploitation of the valuable Old Greek Early Christian historical collections. In this paper, we focus on the problem of recognizing Old Greek manuscripts and propose a novel recognition technique that has been tested in a large number of important historical manuscript collections which are written in lowercase letters and originate from St. Catherine’s Mount Sinai Monastery. Based on an open and closed cavity character representation, we propose a novel, segmentation-free, fast and efficient technique for the detection and recognition of characters and character ligatures. First, we detect open and closed cavities that exist in the skeletonized character body. Then, the classification of a specific character or character ligature is based on the protrusible segments that appear in the topological description of the character skeletons. Experimental results prove the efficiency of the proposed approach.

Keywords

Historical document recognition Handwriting character recognition Segmentation-free OCR 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Amin, A., Masini, G.: Machine recognition of cursive Arabic words, application of digital image processing IV. San Diego, CA, vol. SPIE-359, pp. 286–292 (1982)Google Scholar
  2. 2.
    Brakensiek, A., Rottland, J., Rigoll, G.: Confidence measures for an address reading system. In: 7th International Conference on Document Analysis and Recognition, ICDAR 2003, pp. 294–298 (2003)Google Scholar
  3. 3.
    Chi Z., Suters M., Yan H. (1995): Separation of single-and double-touching handwritten numeral strings. Opt. Eng. 34, 1159–1165CrossRefGoogle Scholar
  4. 4.
    Chen, C.H., Curtins, J.: Word recognition in a segmentation-free approach to OCR. In: 2nd International Conference on Document Analysis and Recognition (ICDAR’93), pp. 573–576 (2003)Google Scholar
  5. 5.
    Chen, C.H., Curtins, J.: A Segmentation-free approach to OCR. IEEE Workshop on Applications of Computer Vision, pp. 190–196 (1992)Google Scholar
  6. 6.
    Duda R., Hart E. (1973): Pattern Classification and Scene Analysis. Wiley, New YorkMATHGoogle Scholar
  7. 7.
    Eastwood, B., Jennings, A., Harvey, A.: A feature based neural network segmenter for handwritten words. In: International Conference on Computational Intelligence and Multimedia Applications (ICCIMA’97), pp. 286–290. Australia (1997)Google Scholar
  8. 8.
    Farag R. (1979): Word-level recognition of cursive script. IEEE Trans. Comput. C-28: 172–175Google Scholar
  9. 9.
    Gatos B., Pratikakis I., Perantonis S.J. (2006): Adaptive degraded document image binarization. Pattern Recogn. 39, 317–327MATHCrossRefGoogle Scholar
  10. 10.
    Gatos, B., Konidaris, T., Ntzios, K., Pratikakis, I., Perantonis, S.: A segmentation-free approach for keyword search in historical typewritten documents. In: 8th International Conference on Document Analysis and Recognition (ICDAR’05), Seoul, Korea, (2005)Google Scholar
  11. 11.
    Gorski, N., Anisimov, V., Augustin, E., Baret, O., Price, D., Simon, JC.: A2iA check reader: a family of bank check recognition systems. In: Proccedings of 5th International Conference on Document Analysis and Recognition, pp. 523–526 (1999)Google Scholar
  12. 12.
    Gonzalez R.C, Woods R.E. (2003): Digital Image Processing. Addison-Wesley, ReadingGoogle Scholar
  13. 13.
    Guillevic, D., Suen, CY.: HMM word recognition engine. In: 4th International Conference on Document Analysis and Recognition ICDAR97, pp. 544 (1997)Google Scholar
  14. 14.
    Hirano, T., Okada, Y., Yoda, F.: Field extraction method from existing forms transmitted by facsimile. In: 6th International Conference on Document Analysis and Recognition, ICDAR2001, pp. 738–742 (2001)Google Scholar
  15. 15.
    Jung D.M., Krishnamoorty M.S., Nagy G., Shapira A. (1996): N-tuple features for OCR revisited. IEEE Trans. PAMI 18(7): 734–745Google Scholar
  16. 16.
    Kavallieratou, E., Fakotakis, N., Kokkinakis, G.: Handwritten character recognition based on structural characteristics. In: 16th International Conference on Pattern Recognition, pp. 139–142 (2002)Google Scholar
  17. 17.
    Kim, I.K., Park, R.H.: Local adaptive thresholding based on a water flow model. In: 2nd Japan–Korea Joint Workhop on Computer Vision, pp. 21–27. Japan (1996)Google Scholar
  18. 18.
    Lee HJ., Chen B. (1992): Recognition of handwritten Chinese characters via short line segments. Pattern Recogn. 25(5): 543–552CrossRefGoogle Scholar
  19. 19.
    Lu Y., Tan C.L. (2002): Combination of multiple classifiers using probabilistic dictionary and its application to postcode recognition. Pattern Recogn. 35, 2823–2832MATHCrossRefGoogle Scholar
  20. 20.
    Lu Y., Shridhar M. (1996): Character segmentation in handwritten words-an overview. Pattern Recogn. 29(1): 77–96CrossRefGoogle Scholar
  21. 21.
    Madhvanath S., Kleinger E., Govindaraju V. (1999): Holistic verifications of handwritten phrases. IEEE Trans. PAMI 21: 1344–1356Google Scholar
  22. 22.
    Madhvanath, S., Govindaraju, V.: Holistic lexicon reduction. In: Proceedings of the 3rd International Workshop on Frontiers in Handwriting Recognition, pp.71–82 Buffalo, NY (1993)Google Scholar
  23. 23.
    Manmatha, R., Croft, WB.: A draft of word spotting: indexing handwritten manuscripts. In: Intelligent Multimedia Information Retrieval, pp. 43–64. MIT Press, Cambridge, MA (1997)Google Scholar
  24. 24.
    Mori S., Suen CY., Yamamoto K. (1992): Historical review of OCR research and development. Proc. IEEE, 80, 1029–1058CrossRefGoogle Scholar
  25. 25.
    Niblack, W.: An Introduction to Digital Image Processing. pp. 115–116. Prentice Hall, Englewood Cliffs, NJ, (1986)Google Scholar
  26. 26.
    Otsu N. (1979): A threshold selection method from gray-level histograms. IEEE trans. Syst. Man Cybern. 9(1): 62–66MathSciNetCrossRefGoogle Scholar
  27. 27.
    Pal, U., Sarkar, A.: Recognition of printed urbu script. In: Proceedings of the 7th International Conference on Document Analysis and Recognition (ICDAR 2003)Google Scholar
  28. 28.
    Pal U., Belaid A., Choisy Ch. (2003): Touching numeral segmentation using water reservoir concept. Pattern Recogn. Lett. 24, 261–272CrossRefGoogle Scholar
  29. 29.
    Pavlidis T. (1992): Algorithms for Graphics and Image Processing. Computer Science Press, Rockville MDGoogle Scholar
  30. 30.
    Plamondon P., Privitera CM. (1999): The segmentation of cursive handwritten: an approach based on off-line recovery of the motor-temporal information. IEEE Trans. Image Process. 8, 80–91CrossRefGoogle Scholar
  31. 31.
    Sauvola J., Pietikainen M. (2000): Adpative document image binarization. Pattern Recogn. 33: 225–236CrossRefGoogle Scholar
  32. 32.
    Shuyan Z., Zheru C., Penfei S., Hong Y. (2003): Two-stage segmentation of unconstrained handwritten Chinese characters. Pattern Recogn. 36: 145–156MATHCrossRefGoogle Scholar
  33. 33.
    Simon, J.: Off-line cursive word recognition. In: Proc. IEEE 80, 1150–1161 (1992)Google Scholar
  34. 34.
    Suen CY. (1993): Building a new generation of handwriting recognition systems. Pattern Recogn. Lett. 14, 303–315CrossRefGoogle Scholar
  35. 35.
    Ulmann J.R. (1969): Experiments with the n-tuple method of pattern recognition. IEEE Trans. Comput. 18(12): 1135–1137Google Scholar
  36. 36.
    Vinciarelli A. (2002): A survey on off-line cursive word recognition. Pattern Recogn. 35, 1433–1446MATHCrossRefGoogle Scholar
  37. 37.
    Xiao, X., Leedham, G.: Cursive script segmentation incorporating Knowledge of writing. In: Proceedings of the 5th International Conference on Document Analysis and Recognition, pp. 535–538 (1999)Google Scholar
  38. 38.
    Xia F. (2003): Normal vector and winding number in 2D digital images with their application for hole detection. Pattern Recogn. 36, 1383–1395MATHCrossRefGoogle Scholar
  39. 39.
    Xu, Q., Lam, L., Suen, CY.: A knowledge-based segmentation system for handwritten dates on bank cheques. In: Sixth International Conference on Document Analysis and Recognition, ICDAR2001, pp. 384–388 (2001)Google Scholar
  40. 40.
    Zhang, M., Suen, C.: Digital Image Processing, 2nd edn, pp. 398–402 (1987)Google Scholar

Copyright information

© Springer-Verlag 2007

Authors and Affiliations

  • K. Ntzios
    • 1
    • 2
  • B. Gatos
    • 1
  • I. Pratikakis
    • 1
  • T. Konidaris
    • 1
  • S. J. Perantonis
    • 1
  1. 1.Computational Intelligence Laboratory, Institute of Informatics and TelecommunicationsNational Research Center “Demokritos”AthensGreece
  2. 2.Department of Informatics and TelecommunicationsNational and Kapodistrian University of AthensAthensGreece

Personalised recommendations