Advertisement

Processing Handwritten Words by Intelligent Use of OCR Results

  • Benjamin Mund
  • Karl-Heinz Steinke
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6171)

Abstract

About 3.5 million dried plants on paper sheets are deposited in the Botanical Museum Berlin in Germany. Frequently they have handwritten annotations (see figure 1). So a procedure had to be developed in order to process the handwriting on the sheet. In the present work an approach tries to identify the writer by handwritten words and to read handwritten keywords. Therefore the word is cut out and transformed into a 6-dimensional time series and compared e.g. by means of DTW-method. A recognition rate of 98.6% is achieved with 12 different words (1200 samples). All herbar documents contain several printed tokens which indicate more information about the plant. With the token it is possible to get information who has found this plant, where this plant was found (country and sometimes the town), what kind of plant it is and so on. By using the local connections of the text it is possible to get more information from the herbar document, e.g. to find and recognize handwritten text in a defined area.

Keywords

local connections token handwriting recognition writer recognition DTW 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bensefia, A., Paquet, T., Heutte, L.: A writer identification and verification system. Pattern Recognition Letters 26(13), 2080–2092 (2005)CrossRefGoogle Scholar
  2. 2.
    Rath, T.M., Manmatha, M.: Word Image Matching Using Dynamic Time Warping. In: CVPR 2003, pp. 521–527 (2003)Google Scholar
  3. 3.
    Marti, U., Bunke, H.: Using a statistical language model to improve the performance of an HMM-based cursive handwriting recognition system. Int. Journal of Pattern Recognition and Artificial Intelligence 15, 65–90 (2001)CrossRefGoogle Scholar
  4. 4.
    Marti, U.V., Messerli, R., Bunke, H.: Writer Identification Using Text Line Based Features. In: Proc. of the 6th International Conference on Document Analysis and Recognition, Seattle, USA, pp. 101–105 (2001)Google Scholar
  5. 5.
    Niels, R., Grootjen, F., Vuurpijl, L.: Writer identification through information retrieval: the allograph weight vector. In: Proceedings of the 11. Int. Conference on Frontiers in Handwriting Recognition, Montreal (2008)Google Scholar
  6. 6.
    Srihari, S., Arora, S.H., Lee, S.: Individuality of handwriting. J. of Forensic Sciences 47(4), 1–17 (2002)Google Scholar
  7. 7.
    Schlapbach, A., Bunke, H.: Off-line Handwriting Identification Using HMM Based Recognizers. Publications Uni Bern (2004)Google Scholar
  8. 8.
    Schomaker, L., Bulacu, M.: Automatic Writer Identification Using Connected-Component Contours and Edge-Based Features of Uppercase Western Script. IEEE Transactions of Pattern Analysis and Machine Intelligence 26(6), 787–798 (2004)CrossRefGoogle Scholar
  9. 9.
    Steinke, K.-H., Dzido, R., Gehrke, M., Prätel, K.: Feature recognition for herbarium specimens (Herbar-Digital). In: Proceedings of TDWG, Perth (2008)Google Scholar
  10. 10.
    Steinke, K.-H.: Recognition of Writers by Handwriting Images. In: Duff, M. (ed.) Conference on Pattern Recognition 1981, Oxford (1980)Google Scholar
  11. 11.
    Rabiner, L.R.: A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Proc. of the IEEE 77, 257–286 (1989)CrossRefGoogle Scholar
  12. 12.
    Siddiqi, I., Vincent, N.: Combining global and local features for writer identification. In: Proceedings of the 11. Int. Conference on Frontiers in Handwriting Recognition, Montreal (2008)Google Scholar
  13. 13.
    Steinke, K.-H.: Lokalisierung von Schrift in komplexer Umgebung, Tagungsband der Jahrestagung der deutschen  Gesellschaft für Photogrammetrie, Jena März (2009)Google Scholar
  14. 14.
    Sakoe, H., Chiba, S.: Dynamic Programming algorithm optimasation for spoken word recognition. IEEE Transactions on Acoustics, Speech and Signal Processing, 159–165 (1978)Google Scholar
  15. 15.
    Steinke, K.-H., Gehrke, M., Dzido, R.: Writer Recognition by Combining Local and Global Methods. In: International Congress on Image and Signal Processing, Tianjin China (October 2009)Google Scholar
  16. 16.
    Heidorn, P.B., Qin, W.Y., Beaman, R., Cellinese, N.: Learning by Example: Machine Learning and Herbarium Label Digitization. In: Joint Plant Science and Conference Botany 2007, Chicago Illinois, July 7-11 (2007)Google Scholar
  17. 17.
    Mund, B.: Diploma thesis: Datamining in OCR Datenbanken, University of Applied Sciences and Arts, Hanover, Hannover (January 2010)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Benjamin Mund
    • 1
  • Karl-Heinz Steinke
    • 1
  1. 1.University of Applied Sciences and Arts, HanoverHanoverGermany

Personalised recommendations