Skip to main content

A Novel Word Segmentation Method Based on Object Detection and Deep Learning

  • Conference paper
  • First Online:
Advances in Visual Computing (ISVC 2015)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9474))

Included in the following conference series:

Abstract

The segmentation of individual words is a crucial step in several data mining methods for historical handwritten documents. Examples of applications include visual searching for query words (word spotting) and character-by-character text recognition. In this paper, we present a novel method for word segmentation that is adapted from recent advances in computer vision, deep learning and generic object detection. Our method has unique capabilities and it has found practical use in our current research project. It can easily be trained for different kinds of historical documents, uses full gray scale information, does not require binarization as pre-processing or prior segmentation of individual text lines. We evaluate its performance using established error metrics, previously used in competitions for word segmentation, and demonstrate its usefulness for a 15th century handwritten document.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://users.iit.demokritos.gr/~nstam/ICDAR2013HandSegmCont/.

References

  1. Ryu, J., Koo, H.I., Cho, N.I.: Word segmentation method for handwritten documents based on structured learning. IEEE Signal Process. Lett. 22, 1161–1165 (2015)

    Article  Google Scholar 

  2. Stafylakis, T., Papavassiliou, V., Katsouros, V., Carayannis, G.: Robust text-line and word segmentation for handwritten documents images. In: IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2008, pp. 3393–3396 (2008)

    Google Scholar 

  3. Varga, T., Bunke, H.: Tree structure for word extraction from handwritten text lines. In: Proceedings Eighth International Conference on Document Analysis and Recognition, vol.1, pp. 352–356 (2005)

    Google Scholar 

  4. Manmatha, R., Rothfeder, J.L.: A scale space approach for automatically segmenting words from historical handwritten documents. IEEE Trans. Pattern Anal. Mach. Intell. 27, 1212–1225 (2005)

    Article  Google Scholar 

  5. Jaderberg, M., Simonyan, K., Vedaldi, A., Zisserman, A.: Reading text in the wild with convolutional neural networks. arXiv preprint arXiv:1412.1842 (2014)

  6. Gidaris, S., Komodakis, N.: Object detection via a multi-region & semantic segmentation-aware cnn model. arXiv preprint arXiv:1505.01749 (2015)

  7. Havaei, M., Davy, A., Warde-Farley, D., Biard, A., Courville, A., Bengio, Y., Pal, C., Jodoin, P.M., Larochelle, H.: Brain tumor segmentation with deep neural networks. arXiv preprint arXiv:1505.03540 (2015)

  8. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp. 1097–1105 (2012)

    Google Scholar 

  9. Bengio, Y., Goodfellow, I.J., Courville, A.: Deep Learning. Book in preparation for MIT Press (2015)

    Google Scholar 

  10. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 580–587. IEEE (2014)

    Google Scholar 

  11. He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part III. LNCS, vol. 8691, pp. 346–361. Springer, Heidelberg (2014)

    Google Scholar 

  12. Tang, Y.: Deep learning using linear support vector machines. arXiv preprint arXiv:1306.0239 (2013)

  13. Stamatopoulos, N., Gatos, B., Louloudis, G., Pal, U., Alaei, A.: ICDAR 2013 handwriting segmentation contest. In: 2013 12th International Conference on Document Analysis and Recognition (ICDAR), pp. 1402–1406. IEEE (2013)

    Google Scholar 

  14. Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the ACM International Conference on Multimedia, pp. 675–678. ACM (2014)

    Google Scholar 

  15. Kovalchuk, A., Wolf, L., Dershowitz, N.: A simple and fast word spotting method. In: 2014 14th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 3–8 (2014)

    Google Scholar 

Download references

Acknowledgment

This project is a part of q2b, From quill to bytes, a framework program sponsored by the Swedish Research Council (Dnr 2012-5743) and Uppsala university. The work is done in part as a collaboration with the Swedish Museum of Natural History (Naturhistoriska riksmuseet).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tomas Wilkinson .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Wilkinson, T., Brun, A. (2015). A Novel Word Segmentation Method Based on Object Detection and Deep Learning. In: Bebis, G., et al. Advances in Visual Computing. ISVC 2015. Lecture Notes in Computer Science(), vol 9474. Springer, Cham. https://doi.org/10.1007/978-3-319-27857-5_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-27857-5_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-27856-8

  • Online ISBN: 978-3-319-27857-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics