NEOCR: A Configurable Dataset for Natural Image Text Recognition

  • Robert Nagy
  • Anders Dicker
  • Klaus Meyer-Wegener
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7139)

Abstract

Recently growing attention has been paid to recognizing text in natural images. Natural image text OCR is far more complex than OCR in scanned documents. Text in real world environments appears in arbitrary colors, font sizes and font types, often affected by perspective distortion, lighting effects, textures or occlusion. Currently there are no datasets publicly available which cover all aspects of natural image OCR. We propose a comprehensive well-annotated configurable dataset for optical character recognition in natural images for the evaluation and comparison of approaches tackling with natural image text OCR. Based on the rich annotations of the proposed NEOCR dataset new and more precise evaluations are now possible, which give more detailed information on where improvements are most required in natural image text OCR.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
  2. 2.
    Google Street View, http://maps.google.com
  3. 3.
    ICDAR Robust Reading Dataset, http://algoval.essex.ac.uk/icdar/Datasets.html
  4. 4.
  5. 5.
  6. 6.
  7. 7.
  8. 8.
  9. 9.
    Street View Text Dataset, http://vision.ucsd.edu/~kai/svt/
  10. 10.
    The PASCAL Visual Object Classes Challenge, http://pascallin.ecs.soton.ac.uk/challenges/VOC/
  11. 11.
  12. 12.
    de Campos, T.E., Babu, M.R., Varma, M.: Character Recognition in Natural Images. In: International Conference on Computer Vision Theory and Applications (2009)Google Scholar
  13. 13.
    Chang, L.Z., ZhiYing, S.Z.: Robust Pre-processing Techniques for OCR Applications on Mobile Devices. In: ACM International Conference on Mobile Technology, Application and Systems (2009)Google Scholar
  14. 14.
    Epshtein, B., Ofek, E., Wexler, Y.: Detecting Text in Natural Scenes with Stroke Width Transform. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp. 2963–2970 (2010)Google Scholar
  15. 15.
    Ferzli, R., Karam, L.J.: A No-Reference Objective Image Sharpness Metric Based on the Notion of Just Noticeable Blur (JNB). IEEE Transactions on Image Processing 18(4), 717–728 (2009)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Liang, J., Doermann, D., Li, H.: Camera-based Analysis of Text and Documents: A Survey. International Journal on Document Analysis and Recognition 7, 84–104 (2005)CrossRefGoogle Scholar
  17. 17.
    Lopresti, D., Zhou, J.: Locating and Recognizing Text in WWW Images. Information Retrieval 2(2-3), 177–206 (2000)CrossRefGoogle Scholar
  18. 18.
    Lucas, S.M., Panaretos, A., Sosa, L., Tang, A., Wong, S., Young, R., Ashida, K., Nagai, H., Okamoto, M., Yamamoto, H., Miyao, H.M., Zhu, J., Ou, W., Wolf, C., Jolion, J.M., Todoran, L., Worring, M., Lin, X.: ICDAR 2003 Robust Reading Competitions: Entries, Results, and Future Directions. International Journal on Document Analysis and Recognition 7(2-3), 105–122 (2005)CrossRefGoogle Scholar
  19. 19.
    Lucas, S., Panaretos, A., Sosa, L., Tang, A., Wong, S., Young, R.: ICDAR 2003 Robust Reading Competitions. In: IEEE International Conference on Document Analysis and Recognition, pp. 682–687 (2003)Google Scholar
  20. 20.
    Nagy, R., Dicker, A., Meyer-Wegener, K.: Definition and Evaluation of the NEOCR Dataset for Natural-Image Text Recognition. Tech. Rep. CS-2011-07, University of Erlangen, Dept. of Computer Science (2011)Google Scholar
  21. 21.
    Russell, B.C., Torralba, A., Murphy, K.P., Freeman, W.T.: LabelMe: A Database and Web-Based Tool for Image Annotation. International Journal of Computer Vision 77, 157–173 (2008)CrossRefGoogle Scholar
  22. 22.
    Wang, K., Belongie, S.: Word Spotting in the Wild. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 591–604. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  23. 23.
    Weinman, J.J., Learned-Miller, E., Hanson, A.R.: Scene Text Recognition Using Similarity and a Lexicon with Sparse Belief Propagation. IEEE Transactions on Pattern Analysis and Machine Intelligence 31(10), 1733–1746 (2009)CrossRefGoogle Scholar
  24. 24.
    Wolberg, G.: Digital Image Warping. IEEE Computer Society Press, Los Alamitos (1994)Google Scholar
  25. 25.
    Wu, W., Chen, X., Yang, J.: Incremental Detection of Text on Road Signs from Video with Application to a Driving Assistant System. In: ACM International Conference on Multimedia, pp. 852–859. ACM, New York (2004)Google Scholar
  26. 26.
    Zhu, Q., Yeh, M.C., Cheng, K.T.: Multimodal Fusion using Learned Text Concepts for Image Categorization. In: ACM International Conference on Multimedia, pp. 211–220. ACM, New York (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Robert Nagy
    • 1
  • Anders Dicker
    • 1
  • Klaus Meyer-Wegener
    • 1
  1. 1.Computer Science 6 (Data Management)University of Erlangen-NürnbergErlangenGermany

Personalised recommendations