Skip to main content

A Dataset for Quality Assessment of Camera Captured Document Images

  • Conference paper
  • First Online:
Camera-Based Document Analysis and Recognition (CBDAR 2013)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 8357))

Abstract

With the proliferation of cameras on mobile devices there is an increased desire to image document pages as an alternative to scanning. However, the quality of captured document images is often lower than its scanned equivalent due to hardware limitations and stability issues. In this context, automatic assessment of the quality of captured images is useful for many applications. Although there has been a lot of work on developing computational methods and creating standard datasets for natural scene image quality assessment, until recently quality estimation of camera captured document images has not been given much attention. One traditional quality indicator for document images is the Optical Character Recognition (OCR) accuracy. In this work, we present a dataset of camera captured document images containing varying levels of focal-blur introduced manually during capture. For each image we obtained the character level OCR accuracy. Our dataset can be used to evaluate methods for predicting OCR quality of captured documents as well as enhancements. In order to make the dataset publicly and freely available, originals from two existing datasets - University of Washington dataset and Tobacco Database were selected. We present a case study with three recent methods for predicting the OCR quality of images on our dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The OCR results in this paper should in no way be used to compare OCR systems rather only to judge relative performance of each system on the collection.

  2. 2.

    Motorola DroidX with Android.

References

  1. Spearman’s rank correlation coefficient. http://en.wikipedia.org/wiki/Spearman's_rank_correlation_coefficient

  2. Omnipage professional version 18.0. http://www.nuance.com/for-business/by-product/omnipage/index.htm (2011)

  3. ISRI-OCR evaluation tool: Code and data to evaluate OCR accuracy, originally from UNLV/ISRI. http://code.google.com/p/isri-ocr-evaluation-tools/ January 2010

  4. ABBYY Finereader 10 Professional Edition, build 10.0.102.74 (2009)

    Google Scholar 

  5. ABBYY finereader 8.0 professional edition, September 2005

    Google Scholar 

  6. Antonacopoulos, A., Clausner, C., Papadopoulos, C., Pletschacher, S.: Historical document layout analysis competition. In: International Conference on Document Analysis and Recognition (ICDAR), pp. 1516–1520 (2011)

    Google Scholar 

  7. Blando, L., Kanai, J., Nartker, T.: Prediction of OCR accuracy using simple image features. In: International Conference on Document Analysis and Recognition, vol. 1, pp. 319–322 (1995)

    Google Scholar 

  8. Cannon, M., Hochberg, J., Kelly, P.: Quality assessment and restoration of typewritten document images. Int. J. Doc. Anal. Recogn. 2(2–3), 80–89 (1999)

    Article  Google Scholar 

  9. Chen, F., Carter, S., Denoue, L., Kumar, J.: SmartDCap: semi-automatic capture of higher quality document images from a smartphone. In: International Conference on Intelligent User Interfaces (IUI), pp. 287–296 (2013)

    Google Scholar 

  10. Chung, Y.C., Wang, J.M., Bailey, R., Chen, S.W., Chang, S.L.: A non-parametric blur measure based on edge analysis for image processing applications. In: IEEE Conference on Cybernetics and Intelligent Systems, vol. 1, pp. 356–360 (2004)

    Google Scholar 

  11. Drucker, H., Burges, C.J.C., Kaufman, L., Smola, A., Vapnik, V.: Support vector regression machines. In: Mozer, M., Jordan, M., Petsche, T. (eds.) Advances in Neural Information Processing Systems, vol. 9, pp. 155–161. MIT Press, Cambridge (1997)

    Google Scholar 

  12. Edwards, A.L.: The correlation coefficient: An Introduction to Linear Regression and Correlation, pp. 33–46. W. H. Freeman, San Francisco (1976)

    Google Scholar 

  13. Fergus, R., Singh, B., Hertzmann, A., Roweis, S.T., Freeman, W.T.: Removing camera shake from a single photograph. ACM Trans. Graph. 25, 787–794 (2006)

    Article  Google Scholar 

  14. Ferzli, R., Karam, L.: A no-reference objective image sharpness metric based on the notion of just noticeable blur (JNB). IEEE Trans. Image Process. 18, 717–728 (2009)

    Article  MathSciNet  Google Scholar 

  15. Guyon, I., Haralick, R.M., Hull, J.J., Phillips, I.T.: Data sets for OCR and document image understanding research. In: Proceedings of the SPIE - Document Recognition IV, pp. 779–799. World Scientific (1997)

    Google Scholar 

  16. Kumar, D., Ramakrishnan, A.: Quad: quality assessment of documents. In: International Workshop on Camera based Document Analysis and Recognition, pp. 79–84 (2011)

    Google Scholar 

  17. Kumar, J., Chen, F., Doermann, D.: Sharpness estimation of document and scene images. In: International Conference on Pattern Recognition (ICPR), pp. 3292–3295 (2012)

    Google Scholar 

  18. Kumar, J., Bala, R., Ding, H., Emmett, P.: Mobile video capture of multi-page documents. In: IEEE International Workshop on Mobile Vision (IWMV), pp. 35–40 (2013)

    Google Scholar 

  19. Kumar, J., Ye, P., Doermann, D.: DIQA: document image quality assesment datasets. In: Language and Media Processing Laboratory (2013). http://lampsrv02.umiacs.umd.edu/projdb/project.php?id=73

  20. Larson, E.C., Chandler, D.M.: Most apparent distortion: full-reference image quality assessment and the role of strategy. J. Electron. Imaging 19(1), 1–21 (2010)

    Google Scholar 

  21. Lewis, D., Agam, G., Argamon, S., Frieder, O., Grossman, D., Heard, J.: Building a test collection for complex document information processing. In: International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 665–666. ACM (2006)

    Google Scholar 

  22. Narvekar, N., Karam, L.: A no-reference image blur metric based on the cumulative probability of blur detection (CPBD). IEEE Trans. Image Process. 20(9), 2678–2683 (2011)

    Article  MathSciNet  Google Scholar 

  23. Peng, X., Cao, H., Subramanian, K., Prasad, R., Natarajan, P.: Automated image quality assessment for camera-captured OCR. In: IEEE International Conference on Image Processing (ICIP), pp. 2621–2624 (2011)

    Google Scholar 

  24. Rice, S.V., Kanai, J., Nartker, T.A.: The third annual test of OCR accuracy. TR 94–03 ISRI. University of Nevada, Las Vegas (1994)

    Google Scholar 

  25. Sheikh, H.R., Wang, Z., Cormack, L., Bovik, A.C.: Live image quality assessment database release 2 (2006). http://live.ece.utexas.edu/research/quality

  26. Sheikh, H., Sabir, M., Bovik, A.: A statistical evaluation of recent full reference image quality assessment algorithms. IEEE Trans. Image Process. 15(11), 3440–3451 (2006)

    Article  Google Scholar 

  27. Souza, A., Cheriet, M., Naoi, S., Suen, C.: Automatic filter selection using image quality assessment. In: International Conference on Document Analysis and Recognition, pp. 508–512 (2003)

    Google Scholar 

  28. Tesseract-OCR: An OCR engine that was developed at HP Labs between 1985 and 1995 and now at Google. https://code.google.com/p/tesseract-ocr/ (2012)

  29. Ye, P., Doermann, D.: Learning features for predicting OCR accuracy. In: International Conference on Pattern Recognition (ICPR), pp. 3204–3207 (2012)

    Google Scholar 

  30. Garibotto, G., et al.: White paper on industrial applications of computer vision and pattern recognition. In: Petrosino, Al (ed.) ICIAP 2013, Part II. LNCS, vol. 8157, pp. 721–730. Springer, Heidelberg (2013)

    Google Scholar 

  31. Ye, P., Kumar, J., Kang, L., Doermann, D.: Unsupervised feature learning framework for no-reference image quality assessment. In: International Conference on Computer Vision and Pattern Recognition (CVPR 2012), pp. 1098–1105 (2012)

    Google Scholar 

  32. Zheng, Q., Kanungo, T.: Morphological degradation models and their use in document image restoration. Technical Report LAMP-TR-065, CS-TR-4218, CAR-TR-962, University of Maryland, College Park, February 2001

    Google Scholar 

  33. Zhu, X., Milanfar, P.: Automatic parameter selection for denoising algorithms using a no-reference measure of image content. IEEE Trans. Image Process. 19(12), 3116–3132 (2010)

    Article  MathSciNet  Google Scholar 

  34. Zi, G.: GroundTruth generation and document image degradation. Technical Report LAMP-TR-121, CAR-TR-1008, CS-TR-4699, UMIACS-TR-2005-08, University of Maryland, College Park, May 2005

    Google Scholar 

Download references

Acknowledgments

We would like to thank Steven Dang for running Tesseract on our images. We would also like to thank Francine Chen and anonymous reviewers for their comments on improving the quality of this work. The partial support of this research by DARPA through BBN/DARPA Award HR0011-08-C-0004 under subcontract 9500009235, and the US Government through NSF Award IIS-0812111 is gratefully acknowledged.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jayant Kumar .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Kumar, J., Ye, P., Doermann, D. (2014). A Dataset for Quality Assessment of Camera Captured Document Images. In: Iwamura, M., Shafait, F. (eds) Camera-Based Document Analysis and Recognition. CBDAR 2013. Lecture Notes in Computer Science(), vol 8357. Springer, Cham. https://doi.org/10.1007/978-3-319-05167-3_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-05167-3_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-05166-6

  • Online ISBN: 978-3-319-05167-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics