A Dataset for Quality Assessment of Camera Captured Document Images

Kumar, Jayant; Ye, Peng; Doermann, David

doi:10.1007/978-3-319-05167-3_9

Jayant Kumar¹⁷,
Peng Ye¹⁷ &
David Doermann¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 8357))

Included in the following conference series:

International Workshop on Camera-Based Document Analysis and Recognition

1209 Accesses
21 Citations

Abstract

With the proliferation of cameras on mobile devices there is an increased desire to image document pages as an alternative to scanning. However, the quality of captured document images is often lower than its scanned equivalent due to hardware limitations and stability issues. In this context, automatic assessment of the quality of captured images is useful for many applications. Although there has been a lot of work on developing computational methods and creating standard datasets for natural scene image quality assessment, until recently quality estimation of camera captured document images has not been given much attention. One traditional quality indicator for document images is the Optical Character Recognition (OCR) accuracy. In this work, we present a dataset of camera captured document images containing varying levels of focal-blur introduced manually during capture. For each image we obtained the character level OCR accuracy. Our dataset can be used to evaluate methods for predicting OCR quality of captured documents as well as enhancements. In order to make the dataset publicly and freely available, originals from two existing datasets - University of Washington dataset and Tobacco Database were selected. We present a case study with three recent methods for predicting the OCR quality of images on our dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The OCR results in this paper should in no way be used to compare OCR systems rather only to judge relative performance of each system on the collection.
2.
Motorola DroidX with Android.

References

Spearman’s rank correlation coefficient. http://en.wikipedia.org/wiki/Spearman's_rank_correlation_coefficient
Omnipage professional version 18.0. http://www.nuance.com/for-business/by-product/omnipage/index.htm (2011)
ISRI-OCR evaluation tool: Code and data to evaluate OCR accuracy, originally from UNLV/ISRI. http://code.google.com/p/isri-ocr-evaluation-tools/ January 2010
ABBYY Finereader 10 Professional Edition, build 10.0.102.74 (2009)
Google Scholar
ABBYY finereader 8.0 professional edition, September 2005
Google Scholar
Antonacopoulos, A., Clausner, C., Papadopoulos, C., Pletschacher, S.: Historical document layout analysis competition. In: International Conference on Document Analysis and Recognition (ICDAR), pp. 1516–1520 (2011)
Google Scholar
Blando, L., Kanai, J., Nartker, T.: Prediction of OCR accuracy using simple image features. In: International Conference on Document Analysis and Recognition, vol. 1, pp. 319–322 (1995)
Google Scholar
Cannon, M., Hochberg, J., Kelly, P.: Quality assessment and restoration of typewritten document images. Int. J. Doc. Anal. Recogn. 2(2–3), 80–89 (1999)
Article Google Scholar
Chen, F., Carter, S., Denoue, L., Kumar, J.: SmartDCap: semi-automatic capture of higher quality document images from a smartphone. In: International Conference on Intelligent User Interfaces (IUI), pp. 287–296 (2013)
Google Scholar
Chung, Y.C., Wang, J.M., Bailey, R., Chen, S.W., Chang, S.L.: A non-parametric blur measure based on edge analysis for image processing applications. In: IEEE Conference on Cybernetics and Intelligent Systems, vol. 1, pp. 356–360 (2004)
Google Scholar
Drucker, H., Burges, C.J.C., Kaufman, L., Smola, A., Vapnik, V.: Support vector regression machines. In: Mozer, M., Jordan, M., Petsche, T. (eds.) Advances in Neural Information Processing Systems, vol. 9, pp. 155–161. MIT Press, Cambridge (1997)
Google Scholar
Edwards, A.L.: The correlation coefficient: An Introduction to Linear Regression and Correlation, pp. 33–46. W. H. Freeman, San Francisco (1976)
Google Scholar
Fergus, R., Singh, B., Hertzmann, A., Roweis, S.T., Freeman, W.T.: Removing camera shake from a single photograph. ACM Trans. Graph. 25, 787–794 (2006)
Article Google Scholar
Ferzli, R., Karam, L.: A no-reference objective image sharpness metric based on the notion of just noticeable blur (JNB). IEEE Trans. Image Process. 18, 717–728 (2009)
Article MathSciNet Google Scholar
Guyon, I., Haralick, R.M., Hull, J.J., Phillips, I.T.: Data sets for OCR and document image understanding research. In: Proceedings of the SPIE - Document Recognition IV, pp. 779–799. World Scientific (1997)
Google Scholar
Kumar, D., Ramakrishnan, A.: Quad: quality assessment of documents. In: International Workshop on Camera based Document Analysis and Recognition, pp. 79–84 (2011)
Google Scholar
Kumar, J., Chen, F., Doermann, D.: Sharpness estimation of document and scene images. In: International Conference on Pattern Recognition (ICPR), pp. 3292–3295 (2012)
Google Scholar
Kumar, J., Bala, R., Ding, H., Emmett, P.: Mobile video capture of multi-page documents. In: IEEE International Workshop on Mobile Vision (IWMV), pp. 35–40 (2013)
Google Scholar
Kumar, J., Ye, P., Doermann, D.: DIQA: document image quality assesment datasets. In: Language and Media Processing Laboratory (2013). http://lampsrv02.umiacs.umd.edu/projdb/project.php?id=73
Larson, E.C., Chandler, D.M.: Most apparent distortion: full-reference image quality assessment and the role of strategy. J. Electron. Imaging 19(1), 1–21 (2010)
Google Scholar
Lewis, D., Agam, G., Argamon, S., Frieder, O., Grossman, D., Heard, J.: Building a test collection for complex document information processing. In: International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 665–666. ACM (2006)
Google Scholar
Narvekar, N., Karam, L.: A no-reference image blur metric based on the cumulative probability of blur detection (CPBD). IEEE Trans. Image Process. 20(9), 2678–2683 (2011)
Article MathSciNet Google Scholar
Peng, X., Cao, H., Subramanian, K., Prasad, R., Natarajan, P.: Automated image quality assessment for camera-captured OCR. In: IEEE International Conference on Image Processing (ICIP), pp. 2621–2624 (2011)
Google Scholar
Rice, S.V., Kanai, J., Nartker, T.A.: The third annual test of OCR accuracy. TR 94–03 ISRI. University of Nevada, Las Vegas (1994)
Google Scholar
Sheikh, H.R., Wang, Z., Cormack, L., Bovik, A.C.: Live image quality assessment database release 2 (2006). http://live.ece.utexas.edu/research/quality
Sheikh, H., Sabir, M., Bovik, A.: A statistical evaluation of recent full reference image quality assessment algorithms. IEEE Trans. Image Process. 15(11), 3440–3451 (2006)
Article Google Scholar
Souza, A., Cheriet, M., Naoi, S., Suen, C.: Automatic filter selection using image quality assessment. In: International Conference on Document Analysis and Recognition, pp. 508–512 (2003)
Google Scholar
Tesseract-OCR: An OCR engine that was developed at HP Labs between 1985 and 1995 and now at Google. https://code.google.com/p/tesseract-ocr/ (2012)
Ye, P., Doermann, D.: Learning features for predicting OCR accuracy. In: International Conference on Pattern Recognition (ICPR), pp. 3204–3207 (2012)
Google Scholar
Garibotto, G., et al.: White paper on industrial applications of computer vision and pattern recognition. In: Petrosino, Al (ed.) ICIAP 2013, Part II. LNCS, vol. 8157, pp. 721–730. Springer, Heidelberg (2013)
Google Scholar
Ye, P., Kumar, J., Kang, L., Doermann, D.: Unsupervised feature learning framework for no-reference image quality assessment. In: International Conference on Computer Vision and Pattern Recognition (CVPR 2012), pp. 1098–1105 (2012)
Google Scholar
Zheng, Q., Kanungo, T.: Morphological degradation models and their use in document image restoration. Technical Report LAMP-TR-065, CS-TR-4218, CAR-TR-962, University of Maryland, College Park, February 2001
Google Scholar
Zhu, X., Milanfar, P.: Automatic parameter selection for denoising algorithms using a no-reference measure of image content. IEEE Trans. Image Process. 19(12), 3116–3132 (2010)
Article MathSciNet Google Scholar
Zi, G.: GroundTruth generation and document image degradation. Technical Report LAMP-TR-121, CAR-TR-1008, CS-TR-4699, UMIACS-TR-2005-08, University of Maryland, College Park, May 2005
Google Scholar

Download references

Acknowledgments

We would like to thank Steven Dang for running Tesseract on our images. We would also like to thank Francine Chen and anonymous reviewers for their comments on improving the quality of this work. The partial support of this research by DARPA through BBN/DARPA Award HR0011-08-C-0004 under subcontract 9500009235, and the US Government through NSF Award IIS-0812111 is gratefully acknowledged.

Author information

Authors and Affiliations

Institute of Advanced Computer Studies, University of Maryland, College Park, USA
Jayant Kumar, Peng Ye & David Doermann

Authors

Jayant Kumar
View author publications
You can also search for this author in PubMed Google Scholar
Peng Ye
View author publications
You can also search for this author in PubMed Google Scholar
David Doermann
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jayant Kumar .

Editor information

Editors and Affiliations

Graudate School of Engineering, Osaka Prefecture University, Osaka, Japan
Masakazu Iwamura
The University of Western Australia, Crawley, West Australia, Australia
Faisal Shafait

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kumar, J., Ye, P., Doermann, D. (2014). A Dataset for Quality Assessment of Camera Captured Document Images. In: Iwamura, M., Shafait, F. (eds) Camera-Based Document Analysis and Recognition. CBDAR 2013. Lecture Notes in Computer Science(), vol 8357. Springer, Cham. https://doi.org/10.1007/978-3-319-05167-3_9

Download citation

DOI: https://doi.org/10.1007/978-3-319-05167-3_9
Published: 19 March 2014
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-05166-6
Online ISBN: 978-3-319-05167-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics