Abstract
With the proliferation of cameras on mobile devices there is an increased desire to image document pages as an alternative to scanning. However, the quality of captured document images is often lower than its scanned equivalent due to hardware limitations and stability issues. In this context, automatic assessment of the quality of captured images is useful for many applications. Although there has been a lot of work on developing computational methods and creating standard datasets for natural scene image quality assessment, until recently quality estimation of camera captured document images has not been given much attention. One traditional quality indicator for document images is the Optical Character Recognition (OCR) accuracy. In this work, we present a dataset of camera captured document images containing varying levels of focal-blur introduced manually during capture. For each image we obtained the character level OCR accuracy. Our dataset can be used to evaluate methods for predicting OCR quality of captured documents as well as enhancements. In order to make the dataset publicly and freely available, originals from two existing datasets - University of Washington dataset and Tobacco Database were selected. We present a case study with three recent methods for predicting the OCR quality of images on our dataset.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The OCR results in this paper should in no way be used to compare OCR systems rather only to judge relative performance of each system on the collection.
- 2.
Motorola DroidX with Android.
References
Spearman’s rank correlation coefficient. http://en.wikipedia.org/wiki/Spearman's_rank_correlation_coefficient
Omnipage professional version 18.0. http://www.nuance.com/for-business/by-product/omnipage/index.htm (2011)
ISRI-OCR evaluation tool: Code and data to evaluate OCR accuracy, originally from UNLV/ISRI. http://code.google.com/p/isri-ocr-evaluation-tools/ January 2010
ABBYY Finereader 10 Professional Edition, build 10.0.102.74 (2009)
ABBYY finereader 8.0 professional edition, September 2005
Antonacopoulos, A., Clausner, C., Papadopoulos, C., Pletschacher, S.: Historical document layout analysis competition. In: International Conference on Document Analysis and Recognition (ICDAR), pp. 1516–1520 (2011)
Blando, L., Kanai, J., Nartker, T.: Prediction of OCR accuracy using simple image features. In: International Conference on Document Analysis and Recognition, vol. 1, pp. 319–322 (1995)
Cannon, M., Hochberg, J., Kelly, P.: Quality assessment and restoration of typewritten document images. Int. J. Doc. Anal. Recogn. 2(2–3), 80–89 (1999)
Chen, F., Carter, S., Denoue, L., Kumar, J.: SmartDCap: semi-automatic capture of higher quality document images from a smartphone. In: International Conference on Intelligent User Interfaces (IUI), pp. 287–296 (2013)
Chung, Y.C., Wang, J.M., Bailey, R., Chen, S.W., Chang, S.L.: A non-parametric blur measure based on edge analysis for image processing applications. In: IEEE Conference on Cybernetics and Intelligent Systems, vol. 1, pp. 356–360 (2004)
Drucker, H., Burges, C.J.C., Kaufman, L., Smola, A., Vapnik, V.: Support vector regression machines. In: Mozer, M., Jordan, M., Petsche, T. (eds.) Advances in Neural Information Processing Systems, vol. 9, pp. 155–161. MIT Press, Cambridge (1997)
Edwards, A.L.: The correlation coefficient: An Introduction to Linear Regression and Correlation, pp. 33–46. W. H. Freeman, San Francisco (1976)
Fergus, R., Singh, B., Hertzmann, A., Roweis, S.T., Freeman, W.T.: Removing camera shake from a single photograph. ACM Trans. Graph. 25, 787–794 (2006)
Ferzli, R., Karam, L.: A no-reference objective image sharpness metric based on the notion of just noticeable blur (JNB). IEEE Trans. Image Process. 18, 717–728 (2009)
Guyon, I., Haralick, R.M., Hull, J.J., Phillips, I.T.: Data sets for OCR and document image understanding research. In: Proceedings of the SPIE - Document Recognition IV, pp. 779–799. World Scientific (1997)
Kumar, D., Ramakrishnan, A.: Quad: quality assessment of documents. In: International Workshop on Camera based Document Analysis and Recognition, pp. 79–84 (2011)
Kumar, J., Chen, F., Doermann, D.: Sharpness estimation of document and scene images. In: International Conference on Pattern Recognition (ICPR), pp. 3292–3295 (2012)
Kumar, J., Bala, R., Ding, H., Emmett, P.: Mobile video capture of multi-page documents. In: IEEE International Workshop on Mobile Vision (IWMV), pp. 35–40 (2013)
Kumar, J., Ye, P., Doermann, D.: DIQA: document image quality assesment datasets. In: Language and Media Processing Laboratory (2013). http://lampsrv02.umiacs.umd.edu/projdb/project.php?id=73
Larson, E.C., Chandler, D.M.: Most apparent distortion: full-reference image quality assessment and the role of strategy. J. Electron. Imaging 19(1), 1–21 (2010)
Lewis, D., Agam, G., Argamon, S., Frieder, O., Grossman, D., Heard, J.: Building a test collection for complex document information processing. In: International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 665–666. ACM (2006)
Narvekar, N., Karam, L.: A no-reference image blur metric based on the cumulative probability of blur detection (CPBD). IEEE Trans. Image Process. 20(9), 2678–2683 (2011)
Peng, X., Cao, H., Subramanian, K., Prasad, R., Natarajan, P.: Automated image quality assessment for camera-captured OCR. In: IEEE International Conference on Image Processing (ICIP), pp. 2621–2624 (2011)
Rice, S.V., Kanai, J., Nartker, T.A.: The third annual test of OCR accuracy. TR 94–03 ISRI. University of Nevada, Las Vegas (1994)
Sheikh, H.R., Wang, Z., Cormack, L., Bovik, A.C.: Live image quality assessment database release 2 (2006). http://live.ece.utexas.edu/research/quality
Sheikh, H., Sabir, M., Bovik, A.: A statistical evaluation of recent full reference image quality assessment algorithms. IEEE Trans. Image Process. 15(11), 3440–3451 (2006)
Souza, A., Cheriet, M., Naoi, S., Suen, C.: Automatic filter selection using image quality assessment. In: International Conference on Document Analysis and Recognition, pp. 508–512 (2003)
Tesseract-OCR: An OCR engine that was developed at HP Labs between 1985 and 1995 and now at Google. https://code.google.com/p/tesseract-ocr/ (2012)
Ye, P., Doermann, D.: Learning features for predicting OCR accuracy. In: International Conference on Pattern Recognition (ICPR), pp. 3204–3207 (2012)
Garibotto, G., et al.: White paper on industrial applications of computer vision and pattern recognition. In: Petrosino, Al (ed.) ICIAP 2013, Part II. LNCS, vol. 8157, pp. 721–730. Springer, Heidelberg (2013)
Ye, P., Kumar, J., Kang, L., Doermann, D.: Unsupervised feature learning framework for no-reference image quality assessment. In: International Conference on Computer Vision and Pattern Recognition (CVPR 2012), pp. 1098–1105 (2012)
Zheng, Q., Kanungo, T.: Morphological degradation models and their use in document image restoration. Technical Report LAMP-TR-065, CS-TR-4218, CAR-TR-962, University of Maryland, College Park, February 2001
Zhu, X., Milanfar, P.: Automatic parameter selection for denoising algorithms using a no-reference measure of image content. IEEE Trans. Image Process. 19(12), 3116–3132 (2010)
Zi, G.: GroundTruth generation and document image degradation. Technical Report LAMP-TR-121, CAR-TR-1008, CS-TR-4699, UMIACS-TR-2005-08, University of Maryland, College Park, May 2005
Acknowledgments
We would like to thank Steven Dang for running Tesseract on our images. We would also like to thank Francine Chen and anonymous reviewers for their comments on improving the quality of this work. The partial support of this research by DARPA through BBN/DARPA Award HR0011-08-C-0004 under subcontract 9500009235, and the US Government through NSF Award IIS-0812111 is gratefully acknowledged.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Kumar, J., Ye, P., Doermann, D. (2014). A Dataset for Quality Assessment of Camera Captured Document Images. In: Iwamura, M., Shafait, F. (eds) Camera-Based Document Analysis and Recognition. CBDAR 2013. Lecture Notes in Computer Science(), vol 8357. Springer, Cham. https://doi.org/10.1007/978-3-319-05167-3_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-05167-3_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-05166-6
Online ISBN: 978-3-319-05167-3
eBook Packages: Computer ScienceComputer Science (R0)