Abstract
The Optical Character Recognition (OCR) is a process that converts characters within images into text documents. In paperless applications, OCR systems have to ensure a better accuracy as well as a high speed. One of the most important steps in OCR is binarization. In this context, we proposed recently the hybrid binarization-based Kmeans method (HBK) (Soua et al. in International Symposium on Communications, Control, and Signal Processing, 2014). HBK offers a satisfying recognition rate while scoring 91 % accuracy. In the other hand, running on an Intel Core i3 CPU processor, the HBK requires at least 1.9 s to process one A4 300 dpi document. However, binarization step should not exceed 460 ms in our real-time OCR system. For this, we propose in this paper a parallel implementation of the HBK method on the NVIDIA GTX 660 graphic processing unit (GPU). Our implementation combines fine-grained and coarse-grained parallelism strategies for the best GPU use. In addition, the costly CPU–GPU communication overhead is avoided and an efficient memory management is ensured. The effectiveness of our implementation is validated through extensive experiments, which demonstrate that the proposed HBK parallelization accelerates the studied process. Indeed, we ensure the binarization of one document in just 425 ms. Consequently, the implemented design is able to meet the targeted real-time OCR system in paperless application.
Similar content being viewed by others
Notes
Copyright(c) 2012. EPITA and Development Laboratory (LRDE) with permission from Le Nouvel Observateur. LRDE-DBD is available online on the web site: http://www.lrde.epita.fr/cgi-bin/twiki/view/Olena/DatasetDBD.
Le Nouvel Observateur. Issue 2402, November 18–24, 2010 and available on the website: http://tempsreel.nouvelobs.com.
References
Xiu, P., Baird, H.S.: Whole-book recognition. IEEE Trans. Pattern Anal. Mach. Intell. 34(12), 2467–2480 (2012)
Kae, A., Huang, G.B., Doersch, C., Learned-Miller, E.G.: Improving state-of-the-art OCR through high-precision document-specific modeling. In: CVPR, pp. 1935–1942 (2010)
Collins-Thompson, K., Schweizer, C., Dumais, S.: Improved string matching under noisy channel conditions. In: Proceedings of the Tenth International Conference on Information and Knowledge Management, pp. 357–364 (2011)
Eikvil, L.: OCR-Optical Character Recognition (1993). http://www.nr.no/~eikvil/OCR.pdf
Gaceb, D., Eglin, V., Lebourgeois, F.: A new mixed binarization method used in real time application of automatic business document and postal mail sorting. Int. Arab J. Inf. Technol. 10(2), 179–188 (2013)
Ashari, E., Hornsey, R.: FPGA implementation of real-time adaptive Image thresholding. In: Proceedings of the Photonic Applications in Astronomy, Biomedicine, Imaging, Materials Processing, and Education (2004)
Fong, W.: Document imaging: a step toward a paperless office. http://web.simmons.edu/~chen/nit/NIT\%2792/133-fon.htm
Kumar, D., et al.: MAPS: midline analysis and propagation of segmentation. In: Proceedings of the Eighth Indian Conference on Computer Vision, Graphics and Image Processing. Article No. 15 (2012)
Soua, M., Kachouri, R., Akil, M.: A new hybrid binarization method based on Kmeans. In: IEEE International Symposium on Communications, Control, and Signal Processing, pp. 118–123 (2014)
Lazzara, G., Graud, T.: Efficient multiscale Sauvola’s binarization. Int. J. Doc. Anal. Recognit. 2(14), 105–123 (2014)
Niblack, W.: An Introduction to Digital Image Processing. Strandberg Publishing Company, Denmark (1985)
Srinivasan, S., et al.: Performance characterization and acceleration of optical character recognition on handheld. In: IEEE International Symposium on Workload Characterization (IISWC), pp. 1–10 (2010)
Li, Y., Zhao, K., Chu, X., Liu, J.: Speeding up k-means algorithm by GPUs. In: Proceedings of the IEEE 10th International Conference on Computer and Information Technology (CIT), pp. 115–122 (2010)
Che, S., Boyer, M., Meng, J., Tarjan, D., Sheaffer, J.W., Skadron, K.: A performance study of general-purpose applications on graphics processors using CUDA. J. Parallel Distrib. Comput. 68(10), 1370–1380 (2008)
Fang, W., et al.: Parallel Data Mining on Graphics Processors. Technical report, Hong Kong University of Science and Technology (2008)
Hong-tao, B., Li-li H., Dan-tong O., Zhan-shan L., He, L.: KMeans on commodity GPUs with CUDA. In: Proceedings of the WRI World Congress
Sirotkovic, J., Dujmic H., Papic, V.: K-means image segmentation on massively parallel GPU architecture. In: Proceedings of the 35th International Convention MIPRO, pp. 489–494 (2012)
Pisharath J., Liu, Y., Liao, W.-K., Choudhary, A., Memik, G., Parhi, J.: ’Nu-MineBench 2.0’. In: CUCIS Technical Report Center for Ultra-Scale Computing and Information Security, Northwestern University (2005)
Wu, R., Zhang, B., Hsu, M.: Clustering billions of data points using GPUs. In: Proceeding of the Combined Workshops on Unconventional High Performance Computing Workshop Plus Memory Access Workshop, Ischia, Italy, pp. 1–6 (2009)
Lloyd, S.P.: Least square quantization in PCM. IEEE Trans. Inf. Theory 28(2), 129–137 (1982)
Smith, R.: An overview of the Tesseract OCR engine. In: Proceedings Ninth International Conference on Document Analysis and Recognition (ICDAR), pp. 629–633 (2007)
EPITA and Development Laboratory (LRDE). http://www.lrde.epita.fr/cgi-bin/twiki/view/Olena/DatasetDBD
Lelore, T., Bouchara, F.: Super-resolved binarization of text based on the FAIR algorithm. In: Proceedings of International Conference on Document Analysis and Recognition, vol. 13, pp. 303–314 (2010)
Chen, T.Y., et al.: On the statistical properties of the F-measure, Quality Software, 2004. In: Proceedings of the Fourth International Conference on QSIC 2004, pp. 146–153 (2004)
Fabrizio, J., Marcotegui, B., Cord, M.: Text segmentation in natural scenes using toggle-mapping. In: 16th IEEE International Conference on Image Processing (ICIP), pp. 2373–2376 (2009)
NVIDIA.: CUDA C best practices guide (version 4.0), Santa Clara, California, USA (2011). http://www.khronos.org/opencl/
NVidia: CUDA C Programming Guide. http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html
Khronos: OpenCL: the open standard for parallel programming of heterogeneous systems. http://www.khronos.org/opencl/
Parallel programming and computing platform, CUDA, NVIDIA. http://www.nvidia.com/object/cuda_hom_new.html
Tompson, J., Schlachter, K.: An introduction to the programming Model. In: Distributed Computing CSCI-GA.2631-001 (2012)
NVIDIAs Next Generation CUDA TM Compute Architecture: Kepler TM GK110. http://www.nvidia.com/content/PDF/kepler/NVIDIA-Kepler-GK110-Architecture-Whitepaper.pdf
Yi, Y., Lai, C., Petrov, S.: Efficient parallel CKY parsing using GPUs. J. Log. Comput. 24(2), 375–393 (2014)
Sauvola, J., Seppanen, T., Haapakoski, S., Pietikainen, M.: Adaptive document binarization. In: 4th International Conference on Document Analysis and Recognition, Ulm, Germany, pp. 147–152 (1997)
Singh, B.M., et al.: Parallel implementation of Niblack’s binarization approach on CUDA. Int. J. Comput. Appl. 32(2), 22–27 (2011)
Khurshid, K., et al. Comparison of Niblack inspired binarization methods for ancient documents. In: Proceedings of the 16th Document Recognition and Retrieval Conference, Part of the IS and T-SPIE Electronic Imaging Symposium, San Jose, CA, USA (2009)
Singh, B.M., et al.: Parallel implementation of Souvola’s binarization approach on GPU. Int. J. Comput. Appl. 32(2), 28–33 (2011)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Soua, M., Kachouri, R. & Akil, M. GPU parallel implementation of the new hybrid binarization based on Kmeans method (HBK). J Real-Time Image Proc 14, 363–377 (2018). https://doi.org/10.1007/s11554-014-0458-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11554-014-0458-2