Abstract
In the literature, very few researches have addressed the problem of recognizing the digits placed on spherical surfaces, even though digit recognition has already attracted extensive attentions and been attacked from various directions. As a particular example of recognizing this kind of digits, in this paper, we introduce a digit ball detection and recognition system to recognize the digit appearing on a 3D ball. The so-called digit ball is the ball carrying Arabic number on its spherical surface. Our system works under weakly controlled environment to detect and recognize the digit balls for practical application, which requires the system to keep on working without recognition errors in a real-time manner. Two main challenges confront our system, one is how to accurately detect the balls and the other is how to deal with the arbitrary rotation of the balls. For the first one, we develop a novel method to detect the balls appearing in a single image and demonstrate its effectiveness even when the balls are densely placed. To circumvent the other challenge, we use spin image and polar image for the representation of the balls to achieve rotation-invariance advantage. Finally, we adopt a dictionary learning-based method for the recognition task. To evaluate our system, a series of experiments are performed on real-world digit ball images, and the results validate the effectiveness of our system, which achieves 100 % accuracy in the experiments.
Similar content being viewed by others
Notes
In our system, the meaning of “real-time” is that no less than 5 frames/pictures per second can be processed with 3–6 balls appearing on each frame/picture. In this way, the system cannot be influenced by the processing speed.
Basler industrial camera, camera model: scA1000-30gc.
Here the polar image is reshaped to a column vector. In the rest of the paper, we use the polar image to denote its column vector abusively.
The nearest centroid classifier is used here.
References
Aharon, M., Elad, M., Bruckstein, A.: The k-svd: An algorithm for designing of overcomplete dictionaries for sparse representation. IEEE Trans. Signal Process. 54(11), 4311–4322 (2006)
Bradley, D.M., Bagnell, J.A.: Differentiable sparse coding. Adv. Neural Inform. Process. Syst. (NIPS) (2008)
Breiman, L.: Better subset regression using the nonnegative garrote. Technometrics 37(4), 373–384 (1995)
Cheng, L., Wang, D., Deng, X., Kong, S.: Sparse representation for three-dimensional number ball recognition. In: WRI Global Congress on Intelligent Systems (GCIS) (2010)
Elad, M., Aharon, M.: Image denoising via learned dictionaries and sparse representation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2006)
Gader, P.D., Khabou, M.A.: Automatic feature generation for handwritten digit recognition. IEEE Trans. Pattern Anal. Mach. Intell. 18(12), 1256–1261 (1996)
Hu, M.K.: Visual pattern recognition by moment invariants. IRE Trans. Inform. Theory 8(2), 179–187 (1962)
Huang, T., Wang, D., Cheng, L., Deng, X.: Number ball recognition at arbitrary pose using multiple view instances. In: IEEE Youth Conference on Information, Computing and Telecommunication, pp. 510–513 (2009)
Johnson, A., Hebert, M.: Using spin images for efficient object recognition in cluttered 3d scenes. IEEE Trans. Pattern Anal. Mach. Intell. 21(5), 433–449 (1999)
Kong, S., Wang, D.: A dictionary learning approach for classification: separating the particularity and the commonality. In: European Conference on Computer Vision (ECCV) (2012)
Kupeev, K.Y., Wolfson, H.J.: A new method of estimating shape similarity. Pattern Recognit. Lett. 17(8), 873–887 (1996)
Kurita, T., Hotta, K., Mishima, T.: Scale and rotation invariant recognition method using higher-order local autocorrelation features of log-polar image. In: Asian Conference on Computer Vision (ACCV), pp 89–96 (1998)
Lazebnik, S., Schmid, C., Ponce, J.: A sparse texture representation using local affine regions. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1265–1278 (2005)
Lee, H., Battle, A., Raina, R., Ng, A.Y.: Efficient sparse coding algorithms. In: Advanced in Neural Information Processing systems (NIPS) (2007)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60(2), 91–110 (2004)
Mairal, J., Bach, F., Ponce, J., Sapiro, G., Zisserman, A.: Supervised dictionary learning. In: Advanced in Neural Information Processing systems (NIPS) (2008)
Ojala, T., Pietikainen, M., Maenpa, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 971–987 (2002)
Rigamonti, R., Brown, M.A., Lepetit, V.: Are sparse representations really relevant for image classification? In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2011)
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B 58(1), 267–288 (1996)
Trier, O.D., Jain, A.K., Taxt, T.: Feature extraction methods for character recognition: a survey. Pattern Recognit. 29(4), 641–662 (1996)
Vizireanu, D.N.: Generalizations of binary morphological shape decomposition. J. Electron. Imag. 16(1), 1–6 (2007)
Vizireanu, D.N.: Morphological shape decomposition interframe interpolation method. J. Electron. Imag. 17(1), 1–5 (2008)
Vizireanu, D.N., Halunga, S., Marghescu, G.: Morphological skeleton decomposition interframe interpolation method. J. Electron. Imag. 19(2), 1–3 (2010)
Wang, D., Cui, C., Wu, Z.: Matching 3d models with global geometric feature map. In: International Conference on Multi-media Modelling (MMM) (2006)
Wang, D., Qian, H.: 3d object recognition by fast spherical correlation between combined view egis and pft. In: IAPR International Conference on Pattern Recognition (ICPR), pp. 1–4 (2008)
Wright, J., Yang, A., Ganesh, A., Sastry, S., Ma, Y.: Robust face recognition via sparse representation. IEEE Trans. Pattern Anal. Mach. Intell. 31(2), 210–227 (2009)
Yang, L., Albregtsen, F.: Fast computation of invariant geometric moments: A new method giving correct results. In: IAPR International Conference on Pattern Recognition (ICPR), pp. 201–204 (1994)
Yu, D., Yan, H.: Reconstruction of broken handwritten digits based on structural morphological features. Pattern Recognit. 34(2), 235–254 (1999)
Zhang, L., Yang, M., Feng, X.: Sparse representation or collaborative representation: Which helps face recognition? In: IEEE International Conference on Computer Vision (ICCV) (2011)
Zhang, Z.: A flexible new technique for camera calibration. IEEE Trans. Pattern Anal. Mach. Intell. 22(11), 1330–1334 (2000)
Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B 67(2), 301–320 (2005)
Acknowledgments
The authors are grateful to the anonymous reviewers for their excellent reviews and constructive comments that helped to improve the manuscript and our system. This work is supported by 973 Program (No.2010CB327904) and Natural Science Foundations (No.61071218) of China.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wang, D., Kong, S. Learning class-specific dictionaries for digit recognition from spherical surface of a 3D ball. Machine Vision and Applications 24, 1213–1227 (2013). https://doi.org/10.1007/s00138-012-0463-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00138-012-0463-z