Skip to main content
Log in

Learning class-specific dictionaries for digit recognition from spherical surface of a 3D ball

  • Original Paper
  • Published:
Machine Vision and Applications Aims and scope Submit manuscript

Abstract

In the literature, very few researches have addressed the problem of recognizing the digits placed on spherical surfaces, even though digit recognition has already attracted extensive attentions and been attacked from various directions. As a particular example of recognizing this kind of digits, in this paper, we introduce a digit ball detection and recognition system to recognize the digit appearing on a 3D ball. The so-called digit ball is the ball carrying Arabic number on its spherical surface. Our system works under weakly controlled environment to detect and recognize the digit balls for practical application, which requires the system to keep on working without recognition errors in a real-time manner. Two main challenges confront our system, one is how to accurately detect the balls and the other is how to deal with the arbitrary rotation of the balls. For the first one, we develop a novel method to detect the balls appearing in a single image and demonstrate its effectiveness even when the balls are densely placed. To circumvent the other challenge, we use spin image and polar image for the representation of the balls to achieve rotation-invariance advantage. Finally, we adopt a dictionary learning-based method for the recognition task. To evaluate our system, a series of experiments are performed on real-world digit ball images, and the results validate the effectiveness of our system, which achieves 100 % accuracy in the experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Notes

  1. In our system, the meaning of “real-time” is that no less than 5 frames/pictures per second can be processed with 3–6 balls appearing on each frame/picture. In this way, the system cannot be influenced by the processing speed.

  2. Basler industrial camera, camera model: scA1000-30gc.

  3. Here the polar image is reshaped to a column vector. In the rest of the paper, we use the polar image to denote its column vector abusively.

  4. http://www.cs.zju.edu.cn/people/wangdh/Software.html.

  5. The nearest centroid classifier is used here.

References

  1. Aharon, M., Elad, M., Bruckstein, A.: The k-svd: An algorithm for designing of overcomplete dictionaries for sparse representation. IEEE Trans. Signal Process. 54(11), 4311–4322 (2006)

    Article  Google Scholar 

  2. Bradley, D.M., Bagnell, J.A.: Differentiable sparse coding. Adv. Neural Inform. Process. Syst. (NIPS) (2008)

  3. Breiman, L.: Better subset regression using the nonnegative garrote. Technometrics 37(4), 373–384 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  4. Cheng, L., Wang, D., Deng, X., Kong, S.: Sparse representation for three-dimensional number ball recognition. In: WRI Global Congress on Intelligent Systems (GCIS) (2010)

  5. Elad, M., Aharon, M.: Image denoising via learned dictionaries and sparse representation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2006)

  6. Gader, P.D., Khabou, M.A.: Automatic feature generation for handwritten digit recognition. IEEE Trans. Pattern Anal. Mach. Intell. 18(12), 1256–1261 (1996)

    Article  Google Scholar 

  7. Hu, M.K.: Visual pattern recognition by moment invariants. IRE Trans. Inform. Theory 8(2), 179–187 (1962)

    Article  MATH  Google Scholar 

  8. Huang, T., Wang, D., Cheng, L., Deng, X.: Number ball recognition at arbitrary pose using multiple view instances. In: IEEE Youth Conference on Information, Computing and Telecommunication, pp. 510–513 (2009)

  9. Johnson, A., Hebert, M.: Using spin images for efficient object recognition in cluttered 3d scenes. IEEE Trans. Pattern Anal. Mach. Intell. 21(5), 433–449 (1999)

    Article  Google Scholar 

  10. Kong, S., Wang, D.: A dictionary learning approach for classification: separating the particularity and the commonality. In: European Conference on Computer Vision (ECCV) (2012)

  11. Kupeev, K.Y., Wolfson, H.J.: A new method of estimating shape similarity. Pattern Recognit. Lett. 17(8), 873–887 (1996)

    Article  Google Scholar 

  12. Kurita, T., Hotta, K., Mishima, T.: Scale and rotation invariant recognition method using higher-order local autocorrelation features of log-polar image. In: Asian Conference on Computer Vision (ACCV), pp 89–96 (1998)

  13. Lazebnik, S., Schmid, C., Ponce, J.: A sparse texture representation using local affine regions. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1265–1278 (2005)

    Article  Google Scholar 

  14. Lee, H., Battle, A., Raina, R., Ng, A.Y.: Efficient sparse coding algorithms. In: Advanced in Neural Information Processing systems (NIPS) (2007)

  15. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60(2), 91–110 (2004)

    Article  Google Scholar 

  16. Mairal, J., Bach, F., Ponce, J., Sapiro, G., Zisserman, A.: Supervised dictionary learning. In: Advanced in Neural Information Processing systems (NIPS) (2008)

  17. Ojala, T., Pietikainen, M., Maenpa, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 971–987 (2002)

    Article  Google Scholar 

  18. Rigamonti, R., Brown, M.A., Lepetit, V.: Are sparse representations really relevant for image classification? In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2011)

  19. Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B 58(1), 267–288 (1996)

    Google Scholar 

  20. Trier, O.D., Jain, A.K., Taxt, T.: Feature extraction methods for character recognition: a survey. Pattern Recognit. 29(4), 641–662 (1996)

    Article  Google Scholar 

  21. Vizireanu, D.N.: Generalizations of binary morphological shape decomposition. J. Electron. Imag. 16(1), 1–6 (2007)

    Article  Google Scholar 

  22. Vizireanu, D.N.: Morphological shape decomposition interframe interpolation method. J. Electron. Imag. 17(1), 1–5 (2008)

    Article  Google Scholar 

  23. Vizireanu, D.N., Halunga, S., Marghescu, G.: Morphological skeleton decomposition interframe interpolation method. J. Electron. Imag. 19(2), 1–3 (2010)

    Article  Google Scholar 

  24. Wang, D., Cui, C., Wu, Z.: Matching 3d models with global geometric feature map. In: International Conference on Multi-media Modelling (MMM) (2006)

  25. Wang, D., Qian, H.: 3d object recognition by fast spherical correlation between combined view egis and pft. In: IAPR International Conference on Pattern Recognition (ICPR), pp. 1–4 (2008)

  26. Wright, J., Yang, A., Ganesh, A., Sastry, S., Ma, Y.: Robust face recognition via sparse representation. IEEE Trans. Pattern Anal. Mach. Intell. 31(2), 210–227 (2009)

    Google Scholar 

  27. Yang, L., Albregtsen, F.: Fast computation of invariant geometric moments: A new method giving correct results. In: IAPR International Conference on Pattern Recognition (ICPR), pp. 201–204 (1994)

  28. Yu, D., Yan, H.: Reconstruction of broken handwritten digits based on structural morphological features. Pattern Recognit. 34(2), 235–254 (1999)

    Article  MathSciNet  Google Scholar 

  29. Zhang, L., Yang, M., Feng, X.: Sparse representation or collaborative representation: Which helps face recognition? In: IEEE International Conference on Computer Vision (ICCV) (2011)

  30. Zhang, Z.: A flexible new technique for camera calibration. IEEE Trans. Pattern Anal. Mach. Intell. 22(11), 1330–1334 (2000)

    Article  Google Scholar 

  31. Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B 67(2), 301–320 (2005)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgments

The authors are grateful to the anonymous reviewers for their excellent reviews and constructive comments that helped to improve the manuscript and our system. This work is supported by 973 Program (No.2010CB327904) and Natural Science Foundations (No.61071218) of China.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Donghui Wang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, D., Kong, S. Learning class-specific dictionaries for digit recognition from spherical surface of a 3D ball. Machine Vision and Applications 24, 1213–1227 (2013). https://doi.org/10.1007/s00138-012-0463-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00138-012-0463-z

Keywords

Navigation