Fast Learning for Accurate Object Recognition Using a Pre-trained Deep Neural Network

  • Víctor Lobato-RíosEmail author
  • Ana C. Tenorio-Gonzalez
  • Eduardo F. Morales
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10632)


Object recognition is a relevant task for many areas and, in particular, for service robots. Recently object recognition has been dominated by the use of Deep Neural Networks (DNN), however, they required a large number of images and long training times. If a user asks a service robot to search for an unknown object, it has to deal with selecting relevant images to learn a model, deal with polysemy, and learn a model relatively quickly to be of any use to the user. In this paper we describe an object recognition system that deals with the above challenges by: (i) a user interface to reduce different object interpretations, (ii) downloading on-the-fly images from Internet to train a model, and (iii) using the outputs of a trimmed pre-trained DNN as attributes for a SVM. The whole process (selecting and downloading images and training a model) of learning a model for an unknown object takes around two minutes. The proposed method was tested on 72 common objects found in a house environment with very high precision and recall rates (over 90%).


  1. 1.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)Google Scholar
  2. 2.
    Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., LeCun, Y.: OverFeat: integrated recognition, localization and detection using convolutional networks. arXiv preprint arXiv:1312.6229 (2013)
  3. 3.
    Jia, Y., et al.: Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp. 675–678. ACM (2014)Google Scholar
  4. 4.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
  5. 5.
    Chatfield, K., Simonyan, K., Vedaldi, A., Zisserman, A.: Return of the devil in the details: delving deep into convolutional nets. arXiv preprint arXiv:1405.3531 (2014)
  6. 6.
    Ouyang, W., et al.: DeepID-Net: deformable deep convolutional neural networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2403–2412 (2015)Google Scholar
  7. 7.
    Szegedy, C., et al.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)Google Scholar
  8. 8.
    Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)Google Scholar
  9. 9.
    Jaderberg, M., Vedaldi, A., Zisserman, A.: Speeding up convolutional neural networks with low rank expansions. arXiv preprint arXiv:1405.3866 (2014)
  10. 10.
    Lebedev, V., Ganin, Y., Rakhuba, M., Oseledets, I., Lempitsky, V.: Speeding-up convolutional neural networks using fine-tuned CP-decomposition. arXiv preprint arXiv:1412.6553 (2014)
  11. 11.
    Zhang, X., Zou, J., He, K., Sun, J.: Accelerating very deep convolutionalnetworks for classification and detection. IEEE Trans. Pattern Anal. Mach. Intell. 38(10), 1943–1955 (2016)CrossRefGoogle Scholar
  12. 12.
    Mathieu, M., Henaff, M., LeCun, Y.: Fast training of convolutional networks through FFTs. arXiv preprint arXiv:1312.5851 (2013)
  13. 13.
    Kim, Y.-D., Park, E., Yoo, S., Choi, T., Yang, L., Shin, D.: Compression of deep convolutional neural networks for fast and low power mobile applications. arXiv preprint arXiv:1511.06530 (2015)
  14. 14.
    Chen, W., Wilson, J., Tyree, S., Weinberger, K., Chen, Y.: Compressing neural networks with the hashing trick. In: International Conference on Machine Learning, pp. 2285–2294 (2015)Google Scholar
  15. 15.
    Carneiro, G., Nascimento, J., Bradley, A.P.: Unregistered multiview mammogram analysis with pre-trained deep learning models. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 652–660. Springer, Cham (2015). Scholar
  16. 16.
    Castelluccio, M., Poggi, G., Sansone, C., Verdoliva, L.: Land use classification in remote sensing images by convolutional neural networks. arXiv preprint arXiv:1508.00092 (2015)
  17. 17.
    Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? In: Advances in Neural Information Processing Systems, pp. 3320–3328 (2014)Google Scholar
  18. 18.
    Penatti, O.A., Nogueira, K., dos Santos, J.A.: Do deep features generalize from everyday objects to remote sensing and aerial scenes domains? In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 44–51 (2015)Google Scholar
  19. 19.
    Ng, H.-W., Nguyen, V.D., Vonikakis, V., Winkler, S.: Deep learning for emotion recognition on small datasets using transfer learning. In: Proceedings of the ACM International Conference on Multimodal Interaction, pp. 443–449. ACM (2015)Google Scholar
  20. 20.
    Litjens, G., et al.: A survey on deep learning in medical image analysis. arXiv preprint arXiv:1702.05747 (2017)
  21. 21.
    Codella, N., Cai, J., Abedini, M., Garnavi, R., Halpern, A., Smith, J.R.: Deep learning, sparse coding, and SVM for melanoma recognition in dermoscopy images. In: Zhou, L., Wang, L., Wang, Q., Shi, Y. (eds.) MLMI 2015. LNCS, vol. 9352, pp. 118–126. Springer, Cham (2015). Scholar
  22. 22.
    Akilan, T., Wu, Q.J., Yang, Y., Safaei, A.: Fusion of transfer learning features and its application in image classification. In: IEEE 30th Canadian Conference on Electrical and Computer Engineering, pp. 1–5. IEEE (2017)Google Scholar
  23. 23.
    D’Innocente, A., Carlucci, F.M., Colosi, M., Caputo, B.: Bridging between computer and robot vision through data augmentation: a case study on object recognition. arXiv preprint arXiv:1705.02139 (2017)
  24. 24.
    Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 886–893. IEEE (2005)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Víctor Lobato-Ríos
    • 1
    Email author
  • Ana C. Tenorio-Gonzalez
    • 1
  • Eduardo F. Morales
    • 1
  1. 1.Instituto Nacional de Astrofísica, Óptica y ElectrónicaTonantzintlaMexico

Personalised recommendations