A survey of deep learning methods and software tools for image classification and object detection

Druzhkov, P. N.; Kustikova, V. D.

doi:10.1134/S1054661816010065

A survey of deep learning methods and software tools for image classification and object detection

Representation, Processing, Analysis and Understanding of Images
Published: 23 July 2016

Volume 26, pages 9–15, (2016)
Cite this article

Pattern Recognition and Image Analysis Aims and scope Submit manuscript

P. N. Druzhkov¹ &
V. D. Kustikova¹

3863 Accesses
206 Citations
6 Altmetric
Explore all metrics

Abstract

Deep learning methods for image classification and object detection are overviewed. In particular we consider such deep models as autoencoders, restricted Boltzmann machines and convolutional neural networks. Existing software packages for deep learning problems are compared.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep Learning Models and Their Architectures for Computer Vision Applications: A Review

Deep Learning

A Review: Image Classification and Object Detection with Deep Learning

References

G. E. Hinton, “Learning multiple layers of representation,” Trends Cognitive Sci. 11, 428–434 (2007).
Article Google Scholar
J. Schmidhuber, Deep learning in neural networks: an overview. http://arxivorg/abs/1404.7828
Resources and pointers to information about Deep Learning. http://deeplearningnet
D. P. Vetrov, “Machine learning: current state and perspectives,” in Proc. of RCDL (Yaroslavl, 2013), Vol. 1, pp. 21–28.
Google Scholar
ImageNet. http://wwwimage-netorg
PASCAL Visual Object Challenge. http://pascallinecssotonacuk/challenges/VOC
C. Dance, J. Willamowski, L. Fan, C. Bray, and G. Csurka, “Visual categorization with bags of keypoints,” in Proc. ECCV Int. Workshop on Statistical Learning in CV (Prague, 2004).
Google Scholar
H. Lee, A. Battle, R. Raina, and A. Y. Ng, “Efficient sparse coding algorithms,” in Proc. of NIPS (Vancouver, 2006), pp. 801–808.
Google Scholar
D. Lowe, “Distinctive image features from scaleinvariant keypoints,” Int. J. Comput. Vision 60 (2), 91–110 (2004).
Article Google Scholar
Y. He, K. Kavukcuoglu, Y. Wang, A. Szlam, and Y. Qi, “Unsupervised feature learning by deep sparse coding,” in Proc. of SIAM Int. Conf. on Data Mining (Philadelphia, 2014), pp. 902–910.
Google Scholar
J. Yang, K. Yu, and T. Huang, “Supervised translationinvariant sparse coding,” in Proc. of CVPR (San Francisco, 2010), pp. 3517–3524.
Google Scholar
Q. Zhang and B. Li, “Discriminative k-svd for dictionary learning in face recognition,” in Proc. of CVPR (San Francisco, 2010), pp. 2691–2698.
Google Scholar
Z. Jiang, Z. Lin, and L. S. Davis, “Learning a discriminative dictionary for sparse coding via label consistent k-svd,” in Proc. of CVPR (Colorado Springs, 2011), pp. 1697–1704.
Google Scholar
A. Coates, H. Lee, and A. Y. Ng, “An analysis of singlelayer networks in unsupervised feature learning,” in Proc. of Int. Conf. on Artificial Intelligence and Statistics (Ft. Lauderdale, FL, 2011), Vol. 15, pp. 215–223.
Google Scholar
A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” in Proc. of NIPS (Lake Tahoe, 2012), pp. 1097–1105.
Google Scholar
K. Simonyan, A. Vedaldi, and A. Zisserman, “Deep fisher networks for large-scale image classification,” in Proc. of NIPS (Lake Tahoe, 2013), pp. 163–171.
Google Scholar
C. Szegedy, A. Toshev, and D. Erhan, “Deep neural networks for object detection,” in Proc. of NIPS (Lake Tahoe, 2013), pp. 2553–2561.
Google Scholar
D. Erhan, C. Szegedy, A. Toshev, and D. Anguelov, “Scalable object detection using deep neural networks,” in Proc. of CVPR (Columbus, OH, 2014).
Google Scholar
R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” in Proc. of CVPR (Columbus, OH, 2014), pp. 580–587.
Google Scholar
M. Hayat, M. Bennamoun, and S. An, “Learning nonlinear reconstruction models for image set classification,” in Proc. of CVPR (Columbus, OH, 2014).
Google Scholar
M. Ranzato, C. Poultney, and S. Chopra, “Efficient learning of sparse representations with an energy-based model,” in Proc. of NIPS (Vancouver, 2006), pp. 1137–1144.
Google Scholar
S. Rifai, P. Vincent, X. Muller, X. Glorot, and Y. Bengio, “Contractive auto-encoders: explicit invariance during feature extraction,” in Proc. of ICML (Bellevue, 2011), pp. 833–840.
Google Scholar
P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, and P. A. Manzagol, “Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion,” J. Mach. Learn. Res. 11, 3371–3408 (2010).
MathSciNet MATH Google Scholar
K. Kavukcuoglu, P. Sermanet, Y-lan Boureau, K. Gregor, M. Mathieu, and Y. L. Cun, “Learning convolutional feature hierarchies for visual recognition,” in Proc. of NIPS (Vancouver, 2010), pp. 1090–1098.
Google Scholar
P. Luo, Y. Tian, X. Wang, and X. Tan, “Switchable deep network for pedestrian detection,” in Proc. of CVPR (Columbus, OH, 2014).
Google Scholar
H. Lee, R. Grosse, R. Ranganath, and A. Y. Ng, “Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations,” in Proc. of ICML (Montreal, 2009), pp. 609–616.
Google Scholar
V. D. Kustikova, N. Yu. Zolotykh, and I. B. Meyerov, “A review of vehicle detection and tracking methods in video,” Vestn. Lobachevsky State Univ. Nizhni Novgorod, No. 5 (2), 347–357 (2012).
Google Scholar
P. F. Felzenszwalb, R. B. Girshick, D. McAllester, and D. Ramanan, “Object detection with discriminatively trained part based models,” IEEE Trans. PAMI’10 32 (9), 1627–1645 (2010).
Google Scholar
J. Shotton, A. Blake, and R. Cipolla, “Contour-based learning for object detection,” in Proc. ICCV (Beijing, 2005), Vol. 1, pp. 503–510.
Google Scholar
C. H. Hilario, J. M. Collado, J. M. Armingol, and A. de la Escalera, “Pyramidal image analysis for vehicle detection,” in Proc. of Intelligent Vehicles Symp. (Las Vegas, 2005), pp. 88–93.
Google Scholar
Y. Amit, 2D Object Detection and Recognition: Models, Algorithms and Networks (MIT Press, 2002).
Google Scholar
M. Sonka, V. Hlavac, and R. Boyle, Image Processing, Analysis and Machine Vision (Thomson, 2008).
Google Scholar
Restricted Boltzmann Machines (RBMs). http://wwwdeeplearningnet/tutorial/rbmhtml. Assessed 07.08.2014.
R. Salakhutdinov and G. Hinton, Deep Boltzmann Machines, DBMs. http://wwwcstorontoedu/~fritz/absps/dbmpdf
Q. Le, M. Ranzato, R. Monga, M. Devin, K. Chen, G. Corrado, J. Dean, and A. Ng, “Building high-level features using large scale unsupervised learning,” in Proc. of ICML (Edinburgh, 2012).
Google Scholar
Y. LeCun, K. Kavukcuoglu, and C. Farabet, “Convolutional networks and applications in vision,” in Proc. of ISCAS (Paris, 2010), pp. 253–256.
Google Scholar
M. Oquab, L. Bottou, I. Laptev, and J. Sivic, “Weakly supervised object recognition with convolutional neural networks,” in Proc. of NIPS (Montreal, 2014).
Google Scholar
M. Oquab, L. Bottou, I. Laptev, and J. Sivic, Learning and transferring mid-level image representations using convolutional neural networks (2013). http://halinriafr/docs/00/91/11/79/PDF/paperpdf
Google Scholar
J.R. R. Uijlings, K.E.A. van de Sande, T. Gevers, and A. W. M. Smeulders, “Selective search for object recognition,” Int. J. Comput. Vision 104 (2), 154–171 (2013).
Article Google Scholar
X. Wang, M. Yang, S. Zhu, and Y. Lin, “Regionlets for generic object detection,” in Proc. of ICCV (Sydney, 2013).
Google Scholar
K. Kavukcuoglu, M. Ranzato, R. Fergus, and Y. LeCun, “Learning invariant features through topographic filter maps,” in Proc. of CVPR (Miami, 2009), pp. 1605–1612.
Google Scholar
R-CNN–a visual object detection system. https://githubcom/rbgirshick/rcnn
Caffe–a deep learning framework. http://caffeberkeleyvisionorg
nnForgeLibrary. http://milakovgithubio/nnForge
DeapLearnToolbox. https://githubcom/rasmusbergpalm/DeepLearnToolbox
Cuda-convnet–high-performance C++/CUDA implementation of convolutional neural networks. http://codegooglecom/p/cuda-convnet
EBLearn–a machine learning library. http://eblearnsourceforgenet
Cuda CNN Library. http://wwwmathworkscom/matlabcentral/fileexchange/24291-cnnconvolutionalneural-network-class, https://bitbucketorg/intelligenceagent/cudacnnpublic/wiki/Home
DeepMat Library. https://githubcom/kyunghyuncho/deepmat
Package Darch. http://cranr-projectorg/web/packages/darch/indexhtml
Software Environment R. http://wwwr-projectorg
Torch–a scientific computing framework. http://wwwtorchch
Theano Library. https://githubcom/Theano/Theano, http://deeplearningnet/software/theano
Lush programming language. http://lushsourceforgenet
Pylearn2–a machine learning library. http://deeplearningnet/software/pylearn2
Deepnet Library. https://githubcom/nitishsrivastava/deepnet
DeCAFFramework. https://githubcom/UCB-ICSIVision-Group/decaf-release
Cuda-convnet NYU. http://csnyuedu/~wanli/dropc
Hebel–GPU-accelerated deep learning library. https://githubcom/hannes-brt/hebel
CXXNET–a neural network toolkit. https://githubcom/antinucleon/cxxnet
Crino–a neural network library. https://githubcom/jlerouge/crino
A. Courville, J. Bergstra, and Y. Bengio, A spike and slab restricted Boltzmann machine (2011). http://jmlrorg/proceedings/papers/v31/luo13apdf
Google Scholar
Y. He, K. Kavukcuoglu, Y. Wang, A. Szlam, and Y. Qi, “Unsupervised feature learning by deep sparse coding,” in Proc. of ICDM (Shenzhen, 2014), pp. 902–910.
Google Scholar
O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei, Image Net large scale visual recognition challenge. http://arxivorg/abs/1409.0575
C. Vens and F. Costa, “Random forest based feature induction,” in Proc. of ICDM (Vancouver, 2011), pp. 744–753.
Google Scholar
V. Yu. Martyanov, A. N. Polovinkin, and E. V. Tuv, “Image classification with codebook based on decision tree ensembles,” in Proc. of Intelligent Information Processing Conf. (Guilin, 2012), pp. 480–482.
Google Scholar
The Intel® Deep Learning Framework (IDLF). https://githubcom/01org/idlf
Scikit-neuralnetwork Library. https://githubcom/aigamedev/scikit-neuralnetwork

Download references

Author information

Authors and Affiliations

Lobachevsky State University of Nizhni Novgorod, Institute of Information Technologies, Mathematics and Mechanics, Nizhni Novgorod, Russian Federation
P. N. Druzhkov & V. D. Kustikova

Authors

P. N. Druzhkov
View author publications
You can also search for this author in PubMed Google Scholar
V. D. Kustikova
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to P. N. Druzhkov.

Additional information

This paper uses the materials of the report submitted at the 9th Open German-Russian Workshop on Pattern Recognition and Image Understanding, held in Koblenz, December 1–5, 2014 (OGRW-9-2014).

The article published in the original.

Pavel Nikolaevich Druzhkov Born 1989 Graduated Lobachevsky State University of Nizni Novgorod in 2012. He is a junior research of the Lobachevsky State University of Nizhni Novgorod.

Research interests: machine learning and data mining, computer vision.

Number of publications (monographs and articles): 6.

Valentina Dmitrievna Kustikova Born 1987. Graduated in 2010, Lobachevsky State University of Nizhni Novgorod. Year of dissertation completion (Candidate’s, Doctoral): 2015, Candidate of Engineering Sciences. Assistent at the Lobachevsky State University of Nizhni Novgorod.

Research interests: computer vision, machine learning, parallel computing.

Number of publications (monographs and articles): 8.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Druzhkov, P.N., Kustikova, V.D. A survey of deep learning methods and software tools for image classification and object detection. Pattern Recognit. Image Anal. 26, 9–15 (2016). https://doi.org/10.1134/S1054661816010065

Download citation

Received: 06 August 2015
Published: 23 July 2016
Issue Date: January 2016
DOI: https://doi.org/10.1134/S1054661816010065

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey of deep learning methods and software tools for image classification and object detection

Abstract

Access this article

Similar content being viewed by others

Deep Learning Models and Their Architectures for Computer Vision Applications: A Review

Deep Learning

A Review: Image Classification and Object Detection with Deep Learning

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A survey of deep learning methods and software tools for image classification and object detection

Abstract

Access this article

Similar content being viewed by others

Deep Learning Models and Their Architectures for Computer Vision Applications: A Review

Deep Learning

A Review: Image Classification and Object Detection with Deep Learning

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation