Advertisement

An Overview of Deep Learning and Its Applications

  • Michael VogtEmail author
Conference paper
Part of the Proceedings book series (PROCEE)

Abstract

Deep learning is the machine learning method that changed the field of artificial intelligence in the last five years. In the view of industrial research, this technology is disruptive: It considerably pushes the border of tasks that can be automated, changes the way applications are developed, and is available to virtually everyone.

References

  1. 1.
    AnnotateMyData. http://annotatemydata.com/. Accessed 28 Feb 2018
  2. 2.
    Angelova, A., et al.: Real-time pedestrian detection with deep network cascades. In: British Machine Vision Conference (BMVC), pp. 32.1–32.12 (2015)Google Scholar
  3. 3.
    Bahdanau, D., et al.: Neural machine translation by jointly learning to align and translate. In: 3rd International Conference on Learning Representations (ICLR) (2015)Google Scholar
  4. 4.
    Banko, M., Brill, E.: Scaling to very very large corpora for natural language disambiguation. In: 39th Annual Meeting ot the Association for Computational Linguistics (ACL), pp. 26–33 (2001)Google Scholar
  5. 5.
    Behrendt, K., et al.: A deep learning approach to traffic lights: detection, tracking, and classification. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 1370–1377 (2017)Google Scholar
  6. 6.
    Bellman, R.: A Markovian decision process. J. Math. Mech. 6(5), 679–684 (1954)MathSciNetzbMATHGoogle Scholar
  7. 7.
    Bojarski, M., et al. Explaining how a deep neural network trained with end-to-end learning steers a car. Computing Research Repository, arXiv:1704.07911 (2017)
  8. 8.
    Boston Dynamics: Atlas Robot. https://www.bostondynamics.com/atlas. Accessed 28 Feb 2018
  9. 9.
    Cheng, J., et al.: Computer-aided diagnosis with deep learning architecture: applications to breast lesions in US images and pulmonary nodules in CT scans. Sci. Rep. 6(24454) (2016)Google Scholar
  10. 10.
    Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1724–1734 (2014)Google Scholar
  11. 11.
    Clevert, D. et al.: Fast and accurate deep network learning by exponential linear units (ELUs). In: 4th International. Conference on Learning Representations (ICLR) (2016)Google Scholar
  12. 12.
    Cordts, M., et al.: The cityscapes dataset for semantic urban scene understanding. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3213–3223 (2016)Google Scholar
  13. 13.
    CrowdFlower. https://www.crowdflower.com/. Accessed 28 Feb 2018
  14. 14.
    Cybenko, G.: Approximation by superpositions of a sigmoidal function. Math. Control, Signals, Syst. 2(4), 303–314 (1989)MathSciNetCrossRefGoogle Scholar
  15. 15.
    Dai, J., et al.: R-FCN: object detection via region-based fully convolutional networks. In: Advances in Neural Information Processing Systems (NIPS), vol. 29, pp. 379–387 (2016)Google Scholar
  16. 16.
    DiGiovanna, J., et al.: Coadaptive brain-machine interface via reinforcement learning. IEEE Trans. Biomed. Eng. 56(1), 54–64 (2009)CrossRefGoogle Scholar
  17. 17.
    Doersch, C., et al.: Unsupervised visual representation learning by context prediction. In: IEEE International Conference on Computer Vision (ICCV), pp. 1422–1430 (2015)Google Scholar
  18. 18.
    Dong, C., et al.: Learning a deep convolutional network for image super-resolution. In: 13th European Conference on Computer Vision (ECCV), pp. 184–199 (2014)Google Scholar
  19. 19.
    Duchi, J., et al.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159 (2011)MathSciNetzbMATHGoogle Scholar
  20. 20.
    Espinosa, J., et al.: Vehicle detection using AlexNet and Faster R-CNN deep learning models: a comparative study. In: 5th International Visual Informatics Conference (IVIC), pp. 3–15 (2017)CrossRefGoogle Scholar
  21. 21.
    Farfade, S., et al.: Multi-view face detection using deep convolutional neural networks. In: 5th ACM on International Conference on Multimedia Retrieval (ICMR), pp. 643–650 (2015)Google Scholar
  22. 22.
    Géron, A.: Hands-On Machine Learning with Scikit-Learn and Tensor-Flow. O’Reilly, Sebastopol (2017)Google Scholar
  23. 23.
    Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: 13th International Conference on Artificial Intelligence and Statistics, pp. 249–256 (2010)Google Scholar
  24. 24.
    Glorot, X., et al.: Deep sparse rectifier neural networks. In: 14th International Conference on Artificial Intelligence and Statistics, pp. 315–323 (2011)Google Scholar
  25. 25.
    Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems 27 (NIPS), pp. 2672–2680 (2014)Google Scholar
  26. 26.
    Goodfellow, I., et al.: Deep Learning. MIT Press, Cambridge (2016)Google Scholar
  27. 27.
    Google, Inc.: Neural network processor. Patent WO2016186801 (2016)Google Scholar
  28. 28.
    Greff, K., et al.: LSTM: a search space odyssey. IEEE Trans. Neural Networks Learn. Syst 28(10), 2222–2232 (2017)MathSciNetCrossRefGoogle Scholar
  29. 29.
    He, K. et al.: Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. In: IEEE International Conference on Computer Vision (ICCV), pp. 1026–1034, 2015Google Scholar
  30. 30.
    He, K., et al.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)Google Scholar
  31. 31.
    He, K., et al.: Mask R-CNN. In: IEEE International Conference on Computer Vision (ICCV), pp. 2980–2988 (2017)Google Scholar
  32. 32.
    Hinton, G., et al.: A fast learning algorithm for deep belief nets. Neural Comput 18, 1527–1554 (2006)MathSciNetCrossRefGoogle Scholar
  33. 33.
    Hinton, G., et al.: Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process. Mag. 29(6), 82–97 (2012)CrossRefGoogle Scholar
  34. 34.
    Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRefGoogle Scholar
  35. 35.
    Hornik, K., et al.: Multilayer feedforward networks are universal approximators. Neural Netw. 2(5), 359–366 (1989)CrossRefGoogle Scholar
  36. 36.
    Huang, J., et al.: Speed/accuracy trade-offs for modern convolutional object detectors. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3296–3297 (2017)Google Scholar
  37. 37.
    Hubel, D., Wiesel, T.: Receptive fields of single neurones in the cat’s striate cortex. J. Physiol. 148(3), 574–591 (1959)CrossRefGoogle Scholar
  38. 38.
    Intel Nervana. https://ai.intel.com/. Accessed 28 Feb 2018
  39. 39.
    Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: 32nd International Conference on International Conference on Machine Learning (ICML), pp. 448–456 (2015)Google Scholar
  40. 40.
    Karpathy, A., Fei-Fei, L.: Deep visual-semantic alignments for generating image descriptions. IEEE Trans. Pattern Anal. Mach. Intell. 39(4), 664–676 (2017)CrossRefGoogle Scholar
  41. 41.
    Kendall, A., et al.: End-to-end learning of geometry and context for deep stereo regression. In: IEEE Int. Conference on ComputerVision (ICCV), pp. 66–75 (2017)Google Scholar
  42. 42.
    Kingma, D., Ba, J.: Adam: a method for stochastic optimization. In: 3rd International Conference on Learning Representations (ICLR) (2015)Google Scholar
  43. 43.
    Krizhevsky, A. et al.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems 25 (NIPS), pp. 1090–1098 (2012)Google Scholar
  44. 44.
    Larsson, G., et al.: Colorization as a proxy task for visual understanding. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 840–849 (2017)Google Scholar
  45. 45.
    Le, Q., et al.: A simple way to initialize recurrent networks of rectified linear units. Computing Research Repository, abs/1504.00941 (2015)Google Scholar
  46. 46.
    LeCun, Y., et al.: Handwritten digit recognition with a back-propagation network. In: Advances in Neural Information Processing Systems 2 (NIPS), pp. 396–404 (1990)Google Scholar
  47. 47.
    LeCun, Y., et al.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRefGoogle Scholar
  48. 48.
    LeCun, Y., et al.: Deep learning. Nature 521(7553), 436–444 (2015)MathSciNetCrossRefGoogle Scholar
  49. 49.
    Levine, S., et al.: End-to-end training of deep visuomotor policies. J. Mach. Learn. Res. 17(1), 1334–1373 (2016)MathSciNetzbMATHGoogle Scholar
  50. 50.
    Levine, S., et al.: Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection. Int. J. Robo. Res. 37(4) (2017)Google Scholar
  51. 51.
    Li, Y., et al.: Fully convolutional instance-aware semantic segmentation. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4438–4446 (2017)Google Scholar
  52. 52.
    Littman, M.: Reinforcement learning improves behaviour from evaluative feedback. Nature 521(7553), 445–451 (2015)CrossRefGoogle Scholar
  53. 53.
    Liu, W., et al.: SSD: single shot multibox detector. In: 14th European Conference on Computer Vision (ECCV), pp. 396–404 (2016)CrossRefGoogle Scholar
  54. 54.
    Long, J., et al.: Fully convolutional networks for semantic segmentation. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3431–3440 (2015)Google Scholar
  55. 55.
    Luong, M., et al.: Effective approaches to attention-based neural machine translation. Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1412–1421 (2015)Google Scholar
  56. 56.
    Matti, D., et al.: Combining LiDAR space clustering and convolutional neural networks for pedestrian detection. In: 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp. 1–6 (2017)Google Scholar
  57. 57.
    McCulloch, W., Pitts, W.: A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biophys. 5, 115–133 (1943)MathSciNetCrossRefGoogle Scholar
  58. 58.
    Mnih, V., et al.: Playing Atari with deep reinforcement learning. NIPS Deep Learning Workshop (2013)Google Scholar
  59. 59.
    Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)CrossRefGoogle Scholar
  60. 60.
    Nesterov, Y.: A method of solving a convex programming problem with convergence rate O(1/k2). Sov. Math. Dokl 27(2), 372–376 (1983)zbMATHGoogle Scholar
  61. 61.
    NVIDIA CUDA. https://developer.nvidia.com/cuda. Accessed 28 Feb 2018
  62. 62.
    NVIDIA cuDNN. https://developer.nvidia.com/cudnn. Accessed 28 Feb 2018
  63. 63.
    NVIDIA Jetson. https://developer.nvidia.com/embedded-computing. Accessed 28 Feb 2018
  64. 64.
    NVIDIA Drive. https://developer.nvidia.com/drive. Accessed 28 Feb 2018
  65. 65.
    Pathak, D. et al.: Context encoders: feature learning by inpainting. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2536–2544 (2016)Google Scholar
  66. 66.
    Peters, J., Schaal, S.: Reinforcement learning of motor skills with policy gradients. Neural Netw. 21(4), 682–697 (2008)CrossRefGoogle Scholar
  67. 67.
    Pham, V., et al.: Dropout improves recurrent neural networks for handwriting recognition. In: 14th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 285–290 (2014)Google Scholar
  68. 68.
    Polyak, B.: Some methods of speeding up the convergence of iteration methods. USSR Comput. Math. Math. Phys. 4(5), 1–17 (1964)CrossRefGoogle Scholar
  69. 69.
    Redmon, J., et al.: You only look once: unified, real-time object detection. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 779–788 (2016)Google Scholar
  70. 70.
    Ren, S., et al.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems 28 (NIPS), pp. 91–99 (2015)Google Scholar
  71. 71.
    Rosenblatt, F.: The perceptron: a probabilistic model for information storage and organization in the brain. Psychol. Rev. 65(6), 386–408 (1958)CrossRefGoogle Scholar
  72. 72.
    Rumelhart, D., et al.: Learning representations by back-propagating errors. Nature 323,533–536 (1986)CrossRefGoogle Scholar
  73. 73.
    Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)MathSciNetCrossRefGoogle Scholar
  74. 74.
    Sak, H., et al.: Long short-term memory recurrent neural network architectures for large scale acoustic modeling. In: 15th Annual Conference of the International Speech Communication Association (INTERSPEECH), pp. 338–342 (2014)Google Scholar
  75. 75.
    Salimans, T., et al.: Improved techniques for training GANs. In: Advances in Neural Information Processing Systems 29 (NIPS), pp. 2234–2242 (2016)Google Scholar
  76. 76.
    Schulman, J., et al.: Trust region policy optimization. In: 32nd International Conference on International Conference on Machine Learning (ICML), pp. 1889–1897 (2015)Google Scholar
  77. 77.
    See, A., et al.: Get to the point: summarization with pointer-generator networks. In: 55th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 1073–1083 (2017)Google Scholar
  78. 78.
    Silver, D., et al.: Deterministic policy gradient algorithms. In: 31st International Conference on International Conference on Machine Learning (ICML), pp. 387–395 (2014)Google Scholar
  79. 79.
    Silver, D., et al.: Mastering the game of Go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)CrossRefGoogle Scholar
  80. 80.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: 3rd International Conference on Learning Representations (ICLR) (2015)Google Scholar
  81. 81.
    Srivastava, N., et al.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res 15, 1929–1958 (2014)MathSciNetzbMATHGoogle Scholar
  82. 82.
    Su, H., et al.: Crowdsourcing annotations for visual object detection. In: AAAI Human Computation Workshop, pp. 40–46 (2012)Google Scholar
  83. 83.
    Sutskever, I., et al.: Sequence to sequence learning with neural networks. In: Neural Information Processing Systems 27 (NIPS), pp. 3104–3112 (2014)Google Scholar
  84. 84.
    Szegedy, C., et al.: Going deeper with convolutions. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–9 (2015)Google Scholar
  85. 85.
    Szegedy, C., et al.: Rethinking the inception architecture for computer vision. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2818–2826 (2016)Google Scholar
  86. 86.
    Szegedy, C., et al.: Inception-v4, Inception-ResNet and the impact of residual connections on learning. In: 31st AAAI Conference on Artificial Intelligence, pp. 4278–4284 (2017)Google Scholar
  87. 87.
    Tesauro, G.: Temporal difference learning and TD-Gammon. Commun. ACM 38(3), 58–68 (1995)CrossRefGoogle Scholar
  88. 88.
    Waymo: Google self-driving car. https://waymo.com/. Accessed 28 Feb 2018
  89. 89.
    Werbos, P.: Beyond regression: new tools for prediction and analysis in the behavioral sciences. PhD thesis, Harvard University (1974)Google Scholar
  90. 90.
    Werbos, P.: Backpropagation through time: what it does and how to do it. Proc. IEEE 78(10), 1550–1560 (1990)CrossRefGoogle Scholar
  91. 91.
    Williams, R.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8(3), 229–256 (1992)zbMATHGoogle Scholar
  92. 92.
    Wilson, A., et al.: The marginal value of adaptive gradient methods in machine learning. In: Advances in Neural Information Processing Systems 30 (NIPS), pp. 4151–4161 (2017)Google Scholar
  93. 93.
    Xu, B., et al.: Empirical evaluation of rectified activations in convolutional network. In: ICML Deep Learning Workshop, 06–11 July 2015Google Scholar
  94. 94.
    Xu, K., et al.: Show, attend and tell: neural image caption generation with visual attention. In: 32nd International Conference on International Conference on Machine Learning (ICML), pp. 2048–2057 (2015)Google Scholar
  95. 95.
    Zeiler M., Fergus, R.: Visualizing and understanding convolutional networks. In: 13th European Conference on Computer Vision (ECCV), pp. 818–833 (2014)Google Scholar
  96. 96.
    Zhang, Y., et al.: Augmenting supervised neural networks with unsupervised objectives for large-scale image classification. In: 33rd International Conference on International Conference on Machine Learning (ICML), pp. 612–621 (2016)Google Scholar
  97. 97.
    Zhu, Z., et al.: Traffic sign detection and classification in the wild. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2110–2118 (2016)Google Scholar
  98. 98.
    Zoph, B., Le, Q.: Neural architecture search with reinforcement learning. In: 5th International Conference on Learning Representations (ICLR) (2017)Google Scholar

Copyright information

© Springer Fachmedien Wiesbaden GmbH, ein Teil von Springer Nature 2019

Authors and Affiliations

  1. 1.Smiths Heimann GmbHWiesbadenGermany

Personalised recommendations