Skip to main content

Advertisement

SpringerLink
Efficient Visual Recognition: A Survey on Recent Advances and Brain-inspired Methodologies
Download PDF
Download PDF
  • Review
  • Open Access
  • Published: 18 August 2022

Efficient Visual Recognition: A Survey on Recent Advances and Brain-inspired Methodologies

  • Yang Wu  ORCID: orcid.org/0000-0001-8010-68571,
  • Ding-Heng Wang2,
  • Xiao-Tong Lu3,
  • Fan Yang4,
  • Man Yao2,5,
  • Wei-Sheng Dong3,
  • Jian-Bo Shi6 &
  • …
  • Guo-Qi Li  ORCID: orcid.org/0000-0002-8994-431X7,8 

Machine Intelligence Research volume 19, pages 366–411 (2022)Cite this article

  • 679 Accesses

  • 1 Citations

  • 2 Altmetric

  • Metrics details

Abstract

Visual recognition is currently one of the most important and active research areas in computer vision, pattern recognition, and even the general field of artificial intelligence. It has great fundamental importance and strong industrial needs, particularly the modern deep neural networks (DNNs) and some brain-inspired methodologies, have largely boosted the recognition performance on many concrete tasks, with the help of large amounts of training data and new powerful computation resources. Although recognition accuracy is usually the first concern for new progresses, efficiency is actually rather important and sometimes critical for both academic research and industrial applications. Moreover, insightful views on the opportunities and challenges of efficiency are also highly required for the entire community. While general surveys on the efficiency issue have been done from various perspectives, as far as we are aware, scarcely any of them focused on visual recognition systematically, and thus it is unclear which progresses are applicable to it and what else should be concerned. In this survey, we present the review of recent advances with our suggestions on the new possible directions towards improving the efficiency of DNN-related and brain-inspired visual recognition approaches, including efficient network compression and dynamic brain-inspired networks. We investigate not only from the model but also from the data point of view (which is not the case in existing surveys) and focus on four typical data types (images, video, points, and events). This survey attempts to provide a systematic summary via a comprehensive survey that can serve as a valuable reference and inspire both researchers and practitioners working on visual recognition problems.

Download to read the full article text

Working on a manuscript?

Avoid the most common mistakes and prepare your manuscript for journal editors.

Learn more

References

  1. Y. Lecun, L. Bottou, Y. Bengio, P. Haffner. Gradient-based learning applied to document recognition. Proceedings of IEEE, vol. 86, no. 11, pp. 2278–2324, 1998. DOI: https://doi.org/10.1109/5.726791.

    Article  Google Scholar 

  2. G. E. Hinton, R. R. Salakhutdinov. Reducing the dimensionality of data with neural networks. Science, vol. 313, no. 5786, pp. 504–507, 2006. DOI: https://doi.org/10.1126/science.1127647.

    Article  MathSciNet  MATH  Google Scholar 

  3. A. Krizhevsky, I. Sutskever, G. E. Hinton. ImageNet classification with deep convolutional neural networks. In Proceedings of the 25th International Conference on Neural Information Processing Systems, Lake Tahoe, USA, pp. 1106–1114, 2012.

  4. T. Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, C. L. Zitnick. Microsoft COCO: Common objects in context. In Proceedings of the 13th European Conference on Computer Vision, Springer, Zurich, Switzerland, pp. 740–755, 2014. DOI: https://doi.org/10.1007/978-3-319-10602-1_48.

    Google Scholar 

  5. J. K. Song, Y. Y. Guo, L. L. Gao, X. L. Li, A. Hanjalic, H. T. Shen. From deterministic: to generative: Multimodal stochastic RNNs for video captioning. IEEE Transactions on Neural Networks and Learning Systems, vol. 30, no. 10, pp. 3047–3058, 2019. DOI: https://doi.org/10.1109/TNNLS.2018.2851077.

    Article  Google Scholar 

  6. L. L. Gao, X. P. Li, J. K. Song, H. T. Shen. Hierarchical LSTMs with adaptive attention for visual captioning. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 42, no. 5, pp. 1112–1131, 2020. DOI: https://doi.org/10.1109/TPAMI.2019.2894139.

    Google Scholar 

  7. S. E. Wei, V. Ramakrishna, T. Kanade, Y. Sheikh. Convolutional pose machines. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 4724–4732, 2016. DOI: https://doi.org/10.1109/CVPR.2016.511.

    Google Scholar 

  8. W. Maass. Networks of spiking neurons: The third generation of neural network models. Neural Networks, vol. 10, no. 9, pp. 1659–1671, 1997. DOI: https://doi.org/10.1016/S0893-6080(97)00011-7.

    Article  Google Scholar 

  9. E. Ahmed, A. Saint, A. E. R. Shabayek, K. Cherenkova, R. Das, G. Gusev, D. Aouada, B. Ottersten. A survey on deep learning advances on different 3D data representations. [Online], Available: https://arxiv.org/abs/1808.01462, 2019.

  10. L. Liu, J. Chen, P. Fieguth, G. Y. Zhao, R. Chellappa, M. Pietikäinen. From bow to CNN: Two decades of texture representation for texture classification. International Journal of Computer Vision, vol. 127, no. 1, pp. 74–109, 2019. DOI: https://doi.org/10.1007/s11263-018-1125-z.

    Article  Google Scholar 

  11. L. Liu, W. L. Ouyang, X. G. Wang, P. Fieguth, J. Chen, X. W. Liu, M. Pietikäinen. Deep learning for generic object detection: A survey. International Journal of Computer Vision, vol. 128, no. 2, pp. 261–318, 2020. DOI: https://doi.org/10.1007/s11263-019-01247-4.

    Article  MATH  Google Scholar 

  12. G. Gallego, T. Delbruük, G. Orchard, C. Bartolozzi, B. Taba, A. Censi, S. Leutenegger, A. J. Davison, J. Conradt, K. Daniilidis, D. Scaramuzza. Event-based vision: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 1, pp. 154–180, 2022. DOI: https://doi.org/10.1109/TPAMI.2020.3008413.

    Article  Google Scholar 

  13. Q. R. Zhang, M. Zhang, T. H. Chen, Z. F. Sun, Y. Z. Ma, B. Yu. Recent advances in convolutional neural network acceleration. Neurocomputing, vol. 323, pp. 37–51, 2019. DOI: https://doi.org/10.1016/j.neucom.2018.09.038.

    Article  Google Scholar 

  14. L. Deng, G. Q. Li, S. Han, L. P. Shi, Y. Xie. Model compression and hardware acceleration for neural networks: A comprehensive survey. Proceedings of IEEE, vol. 108, no. 4, pp. 485–532, 2020. DOI: https://doi.org/10.1109/JPROC.2020.2976475.

    Article  Google Scholar 

  15. Y. Cheng, D. Wang, P. Zhou, T. Zhang. Model compression and acceleration for deep neural networks: The principles, progress, and challenges. IEEE Signal Processing Magazine, vol. 35, no. 1, pp. 126–136, 2018. DOI: https://doi.org/10.1109/MSP.2017.2765695.

    Article  Google Scholar 

  16. V. Lebedev V. Lempitsky. Speeding-up convolutional neural networks: A survey. Bulletin of the Polish Academy of Sciences: Technical Sciences, vol. 66, no. 6, pp. 799–810, 2018. DOI: https://doi.org/10.24425/bpas.2018.125927.

    Google Scholar 

  17. T. Elsken, J. H. Metzen, F. Hutter. Neural architecture search: A survey. The Journal of Machine Learning Research, vol. 20, no. 1, pp. 1997–2017, 2019. DOI: https://doi.org/10.5555/3322706.3361996.

    MathSciNet  MATH  Google Scholar 

  18. Y. Z. Han, G. Huang, S. J. Song, L. Yang, H. H. Wang, Y. L. Wang. Dynamic neural networks: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, to be published. DOI: https://doi.org/10.1109/TPAMI.2021.3117837.

  19. P. Lichtsteiner, C. Posch, T. Delbruck. A 128×128 120 dB 15 µs latency asynchronous temporal contrast vision sensor. IEEE Journal of Solid-state Circuits, vol. 43, no. 2, pp. 566–576, 2008. DOI: https://doi.org/10.1109/JSSC.2007.914337.

    Article  Google Scholar 

  20. C. Posch, D. Matolin, R. Wohlgenannt. A QVGA 143 dB dynamic range frame-free PWM image sensor with lossless pixel-level video compression and time-domain CDS. IEEE Journal of Solid-state Circuits, vol. 46, no. 1, pp. 259–275, 2011. DOI: https://doi.org/10.1109/JSSC.2010.2085952.

    Article  Google Scholar 

  21. A. Krizhevsky. Learning Multiple Layers of Features from Tiny Images, Master dissertation, University of Toronto, Canada, 2009.

    Google Scholar 

  22. J. Deng, W. Dong, R. Socher, L. J. Li, K. Li, L. Fei-Fei. ImageNet: A large-scale hierarchical image database. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Miami, USA, pp. 248–255, 2009. DOI: https://doi.org/10.1109/CVPR.2009.5206848.

    Google Scholar 

  23. Y. Xiang, W. Kim, W. Chen, J. W. Ji, C. Choy, H. Su, R. Mottaghi, L. Guibas, S. Savarese. ObjectNet3D: A large scale database for 3D object recognition. In Proceedings of the 14th European Conference on Computer Vision, Springer, Amsterdam, The Netherlands, pp. 160–176, 2016. DOI: https://doi.org/10.1007/978-3-319-46484-8_10.

    Google Scholar 

  24. A. R. Zamir, A. Sax, W. Shen, L. Guibas, J. Malik, S. Savarese. Taskonomy: Disentangling task transfer learning. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 3712–3722, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00391.

    Google Scholar 

  25. H. Jhuang, J. Gall, S. Zuffi, C. Schmid, M. J. Black. Towards understanding action recognition. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Sydney, Australia, pp. 3192–3199, 2013. DOI: https://doi.org/10.1109/ICCV.2013.396.

    Google Scholar 

  26. A. Shahroudy, J. Liu, T. T. Ng, G. Wang. NTU RGB+D: A large scale dataset for 3D human activity analysis. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 1010–1019, 2016. DOI: https://doi.org/10.1109/CVPR.2016.115.

    Google Scholar 

  27. C. H. Liu, Y. Y. Hu, Y. H. Li, S. J. Song, J. Y. Liu. PKU-MMD: A large scale benchmark for continuous multi-modal human action understanding. [Online], Available: https://arxiv.org/abs/1703.07475, 2017.

  28. Y. S. Tang, Y. Tian, J. W. Lu, P. Y. Li, J. Zhou. Deep progressive reinforcement learning for skeleton-based action recognition. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA., 2018, pp. 5323–5332. DOI: https://doi.org/10.1109/CVPR.2018.00558.

    Google Scholar 

  29. J. X. Hou, G. J. Wang, X. H. Chen, J. H. Xue, R. Zhu, H. Z. Yang. Spatial-temporal attention res-TCN for skeleton-based dynamic hand gesture recognition. In Proceedings of Computer Vision, Springer, Munich, Germany, pp. 273–286, 2018. DOI: https://doi.org/10.1007/978-3-030-11024-6_18.

    Google Scholar 

  30. A. X. Chang, T. Funkhouser, L. Guibas, P. Hanrahan, Q. X. Huang, Z. M. Li, S. Savarese, M. Savva, S. R. Song, H. Su, J. X. Xiao, L. Yi, F. Yu. ShapeNet: An information-rich 3D model repository. [Online], Available: https://arxiv.org/abs/1512.03012, 2015.

  31. H. Rebecq, R. Ranftl, V. Koltun, D. Scaramuzza. High speed and high dynamic range video with an event camera. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 6, pp. 1964–1980, 2021. DOI: https://doi.org/10.1109/TPAMI.2019.2963386.

    Article  Google Scholar 

  32. W. S. Cheng, H. Luo, W. Yang, L. Yu, S. S. Chen, W. Li. Det: A high-resolution DVS dataset for lane extraction. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, IEEE, Long Beach, USA, pp. 1666–1675, 2019. DOI: https://doi.org/10.1109/CVPRW.2019.00210.

    Google Scholar 

  33. T. Delbruck, M. Lang. Robotic goalie with 3 ms reaction time at 4% CPU load using event-based dynamic vision sensor. Frontiers in Neuroscience, vol. 7, Article number 223, 2013. DOI: https://doi.org/10.3389/fnins.2013.00223.

  34. A. Amir, B. Taba, D. Berg, T. Melano, J. McKinstry, C. Di Nolfo, T. Nayak, A. Andreopoulos, G. Garreau, M. Mendoza, J. Kusnitz, M. Debole, S. Esser, T. Delbruck, M. Flickner, D. Modha. A low power, fully event-based gesture recognition system. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 7388–7397, 2017. DOI: https://doi.org/10.1109/CVPR.2017.781.

    Google Scholar 

  35. Z. Wu, Z. Xu, R. N. Zhang, S. M. Li. SIFT feature extraction algorithm for image in DCT domain. Applied Mechanics and Materials, vol. 347–350, pp. 2963–2967, 2013. DOI: https://doi.org/10.4028/u]www.scientific.net/AMM.347-350.2963.

    Article  Google Scholar 

  36. L. Gueguen, A. Sergeev, B. Kadlec, R. Liu, J. Yosinski. Faster neural networks straight from jpeg. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, Montréal, Canada, pp. 3937–3948, 2018. DOI: https://doi.org/10.5555/3327144.3327308.

  37. A. Paul, T. Z. Khan, P. Podder, R. Ahmed, M. M. Rahman, M. H. Khan. Iris image compression using wavelets transform coding. In Proceedings of the 2nd International Conference on Signal Processing and Integrated Networks, IEEE, Noida, India, pp. 544–548, 2015. DOI: https://doi.org/10.1109/SPIN.2015.7095407.

    Google Scholar 

  38. O. Rippel, L. Bourdev. Real-time adaptive image compression. In Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, pp. 2922–2930, 2017. DOI: https://doi.org/10.5555/3305890.3305983.

  39. J. Ballé, D. Minnen, S. Singh, S. J. Hwang, N. Johnston. Variational image compression with a scale hyperprior. In Proceedings of the 6th International Conference on Learning Representations, Vancouver, Canada, pp. 1–49, 2018.

  40. D. Minnen, G. Toderici, S. Singh, S. J. Hwang, M. Covell. Image-dependent local entropy models for learned image compression. In Proceedings of the 25th IEEE International Conference on Image Processing, IEEE, Athens, Greece, pp. 430–434, 2018. DOI: https://doi.org/10.1109/ICIP.2018.8451502.

    Google Scholar 

  41. D. Minnen, J. Ballé, G. D. Toderici. Joint autoregressive and hierarchical priors for learned image compression. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, Montréal, Canada, pp. 10794–10803, 2018. DOI: https://doi.org/10.5555/3327546.3327736.

  42. G. J. Sullivan, J. R. Ohm, W. J. Han, T. Wiegand. Overview of the high efficiency video coding (HEVC) standard. IEEE Transactions on Circuits and Systems for Video Technology, vol. 22, no. 12, pp. 1649–1668, 2012. DOI: https://doi.org/10.1109/TCSVT.2012.2221191.

    Article  Google Scholar 

  43. T. Wiegand, G. J. Sullivan, G. Bjontegaard, A. Luthra. Overview of the H.264/AVC video coding standard.. IEEE Transactions on Circuits and Systems for Video Technology, vol. 13, no. 7, pp. 560–576, 2003. DOI: https://doi.org/10.1109/TCSVT.2003.815165.

    Article  Google Scholar 

  44. T. Chen, H. J. Liu, Q. Shen, T. Yue, X. Cao, Z. Ma. Deepcoder: A deep neural network based video compression. In Proceedings of IEEE Visual Communications and Image Processing, IEEE, St. Petersburg, USA, pp.. 1–4, 2017. DOI: https://doi.org/10.1109/VCIP.2017.8305033.

    Google Scholar 

  45. G. Lu, W. L. Ouyang, D. Xu, X. Y. Zhang, Z. Y. Gao, M. T. Sun. Deep Kalman filtering network for video compression artifact reduction. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 591–608, 2018. DOI: https://doi.org/10.1007/978-3-030-01264-9_35.

    Google Scholar 

  46. C. Y. Wu, N. Singhal, P. Krähenbühl. Video compression through image interpolation. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 425–440, 2018. DOI: https://doi.org/10.1007/978-3-030-01237-3_26.

    Google Scholar 

  47. X. Z. Zhu, Y. W. Xiong, J. F. Dai, L. Yuan, Y. C. Wei. Deep feature flow for video recognition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 4141–4150, 2017. DOI: https://doi.org/10.1109/CVPR.2017.441.

    Google Scholar 

  48. C. Y. Wu, M. Zaheer, H. X. Hu, R. Manmatha, A. J. Smola, P. Krähenbuühl. Compressed video action recognition. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 6026–6035, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00631.

    Google Scholar 

  49. W. Yan, Y. T. Shao, S. Liu, T. H. Li, Z. Li, G. Li. Deep AutoEncoder-based lossy geometry compression for point clouds. [Online], Available: https://arxiv.org/abs/1905.03691, 2019.

  50. J. Q. Wang, H. Zhu, Z. Ma, T. Chen, H. J. Liu, Q. Shen. Learned point cloud geometry compression. [Online], Available: https://arxiv.org/abs/1909.12037, 2019.

  51. Y. Q. Yang, C. Feng, Y. R. Shen, D. Tian. FoldingNet: Point cloud auto-encoder via deep grid deformation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 206–215, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00029.

    Google Scholar 

  52. M. Yao, H. H. Gao, G. S. Zhao, D. S. Wang, Y. H. Lin, Z. X. Yang, G. Q. Li. Temporal-wise attention spiking neural networks for event streams classification. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Montréal, Canada, pp. 10201–10210, 2021. DOI: https://doi.org/10.1109/ICCV48922.2021.01006.

    Google Scholar 

  53. Y. X. Wang, B. W. Du, Y. R. Shen, K. Wu, G. R. Zhao, J. G. Sun, H. K. Wen. EV-Gait: Event-based robust gait recognition using dynamic vision sensors. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 6351–360, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00652.

    Google Scholar 

  54. Y. Sekikawa, K. Hara, H. Saito. EventNet: Asynchronous recursive event processing. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 3882–3891, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00401.

    Google Scholar 

  55. K. Chitta, J. M. Alvarez, E. Haussmann, C. Farabet. Training data subset search with ensemble active learning. [Online], Available: https://arxiv.org/abs/1905.12737, 2020.

  56. O. Sener, S. Savarese. Active learning for convolutional neural networks: A core-set approach. In Proceedings of the 6th International Conference on Learning Representations, Vancouver, Canada, 2018.

  57. K. Vodrahalli, K. Li, J. Malik. Are all training examples created equal? An empirical study. [Online], Available: https://arxiv.org/abs/1811.12569, 2018.

  58. V. Birodkar, H. Mobahi, S. Bengio. Semantic redundancies in image-classification datasets: The 10% you don’t need. [Online], Available: https://arxiv.org/abs/1901.11409, 2019.

  59. J. Y. Gao, Z. H. Yang, C. Sun, K. Chen, R. Nevatia. TURN TAP: Temporal unit regression network for temporal action proposals. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 3648–3656, 2017. DOI: https://doi.org/10.1109/ICCV.2017.392.

    Google Scholar 

  60. J. Carreira, A. Zisserman. Quo vadis, action recognition? A new model and the kinetics dataset. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 4724–4733, 2017. DOI: https://doi.org/10.1109/CVPR.2017.502.

    Google Scholar 

  61. S. N. Xie, C. Sun, J. Huang, Z. W. Tu, K. Murphy. Rethinking spatiotemporal feature learning: Speed-accuracy trade-offs in video classification. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 318–335, 2018. DOI: https://doi.org/10.1007/978-3-030-01267-0_19.

    Google Scholar 

  62. M. Zolfaghari, K. Singh, T. Brox. ECO: Efficient convolutional network for online video understanding. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 713–730, 2018. DOI: https://doi.org/10.1007/978-3-030-01216-8_43.

    Google Scholar 

  63. S. Yeung, O. Russakovsky, G. Mori, L. Fei-Fei. End-to-end learning of action detection from frame glimpses in videos. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 2678–2687, 2016. DOI: https://doi.org/10.1109/CVPR.2016.293.

    Google Scholar 

  64. J. J. Huang, N. N. Li, T. Zhang, G. Li, T. J. Huang, W. Gao. SAP: Self-adaptive proposal model for temporal action detection based on reinforcement learning. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence, New Orleans, USA, pp. 6951–6958, 2018. DOI: https://doi.org/10.1609/aaai.v32i1.12229.

  65. S. Y. Lan, R. Panda, Q. Zhu, A. K. Roy-Chowdhury. FFNet: Video fast-forwarding via reinforcement learning. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 6771–6780, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00708.

    Google Scholar 

  66. H. H. Fan, Z. W. Xu, L. C. Zhu, C. G. Yan, J. J. Ge, Y. Yang. Watching a small portion could be as good as watching all: Towards efficient video classification. In Proceedings of the 27th International Joint Conference on Artificial Intelligence, Stockholm, Sweden, pp. 705–711, 2018. DOI: https://doi.org/10.5555/3304415.3304516.

  67. A. Kar, N. Rai, K. Sikka, G. Sharma. AdaScan: Adaptive scan pooling in deep convolutional neural networks for human action recognition in videos. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 5699–5708, 2017. DOI: https://doi.org/10.1109/CVPR.2017.604.

    Google Scholar 

  68. Z. X. Wu, C. M. Xiong, C. Y. Ma, R. Socher, L. S. Davis. AdaFrame: Adaptive frame selection for fast video recognition. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 1278–1287, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00137.

    Google Scholar 

  69. J. C. Yang, Q. Zhang, B. B. Ni, L. G. Li, J. X. Liu, M. D. Zhou, Q. Tian. Modeling point clouds with self-attention and gumbel subset sampling. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 3318–3327, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00344.

    Google Scholar 

  70. A. Paigwar, O. Erkent, C. Wolf, C. Laugier. Attentional pointNet for 3D-object detection in point clouds. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, IEEE, Long Beach, USA, pp. 1297–1306, 2019. DOI: https://doi.org/10.1109/CVPRW.2019.00169.

    Google Scholar 

  71. C. Kingkan, J. Owoyemi, K. Hashimoto. Point attention network for gesture recognition using point cloud data. In Proceedings of the 29th British Machine Vision Conference, Newcastle, UK, pp. 1–13, 2018. [Online], Available: https://bmvc2018.org/contents/papers/0427.pdf.

  72. A. Khodamoradi, R. Kastner. O(N)o(N)-space spatiotemporal filter for reducing noise in neuromorphic vision sensors. IEEE Transactions on Emerging Topics in Computing, vol. 9, no. 1, pp. 15–23, 2021. DOI: https://doi.org/10.1109/TETC.2017.2788865.

    Google Scholar 

  73. H. J. Liu, C. Brandli, C. H. Li, S. C. Liu, T. Delbruck. Design of a spatiotemporal correlation filter for event-based sensors. In Proceedings of IEEE International Symposium on Circuits and Systems, IEEE, Lisbon, Portugal, pp. 722–725, 2015. DOI: https://doi.org/10.1109/ISCAS.2015.7168735.

    Google Scholar 

  74. V. Padala, A. Basu, G. Orchard. A noise filtering algorithm for event-based asynchronous change detection image sensors on trueNorth and its implementation on TrueNorth. Frontiers in Neuroscience, vol. 12, pp. 1–14, 2018. DOI: https://doi.org/10.3389/fnins.2018.00118.

    Article  Google Scholar 

  75. N. Tajbakhsh, J. Y. Shin, S. R. Gurudu, R. T. Hurst, C. B. Kendall, M. B. Gotway, J. M. Liang. Convolutional neural networks for medical image analysis: Full training or fine tuning? IEEE Transactions on Medical Imaging, vol. 35, no. 5, pp. 1299–1312, 2016. DOI: https://doi.org/10.1109/TMI.2016.2535302.

    Article  Google Scholar 

  76. U. K. Lopes, J. F. Valiati. Pre-trained convolutional neural networks as feature extractors for tuberculosis detection. Computers in Biology and Medicine, vol. 89, pp. 135–143, 2017. DOI: https://doi.org/10.1016/j.compbiomed.2017.08.001

    Article  Google Scholar 

  77. O. J. Hénaff, A. Srinivas, J. De Fauw, A. Razavi, C. Doersch, S. M. A. Eslami, A. van den Oord. Data-efficient image recognition with contrastive predictive coding. In Proceedings of the 37th International Conference on Machine Learning, Vienna, Austria, vol. 119, pp. 4182–4192, 2020. DOI: https://doi.org/10.5555/3524938.3525329.

    Google Scholar 

  78. A. S. Razavian, H. Azizpour, J. Sullivan, S. Carlsson. CNN features off-the-shelf: An astounding baseline for recognition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshops, IEEE, Columbus, USA, pp. 512–519, 2014. DOI: https://doi.org/10.1109/CVPRW.2014.131.

    Google Scholar 

  79. Y. Wu, J. Qiu, J. Takamatsu, T. Ogasawara. Temporal-enhanced convolutional network for person re-identification. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence, New Orleans, USA, pp. 7412–7419, 2018. DOI: https://doi.org/10.1609/aaai.v32i1.12264.

  80. H. Bilen, B. Fernando, E. Gavves, A. Vedaldi, S. Gould. Dynamic image networks for action recognition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 3034–3042, 2016. DOI: https://doi.org/10.1109/CVPR.2016.331.

    Google Scholar 

  81. H. Bilen, B. Fernando, E. Gavves, A. Vedaldi. Action recognition with dynamic image networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 12, pp. 2799–2813, 2018. DOI: https://doi.org/10.1109/TPAMI.2017.2769085.

    Article  Google Scholar 

  82. F. Yang, Y. Wu, S. Sakti, S. Nakamura. Make skeleton-based action recognition model smaller, faster and better. In Proceedings of ACM Multimedia Asia, ACM, Beijing, China, Article number 31, 2019. DOI: https://doi.org/10.1145/3338533.3366569.

    Google Scholar 

  83. C. Li, Q. Y. Zhong, D. Xie, S. L. Pu. Co-occurrence feature learning from skeleton data for action recognition and detection with hierarchical aggregation. In Proceedings of the 27th International Joint Conference on Artificial Intelligence, Stockholm, Sweden, pp. 786–792, 2018. DOI: https://doi.org/10.5555/3304415.3304527.

  84. C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, Z. Wojna. Rethinking the inception architecture for computer vision. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 2818–2826, 2016. DOI: https://doi.org/10.1109/CVPR.2016.308.

    Google Scholar 

  85. P. Q. Wang, P. F. Chen, Y. Yuan, D. Liu, Z. H. Huang, X. D. Hou, G. Cottrell. Understanding convolution for semantic segmentation. In Proceedings of IEEE Winter Conference on Applications of Computer Vision. IEEE, Lake Tahoe, USA, pp. 1451–1460, 2018. DOI: https://doi.org/10.1109/WACV.2018.00163.

    Google Scholar 

  86. F. N. Iandola, S. Han, M. W. Moskewicz, K. Ashraf, W. J. Dally, K. Keutzer. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and < 0.5 MB model size. In Proceedings of the 5th International Conference on Learning Representations, [Online], Available: https://arxiv.org/abs/1602.07360, 2016.

  87. X. Y. Zhang, X. Y. Zhou, M. X. Lin, J. Sun. ShuffleNet: An extremely efficient convolutional neural network for mobile devices. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 6848–6856, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00716.

    Google Scholar 

  88. F. Juefei-Xu, V. N. Boddeti, M. Savvides. Perturbative neural networks. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 3310–3318, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00349.

    Google Scholar 

  89. F. Juefei-Xu, V. N. Boddeti, M. Savvides. Local binary convolutional neural networks. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 4284–4293, 2017. DOI: https://doi.org/10.1109/CVPR.2017.456.

    Google Scholar 

  90. Z. Z. Wu, S. M. King. Investigating gated recurrent neural networks for speech synthesis. [Online], Available: https://arxiv.org/abs/1601.02539, 2016.

  91. J. van der Westhuizen, J. Lasenby. The unreasonable effectiveness of the forget gate. [Online], Available: https://arxiv.org/abs/1804.04849, 2018.

  92. H. Sak, A. W. Senior, F. Beaufays. Long short-term memory recurrent neural network architectures for large scale acoustic modeling. In Proceedings of the 15th Annual Conference of the International Speech Communication Association, Singapore, pp. 338–342, 2014.

  93. Y. H. Wu, M. Schuster, Z. F. Chen, Q. V. Le, M. Norouzi, W. Macherey, M. Krikun, Y. Cao, Q. Gao, K. Macherey, J. Klingner, A. Shah, M. Johnson, X. B. Liu, L. Kaiser, S. Gouws, Y. Kato, T. Kudo, H. Kazawa, K. Stevens, G. Kurian, N. Patil, W. Wang, C. Young, J. Smith, J. Riesa, A. Rudnick, O. Vinyals, G. Corrado, M. Hughes, J. Dean. Google’s neural machine translation system: Bridging the gap between human and machine translation. [Online], Available: https://arxiv.org/abs/1609.08144, 2016.

  94. B. Zoph, Q. V. Le. Neural architecture search with reinforcement learning. In Proceedings of the 5th International Conference on Learning Representations, Toulon, France, 2017.

  95. E. Real, A. Aggarwal, Y. P. Huang, Q. V. Le. Regularized evolution for image classifier architecture search. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence, the 31st Innovative Applications of Artificial Intelligence Conference, and the 9th AAAI Symposium on Educational Advances in Artificial Intelligence, Honolulu, USA, pp. 4780–4789, 2019. DOI: https://doi.org/10.1609/aaai.v33i01.33014780.

  96. K. Kandasamy, W. Neiswanger, J. Schneider, B. Póczos, E. P. Xing. Neural architecture search with Bayesian optimisation and optimal transport. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, Montréal, Canada, pp. 2020–2029, 2018. DOI: https://doi.org/10.5555/3326943.3327130.

  97. H. Cai, L. G. Zhu, S. Han. Proxylessnas: Direct neural architecture search on target task and hardware. In Proceedings of the 7th International Conference on Learning Representations, New Orleans, USA, 2019.

  98. M. Astrid, S. I. Lee. Cp-decomposition with tensor power method for convolutional neural networks compression. In Proceedings of IEEE International Conference on Big Data and Smart Computing, IEEE, Jeju, Korea, pp. 115–118, 2017. DOI: https://doi.org/10.1109/BIGCOMP.2017.7881725.

    Google Scholar 

  99. J. T. Chien, Y. T. Bao. Tensor-factorized neural networks. IEEE Transactions on Neural Networks and Learning Systems, vol. 29, no. 5, pp. 1998–2011, 2018. DOI: https://doi.org/10.1109/TNNLS.2017.2690379.

    Article  MathSciNet  Google Scholar 

  100. J. M. Ye, L. N. Wang, G. X. Li, D. Chen, S. D. Zhe, X. Q. Chu, Z. L. Xu. Learning compact recurrent neural networks with block-term tensor decomposition. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 9378–9387, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00977.

    Google Scholar 

  101. A. Novikov, D. Podoprikhin, A. Osokin, D. P. Vetrov. Tensorizing neural networks. In Proceedings of the 28th International Conference on Neural Information Processing Systems, Montréal Canada, pp.442–450, 2015. DOI: https://doi.org/10.5555/2969239.2969289.

  102. T. Garipov, D. Podoprikhin, A. Novikov, D. Vetrov. Ultimate tensorization: Compressing convolutional and FC layers alike. [Online], Available: https://arxiv.org/abs/1611.03214, 2016.

  103. D. H. Wang, G. S. Zhao, G. Q. Li, L. Deng, Y. Wu. Compressing 3DCNNs based on tensor train decomposition. Neural Networks, vol. 131, pp. 215–230, 2020. DOI: https://doi.org/10.1016/j.neunet.2020.07.028.

    Article  Google Scholar 

  104. A. Tjandra, S. Sakti, S. Nakamura. Compressing recurrent neural network with tensor train. In Proceedings of International Joint Conference on Neural Networks, IEEE, Anchorage, USA, pp. 4451–4458, 2017. DOI: https://doi.org/10.1109/IJCNN.2017.7966420.

    Google Scholar 

  105. Y. C. Yang, D. Krompass, V. Tresp. Tensor-train recurrent neural networks for video classification. In Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, pp. 3891–3900, 2017. DOI: https://doi.org/10.5555/3305890.3306083.

  106. Y. Pan, J. Xu, M. L. Wang, J. M. Ye, F. Wang, K. Bai, Z. L. Xu. Compressing recurrent neural networks with tensor ring for action recognition. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence, the 31st Innovative Applications of Artificial Intelligence Conference, and the 9th AAAI Symposium on Educational Advances in Artificial Intelligence, Honolulu, USA, pp. 4683–4690, 2019. DOI: https://doi.org/10.1609/aaai.v33i01.33014683.

  107. B. J. Wu, D. H. Wang, G. S. Zhao, L. Deng, G. Q. Li. Hybrid tensor decomposition in neural network compression. Neural Networks, vol. 132, pp. 309–320, 2020. DOI: https://doi.org/10.1016/j.neunet.2020.09.006.

    Article  MATH  Google Scholar 

  108. M. Yin, S. Y. Liao, X. Y. Liu, X. D. Wang, B. Yuan. Compressing recurrent neural networks using hierarchical tucker tensor decomposition. [Online], Available: https://arxiv.org/abs/2005.04366, 2020.

  109. S. Wu, G. Q. Li, F. Chen, L. P. Shi. Training and inference with integers in deep neural networks. In Proceedings of the 6th International Conference on Learning Representations, Vancouver, Canada, 2018.

  110. Y. K. Yang, L. Deng, S. Wu, T. Y. Yan, Y. Xie, G. Q. Li. Training high-performance and large-scale deep neural networks with full 8-bit integers. Neural Networks, vol. 125, pp. 70–82, 2020. DOI: https://doi.org/10.1016/j.neunet.2019.12.027.

    Article  Google Scholar 

  111. M. Rastegari, V. Ordonez, J. Redmon, A. Farhadi. XNOR-Net: ImageNet classification using binary convolutional neural networks. In Proceedings of the 14th European Conference on Computer Vision. Springer, Amsterdam, The Netherlands, pp. 525–542, 2016. DOI: https://doi.org/10.1007/978-3-319-46493-0_32.

    Google Scholar 

  112. Q. Lou, F. Guo, M. Kim, L. T.s Liu, L. Jiang. AutoQ: Automated kernel-wise neural network quantization. In Proceedings of the 8th International Conference on Learning Representations, Addis Ababa, Ethiopia, 2020.

  113. Y. Y. Lin, C. Sakr, Y. Kim, N. Shanbhag. PredictiveNet: An energy-efficient convolutional neural network via zero prediction. In Proceedings of IEEE International Symposium on Circuits and Systems, IEEE, Baltimore, USA, 2017. DOI: https://doi.org/10.1109/ISCAS.2017.8050797.

    Google Scholar 

  114. M. C. Song, J. C. Zhao, Y. Hu, J. Q. Zhang, T. Li. Prediction based execution on deep neural networks. In Proceedings of the 45th ACM/IEEE Annual International Symposium on Computer Architecture, IEEE, Los Angeles, USA, pp. 752–763, 2018. DOI: https://doi.org/10.1109/ISCA.2018.00068.

    Google Scholar 

  115. V. Akhlaghi, A. Yazdanbakhsh, K. Samadi, R. K. Gupta, H. Esmaeilzadeh. SnaPEA: Predictive early activation for reducing computation in deep convolutional neural networks. In Proceedings of the 45th ACM/IEEE Annual International Symposium on Computer Architecture, IEEE, Los Angeles, USA, pp. 662–673, 2018. DOI: https://doi.org/10.1109/ISCA.2018.00061.

    Google Scholar 

  116. W. Wen, C. P. Wu, Y. D. Wang, Y. R. Chen, H. Li. Learning structured sparsity in deep neural networks. In Proceedings of the 30th International Conference on Neural Information Processing Systems, Barcelona, Spain, pp. 2082–2090, 2016. DOI: https://doi.org/10.5555/3157096.3157329.

  117. J. H. Luo, J. X. Wu, W. Y. Lin. ThiNet: A filter level pruning method for deep neural network compression. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 5068–5076, 2017. DOI: https://doi.org/10.1109/ICCV.2017.541.

    Google Scholar 

  118. S. H. Lin, R. R. Ji, Y. C. Li, C. Deng, X. L. Li. Toward compact convnets via structure-sparsity regularized filter pruning. IEEE Transactions on Neural Networks and Learning Systems, vol. 31, no. 2, pp. 574–588, 2020. DOI: https://doi.org/10.1109/TNNLS.2019.2906563.

    Article  MathSciNet  Google Scholar 

  119. B. Y. Liu, M. Wang, H. Foroosh, M. Tappen, M. Penksy. Sparse convolutional neural networks. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Boston, USA, pp. 806–814, 2015. DOI: https://doi.org/10.1109/CVPR.2015.7298681.

    Google Scholar 

  120. W. Wen, C. Xu, C. P. Wu, Y. D. Wang, Y. R. Chen, H. Li. Coordinating filters for faster deep neural networks. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 658–666, 2017. DOI: https://doi.org/10.1109/ICCV.2017.78.

    Google Scholar 

  121. S. Han, H. Z. Mao, W. J. Dally. Deep compression: Compressing deep neural networks with pruning, trained quantization and Huffman coding. [Online], Available: https://arxiv.org/abs/1510.00149, 2015.

  122. Y. Choi, M. El-Khamy, J. Lee. Compression of deep convolutional neural networks under joint sparsity constraints. [Online], Available: https://arxiv.org/abs/1805.08303, 2018.

  123. B. W. Pan, W. W. Lin, X. L. Fang, C. Q. Huang, B. L. Zhou, C. W. Lu. Recurrent residual module for fast inference in videos. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 1536–1545, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00166.

    Google Scholar 

  124. S. Han, X. Y. Liu, H. Z. Mao, J. Pu, A. Pedram, M. A. Horowitz, W. J. Dally. EIE: Efficient inference engine on compressed deep neural network. In Proceedings of the 43rd ACM/IEEE Annual International Symposium on Computer Architecture, IEEE, Seoul, Korea, pp. 243–254, 2016. DOI: https://doi.org/10.1109/ISCA.2016.30.

    Google Scholar 

  125. K. Chen, J. Q. Wang, S. Yang, X. C. Zhang, Y. J. Xiong, C. C. Loy, D. H. Lin. Optimizing video object detection via a scale-time lattice. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 7814–7823, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00815.

    Google Scholar 

  126. S. Lee, S. Chang, N. Kwak. UrnEt: User-resizable residual networks with conditional gating module. In Proceedings of the 34th AAAI Conference on Artificial Intelligence, New York, USA, pp. 4569–4576, 2020. DOI: https://doi.org/10.1609/aaai.v34i04.5886.

  127. B. Y. Fang, X. Zeng, M. Zhang. NestDNN: Resource-aware multi-tenant on-device deep learning for continuous mobile vision. In Proceedings of the 24th Annual International Conference on Mobile Computing and Networking, ACM, New Delhi, India, pp. 115–127, 2018. DOI: https://doi.org/10.1145/3241539.3241559.

    Google Scholar 

  128. N. Shazeer, K. Fatahalian, W. R. Mark, R. T. Mullapudi. Hydranets: Specialized dynamic architectures for efficient inference. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 8080–8089, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00843.

    Google Scholar 

  129. G. Huang, D. L. Chen, T. H. Li, F. Wu, L. van der Maaten, K. Q. Weinberger. Multi-scale dense networks for resource efficient image classification. In Proceedings of the 6th International Conference on Learning Representations, Vancouver, Canada, 2018.

  130. Q. S. Guo, Z. P. Yu, Y. C. Wu, D. Liang, H. Y. Qin, J. J. Yan. Dynamic recursive neural network. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 5142–5151, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00529.

    Google Scholar 

  131. G. Huang, S. C. Liu, L. van der Maaten, K. Q. Weinberger. CondenseNet: An efficient DenseNet using learned group convolutions. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 2752–2761, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00291.

    Google Scholar 

  132. B. Yang, G. Bender, Q. V. Le, J. Ngiam. CondConv: Conditionally parameterized convolutions for efficient inference. In Proceedings of the 33rd International Conference on Neural Information Processing Systems, Vancouver, Canada, pp. 1307–1318, 2019. DOI: https://doi.org/10.5555/3454287.3454404.

  133. Y. P. Chen, X. Y. Dai, M. C. Liu, D. D. Chen, L. Yuan, Z. C. Liu. Dynamic convolution: Attention over convolution kernels. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 11027–11036, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.01104

    Google Scholar 

  134. A. W. Harley, K. G. Derpanis, I. Kokkinos. Segmentation-aware convolutional networks using local attention masks. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 5048–5057, 2017. DOI: https://doi.org/10.1109/ICCV.2017.539.

    Google Scholar 

  135. H. Su, V. Jampani, D. Q. Sun, O. Gallo, E. Learned-Miller, J. Kautz. Pixel-adaptive convolutional neural networks. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 11158–11167, 2019. DOI: https://doi.org/10.1109/CVPR.2019.01142.

    Google Scholar 

  136. K. Roy, A. Jaiswal P. Panda. Towards spike-based machine intelligence with neuromorphic computing. Nature vol. 575, no. 7784, pp. 607–617, 2019. DOI: https://doi.org/10.1038/s41586-019-1677-2.

    Article  Google Scholar 

  137. M. Ehrlich, L. Davis. Deep residual learning in the JPEG transform domain. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Seoul, Korea, pp. 3483–3492, 2099. DOI: https://doi.org/10.1109/ICCV.2019.00358.

    Google Scholar 

  138. Z. H. Liu, T. Liu, W. J. Wen, L. Jiang, J. Xu, Y. Z. Wang, G. Quan. DeepN-JPEG: A deep neural network favorable jpeg-based image compression framework. In Proceedings of the 55th ACM/ESDA/IEEE Design Automation Conference, IEEE, San Francisco, USA, 2018, pp. 1–6, 2018. DOI: https://doi.org/10.1109/DAC.2018.8465809.

    Google Scholar 

  139. M. Javed, P. Nagabhushan, B. B. Chaudhuri. A review on document image analysis techniques directly in the compressed domain. Artificial Intelligence Review, vol. 50, no. 4, pp. 539–568, 2018. DOI: https://doi.org/10.1007/s10462-017-9551-9

    Article  Google Scholar 

  140. E. Oyallon, E. Belilovsky, S. Zagoruyko, M. Valko. Compressing the input for CNNs with the first-order scattering transform. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, 2018, pp. 305–320. DOI: https://doi.org/10.1007/978-3-030-01240-3_19.

  141. R. Torfason, F. Mentzer, E. Agustsson, M. Tschannen, R. Timofte, L. van Gool. Towards image understanding from deep compression without decoding. In Proceedings of the 6th International Conference on Learning Representations, Vancouver, Canada, 2018.

  142. T. Chang, B. Tolooshams, D. Ba. RandNet: Deep learning with compressed measurements of images. In Proceedings of the 29th IEEE International Workshop on Machine Learning for Signal Processing, IEEE, Pittsburgh, USA, pp.1–6, 2019. DOI: https://doi.org/10.1109/MLSP.2019.8918878.

    Google Scholar 

  143. L. D. Chamain, Z. Ding. Faster and accurate classification for JPEG2000 compressed images in networked applications. [Online], Available: https://arxiv.org/abs/1909.05638, 2019.

  144. C. X. Ding, D. C. Tao. Trunk-branch ensemble convolutional neural networks for video-based face recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 4, pp. 1002–1014, 2018. DOI: https://doi.org/10.1109/TPAMI.2017.2700390.

    Article  Google Scholar 

  145. L. Pigou, A. van den Oord, S. Dieleman, M. van Herreweghe, J. Dambre. Beyond temporal pooling: Recurrence and temporal convolutions for gesture recognition in video. International Journal of Computer Vision, vol. 126, no. 2–4, pp. 430–439, 2018. DOI: https://doi.org/10.1007/s11263-016-0957-7.

    Article  MathSciNet  Google Scholar 

  146. A. Ullah, J. Ahmad, K. Muhammad, M. Sajjad, S. W. Baik. Action recognition in video sequences using deep bidirectional LSTM with CNN features. IEEE Access, vol. 6, pp. 1155–1166, 2018. DOI: https://doi.org/10.1109/ACCESS.2017.2778011.

    Article  Google Scholar 

  147. S. Tulyakov, M. Y. Liu, X. D. Yang, J. Kautz. MoCoGAN: Decomposing motion and content or video generation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 1526–1535, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00165.

    Google Scholar 

  148. S. Y. Sun, Z. H. Kuang, L. Sheng, W. L. Ouyang, W. Zhang. Optical flow guided feature: A fast and robust motion representation for video action recognition. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 1390–1399, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00151.

    Google Scholar 

  149. G. Lu, W. L. Ouyang, D. Xu, X. Y. Zhang, C. L. Cai, Z. Y. Gao. DVC: An end-to-end deep video compression framework. Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA pp. 10998–11007, 2019. DOI: https://doi.org/10.9109/CVPR.2019.01126.

    Google Scholar 

  150. A. Habibian, T. van Rozendaal, J. Tomczak, T. Cohen. Video compression with rate-distortion autoencoders. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Seoul, Korea, pp. 7032–7041, 2020. DOI: https://doi.org/10.1109/ICCV.2019.00713.

    Google Scholar 

  151. M. Quach, G. Valenzise, F. Dufaux. Learning convolutional transforms for lossy point cloud geometry compression. In Proceedings of IEEE International Conference on Image Processing, IEEE, Taipei, China, pp. 4320–4324, 2019. DOI: https://doi.org/10.1109/ICIP.2019.8803413.

    Google Scholar 

  152. C. Moenning, N. A. Dodgson. Fast marching farthest point sampling. In Proceedings of the 24th Annual Conference of the European Association for Computer Graphics, Eurographics Association, Granada, Spain, pp. 39–42, 2003. DOI: https://doi.org/10.2312/egp.20031024.

    Google Scholar 

  153. O. Dovrat, I. Lang, S. Avidan. Learning to sample. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 2755–2764, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00287.

    Google Scholar 

  154. R. Q. Charles, H. Su, M. Kaichun, L. J. Guibas. PointNet: Deep learning on point sets for 3D classification and segmentation. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 77–85, 2017. DOI: https://doi.org/10.1109/CVPR.2017.16.

    Google Scholar 

  155. Y. Zhao, Y. J. Xiong, D. H. Lin. Trajectory convolution for action recognition. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, Montréal, Canada, pp. 2208–2219, 2018. DOI: https://doi.org/10.5555/3327144.3327148.

  156. S. Mukherjee, L. Anvitha, T. M. Lahari. Human activity recognition in RGB-D videos by dynamic images. Multimedia Tools and Applications, vol. 79, no. 27, pp. 19797–19801, 2020. https://doi.org/10.1007/s11042-020-08747-3.

    Google Scholar 

  157. Y. Xiao, J. Chen, Y. C. Wang, Z. G. Cao, J. T. Zhou, X. Bai. Action recognition for depth video using multi-view dynamic images. Information Sciences, vol. 480, pp. 287–304, 2019. DOI: https://doi.org/10.1016/j.ins.2018.12.050.

    Article  Google Scholar 

  158. H. Liu, J. H. Tu, M. Y. Liu. Two-stream 3D convolutional neural network for skeleton-based action recognition. [Online], Available: https://arxiv.org/abs/1705.08106, 2017.

  159. D. Maturana, S. Scherer. VoxNet: A 3D convolutional neural network for real-time object recognition. In Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems, IEEE, Hamburg, Germany, pp. 922–928, 2015. DOI: https://doi.org/10.1109/IROS.2015.7353481.

    Google Scholar 

  160. J. Y. Chang, G. Moon, K. M. Lee. V2V-PoseNet: Voxel-to-voxel prediction network for accurate 3D hand and human pose estimation from a single depth map. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 5079–5088, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00533.

    Google Scholar 

  161. Q. Y. Wang, Y. X. Zhang, J. S. Yuan, Y. L. Lu. Space-time event clouds for gesture recognition: From RGB cameras to event cameras. In Proceedings of IEEE Winter Conference on Applications of Computer Vision, IEEE, Waikoloa, USA, pp. 1826–1835, 2019. DOI: https://doi.org/10.1109/WACV.2019.00199.

    Google Scholar 

  162. M. Denil, B. Shakibi, L. Dinh, M. Ranzato, N. de Freitas. Predicting parameters in deep learning. In Proceedings of the 26th International Conference on Neural Information Processing Systems, Lake Tahoe, USA, pp. 2148–2156, 2013. DOI: https://doi.org/10.5555/2999792.2999852.

  163. D. H. Wang, B. J. Wu, G. S. Zhao, M. Yao, H. N. Chen, L. Deng, T. Y. Yan, G. Q. Li. Kronecker CP decomposition with fast multiplication for compressing RNNs. IEEE Transactions on Neural Networks and Learning Systems, to be published. DOI: https://doi.org/10.1109/TNNLS.2021.3105961.

  164. L. Deng, Y. J. Wu, Y. F. Hu, L. Liang, G. Q. Li, X. Hu, Y. F. Ding, P. Li, Y. Xie. Comprehensive SNN compression using ADMM optimization and activity regularization. IEEE Transactions on Neural Networks and Learning Systems, to be published. DOI: https://doi.org/10.1109/TNNLS.2021.3109064.

  165. A. G. Howard, M. L. Zhu, B. Chen, D. Kalenichenko, W. J. Wang, T. Weyand, M. Andreetto, H. Adam. MobileNets: Efficient convolutional neural networks for mobile vision applications. [Online], Available: https://arxiv.org/abs/1704.04861, 2017.

  166. B. C. Wu, A. Wan, X. Y. Yue, P. Jin, S. C. Zhao, N. Golmant, A. Gholaminejad, J. Gonzalez, K. Keutzer. Shift: A tero FLOP, zero parameter alternative to spatial convolutions. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 9127–9135, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00951.

    Google Scholar 

  167. W. J. Luo, Y. J. Li, R. Urtasun, R. Zemel. Understanding the effective receptive field in deep convolutional neural networks. In Proceedings of the 30th International Conference on Neural Information Processing Systems, Barcelona, Spain, pp. 4905–4913, 2016. DOI: https://doi.org/10.5555/3157382.3157645.

  168. K. Simonyan, A. Zisserman. Very deep convolutional networks for large-scale image recognition. In Proceedings of the 3rd International Conference on Learning Representations, San Diego, USA, 2015.

  169. A. Paszke, A. Chaurasia, S. Kim, E. Culurciello. ENet: A deep neural network architecture for real-time semantic segmentation. [Online], Available: https://arxiv.org/abs/1606.02147, 2016.

  170. M. Holschneider, R. Kronland-Martinet, J. Morlet, P. Tchamitchian. A real-time algorithm for signal analysis with the help of the wavelet transform. In Wavelets: Time-Frequency Methods and Phase Space, J. M. Combes, A. Grossmann, P. Tchamitchian, Eds. Berlin, Germany: Springer, pp. 286–297, 1989. DOI: https://doi.org/10.1007/978-3-642-97177-8_28.

    Chapter  Google Scholar 

  171. F. Yu, V. Koltun. Multi-scale context aggregation by dilated convolutions. In Proceedings of the 4th International Conference on Learning Representations, San Juan, Puerto Rico, 2016.

  172. J. F. Dai, H. Z. Qi, Y. W. Xiong, Y. Li, G. D. Zhang, H. Hu, Y. C. Wei Deformable convolutional networks. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 764–773, 2017. DOI: https://doi.org/10.1109/ICCV.2017.89.

    Google Scholar 

  173. M. Lin, Q. Chen, S. C. Yan. Network in network. [Online], Available: https://arxiv.org/abs/1312.4400, 2013.

  174. C. Szegedy, Wei Liu, Y. Q. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, A. Rabinovich. Going deeper convolutions. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Boston, USA, pp. 1–9, 2015. DOI: https://doi.org/10.1109/CVPR.2015.7298594.

    Google Scholar 

  175. K. M. He, X. Y. Zhang, S. Q. Ren, J. Sun. Deep residual learning for image recognition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 770–778, 2016. DOI: https://doi.org/10.1109/CVPR.2016.90.

    Google Scholar 

  176. K. M. He, X. Y. Zhang, S. Q. Ren, J. Sun. Identity mappings in deep residual networks. In Proceedings of the 14th European Conference on Computer Vision, Springer, Amsterdam, The Netherlands, pp. 630–645, 2016. DOI: https://doi.org/10.1007/978-3-319-46493-0_38.

    Google Scholar 

  177. F. Chollet. Xception: Deep learning with depthwise separable convolutions. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 1800–1807, 2017. DOI: https://doi.org/10.1109/CVPR.2017.195.

    Google Scholar 

  178. M. Sandler, A. Howard, M. L. Zhu, A. Zhmoginov, L. C. Chen. MobileNetV2: Inverted residuals and linear bottlenecks. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 4510–4520, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00474.

    Google Scholar 

  179. S. Chen, Y. Liu, X. Gao, Z. Han. Mobilefacenets: Efficient CNNs for accurate real-time face verification on mobile devices. In Proceedings of the 13th Chinese Conference on Biometric Recognition, Springer, Urumqi, China, pp. 428–438, 2018. DOI: https://doi.org/10.1007/978-3-319-97909-0_46.

    Google Scholar 

  180. S. N. Xie, R. Girshick, P. Dollár, Z. W. Tu, K. M. He. Aggregated residual transformations for deep neural networks. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 5987–5995, 2017. DOI: https://doi.org/10.1109/CVPR.2017.634.

    Google Scholar 

  181. S. Hochreiter J. Schmidhuber. Long short-term memory. Neural Computation, vol. 9, no. 8, pp. 1735–1780, 1997. DOI: https://doi.org/10.1162/neco.1997.9.8.1735.

    Article  Google Scholar 

  182. F. A Gers, J. Schmidhuber, F. Cummins. Learning to forget: Continual prediction with LSTM. In Proceedings of the 19th International Conference on Artificial Neural Networks, Edinburgh, UK, pp. 850–855, 1999. DOI: https://doi.org/10.1049/cp:19991218.

  183. K. Cho, B. van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, Y. Bengio. Learning phrase representations using rnn encoder-decoder for statistical machine translation. Proceedings of Conference on Empirical Methods in Natural Language Processing, Doha, Qatar, pp. 1724–1734, 2014. DOI: https://doi.org/10.3115/v1/D14-1179.

  184. G. B. Zhou, J. X. Wu, C. L. Zhang, Z. H. Zhou. Minimal gated unit for recurrent neural networks. International Journal of Automation and Computing, vol. 13, no. 3, pp. 226–234, 2016. DOI: https://doi.org/10.1007/s11633-016-1006-2.

    Article  Google Scholar 

  185. A. Kusupati, M. Singh, K. Bhatia, A. Kumar, P. Jain, M. Varma. FastGRNN: A fast, accurate, stable and tiny kilobyte sized gated recurrent neural network. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, Montréal, Canada, pp. 9031–9042, 2018. DOI: https://doi.org/10.5555/3327546.3327577.

  186. J. Bradbury, S. Merity, C. M. Xiong, R. Socher. Quasi-recurrent neural networks. In Proceedings of the 5th International Conference on Learning Representations, Toulon, France, 2017.

  187. S. Z. Zhang, Y. H. Wu, T. Che, Z. H. Lin, R. Memisevic, R. Salakhutdinov, Y. Bengio. Architectural complexcity measures of recurrent neural networks. In Proceedings of the 30th International Conference on Neural Information Processing Systems, Barcelona, Spain, pp. 1822–1830, 2016. DOI: https://doi.org/10.5555/3157096.3157301.

  188. N. Kalchbrenner, I. Danihelka, A. Graves. Grid long short-term memory. [Online], Available: https://arxiv.org/abs/1507.01526, 2015.

  189. M. Fraccaro, S. K. Sønderby, U. Paquet, O. Winther. Sequential neural models with stochastic layers. In Proceedings of the 30th International Conference on Neural Information Processing Systems, Barcelona, Spain, pp. 2207–2215, 2016. DOI: https://doi.org/10.5555/3157096.3157343.

  190. G. Hinton, O. Vinyals, J. Dean. Distilling the knowledge in a neural network. [Online], Available: https://arxiv.org/abs/1503.02531, 2015.

  191. K. Greff, R. K. Srivastava, J. Koutník, B. R. Steunebrink, J. Schmidhuber. LSTM: A search space odyssey. IEEE Transactions on Neural Networks and Learning Systems, vol. 28, no. 10, pp. 2222–2232, 2017. DOI: https://doi.org/10.1109/TNNLS.2016.2582924.

    Article  MathSciNet  Google Scholar 

  192. B. Zoph, V. Vasudevan, J. Shlens, Q. V. Le. Learning transferable architectures for scalable image recognition. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 8697–8710, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00907.

    Google Scholar 

  193. H. X. Liu, K. Simonyan, Y. M. Yang. Darts: Differentiable architecture search. In Proceedings of the 7th International Conference on Learning Representations, New Orleans, USA, 2019.

  194. A. Rawal, R. Miikkulainen. From nodes to networks: Evolving recurrent neural networks. [Online], Available: https://arxiv.org/abs/1803.04439, 2018.

  195. Z. Zhong, J. J. Yan, W. Wu, J. Shao, C. L. Liu. Practical block-wise neural network architecture generation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 2423–2432, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00257.

    Google Scholar 

  196. C. X. Liu, B. Zoph, M. Neumann, J. Shlens, W. Hua, L. J. Li, L. Fei-Fei, A. Yuille, J. Huang, K. Murphy. Progressive neural architecture search. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 19–35, 2018. DOI: https://doi.org/10.1007/978-3-030-01246-5_2.

    Google Scholar 

  197. H. X. Liu, K. Simonyan, O. Vinyals, C. Fernando, K. Kavukcuoglu. Hierarchical representations for efficient architecture search. In Proceedings of the 6th International Conference on Learning Representations, Vancouver, Canada, 2018.

  198. B. Baker, O. Gupta, N. Naik, R. Raskar. Designing neural network architectures using reinforcement learning. In Proceedings of the 5th International Conference on Learning Representations, Toulon, France, 2017.

  199. Z. Zhong, J. J. Yan, W. Wu, J. Shao, C. L. Liu. Practical block-wise neural network architecture generation. [Online], Available: https://arxiv.org/abs/1708.05552, 2017.

  200. H. Cai, J. C. Yang, W. N. Zhang, S. Han, Y. Yu. Path-level network transformation for efficient architecture search. In Proceedings of the 35th International Conference on Machine Learning, Stockholm, Sweden, pp. 678–687, 2018.

  201. H. Cai, T. Y. Chen, W. N. Zhang, Y. Yu, J. Wang. Efficient architecture search by network transformation, to Proceedings of the 32nd AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence, New Orleans, USA, pp. 2787–2794, 2018. DOI: https://doi.org/10.5555/3504035.3504375.

  202. L. X. Xie, A. L. Yuille. Genetic CNN. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 1388–1397, 2017. DOI: https://doi.org/10.1109/ICCV.2017.154.

    Google Scholar 

  203. A. Klein, E. Christiansen, K. Murphy, F. Hutter. Towards reproducible neural architecture and hyperparameter search. In Proceedings of the 2nd Reproducibility in Machine Learning Workshop, Stockholm, Sweden, 2018.

  204. M. X. Tan, Q. V. Le. EfficientNet: Rethinking model scaling for convolutional neural networks. In Proceedings of the 36th International Conference on Machine Learning, Long Beach, USA, pp. 6105–6114, 2019.

  205. T. Elsken, J. Metzen, F. Hutter. Efficient multi-objective neural architecture search via Lamarckian evolution. In Proceedings of the 7th International Conference on Learning Representations, New Orleans, USA, 2019.

  206. A. Klein, S. Falkner, S. Bartels, P. Hennig, F. Hutter. Fast Bayesian optimization of machine learning hyperparameters on large datasets. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, Fort Lauderdale, USA, pp. 528–536, 2017.

  207. H. Cai, C. Gan, T. Z. Wang, Z. K. Zhang, S. Han. Once-for-all: Train one network and specialize it for efficient deployment. In Proceedings of the 8th International Conference on Learning Representations, Addis Ababa, Ethiopia, 2020.

  208. A. Klein, S. Falkner, J. T. Springenberg, F. Hutter. Learning curve prediction with Bayesian neural networks. In Proceedings of the 5th International Conference on Learning Representations, Toulon, France, 2017.

  209. T. Wei, C. H. Wang, Y. Rui, C. W. Chen. Network morphism. In Proceedings of the 33rd International Conference on Machine Learning, New York, USA, pp. 564–572, 2016.

  210. M. Masana, J. van de Weijer, L. Herranz, A. D. Bagdanov, J. M. Álvarez. Domain-adaptive deep network compression. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 4299–4307, 2017. DOI: https://doi.org/10.1109/ICCV.2017.460.

    Google Scholar 

  211. T. Kumamoto, M. Suzuki, H. Matsueda. Singular-value-decomposition analysis of associative memory in a neural network. Journal of the Physical Society of Japan, vol. 86, no. 2, Article number 24005, 2017. DOI: https://doi.org/10.7566/JPSJ.86.024005.

  212. T. Deb, A. K. Ghosh, A. Mukherjee. Singular value decomposition applied to associative memory of Hopfield neural network. Materials Today: Proceedings, vol. 5, no. 1, pp. 2222–2228, 2018. DOI: https://doi.org/10.1016/j.matpr.2017.09.222.

    Google Scholar 

  213. Z. X. Zou, Z. W. Shi. Ship detection in spaceborne optical image with SVD networks. IEEE Transactions on Geoscience and Remote Sensing, vol. 54, no. 10, pp. 5832–5845, 2016. DOI: https://doi.org/10.1109/TGRS.2016.2572736.

    Article  Google Scholar 

  214. X. Y. Zhang, J. H. Zou, X. Ming, K. M. He, J. Sun. Efficient and accurate approximations of nonlinear convolutional networks. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Boston, USA, pp. 1984–1992, 2015. DOI: https://doi.org/10.1109/CVPR.2015.7298809.

    Google Scholar 

  215. X. Y. Zhang, J. H. Zou, K. M. He, J. Sun. Accelerating very deep convolutional networks for classification and detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 38, no. 10, pp. 1943–1955, 2016. DOI: https://doi.org/10.1109/TPAMI.2015.2502579.

    Article  Google Scholar 

  216. Y. Ioannou, D. Robertson, R. Cipolla, A. Criminisi. Deep roots: Improving CNN efficiency with hierarchical filter groups. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 5977–5986, 2017. DOI: https://doi.org/10.1109/CVPR.2017.633.

    Google Scholar 

  217. B. Peng, W. M. Tan, Z. Y. Li, S. Zhang, D. Xie, S. L. Pu. Extreme network compression via filter group approximation. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 307–323, 2018. DOI: https://doi.org/10.1007/978-3-030-01237-3_19.

    Google Scholar 

  218. G. S. Hu, Y. Hua, Y. Yuan, Z. H. Zhang, Z. Lu, S. S. Mukherjee, T. M. Hospedales, N. M. Robertson, Y. X. Yang. Attribute-enhanced face recognition with neural tensor fusion networks. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 3764–3773, 2017. DOI: https://doi.org/10.1109/ICCV.2017.404.

    Google Scholar 

  219. J. D. Carroll, J. J. Chang. Analysis of individual differences in multidimensional scaling via an n-way generalization of “Eckart-Young” decomposition. Psychometrika, vol. 35, no. 3, pp. 283–319, 1970. DOI: https://doi.org/10.1007/BF02310791.

    Article  MATH  Google Scholar 

  220. L. De Lathauwer, B. De Moor, J. Vandewalle. A multilinear singular value decomposition. SIAM Journal on Matrix Analysis and Applications, vol. 21, no. 4, pp. 1253–1278, 2000. DOI: https://doi.org/10.1137/S0895479896305696.

    Article  MathSciNet  MATH  Google Scholar 

  221. L. De Lathauwer, B. De Moor, J. Vandewalle. On the best rank-1 and rank-(R1, R2, ⋯ RN) approximation of higher-order tensors. SIAM Journal on Matrix Analysis and Applications, vol. 21, no. 4, pp. 1324–1342, 2000. DOI: https://doi.org/10.1137/S0895479898346995.

    Article  MathSciNet  MATH  Google Scholar 

  222. L. R. Tucker. Some mathematical notes on three-mode factor analysis. Psychometrika, vol. 31, no. 3, pp. 279–311, 1966. DOI: https://doi.org/10.1007/BF02289464.

    Article  MathSciNet  Google Scholar 

  223. T. G. Kolda, B. W. Bader. Tensor decompositions and applications. SIAM Review, vol. 51, no. 3, pp. 455–500, 2009. DOI: https://doi.org/10.1137/07070111X.

    Article  MathSciNet  MATH  Google Scholar 

  224. G. Huang, Z. Liu, L. van der Maaten, K. Q. Weinberger. Densely connected convolutional networks. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 2261–2269, 2017. DOI: https://doi.org/10.1109/CVPR.2017.243.

    Google Scholar 

  225. X. C. Zhang, Z. Z. Li, C. C. Loy, D. H. Lin. PolyNet: A pursuit of structural diversity in very deep networks. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp.3900–3908, 2017. DOI: https://doi.org/10.1109/CVPR.2017.415.

    Google Scholar 

  226. Y. P. Chen, J. N. Li, H. X. Xiao, X. J. Jin, S. C. Yan, J. S. Feng. Dual path networks. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, USA, pp. 4470–4478, 2017. DOI: https://doi.org/10.5555/3294996.3295200.

  227. Y. D. Kim, E. Park, S. Yoo, T. Choi, L. Yang, D. Shin. Compression of deep convolutional neural networks for fast and low power mobile applications. In Proceedings of the 4th International Conference on Learning Representations, San Juan, Puerto Rico, 2016.

  228. J. Kossaifi, A. Khanna, Z. Lipton, T. Furlanello, A. Anandkumar. Tensor contraction layers for parsimonious deep nets. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshops, IEEE, Honolulu, USA, pp. 1940–1946, 2017. DOI: https://doi.org/10.1109/CVPRW.2017.243.

    Google Scholar 

  229. J. Kossaifi, Z. C. Lipton, A. Kolbeinsson, A. Khanna, T. Furlanello, A. Anandkumar. Tensor regression networks. Journal of Machine Learning Research, vol. 21, no. 123, pp. 1–21, 2020.

    MathSciNet  MATH  Google Scholar 

  230. M. Janzamin, H. Sedghi, A. Anandkumar. Beating the perils of non-convexity: Guaranteed training of neural networks using tensor methods. [Online], Available: https://arxiv.org/abs/1506.08473, 2016.

  231. V. Lebedev, Y. Ganin, M. Rakhuba, I. V. Oseledets, V. S. Lempitsky. Speeding-up convolutional neural networks using fine-tuned CP-decomposition. In Proceedings of the 3rd International Conference on Learning Representations, San Diego, USA, 2015.

  232. D. T. Tran, A. Iosifidis, M. Gabbouj. Improving efficiency in convolutional neural networks with multilinear filters. Neural Networks, vol. 105, pp. 328–339, 2018. DOI: https://doi.org/10.1016/j.neunet.2018.05.017.

    Article  Google Scholar 

  233. K. T. Schütt, F. Arbabzadah, S. Chmiela, K. R Müller, A. Tkatchenko. Quantum-chemical insights from deep tensor neural networks. Nature Communications, vol. 8, no. 1, pp. 1–8, 2017. DOI: https://doi.org/10.1038/ncomms13890.

    Article  Google Scholar 

  234. M. Y. Zhou, Y. P. Liu, Z. Long, L. X. Chen, C. Zhu. Tensor rank learning in CP decomposition via convolutional neural network. Signal Processing: Image Communication, vol. 73, pp. 12–21, 2019. DOI: https://doi.org/10.1016/j.image.2018.03.017.

  235. S. Oymak, M. Soltanolkotabi. End-to-end learning of a convolutional neural network via deep tensor decomposition. [Online], Available: https://arxiv.org/abs/1805.06523, 2018.

  236. L. Grasedyck, D. Kressner, C. Tobler. A literature survey of low-rank tensor approximation techniques. GAMM-Mitteilungen, vol. 36, no. 1, pp. 53–78, 2013. DOI: https://doi.org/10.1002/gamm.201310004.

    Article  MathSciNet  MATH  Google Scholar 

  237. A. Cichocki, D. Mandic, L. De Lathauwer, G. X. Zhou, Q. B. Zhao, C. Caiafa, H. A. Phan. Tensor decompositions for signal processing applications: From two-way to multiway component analysis. IEEE Signal Processing Magazine, vol. 32, no. 2, pp. 145–163, 2015. DOI: https://doi.org/10.1109/MSP.2013.2297439.

    Article  Google Scholar 

  238. L. De Lathauwer. Decompositions of a higher-order tensor in block terms — Part II: Definitions and uniqueness. SIAM Journal on Matrix Analysis and Applications, vol. 30, no. 3, pp. 1033–1066, 2008. DOI: https://doi.org/10.1137/070690729.

    Article  MathSciNet  MATH  Google Scholar 

  239. A. H. Phan, A. Cichocki, P. Tichavský, R. Zdunek, S. Lehky. From basis components to complex structural patterns. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, IEEE, Vancouver, Canada, pp. 3228–3232, 2013. DOI: https://doi.org/10.1109/ICASSP.2013.6638254.

    Google Scholar 

  240. A. H. Phan, A. Cichocki, I. Oseledets, G. G. Calvi, S. Ahmadi-Asl, D. P. Mandic. Tensor networks for latent variable analysis: Higher order canonical polyadic decomposition. IEEE Transactions on Neural Networks and Learning Systems, vol. 31, no. 6, pp. 2174–2188, 2020. DOI: https://doi.org/10.1109/TNNLS.2019.2929063.

    Article  MathSciNet  Google Scholar 

  241. W. H. He, Y. J. Wu, L. Deng, G. Q. Li, H. Y. Wang, Y. Tian, W. Ding, W. H. Wang, Y. Xie. Comparing SNNs and RNNs on neuromorphic vision datasets: Similarities and differences. Neural Networks, vol. 132, pp. 108–120, 2020. DOI: https://doi.org/10.1016/j.neunet.2020.08.001.

    Article  Google Scholar 

  242. L. Deng, Y. J. Wu, X. Hu, L. Liang, Y. F. Ding, G. Q. Li, G. S. Zhao, P. Li, Y. Xie. Rethinking the performance comparison between SNNs and ANNs. Neural Networks, vol. 121, pp. 294–307, 2020. DOI: https://doi.org/10.1016/j.neunet.2019.09.005.

    Article  Google Scholar 

  243. A. Cichocki. Tensor networks for dimensionality reduction, big data and deep learning. In Advances in Data Analysis with Computational Intelligence Methods, A. E. Gawęda, J. Kacprzyk, L. Rutkowski, G. G. Yen, Eds., Cham,Germany: Springer, pp. 3–49, 2018. DOI: https://doi.org/10.1007/978-3-319-67946-4_1.

    Chapter  Google Scholar 

  244. A. Pellionisz, R. Llinás. Tensor network theory of the metaorganization of functional geometries in the central nervous system. Neuroscience, vol. 16, no. 2, pp. 245–273, 1985. DOI: https://doi.org/10.1016/0306-4522(85)90001-6.

    Article  Google Scholar 

  245. I. V. Oseledets, E. E. Tyrtyshnikov. Breaking the curse of dimensionality, or how to use SVD in many dimensions. SIAM Journal on Scientific Computing, vol. 31, no. 5, pp. 3744–3759, 2009. DOI: https://doi.org/10.1137/090748330.

    Article  MathSciNet  MATH  Google Scholar 

  246. I. V. Oseledets. Tensor-train decomposition. SIAM Journal on Scientific Computing, vol. 33, no. 5, pp. 2295–2317, 2011. DOI: https://doi.org/10.1137/090752286.

    Article  MathSciNet  MATH  Google Scholar 

  247. B. N. Khoromskij. O(dlog N)-quantics approximation of N−d tensors in high-dimensional numerical modeling. Constructive Approximation, vol. 34, no. 2, pp. 257–280, 2011. DOI: https://doi.org/10.1007/s00365-011-9131-1.

    Article  MathSciNet  MATH  Google Scholar 

  248. M. Espig, K. K. Naraparaju, J. Schneider. A note on tensor chain approximation. Computing and Visualization in Science, vol. 15, no. 6, pp. 331–344, 2012. DOI: https://doi.org/10.1007/s00791-014-0218-7.

    Article  MathSciNet  MATH  Google Scholar 

  249. Q. B. Zhao, M. Sugiyama, L. H. Yuan, A. Cichocki. Learning efficient tensor representations with ring-structured networks. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, IEEE, Brighton, UK, pp. 8608–8612, 2018. DOI: https://doi.org/10.1109/ICASSP.2019.8682231.

    Google Scholar 

  250. Q. B. Zhao, G. X. Zhou, S. L. Xie, L. Q. Zhang, A. Cichocki. Tensor ring decomposition. [Online], Available: https://arxiv.org/abs/1606.05535, 2016.

  251. W. Hackbusch, S. Kühn. A new scheme for the tensor representation. Journal of Fourier Analysis and Applications, vol. 15, no. 5, pp. 706–722, 2009. DOI: https://doi.org/10.1007/s00041-009-9094-9.

    Article  MathSciNet  MATH  Google Scholar 

  252. L. Grasedyck. Hierarchical singular value decomposition of tensors. SIAM Journal on Matrix Analysis and Applications, vol. 31, no. 4, pp. 2029–2054, 2010. DOI: https://doi.org/10.1137/090764189.

    Article  MathSciNet  MATH  Google Scholar 

  253. N. Lee, A. Cichocki. Regularized computation of approximate pseudoinverse of large matrices using low-rank tensor train decompositions. SIAM Journal on Matrix Analysis and Applications, vol. 37, no. 2, pp. 598–623, 2016. DOI: https://doi.org/10.1137/15M1028479.

    Article  MathSciNet  MATH  Google Scholar 

  254. N. Lee, A. Cichocki. Fundamental tensor operations for large-scale data analysis using tensor network formats.. Multidimensional Systems and Signal Processing, vol. 29, no. 3, pp. 921–960, 2018. DOI: https://doi.org/10.1007/s11045-017-0481-0.

    Article  MathSciNet  MATH  Google Scholar 

  255. N. Cohen, O. Sharir, A. Shashua. On the expressive power of deep learning: A tensor analysis. In Proceedings of the 29th Annual Conference on Learning Theory, New York, USA, pp. 698–728, 2016.

  256. M. Zhu, S. Gupta. To prune, or not to prune: Exploring the efficacy of pruning for model compression. In Proceedings of the 6th International Conference on Learning Representations, Vancouver, Canada, 2018.

  257. H. T. Huang, L. B. Ni, K. W. Wang, Y. G. Wang, H. Yu. A highly parallel and energy efficient three-dimensional multilayer CMOS-RRAM accelerator for tensorized neural network. IEEE Transactions on Nanotechnology, vol. 17, no. 4, pp. 645–656, 2018. DOI: https://doi.org/10.1109/TNANO.2017.2732698.

    Article  Google Scholar 

  258. J. H. Su, J. L. Li, B. Bhattacharjee, F. R. Huang. Tensorial neural networks: Generalization of neural networks and application to model compression. [Online], Available: https://arxiv.org/abs/1805.10352, 2018.

  259. D. H. Wang, G. S. Zhao, H. N. Chen, Z. X. Liu, L. Deng, G. Q. Li. Nonlinear tensor train format for deep neural network compression. Neural Networks, vol. 144, pp. 320–333, 2021. DOI: https://doi.org/10.1016/j.neunet.2021.08.028.

    Article  Google Scholar 

  260. J. Achterhold, J. M. Köhler, A. Schmeink, T. Genewein. Variational network quantization. In Proceedings of the 6th International Conference on Learning Representations, Vancouver, Canada, 2018.

  261. C. Leng, H. Li, S. H. Zhu, R. Jin. Extremely low bit neural network: Squeeze the last bit out with ADMM. [Online], Available: https://arxiv.org/abs/1707.09870, 2017.

  262. A. J. Zhou, A. B. Yao, Y. W. Guo, L. Xu, Y. R. Chen. Incremental network quantization: Towards lossless cnns with low-precision weights. In Proceedings of the 5th International Conference on Learning Representations, Toulon, France, 2017.

  263. S. Jung, C. Son, S. Lee, J. Son, J. J. Han, Y. Kwak, S. J. Hwang, C. Choi. Learning to quantize deep networks by optimizing quantization intervals with task loss. Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 345–4354, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00448.

    Google Scholar 

  264. S. C. Zhou, Y. Z. Wang, H. Wen, Q. Y. He, Y. H. Zou. Balanced quantization: An effective and efficient approach to quantized neural networks. Journal of Computer Science and Technology, vol. 32, no. 4, pp. 667–682, 2017. DOI: https://doi.org/10.1007/s11390-017-1750-y.

    Article  MathSciNet  Google Scholar 

  265. Y. Choi, M. El-Khamy, J. Lee. Learning sparse low-precision neural networks with learnable regularization. [Online], Available: https://arxiv.org/abs/1809.00095, 2018.

  266. K. Wang, Z. J. Liu, Y. J. Lin, J. Lin, S. Han. HAQ: Hardware-aware automated quantization with mixed precision. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 8604–8612, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00881.

    Google Scholar 

  267. L. Deng, P. Jiao, J. Pei, Z. Z. Wu, G. Q. Li. GXNOR-Net: Training deep neural networks with ternary weights and activations without full-precision memory under a unified discretization framework. Neural Networks, vol. 100, pp. 49–58, 2018. DOI: https://doi.org/10.1016/j.neunet.2018.01.010.

    Article  MATH  Google Scholar 

  268. R. Banner, I. Hubara, E. Hoffer, D. Soudry. Scalable methods for 8-bit training of neural networks. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, Montréal, Canada, pp. 5151–5159, 2018. DOI: https://doi.org/10.5555/3327345.3327421.

  269. C. Sakr, N. R. Shanbhag. Per-tensor fixed-point quantization of the back-propagation algorithm. In Proceedings of the 7th International Conference on Learning Representations, New Orleans, USA, 2019.

  270. N. G. Wang, J. Choi, D. Brand, C. Y. Chen, K. Gopalakrishnan. Training deep neural networks with 8-bit floating point numbers. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, Montréal, Canada, pp. 7686–7695, 2018. DOI: https://doi.org/10.5555/3327757.3327866.

  271. R. Zhao, Y. W. Hu, J. Dotzel, C. De Sa, Z. R. Zhang. Improving neural network quantization without retraining using outlier channel splitting. In Proceedings of the 36th International Conference on Machine Learning, Long Beach, USA, pp. 7543–7552, 2019.

  272. Z. C. Liu, Z. Q. Shen, M. Savvides, K. T. Cheng. ReActNet: Towards precise binary neural network with generalized activation functions. In Proceedings of the 16th European Conference on Computer Vision, Springer, Glasgow, UK, pp. 143–159, 2020. DOI: https://doi.org/10.1007/978-3-030-58568-6_9.

    Google Scholar 

  273. G. Tej Pratap, R. Kumar, N. S. Pradeep. Hybrid and non-uniform quantization methods using retro synthesis data for efficient inference. In Proceedings of International Joint Conference on Neural Networks, IEEE, Shenzhen, China, 2021. DOI: https://doi.org/10.1109/IJCNN52387.2021.9533724.

    Book  Google Scholar 

  274. C. Gong, Y. Chen, Y. Lu, T. Li, C. Hao, D. M. Chen. VecQ: Minimal loss DNN model compression with vectorized weight quantization. IEEE Transactions on Computers, vol. 70, no. 5, pp. 696–710, 2021. DOI: https://doi.org/10.1109/TC.2020.2995593.

    Article  MathSciNet  MATH  Google Scholar 

  275. C. Z. Zhu, S. Han, H. Z. Mao, W. J. Dally. Trained ternary quantization. In Proceedings of the 5th International Conference on Learning Representations, Toulon, France, 2017.

  276. R. P. K. Poudel, U. Bonde, S. Liwicki, C. Zach. ContextNet: Exploring context and detail for semantic segmentation in real-time. [Online], Available: https://arxiv.org/abs/1805.04554, 2018.

  277. R. P. K. Poudel, S. Liwicki, R. Cipolla. Fast-SCNN: Fast semantic segmentation network. In Proceedings of the 30th British Machine Vision Conference, Cardiff, UK, 2019.

  278. M. Courbariaux, Y. Bengio, J. P. David. BinaryConnect: Training deep neural networks with binary weights during propagations. In Proceedings of the 28th International Conference on Neural Information Processing Systems, Montréal, Canada, pp. 3123–3131, 2015.

  279. I. Hubara, M. Courbariaux, D. Soudry, R. El-Yaniv, Y. Bengio. Quantized neural networks: Training neural networks with low precision weights and activations. The Journal of Machine Learning Research, vol. 18, no. 1, pp. 6869–6898, 2017. DOI: https://doi.org/10.5555/3122009.3242044.

    MathSciNet  MATH  Google Scholar 

  280. S. C. Zhou, Y. X. Wu, Z. K. Ni, X. Y. Zhou, H. Wen, Y. H. Zou. DoReFa-Net: Training low bitwidth convolutional neural networks with low bitwidth gradients. [Online], Available: https://arxiv.org/abs/1606.06160, 2016.

  281. M. Courbariaux, I. Hubara, D. Soudry, R. El-Yaniv, Y. Bengio. Binarized neural networks: Training deep neural networks with weights and activations constrained to +1 or −1. [Online], Available: https://arxiv.org/abs/1602.02830, 2016.

  282. K. Weinberger, A. Dasgupta, J. Langford, A. Smola, J. Attenberg. Feature hashing for large scale multitask learning. In Proceedings of the 26th Annual International Conference on Machine Learning, ACM, Montréal, Canada, pp.1113–1120, 2009. DOI: https://doi.org/10.1145/1553374.1553516.

    Google Scholar 

  283. W. L. Chen, J. T. Wilson, S. Tyree, K. Q. Weinberger, Y. X. Chen. Compressing neural networks with the hashing trick. In Proceedings of the 32nd International Conference on Machine Learning, Lille, France, pp. 2285–2294, 2015.

  284. R. Spring, A. Shrivastava. Scalable and sustainable deep learning via randomized hashing. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, Halifax, Canada, pp. 445–454, 2017. DOI: https://doi.org/10.1145/3097983.3098035.

    Google Scholar 

  285. Y. J. Lin, S. Han, H. Z. Mao, Y. Wang, B. Dally. Deep gradient compression: Reducing the communication bandwidth for distributed training. In Proceedings of the 6th International Conference on Learning Representations, Vancouver, Canada, 2018.

  286. T. W. Chin, C. Zhang,, D. Marculescu. Layer-compensated pruning for resource-constrained convolutional neural networks. [Online], Available: https://arxiv.org/abs/1810.00518, 2018.

  287. Y. H. He, J. Lin, Z. J. Liu, H. R. Wang, L. J. Li, S. Han. AMC: AutoML for model compression and acceleration on mobile devices. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 815–832, 2018. DOI: https://doi.org/10.1007/978-3-030-01234-2_48.

    Google Scholar 

  288. X. F. Xu, M. S. Park, C. Brick. Hybrid pruning: Thinner sparse networks for fast inference on edge devices. [Online], Available: https://arxiv.org/abs/1811.00482, 2018.

  289. J. B. Ye, X. Lu, Z. Lin, J. Z. Wang. Rethinking the smaller-norm-less-informative assumption in channel pruning of convolution layers. In Proceedings of the 6th International Conference on Learning Representations, Vancouver, Canada, 2018.

  290. J. H. Luo, J. X. Wu. AutoPruner: An end-to-end trainable filter pruning method for efficient deep model inference. Pattern Recognition, vol. 107, Article number 107461, 2020. DOI: https://doi.org/10.1016/j.patcog.2020.107461.

  291. X. L. Dai, H. X. Yin, N. K. Jha. NeST: A neural network synthesis tool based on a grow-and-prune paradigm. IEEE Transactions on Computers, vol. 68, no. 10, pp. 1487–1497, 2019. DOI: https://doi.org/10.1109/TC.2019.2914438.

    Article  MathSciNet  MATH  Google Scholar 

  292. Z. Liu, J. G. Li, Z. Q. Shen, G. Huang, S. M. Yan, C. S. Zhang. Learning efficient convolutional networks through network slimming. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 2755–2763, 2017. DOI: https://doi.org/10.1109/ICCV.2017.298.

    Google Scholar 

  293. P. Molchanov, A. Mallya, S. Tyree, I. Frosio, J. Kautz. Importance estimation for neural network pruning. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 11256–11264, 2019. DOI: https://doi.org/10.1109/CVPR.2019.01152.

    Google Scholar 

  294. A. Renda, J. Frankle, M. Carbin. Comparing rewinding and fine-tuning in neural network pruning. In Proceedings of the 8th International Conference on Learning Representations, Addis Ababa, Ethiopia, 2020.

  295. G. G. Ding, S. Zhang, Z. Z. Jia, J. Zhong, J. G. Han. Where to prune: Using LSTM to guide data-dependent soft pruning. IEEE Transactions on Image Processing, vol. 30, pp. 293–304, 2021. DOI: https://doi.org/10.1109/TIP.2020.3035028.

    Article  Google Scholar 

  296. M. B. Lin, L. J. Cao, S. J. Li, Q. X. Ye, Y. H. Tian, J. Z. Liu, Q. Tian, R. R. Ji. Filter sketch for network pruning. IEEE Transactions on Neural Networks and Learning Systems, to be published. DOI: https://doi.org/10.1109/TNNLS.2021.3084206.

  297. M. B. Lin, R. R. Ji, S. J. Li, Y. Wang, Y. J. Wu, F. Y. Huang, Q. X. Ye. Network pruning using adaptive exemplar filters. IEEE Transactions on Neural Networks and Learning Systems, to be published. DOI: https://doi.org/10.1109/TNNLS.2021.3084856.

  298. S. Chetlur, C. Woolley, P. Vandermersch, J. Cohen, J. Tran, B. Catanzaro, E. Shelhamer. cuDNN: Efficient primitives for deep learning. [Online], Available: https://arxiv.org/abs/1410.0759, 2014.

  299. X. L. Dai, H. X. Yin, N. K. Jha. Grow and prune compact, fast, and accurate LSTMs. IEEE Transactions on Computers, vol. 69, no. 3, pp. 441–452, 2020. DOI: https://doi.org/10.1109/TC.2019.2954495.

    Article  MathSciNet  MATH  Google Scholar 

  300. M. H. Zhu, J. Clemons, J. Pool, M. Rhu, S. W. Keckler, Y. Xie. Structurally sparsified backward propagation for faster long short-term memory training. [Online], Available: https://arxiv.org/abs/1806.00512, 2018.

  301. F. Alibart, E. Zamanidoost, D. B. Strukov. Pattern classification by memristive crossbar circuits using ex situ and in situ training. Nature Communications, vol. 4, no. 1, Article number 2072, 2013. DOI: https://doi.org/10.1038/ncomms3072.

  302. Z. Liu, M. J. Sun, T. H. Zhou, G. Huang, T. Darrell. Rethinking the value of network pruning. In Proceedings of the 7th International Conference on Learning Representations, New Orleans, USA, 2019.

  303. J. Frankle, M. Carbin. The lottery ticket hypothesis: Finding sparse, trainable neural networks. In Proceedings of the 7th International Conference on Learning Representations, New Orleans, USA, 2019.

  304. N. Cohen, A. Shashua. Convolutional rectifier networks as generalized tensor decompositions. In Proceedings of the 33rd International Conference on Machine Learning, New York City, USA, pp. 955–963, 2016.

  305. Y. P. Chen, X. J. Jin, B. Y. Kang, J. S. Feng, S. C. Yan. Sharing residual units through collective tensor factorization in deep neural networks. [Online], Available: https://arxiv.org/abs/1703.02180v2, 2017.

  306. S. H. Li, L. Wang. Neural network renormalization group. Physical Review Letters, vol. 121, no. 26, Article number 260601, 2018. DOI: https://doi.org/10.1103/PhysRevLett.121.260601.

  307. G. Evenbly, G. Vidal. Algorithms for entanglement renormalitation. Physical Review B, vol. 79, no. 14, Article number 144108, 2009. DOI: https://doi.org/10.1103/PhysRevB.79.144108.

  308. A. S. Morcos, H. N. Yu, M. Paganini, Y. D. Tian. One ticket to win them all: Generalizing lottery ticket initializations across datasets and optimizers. In Proceedings of the 33rd International Conference on Neural Information Processing Systems, Vancouver, Canada, Article number 444, 2019. DOI: https://doi.org/10.5555/3454287.3454731.

  309. H. N. Yu, S. Edunov, Y. D. Tian, A. S. Morcos. Playing the lottery with rewards and multiple languages: Lottery tickets in RL and NLP. In Proceedings of the 8th International Conference on Learning Representations, Addis Ababa, Ethiopia, pp. 1–12, 2020.

  310. E. Malach, G. Yehudai, S. Shalev-Shwartz, O. Shamir. Proving the lottery ticket hypothesis: Pruning is all you need. In Proceedings of the 37th International Conference on Machine Learning, Vienna, Austria, pp. 6682–6691, 2020.

  311. L. Orseau, M. Hutter, O. Rivasplata. Logarithmic pruning is all you need. In Proceedings of the 34th International Conference on Neural Information Processing Systems, Vancouver, Canada, Article number 246, 2020. DOI: https://doi.org/10.5555/3495724.3495970.

  312. S. K. Ye, T. Y. Zhang, K. Q. Zhang, J. Y. Li, K. D. Xu, Y. F. Yang, F. X. Yu, J. Tang, Fardad, S. J. Liu, X. Chen, X. Lin, Y. Z. Wang. Progressive weight pruning of deep neural networks using ADMM. [Online], Available: https://arxiv.org/abs/1810.07378, 2018.

  313. A. Polino, R. Pascanu, D. Alistarh. Model compression via distillation and quantization. In Proceedings of the 6th International Conference on Learning Representations, Vancouver, Canada, 2018.

  314. P. Jiang, G. Agrawal. A linear speedup analysis of distributed deep learning with sparse and quantized communication. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, Montréal, Canada, pp. 2530–2541, 2018. DOI: https://doi.org/10.5555/3327144.3327178.

  315. G. Tzelepis, A. Asif, S. Baci, S. Cavdar, E. E. Aksoy. Deep neural network compression for image classification and object detection. In Proceedings of the 18th IEEE International Conference on Machine Learning and Applications, IEEE, Boca Raton, USA, pp. 1621–1628, 2019. DOI: https://doi.org/10.1109/ICMLA.2019.00266.

    Google Scholar 

  316. D. Lee, D. H. Wang, Y. K. Yang, L. Deng, G. S. Zhao, G. Q. Li. QTTnet: Quantized tensor train neural networks for 3D object and video recognition. Neural Networks, vol. 144, pp. 420–432, 2021. DOI: https://doi.org/10.1016/j.neunet.2021.05.034.

    Article  Google Scholar 

  317. X. Z. Zhu, J. F. Dai, L. Yuan, Y. C. Wei. Towards high performance video object detection, to Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 7210–7218, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00753.

    Google Scholar 

  318. J. Lin, Y. M. Rao, J. W. Lu, J. Zhou. Runtime neural pruning. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, USA, pp. 2178–2188, 2017. DOI: https://doi.org/10.5555/3294771.3294979.

  319. Y. M. Rao, J. W. Lu, J. Lin, J. Zhou. Runtime network routing for efficient image classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 41, no. 10, pp. 2291–2304, 2019. DOI: https://doi.org/10.1109/TPAMI.2018.2878258.

    Article  Google Scholar 

  320. X. T. Gao, Y. R. Zhao, L. Dudziak, R. D. Mullins, C. Z. Xu. Dynamic channel pruning: Feature boosting and suppression. In Proceedings of the 7th International Conference on Learning Representations, New Orleans, USA, 2019.

  321. J. H. Yu, L. J. Yang, N. Xu, J. C. Yang, T. S. Huang. Slimmable neural networks. In Proceedings of the 7th International Conference on Learning Representations, New Orleans, USA, 2019.

  322. Z. D. Zhang, C. Jung. Recurrent convolution for compact and cost-adjustable neural networks: An empirical study. [Online], Available: https://arxiv.org/abs/1902.09809, 2019.

  323. S. C. Liu, Y. Y. Lin, Z. M. Zhou, K. M. Nan, H. Liu, J. Z. Du. On-demand deep model compression for mobile devices: A usage-driven model selection framework. In Proceedings of the 16th Annual International Conference on Mobile Systems, Applications, and Services, ACM, Munich, Germany, pp. 389–400, 2018. DOI: https://doi.org/10.1145/3210240.3210337.

    Google Scholar 

  324. T. Bolukbasi, J. Wang, O. Dekel, V. Saligrama. Adaptive neural networks for efficient inference. In Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, pp. 527–536, 2017.

  325. X. Wang, F. Yu, Z. Y. Dou, T. Darrell, J. E. Gonzalez. SkipNet: Learning dynamic routing in convolutional networks. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 420–436, 2018. DOI: https://doi.org/10.1007/978-3-030-01261-8_25.

    Google Scholar 

  326. A. Ehteshami Bejnordi, R. Krestel. Dynamic channel and layer gating in convolutional neural networks. In Proceedings of the 43rd German Conference on Artificial Intelligence, Springer, Bamberg, Germany, pp. 33–45, 2020. DOI: https://doi.org/10.1007/978-3-030-58285-2_3.

    Google Scholar 

  327. J. Q. Guan, Y. Liu, Q. Liu, J. Peng. Energy-efficient amortized inference with cascaded deep classifiers. In Proceedings of the 27th International Joint Conference on Artificial Intelligence, IJCAI.org, Stockholm, Sweden, pp. 2184–2190, 2018. DOI: https://doi.org/10.24963/ijcai.2018/302.

    Google Scholar 

  328. H. X. Li, Z. Lin, X. H. Shen, J. Brandt, G. Hua. A convolutional neural network cascade for face detection. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Boston, USA, pp. 5325–5334, 2015. DOI: https://doi.org/10.1109/CVPR.2015.7299170.

    Google Scholar 

  329. R. A. Jacobs, M. I. Jordan, S. J. Nowlan, G. E. Hinton. Adaptive mixtures of local experts. Neural Computation, vol. 3, no. 1, pp. 79–87, 1991. DOI: https://doi.org/10.1162/neco.1991.3.1.79.

    Article  Google Scholar 

  330. A. Veit, S. Belongie. Convolutional networks with adaptive inference graphs. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 3–18, 2018. DOI: https://doi.org/10.1007/978-3-030-01246-5_1.

    Google Scholar 

  331. H. Y. Wang, Z. Q. Qin, S. Y. Li, X. Li. CoDiNet: Path distribution modeling with consistency and diversity for dynamic routing. IEEE Transactions on Pattern Analysis and Machine Intelligence, to be published. DOI: https://doi.org/10.1109/TPAMI.2021.3084680.

  332. J. Hu, L. Shen, G. Sun. Squeeze-and-excitation networks. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 7132–7141, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00745.

    Google Scholar 

  333. F. Wang, M. Q. Jiang, C. Qian, S. Yang, C. Li, H. G. Zhang, X. G. Wang, X. O. Tang. Residual attention network for image classification. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 6450–6458, 2017. DOI: https://doi.org/10.1109/CVPR.2017.683.

    Google Scholar 

  334. M. Y. Ren, A. Pokrovsky, B. Yang, R Urtasun. SBNet: Sparse blocks network for fast inference. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 8711–8720, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00908.

    Google Scholar 

  335. A. Recasens, P. Kellnhofer, S. Stent, W. Matusik, A. Torralba. Learning to zoom: A saliency-based sampling layer for neural networks. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 52–67, 2018. DOI: https://doi.org/10.1007/978-3-030-01240-3_4.

    Google Scholar 

  336. Z. R. Yang, Y. H. Xu, W. R. Dai, H. K. Xiong. Dynamic-stride-net: Deep convolutional neural network with dynamic stride. In Proceedings of SPIE 11187, Optoelectronic Imaging and Multimedia Technology VI, SPIE, Hangzhou, China, Article number 1118707, 2019. DOI: https://doi.org/10.1117/12.2537799.

    Google Scholar 

  337. W. H. Wu, D. L. He, X. Tan, S. F. Chen, Y. Yang, S. L. Wen. Dynamic inference: A new approach toward efficient video action recognition. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, IEEE, Seattle, USA, pp. 2890–2898, 2020. DOI: https://doi.org/10.1109/CVPRW50498.2020.00346.

    Google Scholar 

  338. B. L. Zhou, A. Khosla, A. Lapedriza, A. Oliva, A. Torralba. Learning deep features for discriminative localization. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 2921–2929, 2016. DOI: https://doi.org/10.1109/CVPR.2016.319.

    Google Scholar 

  339. A. H. Phan, A. Cichocki, P. Tichavský, D. P. Mandic, K. Matsuoka. On revealing replicating structures in multiway data: A novel tensor decomposition approach. In Proceedings of the 10th International Conference on Latent Variable Analysis and Signal Separation, Springer, Tel Aviv, Israel, pp. 297–305, 2012. DOI: https://doi.org/10.1007/978-3-642-28551-6_37.

    Google Scholar 

  340. J. Pei, L. Deng, S. Song, M. G. Zhao, Y. H. Zhang, S. Wu, G. R. Wang, Z. Zou, Z. H. Wu, W. He, F. Chen, N. Deng, S. Wu, Y. Wang, Y. J. Wu, Z. Y. Yang, C. Ma, G. Q. Li, W. T. Han, H. L. Li, H. Q. Wu, R. Zhao, Y. Xie, L. P. Shi. Towards artificial general intelligence with hybrid Tianjic chip architecture. Nature, vol. 572, no. 7767, pp. 106–111, 2019. DOI: https://doi.org/10.1038/s41586-019-1424-8.

    Article  Google Scholar 

  341. P. A. Merolla, J. V. Arthur, R. Alvarez-Icaza, A. S. Cassidy, J. Sawada, F. Akopyan, B. L. Jackson, N. Imam, C. Guo, Y. Nakamura, B. Brezzo, I. Vo, S. K. Esser, R. Appuswamy, B. Taba, A. Amir, M. D. Flickner, W. P. Risk, R. Manohar, D. S. Modha. A million spiking-neuron integrated circuit with a scalable communication network and interface. Science, vol. 345, no. 6197, pp. 668–673, 2014. DOI: https://doi.org/10.1126/science.1254642.

    Article  Google Scholar 

  342. N. Schuch, I. Cirac, D. Pírez-García. Peps as ground states: Degeneracy and topology. Annals of Physics, vol. 325, no. 10, pp. 2153–2192, 2010. DOI: https://doi.org/10.1016/j.aop.2010.05.008.

    Article  MathSciNet  MATH  Google Scholar 

  343. A. Hallam, E. Grant, V. Stojevic, S. Severini, A. G. Green. Compact neural networks based on the multiscale entanglement renormalization ansatz. In Proceedings of British Machine Vision Conference, Newcastle, UK, 2018.

Download references

This work was supported by National Key R&D Program of China (No. 2018AAA0102600), Beijing Natural Science Foundation, China (No. JQ21015), Beijing Academy of Artificial Intelligence (BAAI), China, and Pengcheng Laboratory, China.

Author information

Authors and Affiliations

  1. Applied Research Center Laboratory, Tencent Platform and Content Group, Shenzhen, 518057, China

    Yang Wu

  2. School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi’an Jiaotong University, Xi’an, 710049, China

    Ding-Heng Wang & Man Yao

  3. School of Artificial Intelligence, Xidian University, Xi’an, 710071, China

    Xiao-Tong Lu & Wei-Sheng Dong

  4. Division of Information Science, Nara Institute of Science and Technology, Nara, 6300192, Japan

    Fan Yang

  5. Peng Cheng Laboratory, Shenzhen, 518000, China

    Man Yao

  6. Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, 19104-6389, USA

    Jian-Bo Shi

  7. Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China

    Guo-Qi Li

  8. University of Chinese Academy of Sciences, Beijing, 100190, China

    Guo-Qi Li

Authors
  1. Yang Wu
    View author publications

    You can also search for this author in PubMed Google Scholar

  2. Ding-Heng Wang
    View author publications

    You can also search for this author in PubMed Google Scholar

  3. Xiao-Tong Lu
    View author publications

    You can also search for this author in PubMed Google Scholar

  4. Fan Yang
    View author publications

    You can also search for this author in PubMed Google Scholar

  5. Man Yao
    View author publications

    You can also search for this author in PubMed Google Scholar

  6. Wei-Sheng Dong
    View author publications

    You can also search for this author in PubMed Google Scholar

  7. Jian-Bo Shi
    View author publications

    You can also search for this author in PubMed Google Scholar

  8. Guo-Qi Li
    View author publications

    You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Yang Wu or Guo-Qi Li.

Additional information

Colored figures are available in the online version at https://link.springer.com/journal/11633

Yang Wu received the B. Sc. degree in information and the Ph. D. degree in control science and engineering from Xi’an Jiaotong University, China in 2004 and 2010, respectively. He is currently a principal researcher with Applied Research Center (ARC) Laboratory, Tencent Platform and Content Group (PCG), China. From July 2019 to May 2021, he was a program-specific senior lecturer with Department of Intelligence Science and Technology, Kyoto University, Japan. He was an assistant professor of the Nara Institute of Science and Technology (NAIST) International Collaborative Laboratory for Robotics Vision, NAIST, from December 2014 to June 2019. From 2011 to 2014, he was a program-specific researcher with the Academic Center for Computing and Media Studies, Kyoto University, Japan.

His research interests include computer vision, pattern recognition, as well as multimedia content analysis, enhancement and generation.

Ding-Heng Wang received the B. Eng. degree in mechanical engineering and automation, the M. Eng. degree in software engineering from Xi’an Jiaotong University, China in 2010 and 2014, respectively, and the Ph. D. degree in control science and engineering from Xi’an Jiaotong University, China in 2022. From 2014 to 2017, he was a software engineer in China Aerospace Science and Industry Corporation Limited, China.

His research interests include tensor decomposition, neural network compression, and efficient machine learning model.

Xiao-Tong Lu received the B. Sc. degree in electronic engineering from Xidian University, China in 2016, where he is currently a Ph. D. degree candidate in intelligent information processing.

His research interests include deep learning, compressive sensing, image restoration and deep neural network compression.

Fan Yang received the B. Sc. degree in geographical informational system from Nanjing University, China in 2012, and the M. Sc. degree in information science from Nara Institute of Science and Technology, Japan in 2018. He is currently a Ph. D. degree candidate in information science at Nara Institute of Science and Technology, Japan.

His research interest is on video processing.

Man Yao received the M. Eng. degree in the electronic and communication engineering from Xi’an Jiaotong University, China in 2018. He is currently a Ph. D. degree candidate in control science and engineering at Xi’an Jiaotong University, China. From May 2021 to the present, he is doing an internship in Peng Cheng Laboratory, China.

His research interests include spiking neural network and dynamic neural network.

Wei-Sheng Dong received the B. Sc. degree in electronic engineering from Huazhong University of Science and Technology, China in 2004, and the Ph. D. degree in circuits and system from Xidian University, China in 2010. He was a visiting student with Microsoft Research Asia, China in 2006. From 2009 to 2010, he was a research assistant with Department of Computing, Hong Kong Polytechnic University, China. In 2010, he joined School of Electronic Engineering, Xidian University, China as a lecturer, where he has been a professor since 2016. He was a recipient of the Best Paper Award at the SPIE Visual Communication and Image Processing (VCIP) in 2010. He has served as an Associate Editor of IEEE Transactions on Image Processing and is currently an Associate Editor of SIAM Journal of Imaging Sciences.

His research interests include inverse problems in image processing, deep learning, and parse representation.

Jian-Bo Shi received the B. A. degree in computer science and mathematics from Cornell University, USA in 1994, and the Ph. D. degree in computer science from the University of California at Berkeley, USA in 1998. He joined The Robotics Institute at Carnegie Mellon University, USA in 1999 as a research faculty, and in 2003, University of Pennsylvania where he is currently a professor of Computer and Information Science. In 2007, he was awarded the Longuet-Higgins Prize for his work on Normalized Cuts.

His research focuses on first person vision, human behavior analysis and image recognition-segmentation. His other research interests include image/video retrieval, 3D vision, and vision based desktop computing. His long-term interests center around a broader area of machine intelligence, he wishes to develop a “visual thinking” module that allows computers not only to understand the environment around us, but also to achieve cognitive abilities such as machine memory and learning.

Guo-Qi Li received the B. Eng. degree in automation from the Xi’an University of Technology, China in 2004, the M. Eng. degree in control engineering from Xi’an Jiaotong University, China in 2007, and the Ph. D. degree in electrical and electronic engineering from Nanyang Technological University, Singapore in 2011. From 2011 to 2014, he was a scientist with the Data Storage Institute and the Institute of High Performance Computing, Agency for Science, Technology and Research, Singapore. From 2014 to 2022, he was an assistant professor and associate professor at Tsinghua University, China. Since 2022, he has been with Institute of Automation, Chinese Academy of Sciences and the University of Chinese Academy of Sciences, where he is currently a full professor. He has authored or co-authored more than 150 journal and conference papers. He has been actively involved in professional services such as serving as a Tutorial Chair, an International Technical Program Committee Member, a PC member, a Publication Chair and a Track Chair for several international conferences. He is an Editorial-Board Member for Control and Decision, and served as Associate Editors for Journal of Control and Decision and Frontiers in Neuroscience: Neuromorphic Engineering. He is a reviewer for Mathematical Reviews published by the American Mathematical Society and serves as a reviewer for a number of prestigious international journals and top AI conferences including ICLR, NeurIPS, ICML, AAI, etc. He was the recipient of the First Class Prize in Science and Technology of the Chinese Institute of Command and Control in 2018, the Second Prize of Fujian Provincial Science and Technology Progress Award in 2020. He received the outstanding Young Talent Award of the Beijing Natural Science Foundation in 2021.

His research interests include brain-inspired intelligence, neuromorphic computing and spiking neural networks.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Wu, Y., Wang, DH., Lu, XT. et al. Efficient Visual Recognition: A Survey on Recent Advances and Brain-inspired Methodologies. Mach. Intell. Res. 19, 366–411 (2022). https://doi.org/10.1007/s11633-022-1340-5

Download citation

  • Received: 07 April 2022

  • Accepted: 26 May 2022

  • Published: 18 August 2022

  • Issue Date: October 2022

  • DOI: https://doi.org/10.1007/s11633-022-1340-5

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Keywords

  • Visual recognition
  • deep neural networks (DNNS)
  • brain-inspired methodologies
  • network compression
  • dynamic inference
  • survey
Download PDF

Working on a manuscript?

Avoid the most common mistakes and prepare your manuscript for journal editors.

Learn more

Advertisement

Over 10 million scientific documents at your fingertips

Switch Edition
  • Academic Edition
  • Corporate Edition
  • Home
  • Impressum
  • Legal information
  • Privacy statement
  • California Privacy Statement
  • How we use cookies
  • Manage cookies/Do not sell my data
  • Accessibility
  • FAQ
  • Contact us
  • Affiliate program

Not affiliated

Springer Nature

© 2023 Springer Nature Switzerland AG. Part of Springer Nature.