Skip to main content

Exploring the Impact of Convolutions on LSTM Networks for Video Classification

  • Conference paper
  • First Online:
Artificial Intelligence, Data Science and Applications (ICAISE 2023)

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 838))

  • 185 Accesses

Abstract

Video classification plays a foundational role within the field of computer vision, that involves categorizing and labeling videos based on their content. Its significance is evident in a wide array of applications, encompassing video surveillance, content recommendation, action recognition, video indexing, and more. The goal of video classification is to automatically analyze and understand the visual information present in videos, enabling efficient organization, retrieval, and interpretation of large video collections. The fusion of convolutional neural networks (CNNs) and long short term memory (LSTM) networks has revolutionized the field of video classification by effectively capturing both spatial and temporal dependencies within video sequences. This fusion combines the strengths of CNNs in extracting spatial features and LSTMs in modeling sequential and temporal information. Two widely adopted architectures that incorporate this fusion are ConvLSTM and LRCN (Long-term Recurrent Convolutional Networks). This paper aims to explore the impact of convolutions on LSTM networks in the context of video classification and compare the performance of ConvLSTM and LRCN.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Papers with Code - ConvLSTM Explained. https://paperswithcode.com/method/convlstm. Consulté le 1 juillet 2023

  2. Tsang, S.-H.: Brief review—LRCN: long-term recurrent convolutional networks for visual recognition and…. Medium, 18 Sept 2022. https://sh-tsang.medium.com/brief-review-lrcn-long-term-recurrent-convolutional-networks-for-visual-recognition-and-9542bc7e8a79. Consulté le 1 juillet 2023

  3. Zebhi, S., AlModarresi, S.M.T., Abootalebi, V.: Action recognition in videos using global descriptors and pre-trained deep learning architecture. In: 2020 28th Iranian Conference on Electrical Engineering (ICEE), Tabriz, Iran, pp. 1–4. IEEE (2020). https://doi.org/10.1109/ICEE50131.2020.9261038

  4. Cheng, Y., Yang, Y., Chen, H.-B., Wong, N., Yu, H.: S3-Net: a fast and lightweight video scene understanding network by single-shot segmentation. In: 2021 IEEE Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, pp. 3328–3336. IEEE (2021). https://doi.org/10.1109/WACV48630.2021.00337

  5. Benzyane, M., Zeroual, I., Azrour, M., Agoujil, S.: Convolutional long short-term memory network model for dynamic texture classification: a case study. In: Kacprzyk, J., Ezziyyani, M., Balas, V.E. (eds.) International Conference on Advanced Intelligent Systems for Sustainable Development. Lecture Notes in Networks and Systems, pp. 383–395. Springer Nature Switzerland, Cham (2023). https://doi.org/10.1007/978-3-031-26384-2_33

  6. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), Art. No. 7553 (2015). https://doi.org/10.1038/nature14539

  7. Brinker, T.J., et al.: Skin cancer classification using convolutional neural networks: systematic review. J. Med. Internet Res. 20(10), e11936 (2018). https://doi.org/10.2196/11936

  8. Ratan, P.: What is the convolutional neural network architecture? Analytics Vidhya, 28 Oct 2020. https://www.analyticsvidhya.com/blog/2020/10/what-is-the-convolutional-neural-network-architecture/. Consulté le 14 mai 2023

  9. LSTM Neural Network, Big Data Mining & Machine Learning, 28 avr 2018. www.big-data.tips. http://www.big-data.tips/lstm-neural-network. Consulté le 27 juin 2023

  10. Ye, W., Cheng, J., Yang, F., Xu, Y.: Two-stream convolutional network for improving activity recognition using convolutional long short-term memory networks. IEEE Access 7, 67772–67780 (2019). https://doi.org/10.1109/ACCESS.2019.2918808

  11. Sun, H., Yang, Y., Chen, Y., Liu, X., Wang, J.: Tourism demand forecasting of multi-attractions with spatiotemporal grid: a convolutional block attention module model. Inf. Technol. Tour. 1–29 (2023). https://doi.org/10.1007/s40558-023-00247-y

  12. Ko, B.: Long-term Recurrent Convolutional Network (LRCN). Home, 16 Oct 2017. https://kobiso.github.io//research/research-lrcn/. Consulté le 1 juillet 2023

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Manal Benzyane .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Benzyane, M., Azrour, M., Zeroual, I., Agoujil, S. (2024). Exploring the Impact of Convolutions on LSTM Networks for Video Classification. In: Farhaoui, Y., Hussain, A., Saba, T., Taherdoost, H., Verma, A. (eds) Artificial Intelligence, Data Science and Applications. ICAISE 2023. Lecture Notes in Networks and Systems, vol 838. Springer, Cham. https://doi.org/10.1007/978-3-031-48573-2_4

Download citation

Publish with us

Policies and ethics