Exploring the Impact of Convolutions on LSTM Networks for Video Classification

Benzyane, Manal; Azrour, Mourade; Zeroual, Imad; Agoujil, Said

doi:10.1007/978-3-031-48573-2_4

Manal Benzyane¹⁴,
Mourade Azrour¹⁵,
Imad Zeroual¹⁵ &
…
Said Agoujil¹⁴

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 838))

Included in the following conference series:

The International Conference on Artificial Intelligence and Smart Environment

185 Accesses

Abstract

Video classification plays a foundational role within the field of computer vision, that involves categorizing and labeling videos based on their content. Its significance is evident in a wide array of applications, encompassing video surveillance, content recommendation, action recognition, video indexing, and more. The goal of video classification is to automatically analyze and understand the visual information present in videos, enabling efficient organization, retrieval, and interpretation of large video collections. The fusion of convolutional neural networks (CNNs) and long short term memory (LSTM) networks has revolutionized the field of video classification by effectively capturing both spatial and temporal dependencies within video sequences. This fusion combines the strengths of CNNs in extracting spatial features and LSTMs in modeling sequential and temporal information. Two widely adopted architectures that incorporate this fusion are ConvLSTM and LRCN (Long-term Recurrent Convolutional Networks). This paper aims to explore the impact of convolutions on LSTM networks in the context of video classification and compare the performance of ConvLSTM and LRCN.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Softcover Book: USD 199.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Papers with Code - ConvLSTM Explained. https://paperswithcode.com/method/convlstm. Consulté le 1 juillet 2023
Tsang, S.-H.: Brief review—LRCN: long-term recurrent convolutional networks for visual recognition and…. Medium, 18 Sept 2022. https://sh-tsang.medium.com/brief-review-lrcn-long-term-recurrent-convolutional-networks-for-visual-recognition-and-9542bc7e8a79. Consulté le 1 juillet 2023
Zebhi, S., AlModarresi, S.M.T., Abootalebi, V.: Action recognition in videos using global descriptors and pre-trained deep learning architecture. In: 2020 28th Iranian Conference on Electrical Engineering (ICEE), Tabriz, Iran, pp. 1–4. IEEE (2020). https://doi.org/10.1109/ICEE50131.2020.9261038
Cheng, Y., Yang, Y., Chen, H.-B., Wong, N., Yu, H.: S3-Net: a fast and lightweight video scene understanding network by single-shot segmentation. In: 2021 IEEE Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, pp. 3328–3336. IEEE (2021). https://doi.org/10.1109/WACV48630.2021.00337
Benzyane, M., Zeroual, I., Azrour, M., Agoujil, S.: Convolutional long short-term memory network model for dynamic texture classification: a case study. In: Kacprzyk, J., Ezziyyani, M., Balas, V.E. (eds.) International Conference on Advanced Intelligent Systems for Sustainable Development. Lecture Notes in Networks and Systems, pp. 383–395. Springer Nature Switzerland, Cham (2023). https://doi.org/10.1007/978-3-031-26384-2_33
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), Art. No. 7553 (2015). https://doi.org/10.1038/nature14539
Brinker, T.J., et al.: Skin cancer classification using convolutional neural networks: systematic review. J. Med. Internet Res. 20(10), e11936 (2018). https://doi.org/10.2196/11936
Ratan, P.: What is the convolutional neural network architecture? Analytics Vidhya, 28 Oct 2020. https://www.analyticsvidhya.com/blog/2020/10/what-is-the-convolutional-neural-network-architecture/. Consulté le 14 mai 2023
LSTM Neural Network, Big Data Mining & Machine Learning, 28 avr 2018. www.big-data.tips. http://www.big-data.tips/lstm-neural-network. Consulté le 27 juin 2023
Ye, W., Cheng, J., Yang, F., Xu, Y.: Two-stream convolutional network for improving activity recognition using convolutional long short-term memory networks. IEEE Access 7, 67772–67780 (2019). https://doi.org/10.1109/ACCESS.2019.2918808
Sun, H., Yang, Y., Chen, Y., Liu, X., Wang, J.: Tourism demand forecasting of multi-attractions with spatiotemporal grid: a convolutional block attention module model. Inf. Technol. Tour. 1–29 (2023). https://doi.org/10.1007/s40558-023-00247-y
Ko, B.: Long-term Recurrent Convolutional Network (LRCN). Home, 16 Oct 2017. https://kobiso.github.io//research/research-lrcn/. Consulté le 1 juillet 2023

Download references

Author information

Authors and Affiliations

MMIS, MAIS, FST Errachidia, Moulay Ismail University, Meknes, Morocco
Manal Benzyane & Said Agoujil
STI, IDMS, FST Errachidia, Moulay Ismail University, Meknes, Morocco
Mourade Azrour & Imad Zeroual

Authors

Manal Benzyane
View author publications
You can also search for this author in PubMed Google Scholar
Mourade Azrour
View author publications
You can also search for this author in PubMed Google Scholar
Imad Zeroual
View author publications
You can also search for this author in PubMed Google Scholar
Said Agoujil
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Manal Benzyane .

Editor information

Editors and Affiliations

Department of Computer Science, Moulay Ismail University, Errachidia, Morocco
Yousef Farhaoui
Centre of AI and Robotics, Napier University, Edinburgh, UK
Amir Hussain
Prince Sultan University, Riyadh, Saudi Arabia
Tanzila Saba
University Canada West, Vancouver, BC, Canada
Hamed Taherdoost
Institute of Science, Banaras Hindu University, Varanasi, India
Anshul Verma

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Benzyane, M., Azrour, M., Zeroual, I., Agoujil, S. (2024). Exploring the Impact of Convolutions on LSTM Networks for Video Classification. In: Farhaoui, Y., Hussain, A., Saba, T., Taherdoost, H., Verma, A. (eds) Artificial Intelligence, Data Science and Applications. ICAISE 2023. Lecture Notes in Networks and Systems, vol 838. Springer, Cham. https://doi.org/10.1007/978-3-031-48573-2_4

Download citation

DOI: https://doi.org/10.1007/978-3-031-48573-2_4
Published: 30 January 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-48572-5
Online ISBN: 978-3-031-48573-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics