CS-RNN: efficient training of recurrent neural networks with continuous skips

Chen, Tianyu; Li, Sheng; Yan, Jun

doi:10.1007/s00521-022-07227-z

CS-RNN: efficient training of recurrent neural networks with continuous skips

Original Article
Published: 24 June 2022

Volume 34, pages 16515–16532, (2022)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

296 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Recurrent neural networks (RNNs) provide powerful tools for sequence problems. However, simple RNN and its variants are prone to high computational cost, for which RNN variants like Skip RNN have been proposed. To further reduce the cost, we introduce a new recurrent network model, continuous skip RNN (CS-RNN), to overcome the limitation of Skip RNN. The model learns to omit relatively continuous elements in a sequence, which are less relevant to the task, allowing it to maintain its inference ability while reducing the training costs significantly. Two hyperparameters are introduced to control the number of skips to balance the efficiency and the accuracy. Six different experiments have been conducted to demonstrate the feasibility and efficiency of the proposed CS-RNN. The model is evaluated by the number of FLOPs, the accuracy, and a new metric proposed for the trade-off between efficiency and accuracy. The results have shown a significant improvement in efficiency by the proposed continuous skips while the performance of RNN has been retained, which is promising for efficient training of RNNs over long sequences.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning deep hierarchical and temporal recurrent neural networks with residual learning

Article 29 January 2020

Minimal gated unit for recurrent neural networks

Article 11 June 2016

Euler Recurrent Neural Network: Tracking the Input Contribution to Prediction on Sequential Data

Notes

http://yann.lecun.com/exdb/mnist/.
opihi.cs.uvic.ca/sound/genres.tar.gz.
http://ai.stanford.edu/~amaas/data/sentiment/.
https://code.google.com/archive/p/word2vec/.

References

Zhang H, Wang Z, Liu D (2014) A comprehensive review of stability analysis of continuous-time recurrent neural networks. IEEE Trans Neural Netw Learn Syst 25(7):1229–1262
Article Google Scholar
Weerakody PB, Wong KW, Wang G, Ela W (2021) A review of irregular time series data handling with gated recurrent neural networks. Neurocomputing 441:161–178
Article Google Scholar
Sakar CO, Polat SO, Katircioglu M, Kastro Y (2019) Real-time prediction of online shoppers’ purchasing intention using multilayer perceptron and LSTM recurrent neural networks. Neural Comput Appl 31(10):6893–6908
Article Google Scholar
Jin Z, Yang Y, Liu Y (2020) Stock closing price prediction based on sentiment analysis and LSTM. Neural Comput Appl 32(13):9713–9729
Article Google Scholar
Yen VT, Nan WY, Van Cuong P (2019) Recurrent fuzzy wavelet neural networks based on robust adaptive sliding mode control for industrial robot manipulators. Neural Comput Appl 31(11):6945–6958
Article Google Scholar
Wang L, Ge Y, Chen M, Fan Y (2017) Dynamical balance optimization and control of biped robots in double-support phase under perturbing external forces. Neural Comput Appl 28(12):4123–4137
Article Google Scholar
Chatziagorakis P, Ziogou C, Elmasides C, Sirakoulis GC, Karafyllidis I, Andreadis I, Georgoulas N, Giaouris D, Papadopoulos AI, Ipsakis D, Papadopoulou S, Seferlis P, Stergiopoulos F, Voutetakis S (2016) Enhancement of hybrid renewable energy systems control with neural networks applied to weather forecasting: the case of olvio. Neural Comput Appl 27(5):1093–1118
Article Google Scholar
Lin CH (2017) Retracted article: Hybrid recurrent Laguerre-orthogonal-polynomials neural network control with modified particle swarm optimization application for v-belt continuously variable transmission system. Neural Comput Appl 28(2):245–264
Article Google Scholar
Basterrech S, Krömer P (2020) A nature-inspired biomarker for mental concentration using a single-channel EEG. Neural Comput Appl 32(12):7941–7956
Article Google Scholar
De Boom C, Demeester T, Dhoedt B (2019) Character-level recurrent neural networks in practice: comparing training and sampling schemes. Neural Comput Appl 31(8):4001–4017
Article Google Scholar
Yu Z, Chen F, Deng F (2018) Unification of map estimation and marginal inference in recurrent neural networks. IEEE Trans Neural Netw Learn Syst 29(11):5761–5766
Article MathSciNet Google Scholar
Yu Y, Si X, Hu C, Zhang J (2019) A review of recurrent neural networks: LSTM cells and network architectures. Neural Comput 31(7):1235–1270
Article MathSciNet Google Scholar
Huang T, Shen G, Deng ZH (2019) Leap-LSTM: enhancing long short-term memory for text categorization. In: Proc. international joint conference on artificial intelligence, pp 5017–5023
Yu AW, Lee H, Le Q (2017) Learning to skim text. In: Proc. annual meeting of the association for computational linguistics, pp 1880–1890
Campos V, Jou B, i Nieto XG, Torres J, Chang SF (2018) Skip RNN: Learning to skip state updates in recurrent neural networks. In: Proc. international conference on learning representations, pp 1–17
Jernite Y, Grave E, Joulin A, Mikolov T (2017) Variable computation in recurrent neural networks. arxiv:1611.06188
Seo M, Min S, Farhadi A, Hajishirzi H (2018) Neural speed reading via skim-RNN. In: Proc. international conference on learning representations, pp 1–14
Neil D, Pfeiffer M, Liu SC (2016) Phased LSTM: Accelerating recurrent network training for long or event-based sequences. In: Proc. advances in neural information processing systems, p 3882–3890
Liu L, Shen J, Zhang M, Wang Z, Tang J (2018) Learning the joint representation of heterogeneous temporal events for clinical endpoint prediction. In: Proc. AAAI conference on artificial intelligence, pp 1–9
Koutník J, Greff K, Gomez F, Schmidhuber J (2014) A clockwork RNN. Comput Sci pp 1863–1871
Koch C, Ullman S (1985) Shifts in selective visual attention: towards the underlying neural circuitry. Hum Neurobiol 4(4):219–227
Google Scholar
Molchanov P, Tyree S, Karras T, Aila T, Kautz J (2017) Pruning convolutional neural networks for resource efficient transfer learning. In: Proc. international conference on learning representations, pp 1–17
Zacks R, Hasher L (1994) Inhibitory processes in attention, memory, and language. Directed ignoring pp 241–264
Bengio Y, Léonard N, Courville A (2013) Estimating or propagating gradients through stochastic neurons for conditional computation. arxiv:1308.3432
Chung J, Ahn S, Bengio Y (2016) Hierarchical multiscale recurrent neural networks. arxiv:1609.01704
Yin P, Lyu J, Zhang S, Osher S, Qi Y, Xin J (2019) Understanding straight-through estimator in training activation quantized neural nets. In: Proc. international conference on learning representations, pp 1–30
Williams RJ (1992) Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach Learn 8(3–4):229–256
MATH Google Scholar
Courbariaux M, Bengio Y (2016) Binarynet: training deep neural networks with weights and activations constrained to +1 or -1. arxiv:1602.02830
Vakili M, Ghamsari MK, Rezaei M (2020) Performance analysis and comparison of machine and deep learning algorithms for iot data classification arxiv:2001.09636
Chinchor N (1992) MUC-4 evaluation metrics. In: Proc. conference on message understanding. Assoc Comput Linguist, pp 22–29
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Article Google Scholar
Cho K, van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder–decoder for statistical machine translation. In: Proc. conference on empirical methods in natural language processing, pp 1724–1734
Krueger D, Maharaj T, Kramár J, Pezeshki M, Ballas N, Ke NR, Goyal A, Bengio Y, Larochelle H, Courville AC, Pal C (2017) Zoneout: Regularizing rnns by randomly preserving hidden activations. In: Proc. international conference on learning representations, pp 1–11
Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, Corrado GS, Davis A, Dean J, Devin M, et al. (2016) Tensorflow: large-scale machine learning on heterogeneous distributed systems. arxiv:1603.04467
Kingma D, Ba J (2015) Adam: A method for stochastic optimization. In: Proc. international conference for learning representations, pp 1–15
Nagabushanam P, George ST, Radha S (2019) EEG signal classification using LSTM and improved neural network algorithms. Soft Comput, pp 1–23
Hughes TW, Williamson IA, Minkov M, Fan S (2019) Wave physics as an analog recurrent neural network. Sci Adv 5(12):eaay6946
Article Google Scholar
Le QV, Jaitly N, Hinton GE (2015) A simple way to initialize recurrent networks of rectified linear units. arxiv:1504.00941
Arjovsky M, Shah A, Bengio Y (2016) Unitary evolution recurrent neural networks. In: International conference on machine learning, pp 1120–1128
Cao J, Katzir O, Jiang P, Lischinski D, Cohen-Or D, Tu C, Li Y (2018) Dida: disentangled synthesis for domain adaptation. arxiv:1805.08019
Ganin Y, Ustinova E, Ajakan H, Germain P, Larochelle H, Laviolette F, Marchand M, Lempitsky V (2016) Domain-adversarial training of neural networks. J Mach Learn Res 17(1):2096–2030
MathSciNet MATH Google Scholar
Sturm BL (2014) The state of the art ten years after a state of the art: future research in music information retrieval. J New Music Res 43(2):147–172
Article Google Scholar
Tzanetakis G, Cook P (2002) Musical genre classification of audio signals. IEEE Trans Speech Audio Process 10(5):293–302
Article Google Scholar
Acharya J, Basu A (2020) Deep neural network for respiratory sound classification in wearable devices enabled by patient specific model tuning. IEEE Trans Biomed Circuits Syst p 1-1
Maas AL, Daly RE, Pham PT, Dan H, Potts C (2011) Learning word vectors for sentiment analysis. In: Proc. meeting of the association for computational linguistics, human language technologies, pp 142–150

Download references

Author information

Authors and Affiliations

School of Information and Safety Engineering, Zhongnan University of Economics and Law, Wuhan, 430000, China
Tianyu Chen & Sheng Li
Concordia Institute for Information Systems Engineering Montreal, Quebec, Canada
Jun Yan

Authors

Tianyu Chen
View author publications
You can also search for this author in PubMed Google Scholar
Sheng Li
View author publications
You can also search for this author in PubMed Google Scholar
Jun Yan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sheng Li.

Ethics declarations

Conflict of interest

All authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, T., Li, S. & Yan, J. CS-RNN: efficient training of recurrent neural networks with continuous skips. Neural Comput & Applic 34, 16515–16532 (2022). https://doi.org/10.1007/s00521-022-07227-z

Download citation

Received: 01 June 2021
Accepted: 29 March 2022
Published: 24 June 2022
Issue Date: October 2022
DOI: https://doi.org/10.1007/s00521-022-07227-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

CS-RNN: efficient training of recurrent neural networks with continuous skips

Abstract

Access this article

Similar content being viewed by others

Learning deep hierarchical and temporal recurrent neural networks with residual learning

Minimal gated unit for recurrent neural networks

Euler Recurrent Neural Network: Tracking the Input Contribution to Prediction on Sequential Data

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

CS-RNN: efficient training of recurrent neural networks with continuous skips

Abstract

Access this article

Similar content being viewed by others

Learning deep hierarchical and temporal recurrent neural networks with residual learning

Minimal gated unit for recurrent neural networks

Euler Recurrent Neural Network: Tracking the Input Contribution to Prediction on Sequential Data

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation