Architectural richness in deep reservoir computing

Gallicchio, Claudio; Micheli, Alessio

doi:10.1007/s00521-021-06760-7

Architectural richness in deep reservoir computing

S.I. : IWANN 2019 SI on Advances in Computational Intelligence
Published: 15 January 2022

Volume 35, pages 24525–24542, (2023)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

657 Accesses
3 Citations
1 Altmetric
Explore all metrics

Abstract

Reservoir computing (RC) is a popular class of recurrent neural networks (RNNs) with untrained dynamics. Recently, advancements on deep RC architectures have shown a great impact in time-series applications, showing a convenient trade-off between predictive performance and required training complexity. In this paper, we go more in depth into the analysis of untrained RNNs by studying the quality of recurrent dynamics developed by the layers of deep RC neural networks. We do so by assessing the richness of the neural representations in the different levels of the architecture, using measures originating from the fields of dynamical systems, numerical analysis and information theory. Our experiments, on both synthetic and real-world datasets, show that depth—as an architectural factor of RNNs design—has a natural effect on the quality of RNN dynamics (even without learning of the internal connections). The interplay between depth and the values of RC scaling hyper-parameters, especially the scaling of inter-layer connections, is crucial to design rich untrained recurrent neural systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Hierarchical Temporal Representation in Linear Reservoir Computing

Next generation reservoir computing

Article Open access 21 September 2021

Create Efficient and Complex Reservoir Computing Architectures with ReservoirPy

Notes

The maximum among the eigenvalues in modulus.
The interested reader is referred to [13, 17] for a more complete overview of the general deep RC approach, including the readout computation and processing.
The size of \({\mathcal {K}}\) is set to 0.3 times the std of instantaneous reservoir activations, as in [34].
From LÍngua BRAsileira de Sinais, i.e., Brasilian Sign Language.
The original dataset contained a variable number of time-steps with zero input features at the beginning and at the end of each time-series, which we have preliminary removed.
Despite its simplicity, this choice for the architectural design of deep RC was found useful in several application contexts (see, e.g., [16, 19]). Notice, however, that in general the deep RC approach is not restrictive in this sense, and it enables the flexibility to have different hyper-parameterizations in different levels of the architecture.
Note the values of the color scale in the \( {\text{ESP}}_{{{\text{index}}}} \) plot in Fig. 8, especially in comparison with those in the same plot of Fig. 7.
Given a vector \({\mathbf {v}}\), \(\text {MAD}({\mathbf {v}}) = \text {MEDIAN}(| {\mathbf {v}}-\text {MEDIAN}({\mathbf {v}})|)\).

References

Atiya AF, Parlos AG (2000) New results on recurrent network training: unifying the algorithms and accelerating convergence. IEEE Trans Neural Netw 11(3):697–709
Article Google Scholar
Bacciu D, Barsocchi P, Chessa S, Gallicchio C, Micheli A (2014) An experimental characterization of reservoir computing in ambient assisted living applications. Neural Comput Appl 24(6):1451–1464
Article Google Scholar
Bengio Y, Simard P, Frasconi P (1994) Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Netw 5(2):157–166
Article Google Scholar
Chen Y, Keogh E, Hu B, Begum N, Bagnall A, Mueen A, Batista G (2015) The ucr time series classification archive . www.cs.ucr.edu/~eamonn/time_series_data/
Colla V, Matino I, Dettori S, Cateni S, Matino R (2019) Reservoir computing approaches applied to energy management in industry. In: International conference on engineering applications of neural networks. Springer, pp 66–79
Cover TM (1965) Geometrical and statistical properties of systems of linear inequalities with applications in pattern recognition. IEEE Trans Electron Comput 3:326–334
Article MATH Google Scholar
Dettori S, Matino I, Colla V, Speets R (2020) Deep echo state networks in industrial applications. In: IFIP international conference on artificial intelligence applications and innovations. Springer, pp 53–63
Dias DB, Madeo RC, Rocha T, Biscaro HH, Peres SM (2009) Hand movement recognition for brazilian sign language: a study using distance-based neural networks. In: 2009 international joint conference on neural networks, pp. 697–704. IEEE
Dua D, Graff C (2017) UCI machine learning repository. http://archive.ics.uci.edu/ml
Gallicchio C (2019) Chasing the echo state property. In: 27th European symposium on artificial neural networks, computational intelligence and machine learning, ESANN 2019, pp 667–672. ESANN (i6doc. com)
Gallicchio C, Micheli A (2010) A markovian characterization of redundancy in echo state networks by pca. In: Proc. of the 18th European symposium on artificial neural networks (ESANN). d-side publi
Gallicchio C, Micheli A (2011) Architectural and markovian factors of echo state networks. Neural Netw 24(5):440–456
Article Google Scholar
Gallicchio C, Micheli A (2017) Deep echo state network (deepesn): a brief survey. arXiv preprint arXiv:1712.04323
Gallicchio C, Micheli A (2017) Echo state property of deep reservoir computing networks. Cogn Comput 9(3):337–350
Article Google Scholar
Gallicchio C, Micheli A (2019) Reservoir topology in deep echo state networks. In: International conference on artificial neural networks. Springer, pp. 62–75
Gallicchio C, Micheli A (2020) Fast and deep graph neural networks. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 3898–3905
Gallicchio C, Micheli A (2021) Deep reservoir computing. In: Nakajima K, Fischer I (eds) Reservoir computing. Springer, pp 77–95
Gallicchio C, Micheli A, Pedrelli L (2017) Deep reservoir computing: a critical experimental analysis. Neurocomputing 268:87–99. https://doi.org/10.1016/j.neucom.2016.12.089
Article Google Scholar
Gallicchio C, Micheli A, Pedrelli L (2018) Design of deep echo state networks. Neural Netw 108:33–47
Article Google Scholar
Gallicchio C, Scardapane S (2020) Deep randomized neural networks. Recent Trends Learn Data 43–68
Graves A, Mohamed Ar, Hinton G (2013) Speech recognition with deep recurrent neural networks. In: 2013 IEEE international conference on acoustics, speech and signal processing, pp 6645–6649. Ieee
Haber E, Ruthotto L (2017) Stable architectures for deep neural networks. Inverse Probl 34(1):014004
Article MathSciNet MATH Google Scholar
Hermans M, Schrauwen B (2013) Training and analysing deep recurrent neural networks. Adv Neural Inf Process Syst 26:190–198
Google Scholar
Hu H, Wang L, Lv SX (2020) Forecasting energy consumption and wind power generation using deep echo state network. Renew Energy 154:598–613
Article Google Scholar
Jaeger H (2001) The “echo state” approach to analysing and training recurrent neural networks-with an erratum note. Bonn Ger Ger Natl Res Center Inf Technol GMD Tech Rep
Jaeger H (2002) Short term memory in echo state networks. Tech. rep, GMD-German National Research Institute for Computer Science
Google Scholar
Jaeger H (2005) Reservoir riddles: suggestions for echo state network research. In: Proceedings of the 2005 IEEE international joint conference on neural networks (IJCNN), vol 3, pp 1460–1462. IEEE
Jaeger H, Haas H (2004) Harnessing nonlinearity: predicting chaotic systems and saving energy in wireless communication. Science 304(5667):78–80
Article Google Scholar
Jaeger H, Lukoševičius M, Popovici D, Siewert U (2007) Optimization and applications of echo state networks with leaky-integrator neurons. Neural Netw 20(3):335–352
Article MATH Google Scholar
Kawai Y, Park J, Asada M (2019) A small-world topology enhances the echo state property and signal propagation in reservoir computing. Neural Netw 112:15–23
Article Google Scholar
Kim T, King BR (2020) Time series prediction using deep echo state networks. Neural Comput Appl 32(23):17769–17787
Article Google Scholar
Lukoševičius M, Jaeger H (2009) Reservoir computing approaches to recurrent neural network training. Comput Sci Rev 3(3):127–149
Article MATH Google Scholar
Olszewski RT (2001) Generalized feature extraction for structural pattern recognition in time-series data. CARNEGIE-MELLON UNIV PITTSBURGH PA SCHOOL OF COMPUTER SCIENCE, Tech. rep
Google Scholar
Ozturk M, Xu D, Principe J (2007) Analysis and design of echo state networks. Neural Comput 19(1):111–138
Article MATH Google Scholar
Pascanu R, Gulcehre C, Cho K, Bengio Y (2013) How to construct deep recurrent neural networks. arXiv preprint arXiv:1312.6026
Principe J, Xu D, Fisher J, Haykin S (2000) Information theoretic learning. unsupervised adaptive filtering. Unsupervised Adapt Filter 1
Principe JC (2010) Information theoretic learning: Renyi’s entropy and kernel perspectives. Springer Science & Business Media
Rodan A, Tiňo P (2010) Minimum complexity echo state network. IEEE Trans Neural Netw 22(1):131–144
Article Google Scholar
Scardapane S, Wang D (2017) Randomness in neural networks: an overview. Wiley Interdiscip Rev Data Min Knowl Discov 7(2):e1200
Article Google Scholar
Tiňo P, Hammer B, Bodén M (2007) Markovian bias of neural-based architectures with feedback connections. In: Perspectives of neural-symbolic integration. Springer, pp 95–133
Verstraeten D, Schrauwen B, d’Haene M, Stroobandt D (2007) An experimental unification of reservoir computing methods. Neural Netw 20(3):391–403
Weigend AS (2018) Time series prediction: forecasting the future and understanding the past. Routledge
Werbos PJ (1990) Backpropagation through time: what it does and how to do it. Proc IEEE 78(10):1550–1560
Article Google Scholar
Williams BH, Toussaint M, Storkey AJ (2006) Extracting motion primitives from natural handwriting data. In: International conference on artificial neural networks. Springer, pp 634–643
Xue Y, Yang L, Haykin S (2007) Decoupled echo state networks with lateral inhibition. Neural Netw 20(3):365–376
Article MATH Google Scholar
Yildiz I, Jaeger H, Kiebel S (2012) Re-visiting the echo state property. Neural Netw 35:1–9
Article MATH Google Scholar

Download references

Acknowledgements

This work has been partially supported by the European Union’s Horizon 2020 Research and Innovation program, under project TEACHING (Grant agreement ID: 871385), URL https://www.teaching-h2020.eu, and by the project BrAID under the Bando Ricerca Salute 2018—Regional public call for research and development projects aimed at supporting clinical and organisational innovation processes of the Regional Health Service—Regione Toscana.

Author information

Authors and Affiliations

Department of Computer Science, University of Pisa, Largo B. Pontecorvo 3, Pisa, Italy
Claudio Gallicchio & Alessio Micheli

Authors

Claudio Gallicchio
View author publications
You can also search for this author in PubMed Google Scholar
Alessio Micheli
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Claudio Gallicchio.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

A Further Results

Here, we report additional experimental results that provide further support to the analysis conducted in the paper.

In particular, in Fig. 10, we show the deviation of the results across the different datasets, expressed in terms of median average deviation (MAD)^{Footnote 8}, for the spectral radius, the input scaling and the inter-layer variability. The provided plots match the corresponding median results given in Figs. 7, 8, and 9. We can observe that in general the deviation is rather small and, varying the reservoir configurations and the number of layers, shows a trend generally in line with the observations made in Sect. 4.3, with tendentially lower values for richer reservoirs.

Next, we detail the outcomes of the experiments reported in Sect. 4.3, aggregated on the individual datasets. We provide the values of \(\text {ESP}_{index}\), ASE, LUD and C, achieved under the experimental settings illustrated in Sect. 4.2, varying the value of the spectral radius \(\rho \) in Fig. 11, varying the value of the input scaling \( \omega _{{{\text{in}}}} \) in Fig. 12, and varying the value of the inter-layer scaling \( \omega _{{{\text{il}}}} \) in Fig. 13. In each figure, each row corresponds to a different dataset. The values shown in the plots evidently confirm the same trends analyzed in Sect. 4.3 (see Figs. 7, 8, and 9).

Finally, we report the outcomes of the further experiments conducted in the same conditions as in Sect. 4.2, but with a preliminary rescaling in \([-1,1]\) of the input along each dimension individually, for each dataset. Results are given in Fig. 14, and, as it is apparent from the plots, are qualitatively in line with those without rescaling shown in Figs. 11, 12, and 13, thereby confirming the analysis discussed in Sect. 4.3.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gallicchio, C., Micheli, A. Architectural richness in deep reservoir computing. Neural Comput & Applic 35, 24525–24542 (2023). https://doi.org/10.1007/s00521-021-06760-7

Download citation

Received: 23 March 2021
Accepted: 11 November 2021
Published: 15 January 2022
Issue Date: December 2023
DOI: https://doi.org/10.1007/s00521-021-06760-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Architectural richness in deep reservoir computing

Abstract

Access this article

Similar content being viewed by others