Training Echo State Networks with Regularization Through Dimensionality Reduction

Løkse, Sigurd; Bianchi, Filippo Maria; Jenssen, Robert

doi:10.1007/s12559-017-9450-z

Training Echo State Networks with Regularization Through Dimensionality Reduction

Published: 13 January 2017

Volume 9, pages 364–378, (2017)
Cite this article

Cognitive Computation Aims and scope Submit manuscript

1034 Accesses
47 Citations
1 Altmetric
Explore all metrics

Abstract

In this paper, we introduce a new framework to train a class of recurrent neural network, called Echo State Network, to predict real valued time-series and to provide a visualization of the modeled system dynamics. The method consists in projecting the output of the internal layer of the network on a lower dimensional space, before training the output layer to learn the target task. Notably, we enforce a regularization constraint that leads to better generalization capabilities. We evaluate the performances of our approach on several benchmark tests, using different techniques to train the readout of the network, achieving superior predictive performance when using the proposed framework. Finally, we provide an insight on the effectiveness of the implemented mechanics through a visualization of the trajectory in the phase space and relying on the methodologies of nonlinear time-series analysis. By applying our method on well-known chaotic systems, we provide evidence that the lower dimensional embedding retains the dynamical properties of the underlying system better than the full-dimensional internal states of the network.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Echo State Network Based on L0 Norm Regularization for Chaotic Time Series Prediction

Modified echo state network for prediction of nonlinear chaotic time series

Article 17 September 2022

Time series prediction using deep echo state networks

Article 30 April 2020

References

Alexandre LA, Embrechts MJ, Linton J. Benchmarking reservoir computing on time-independent classification tasks. IJCNN International Joint Conference on Neural Networks, 2009. IEEE; 2009. p. 2009.
Baker CT. The numerical treatment of integral equations. Clarendon Press, Israel Program for Scientific Translations, 1973. ISBN 019853406X.
Balmforth N, Craster R. Synchronizing moore and spiegel. Chaos: An Interdisciplinary Journal of Nonlinear Science. 1997;7(4):738–752.
Article Google Scholar
Belkin M, Niyogi P, Sindhwani V. Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J Mach Learn Res. 2006;7:2399–2434.
Google Scholar
Bengio Y, Paiement J-F, Vincent P, Delalleau O, Le Roux N, Ouimet M. Out-of-sample extensions for lle, isomap, mds, eigenmaps, and spectral clustering. Adv Neural Inf Proces Syst. 2004;16:177–184.
Google Scholar
Bianchi FM, De Santis E, Rizzi A, Sadeghian A. Short-term electric load forecasting using echo state networks and PCA decomposition. IEEE Access 2015a;3:1931–1943. ISSN 2169-3536. doi:10.1109/ACCESS.2015.2485943.
Article Google Scholar
Bianchi FM, Scardapane S, Uncini A, Rizzi A, Sadeghian A. Prediction of telephone calls load using Echo State Network with exogenous variables. Neural Netw. 2015b;71:204–213. doi:10.1016/j.neunet.2015.08.010.
Article PubMed Google Scholar
Bianchi FM, Livi L, Alippi C. Investigating echo state networks dynamics by means of recurrence analysis. 2016. arXiv:1601.07381.
Boedecker J, Obst O, Lizier JT, Mayer NM, Asada M. Information processing in echo state networks at the edge of chaos. Theory Biosci. 2012;131(3):205–213.
Article PubMed Google Scholar
Bradley E, Kantz H. Nonlinear time-series analysis revisited. Chaos: An Interdisciplinary Journal of Nonlinear Science. 2015;25(9):097610.
Article Google Scholar
Burges CJ. A tutorial on support vector machines for pattern recognition. Data Min Knowl Disc. 1998;2(2): 121–167.
Article Google Scholar
Cao L. Practical method for determining the minimum embedding dimension of a scalar time series. Physica D: Nonlinear Phenomena.s 1997;110(1):43–50.
Article Google Scholar
Charles A, Yin D, Rozell C. Distributed sequence memory of multidimensional inputs in recurrent networks. 2016. arXiv:1605.08346.
Davenport MA, Duarte MF, Wakin MB, Laska JN, Takhar D, Kelly KF, Baraniuk RG. The smashed filter for compressive classification and target recognition. Electronic Imaging 2007, pages 64980H–64980H. International Society for Optics and Photonics; 2007.
Deihimi A, Showkati H. Application of echo state networks in short-term electric load forecasting. Energy. 2012;39(1):327–340.
Article Google Scholar
Deihimi A, Orang O, Showkati H. Short-term electric load and temperature forecasting using wavelet echo state networks with neural reconstruction. Energy. 2013;57:382–401.
Article Google Scholar
Dutoit X, Schrauwen B, Campenhout JV, Stroobandt D, Brussel HV, Nuttin M. Pruning and regularization in reservoir computing. Neurocomputing. 2009;72(7–9):1534 – 1546. ISSN 0925-2312. doi:10.1016/j.neucom.2008.12.020 Advances in Machine Learning and Computational Intelligence16th European Symposium on Artificial Neural Networks 200816th European Symposium on Artificial Neural Networks 2008.
Article Google Scholar
Fodor IK. A survey of dimension reduction techniques Technical report. 2002.
Fraser AM, Swinney HL. Independent coordinates for strange attractors from mutual information. Phys Rev A. 1986;33(2):1134.
Article CAS Google Scholar
Friedman JH. On bias, variance, 0/1—loss, and the curse-of-dimensionality. Data Min Knowl Disc. 1997;1(1): 55–77.
Article Google Scholar
Gao J, Cao Y, Tung W-w, Hu J. Multiscale analysis of complex time series: integration of chaos and random fractal theory, and beyond: John Wiley & Sons; 2007. ISBN 978-0-471-65470-4.
Grassberger P, Procaccia I. Measuring the strangeness of strange attractors. The Theory of Chaotic Attractors. Springer; 2004. p. 170–189.
Hai-yan D, Wen-jiang P, Zhen-ya H. A multiple objective optimization based echo state network tree and application to intrusion detection. Proceedings of 2005 IEEE International Workshop on VLSI Design and Video Technology, 2005; 2005. p. 443–446. doi:10.1109/IWVDVT.2005.1504645.
Han S, Lee J. Fuzzy echo state neural networks and funnel dynamic surface control for prescribed performance of a nonlinear dynamic system. IEEE Trans Ind Electron. 2014a;61(2):1099–1112. ISSN 0278-0046. doi:10.1109/TIE.2013.2253072.
Article Google Scholar
Han SI, Lee JM. Fuzzy echo state neural networks and funnel dynamic surface control for prescribed performance of a nonlinear dynamic system. IEEE Trans Ind Electron. 2014b;61(2):1099–1112.
Article Google Scholar
Har-Shemesh O, Quax R, Miñano B, Hoekstra AG, Sloot PMA. Nonparametric estimation of Fisher information from real data. Phys Rev E. 2016;93(2):023301. doi:10.1103/PhysRevE.93.023301.
Article PubMed Google Scholar
Hotelling H. Analysis of a complex of statistical variables into principal components. J Educ Psychol. 1933;24 (6):417–441.
Article Google Scholar
Huang C-M, Huang C-J, Wang M-L. A particle swarm optimization to identifying the armax model for short-term load forecasting. IEEE Trans Power Syst. 2005;20(2):1126–1133.
Article Google Scholar
Indyk P, Motwani R. Approximate nearest neighbors: towards removing the curse of dimensionality. Proceedings of the thirtieth annual ACM symposium on Theory of computing. ACM; 1998. p. 604–613.
Jaeger H. The echo state approach to analysing and training recurrent neural networks-with an erratum note. Bonn, Germany: German National Research Center for Information Technology GMD Technical Report. 2001;148:34.
Google Scholar
Jaeger H. Adaptive nonlinear system identification with echo state networks. Advances in neural information processing systems; 2002. p. 593–600.
Jaeger H, Haas H. Harnessing nonlinearity: predicting chaotic systems and saving energy in wireless communication. science. 2004;304(5667):78–80.
Article CAS PubMed Google Scholar
Jan van Oldenborgh G, Balmaseda MA, Ferranti L, Stockdale TN, Anderson DL. Did the ecmwf seasonal forecast model outperform statistical enso forecast models over the last 15 years? J Clim. 2005;18(16): 3240–3249.
Article Google Scholar
Jenssen R. Kernel entropy component analysis. IEEE Trans Pattern Anal Mach Intell 2010;32(5):847–860. ISSN 0162-8828. doi:10.1109/TPAMI.2009.100.
Article PubMed Google Scholar
Jenssen R. Entropy-relevant dimensions in the kernel feature space: cluster-capturing dimensionality reduction. IEEE Signal Process Mag. 2013;30(4):30–39. ISSN 1053-5888. doi:10.1109/MSP.2013.2249692.
Article Google Scholar
Kantz H, Schreiber T, Vol. 7. Nonlinear time series analysis: Cambridge university press; 2004. ISBN 9780511755798. doi:10.1017/CBO9780511755798.
Li D, Han M, Wang J. Chaotic time series prediction based on a novel robust echo state network. IEEE Transactions on Neural Networks and Learning Systems. 2012;23(5):787–799.
Article PubMed Google Scholar
Liebert W, Schuster H. Proper choice of the time delay for the analysis of chaotic time series. Phys Lett A. 1989;142(2-3):107–111.
Article Google Scholar
Livi L, Bianchi FM, Alippi C. Determination of the edge of criticality in echo state networks through fisher information maximization. 2016. arXiv:1603.03685.
Lukoševičius M, Jaeger H. Reservoir computing approaches to recurrent neural network training. Computer Science Review. 2009;3(3):127–149. doi:10.1016/j.cosrev.2009.03.005.
Article Google Scholar
Ma Q, Shen L, Chen W, Wang J, Wei J, Yu Z. Functional echo state network for time series classification. Inf Sci. 2016;373:1 – 20. ISSN 0020-0255. doi:10.1016/j.ins.2016.08.081.
Article Google Scholar
Malik ZK, Hussain A, Wu J. Novel biologically inspired approaches to extracting online information from temporal data. Cogn Comput. 2014;6(3):595–607. ISSN 1866-9964. doi:10.1007/s12559-014-9257-0.
Article Google Scholar
Malik ZK, Hussain A, Wu J. An online generalized eigenvalue version of laplacian eigenmaps for visual big data. Neurocomputing. 2016a;173(2):127 – 136. ISSN 0925-2312. doi:10.1016/j.neucom.2014.12.119.
Article Google Scholar
Malik ZK, Hussain A, Wu QJ. Multilayered echo state machine: A novel architecture and algorithm. 2016b.
Marwan N, Romano MC, Thiel M, Kurths J. Recurrence plots for the analysis of complex systems. Phys Rep. 2007;438(5):237–329.
Article Google Scholar
Mazumdar J, Harley R. Utilization of echo state networks for differentiating source and nonlinear load harmonics in the utility network. IEEE Trans Power Electron. 2008;23(6):2738–2745. ISSN 0885-8993. doi:10.1109/TPEL.2008.2005097.
Article Google Scholar
Packard NH, Crutchfield JP, Farmer JD, Shaw RS. Geometry from a time series. Phys Rev Lett. 1980;45(9):712.
Article Google Scholar
Parlitz U. Nonlinear Time-Series Analysis. Boston, MA: Springer US; 1998, pp. 209–239. ISBN 978-1-4615-5703-6. doi:10.1007/978-1-4615-5703-6_8.
Google Scholar
Peng Y, Lei M, Li J-B, Peng X-Y. A novel hybridization of echo state networks and multiplicative seasonal ARIMA model for mobile communication traffic series forecasting. Neural Comput & Applic. 2014;24(3-4): 883–890.
Article Google Scholar
Rényi A. On the dimension and entropy of probability distributions. Acta Mathematica Academiae Scientiarum Hungarica. 1959;10(1-2):193–215.
Article Google Scholar
Rhodes C, Morari M. The false nearest neighbors algorithm: An overview. Comput Chem Eng. 1997;21: S1149–S1154.
Article CAS Google Scholar
Scardapane S, Comminiello D, Scarpiniti M, Uncini A. Significance-Based Pruning for Reservoir’s Neurons in Echo State Networks: Springer International Publishing, Cham; 2015, pp. 31–38. ISBN 978-3-319-18164-6. doi:10.1007/978-3-319-18164-6_4.
Schölkopf B, Smola A, Müller K-R. Kernel principal component analysis. International Conference on Artificial Neural Networks. Springer; 1997. p. 583–588.
Schölkopf B, Smola AJ, Williamson RC, Bartlett PL. New support vector algorithms. Neural Comput. 2000;12(5):1207– 1245.
Article PubMed Google Scholar
Skowronski MD, Harris JG. Automatic speech recognition using a predictive echo state network classifier. Neural Netw. 2007;20(3):414–423.
Article PubMed Google Scholar
Srinivas M, Patnaik LM. Genetic algorithms: a survey. Computer 1994;27(6):17–26. ISSN 0018-9162. doi:10.1109/2.294849.
Article Google Scholar
Takens F. Detecting strange attractors in turbulence. Berlin, Heidelberg: Springer Berlin Heidelberg; 1981, pp. 366–381. ISBN 978-3-540-38945-3. doi:10.1007/BFb0091924.
Google Scholar
Van Der Maaten L, Postma E, Van den Herik J. Dimensionality reduction: a comparative. J Mach Learn Res. 2009;10:66–71.
Varshney S, Verma T. Half Hourly Electricity Load Prediction using Echo State Network. International Journal of Science and Research. 2014;3(6):885–888.
Google Scholar
Verstraeten D, Schrauwen B. On the quantification of dynamics in reservoir computing. Artificial Neural Networks – ICANN 2009. In: Alippi C, Polycarpou M, Panayiotou C, and Ellinas G, editors. Heidelberg: Springer Berlin; 2009. p. 985–994. ISBN 978-3-642-04273-7. doi:10.1007/978-3-642-04274-4_101.
Wierstra D, Gomez FJ, Schmidhuber J. Modeling systems with internal state using evolino. Proceedings of the 7th annual conference on Genetic and evolutionary computation. ACM; 2005. p. 1795–1802.
Wolf A, Swift JB, Swinney HL, Vastano JA. Determining lyapunov exponents from a time series. Physica D: Nonlinear Phenomena. 1985;16(3):285–317.
Article Google Scholar
Zhou S, Lafferty J, Wasserman L. Compressed and privacy-sensitive sparse regression. IEEE Trans Inf Theory. 2009;55(2):846–866.
Article Google Scholar
Fusi S, Miller EK, Rigotti M. Why neurons mix: high dimensionality for higher cognition. Curr Opin Neurobiol. 2016;37:66–74. ISSN 0959-4388. doi:10.1016/j.conb.2016.01.010.
Article CAS PubMed Google Scholar
Cover TM. Geometrical and statistical properties of systems of linear inequalities with applications in pattern recognition. IEEE Transactions on Electronic Computers. 1965;EC-14(3):326–334. ISSN 0367-7508. doi:10.1109/PGEC.1965.264137.
Article Google Scholar
Mante V, Sussillo D, Shenoy KV, Newsome WT. Context- dependent computation by recurrent dynamics in prefrontal cortex. Nature. 2013;503(7474):78–84. ISSN 0028-0836. doi:10.1038/nature12742.
Article CAS PubMed PubMed Central Google Scholar
DiCarlo JJ, Cox DD. Untangling invariant object recognition. Trends Cogn Sci. 2007;11(8):333–341. ISSN 1364-6613. doi:10.1016/j.tics.2007.06.010.
Article PubMed Google Scholar

Download references

Author information

Authors and Affiliations

Machine Learning Group, Department of Physics and Technology, University of Tromsø - The Arctic University of Norway, Tromsø, Norway
Sigurd Løkse, Filippo Maria Bianchi & Robert Jenssen

Authors

Sigurd Løkse
View author publications
You can also search for this author in PubMed Google Scholar
Filippo Maria Bianchi
View author publications
You can also search for this author in PubMed Google Scholar
Robert Jenssen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Filippo Maria Bianchi.

Ethics declarations

Conflict of interests

The authors declare that they have no conflict of interest.

Informed Consent

All procedures followed were in accordance with the ethical standards of the responsible committee on human experimentation (institutional and national) and with the Helsinki Declaration of 1975, as revised in 2008 (5). Additional informed consent was obtained from all patients for which identifying information is included in this article.

Human and Animal Rights

This article does not contain any studies with human or animal subjects performed by any of the authors.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Løkse, S., Bianchi, F.M. & Jenssen, R. Training Echo State Networks with Regularization Through Dimensionality Reduction. Cogn Comput 9, 364–378 (2017). https://doi.org/10.1007/s12559-017-9450-z

Download citation

Received: 07 September 2016
Accepted: 02 January 2017
Published: 13 January 2017
Issue Date: June 2017
DOI: https://doi.org/10.1007/s12559-017-9450-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Training Echo State Networks with Regularization Through Dimensionality Reduction

Abstract

Access this article

Similar content being viewed by others

Echo State Network Based on L0 Norm Regularization for Chaotic Time Series Prediction

Modified echo state network for prediction of nonlinear chaotic time series

Time series prediction using deep echo state networks

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interests

Informed Consent

Human and Animal Rights

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Training Echo State Networks with Regularization Through Dimensionality Reduction

Abstract

Access this article

Similar content being viewed by others

Echo State Network Based on L0 Norm Regularization for Chaotic Time Series Prediction

Modified echo state network for prediction of nonlinear chaotic time series

Time series prediction using deep echo state networks

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interests

Informed Consent

Human and Animal Rights

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation