Abstract
Multivariate time series (MTS) data are widely available in different fields including medicine, finance, bioinformatics, science and engineering. Modelling MTS data accurately is important for many decision making activities. One area that has been largely overlooked so far is the particular type of time series where the data set consists of a large number of variables but with a small number of observations. In this paper we describe the development of a novel computational method based on Natural Computation and sparse matrices that bypasses the size restrictions of traditional statistical MTS methods, makes no distribution assumptions, and also locates the associated parameters. Extensive results are presented, where the proposed method is compared with both traditional statistical and heuristic search techniques and evaluated on a number of criteria. The results have implications for a wide range of applications involving the learning of short MTS models.
Similar content being viewed by others
Abbreviations
- Term:
-
Meaning
- GA:
-
genetic algorithm
- HC:
-
Hill Climbing
- LS:
-
Least Squares
- ML:
-
Maximum Likelihood
- MTS:
-
multivariate time series
- SSV:
-
seeded sparse-VARGA
- SVNP:
-
sparse-VARGA-no-padding
- SVP:
-
sparse-VARGA-padding
- VAR:
-
Vector Auto-Regressive
- VARGA:
-
VAR genetic algorithm
- WK:
-
Weighted-Kappa
- YW:
-
Yule–Walker
References
Akaike H (1974) A new look at the statistical model identification. IEEE Transactions on Automatic Control AC-19(6): 716–723
Altman DG (1997) Practical Statistics for Medical Research. Chapman and Hall
Armstrong JS, Collopy F (1992) Error measures for generalizing about forecasting methods: empirical comparisons. International Journal of Forecasting 8: 69–80
Baker JE (1985) Adaptive selection methods for genetic algorithms, Proceedings of the First International Conference on Genetic Algorithms, pp. 101–111, Lawrence Erlbaum Associates
Bearse PM, Bozdogan H (1998) Subset selection in vector autoregressive models using the genetic algorithm with informational complexity as the fitness function, systems analysis. Modelling, and Simulation (SAMS) 31: 61–91
Casdagli M and Eubank S (1992) Nonlinear Modeling and Forecasting. Addison Wesley
Chatfield C (1995) Model Uncertainty, data mining and statistical inference (with discussion). Journal of the Royal Statistical Society, Series A 158: 419–466
Crabb D, Fitzke F, McNaught A and Hitchings R (1996/1997) A Profile of the Spatial Dependence of Pointwise Sensitivity Across The Glaucomatous Visual Field, Perimetry Update, pp. 301–310
Davis L (1989) Adapting operator probabilities in genetic algorithms, ICG89: Proceedings of the 3rd International Conference on Genetic Algorithms, pp. 60–69, Morgan Kaufmann
DeJong KA (1975) An Analysis of the Behaviour of a Class of Genetic Adaptive Systems”. PhD Thesis: University of Michigan, Dissertational Abstracts International, 36:10, 5140B
Eshelman LJ and Schaffer JD (1993), Real-Coded genetic Algorithms and Interval-Schemata. In: Whitley LD (ed.) Foundations of Genetic Algorithms 2, pp. 187–202, Morgan Kaufmann
Goldberg DE (1989) Genetic Algorithms in Search, Optimization and Machine Learning, Addison Wesley
Goldberg DE (1990) Real-Coded Genetic Algorithms, Virtual Alphabets, and Blocking, Technical Report no. 90001, University of Illinois at Urbana-Champaign
Haley MJ (ed) (1987) The Field Analyzer Primer. Allergan Humphrey
Hand DJ (1994) Deconstructing statistical questions (with discussion). Journal of the Royal Statistical Society, Series A 157: 317–356
Heijl A, Lindgren A and Lindgren G (1988/1989) Inter-Point Correlations of Deviations of Threshold Values in Normal and Glaucomatous Visual Fields, Perimetry Update, pp. 177–183
Hitchings RA (2000) Glaucoma, BMJ Publishing Group
Holden K (1995) Vector autoregression modelling and forecasting. Journal of Forecasting: Special issues on the Vector Autoregressive Model 14: 159–166
Holland JH (1975) Adaptation in Natural and Artificial Systems. The University of Michigan Press, Ann Arbor, MI
Kadous M (1999) Learning comprehensive descriptions of multivariate time series, Proceedings of the International Conference on Machine Learning, pp. 454–463
Lockhart B, Winzeler E (2000) Genomics, gene expression and DNA arrays. Nature 405: 827–836
Lütkepohl H (1993) Introduction to Multivariate Time Series Analysis, Springer-Verlag
Mahfoud SW (1995) Niching methods for Genetic Algorithms, University of Illinois at Urbana-Champaign, Technical Report, no. 90001
Meir R (2000) Nonparametric time series prediction through adaptive model selection. Machine Learning 39:1, 5–34
Michalewicz Z (1996) Genetic Algorithms + Data Structures = Evolution Programs, 3rd edn, Springer
Oates T, Schmill M and Cohen P (1999) Efficient Mining of Statistical Dependencies, Proceedings of the 16th IJCAI, pp. 794–799
Pfeifer EP, Deutsch SJ (1980) A three-stage iterative procedure for space–time modeling. Technometrics 22:1, 35–47
Pfeifer EP, Deutsch SJ (1980) Identification and interpretation of first order space–time ARMA models. Technometrics 22:3, 397–408
Pole A, West M and Harrison PJ (1994) Applied Bayesian Forecasting and Time Series Analysis, Chapman-Hall
Russell S and Norvig P (1995) Artificial Intelligence, A Modern Approach, Prentice Hall, pp. 111–112
Shahar Y (1997) A framework for knowledge-based temporal abstraction. Artificial Intelligence 90: 79–133
Sharif AM and Barrett AN (1998) Seeding a Genetic Population for Mesh Optimisation and Evaluation, Genetic Programming 1998 Conference: Late Breaking Papers, pp.␣195–200
Snedecor G and Cochran W (1967) Statistical Methods, 6th edn, Iowa State University Press
Spearman C (1904) The proof and measurement of association between two things. The American Journal of Psychology 15: 73–101
Stewart GW (1998) Matrix Algorithms Volume 1, Basic Decompositions. Society for Industrial and Applied Mathematics, Philadelphia
Swift S and Liu X (1999) Modelling and forecasting of glaucomatous visual fields using genetic algorithms, GECCO99: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 1731–1737, Morgan Kaufmann
Swift S, Liu X (2002) Predicting glaucomatous visual field deterioration through short multivariate time series modelling. Artificial Intelligence in Medicine 24(1): 5–24
Swift S (2002) The Modelling of Short High-Dimensional Multivariate Time Series, PhD Thesis: University of London, London, UK, 2002
Swift S, Tucker A and Liu X (1999) Evolutionary computation to search for strongly correlated variables in high-dimensional time-series, Proceedings of Intelligent Data Analysis 99, pp. 51–62, Springer-Verlag
Swift S, Tucker A, Liu X, Martin N, Orengo C, Kellam P (2004) Consensus clustering and functional interpretation of gene expression data. Genome Biology 5(11): R94.1–R94.16
Syswerda G (1989) Uniform crossover in genetic algorithms, Proceedings of the Third International Conference on Genetic Algorithms, pp. 10–19, Morgan Kaufmann
Tucker A and Liu X (1999) Extending evolutionary programming to the learning of dynamic bayesian networks, GECCO99: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 923–929, Morgan Kaufmann
Tucker A, Vinciotti V, Liu X, Garway-Heath D (2005) A spatio-temporal Bayesian network classifier for understanding visual field deterioration. Artificial Intelligence in Medicine 34(2): 163–177
Weigend AS and Garshenfeld NA (1994) Time Series Prediction, Addison-Wesley
Whittle P (1984) Prediction and Regulation, Basil Blackwell, 2nd edn
Zlatev Z (1991) Computational Methods for General Sparse Matrices, Kluwer Academic Publishers
Acknowledgements
We thank our research partners at Moorfields Eye Hospital and the Institute of Ophthalmology for their advice and the MTS visual field data. We are grateful to Dr Allan Tucker, Dr Steve Counsell and Dr Jason Crampton for their advice and assistance. We are also grateful to the reviewers for their constructive and helpful comments. This research was funded by Moorfields Eye Hospital, London; the Engineering and Physical Sciences Research Council, UK and the Biotechnology and Biological Sciences Research Council, UK.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Swift, S., Kok, J. & Liu, X. Learning short multivariate time series models through evolutionary and sparse matrix computation. Nat Comput 5, 387–426 (2006). https://doi.org/10.1007/s11047-006-9005-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11047-006-9005-9