Abstract
A general nonparametric approach to identify similarities in a set of simultaneously observed time series is proposed. The trends are estimated via local polynomial regression and classified according to standard clustering procedures. The equality of the trends is checked using several nonparametric test statistics whose asymptotic distributions are approximated by a bootstrap procedure. Once the estimated trends are removed from the model, the residual series are grouped by means of a nonparametric cluster method specifically designed for time series. Such a method is based on a disparity measure between local linear smoothers of the spectra of the series. The performance of the proposed methodology is illustrated by means of its application to a particular financial data example. The dependence of the observations is a crucial factor in this work and is taken into account throughout the study.
Similar content being viewed by others
References
CAIADO, J., CRATO, N., and PEÑA, D. (2006), “A Periodogram-based Metric for Time Series Classification,” Computational Statististics and Data Analysis, 50, 2668–2684.
CHOUAKRIA-DOUZAL, A., and NAGABHUSHAN, P.N. (2007), “Adaptive Dissimilarity Index for Measuring Time Series Proximity,” Advanced in Data Analysis and Classification, 1, 5–21.
DETTE, H., and NEUMEYER, N. (2001), “Nonparametric Analysis of Covariance,” Annals of Statistics, 29, 1361–1400.
FAN, J., and GIJBELS, I. (1996), Local Polynomial Modelling and its Applications, London: Chapman and Hall.
FAN, J., and KREUTZBERGER, E. (1998), “Automatic Local Smoothing for Spectral Density Estimation,” Scandinavian Journal of Statistics, 25, 359–369.
FRANCISCO, M., OPSOMER, J., and VILAR, J.M. (2004), “Plug-in Bandwidth Selector for Local Polynomial Regression Estimator with Correlated Errors,“ Nonparametric Statistics, 16, 127–151.
FRANCISCO, M., and VILAR, J.M. (2001), “Local Polynomial Regression Estimation with Correlated Errors,” Communications in Statistics, Part A - Theory and Methods, 30, 1271–1293.
FRANCISCO, M., and VILAR, J.M. (2004), “Weighted Local Nonparametric Regression with Dependent Errors: Study of Real Private Residential Fixed Investment in the USA,” Statistical Inference for Stochastic Processes, 7, 69–93.
GALBRAITH, J.K., and JIAQING, L. (1999), “Cluster and Discriminant Analysis on Time Series as a Research Tool,” UTIP Working Paper Number 6, The University of Texas at Austin, Austin: Lyndon B.
HALL, P., and HART, J.D. (1990), “Bootstrap Test for Differences Between Means in Nonparametric Regression,” Journal of the American Statistical Association, 85, 1039–1049.
HIRSCH, B., and DUBOIS, D. (1991), “Self-Esteem in Early Adolescence: The Identification and Prediction of Contrasting Longitudinal Trajectories,” Journal of Youth and Adolescence, 20, 53–72.
KAKIZAWA, Y., SHUMWAY, R.H., and TANIGUCHI, M. (1998), “Discrimination and Clustering for Multivariate Time Series,” Journal of the American Statistical Association, 93, 328–340.
KING, E.C., HART, J.D., and WEHRLY, T.E. (1991), “Testing the Equality of Two Regression Curves Using Linear Smoothers,” Statistics and Probability Letters, 12, 239–247.
KOUL, H.L., and SCHICK, A. (1997), “Testing for the Equality of Two Nonparametric Regression Curves,” Journal of Statistical Planning and Inference, 65, 293–314.
KULASEKERA, K.B. (1995), “Comparison of Regression Curves Using Quasi-Residuals,” Journal of the American Statistical Association, 90, 1085–1093.
LIAO, T.W. (2005), “Clustering of Time Series Data—a Survey,” Pattern Recognition, 38, 1857–1874.
MAHARAJ, E.A. (1996), “A Significance Test for Classifying ARMA Models,” Journal of Statistical Computation and Simulation, 54, 305–331.
MAHARAJ, E.A. (2000), “Clusters of Time Series,” Journal of Classification, 17, 297–314.
NEUMEYER, N., and DETTE, H. (2003), “Nonparametric Comparison of Regression Curves: An Empirical Process Approach,” Annals of Statistics, 31, 880–920.
PICCOLO, D. (1990), “A Distance Measure for Classifying ARIMA Models,” Journal of Time Series Analysis, 11, 153–164.
SERBAN, N., and WASSERMAN, L. (2004), “CATS: Cluster After Transformation and Smoothing,“ Journal of the American Statistical Association, 100, 990–999.
TONG, H., and DABAS, P. (1990), “Cluster of Time Series Models: An Example,” Journal of Applied Statistics, 17, 187–198.
VILAR, J.M., and GONZÁLEZ, W. (2004), “Nonparametric Comparison of Curves with Dependent Errors,” Statistics, 38, 81–99.
VILAR, J.A., and PÉRTEGA, S. (2004), “Discriminant and Cluster Analysis for Gaussian Stationary Processes: Local Linear Fitting Approach,” Journal of Nonparametric Statistics, 16, 443–462.
VILAR, J.M., VILAR, J.A., and GONZÁLEZ, W. (2006), “Bootstrap Tests for Nonparametric Comparison of Regression Curves with Dependent Errors,” Test, 16, 123–144.
YOUNG, S.G., and BOWMAN, A.W. (1995), “Nonparametric Analysis of Covariance,” Biometrics, 51, 920–931.
Author information
Authors and Affiliations
Corresponding author
Additional information
The research of the authors was supported by the DGICYT Spanish Grant MTM2005-00429 and MTM2008-00166 (ERDF included) and XUGA Grant 07SIN012105PR.
Authors wish to thank the Editor and three anonymous referees for their helpful and constructive comments which enhanced the presentation of the manuscript.
Rights and permissions
About this article
Cite this article
Vilar, J.M., Vilar, J.A. & Pértega, S. Classifying Time Series Data: A Nonparametric Approach. J Classif 26, 3–28 (2009). https://doi.org/10.1007/s00357-009-9030-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00357-009-9030-3