Abstract
The problem of determining a normal linear model with possible perturbations, viz. change-points and outliers, is formulated as a problem of testing multiple hypotheses, and a Bayes invariant optimal multi-decision procedure is provided for detecting at most k (k > 1) such perturbations. The asymptotic form of the procedure is a penalized log-likelihood procedure which does not depend on the loss function nor on the prior distribution of the shifts under fairly mild assumptions. The term which penalizes too large a number of changes (or outliers) arises mainly from realistic assumptions about their occurrence. It is different from the term which appears in Akaike‘s or Schwarz‘ criteria, although it is of the same order as the latter. Some concrete numerical examples are analyzed.
Similar content being viewed by others
References
Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle, 2nd International Symposium on Information Theory, 267–281, Akademiai Kiado, Budapest.
Alexander, W. P. (1993). Testing the means of independent normal random variables, Comput. Statist. Data Anal., 16, 1–10.
Barry, D. and Hartigan, J. A. (1993). A Bayesian analysis for change point problems, J. Amer. Statist. Assoc., 88, 421, 309–319.
Caussinus, H. and Vaillant, J. (1985). Some geometric tools for the Gaussian linear model, Linear Statistical Inference, Lecture Notes in Statist., 35, 1–19, Springer, Berlin.
Chernoff, H. and Zacks, S. (1964). Estimating the current mean of a normal distribution which is subject to changes in time, Ann. Math. Statist., 35, 999–1018.
Farley, J. U. and Hinich, M. J. (1970). A test for a shifting slope coefficient in a linear model, J. Amer. Statist. Assoc., 65, 1320–1399.
Ferguson, T. S. (1967). Mathematical Statistics: a Decision Theoretic Approach, Academic Press, New York and London.
Freeman, P. R. (1980). On the number of outliers in data from a linear model (with discussion), Bayesian Statistics (eds. J. M. Bernardo et al.), 349–365, University Press, Valencia.
Gardner, L. A. (1969). On detecting changes in the mean of normal variates, Ann. Statist., 40, 116–126.
Hand, D. J., Daly, F., Lunn, A. D., Mc Conway, K. J. and Ostrowski, E. (1994). A Handbook of Small Data Sets, Chapman & Hall, London.
Hannan, E. J. and Quinn, B. G. (1979). The determination of the order of an autoregression, J. Roy. Statist. Soc. Ser. B, 41, 190–195.
Hawkins, D. M. (1977). Testing a sequence of observations for a shift in location, J. Amer. Statist. Assoc., 72, 180–186.
Hinkley, D. J. (1971). Inference in two-phase regression, J. Amer. Statist. Assoc., 66, 736–743.
Jandhyala, V. K. and MacNeill, I. B. (1991). Tests for parameter changes at unknown times in linear regression models, J. Statist. Plann. Inference., 27, 291–316.
Kashiwagi, N. (1991). Bayesian detection of structural changes, Ann. Inst. Statist. Math., 43, 77–93.
Kim, H. and Siegmund, D. (1989). The likelihood ratio test for a change-point in simple linear regression, Biometrika, 76(3), 409–423.
Leonard, T. (1982). Comment on M. Lejeune and G. D. Faulkenberry, “A simple predictive density function”, J. Amer. Statist. Assoc., 77, 657–658.
Maddala, G. S. (1977). Econometrics, McGraw-Hill, Singapore.
Maronna, R. and Yohni, V. (1978). A bivariate test for the detection of a systematic change in means, J. Amer. Statist. Assoc., 73, 640–645.
Page, E. S. (1955). A test for a change in a parameter occurring at an unknown time point, Biometrika, 42, 523–526.
Pettit, L. I. (1992). Bayes factors for outlier models using the device of imaginary observations, J. Amer. Statist. Assoc., 87, 541–545.
Quandt, R. E. (1958). The estimation of the parameter of a linear regression system obeying two separate regimes, J. Amer. Statist. Assoc., 53, 873–880.
Schwarz, C. (1978). Estimating the dimension of a model, Ann. Statist., 6, 461–464.
Shibata, R. (1981). An optimal selection of regression variables, Biometrika, 68, 45–54.
Smith, A. F. M. (1980). Change-point problems: approaches and applications, Bayes Statistics (eds. J. M. Bernardo et al.), 83–98, University Press, Valencia.
Smith, A. F. M. and Spiegelhalter, D. J. (1980). Bayes factors and choice criteria for linear models, J. Roy. Statist. Soc. Ser. B, 42, 213–220.
Smith, A. F. M. and West, M. (1983). Monitoring renal transplants: an application of the multiprocess Kalman filter, Biometrics, 39, 867–878.
Taplin, R. H. and Raftery, A. E. (1994). Analysis of agricultural fields trials in the presence of outliers and fertility jumps, Biometrics, 50, 764–781.
Worsley, K. J. (1979). On the likelihood ratio test for shift in location of normal population, J. Amer. Statist. Assoc., 74, 36–57.
Worsley, K. J. (1983). Testing for a two-phase multiple regression, Technometrics, 25, 35–42.
Yao, Y. C. (1984). Estimation of a noisy discrete-time step function: Bayes and empirical Bayes approaches, Ann. Statist., 12, 1434–1447.
Yao, Y. C. (1988). Estimating the number of change points by Schwarz's criterion, Statist. Probab. Lett., 6, 181–189.
Author information
Authors and Affiliations
About this article
Cite this article
Caussinus, H., Lyazrhi, F. Choosing a Linear Model with a Random Number of Change-Points and Outliers. Annals of the Institute of Statistical Mathematics 49, 761–775 (1997). https://doi.org/10.1023/A:1003230713770
Issue Date:
DOI: https://doi.org/10.1023/A:1003230713770