Annals of the Institute of Statistical Mathematics

, Volume 49, Issue 4, pp 761–775

# Choosing a Linear Model with a Random Number of Change-Points and Outliers

• Henri Caussinus
• Faouzi Lyazrhi
Article

## Abstract

The problem of determining a normal linear model with possible perturbations, viz. change-points and outliers, is formulated as a problem of testing multiple hypotheses, and a Bayes invariant optimal multi-decision procedure is provided for detecting at most k (k > 1) such perturbations. The asymptotic form of the procedure is a penalized log-likelihood procedure which does not depend on the loss function nor on the prior distribution of the shifts under fairly mild assumptions. The term which penalizes too large a number of changes (or outliers) arises mainly from realistic assumptions about their occurrence. It is different from the term which appears in Akaike‘s or Schwarz‘ criteria, although it is of the same order as the latter. Some concrete numerical examples are analyzed.

Akaike‘s criterion Bayes decision procedure change-point invariance maximal invariant outliers regression analysis Schwarz‘ criterion

## Preview

### References

1. Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle, 2nd International Symposium on Information Theory, 267–281, Akademiai Kiado, Budapest.Google Scholar
2. Alexander, W. P. (1993). Testing the means of independent normal random variables, Comput. Statist. Data Anal., 16, 1–10.Google Scholar
3. Barry, D. and Hartigan, J. A. (1993). A Bayesian analysis for change point problems, J. Amer. Statist. Assoc., 88, 421, 309–319.Google Scholar
4. Caussinus, H. and Vaillant, J. (1985). Some geometric tools for the Gaussian linear model, Linear Statistical Inference, Lecture Notes in Statist., 35, 1–19, Springer, Berlin.Google Scholar
5. Chernoff, H. and Zacks, S. (1964). Estimating the current mean of a normal distribution which is subject to changes in time, Ann. Math. Statist., 35, 999–1018.Google Scholar
6. Farley, J. U. and Hinich, M. J. (1970). A test for a shifting slope coefficient in a linear model, J. Amer. Statist. Assoc., 65, 1320–1399.Google Scholar
7. Ferguson, T. S. (1967). Mathematical Statistics: a Decision Theoretic Approach, Academic Press, New York and London.Google Scholar
8. Freeman, P. R. (1980). On the number of outliers in data from a linear model (with discussion), Bayesian Statistics (eds. J. M. Bernardo et al.), 349–365, University Press, Valencia.Google Scholar
9. Gardner, L. A. (1969). On detecting changes in the mean of normal variates, Ann. Statist., 40, 116–126.Google Scholar
10. Hand, D. J., Daly, F., Lunn, A. D., Mc Conway, K. J. and Ostrowski, E. (1994). A Handbook of Small Data Sets, Chapman & Hall, London.Google Scholar
11. Hannan, E. J. and Quinn, B. G. (1979). The determination of the order of an autoregression, J. Roy. Statist. Soc. Ser. B, 41, 190–195.Google Scholar
12. Hawkins, D. M. (1977). Testing a sequence of observations for a shift in location, J. Amer. Statist. Assoc., 72, 180–186.Google Scholar
13. Hinkley, D. J. (1971). Inference in two-phase regression, J. Amer. Statist. Assoc., 66, 736–743.Google Scholar
14. Jandhyala, V. K. and MacNeill, I. B. (1991). Tests for parameter changes at unknown times in linear regression models, J. Statist. Plann. Inference., 27, 291–316.Google Scholar
15. Kashiwagi, N. (1991). Bayesian detection of structural changes, Ann. Inst. Statist. Math., 43, 77–93.Google Scholar
16. Kim, H. and Siegmund, D. (1989). The likelihood ratio test for a change-point in simple linear regression, Biometrika, 76(3), 409–423.Google Scholar
17. Leonard, T. (1982). Comment on M. Lejeune and G. D. Faulkenberry, “A simple predictive density function”, J. Amer. Statist. Assoc., 77, 657–658.Google Scholar
19. Maronna, R. and Yohni, V. (1978). A bivariate test for the detection of a systematic change in means, J. Amer. Statist. Assoc., 73, 640–645.Google Scholar
20. Page, E. S. (1955). A test for a change in a parameter occurring at an unknown time point, Biometrika, 42, 523–526.Google Scholar
21. Pettit, L. I. (1992). Bayes factors for outlier models using the device of imaginary observations, J. Amer. Statist. Assoc., 87, 541–545.Google Scholar
22. Quandt, R. E. (1958). The estimation of the parameter of a linear regression system obeying two separate regimes, J. Amer. Statist. Assoc., 53, 873–880.Google Scholar
23. Schwarz, C. (1978). Estimating the dimension of a model, Ann. Statist., 6, 461–464.Google Scholar
24. Shibata, R. (1981). An optimal selection of regression variables, Biometrika, 68, 45–54.Google Scholar
25. Smith, A. F. M. (1980). Change-point problems: approaches and applications, Bayes Statistics (eds. J. M. Bernardo et al.), 83–98, University Press, Valencia.Google Scholar
26. Smith, A. F. M. and Spiegelhalter, D. J. (1980). Bayes factors and choice criteria for linear models, J. Roy. Statist. Soc. Ser. B, 42, 213–220.Google Scholar
27. Smith, A. F. M. and West, M. (1983). Monitoring renal transplants: an application of the multiprocess Kalman filter, Biometrics, 39, 867–878.Google Scholar
28. Taplin, R. H. and Raftery, A. E. (1994). Analysis of agricultural fields trials in the presence of outliers and fertility jumps, Biometrics, 50, 764–781.Google Scholar
29. Worsley, K. J. (1979). On the likelihood ratio test for shift in location of normal population, J. Amer. Statist. Assoc., 74, 36–57.Google Scholar
30. Worsley, K. J. (1983). Testing for a two-phase multiple regression, Technometrics, 25, 35–42.Google Scholar
31. Yao, Y. C. (1984). Estimation of a noisy discrete-time step function: Bayes and empirical Bayes approaches, Ann. Statist., 12, 1434–1447.Google Scholar
32. Yao, Y. C. (1988). Estimating the number of change points by Schwarz's criterion, Statist. Probab. Lett., 6, 181–189.Google Scholar