Statistics and Computing

, Volume 22, Issue 4, pp 917–929 | Cite as

Exact posterior distributions and model selection criteria for multiple change-point detection problems

Article

Abstract

In segmentation problems, inference on change-point position and model selection are two difficult issues due to the discrete nature of change-points. In a Bayesian context, we derive exact, explicit and tractable formulae for the posterior distribution of variables such as the number of change-points or their positions. We also demonstrate that several classical Bayesian model selection criteria can be computed exactly. All these results are based on an efficient strategy to explore the whole segmentation space, which is very large. We illustrate our methodology on both simulated data and a comparative genomic hybridization profile.

Keywords

Bayesian model selection change-point detection BIC DIC ICL posterior distribution of change-points posterior distribution of segments 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Akaike, H.: Information theory as an extension of the maximum likelihood principle. In: Petrov, B., Csaki, F. (eds.) Second International Symposium on Information Theory, pp. 267–281. Akademiai Kiado, Budapest (1973) Google Scholar
  2. Bai, J., Perron, P.: Computation and analysis of multiple structural change models. J. Appl. Econ. 18, 1–22 (2003) CrossRefGoogle Scholar
  3. Baraud, Y., Giraud, C., Huet, S.: Gaussian model selection with unknown variance. Ann. Stat. 37(2), 630–672 (2009) MathSciNetMATHCrossRefGoogle Scholar
  4. Biernacki, C., Celeux, G., Govaert, G.: Assessing a mixture model for clustering with the integrated completed likelihood. IEEE Trans. Pattern Anal. Mach. Intell. 22(7), 719–725 (2000) CrossRefGoogle Scholar
  5. Biernacki, C., Celeux, G., Govaert, G.: Exact and Monte-Carlo calculation of integrated likelihoods for the latent class model. J. Stat. Plan. Inference 140, 2191–3002 (2010) MathSciNetCrossRefGoogle Scholar
  6. Birgé, L., Massart, P.: Minimal penalties for Gaussian model selection. Probab. Theory Relat. Fields 138, 33–73 (2007) MATHCrossRefGoogle Scholar
  7. Braun, R.-K., Braun, J.-V., Müller, H.-G.: Multiple changepoint fitting via quasilikelihood, with application to DNA sequence segmentation. Biometrika 87, 301–314 (2000) MathSciNetMATHCrossRefGoogle Scholar
  8. Carlin, B.P., Chib, S.: Bayesian model choice via Markov chain Monte Carlo methods. J. R. Stat. Soc., Ser. B, Stat. Methodol. 57(3), 473–484 (1995). ArticleType: research-article/Full publication date: 1995/Copyright © 1995 Royal Statistical Society MATHGoogle Scholar
  9. Chen, C., Chan, J., Gerlach, R., Hsieh, W.: A comparison of estimators for regression models with change points (2010). doi:10.1007/s11222-010-9177-0
  10. Congdon, P.: Bayesian model choice based on Monte Carlo estimates of posterior model probabilities. Comput. Stat. Data Anal. 50(2), 346–357 (2006) MathSciNetMATHCrossRefGoogle Scholar
  11. Congdon, P.: Model weights for model choice and averaging. Stat. Methodol. 4(2), 143–157 (2007) MathSciNetCrossRefGoogle Scholar
  12. Feder, P.I.: The loglikelihood ratio in segmented regression. Ann. Stat. 3(1), 84–97 (1975) MathSciNetMATHCrossRefGoogle Scholar
  13. Godsill, S.J.: On the relationship between Markov chain Monte Carlo methods for model uncertainty. J. Comput. Graph. Stat. 10, 230–248 (2001) MathSciNetCrossRefGoogle Scholar
  14. Guédon, Y.: Explorating the segmentation space for the assessment of multiple change-points models. Technical report, Preprint INRIA n°6619 (2008) Google Scholar
  15. Husková, M., Kirch, C.: Bootstrapping confidence intervals for the change-point of time series. J. Time Ser. Anal. 29(6), 947–972 (2008) MathSciNetMATHCrossRefGoogle Scholar
  16. Kass, R.E., Raftery, A.E.: Bayes factors. J. Am. Stat. Assoc. 90, 773–795 (1995) MATHCrossRefGoogle Scholar
  17. Lavielle, M.: Using penalized contrasts for the change-point problem. Signal Process. 85(8), 1501–1510 (2005) MATHCrossRefGoogle Scholar
  18. Lebarbier, E.: Detecting multiple change-points in the mean of Gaussian process by model selection. Signal Process. 85, 717–736 (2005) MATHCrossRefGoogle Scholar
  19. Lebarbier, E., Mary-Huard, T.: Une introduction au critère BIC : fondements théoriques et interprétation. J. Soc. Fr. Stat. 147(1), 39–57 (2006) MathSciNetGoogle Scholar
  20. Lee, C.-B.: Estimating the number of change points in a sequence of independent normal random variables. Stat. Probab. Lett. 25(3), 241–8 (1995) MATHCrossRefGoogle Scholar
  21. Muggeo, V.M.: Estimating regression models with unknown break-points. Stat. Med. 22(19), 3055–3071 (2003) CrossRefGoogle Scholar
  22. Picard, F., Robin, S., Lavielle, M., Vaisse, C., Daudin, J.-J.: A statistical approach for array CGH data analysis. BMC Bioinform. 6(27), 1 (2005). www.biomedcentral.com/1471-2105/6/27 Google Scholar
  23. Pinkel, D., Segraves, R., Sudar, D., Clark, S., Poole, I., Kowbel, D., Collins, C., Kuo, W., Chen, C., Zhai, Y., Dairkee, S., Ljung, B., Gray, J.: High resolution analysis of DNA copy number variation using comparative genomic hybridization to microarrays. Nat. Genet. 20, 207–211 (1998) CrossRefGoogle Scholar
  24. Schwarz, G.: Estimating the dimension of a model. Ann. Stat. 6(2), 461–464 (1978) MATHCrossRefGoogle Scholar
  25. Scott, S.L.: Bayesian methods for hidden Markov models: Recursive computing in the 21st century. J. Am. Stat. Assoc. 97(457), 337–351 (2002). ArticleType: research-article/Full publication date: Mar., 2002/Copyright © 2002 American Statistical Association MATHCrossRefGoogle Scholar
  26. Spiegelhalter, D., Best, N., Carlin, B., van der Linde, A.: Bayesian measures of model complexity and fit. J. R. Stat. Soc. B 64(4), 583–639 (2002) MATHCrossRefGoogle Scholar
  27. Toms, J.D., Lesperance, M.L.: Piecewise regression: A tool for identifying ecological thresholds. Ecology 84(8), 2034–2041 (2003) CrossRefGoogle Scholar
  28. Yao, Y.-C.: Estimating the number of change-points via Schwarz’ criterion. Stat. Probab. Lett. 6(3), 181–189 (1988) MATHCrossRefGoogle Scholar
  29. Zhang, N.R., Siegmund, D.O.: A modified Bayes information criterion with applications to the analysis of comparative genomic hybridization data. Biometrics 63(1), 22–32 (2007) MathSciNetMATHCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  1. 1.AgroParisTechUMR 518ParisFrance
  2. 2.INRAUMR 518ParisFrance
  3. 3.Département de TransfertInstitut CurieParisFrance
  4. 4.Bioinformatics and StatisticsNKI-AVLAmsterdamNetherlands

Personalised recommendations