A bayesian solution to reconstructing centrally censored distributions

Baker, Peter; Mengersen, Kerrie; Davis, Gerard

doi:10.1198/108571105X28697

Peter Baker^nAff1,
Kerrie Mengersen² &
Gerard Davis^nAff3

68 Accesses
2 Citations
Explore all metrics

Abstract

Bayesian methods are investigated for the reconstruction of mixtures in the case of central censoring. Earlier literature suggested that when the relationship between a continuous and a categorical variable is of interest, a cost-efficient strategy may be to measure the categorical variable only in the tails of the continuous distribution. Such samples occur in population epidemiology and gene mapping. Because central observations are not classified, the mixture component to which each observation belongs is not known. Three cases of censoring, which correspond to differing amounts of available information, are compared. Closed form solutions are not available and so Markov chain Monte Carlo techniques are employed to estimate posterior densities. Evidence for a mixture of two populations is assessed via Bayes factors calculated using a Laplace-Metropolis estimator. Although parameter estimates appear to be satisfactory in most situations, evidence of two populations is only found when the component populations are well separated, tail sizes are not too small, or typing information is available. Extension of these methods to incorporate fixed effects is illustrated by application to a cattle breeding experiment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Besag, J., Green, P., Higdon, P. J., and Mengersen, K. (1995), “Bayesian Computation and Stochastic Systems” (with discussion), Statistical Science, 10, 3–66.
Article MATH MathSciNet Google Scholar
Best, N., Cowles, M. K., and Vines, K. (1995), CODA Convergence Diagnosis and Output Software for Gibbs Sampling Output Version 0.30, Cambridge, MA: MRC Biostatistics Unit.
Google Scholar
Carlin, B. P., and Chib, S. (1995), “Bayesian Model Choice via Markov Chain Monte Carlo Methods,” Journal of the Royal Statistical Society, Ser. B, 57, 473–484.
MATH Google Scholar
Carlin, B. P., and Louis, T. A. (2000), Bayes and Empirical Bayes Methods for Data Analysis (2nd ed.), London: Chapman and Hall/CRC Press.
MATH Google Scholar
Celeux, G., Hurn, M., and Robert, C. (2000), “Computational and Inferential Difficulties With Mixture Posterior Distributions,” Journal of the American Statistical Association, 95, 957–970.
Article MATH MathSciNet Google Scholar
Cohen, A. C. (1991), Truncated and Censored Samples Theory and Applications, New York: Marcel Dekker.
MATH Google Scholar
Contreras-Cristan, A., Gutierrez-Pena, E., and OReilly, F. (2003), “Inference Using Latent Variables for Mixtures of Distributions for Censored Data with Partial Identification,” Communications in Statistics—Theory and Methods, 32, 749–774.
Article MATH MathSciNet Google Scholar
Darvasi, A., and Soller, M. (1992), “Selective Genotyping for Determination of Linkage Between a Marker Locus and a Quantitative Trait Locus,” Theoretical and Applied Genetics, 85, 353–359.
Article Google Scholar
David, H. A. (1970), Order Statistics, New York: Wiley.
MATH Google Scholar
Dempster, A. P., Laird, N. M., and Rubin, D. B. (1977), “Maximum Likelihood from Incomplete Data via the EM Algorithm” (with discussion), Journal of the Royal Statistical Society, Ser. B, 39, 1–38.
MATH MathSciNet Google Scholar
Gelman, A., and Rubin, B. D. (1992), “Inference From Iterative Simulation Using Multiple Sequences,” Statistical Science, 7, 457–511.
Article Google Scholar
Geman, S., and Geman, D. (1984), “Stochastic Relaxation, Gibbs Distributions and the Bayesian Restoration of Images,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 6, 721–741.
Article MATH Google Scholar
Geweke, J. (1992), “Evaluating the Accuracy of Sampling Based Approaches to Calculating Posterior Moments,” in Bavesian Statistics 4, eds. J. M. Bernado, J. O. Berger, A. P. David, and A. F. M. Smith, Cambridge, MA: Oxford University Press.
Google Scholar
Gilks, W., Richardson, S., and Spiegelhalter, D. (1996), Markov Chain Monte Carlo in Practice, London: Chapman Hall.
MATH Google Scholar
Green, P. J. (1995), “Reversible Jump Markov Chain Monte Carlo Computation and Bayesian Model Determination,” Biometrika, 82, 711–732.
Article MATH MathSciNet Google Scholar
Hastings, W. K. (1970), “Monte Carlo Sampling Methods Using Markov Chains and Their Applications,” Biometrika, 57, 97–109.
Article MATH Google Scholar
Heath, S. C. (1997), “Markov Chain Monte Carlo Segregation and Linkage Analysis for Oligenic Models,” American Journal of Human Genetics, 61, 748–760.
Article Google Scholar
Heidelberger, P., and Welch, P. (1983), “Simulation Run Length Control in the Presence of an Initial Transient,” Operations Research, 31, 1109–1144.
Article MATH Google Scholar
Hsiao, C. K. (1997), “Approximate Bayes Factors When a Mode Occurs on the Boundary,” Journal of the American Statistical Association, 92, 656–663.
Article MATH MathSciNet Google Scholar
Ihaka, R., and Gentleman, R. (1996), “R: A Language for Data Analysis and Graphics,” Journal of Computational and Graphical Statistics, 5, 299–314.
Article Google Scholar
Lander, E. S., and Botstein, D. (1989), “Mapping Mendelian Factors Underlying Quantitative Traits using RFLP Linkage Maps,” Genetics, 121, 185–199.
Google Scholar
Lebowitz, R. J., Soller, M., and Beckmann, J. S. (1987), “Trait-Based Analyses for the Detection of Linkage Between Marker Loci and Quantitative Trait Loci in Crosses Between Inbred Lines,” Theoretical and Applied Genetics, 73, 556–562.
Article Google Scholar
Lee, S., Park, S. H., and Park, J. (2003), “The Proportional Hazards Regression With a Censored Covariate,” Statistics and Probability Letters, 61, 309–319.
Article MATH MathSciNet Google Scholar
McLachlan, G. J., and Jones, P. N. (1988), “Fitting Mixture Models to Grouped and Truncated Data via the EM Algorithm,” Biometrics, 44, 571–578.
Article MATH Google Scholar
McLaren, C. E., Wagstaff, M., Brittenham, G. M., and Jacobs, A. (1991), “Detection of Two-Component Mixtures of Lognormal Distributions in Grouped, Doubly Truncated Data: Analysis of Red Blood Cell Volume Distributions,” Biometrics 47, 607–622.
Article Google Scholar
Mengersen, K. L., and Robert, C. P. (1996), “Testing for Mixtures: A Bayesian, Entropic Approach,” in Bavesian Statistics 5, eds. J. M. Bernando, J. O. Berger, A. P. Dawid, A. F. M. Smith, Cambridge, MA: Oxford University Press, pp. 225–276.
Google Scholar
Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H., and Teller, E. (1953), “Equations of State Calculations by Fast Computing Machines,” Journal of Chemical Physics, 21, 1087–1092.
Article Google Scholar
Muranty, H., and Goffinet, B. (1997), “Selective Genotyping for Location and Estimation of the Effect of a Quantitative Trait Locus,” Biometrics, 53, 629–643.
Article MATH Google Scholar
Ord, K., and Bagchi, U. (1983), “The Truncated Normal-Gamma Mixture as a Distribution for Lead Time Demand,” Naval Research Logistics Quarterly, 30, 359–365.
Article MATH Google Scholar
Pack, S. E., and Morgan, B. J. T. (1990), “A Mixture Model for Interval-Censored Time-to-Response Quantal Assay Data,” Biometrics, 46, 749–757.
Article Google Scholar
Payne, R. W., et al. (1993), Genstat 5, Release 3 Reference Manual, Oxford: Oxford University Press.
Google Scholar
Pettitt, A. N. (1985), “Re-weighted Least Squares Estimation with Censored and Grouped Data: An Application of the EM Algorithm,” Journal of the Royal Statistical Society, Ser. B. 47, 253–260.
MathSciNet Google Scholar
Raftery, A. E. (1996), “Hypothesis Testing and Model Selection,” in Markov Chain Monte Carlo in Practice, eds. W. J. Gilks, S. Richardson, and D. J. Spiegelhalter London: Chapman and Hall, pp. 163–188.
Google Scholar
Raftery, A. L., and Lewis, S. (1992), “How Many Iterations in the Gibbs Sampler?” in Bayesian Statistics 4, eds. J. M. Bernado, J. O. Berger, A. P. David, and A. F. M. Smith, Oxford: Oxford University Press, p. 763–774.
Google Scholar
Richardson, S., and Green, P. J. (1997), “On Bayesian Analysis of Mixtures With an Unknown Number of Components,” Journal of the Royal Statistical Society, Ser. B, 50, 731–792.
Article MathSciNet Google Scholar
Robert, C. (1996), “Mixtures of Distributions: Inference and Estimation,” in Markov Chain Monte Carlo in Practice, eds. W. Gilks, S. Richardson, and D. Spiegelhalter, London: Chapman and Hall.
Google Scholar
Robert, C. P. (1994), The Bayesian Choice, New York: Springer.
MATH Google Scholar
Robert, C. P., and Casella, G. (1999), Monte Carlo Statistical Methods, New York: Springer Verlag.
MATH Google Scholar
Satagopan, J. M., Yandell, B. S., Newton, M. A., and Osborn, T. C. (1996), “A Bayesian Approach to Detect Quantitative Trait Loci using Markov Chain Monte Carlo,” Genetics, 144, 805–816.
Google Scholar
Schneider, H. (1988), Truncated and Censored Samples from Normal Populations, New York: Marcel Dekker.
Google Scholar
Sillanpää, M. J., and Arjas, E. (1998), “Bayesian Mapping of Multiple Quantitative Trait Loci from Incomplete Line Cross Data,” Genetics, 148, 1373–1388.
Google Scholar
Smith, A. F. M., and Roberts, G. O. (1993), “Bayesian Computation via the Gibbs Sampler and Related Markov Monte Carlo Methods,” Journal of the Royal Statistical Society, Ser. B, 55, 3–23.
MATH MathSciNet Google Scholar
Smith, M. D., and Moffatt, P. G. (1999), “Fisher’s Information on the Correlation Coefficient in Bivariate Logistic Models,” Australian and New Zealand Journal of Statistics, 41, 315–330.
Article MATH MathSciNet Google Scholar
Spiegelhalter, D., Thomas, A., Best, N., and Gilks, W. (1995), BUGS. Bayesian inference Using Gibbs Sampling, Version 0.50, Cambridge: MRC Biostatistics Unit.
Google Scholar
Stephens, D. A., and Fisch, R. D. (1998), “Bayesian Analysis of Quantitative Trait Locus Data Using Reversible Jump Markov Chain Monte Carlo,” Biometrics, 54, 1334–1347.
Article MATH Google Scholar
Stephens, D. A., and Smith, A. F. M. (1993), “Bayesian Inference in Multipoint Gene Mapping,” Annals of Human Genetics, 57, 65–82.
Article Google Scholar
Stephens, M. (2000a), “Bayesian Analysis of Mixtures With an Unknown Number of Components—An Alternative to Reversible Jump Methods,” The Annals of Statistics, 28, 40–74.
Article MATH MathSciNet Google Scholar
— (2000b), “Dealing With Label-Switching in Mixture Models,” Journal of the Royal Statistical Society, Ser. B, 62, 795–809.
Article MATH MathSciNet Google Scholar
Tanner, M. A. (1993), Tools for Statistical Inference (2nd ed.), New York: Springer-Verlag.
MATH Google Scholar
Tanner, M. A., and Wong, W. H. (1987), “The Calculation of Posterior Distributions by Data Augmentation” (with discussion), Journal of the American Statistical Association, 82, 528–550.
Article MATH MathSciNet Google Scholar
Tweedie, R. L., and Mengersen, K. (1996), “Rates of Convergence of the Hastings and Metropolis Algorithms,” The Annals of Statistics, 24, 101–121.
Article MATH MathSciNet Google Scholar
Uimari, P., and Sillanpää, M. J. (2001), “Bayesian Oligogenic Analysis of Quantitative and Qualitative Traits in General Pedigrees,” Genetic Epidemiology, 21, 224–242.
Article Google Scholar
Vogl, C., and Xu, S. (2002), “Qtl Analysis in Arbitrary Pedigrees with Incomplete Marker Information,” Heredity, 89, 339–345.
Article Google Scholar
Wang, Q. H., and Li, G. (2002), “Empirical Likelihood Semiparametric Regression Analysis Under Random Censorship,” Journal of Multivariate Analysis, 83, 469–486.
Article MATH MathSciNet Google Scholar
Yi, N. J., and Xu, S. Z. (2002), “Linkage Analysis of Quantitative Trait Loci in Multiple Line Crosses,” Genetica, 114, 217–230.
Article Google Scholar

Download references

Author information

Peter Baker (Research Statistician)
Present address: CSIRO Mathematical and Information Sciences, Queensland Bioscience Precinct, 306 Carmody Road, 4067, ST LUCIA, QLD, Australia
Gerard Davis (Managing Director of Genetic Solutions Pty Ltd)
Present address: , P. O. Box 145, 4010, Albion, Qld, Australia

Authors and Affiliations

School of Mathematical Sciences, Queensland University of Technology, GPO Box 2434, Brisbane, QLD, Australia
Kerrie Mengersen (Research Chair in Statistics)

Authors

Peter Baker
View author publications
You can also search for this author in PubMed Google Scholar
Kerrie Mengersen
View author publications
You can also search for this author in PubMed Google Scholar
Gerard Davis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Peter Baker.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Baker, P., Mengersen, K. & Davis, G. A bayesian solution to reconstructing centrally censored distributions. JABES 10, 61–83 (2005). https://doi.org/10.1198/108571105X28697

Download citation

Received: 15 October 2003
Revised: 15 July 2004
Issue Date: March 2005
DOI: https://doi.org/10.1198/108571105X28697

Key Words

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A bayesian solution to reconstructing centrally censored distributions

Abstract

Access this article

Similar content being viewed by others

A simple algorithm for computing the probabilities of count models based on pure birth processes

Violating the normality assumption may be the lesser of two evils

Making Predictions Using Poorly Identified Mathematical Models

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Key Words

Navigation

A bayesian solution to reconstructing centrally censored distributions

Abstract

Access this article

Similar content being viewed by others

A simple algorithm for computing the probabilities of count models based on pure birth processes

Violating the normality assumption may be the lesser of two evils

Making Predictions Using Poorly Identified Mathematical Models

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Key Words

Search

Navigation