Skip to main content
Log in

A bayesian solution to reconstructing centrally censored distributions

  • Published:
Journal of Agricultural, Biological, and Environmental Statistics Aims and scope Submit manuscript

Abstract

Bayesian methods are investigated for the reconstruction of mixtures in the case of central censoring. Earlier literature suggested that when the relationship between a continuous and a categorical variable is of interest, a cost-efficient strategy may be to measure the categorical variable only in the tails of the continuous distribution. Such samples occur in population epidemiology and gene mapping. Because central observations are not classified, the mixture component to which each observation belongs is not known. Three cases of censoring, which correspond to differing amounts of available information, are compared. Closed form solutions are not available and so Markov chain Monte Carlo techniques are employed to estimate posterior densities. Evidence for a mixture of two populations is assessed via Bayes factors calculated using a Laplace-Metropolis estimator. Although parameter estimates appear to be satisfactory in most situations, evidence of two populations is only found when the component populations are well separated, tail sizes are not too small, or typing information is available. Extension of these methods to incorporate fixed effects is illustrated by application to a cattle breeding experiment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Besag, J., Green, P., Higdon, P. J., and Mengersen, K. (1995), “Bayesian Computation and Stochastic Systems” (with discussion), Statistical Science, 10, 3–66.

    Article  MATH  MathSciNet  Google Scholar 

  • Best, N., Cowles, M. K., and Vines, K. (1995), CODA Convergence Diagnosis and Output Software for Gibbs Sampling Output Version 0.30, Cambridge, MA: MRC Biostatistics Unit.

    Google Scholar 

  • Carlin, B. P., and Chib, S. (1995), “Bayesian Model Choice via Markov Chain Monte Carlo Methods,” Journal of the Royal Statistical Society, Ser. B, 57, 473–484.

    MATH  Google Scholar 

  • Carlin, B. P., and Louis, T. A. (2000), Bayes and Empirical Bayes Methods for Data Analysis (2nd ed.), London: Chapman and Hall/CRC Press.

    MATH  Google Scholar 

  • Celeux, G., Hurn, M., and Robert, C. (2000), “Computational and Inferential Difficulties With Mixture Posterior Distributions,” Journal of the American Statistical Association, 95, 957–970.

    Article  MATH  MathSciNet  Google Scholar 

  • Cohen, A. C. (1991), Truncated and Censored Samples Theory and Applications, New York: Marcel Dekker.

    MATH  Google Scholar 

  • Contreras-Cristan, A., Gutierrez-Pena, E., and OReilly, F. (2003), “Inference Using Latent Variables for Mixtures of Distributions for Censored Data with Partial Identification,” Communications in Statistics—Theory and Methods, 32, 749–774.

    Article  MATH  MathSciNet  Google Scholar 

  • Darvasi, A., and Soller, M. (1992), “Selective Genotyping for Determination of Linkage Between a Marker Locus and a Quantitative Trait Locus,” Theoretical and Applied Genetics, 85, 353–359.

    Article  Google Scholar 

  • David, H. A. (1970), Order Statistics, New York: Wiley.

    MATH  Google Scholar 

  • Dempster, A. P., Laird, N. M., and Rubin, D. B. (1977), “Maximum Likelihood from Incomplete Data via the EM Algorithm” (with discussion), Journal of the Royal Statistical Society, Ser. B, 39, 1–38.

    MATH  MathSciNet  Google Scholar 

  • Gelman, A., and Rubin, B. D. (1992), “Inference From Iterative Simulation Using Multiple Sequences,” Statistical Science, 7, 457–511.

    Article  Google Scholar 

  • Geman, S., and Geman, D. (1984), “Stochastic Relaxation, Gibbs Distributions and the Bayesian Restoration of Images,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 6, 721–741.

    Article  MATH  Google Scholar 

  • Geweke, J. (1992), “Evaluating the Accuracy of Sampling Based Approaches to Calculating Posterior Moments,” in Bavesian Statistics 4, eds. J. M. Bernado, J. O. Berger, A. P. David, and A. F. M. Smith, Cambridge, MA: Oxford University Press.

    Google Scholar 

  • Gilks, W., Richardson, S., and Spiegelhalter, D. (1996), Markov Chain Monte Carlo in Practice, London: Chapman Hall.

    MATH  Google Scholar 

  • Green, P. J. (1995), “Reversible Jump Markov Chain Monte Carlo Computation and Bayesian Model Determination,” Biometrika, 82, 711–732.

    Article  MATH  MathSciNet  Google Scholar 

  • Hastings, W. K. (1970), “Monte Carlo Sampling Methods Using Markov Chains and Their Applications,” Biometrika, 57, 97–109.

    Article  MATH  Google Scholar 

  • Heath, S. C. (1997), “Markov Chain Monte Carlo Segregation and Linkage Analysis for Oligenic Models,” American Journal of Human Genetics, 61, 748–760.

    Article  Google Scholar 

  • Heidelberger, P., and Welch, P. (1983), “Simulation Run Length Control in the Presence of an Initial Transient,” Operations Research, 31, 1109–1144.

    Article  MATH  Google Scholar 

  • Hsiao, C. K. (1997), “Approximate Bayes Factors When a Mode Occurs on the Boundary,” Journal of the American Statistical Association, 92, 656–663.

    Article  MATH  MathSciNet  Google Scholar 

  • Ihaka, R., and Gentleman, R. (1996), “R: A Language for Data Analysis and Graphics,” Journal of Computational and Graphical Statistics, 5, 299–314.

    Article  Google Scholar 

  • Lander, E. S., and Botstein, D. (1989), “Mapping Mendelian Factors Underlying Quantitative Traits using RFLP Linkage Maps,” Genetics, 121, 185–199.

    Google Scholar 

  • Lebowitz, R. J., Soller, M., and Beckmann, J. S. (1987), “Trait-Based Analyses for the Detection of Linkage Between Marker Loci and Quantitative Trait Loci in Crosses Between Inbred Lines,” Theoretical and Applied Genetics, 73, 556–562.

    Article  Google Scholar 

  • Lee, S., Park, S. H., and Park, J. (2003), “The Proportional Hazards Regression With a Censored Covariate,” Statistics and Probability Letters, 61, 309–319.

    Article  MATH  MathSciNet  Google Scholar 

  • McLachlan, G. J., and Jones, P. N. (1988), “Fitting Mixture Models to Grouped and Truncated Data via the EM Algorithm,” Biometrics, 44, 571–578.

    Article  MATH  Google Scholar 

  • McLaren, C. E., Wagstaff, M., Brittenham, G. M., and Jacobs, A. (1991), “Detection of Two-Component Mixtures of Lognormal Distributions in Grouped, Doubly Truncated Data: Analysis of Red Blood Cell Volume Distributions,” Biometrics 47, 607–622.

    Article  Google Scholar 

  • Mengersen, K. L., and Robert, C. P. (1996), “Testing for Mixtures: A Bayesian, Entropic Approach,” in Bavesian Statistics 5, eds. J. M. Bernando, J. O. Berger, A. P. Dawid, A. F. M. Smith, Cambridge, MA: Oxford University Press, pp. 225–276.

    Google Scholar 

  • Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H., and Teller, E. (1953), “Equations of State Calculations by Fast Computing Machines,” Journal of Chemical Physics, 21, 1087–1092.

    Article  Google Scholar 

  • Muranty, H., and Goffinet, B. (1997), “Selective Genotyping for Location and Estimation of the Effect of a Quantitative Trait Locus,” Biometrics, 53, 629–643.

    Article  MATH  Google Scholar 

  • Ord, K., and Bagchi, U. (1983), “The Truncated Normal-Gamma Mixture as a Distribution for Lead Time Demand,” Naval Research Logistics Quarterly, 30, 359–365.

    Article  MATH  Google Scholar 

  • Pack, S. E., and Morgan, B. J. T. (1990), “A Mixture Model for Interval-Censored Time-to-Response Quantal Assay Data,” Biometrics, 46, 749–757.

    Article  Google Scholar 

  • Payne, R. W., et al. (1993), Genstat 5, Release 3 Reference Manual, Oxford: Oxford University Press.

    Google Scholar 

  • Pettitt, A. N. (1985), “Re-weighted Least Squares Estimation with Censored and Grouped Data: An Application of the EM Algorithm,” Journal of the Royal Statistical Society, Ser. B. 47, 253–260.

    MathSciNet  Google Scholar 

  • Raftery, A. E. (1996), “Hypothesis Testing and Model Selection,” in Markov Chain Monte Carlo in Practice, eds. W. J. Gilks, S. Richardson, and D. J. Spiegelhalter London: Chapman and Hall, pp. 163–188.

    Google Scholar 

  • Raftery, A. L., and Lewis, S. (1992), “How Many Iterations in the Gibbs Sampler?” in Bayesian Statistics 4, eds. J. M. Bernado, J. O. Berger, A. P. David, and A. F. M. Smith, Oxford: Oxford University Press, p. 763–774.

    Google Scholar 

  • Richardson, S., and Green, P. J. (1997), “On Bayesian Analysis of Mixtures With an Unknown Number of Components,” Journal of the Royal Statistical Society, Ser. B, 50, 731–792.

    Article  MathSciNet  Google Scholar 

  • Robert, C. (1996), “Mixtures of Distributions: Inference and Estimation,” in Markov Chain Monte Carlo in Practice, eds. W. Gilks, S. Richardson, and D. Spiegelhalter, London: Chapman and Hall.

    Google Scholar 

  • Robert, C. P. (1994), The Bayesian Choice, New York: Springer.

    MATH  Google Scholar 

  • Robert, C. P., and Casella, G. (1999), Monte Carlo Statistical Methods, New York: Springer Verlag.

    MATH  Google Scholar 

  • Satagopan, J. M., Yandell, B. S., Newton, M. A., and Osborn, T. C. (1996), “A Bayesian Approach to Detect Quantitative Trait Loci using Markov Chain Monte Carlo,” Genetics, 144, 805–816.

    Google Scholar 

  • Schneider, H. (1988), Truncated and Censored Samples from Normal Populations, New York: Marcel Dekker.

    Google Scholar 

  • Sillanpää, M. J., and Arjas, E. (1998), “Bayesian Mapping of Multiple Quantitative Trait Loci from Incomplete Line Cross Data,” Genetics, 148, 1373–1388.

    Google Scholar 

  • Smith, A. F. M., and Roberts, G. O. (1993), “Bayesian Computation via the Gibbs Sampler and Related Markov Monte Carlo Methods,” Journal of the Royal Statistical Society, Ser. B, 55, 3–23.

    MATH  MathSciNet  Google Scholar 

  • Smith, M. D., and Moffatt, P. G. (1999), “Fisher’s Information on the Correlation Coefficient in Bivariate Logistic Models,” Australian and New Zealand Journal of Statistics, 41, 315–330.

    Article  MATH  MathSciNet  Google Scholar 

  • Spiegelhalter, D., Thomas, A., Best, N., and Gilks, W. (1995), BUGS. Bayesian inference Using Gibbs Sampling, Version 0.50, Cambridge: MRC Biostatistics Unit.

    Google Scholar 

  • Stephens, D. A., and Fisch, R. D. (1998), “Bayesian Analysis of Quantitative Trait Locus Data Using Reversible Jump Markov Chain Monte Carlo,” Biometrics, 54, 1334–1347.

    Article  MATH  Google Scholar 

  • Stephens, D. A., and Smith, A. F. M. (1993), “Bayesian Inference in Multipoint Gene Mapping,” Annals of Human Genetics, 57, 65–82.

    Article  Google Scholar 

  • Stephens, M. (2000a), “Bayesian Analysis of Mixtures With an Unknown Number of Components—An Alternative to Reversible Jump Methods,” The Annals of Statistics, 28, 40–74.

    Article  MATH  MathSciNet  Google Scholar 

  • — (2000b), “Dealing With Label-Switching in Mixture Models,” Journal of the Royal Statistical Society, Ser. B, 62, 795–809.

    Article  MATH  MathSciNet  Google Scholar 

  • Tanner, M. A. (1993), Tools for Statistical Inference (2nd ed.), New York: Springer-Verlag.

    MATH  Google Scholar 

  • Tanner, M. A., and Wong, W. H. (1987), “The Calculation of Posterior Distributions by Data Augmentation” (with discussion), Journal of the American Statistical Association, 82, 528–550.

    Article  MATH  MathSciNet  Google Scholar 

  • Tweedie, R. L., and Mengersen, K. (1996), “Rates of Convergence of the Hastings and Metropolis Algorithms,” The Annals of Statistics, 24, 101–121.

    Article  MATH  MathSciNet  Google Scholar 

  • Uimari, P., and Sillanpää, M. J. (2001), “Bayesian Oligogenic Analysis of Quantitative and Qualitative Traits in General Pedigrees,” Genetic Epidemiology, 21, 224–242.

    Article  Google Scholar 

  • Vogl, C., and Xu, S. (2002), “Qtl Analysis in Arbitrary Pedigrees with Incomplete Marker Information,” Heredity, 89, 339–345.

    Article  Google Scholar 

  • Wang, Q. H., and Li, G. (2002), “Empirical Likelihood Semiparametric Regression Analysis Under Random Censorship,” Journal of Multivariate Analysis, 83, 469–486.

    Article  MATH  MathSciNet  Google Scholar 

  • Yi, N. J., and Xu, S. Z. (2002), “Linkage Analysis of Quantitative Trait Loci in Multiple Line Crosses,” Genetica, 114, 217–230.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peter Baker.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Baker, P., Mengersen, K. & Davis, G. A bayesian solution to reconstructing centrally censored distributions. JABES 10, 61–83 (2005). https://doi.org/10.1198/108571105X28697

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1198/108571105X28697

Key Words

Navigation