Abstract
Bayesian methods are investigated for the reconstruction of mixtures in the case of central censoring. Earlier literature suggested that when the relationship between a continuous and a categorical variable is of interest, a cost-efficient strategy may be to measure the categorical variable only in the tails of the continuous distribution. Such samples occur in population epidemiology and gene mapping. Because central observations are not classified, the mixture component to which each observation belongs is not known. Three cases of censoring, which correspond to differing amounts of available information, are compared. Closed form solutions are not available and so Markov chain Monte Carlo techniques are employed to estimate posterior densities. Evidence for a mixture of two populations is assessed via Bayes factors calculated using a Laplace-Metropolis estimator. Although parameter estimates appear to be satisfactory in most situations, evidence of two populations is only found when the component populations are well separated, tail sizes are not too small, or typing information is available. Extension of these methods to incorporate fixed effects is illustrated by application to a cattle breeding experiment.
Similar content being viewed by others
References
Besag, J., Green, P., Higdon, P. J., and Mengersen, K. (1995), “Bayesian Computation and Stochastic Systems” (with discussion), Statistical Science, 10, 3–66.
Best, N., Cowles, M. K., and Vines, K. (1995), CODA Convergence Diagnosis and Output Software for Gibbs Sampling Output Version 0.30, Cambridge, MA: MRC Biostatistics Unit.
Carlin, B. P., and Chib, S. (1995), “Bayesian Model Choice via Markov Chain Monte Carlo Methods,” Journal of the Royal Statistical Society, Ser. B, 57, 473–484.
Carlin, B. P., and Louis, T. A. (2000), Bayes and Empirical Bayes Methods for Data Analysis (2nd ed.), London: Chapman and Hall/CRC Press.
Celeux, G., Hurn, M., and Robert, C. (2000), “Computational and Inferential Difficulties With Mixture Posterior Distributions,” Journal of the American Statistical Association, 95, 957–970.
Cohen, A. C. (1991), Truncated and Censored Samples Theory and Applications, New York: Marcel Dekker.
Contreras-Cristan, A., Gutierrez-Pena, E., and OReilly, F. (2003), “Inference Using Latent Variables for Mixtures of Distributions for Censored Data with Partial Identification,” Communications in Statistics—Theory and Methods, 32, 749–774.
Darvasi, A., and Soller, M. (1992), “Selective Genotyping for Determination of Linkage Between a Marker Locus and a Quantitative Trait Locus,” Theoretical and Applied Genetics, 85, 353–359.
David, H. A. (1970), Order Statistics, New York: Wiley.
Dempster, A. P., Laird, N. M., and Rubin, D. B. (1977), “Maximum Likelihood from Incomplete Data via the EM Algorithm” (with discussion), Journal of the Royal Statistical Society, Ser. B, 39, 1–38.
Gelman, A., and Rubin, B. D. (1992), “Inference From Iterative Simulation Using Multiple Sequences,” Statistical Science, 7, 457–511.
Geman, S., and Geman, D. (1984), “Stochastic Relaxation, Gibbs Distributions and the Bayesian Restoration of Images,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 6, 721–741.
Geweke, J. (1992), “Evaluating the Accuracy of Sampling Based Approaches to Calculating Posterior Moments,” in Bavesian Statistics 4, eds. J. M. Bernado, J. O. Berger, A. P. David, and A. F. M. Smith, Cambridge, MA: Oxford University Press.
Gilks, W., Richardson, S., and Spiegelhalter, D. (1996), Markov Chain Monte Carlo in Practice, London: Chapman Hall.
Green, P. J. (1995), “Reversible Jump Markov Chain Monte Carlo Computation and Bayesian Model Determination,” Biometrika, 82, 711–732.
Hastings, W. K. (1970), “Monte Carlo Sampling Methods Using Markov Chains and Their Applications,” Biometrika, 57, 97–109.
Heath, S. C. (1997), “Markov Chain Monte Carlo Segregation and Linkage Analysis for Oligenic Models,” American Journal of Human Genetics, 61, 748–760.
Heidelberger, P., and Welch, P. (1983), “Simulation Run Length Control in the Presence of an Initial Transient,” Operations Research, 31, 1109–1144.
Hsiao, C. K. (1997), “Approximate Bayes Factors When a Mode Occurs on the Boundary,” Journal of the American Statistical Association, 92, 656–663.
Ihaka, R., and Gentleman, R. (1996), “R: A Language for Data Analysis and Graphics,” Journal of Computational and Graphical Statistics, 5, 299–314.
Lander, E. S., and Botstein, D. (1989), “Mapping Mendelian Factors Underlying Quantitative Traits using RFLP Linkage Maps,” Genetics, 121, 185–199.
Lebowitz, R. J., Soller, M., and Beckmann, J. S. (1987), “Trait-Based Analyses for the Detection of Linkage Between Marker Loci and Quantitative Trait Loci in Crosses Between Inbred Lines,” Theoretical and Applied Genetics, 73, 556–562.
Lee, S., Park, S. H., and Park, J. (2003), “The Proportional Hazards Regression With a Censored Covariate,” Statistics and Probability Letters, 61, 309–319.
McLachlan, G. J., and Jones, P. N. (1988), “Fitting Mixture Models to Grouped and Truncated Data via the EM Algorithm,” Biometrics, 44, 571–578.
McLaren, C. E., Wagstaff, M., Brittenham, G. M., and Jacobs, A. (1991), “Detection of Two-Component Mixtures of Lognormal Distributions in Grouped, Doubly Truncated Data: Analysis of Red Blood Cell Volume Distributions,” Biometrics 47, 607–622.
Mengersen, K. L., and Robert, C. P. (1996), “Testing for Mixtures: A Bayesian, Entropic Approach,” in Bavesian Statistics 5, eds. J. M. Bernando, J. O. Berger, A. P. Dawid, A. F. M. Smith, Cambridge, MA: Oxford University Press, pp. 225–276.
Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H., and Teller, E. (1953), “Equations of State Calculations by Fast Computing Machines,” Journal of Chemical Physics, 21, 1087–1092.
Muranty, H., and Goffinet, B. (1997), “Selective Genotyping for Location and Estimation of the Effect of a Quantitative Trait Locus,” Biometrics, 53, 629–643.
Ord, K., and Bagchi, U. (1983), “The Truncated Normal-Gamma Mixture as a Distribution for Lead Time Demand,” Naval Research Logistics Quarterly, 30, 359–365.
Pack, S. E., and Morgan, B. J. T. (1990), “A Mixture Model for Interval-Censored Time-to-Response Quantal Assay Data,” Biometrics, 46, 749–757.
Payne, R. W., et al. (1993), Genstat 5, Release 3 Reference Manual, Oxford: Oxford University Press.
Pettitt, A. N. (1985), “Re-weighted Least Squares Estimation with Censored and Grouped Data: An Application of the EM Algorithm,” Journal of the Royal Statistical Society, Ser. B. 47, 253–260.
Raftery, A. E. (1996), “Hypothesis Testing and Model Selection,” in Markov Chain Monte Carlo in Practice, eds. W. J. Gilks, S. Richardson, and D. J. Spiegelhalter London: Chapman and Hall, pp. 163–188.
Raftery, A. L., and Lewis, S. (1992), “How Many Iterations in the Gibbs Sampler?” in Bayesian Statistics 4, eds. J. M. Bernado, J. O. Berger, A. P. David, and A. F. M. Smith, Oxford: Oxford University Press, p. 763–774.
Richardson, S., and Green, P. J. (1997), “On Bayesian Analysis of Mixtures With an Unknown Number of Components,” Journal of the Royal Statistical Society, Ser. B, 50, 731–792.
Robert, C. (1996), “Mixtures of Distributions: Inference and Estimation,” in Markov Chain Monte Carlo in Practice, eds. W. Gilks, S. Richardson, and D. Spiegelhalter, London: Chapman and Hall.
Robert, C. P. (1994), The Bayesian Choice, New York: Springer.
Robert, C. P., and Casella, G. (1999), Monte Carlo Statistical Methods, New York: Springer Verlag.
Satagopan, J. M., Yandell, B. S., Newton, M. A., and Osborn, T. C. (1996), “A Bayesian Approach to Detect Quantitative Trait Loci using Markov Chain Monte Carlo,” Genetics, 144, 805–816.
Schneider, H. (1988), Truncated and Censored Samples from Normal Populations, New York: Marcel Dekker.
Sillanpää, M. J., and Arjas, E. (1998), “Bayesian Mapping of Multiple Quantitative Trait Loci from Incomplete Line Cross Data,” Genetics, 148, 1373–1388.
Smith, A. F. M., and Roberts, G. O. (1993), “Bayesian Computation via the Gibbs Sampler and Related Markov Monte Carlo Methods,” Journal of the Royal Statistical Society, Ser. B, 55, 3–23.
Smith, M. D., and Moffatt, P. G. (1999), “Fisher’s Information on the Correlation Coefficient in Bivariate Logistic Models,” Australian and New Zealand Journal of Statistics, 41, 315–330.
Spiegelhalter, D., Thomas, A., Best, N., and Gilks, W. (1995), BUGS. Bayesian inference Using Gibbs Sampling, Version 0.50, Cambridge: MRC Biostatistics Unit.
Stephens, D. A., and Fisch, R. D. (1998), “Bayesian Analysis of Quantitative Trait Locus Data Using Reversible Jump Markov Chain Monte Carlo,” Biometrics, 54, 1334–1347.
Stephens, D. A., and Smith, A. F. M. (1993), “Bayesian Inference in Multipoint Gene Mapping,” Annals of Human Genetics, 57, 65–82.
Stephens, M. (2000a), “Bayesian Analysis of Mixtures With an Unknown Number of Components—An Alternative to Reversible Jump Methods,” The Annals of Statistics, 28, 40–74.
— (2000b), “Dealing With Label-Switching in Mixture Models,” Journal of the Royal Statistical Society, Ser. B, 62, 795–809.
Tanner, M. A. (1993), Tools for Statistical Inference (2nd ed.), New York: Springer-Verlag.
Tanner, M. A., and Wong, W. H. (1987), “The Calculation of Posterior Distributions by Data Augmentation” (with discussion), Journal of the American Statistical Association, 82, 528–550.
Tweedie, R. L., and Mengersen, K. (1996), “Rates of Convergence of the Hastings and Metropolis Algorithms,” The Annals of Statistics, 24, 101–121.
Uimari, P., and Sillanpää, M. J. (2001), “Bayesian Oligogenic Analysis of Quantitative and Qualitative Traits in General Pedigrees,” Genetic Epidemiology, 21, 224–242.
Vogl, C., and Xu, S. (2002), “Qtl Analysis in Arbitrary Pedigrees with Incomplete Marker Information,” Heredity, 89, 339–345.
Wang, Q. H., and Li, G. (2002), “Empirical Likelihood Semiparametric Regression Analysis Under Random Censorship,” Journal of Multivariate Analysis, 83, 469–486.
Yi, N. J., and Xu, S. Z. (2002), “Linkage Analysis of Quantitative Trait Loci in Multiple Line Crosses,” Genetica, 114, 217–230.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Baker, P., Mengersen, K. & Davis, G. A bayesian solution to reconstructing centrally censored distributions. JABES 10, 61–83 (2005). https://doi.org/10.1198/108571105X28697
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1198/108571105X28697