Estimating the number of zero-one multi-way tables via sequential importance sampling

Article
  • 114 Downloads

Abstract

In 2005, Chen et al. introduced a sequential importance sampling (SIS) procedure to analyze zero-one two-way tables with given fixed marginal sums (row and column sums) via the conditional Poisson (CP) distribution. They showed that compared with Monte Carlo Markov chain (MCMC)-based approaches, their importance sampling method is more efficient in terms of running time and also provides an easy and accurate estimate of the total number of contingency tables with fixed marginal sums. In this paper, we extend their result to zero-one multi-way (\(d\)-way, \(d \ge 2\)) contingency tables under the no \(d\)-way interaction model, i.e., with fixed \(d-1\) marginal sums. Also, we show by simulations that the SIS procedure with CP distribution to estimate the number of zero-one three-way tables under the no three-way interaction model given marginal sums works very well even with some rejections. We also applied our method to Samson’s monks data set.

Keywords

Categorical data analysis Conditional Poisson Counting problem No three-way interaction 

References

  1. Blitzstein, J., Diaconis, P. (2010). A sequential importance sampling algorithm for generating random graphs with prescribed degrees. Internet Mathematics, 6(4), 489–522.Google Scholar
  2. Breiger, R., Boorman, S., Arabie, P. (1975). An algorithm for clustering relational data with applications to social network analysis and comparison with multidimensional scaling. Journal of Mathematical Psychology, 12, 328–383.Google Scholar
  3. Chen, Y. (2007). Conditional inference on tables with structural zeros. Journal of Computational and Graphical Statistics, 16(2), 445–467.MathSciNetCrossRefGoogle Scholar
  4. Chen, Y., Diaconis, P., Holmes, S., Liu, J. S. (2005). Sequential monte carlo methods for statistical analysis of tables. Journal of the American Statistical Association, 100, 109–120.Google Scholar
  5. Chen, Y., Dinwoodie, I., Sullivant, S. (2006). Sequential importance sampling for multiway tables. The Annals of Statistics, 34(1), 523–545.Google Scholar
  6. De Loera, J., Haws, D., Hemmecke, R., Huggins, P., Tauzer, J., Yoshida, R. (2005). LattE, version 1.2. http://www.math.ucdavis.edu/~latte/.
  7. De Loera, J., Onn, S. (2006). All linear and integer programs are slim 3-way transportation programs. SIAM Journal on Optimization, 17, 806–821.Google Scholar
  8. Dinwoodie, I. H. (2008). Polynomials for classification trees and applications. Statistical and Applied Mathematical Sciences Institute Technical, Report 2008-7.Google Scholar
  9. Dinwoodie, I. H., Chen, Y. (2011). Sampling large tables with constraints. Statistica Sinica, 21, 1591–1609.Google Scholar
  10. Garey, M. R., Johnson, D. S. (1979). Computers and intractabihty, a guide to the theory of NP-completeness. San Francisco: Freeman & Co.Google Scholar
  11. Huber, M. (2006). Fast perfect sampling from linear extensions. Discrete Mathematics, 306, 420–428.MathSciNetMATHCrossRefGoogle Scholar
  12. R-Project-Team. (2011). R project. GNU software. http://www.r-project.org/.
  13. Sampson, S. (1969). Crisis in a cloister. Doctoral dissertation (unpublished).Google Scholar
  14. Snijders, T. A. B. (1991). Enumeration and simulation methods for \(0-1\) matriceswith given marginals. Psychometrika, 56, 397–417.MathSciNetMATHCrossRefGoogle Scholar

Copyright information

© The Institute of Statistical Mathematics, Tokyo 2012

Authors and Affiliations

  1. 1.Statistics DepartmentUniversity of KentuckyLexingtonUSA
  2. 2.Statistics DepartmentUniversity of KentuckyLexingtonUSA
  3. 3.Computational GeneticsIBM, Thomas J. Watson Research CenterYorktown HeightsUSA

Personalised recommendations