AStA Advances in Statistical Analysis

, Volume 101, Issue 1, pp 51–65 | Cite as

Estimates for cell counts and common odds ratio in three-way contingency tables by homogeneous log-linear models with missing data

  • Haresh D. Rochani
  • Robert L. Vogel
  • Hani M. Samawi
  • Daniel F. Linder
Original Paper


Missing observations often occur in cross-classified data collected during observational, clinical, and public health studies. Inappropriate treatment of missing data can reduce statistical power and give biased results. This work extends the Baker, Rosenberger and Dersimonian modeling approach to compute maximum likelihood estimates for cell counts in three-way tables with missing data, and studies the association between two dichotomous variables while controlling for a third variable in \( 2\times 2 \times K \) tables. This approach is applied to the Behavioral Risk Factor Surveillance System data. Simulation studies are used to investigate the efficiency of estimation of the common odds ratio.


Contingency table Cross-classified data Log-linear model Maximum likelihood method Missing data Common odds ratio Three-way table 



The authors are grateful to the two anonymous reviewers, whose suggestions significantly improved the manuscript.

Supplementary material

10182_2016_275_MOESM1_ESM.pdf (148 kb)
10182_2016_275_MOESM2_ESM.txt (42 kb)


  1. Agresti, A.: Wiley Series in Probability and Mathematical Statistics. Categorical data analysis, 3rd edn. Wiley, Hoboken (2002)Google Scholar
  2. Baker, S.G., Rosenberger, W.F., DerSimonian, R.: Closed-form estimates for missing counts in two-way contingency tables. Stat. Med. 11, 643–657 (1992)CrossRefGoogle Scholar
  3. Behavioral risk factor surveillance system: Accessed 5 July 2015 (2015)
  4. Bishop, Y.M.M., Fienberg, S.E., Holland, P.W.: Discrete Multivariate Analysis : Theory and Practice. MIT Press, New York (1995)zbMATHGoogle Scholar
  5. Byrd, R.H., Lu, P., Nocedal, J., Zhu, C.: A limited memory algorithm for bound constrained optimization. SIAM J. Sci. Comput. 16(5), 1190–1208 (1995)MathSciNetCrossRefzbMATHGoogle Scholar
  6. Chen, T., Fienberg, S.E.: Two-dimensional contingency tables with both completely and partially cross-classified data. Int. Biom. Soc. 30, 629–642 (1974)MathSciNetzbMATHGoogle Scholar
  7. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B (Methodological), 1–38 (1977)Google Scholar
  8. Fay, R.E.: Causal models for patterns of nonresponse. J. Am. Stat. Assoc. 81, 354 (1986)CrossRefGoogle Scholar
  9. Fleischhacker, W.W., Derks, E., Kahn, R.S.: Interpreting treatment trials in schizophrenia patients: lessons learned from eufest. Schizophr. Res. 138, 39–40 (2012)CrossRefGoogle Scholar
  10. Gelman, A., Carlin, J.B., Stern, H.S., Rubin, D.B.: Bayesian Data Analysis. CRC Press, New York (2003)zbMATHGoogle Scholar
  11. Green, R.G., Murphy, K.D., Snyder, S.M.: Should demographics be placed at the end or at the beginning of mailed questionnaires? An empirical answer to a persistent methodological question. Soc. Work Res. 24, 237–241 (2000)CrossRefGoogle Scholar
  12. Hamano, T., Yamasaki, M., Fujisawa, Y., Ito, K., Nabika, T., Shiwaku, K.: Social capital and psychological distress of elderly in japanese rural communities. Stress Health J. Int. Soc. Invest. Stress 27, 163–169 (2011)CrossRefGoogle Scholar
  13. Heitjan, D.F.: Incomplete data: What you don’t know might hurt you. Cancer Epidemiol. Biomark. Prevent. 20, 1567–1570 (2011)CrossRefGoogle Scholar
  14. Hocking, R.R., Oxspring, H.H.: The analysis of partially categorized contingency data. Biometrics 30, 469–483 (1974)CrossRefzbMATHGoogle Scholar
  15. Jansen, I.: Flexible model strategies and sensitivity analysis tools for non monotone incomplete categorical data. Thesis and dissertations (2005)Google Scholar
  16. Little, R.J.: Models for nonresponse in sample surveys. J. Am. Stat. Assoc. 77(378), 237–250 (1982)MathSciNetCrossRefzbMATHGoogle Scholar
  17. Little, R.J., Rubin, D.B.: Statistical Analysis with Missing Data. Wiley, New York (2002)CrossRefzbMATHGoogle Scholar
  18. Meng, X.-L., Rubin, D.B.: Using EM to obtain asymptotic variance–covariance matrices: the SEM algorithm. J. Am. Stat. Assoc. 86(416), 899–909 (1991)CrossRefGoogle Scholar
  19. Nordheim, E.V.: Inference from nonrandomly missing categorical data: an example from a 16 Rochani et. all genetic study on turner’s syndrome. J. Am. Stat. Assoc. 79, 772–780 (1984)CrossRefGoogle Scholar
  20. Powers, J.R., Young, A.F., Russell, A., Pachana, N.A.: Implications of non-response of older women to a short form of the center for epidemiologic studies depression scale. Int. J. Aging Hum. Dev. 57, 37–54 (2003)CrossRefGoogle Scholar
  21. Pregibon, D.: Typical survey data: estimation and imputation. Surv. Methodol. 2, 70–102 (1977)Google Scholar
  22. Rubin, D.: Inference and missing data. Biometrika 63(3), 581–592 (1976)MathSciNetCrossRefzbMATHGoogle Scholar
  23. Rubin, D.: Multiple Imputation for Nonresponse in Surveys. Wiley, New York (2004)zbMATHGoogle Scholar
  24. Schafer, J.L.: Analysis of Incomplete Multivariate Data. CRC Press, New York (2010)zbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2016

Authors and Affiliations

  • Haresh D. Rochani
    • 1
  • Robert L. Vogel
    • 1
  • Hani M. Samawi
    • 1
  • Daniel F. Linder
    • 2
  1. 1.Jiann-Ping Hsu College of Public Health, Department of BiostatisticsGeorgia Southern UniversityStatesboroGeorgia
  2. 2.Medical College of Georgia, Department of Biostatistics and EpidemiologyAugusta UniversityAugustaGeorgia

Personalised recommendations