Journal of Classification

, Volume 35, Issue 1, pp 124–146 | Cite as

A Multivariate Logistic Distance Model for the Analysis of Multiple Binary Responses

  • Hailemichael M. Worku
  • Mark de Rooij
Open Access


We propose a Multivariate Logistic Distance (MLD) model for the analysis of multiple binary responses in the presence of predictors. The MLD model can be used to simultaneously assess the dimensional/factorial structure of the data and to study the effect of the predictor variables on each of the response variables. To enhance interpretation, the results of the proposed model can be graphically represented in a biplot, showing predictor variable axes, the categories of the response variables and the subjects’ positions. The interpretation of the biplot uses a distance rule. The MLD model belongs to the family of marginal models for multivariate responses, as opposed to latent variable models and conditionally specified models. By setting the distance between the two categories of every response variable to be equal, the MLD model becomes equivalent to a marginal model for multivariate binary data estimated using a GEE method. In that case the MLD model can be fitted using existing statistical packages with a GEE procedure, e.g., the genmod procedure from SAS or the geepack package from R. Without the equality constraint, the MLD model is a general model which can be fitted by its own right. We applied the proposed model to empirical data to illustrate its advantages.


Multivariate binary data Biplots Multidimensional scaling Multidimensional unfolding Marginal model Clustered bootstrap Generalized estimating equations 


  1. ACITO, F., and ANDERSON, R.D. (1986), “A Simulation Study of Factor Score Indeterminacy”, Journal of Marketing Research, 23, 111–118.CrossRefGoogle Scholar
  2. AGRESTI, A. (2002), Categorical Data Analysis (2nd ed.), New York: John Wiley and Sons.CrossRefzbMATHGoogle Scholar
  3. AKAIKE, H. (1973), “Information Theory and an Extension of the Maximum Likelihood Principle”, in Proceedings of the Second International Symposium on Information Theory, eds. B.N. Petrov and F. Csaki, Budapest: Akademiai Kiado, pp. 267–281.Google Scholar
  4. ASAR, Ö., and ILK, Ö. (2013), “mmm: An R Package for Analyzing Multivariate Longitudinal Data with Multivariate Marginal Models”, Computer Methods and Programs in Biomedicine, 112, 649–654.CrossRefGoogle Scholar
  5. BEESDO-BAUM, K. et al. (2009), “The Structure of Common Mental Disorders: A Replication Study in a Community Sample of Adolescents and Young Adults”, International Journal of Methods in Psychiatric Research, 18, 204–220.Google Scholar
  6. BOOMSMA, A., and HOOGLAND, J.J. (2001), “The Robustness of LISREL Modeling Revisted”, in Structural Equation Modeling: Present and Future, eds. R. Cudeck, S. de Toit, and D.Sörbom, Chicago: Scientific Software International, pp. 139–168.Google Scholar
  7. BORG, I. , and GROENEN, P.J.F. (2005), Modern Multidimensional Scaling: Theory and Applications (2nd ed.), New York: Springer.zbMATHGoogle Scholar
  8. BULL, S.B. (1998), “Regression Models for Multiple Outcomes in Large Epidemiological Studies”, Statistics in Medicine, 17, 2179–2197.CrossRefGoogle Scholar
  9. CHENG, G., YU, Z., and HUANG, J.Z. (2013), “The Cluster Bootstrap Consistency in Generalized Estimating Equations”, Journal of Multivariate Analysis, 115, 33–47.MathSciNetCrossRefzbMATHGoogle Scholar
  10. COSTA, P.T., and MCCRAE, R.R. (1992), Revised NEO Personality Inventory (NEO-PRI) and NEO Five-Factor Inventory (NEO- FFI) Professional Manual, Odessa, FL: Psychological Assessment Resources.Google Scholar
  11. DE ROOIJ, M. (2009), “Ideal Point Discriminant Analysis with a Special Emphasis on Visualization”, Psychometrika, 74, 317–330.MathSciNetCrossRefzbMATHGoogle Scholar
  12. DE ROOIJ, M., and WORKU, H.M. (2012), “A Warning Concerning the Estimation of Multinomial Logistic Models with Correlated Responses in SAS”, Computer Methods and Programs in Biomedicine, 107(2), 341–346.CrossRefGoogle Scholar
  13. ELLIOT, D.S., HUIZINGA, D., and MENARD, S. (1989), Multiple Problem Youth: Delinquency, Substance Use, and Mental Health Problems, New York: Springer-Verlag.CrossRefGoogle Scholar
  14. FITZMAURICE, G., DAVIDIAN, M., VERBEKE, G., and MOLENBERGHS, G. (2008), Longitudinal Data Analysis, London: Chapman and Hall.zbMATHGoogle Scholar
  15. GABRIEL, K.R. (1971), “The Biplot Graphical Display of Matrices with Application to Principal Component Analysis”, Biometrika, 58, 453–467.MathSciNetCrossRefzbMATHGoogle Scholar
  16. GIFI, A. (1990), Nonlinear Multivariate Analysis, Chichester: John Wiley and Sons.zbMATHGoogle Scholar
  17. GOWER, J.C., and HAND, D.J. (1996), Biplots, London: Chapman and Hall.zbMATHGoogle Scholar
  18. GOWER, J.C., LUBBE, S., and LE ROUX, N. (2011), Understanding Biplots, Chichester: John Wiley and Sons Ltd.CrossRefGoogle Scholar
  19. HALEKOH, U., HOJSGAARD, S., and YAN, J. (2006), “The R Package geepack for Generalized Estimating Equations”, Journal of Statistical Software, 15(2), 1–11.CrossRefGoogle Scholar
  20. HUBBARD, A.E. et al. (2010), “To GEE or Not to GEE: Comparing Population Averaged and Mixed Models for Estimating the Associations Between Neighborhood Risk Factors and Health”, Epidemiology, 21(4), 467–474.CrossRefGoogle Scholar
  21. KRUEGER, R.F. (1999), “The Structure of Common Mental Disorders”, Archives of General Psychiatry, 56, 921–926.CrossRefGoogle Scholar
  22. KRUSKAL, J.B., and WISH, M. (1978), Multidimensional Scaling, Sage Publications.Google Scholar
  23. LIANG, K.Y., and ZEGER, S.L. (1986), “Longitudinal Data Analysis Using Generalised Linear Models”, Biometrika, 73, 13–22.MathSciNetCrossRefzbMATHGoogle Scholar
  24. LIANG, K.Y., ZEGER, S.L., and QAQISH, B. (1992), “Multivariate Regression Analyses for Categorical Data, Journal of the Royal Statistical Society, Series B (Methodological), 54(1), 3–40.MathSciNetzbMATHGoogle Scholar
  25. LIPSITZ, S.R., KIM, K., and ZHAO, L.P. (1994), “Analysis of Repeated Categorical Data Using Generalized Estimating Equations”, Statistics in Medicine, 14, 1149–1163.CrossRefGoogle Scholar
  26. MCCULLAGH, P., and NELDER, J.A. (1989), Generalized Linear Models, London: Chapman and Hall.CrossRefzbMATHGoogle Scholar
  27. PAN, W. (2001), “Akaike’s Information Criterion in Generalized Estimating Equations”, Biometrics, 57, 120–125.MathSciNetCrossRefzbMATHGoogle Scholar
  28. PARK, T. (1994), “Multivariate Regression Models for Discrete and Continuous Repeated Measurements”, Communications in Statistics - Theory and Methods, 23, 1547–1564.CrossRefzbMATHGoogle Scholar
  29. PENNINX, B.W. et al. (2008), “The Netherlands Study of Depression and Anxiety (NESDA): Rationale, Objectives and Methods”, International Journal of Methods in Psychiatric Research, 17, 121–140.CrossRefGoogle Scholar
  30. PLEWIS, I. (1996), “Statistical Methods for Understanding Cognitive Growth: A Review, A Synthesis and An Application”, British Journal of Mathematical and Statistical Psychology, 49, 25–42.CrossRefzbMATHGoogle Scholar
  31. R DEVELOPMENT CORE TEAM (2013), “R: A Language and Environment for Statistical Computing”, Computer Software Manual Version 3.0.2, Vienna, Austria,
  32. SAS INSTITUTE INC. (2011), “SAS/STAT Software”, Computer Software Manual Version 9.3, Cary, NC,
  33. SHERMAN, M., and LE CESSIE, S. (1997), “A Comparison Between Bootstrap Methods and Generalized Estimating Equations for Correlated Outcomes in Generalized Linear Models”, Communications in Statistics - Simulation and Computation, 26, 901–925.MathSciNetCrossRefzbMATHGoogle Scholar
  34. SOMMER, A., KATZ, J., and TARWOTJO, I. (1984), “Increased Risk of Respiratory Disease and Diarrhea in Children with Preexsting Mild Vitamin A Deficiency”, American Society for Clinical Nutrition, 40, 1090–1095.CrossRefGoogle Scholar
  35. SPINHOVEN, P., DE ROOIJ, M., HEISER, W., PENNINX, B.W.J.H., and SMIT, J. (2009), “The Role of Personality in Comorbidity Among Anxiety and Depressive Disorders in Primary Care and Speciality Care: A Cross-Sectional Analysis”, General Hospital Psychiatry, 31, 470–477.CrossRefGoogle Scholar
  36. SPINHOVEN, P., PENELO, E., DE ROOIJ, M., PENNINX, B,W., and ORMEL, J. (2013), “Reciprocal Effects of Stable and Temporary Components of Neuroticism and Affective Disorders: Results of a Longitudinal Cohort Study”, Psychological Medicine, 44, 337–348.CrossRefGoogle Scholar
  37. TER BRAAK, C.J.F. (1986), “Canonical Correspondence Analysis: A New Eigenvector Technique for Multivariate Direct Gradient Analysis”, Ecology, 67(5), 1167–1179.CrossRefGoogle Scholar
  38. TER BRAAK, C.J.F., and VERDONSCHOT, P.F.M. (1995), “Canonical Correspondence Analysis and Related Multivariate Methods in Aquatic Ecology”, Aquatic Sciences, 57(3), 1015–1621.Google Scholar
  39. VAN DER HEIJDEN, P.G.M., MOOIJAART, A., and TAKANE, Y. (1994), “Correspondence Analysis and Contingency Models”, in Correspondence Analysis in the Social Sciences, eds. M.J. Greenacre and J. Blasius, New York: Academic Press, pp. 79–111.Google Scholar
  40. VON OERTZEN, T., HERTZOG, C., LINDENBERGER, U., and GHISLETTA, P. (2010), “The Effect of Multiple Indicators on the Power to Detect Inter-Individual Differences in Change”, British Journal of Mathematical and Statistical Psychology, 63, 627–646.MathSciNetCrossRefGoogle Scholar
  41. WEI, L., and STRAM, D. (1988), “Analysing Repeated Measurements with Possibly Missing Observations by Modeling Marginal Distributions”, Statistics in Medicine, 7, 139–148.CrossRefGoogle Scholar
  42. WEI, X. (2012), “%PROC_R: A SAS Macro That Enables Native R Programming in the Base SAS Environment”, Journal of Statistical Software, 46.Google Scholar
  43. WORKU, H.M., and DE ROOIJ, M. (2016), “Properties of Ideal Point Classification Models for Bivariate Binary Data”, Psychometrika (accepted for publication).Google Scholar
  44. ZIEGLER, A. (2011), Generalized Estimating Equations, New York: Springer.CrossRefzbMATHGoogle Scholar
  45. ZIEGLER, A., and ARMINGER, G. (1995), “Analyzing the Employment Status with Panel Data from GSOEP - A Comparison of the MECOSA and the GEE1 Approach for Marginal Models”, Vierteljahreshefte zur Wirtschaftsforschung, 64, 72–80.Google Scholar
  46. ZIEGLER, A., KASTNER, C., and BLETTNER, M. (1998), “The Generalized Estimating Equations: An Annotated Bibliography”, Biometrical Journal, 40(2), 115–139.MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© The Author(s) 2018

Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Authors and Affiliations

  1. 1.Psychology Institute, Methodology and Statistics UnitLeiden UniversityLeidenThe Netherlands

Personalised recommendations