Skip to main content

Modelling heterogeneity: on the problem of group comparisons with logistic regression and the potential of the heterogeneous choice model


The comparison of coefficients of logit models obtained for different groups is widely considered as problematic because of possible heterogeneity of residual variances in latent variables. It is shown that the heterogeneous logit model can be used to account for this type of heterogeneity by considering reduced models that are identified. A model selection strategy is proposed that can distinguish between effects that are due to heterogeneity and substantial interaction effects. In contrast to the common understanding, the heterogeneous logit model is considered as a model that contains effect modifying terms, which are not necessarily linked to variances but can also represent other types of heterogeneity in the population. The alternative interpretation of the parameters in the heterogeneous logit model makes it a flexible tool that can account for various sources of heterogeneity. Although the model is typically derived from latent variables it is important that for the interpretation of parameters the reference to latent variables is not needed. Latent variables are considered as a motivation for binary models, but the effects in the models can be interpreted as effects on the binary response.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2


  • Agresti A (2013) Categorical data analysis, 3d edn. Wiley, New York

    MATH  Google Scholar 

  • Allison PD (1999) Comparing logit and probit coefficients across groups. Sociol Methods Res 28(2):186–208

    Google Scholar 

  • Baumgartner H, Steenkamp J-BE (2001) Response styles in marketing research: a cross-national investigation. J Market Res 38(2):143–156

    Google Scholar 

  • Berger M, Tutz G, Schmid M (2019) Tree-structured modelling of varying coefficients. Stat Comput 29(2):217–229

    MathSciNet  MATH  Google Scholar 

  • Breen R, Holm A, Karlson KB (2014) Correlations and nonlinear probability models. Sociol Methods Res 43(4):571–605

    MathSciNet  Google Scholar 

  • Cai Z, Fan J, Li R (2000) Efficient estimation and inferences for varying-coefficient models. J Am Stat Assoc 95(451):888–902

    MathSciNet  MATH  Google Scholar 

  • Christensen RHB (2015) Ordinal-regression models for ordinal data. R package version 2015.6-28.

  • Fan J, Zhang W (1999) Statistical estimation in varying coefficient models. Ann Stat 27:1491–1518

    MathSciNet  MATH  Google Scholar 

  • Fullerton AS, Xu J (2012) The proportional odds with partial proportionality constraints model for ordinal response variables. Soc Sci Res 41(1):182–198

    Google Scholar 

  • Gertheiss J, Tutz G (2012) Regularization and model selection with categorial effect modifiers. Stat Sin 22:957–982

    MathSciNet  MATH  Google Scholar 

  • Gollwitzer M, Eid M, Jürgensen R (2005) Response styles in the assessment of anger expression. Psychol Assess 17(1):56

    Google Scholar 

  • Hastie T, Tibshirani R (1993) Varying-coefficient models. J R Stat Soc B 55:757–796

    MathSciNet  MATH  Google Scholar 

  • Hauser RM, Andrew M (2006) Another look at the stratification of educational transitions: the logistic response model with partial proportionality constraints. Sociol Methodol 36(1):1–26

    Google Scholar 

  • Johnson TR (2003) On the use of heterogeneous thresholds ordinal regression models to account for individual differences in response style. Psychometrika 68(4):563–583

    MathSciNet  MATH  Google Scholar 

  • Karlson KB, Holm A, Breen R (2012) Comparing regression coefficients between same-sample nested models using logit and probit: a new method. Sociol Methodol 42(1):286–313

    Google Scholar 

  • Kuha J, Mills C (2017) On group comparisons with logistic regression models. Sociol Methods Res.

    Article  Google Scholar 

  • Maij-de Meij AM, Kelderman H, van der Flier H (2008) Fitting a mixture item response theory model to personality questionnaire data: Characterizing latent classes and investigating possibilities for improving prediction. Appl Psychol Meas 32(8):611–631

    MathSciNet  Google Scholar 

  • Mare RD (2006) Response: statistical models of educational stratification-Hauser and Andrew’s models for school transitions. Sociol Methodol 36:27–37

    Google Scholar 

  • McCullagh P (1980) Regression model for ordinal data (with discussion). J R Stat Soc B 42(2):109–127

    MATH  Google Scholar 

  • McCullagh P, Nelder JA (1989) Generalized linear models, 2nd edn. Chapman & Hall, New York

    MATH  Google Scholar 

  • Mood C (2010) Logistic regression: Why we cannot do what we think we can do, and what we can do about it? Eur Sociol Rev 26(1):67–82

    Google Scholar 

  • Park BU, Mammen E, Lee YK, Lee ER (2015) Varying coefficient regression models: a review and new developments. Int Stat Rev 83(1):36–64

    MathSciNet  Google Scholar 

  • Piccolo D, Simone R (2019) The class of CUB models: statistical foundations, inferential issues and empirical evidence. Stat Methods Appl.

    MathSciNet  MATH  Article  Google Scholar 

  • Plieninger H (2016) Mountain or molehill? A simulation study on the impact of response styles. Educ Psychol Meas 77:32–53

    Google Scholar 

  • R Core Team (2016). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna

  • Rohwer G (2015) A note on the heterogeneous choice model. Sociol Methods Res 44(1):145–148

    MathSciNet  Google Scholar 

  • Tutz G (2012) Regression for categorical data. Cambridge University Press, Cambridge

    MATH  Google Scholar 

  • Tutz G (2018) Binary response models with underlying heterogeneity: identification and interpretation of effects. Eur Sociol Rev 34:211–221

    Google Scholar 

  • Van Vaerenbergh Y, Thomas TD (2013) Response styles in survey research: a literature review of antecedents, consequences, and remedies. Int J Publ Opin Res 25(2):195–217

    Google Scholar 

  • Wetzel E, Carstensen CH (2017) Multidimensional modeling of traits and response styles. Eur J Psychol Assess 33:352–364

    Google Scholar 

  • Williams R (2009) Using heterogeneous choice models to compare logit and probit coefficients across groups. Sociol Method Res 37(4):531–559

    MathSciNet  Google Scholar 

  • Williams R (2010) Fitting heterogeneous choice models with oglm. Stat J 10(4):540–567

    Google Scholar 

  • Williams R (2016) Understanding and interpreting generalized ordered logit models. J Math Sociol 40(1):7–20

    MathSciNet  MATH  Google Scholar 

  • Zhao W, Zhang R, Liu J (2014) Regularization and model selection for quantile varying coefficient model with categorical effect modifiers. Comput Stat Data Anal 79:44–62

    MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Gerhard Tutz.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.



Proof Proposition 3.1

(a) Let us assume that the general heterogeneous choice model with predictor

$$\begin{aligned} \eta _i = ({\alpha _{00}+x_{i0}\alpha _0+{\varvec{x}}_i^T\varvec{\alpha }}+ x_{i0}({\varvec{x}}_i^{S})^T \varvec{\alpha }^{S})/\exp (x_{i0}\gamma ) \end{aligned}$$

holds, where for \(S=\{j_1,\ldots ,j_m\}\) one has \(({\varvec{x}}_i^{S})^T=(x_{ij_1},\ldots ,x_{ij_m})\), and interaction effects \((\varvec{\alpha }^{S})^T=(\alpha _{0j_1},\ldots , \alpha _{0j_m})\). One can define new parameters

$$\begin{aligned} \beta _{00}&= \alpha _{00}, \quad \beta _{0} = \alpha _{00}\frac{1-e^{\gamma }}{e^{\gamma }}- \frac{\alpha _{0}}{e^{\gamma }},\\ \beta _j&=\alpha _j,\quad j=1,\ldots ,p,\\ \beta _{0j}&=\alpha _{0j}\;\; \text {for}\; j \in S, \quad \beta _{0j}=\alpha _{j} \frac{1-e^{\gamma }}{e^{\gamma }}\;\; \text {for}\; j \notin S, \end{aligned}$$

When using these parameters as parameters in the interaction model

$$\begin{aligned} {\text {logit}}(\pi _i)=\beta _{00}+x_{i0}\beta _0+x_{i1}\beta _1+\cdots +x_{ip}\beta _p+x_{i0}x_{i1}\beta _{01}+\cdots +x_{i0}x_{ip}\beta _{0p}, \end{aligned}$$

one obtains that the linear predictor in (14) is the same as the linear predictor in the general heterogeneous choice model (13). Thus, the interaction model holds.

In addition, for \(j \notin S\) the relation \(\beta _{0j}/\beta _j= (1-e^{\gamma })/(e^{\gamma })\) holds. Let \(\{1,\ldots ,p\}\) be partitioned into the disjunct subsets S and \({\tilde{S}}=\{1,\ldots ,p\} {\setminus } S\). Then for pairs \(j,s \in {\tilde{S}}\) the constraints

$$\begin{aligned} \beta _{0j}/\beta _{j}=\cdots =\beta _{0s}/\beta _{s}\quad \text {for all}\;\; j,s \in {\tilde{S}} \end{aligned}$$

hold. Of course it is only a constraint if \({\tilde{S}} \ge 2\).

(b) Let us now assume that the interaction model (14) with constraints (15) holds.

Case 1 If \(|S|=p, |{\tilde{S}}|=0\) one obtains with the parameters defined by \(\alpha _{00}=\beta _{00}, \alpha _{0}=\beta _{0}\)\(\alpha _{j}=\beta _{j}\), \(\alpha _{0j}=\beta _{0j}\), \(j=1,\ldots , p\) that the linear predictor \(\eta _i = ({\alpha _{00}+x_{i0}\alpha _0+{\varvec{x}}_i^T\varvec{\alpha }}+ x_{i0}{\varvec{x}}_i^T \varvec{\alpha })/\exp (x_{i0}\gamma )\) is equivalent to the predictor in (14), which means that the heterogeneous choice model holds with \(\gamma \) fixed by \(\gamma =1\) since it is not identified.

Case 2 Let \(|S|=p-1, |{\tilde{S}}|=1\) hold and parameters be defined by

$$\begin{aligned} \alpha _{00}&=\beta _{00}, \quad \alpha _{0} = \beta _{00}(e^{\gamma }-1)+e^{\gamma }\beta _{0}, \\ \alpha _j&=\beta _j, \quad j=1,\ldots ,p-1, \quad \alpha _p=e^{\gamma }(\beta _{p}+\beta _{0p}),\\ \alpha _{0j}&=\beta _{j}(e^{\gamma }-1) + e^{\gamma }\beta _{0j}, \quad j=1,\ldots ,p-1, \\ e^{\gamma }&= \beta _{p}/(\beta _{p}+\beta _{0p}). \end{aligned}$$

Using these parameters in the predictor \(\eta _i = ({\alpha _{00}+x_{i0}\alpha _0+{\varvec{x}}_i^T\varvec{\alpha }}+ x_{i0}(x_{i1}\ldots , \ldots , x_{i,p-1}) \varvec{\alpha }^{S})/\exp (x_{i0}\gamma )\) yields the predictor in (14). Thus the interaction model is represented as a heterogeneous chioce model, in which \(\alpha _{0p}=0\). It should be noted that one could have omitted another interaction parameter. Without loss of generality we chose the parameter \(\alpha _{0p}\).

Case 3 Let \(|S| \le p-2, |{\tilde{S}}|=m \ge 2\) hold. Without loss of generality let \({\tilde{S}}=\{p-m+1,\ldots ,p\}\). Let parameters be defined by

$$\begin{aligned} \alpha _{00}&=\beta _{00}, \quad \alpha _{0} = \beta _{00}(e^{\gamma }-1)+e^{\gamma }\beta _{0}, \\ \alpha _j&=\beta _j, \quad j=1,\ldots ,p, \\ \alpha _{0j}&=\beta _{j}(e^{\gamma }-1) + e^{\gamma }\beta _{0j},\quad \text {for}\;\; j \in S. \end{aligned}$$

In addition, \(\gamma \) is defined by

$$\begin{aligned} e^{\gamma } = \beta _{j}/(\beta _{j}+\beta _{0j}) \quad \text {for}\;\; j \in {\tilde{S}}, \end{aligned}$$

which is possible since \((1-e^{\gamma })/e^{\gamma }= \beta _{0j}/\beta _{j}\), and \(\beta _{0j}/\beta _{j}\) has the same value for all \(j \in {\tilde{S}}\). Using these parameters in the predictor \(\eta _i = ({\alpha _{00}+x_{i0}\alpha _0+{\varvec{x}}_i^T\varvec{\alpha }}+ x_{i0}(x_{i1}\ldots , \ldots , x_{i,p-m}) \varvec{\alpha }^{S})/\exp (x_{i0}\gamma )\) yields the predictor in (14). Therefore, it is shown that the heterogeneous choice model with interactions \(\alpha _{0,p-m+1}=\cdots =\alpha _{0,p}=0\) holds.

Proof Proposition 3.2

Let us consider the model (10) and assume that one of the interaction parameters is zero. Without loss of generality we assume \(\alpha _{0p}=0\). Then one has the model

$$\begin{aligned} {\text {logit}}(\pi _i)=\frac{\alpha _{00} +x_{i0}\alpha _0+x_{i1}\alpha _1+\cdots +x_{ip}\alpha _p+x_{i0}x_{i1}\alpha _{01}+\cdots +x_{i0}x_{i,p-1}\alpha _{0,p-1}}{\exp (x_{i0}\gamma )}. \end{aligned}$$

Let \(\alpha _{00},\ldots ,\alpha _{0,p-1},\gamma \) and \({{\tilde{\alpha }}}_{00},\ldots ,{{\tilde{\alpha }}}_{0,p-1},{{\tilde{\gamma }}}\) be two parameterizations of the model. It has to be shown that the two parameterizations are identical.

Let \(\pi (x_{ij})\) denote the probability of observing \(Y_i=1\) when the jth covariate has value \(x_{ij}\) and \(\pi (x_{ij}+1)\) denote the probability if the jth covariate has value \(x_{ij}+1\); all other variables are kept fixed. In addition we let \(\pi (x_{ij}, x_{i0}=g)\) denote the probability of observing \(Y_i=1\) when the jth covariate has value \(x_{ij}\) and \(x_{i0}=g\), correspondingly \(\pi (x_{ij}+1)\) denotes the probability if the jth covariate has value \(x_{ij}+1\) and \(x_{i0}=g\); all other variables are kept fixed.

(1) One obtains immediately

$$\begin{aligned} {{\text {logit}}(\pi (x_{ip}+1))-{\text {logit}}(\pi (x_{ip})= e^{-x_{i0}\gamma }}\alpha _p \end{aligned}$$

and therefore, provided \(\alpha _p \ne 0\),

$$\begin{aligned} \frac{{\text {logit}}(\pi (x_{ip}+1,x_{i0}=1))-{\text {logit}}(\pi (x_{ip},x_{i0}=1))}{{\text {logit}}(\pi (x_{ip}+1,x_{i0}=0))-{\text {logit}}(\pi (x_{ip},x_{i0}=0))}= e^{-\gamma }. \end{aligned}$$

Since the equations hold for both parameterizations one obtains \(e^{\gamma }=e^{{{\tilde{\gamma }}}}\) and therefore \(\gamma ={{\tilde{\gamma }}}\).

(2) For all variables \(j \ne p\) one has

$$\begin{aligned} {{\text {logit}}(\pi (x_{ij}+1))-{\text {logit}}(\pi (x_{ij})= e^{-x_{i0}\gamma }}(\alpha _j+x_{i0}\alpha _{0j}). \end{aligned}$$

This yields for \(x_{i0}=0\) that \(\alpha _j={{\tilde{\alpha }}}_j\) holds, and for \(x_{i0}=1\) that \(\alpha _{0j}={\tilde{\alpha }}_{0j}\) holds.

(3) The only left parameters, which still to be investigated, are \(\alpha _{00}\) and \(\alpha _0\). By using for \(x_{i0}=0\)

$$\begin{aligned} {\text {logit}}(\pi _i)=\alpha _{00}+x_{i1}\alpha _1+\cdots +x_{ip}\alpha _p \end{aligned}$$

and for \(x_{i0}=1\)

$$\begin{aligned} {\text {logit}}(\pi _i)=\frac{\alpha _{00} +x_{i0}\alpha _0+x_{i1}\alpha _1+\cdots +x_{ip}\alpha _p+x_{i0}x_{i1}\alpha _{01}+\cdots +x_{i0}x_{i,p-1}\alpha _{0,p-1}}{\exp (x_{i0}^T\gamma )} \end{aligned}$$

one obtains \(\alpha _{00}={\tilde{\alpha }}_{00}\) and \(\alpha _{0}={\tilde{\alpha }}_{0}\), which concludes the proof.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Tutz, G. Modelling heterogeneity: on the problem of group comparisons with logistic regression and the potential of the heterogeneous choice model. Adv Data Anal Classif 14, 517–542 (2020).

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:


  • Heterogeneous choice model
  • Location–scale model
  • Heterogeneity of variances
  • Logit model
  • Group comparisons
  • Non-contingent response style

Mathematics Subject Classification

  • 62J12
  • 62H99
  • 62P25