Skip to main content
Log in

A latent class model for individual differences in the interpretation of conditionals

  • Original Article
  • Published:
Psychological Research Aims and scope Submit manuscript

Abstract.

We investigated the hypothesis that there are three levels of performance associated with conditional reasoning: (1) Unsophisticated reasoners solve a modus tollens by accepting the invited inferences, treating the conditional as if it were a biconditional. (2) Reasoners of an intermediate level can resist the invited inferences, but cannot find the line of reasoning needed to endorse modus tollens. (3) Sophisticated reasoners do not draw the invited inferences either, but they do master the strategy to solve a modus tollens.

On a first set of six problems, solved by 214 adolescents, an unrestricted latent class analysis revealed the existence of a large subgroup of reasoners with a biconditional interpretation of the conditional, and a smaller subgroup with a conditional interpretation.

On a second set of 24 problems, solved by the same participants, a restricted latent class model corroborated the existence of a large subgroup of unsophisticated reasoners and a smaller subgroup of reasoners of an intermediate level. No evidence was found for the existence of a subgroup of sophisticated reasoners.

As expected, the class of biconditional reasoners was associated with the class of unsophisticated reasoners, and the class of conditional reasoners was associated with the class of reasoners of an intermediate level. Furthermore, the former showed a biconditonal response pattern on truth table tasks, whereas the latter showed a conditional response pattern.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. The authors opted for two different types of latent class analyses on the two problems sets instead of one joint analysis on all 30 problems together (the six problems of the first set and the 24 problems of the second set) for the following two reasons. First, for the set of six DA+ and AC+ problems, different interpretations of the conditional should lead to qualitatively different answers: conditional reasoners should answer 'undecidable', whereas biconditional reasoners should answer ''necessarily true'. For the set of 24 MP+ and MT+ problems, the correct answer is independent of the interpretation of the conditional. Interpretation is only supposed to affect the difficulty of those problems. The consequence is that the answers on the MP+ and MT+ problems can be dichotomized (correct/incorrect) without much loss of information, unlike the answers on the DA+ and AC+ problems, calling for a different modeling approach for the two sets of problems. Second, The two different types of analyses reflect two stages in modeling. We will first check whether there are latent classes corresponding with the interpretation of a conditional by conducting an unrestricted latent class analysis on the DA+ and AC+ problems. If the existence of different interpretations is revealed, we can be confident that it is appropriate to construct a confirmatory model for the set of MP+ and MT+ problems, based on the three-level hypothesis for conditional reasoning.

  2. We would like to thank an anonymous reviewer for suggesting effect coding for the independent variables of the logistic regression, so that the intercept corresponds to the mean on a logistic scale of the solution probabilities.

References

  • Barrouillet, P., Grosset, N., & Lecas, J.F. (2000). Conditional reasoning by mental models: Chronometric and developmental evidence. Cognition, 75, 237–266.

    Article  PubMed  Google Scholar 

  • Braine, M.D.S. (1978). On the relation between the natural logic of reasoning and standard logic. Psychological Review, 85, 1–21.

    Article  Google Scholar 

  • Braine, M.D.S., Reiser, B.J., & Rumain, B. (1984). Some empirical justification for a theory of natural propositional logic. In G. Bower (Ed.), The psychology of learning and motivation. (Vol. 18, pp. 313–371). New York: Academic Press.

  • Braine, M.D.S. & O'Brien, D.P. (1991). A theory of if. A lexical entry, reasoning program, and pragmatic principles. Psychological Review, 98, 182–203.

    Article  Google Scholar 

  • Dayton, C.M. (1998). Latent class scaling analysis. Thousand Oaks CA: Sage.

  • Dempster, A.P., Laird, N.M., & Rubin, D.B. (1977). Maximum likelihood estimation from incomplete data via the EM-algorithm (with discussion). Journal of the Royal Statistical Society, B, 39, 1–38.

    Google Scholar 

  • Efron, B., Tibshirani, R.J. (1993). An introduction to the bootstrap. New York: Chaman & Hall.

  • Evans, J. St. B.T., Newstead, S.E., & Byrne, R.M.J. (1993). Human reasoning: The psychology of deduction. Hove, UK: Erlbaum.

  • Geis, M.C., & Zwicky, A.M. (1971). On invited inferences. Linguistic Inquiry, 2, 561–566.

    Google Scholar 

  • Goodman, L.A. (1974). Exploratory latent structure analysis using both identifiable and unidentifiable models. Biometrika, 61, 215–231.

    Article  Google Scholar 

  • Hedeker, D., & Gibbons, R.D. (1994). A random-effects ordinal regression model for multilevel analysis. Biometrics, 50, 933–944.

    Article  PubMed  Google Scholar 

  • Heinen, T. (1993). Discrete latent variable models. Series on work and organization. Tilburg, The Netherlands: Tilburg University Press.

  • Johnson-Laird, P.N., & Byrne, R.M.J. (199l). Deduction. Hillsdale, NJ: Erlbaum.

  • Johnson-Laird, P.N., Byrne, R.M.J., & Schaeken, W. (1992). Propositional reasoning by model. Psychological Review, 99, 418–439.

    Article  PubMed  Google Scholar 

  • Kodroff, J.K., & Roberge, J.J. (1975). Developmental analysis of the conditional reasoning abilities of primary grade children. Developmental Psychology, 13, 342–353.

    Google Scholar 

  • Lazarsfeld, P.F., and Henry, N.W. (1968). Latent structure analysis. Boston: Houghton Mill.

  • McLachlan, G., & Krishnan, T. (1997). The EM algorithm and extensions. New York: Wiley.

  • Mislevy, R.J., & Verhelst, N. (1990). Modeling item responses when different subjects employ different solution strategies. Psychometrika, 55, 195–215.

    Article  Google Scholar 

  • Mislevy, R.J., & Wilson, M. R. (1996). Marginal maximum likelihood estimation for a psychometric model of discontinuous development. Psychometrika, 61, 41–71.

    Article  Google Scholar 

  • Osherson, D.N. (1975). Logical abilities in children: Vol 3. Reasoning in adolescence: Deductive inference. Hillsdale, NJ: Erlbaum.

  • Rijmen, F., & De Boeck, P (2001). Propositional reasoning: The differential contribution of "rules" to the difficulty of complex reasoning problems. Memory & Cognition, 29, 165–175.

    Article  Google Scholar 

  • Rijmen, F., Tuerlinckx, F., & De Boeck, P. A nonlinear mixed model framework for IRT models. Submitted for publication.

  • Rips, L.J. (1983). Cognitive processes in propositional reasoning. Psychological Review, 90, 3 8–7 1.

    Article  Google Scholar 

  • Rips, L.J. (1994). The psychology of proof. Cambridge, IM: Routledge.

  • Roberge, J.J. (1970). A study of children s ability to reason with basic principles of deductive reasoning. American Educational Research Joumal, 7, 583–596.

    Article  Google Scholar 

  • Rost, J. (1990). Rasch models in latent classes: An integration of two approaches to item analysis. Applied Psychological Measurement, 14, 271–282.

    Article  Google Scholar 

  • Rumain, B., Connell, J., & Braine, M.D.S. (1983). Conversational comprehension processes are responsible for reasoning fallacies in children as well as adults: If is not the biconditional. Developmental Psychology, 19, 471–481.

    Article  Google Scholar 

  • Sclove, S.L. (1987). Application of model selection criteria to some problems in multivariate analysis. Psychometrika, 52, 333–343.

  • Vermunt, J.K. (1997). lEM: A general program for the analysis of categorical data. Tilburg, The Netherlands, Tilburg University.

  • Von Davier, M. (1997). Bootstrapping goodness-of-fit statistics for sparse categorical data: Results of a Monte Carlo study. Methods of Psychological Research Online, 2, 29–48.

  • Wildman,T.M., & Fletcher, H.J. (1977). Developmental increases and decreases in solutions of conditional syllogism problems. Developmental Psychology, 13, 630–636.

    Article  Google Scholar 

  • Wilson, M. (1989). Saltus: A psychometric model of discontinuity in cognitive development. Psychological Bulletin, 105, 276–289.

    Article  Google Scholar 

  • Yamamoto, K. (1987). A hybrid model for item responses. Unpublished dissertation, University of Illinois.

  • Zeger, S.L., & Karim, M.R. (1991). Generalized linear models with random effects: A Gibbs sampling approach. Journal of the American Statistical Association, 86, 79–86.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Frank Rijmen.

Additional information

Frank Rijmen was supported by the Fund for Scientific Research Flanders (FWO).

Appendix

Appendix

EM algorithm for constrained latent class analysis.

Our use of the EM algorithm for the discrete mixture model of Equation 9 resembles the use of the EM algorithm as proposed by Mislevy and Verhelst (1990), and Mislevy and Wilson (1996) for these kind of models.

The marginal likelihood function for the model of Equation 9 is

$$ L\left( {\varvec{\delta} ,{\bf p},\sigma ^2 \left| {\bf Y} \right.} \right) = \prod\limits_{n = 1}^N {\sum\limits_{t = 1}^T {p_t \left\{ {\int\limits_{\theta _{nt} } {\left( {\prod\limits_{i = 1}^I {{{\exp \left[ {y_{ni} \left( {{\bf x}_{nit}^{\bf '} \varvec{\delta} + \theta _{nt} } \right)} \right]} \over {1 + \exp \left( {{\bf x}_{nit}^{\bf '} \varvec{\delta} + \theta _{nt} } \right)}}} } \right)} N\left( {\theta _{nt} \left| {0,\sigma ^2 } \right.} \right)d\theta _{nt} } \right\}} ,} $$
(10)

where δ is a parameter vector containing both the class specific logistic regression weights and the logistic regression weights constant across classes; \({\bf x}_{nit}^{'} \) is a class specific covariate vector, defined such that \( {\bf x}_{nit}^{\prime} \varvec{\delta} = {\bf x}_{ni}^{\prime}\varvec{\beta} _t \) . p=(p 1 , ..., p T ) with \( p_T = 1 - \sum\limits_{t = 1}^{T - 1} {p_t } ; \) and Y is the N×I datamatrix.

Using \({\bf x}_{nit}^{'} \) and δ instead of \({\bf x}_{ni}^{'} \) and β t is done for computational convenience. It allows to treat all logistic regression weights as logistic regression weights that are constant across classes. If a logistic regression weight β tj is class specific for a particular class t, this can be taken into account by setting x nit'j to zero for all t' t.

The idea behind the EM-algorithm is that maximizing (10) with respect to the parameters would be easier if the ςn and θ nt were known. In this case, the complete data (consisting of the observed data Y and the "missing data" ςn and θ nt) likelihood L c reduces to:

$$ L_c \left({\delta} , {p},\sigma ^2 | {\bf Y},\sigma {\bf ,}{\theta} _\varsigma \right) = \prod\limits_{n = 1}^N {p_t} \left( {\prod\limits_{i = 1}^I {{{\exp \left[ {y_{ni} \left( {{\bf x}_{nit}^{\bf '} {\delta} + \theta _{nt} } \right)} \right]} \over {1 + \exp \left( {{\bf x}_{nit}^{\bf '} {\delta} + \theta _{nt} } \right)}}} } \right)N\left( {\theta _{nt} \left| {0,\sigma ^2 } \right.} \right) $$
(11)

Where: ς and θ ς are N×1 vectors, containing for each person n the class t he/she belongs to and the ability θ nt of that person within the class t he/she belongs to.

The missing data being unknown (by definition), the EM algorithm iterates as follows: at iteration k + 1,

  1. 1.

    Compute the expected complete data loglikelihood, given the observed data and provisional parameter estimates obtained in iteration k.

  2. 2.

    Maximize the expected complete data loglikelihood with respect to the parameters of interest, resulting in new provisional parameter estimates for iteration k + 1.

Iterate until some convergence criterion is met. For the first iteration, initial parameter estimates can be chosen at random or based on some preliminary analysis. Dempster et al. (1977) show that, under regularity conditions, the likelihood is not decreased after an iteration.

The expected complete data loglikelihood, given the observed data and the provisional parameter estimates is:

$$ \matrix{ {Q\left( {\varvec{\alpha} \left| {\varvec{\alpha} ^{(k)} } \right.} \right)} \hfill & = \hfill & {E_{\varvec{\alpha} ^{(k)} } \left[ {\log L_c \left( {\varvec{\alpha} \left| {{\bf Y}{\bf ,}\varvec{\sigma} ,\varvec{\theta} _\sigma } \right.} \right)\left| {\bf Y} \right.} \right]} \hfill \cr {} \hfill & = \hfill & {\sum\limits_{n = 1}^N {\sum\limits_{t = 1}^T {p\left( {\varsigma _n = t\left| {{\bf y}_n ,\varvec{\alpha} ^{\left( k \right)} } \right.} \right)\int\limits_{\theta _{nt} } {\log g\left( {{\bf y}_n {\bf ,}\varsigma _n = t,\theta _{nt} \left|\varvec{\alpha} \right.} \right)f\left( {\theta _{nt} \left| {\varsigma _n = t,{\bf y}_n ,\varvec{\alpha} ^{(k)} } \right.} \right)d\theta _{nt} } } } } \hfill \cr } $$
(12)

where

α is the vector of all parameters (δ , p , σ 2),

α (k) is the vector of provisional parameter estimates obtained in iteration k, and

\(E_{\varvec{\alpha} ^{(k)} } \)denotes the expectation using parameter vector α (k).

Differentiating Q(α|α ( k )) with respect to the parameters α results in the following score equations:

$${{\partial Q\left( {\varvec{\alpha} \left| {\varvec{\alpha} ^{(k)} } \right.} \right)} \over {\partial p_t }}=\sum\limits_{n=1}^N {\left[ {p\left( {\varsigma _n =t\left| {{\bf y}_n ,\varvec{\alpha} ^{\left( k \right)} } \right.} \right)/p_t - p\left( {\varsigma _n =T\left| {{\bf y}_n ,\varvec{\alpha} ^{\left( k \right)} } \right.} \right)/\left( {p_T } \right)} \right]} $$
(13)

for t = 1,..., T-1

$$ {{\partial Q\left( {\varvec{\alpha} \left| {\varvec{\alpha} ^{(k)} } \right.} \right)} \over {\partial \sigma ^2 }} = \sum\limits_{n = 1}^N {\sum\limits_{t = 1}^T {p\left( {\varsigma _n = t\left| {{\bf y}_n ,\varvec{\alpha} ^{\left( k \right)} } \right.} \right)\int\limits_{\theta _{nt} } {\left[ { - 1/2\sigma ^2 + \theta _{nt}^2 /2\sigma ^4 } \right]} f\left( {\theta _{nt} \left| {\varsigma _n = t,{\bf y}_n ,\varvec{\alpha} ^{(k)} } \right.} \right)d\theta _{nt} } } , $$
(14)

and

$$ {{\partial Q\left( {\varvec{\alpha} \left| {\varvec{\alpha} ^{(k)} } \right.} \right)} \over {\partial \delta _j }} = \sum\limits_{n = 1}^N {\sum\limits_{t = 1}^T {p\left( {\varsigma _n = t\left| {{\bf y}_n ,\varvec{\alpha} ^{\left( k \right)} } \right.} \right)\left\{ {\sum\limits_{i = 1}^I {y_{ni} x_{nitj} } - \int\limits_{\theta _{nt} } {\sum\limits_{i = 1}^I {{{\exp \left( {\theta _{nt} + {\bf x}_{nit}^{\prime} \varvec{\delta} } \right)x_{nitj} } \over {1 + \exp \left( {\theta _{nt} + {\bf x}_{nit}^{\prime} \varvec{\delta} } \right)}}} } f\left( {\theta _{nt} \left| {\varsigma _n = t,{\bf y}_n ,\varvec{\alpha} ^{(k)} } \right.} \right)d\theta _{nt} } \right\}} } . $$
(15)

By equating the partial derivatives to zero, closed form solutions can be found for p 1 ,...p T- 1, and σ 2 that maximize Q(α|α ( k )), serving as provisional parameter estimates for the E-step of iteration k+2 :

$$ \matrix{ {p_t^{(k + 1)} = {{\sum\limits_{n = 1}^N {p\left( {\varsigma _n = t\left| {{\bf y}_n ,\varvec{\alpha} ^{\left( k \right)} } \right.} \right)} } \over N}} & {{\rm for}\;t\; = \;1,...,\;T - 1} \cr } , $$
(16)

and

$$ \sigma ^{2(k + 1)} = {{\sum\limits_{n = 1}^N {\sum\limits_{t = 1}^T {p\left( {\varsigma _n = t\left| {{\bf y}_n ,\varvec{\alpha} ^{\left( k \right)} } \right.} \right)\int\limits_{\theta _{nt} } {\theta _{nt}^2 } f\left( {\theta _{nt} \left| {\varsigma _n = t,{\bf y}_n ,\varvec{\alpha} ^{(k)} } \right.} \right)d\theta _{nt} } } } \over N}. $$
(17)

For the logistic regression weights δ j, no closed form solution exists. Instead, the parameter estimates are updated using one Newton-Raphson iteration. Hence actually, the algorithm we implemented is a Generalised EM algorithm (Dempster et al., 1977). The second derivatives needed for the Newton-Raphson algorithm are

$$ {{\partial ^2 Q\left( {\varvec{\alpha} \left| {\varvec{\alpha} ^{(k)} } \right.} \right)} \over {\partial \delta _j \partial \delta _{j'} }} = - \sum\limits_{n = 1}^N {\sum\limits_{t = 1}^T {p\left( {\varsigma _n = t\left| {{\bf y}_n ,\varvec{\alpha} ^{\left( k \right)} } \right.} \right)\int\limits_{\theta _{nt} } {\left\{ {\sum\limits_{i = 1}^I {{{\exp \left( {\theta _{nt} + {\bf x}_{nit}^{\prime} \varvec{\delta} } \right)x_{nitj} x_{nitj'} } \over {\left[ {1 + \exp \left( {\theta _{nt} + {\bf x}_{nit}^{\prime} \varvec{\delta} } \right)} \right]^2 }}} } \right\}} f\left( {\theta _{nt} \left| {\varsigma _n = t,{\bf y}_n ,\varvec{\alpha} ^{(k)} } \right.} \right)d\theta _{nt} } } . $$
(18)

The posterior densities ƒ(θnt|ς n = t, yn, α ( k ))and the posterior probabilities p(ς n = t|yn, α ( k )) are computed using Bayes' theorem:

$$ f\left( {\theta _{nt} \left| {\varsigma _n = t,{\bf{y}}_n ,\varvec{\alpha} ^{(k)} } \right.} \right) = {{p\left( {{\bf{y}}_n \left| {\varsigma _n = t,\theta _{nt} ,\varvec{\delta} } \right.} \right)N\left( {\theta _{nt} \left| {0,\sigma ^2 } \right.} \right)} \over {\int\limits_{\theta _{nt} } {p\left( {{\bf{y}}_n \left| {\varsigma _n = t,\theta _{nt} ,\varvec{\delta} } \right.} \right)N\left( {\theta _{nt} \left| {0,\sigma ^2 } \right.} \right)d\theta _{nt} } }} $$
(19)
$$ p\left( {\varsigma _n = t\left| {{\bf y}_n ,\varvec{\alpha} ^{\left( k \right)} } \right.} \right) = {{p_t \int\limits_{\theta _{nt} } {p\left( {{\bf y}_n \left| {\varsigma _n = t,\theta _{nt} ,\varvec{\delta} } \right.} \right)N\left( {\theta _{nt} \left| {0,\sigma ^2 } \right.} \right)d\theta _{nt} } } \over {\sum\limits_{t = 1}^T {p_t \int\limits_{\theta _{nt} } {p\left( {{\bf y}_n \left| {\varsigma _n = t,\theta _{nt} ,\varvec{\delta} } \right.} \right)N\left( {\theta _{nt} \left| {0,\sigma ^2 } \right.} \right)d\theta _{nt} } } }} $$
(20)

The integrals over θ nt are approximated with Gaussian quadrature.

After convergence, the observed information matrix for the maximum likelihood estimates \(\hat {\alpha}\) can be approximated by

$$ I\left( {\hat {\alpha} \left| {\bf Y} \right.} \right) \approx \sum\limits_{n = 1}^N {{\bf s}\left( {{\bf y}_n \left| {\hat {\alpha} } \right.} \right){\bf s}^{\prime} \left( {{\bf y}_n \left| {\hat {\alpha} } \right.} \right)} $$
(21)

where \( s\left( {{\bf y}_n \left| {\hat {\alpha} } \right.} \right) = \partial \log L_n \left( {{\delta} ,{\bf p},\sigma ^2 \left| {{\bf y}_n } \right.} \right)/\partial {\alpha} \) , the score function based on the single observation y n. The latter can be computed using the complete data loglikelihood as follows (McLachlan & Krishnan, 1997):

$${\bf{s}}\left( {{\bf{y}}_n \left| {\bf{\varvec{\alpha} }} \right.} \right) = E_{\bf{\varvec{\alpha} }} \left[ {\partial \log L_{cn} \left( {{\bf{\varvec{\alpha} }}\left| {{\bf{y}}_n {\bf{,}}\varsigma _n ,\theta _{nt} } \right.} \right)/\partial {\bf{\varvec{\alpha} }}\left| {{\bf{y}}_n } \right.} \right]$$
(22)

where Lcn(α|y n, ς n, θ nt)is the complete data loglikelihood formed from the single observation y n.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rijmen, F., De Boeck, P. A latent class model for individual differences in the interpretation of conditionals. Psychological Research 67, 219–231 (2003). https://doi.org/10.1007/s00426-002-0092-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00426-002-0092-7

Keywords

Navigation