1 Introduction

Longitudinal ordered categorical data are affected by response styles (RS) when respondents are asked to evaluate, on Likert scales, items at different time occasions and decide to use only a few of the given options of the rating scale (usually the extreme or the middle categories) irrespectively of the content of the item. There is a large body of psychometric and statistical literature that has devoted interest to different types of response style and has studied RS causes and effects (e.g. Van Vaerenbergh and Thomas 2013). The most widely recognised response styles are acquiescence RS (ARS), disacquiescence RS (DRS), extreme RS (ERS), middle RS (MRS). Clear and concise descriptions of common response styles are in Baumgartner and Steenkamp (2001), Table 1 p. 145, and Van Vaerenbergh and Thomas (2013), Table 1 p. 197, see also Roberts (2016) and Dolnicar and Grün (2009). The RS mechanism cannot be neglected since it introduces heterogeneity in the responses, bias in the estimated parameters of the models and consequently misleading results (e.g. Tutz and Berger 2016; Colombi et al. 2019, 2021).

In this work, the challenge is to account for the time evolution of the RS behaviour in contrast to previous approaches where it is ignored or modelled as a time invariant latent trait or random effect (Billiet and Davidov 2008; Schauberger and Tutz 2022).

Acknowledging the importance of RS, and its temporal dynamics, and considering that, at every time occasion, respondents can answer according to a RS or use properly the rating scale to give a correct representation of their own feeling, we introduce a Markov switching model (Fruhwirth Schnatter 2001) driven by a bivariate latent Markov chain. One component of the chain has k states or regimes, is named k-regime switching indicator and accommodates serial dependence and respondent’s heterogeneity due to unobserved covariates. The other binary component is named response style regime switching indicator and determines the way of responding (dictated by RS or no-RS). Conditionally on the k-regime switching indicator, the dependence of the observed categorical responses on time-varying and subject specific covariates under the no-RS regime is modelled by a stereotype logit model (Anderson 1984), and under the RS regime by a parallel local logit model, with restricted intercepts, that can cope with the tendency of respondents to select categories due to RS. Since the parallel local logit model for RS respondents can be seen as a restricted stereotype model, the name Markov switching stereotype logit (MSSL) model is adopted for our proposal and to emphasize that the focus is on parameter variation in a logit regression model, the Markov switching model terminology (Fruhwirth Schnatter 2001) is preferred to that of hidden Markov models.

Our proposal contributes to the literature on multivariate Markov chains in the context of Markov switching models (see, Farcomeni 2015; Pohle et al. 2021, among others) and extends the literature on models for longitudinal categorical data (e.g. Bartolucci et al. 2012; Molenberghs and Verbeke 2005).

The proposed methodology can find useful applications in all longitudinal surveys that collect opinions on state of health and risk of illness, on economic difficulties, on the impact of climatic events, on discriminatory and racist beliefs and on political attitudes, because perceptions can be revealed with a bias due to response styles that can vary over time according to the inconstancy and inconsistency of human behaviour.

The rest of the paper is organized as follows. In Sect. 2, the model is introduced and the choice of the logit models for no-RS and RS responses is motivated in Sect. 3 by comparing their allocation sets (intervals of values of the linear predictor according to which a category is the most probable). In Sect. 4, maximum likelihood estimation of the parameters of the model is examined and in Sect. 5, the model is applied to data from the Bank of Italy about economic vulnerability of Italian citizens. In Sect. 6, a simulation study, that illustrates the consequences of ignoring the RS component, is presented. Final remarks are reported in the last section and technical details are given in the Appendix and in the supplementary material. R codes for the estimation of the model parameters and the determination of the allocation sets are available from the authors.

2 A Markov switching model with two latent variables

Consider J ordinal responses observed on n units (subjects/respondents) at T time occasions. In particular, let \(R_{jit}\), \(R_{jit}\in \mathcal R_j=\{1,\ldots , c_j\}\), denote the j-th ordinal response variable, \(j\in \mathcal J=\{1,\ldots ,J\}\), of the i-th unit, \(i\in \mathcal I =\{1,\ldots ,n\}\), at the t-th occasion, \(t\in \mathcal T =\{1,\ldots ,T\}\).

The probability functions of the responses are assumed to depend on the k-regime switching indicators \(L_{it}\), \(i\in \mathcal I\), \(t\in \mathcal T\), with finite discrete state space \(\mathcal S_L=\{1,\ldots ,k\}\), and on the response style regime switching indicators \(U_{it}\), \(i\in \mathcal I\), \(t\in \mathcal T\), with state space \(\mathcal S_U=\{1,2\}\), where 1 and 2 denote no-RS and RS regimes, respectively.

The proposed MSSL model is defined for every unit i, \(i\in \mathcal I\), by the bivariate Markov chain of the latent variables \((L_{it},U_{it})\), \({t\in \mathcal T}\), and by the conditional distributions of the responses given the latent variables. Next subsections are devoted to specifying the two model components. For simplicity, hereafter the supports \(\mathcal J\), \(\mathcal I\) and \(\mathcal T\) of variable, unit and time indices will be omitted if not strictly necessary.

2.1 The latent bivariate Markov chain

The latent variables \(L_{it}\) and \(U_{it}\) are independent across units and, for every i the process \(\{L_{it},U_{it}\}_{t\in \mathcal T}\) is assumed to evolve in time according to a first order bivariate Markov chain with states (lu), \(l\in \mathcal S_L\), \(u\in \mathcal S_U\). For the sequel, let us consider states \(l, \ \bar{l} \in \mathcal S_L\) and \(u, \ \bar{u} \in \mathcal S_U\).

The latent component of the model is specified through its initial and transition probabilities that are assumed to be time invariant and subject independent.

The initial probabilities (\(t=1\)) of the latent bivariate process \(\{L_{it},U_{it}\}_{t\in \mathcal T}\) are denoted by \(\pi _{1}(l,u) =P(L_{i1}=l, U_{i1}=u),\) and the transition probabilities are \(\pi (l,u| \bar{l}, \bar{u}) = P(L_{it}=l, U_{it}=u|L_{it-1}=\bar{l}, U_{it-1}=\bar{u}), t=2,\ldots ,T\). Furthermore, \(\pi ^L(l|\bar{l},\bar{u}) = P(L_{it}=l|L_{it-1}=\bar{l}, U_{it-1}=\bar{u})\) denote the marginal transition probabilities for the latent variables \(L_{it}\) and \(\pi ^{U|L}(u|l,\bar{l},\bar{u}) =P(U_{it}=u|L_{it}=l, L_{it-1}=\bar{l}, U_{it-1}=\bar{u})\) are the transition probabilities of the response style regime switching indicators \(U_{it}\), conditioned on the transition \((\bar{l},l)\) of the k-regime switching indicator.

Moreover, the introduced probabilities are required to satisfy the following conditions:

$$\begin{aligned} a) \, \pi ^L(l|\bar{u},\bar{l})=\pi ^L(l|\bar{l}), \quad b) \, \pi ^{U|L}(u|l,\bar{l},\bar{u}) = \pi ^{U|L}(u|l,\bar{u}). \end{aligned}$$
(1)

Condition a) states that the k-regime switching indicator, given its past, does not depend on the past of the response style regime switching indicator. This assumption ensures that the k-regime switching indicator is marginally a Markov chain (see Florens et al. 1993; Colombi and Giordano 2012) and together with assumption b) imposes a hierarchy on the two latent variables according to which the response style indicator depends on the k-regime switching indicator. Condition b) is about the independence of the current way of answering on the past of the k-regime switching indicator and, with a loss of simplicity, can be avoided allowing the response style regime switching indicator to depend also on the past of the k-regime switching indicator. Alternatively, an even more restrictive assumption considers the response style regime switching indicator also independent from its past, that is \(\pi ^{U|L}(u|l,\bar{l},\bar{u}) = \pi ^{U|L}(u|l)\). Conditions a), b) characterize our model according to which the k-regime switching indicator affects both the response style indicator and the observable responses while the response style indicator affects the observable responses only. The dependence structure of the latent components in the MSSL model is represented in Fig. 1, where the nodes indicate the latent variables and the oriented arcs describe the conditions in (1).

Fig. 1
figure 1

Directed Acyclic Graph that encodes the dependence structure of the latent components of the MSSL model

Assumptions (1) simplify the transition probabilities of the bivariate Markov chain to \(\pi (u,l|\bar{u}, \bar{l})=\pi ^{U|L}(u|l,\bar{u})\pi ^L(l|\bar{l}).\) To reduce the number of parameters we also assume that \(\pi _{i1}(u,l)=\pi _{1}^{U}(u)\pi ^L_{1}(l)\).

Logit parameterizations can be conveniently adopted for initial and transition probabilities. The number of parameters of the latent bivariate Markov chain is \(2k+k^2\).

In this paper, we assume that covariates do not affect the transition probabilities of the latent components because the main interest is on logit regression models with time-varying parameters for the observable variables and the latent component is an artefact to model units heterogeneity and time dependence due to unobserved covariates. Within the framework of this paper, covariate effects make less sense on the initial and transition probabilities of the latent component. Nevertheless, it would be interesting to consider a non-homogeneous latent process where the initial probabilities depend on time invariant regressors and the transition probabilities on time specific covariates. Models of this kind have been considered by Colombi et al. (2023) under the restriction of subject and time invariant observation probability functions. In the mentioned paper, the observed variables are indicators of a latent construct of interest and thus covariates naturally affect only the latent component of the model. An interesting extension of our proposal is the case where initial and transition probabilities of the response style indicator depend on subject specific time invariant discrete random effects (Altman 2007; Bartolucci et al. 2012). This generalization would be useful to take into account time invariant heterogeneity in RS attitudes.

2.2 Logit models for the observable variables

Let \(\textbf{R}_i\) be the vector of the ordinal responses \(R_{jit}, j\in \mathcal J, \, t\in \mathcal T\) of unit i. Some independence assumptions specify the observation model: a) the vectors \(\textbf{R}_i\) are independent random vectors; b) for every unit i and occasion t, given \(\{L_{it},U_{it}\}_{t\in \mathcal T}\), the responses \(R_{jit}\), \(j \in \mathcal J\), are independent from their past and depend on \((L_{it},U_{it})\) only, moreover they are contemporaneously independent given the current state \((L_{it},U_{it})\).

The marginal probability functions of \(R_{jit}\), conditioned on the latent states (lu) are denoted by: \(f_{luj}(r|{\varvec{x}}_{it}),\,r\in \mathcal R_j,\) where \({\varvec{x}}_{it}\) is a vector of p covariates.

Under the previous assumptions, given the latent state (lu), every probability function \(f_{luj}(r|{\varvec{x}}_{it})\) is parameterized by the stereotype logit model:

$$\begin{aligned} \eta _{rluj} ({\varvec{x}}_{it})=\log \frac{f_{luj}(r+1|{\varvec{x}}_{it})}{f_{luj}(r|{\varvec{x}}_{it})}= & {} \alpha _{rluj}+\mu _{rluj} {{\varvec{\gamma }}}_{luj}^{\prime }{\varvec{x}}_{it}, \quad r=1,2,\ldots ,c_j-1, \end{aligned}$$
(2)

where \(\mu _{rluj}\) are logit-specific and covariate-independent multiplicative factors, called scores, to be estimated. For identifiability purposes, since the model is invariant under scale transformations of the scores, we set \(\mu _{1luj}=1\).

In the literature, since the original paper by Anderson (1984), the scores are constrained to be non-negative. In our idea, such constraints are too restrictive and we propose to estimate unconstrained scores to make the model more flexible as shown hereafter. In particular, non negative values of the scores imply that responses are stochastically ordered, according to the likelihood ratio dominance criterion, by the linear predictor and this ordering is not imposed a priori if the scores are left unconstrained.

Given the no-RS regime (\(u=1\)), the stereotype logit model (2) has \(k ( \sum _{j=1}^J (2c_j - 3)+pJ )\) parameters and is conveniently used since it is less restrictive than a parallel logit model and it is not over parameterized as the unrestricted multinomial logit model.

The difference is that in the parallel logit model, the linear predictor \({{\varvec{\gamma }}}_{l1j}^{\prime }{\varvec{x}}_{it}\) is the same for all logits \(\eta _{rluj} ({\varvec{x}}_{it})\), \(r=1,2,\ldots ,c_j-1\); in the stereotype model, the linear predictor \(\mu _{rl1j} {{\varvec{\gamma }}}_{l1j}^{\prime }{\varvec{x}}_{it}\) changes according to the logit-specific and covariate-independent scores \(\mu _{rl1j}\); in the multinomial logit model, each logit has its own specific linear predictor \({{\varvec{\gamma }}}_{rl1j}^{\prime }{\varvec{x}}_{it}\).

Some restrictions on the scores \(\mu _{rl1j}\) provide special cases as follows. The scores \(\mu _{rl1j}\), \(l\in \mathcal S_L,\) are switching according to the states l of the k-regime switching indicator in model (2), a more parsimonious version assumes them as fixed i.e. \(\mu _{rl1j}=\mu _{r1j}\), \(l\in \mathcal S_L\). If the scores \(\mu _{rl1j}\), \(r=1,2,\ldots ,c_j-1\), \(j \in \mathcal J,\) \(l\in \mathcal S_L\), are equal to 1, the model for no-RS responses is equivalent to a parallel adjacent category logit model.

Given the RS regime (\(u=2\)), the stereotype logit model (2) is specified by the intercepts \(\alpha _{rl2j}\) and the scores \(\mu _{rl2j}\) satisfying the constraints:

$$\begin{aligned} \alpha _{rl2j}=\phi _{0lj}+\phi _{1lj}s_{rj}, \, \mu _{rl2j}=1, \quad r=1,2,\ldots ,c_j-1,\, j \in \mathcal J,\,l\in \mathcal S_L, \end{aligned}$$
(3)

where \(\phi _{0lj}\) and \(\phi _{1lj}\) are parameters to be estimated and the scores \(s_{rj}\) are known constants defined as: \(s_{rj}=1\) for \(r<c_j/2\), \(s_{rj}=0\) for \(r=c_j/2\), \(s_{rj}=-1\) for \(r>c_j/2\), \(r = 1,2,\ldots , c_j-1\), in line with the proposal by Tutz and Berger (2016). Based on this model, the modal categories can be on the middle or on the extremes as will be shown in Sect. 3.2.

Note that the previous constraints describe parallel local logit models with restricted intercepts depending on two parameters only. See Yee (2015) for a general discussion on this kind of linear constraints on the parameters of a link function. The number of parameters of these models is \((2+p)Jk\). The adequacy of this model, to describe responses affected by RS, is examined in the next section.

The regression coefficients \({{\varvec{\gamma }}}_{luj}\), given \(u\in \mathcal S_U\), vary across latent states or regimes l, \(l\in \mathcal S_L,\) so they are named switching coefficients with respect to the k-regime switching indicator. On the contrary, when \({{\varvec{\gamma }}}_{luj}= {\varvec{\gamma }}_{uj}\), \(l\in \mathcal S_L,\) they are fixed coefficients with respect to the k-regime switching indicator.

3 Allocation sets for RS and no-RS regime models

In this section, the indices it associated to units, time will be omitted and the dependence on the latent variables ignored to simplify the notation. Moreover, for simplicity, we consider the case of one ordinal response R with categories \(r = 1,2,\ldots ,c\). As the interest is on the relationship between R and a vector of covariates \({\varvec{x}}\), the probability function of the ordinal categorical variable R, given \({\varvec{x}}\), is denoted by f(r|z), where \(z={{\varvec{\gamma }}}^{\prime }{\varvec{x}}\) is a linear predictor.

For predicting r given z, one method is to take the category which maximizes the probability f(r|z). This leads to determine the mode of the probability function given a value of the linear predictor z and the allocation set \(A_r=\{z: f(r|z)\ge f(s|z);\, s=1,2,\ldots ,c, r\ne s\}\), \(r=1,2,\ldots ,c\), become relevant in this regard. The allocation sets are not necessarily non-empty, thereby implying that some response categories are never the mode. It is relevant to investigate which categories can be modal for some z and in which ordering the categories become modal as z increases. The allocation sets have an important role in this direction as they describe how the most probable category changes as function of the linear predictor z.

Subsections 3.1 and 3.2 will highlight the differences between the models for no-RS and RS regimes in terms of allocation sets by showing how the mode of the probability functions, induced by the logit models specified by (2) and (3), changes according to the linear predictor. This approach clarifies why the assumptions of restricted intercepts and parallel covariate effects in (3) are useful to describe the propensity towards various RSs and justifies the stereotype model as a flexible and parsimonious model for the responses not dictated by RS. The relevance of the allocation sets to give a useful interpretation of parameter estimates is illustrated on real data in Sect. 5.

3.1 Allocation sets under no-RS regime

Let us consider the stereotype regression model for the probability function \(f(r|z), \, r=1,2,\ldots ,c,\)

$$\begin{aligned} \eta _r=\log \frac{f(r+1|z)}{f(r|z)}=\alpha _r + \mu _r{{\varvec{\gamma }}}^{\prime }{\varvec{x}}, \quad r=1,2,\ldots ,c-1, \quad \mu _1=1: \end{aligned}$$
(4)

where \(z={{\varvec{\gamma }}}^{\prime }{\varvec{x}}\). Note that the dependence of logits on covariates is left implicit for simplicity. In the Anderson’s ordered stereotype model, the scores are non-negative, that is \(\mu _{r} \ge 0, r=1,\ldots ,c.\) This implies that the categories become modal for increasing z coherently with the response ordering (Anderson 1984). If the scores \(\mu _{r}\) are allowed to be negative, the order in which the categories become the most probable as z increases is not necessarily consistent with the ordered scale of the response. In this regard, in the Appendix it is shown that for unconstrained scores there exists a permutation \(\iota (r), r=1,2,\ldots ,c,\) of the c categories of R such that model (4) is equivalent to the stereotype model:

$$\begin{aligned} \eta ^*_{\iota (r)}=\log \frac{f(\iota (r+1)|z)}{f(\iota (r)|z)}=\alpha _{\iota (r)}^*+\mu _{\iota (r)}^*{{\varvec{\gamma }}}^{\prime }{\varvec{x}}, \quad r=1,2,\ldots ,c-1, \end{aligned}$$
(5)

where all the scores are non-negative, that is \(\mu _{\iota (r)}^*\ge 0, r=1,2,\ldots ,c-1.\)

Formulation (5) of model (4) is useful to study the family of allocation sets of model (4).

In the Appendix, it is proved that there exists a set

$$\begin{aligned} I^* =\{r^*_1,r^*_2,\ldots ,r^*_m\},\quad r^*_1<r^*_2<,\cdots ,<r^*_m,\quad I^*\subseteq I=\{1,2,\ldots ,c\}, \end{aligned}$$

and \(m-1\) cut points

$$\begin{aligned} z_{\iota (r^*_1)}\le z_{\iota (r^*_2)},\le \cdots \le z_{\iota (r ^*_{m-1})} \end{aligned}$$

such that the allocation sets

$$\begin{aligned} A_{\iota (r^*_i)}=\{z: f(\iota (r^*_i)|z)\ge f(s|z);\, \iota (r^*_i)\ne s\},\quad r^*_i \in I^*,\quad i=1,2,...,m, \end{aligned}$$

have the following properties:

  1. (1)

    \(\sup ( z: z \in A_{\iota (r^*_i)})=\inf ( z: z \in A_{\iota (r^*_{i+1})})=z_{\iota (r^*_i)}\), \(i=1,2,\ldots ,m-1\),

  2. (2)

    \(\inf (z:z \in A_{\iota (r^*_1)})=-\infty\), \(\sup ( z: z \in A_{\iota (r^*_m)})=\infty ,\)

  3. (3)

    at the boundary of two allocation sets \(z_{\iota (r^*_i)}\), the modal category is not unique.

  4. (4)

    if \(i \notin I^*\), the category \(\iota (i)\) is not mode for any value of z.

From Properties 1) and 2) it follows that

$$\begin{aligned} A_{\iota (r^*_i)}=\{z: z_{\iota (r^*_{i-1})}\le z \le z_{\iota (r^*_{i})}\},\quad i=1,2,...,m, \end{aligned}$$

if \(\, z_{\iota (r^*_0)}=-\infty ,\,z_{\iota (r^*_m)}=\infty\).

The cut points \(z_{\iota (r^*_1)}\le z_{\iota (r^*_2)},\le \cdots \le z_{\iota (r^*_{m-1})}\), that describe how the modal category changes according to the linear predictor z, are functions, as shown in the Appendix, of the scores \(\mu _{\iota (r)}^*\) and the intercepts \(\alpha _{\iota (r)}^*\) and can be computed as described in the last part of the Appendix.

Some special cases where \(I^*\equiv I\) or \(\iota (r)=r,\,r=1,2,\ldots ,c,\) deserve further comments. If \(\mu _{\iota (r)}^*>0, \,r=1,2,\ldots ,c-1\), and \(\frac{\alpha _{\iota (r)}^*}{\mu _{\iota (r)}^*}\ge \frac{\alpha _{\iota (r+1)}^*}{\mu _{\iota (r+1)}^*}\), \(r=1,2,\ldots ,c-2,\) then \(I^*\equiv I\), \(z_{\iota (r^*_i)}=-\frac{\alpha _{\iota (r^*_i)}^*}{\mu _{\iota (r^*_i)}^*}\) and all the categories can be modal. If \(\mu _{r}>0, \,r=1,2,\ldots ,c-1\), as in the Anderson’s proposal, then \(\iota (r)=r,\,r=1,2,\ldots ,c\), and if in addition it holds that \(\frac{\alpha _r}{\mu _r}\ge \frac{\alpha _{r+1}}{\mu _{r+1}}\), then \(I \equiv I^*\).

3.2 Allocation sets under RS regime

When in (4), it is \(\alpha _r =\phi _0+\phi _1\), for \(r<c/2\), \(\alpha _r =\phi _0\), for \(r=c/2\), \(\alpha _r =\phi _0-\phi _1\), for \(r>c/2\), and \(\mu _r=1, \, r=1,2,\ldots ,c-1\), the parallel local logit model for the RS regime is obtained. In this case, when \(\phi _1 > 0\), the sequence \(\alpha _r, \,r=1,2,\ldots ,c-1,\) is non-increasing (\(\alpha _r \ge \alpha _{r+1}\)) and every category r, \(r=1,2,\ldots ,c,\) is modal if z belongs to the allocation set \(\{z:-\alpha _{r-1}\le z\le -\alpha _{r}\}\), \(\alpha _0=\infty ,\,\alpha _c=-\infty\). Conversely, when \(\phi _1 < 0\), the sequence \(\alpha _r, \,r=1,2,\ldots ,c-1,\) is non-decreasing (\(\alpha _r \le \alpha _{r+1}\)), and only the first and the last categories can be mode, with a cut point at \(-\frac{\sum _{x=1}^{c-1}\alpha _x}{c-1}\) for the range of the linear predictor.

These results are used to obtain Table 1 that illustrates the allocation sets for the RS regime and shows the great flexibility of the model that, with only the two parameters \(\phi _0,\,\phi _1,\) captures several recognized RS attitudes. Thus, the suitability of model (2) under the constraints (3) for describing different RS, based on the tendency towards extreme or middle categories, is justified by the fact that the probability function defined by (3) can have a unique mode only at the middle or the extreme categories of the response scale.

Table 1 Probability functions under RS regime

Finally, it is worth to stress that when a parallel logit model is used also for the no-RS regime, the difference between the two regimes relies only on the restrictions on the intercepts. However, also in this case, in the no-RS regimes every category can be the unique mode, while in the RS regime middle or extreme categories only can be the unique mode. This is the crucial difference between RS and no-RS models.

4 Parameter estimation

Let \({\varvec{\theta }}\) denote the vector of all the parameters of the latent and observation models introduced in the previous sections.

The latent binary variable \(d_{it}^{(1)} (l,u)\) is equal to 1 when the i-th unit (subject) is at time t in state (lu) and the latent binary variable \(d_{it}^{(2)} (l,u;\bar{l},\bar{u})\) is 1 if at time t, \(t>1\), the i-th subject is in state (ul) while at occasion \(t-1\) was in \((\bar{l}, \bar{u})\). Moreover, the observable binary variable \(d_{jit}(r)\) is equal to 1 if at time t the category r of \(R_{jit}\), \(j\in \mathcal J\), is observed on the i-th individual, \(i\in \mathcal I\).

If the above binary latent variables were observable, the parameters could be estimated by maximizing the following complete log-likelihood (i.e. the joint log-likelihood of the observations and the latent variables):

$$\begin{aligned} \ell ^*({\varvec{\theta }})= & {} \sum _{l=1}^{k}\left[ \sum _{u=1}^{2}\sum _{i=1}^{n}d_{i1}^{(1)}(l,u)\right] \log \pi ^L_{1}(l) \nonumber \\{} & {} \quad +\sum _{u=1}^{2}\left[ \sum _{l=1}^{k}\sum _{i=1}^{n}d_{i1}^{(1)}(l,u) \right] \log \pi _{1}^{U}(u)\nonumber \\{} & {} \quad +\sum _{\bar{l}=1}^{k}\left\{ \sum _{l=1}^{k}\left[ \sum _{u=1}^{2}\sum _{\bar{u}=1}^{2}\sum _{i=1}^{n}\sum _{t=2}^{T}d_{it}^{(2)} (l,u;\bar{l},\bar{u})) \right] \log \pi ^L_{}(l|\bar{l})\right\} \nonumber \\{} & {} \quad +\sum _{\bar{u}=1}^{2}\sum _{l=1}^{k}\left\{ \sum _{u=1}^{2}\left[ \sum _{\bar{l}=1}^{k}\sum _{i=1}^{n}\sum _{t=2}^{T}d_{it}^{(2)}(l,u;\bar{l},\bar{u})) \right] \log \pi _{}^{U|L}(u|l,\bar{u})\right\} \nonumber \\{} & {} \quad +\sum _{j=1}^{J}\sum _{l=1}^{k}\left\{ \sum _{i=1}^{n}\sum _{t=1}^{T}\sum _{r=1}^{c_j}\left[ d_{it}^{(1)}(l,1)d_{jit}(r)\right] \log f_{l1j}(r|{\varvec{x}}_{it})\right\} \nonumber \\{} & {} \quad +\sum _{j=1}^{J}\sum _{l=1}^{k}\left\{ \sum _{i=1}^{n}\sum _{t=1}^{T}\sum _{r=1}^{c_j}\left[ d_{it}^{(1)}(l,2)d_{jit}(r)\right] \log f_{l2j}(r|{\varvec{x}}_{it})\right\} \end{aligned}$$
(6)

where \(f_{l1j}\) and \(f_{l2j}\) are provided in Sect. 2.2.

As the latent variables are not observable, it is convenient to use the EM algorithm to compute the maximum likelihood estimates (Bartolucci and Farcomeni 2015). Alternatively, the parameters can be estimated by maximizing the log-likelihood function of the observed variables through the commonly used Fisher Scoring or Newton–Raphson algorithms, but of the numerical approaches, EM seems to be more stable.

Every iteration of the EM algorithm is composed by two steps: the Expectation (E) step and the Maximization (M) step. With respect to our model, in the E step the following expected values are computed:

$$\begin{aligned} \delta _{it}^{(1)} (l,u;\bar{{\varvec{\theta }}})=E_{obs}(d_{it}^{(1)} (l,u)), \quad \delta _{it}^{(2)} (l,u;\bar{l},\bar{u};\bar{{\varvec{\theta }}})=E_{obs}(d_{it}^{(2)} (l,u;\bar{l},\bar{u}))), \end{aligned}$$
(7)

where \(E_{obs}()\) is the expected value taken conditionally on the observed values of the responses \(R_{jit}\) and on the covariates and given the current value \(\bar{{\varvec{\theta }}}\) of the parameters. The previous expected values can be computed by the Baum-Welch forward-backward algorithm (Zucchini and MacDonald 2009, Ch. 4).

In the M step, the conditional expectation \(Q({\varvec{\theta }}|\bar{{\varvec{\theta }}})=E_{obs}(\ell ^*({\varvec{\theta }}))\) of the complete log-likelihood function is maximized in order to obtain an updated \(\bar{{\varvec{\theta }}}\). Note that \(Q({\varvec{\theta }}|\bar{{\varvec{\theta }}})\) is obtained from the complete log-likelihood by replacing \(d_{it}^{(1)} (l,u)\) and \(d_{it}^{(2)} (l,u;\bar{l},\bar{u})\) with their expected values (7). The six addends of \(Q({\varvec{\theta }}|\bar{{\varvec{\theta }}})\) depend on disjoint subsets of the vector \({\varvec{\theta }}\) and can be maximized separately. The maximization of the first four addends of \(Q({\varvec{\theta }}|\bar{{\varvec{\theta }}})\), corresponding to the initial and transition probabilities specified in Sect. 2.1, is simple as there is a closed form for the maxima. The fifth and sixth addends depend only on the parameters of the parallel logit models and stereotype logit models of Sect. 2.2 and can be maximized by using functions vglm and rrvglm, respectively, of the R package VGAM (Yee 2015, 2021).

If the model is correctly specified, the estimates of the standard errors can be based on the outer products of the individual contributions to the score functions (outer product information matrix) as shown in Colombi et al. (2023). To assess the adequacy of the selected model, full-conditional residuals, introduced in the context of hidden Markov models by Buckby et al. (2020) as exvisive residuals or predictive residuals can be used. Some computational aspects of the mentioned residuals are in Colombi et al. (2023).

5 Modelling financial ability and risk perception

We fit the proposed model to data of 1109 Italian householdsFootnote 1 from the waves 2006–2016 of the Survey on Household Income and Wealth (SHIW), conducted by the Bank of Italy every two years to collect information about the income, wealth and saving of Italian households. Here, the analysis is focused on studying the financial capability, that according to OECD (2020) is a broader term encompassing behaviour, knowledge, skills, attitudes of people with regard to managing their financial resources. To this aim, we focus on the household’s financial ability to make ends meet and financial risk perception. Comprehension of the self-assessment of ability and risk perception reveals how people are aware of their real financial capability, and may offer useful prompts for the design of effective education programmes and orient towards more vulnerable individuals.

The observed ordinal responses are \(R_1\): the perception of the household’s financial ability to make ends meet, based on the answer of the head of the household to the question: Is your household’s income sufficient to see you through to the end of the month.... very easily (ve), easily (e), fairly easily (fe), fairly difficultly (fd), with some difficulty (sd), very difficulty (vd); and \(R_2\): the risk perception in managing financial investments measured through the response of the head of the household to the question: in managing your financial investments, would you say you have a preference for investments that offer: low returns, with no risk of losing the invested capital (risk averse, a); a fair return, with a good degree of protection for the invested capital (risk tolerant, t), good-high returns, but with a fair-high risk of losing part of the capital (risk lover, l).

Fig. 2
figure 2

Map of frequencies per response categories and time occasions

Figure 2 illustrates the frequencies of the response categories over time.

Factors that may explain the responses are listed below with the reference categories being in italics. Some demographic characteristics that can influence the responses of the house keepers are: gender (G): female, male; age (A): up to 40 years old, over 40 years old; marital status of the head of household (M): married, not married; job (J): self-employee (se), housekeeper/retired/student (hrs), employee (e); education (E): up to secondary school, over secondary school. Economic features are collected through the covariates: household income (I): up to 20000 euro per year, over 20000 euro per year; savings (S): with savings, no savings; home ownership (H): owned home, rented/under redemption agreement/in usufruct. Interviewers are invited to say how they rate (on a scale from 1 to 10, in which 1 is lowest and 10 highest) the reliability of answers provided by the householder. We consider the evaluation of the interviewer as a binary factor R with categories not completely reliable (with evaluation up to 8), and reliable (with evaluation 9 or 10).

Table 2 The maximum value of the log-likelihood function Loglike, the number of parameters \(\texttt {\#}\) par, BIC values and the number k of states of the latent k-regime switching indicator are reported for models defined by different hypotheses on the observation probability functions

Models illustrated in Sect. 2.2, specified by different restrictions on regression coefficients and fixed scores, are compared for various numbers k of the k-regime switching indicator and Table 2 reports the results. All covariates are inserted into no-RS and RS regime models (2, 3) in all the compared MSSL models. The compared models in Table 2 are identified by the symbol \(A-B-C\) where \(A \in \{SWitching,FIxed\}\) refers to the regression coefficients under the RS regime, \(B \in \{SWitching,FIxed\}\), to the regression coefficients under the no-RS regime and \(C \in \{S,P\}\) indicates if a stereotype (S) or parallel (P) logit model is used under the no-RS regime. According to BIC, the model offering the best fit is FI-FI-S with \(k=3\). Note that, according to the selected model, the scores and the regression coefficients are fixed parameters and that the intercepts are switching with respect to the 3 regimes of the k-regime indicator. Full-conditional residuals confirm its good performance: their values are very small around zero at every time occasion and configuration of covariates and responses (\(Q_1 = -0.342\), \(Me = -0.176\), \(Q_3 = -0.041\)), 96\(\%\) are in the range \((-2,2)\) and 0.23\(\%\) are out of 5 in absolute value, boxplots are reported in the supplementary material. Moreover, the selected model shows a better fit when compared with the corresponding hidden Markov model (3 states) that does not have a latent variable to account for the answering behaviour (BIC = 29549.18). This result highlights how in the data at hand we find confirmation of the relevance of taking response style into account.

Let us now go into details of the results related to the selected MSSL model. Since scores and regression coefficients are fixed with respect to the k-regime switching indicator, the differences among the conditional distributions depend on the intercepts. In particular, as the intercepts are switching parameters (whose estimates are in the supplementary material), the corresponding allocation sets, introduced in Sect. 3.1, are different in the three states of the k-regime switching indicator, under every responding behavioral regime. Differences based on the intercepts would be difficult to interpret, but the allocation sets provide a simple way to explain.

The allocation sets and the most probable responses, for every state of the latent regime and response style indicators, are reported in Table 3 for both \(R_1\) and \(R_2\). They are also illustrated in Figs. 3 and 4.

Here, \(k=3\) suggests to distinguish the groups of households in three latent states interpreted according to whether households are financially confident, fair or distressed. Table 3 gives valuable insights in this direction, looking at the results under the perspective of the two responding regimes.

Table 3 Allocation sets of the response categories: very easily (ve), easily (e), fairly easily (fe), fairly difficultly (fd), with some difficulty (sd), very difficulty (vd), risk averse (a), risk tolerant (t), risk lover (l)

The estimated scores of the no-RS regime model (2) for \(R_1\) are all positive, i.e. \(\hat{\mu }_{211}= 1.178\,(se = 0.258)\), \(\hat{\mu }_{311}= 2.587 \,(se = 0.286)\), \(\hat{\mu }_{411}= 1.281 \,(se = 0.240)\), \(\hat{\mu }_{511}= 1.058 \,(se = 0.250)\) and the ratios between intercepts and scores are decreasing, so that, as the linear predictor z increases, all the response categories become modal one after the other respecting the natural order of the categories. Moreover, confident households have the widest allocation set \((-\infty , -0.85]\) for the three categories of ease of making ends meet ve, e, fe, while distressed households have the narrowest allocation set \((-\infty , -2.26]\). On the other hand, when considering the categories that indicate difficulties (fd, sd, vd), the result is opposite.

The unique estimated score for question \(R_2\) under no-RS regime is negative (\(\hat{\mu }_{212}= -0.187,\, se = 0.068\)), thus the natural ordering of the categories is not respected. In particular, for distressed households, the three categories of \(R_2\) become modal in the sequence risk averse, risk lover, risk tolerant. Moreover, category risk lover is never modal for confident and fair households. The allocation set for the risk averse category decreases as we move from confident households to distressed households, suggesting that this category is more commonly observable among those who are financially confident. Similarly, distressed households have the widest allocation set for the risk lover category, while fair households have the widest allocation set for the risk tolerant option. This implies that individuals experiencing financial distress tend to exhibit a higher inclination towards risk, while those with a fair stance display a more cautious attitude. Those who have greater financial confidence show a strong preference for risk aversion. Again from Table 3, we can note which categories (among middle or extreme categories) are more probable when the responses are driven by RS regime according to Sect. 3.2, and how such preferences differ in the three states of the latent financial condition, for both items \(R_1\) and \(R_2\).

Confident households, when utilizing the RS behavior to address \(R_1\), exhibit a stronger inclination towards the middle response style (MRS) compared to households with fair or distressed financial capabilities. This preference for the middle categories fe and fd is emphasized by the largest allocation set, of both fe and fd, corresponding to the confident regime among the three latent regimes. On the other hand, fair households possess the widest allocation set for the category ve. In contrast, the negative estimate \(\hat{\phi }_{131} = -0.6414\) (se = 0.4478) indicates a greater propensity towards the ERS attitude (as shown in Table 1) among households facing financial distress. In particular, the extreme category vd is more commonly selected by households in this distressed group, as the allocation set for vd in the distressed regime exceeds the allocation sets for vd in the fair and confident regimes. Furthermore, it is crucial to note that households suffering financial distress do not exhibit a preference for the MRS, as the corresponding allocation set, which would include the middle response points as the mode, is empty. Conversely, while fair households have the option to choose the MRS, they are less inclined to do so compared to confident respondents.

Regarding question \(R_2\), the estimated values, \(\hat{\phi }_{132} = -0.4666\) (se = 0.3705) and \(\hat{\phi }_{122} = -0.8324\) (se = 0.4844), are both negative. These results, according to the findings presented in Table 1, indicate that when distressed and fair households answer with a response style behavior while expressing their perception of financial risk, they tend to select the scale’s endpoints (ERS). Specifically, households in financial distress lean more easily towards the risk lover endpoint, while those in more moderate financial situations towards the risk averse endpoint. This observation is supported by comparing the allocation sets between the fair and distressed regimes. Additionally, the widest allocation set for the category risk averse is associated with the confident regime indicating that households who are confident in their financial capability tend to prefer a risk averse stance (DRS) more than the households in the other two regimes. There is a narrow allocation set where risk tolerant is the mode for confident households, indicating that confident individuals may also exhibit MRS. However, this tendency is specific to the confident regime and is not observed among fair and distressed households, as the allocation set of t is empty.

Figures 3 and 4 show with what probability the categories are modal when the linear predictor varies and how far are the probabilities of the other categories. Moreover, the plots make on evidence the categories which are never mode. To be more specific, note that, when using only categorical covariates as in the current example, the linear predictor only covers values in a finite set, not on a continuous scale. Table 4 reports estimates and standard errors of the fixed regression coefficients in model (2) that are useful to interpret the covariate effects on the linear predictor under the two response styles.

For item \(R_1\) in no-RS regime, all the scores are non-negative. Thus, the direction of the effect of a regressor is the same for each pair of adjacent categories of \(R_1\). This reveals that, the higher the value of \(z=\hat{{{\varvec{\gamma }}}}^{\prime }_{11} {\varvec{z}}\), the more the distribution of \(R_1\) tends to move toward the higher end of the response scale (indicating difficulty in managing ends meet). As all the significant regression coefficients listed in \(\hat{{{\varvec{\gamma }}}}^{\prime }_{11}\) are negative, except that of covariate H, the value of z is the highest for no reliable households, with an employee householder that has low level of education, lower disposable income, with no savings and no owned home. The effect of covariates is mainly stressed at the cut point between positive and negative side of the response scale, since the estimated score \(\hat{\mu }_{311} = 2.857\) is the highest. So, considering the regressor income for example, the odds of handling own finances fairly difficultly instead of fairly easily for households with low income is quite 17 times the odds for households with more income.

For item \(R_2\) in no-RS regime, the estimated score is negative, so that the effect of covariates is opposite on the two local logits. For instance, the odds of being risk lover versus tolerant for high educated householders is 0.83 times the same odds for low educated respondents, while the odds of being tolerant instead of averse for high educated householders is 2.55 times the same odds for those with a lower education level. As z increases, the odds of being risk tolerant instead of lover or averse increases. The value z is greater for young and married men, in career, with higher education and income, homeowners, according to the signs of the non-null estimated regression coefficients (last row of Table 4).

When the answers are considered in RS regime, according to the parallel local logit model (2) with parameters (3), the effect of a regressor is the same for each pair of adjacent categories of \(R_1\) and it is described only by the regression coefficients in \(\hat{{{\varvec{\gamma }}}}^{\prime }_{2j}, j=1,2\), showed in Table 4.

As implied by the hypothesis of parallel effect the distribution of \(R_j\) tends to move towards the high end of the scale for increasing values of z, but the characteristics of respondents in RS regime implying an increase of z are different from those rising z under the no-RS regime. In particular, based on the signs of the significant estimated regression coefficients, in answering question \(R_1\) following a RS, young and low educated women, employees, with no high salary, no savings to count on, but with owned home, choose with higher chance categories underlining difficulties. On the other hands, all the mentioned household characteristics that have impact on the perception of financial risk (\(R_2\)) in no-RS regime, no longer seem useful to explain the heterogeneity into that perception when the answers are driven by RS. In general, in the RS regime, there are no significant effects of covariates on \(R_2\).

Table 4 Maximum likelihood estimates (\(\hat{{{\varvec{\gamma }}}}^{\prime }_{1j},\hat{{{\varvec{\gamma }}}}^{\prime }_{2j}, j=1,2\)) and standard errors (se) of parameters of the selected model

The allocation sets help matching the linear predictor values, identifying groups of respondents who have certain demographic and economic profiles, with the modal answer for such groups. For the householders with one of such profiles, the corresponding linear predictor z is determined on the basis of the estimated regression coefficients (Table 4) and the allocation set (Table 3) to which the calculated z belongs will reveal the most likely response for that profile, for every item and latent state. For instance, consider a householder with covariates: \(R =\) reliable, \(G=\) male, \(Age =\) over 40, \(M =\) not married, \(J =\) self employee/employee, \(E =\) up to secondary school, \(I =\) less than 20000 per year, \(S =\) with savings, \(H =\) owned home. This profile corresponds to the linear predictor \(z = -1.34\), in answering \(R_1\) under no-RS regime, and \(z= -1.03\) under RS regime. So the most probable response for householders with this profile, not driven by RS, would be fairly easily, fairly difficultly, and with some difficulty, respectively, according to whether they have a confident, fair, or distressed attitude to face financial issues, while when their answering behavior follows a RS, they tend to take refuge in the middle (fairly difficultly) with higher probability when financially confident, and their preference is more likely very difficultly when they are in a (fair) financial stability or suffer financial distress. Other exemplifying profiles are reported in supplementary material.

Fig. 3
figure 3

Allocation sets of the response \(R_1\) function under no RS (2) and RS (3) regimes, in the three latent states (financially confident, fair, distressed). Response categories are ve = very easily, e = easily, fe = fairly easily, fd = fairly difficultly, d = with some difficulty, vd = very difficultly

Fig. 4
figure 4

Allocation sets of the response \(R_2\) function under no RS (2) and RS (3) regimes, in the three latent states (financially confident, fair, distressed). Response categories are a = risk averse, t = risk tolerant, l = risk lover

6 Simulation study

To illustrate the consequences of ignoring the RS attitude in the responses, we conducted a Monte Carlo simulation study from different scenarios.

The main interest is on showing the consequences of model misspecification on the parameter estimates when an MSSL model, that ignores the RS regime switching indicator and is specified only by logit model (2) with \(u=1\) and initial-transition probabilities \(\pi ^L_{1}(l)\), \(\pi ^L(l|\bar{l})\) of the \(k-regime\) switching indicator Markov chain, is erroneously adopted. Data is generated from the MSSL model proposed in Section 2 for different parameter scenarios. For each scenario and on each sample, we fitted the correct model and the wrongly specified switching stereotype model without RS component. The performance of the estimators of the switching intercepts, the fixed scores and the vector of fixed regression coefficients that parameterize the stereotype model for no-RS responses is examined.

In the simulation study, one response variable R with 5 categories and two explanatory variables, one binary (0/1 with probabilities 0.3/0.7) and one standardized Normal, are considered. The parameter setting, common to each scenario, is specified by: \(k=3\) regimes, fixed scores \(\mu _{21}=1.4,\,\mu _{31}=1.7, \,\mu _{41}=1.3\); fixed and switching RS intercept-parameters \(\phi _{0l}=0, \, l=1,2,3\) and \(\phi _{11}=0.7, \, \phi _{12}=1,\phi _{13}=1.3;\) the matrix of switching no-RS intercept-parameters with rows \((\alpha _{r11},\delta _{r21},\delta _{r31})=(\alpha _{r11}, \alpha _{r21}-\alpha _{r11},\alpha _{r31}-\alpha _{r11})\), \(r=1,2,3,4,\)

$$\begin{aligned} {\varvec{\Delta }}=\left( \begin{array}{ccc} 2.3 &{}\quad 0.9&{}\quad 1 \\ -1.5 &{}\quad 0.9 &{}\quad 1 \\ -2.3 &{}\quad 0.6 &{}\quad 0.9\\ -3.3 &{}\quad 0.6 &{}\quad 0.9 \\ \end{array} \right) ;\end{aligned}$$

the initial probabilities \(\pi ^L_{1}(l)=1/3, \, l=1,2,3,\) and the matrix of the transition probabilities \(\pi ^L(l|\bar{l})\) of the \(3-regime\) switching indicator

$$\begin{aligned} {\varvec{\Pi }}= \left( \begin{array}{ccc} 1/2 &{}\quad 1/4 &{}\quad 1/4 \\ 1/8 &{}\quad 2/3 &{}\quad 5/24 \\ 1/6 &{}\quad 1/6 &{}\quad 2/3 \\ \end{array} \right) .\end{aligned}$$

The simulation study encompasses 16 scenarios that consider four levels of relevance regarding the RS attitude, two settings for the regression coefficients and two sample sizes. As initial probabilities of the RS regime switching indicator, we assume \(\pi _{1}^{U}(u)=0.05, 0.1, 0.2, 0.4\), \(u=2\), which are associated with labels low, low-medium, medium, and high, respectively, of the RS importance. The transition probabilities \(\pi ^{U|L}(u|l,\bar{u})\), \(u=2\), are computed as \(\pi ^{U|L}(u|l,\bar{u})=\pi ^{U|L}(u|l)=\pi _{1}^{U}(u)+0.1(l-1), \, u=2.\) Setting A: \({\varvec{\gamma }}_{1}^{\prime }=(\gamma _{11},\gamma _{12})=(1,1),\, {\varvec{\gamma }}_{2}^{\prime }=(\gamma _{21},\gamma _{22})=(0.2,0.2)\) and setting B: \({\varvec{\gamma }}_{1}^{\prime }=(-1,-1),\, {\varvec{\gamma }}_{2}^{\prime }=(-0.5,-0.5)\) are considered for the fixed regression coefficients. Data is generated 300 times for \(n=500, 1000\) and \(T=6\).

The simulation results are presented below for settings A and B of regression coefficients, along with all four levels of RS relevance, and sample size \(n=1000\). More detailed tables for \(n=500\) and \(n=1000\) can be found in the supplementary material.

Figures 5 and 6 well illustrate the consequences of the model misspecification. They show the boxplots of the Monte Carlo errors, calculated as difference between parameter estimate and true value, for scores and regression coefficients, under the proposed model accounting for RS (colored) and the model which ignores RS (grey). Estimates from the model ignoring RS differ substantially from the true values and underestimate or overestimate the true parameters. The two settings A and B serve to illustrate opposite situations of underestimation and overestimation. Moreover, estimation errors tend to increase as the importance of the RS component increases, from low to high level.

Fig. 5
figure 5

Boxplots of Monte Carlo errors of scores and fixed regression coefficients, in models which include (colored) and ignore (grey) RS component, at the four levels of relevance of the RS attitude, with sample size \(n=1000\), and scenario A coefficients: \({\varvec{\gamma }}_1^{\prime }=(1,1)\), \({\varvec{\gamma }}_2^{\prime }=(0.2,0.2)\) (color figure online)

Fig. 6
figure 6

Boxplots of Monte Carlo errors of scores and fixed regression coefficients, in models which include (colored) and ignore (grey) RS component, at the four levels of relevance of the RS attitude, with sample size \(n=1000\), and scenario B coefficients \({\varvec{\gamma }}_1^{\prime }=(-1,-1)\), \({\varvec{\gamma }}_2^{\prime }=(-0.5,-0.5)\) (color figure online)

Tables 5 and 6 report biases and standard deviations of the products of the estimated scores and regression coefficients, which describe the effects of covariates in model (2), obtained by fitting the misspecified model without RS, under different scenarios. Analogous results for the parameters of interest, \(\alpha _{rl1}\), \(\mu _{r1}\) and \({\varvec{\gamma }}_{1}^{\prime }\), are reported in the supplementary material. As can be seen along these tables, both bias and variability increase from lower to higher RS significance levels.

Table 5 Bias and standard deviation (StDev) of estimators of the products of scores and regression coefficients in the RS-ignored model
Table 6 Bias and standard deviation (StDev) of estimators of the products of scores and regression coefficients in the RS-ignored model

Important regularities can be detected: if the true value of a parameter (intercepts \(\alpha _{rl1}\), fixed regression coefficients \({\varvec{\gamma }}_{1}^{\prime }\), and products \({\varvec{\gamma }}_{1}^{\prime }\mu _{r1}\)) in the no-RS regime model is larger (smaller) compared to that of the RS regime model, using the incorrect model will lead to underestimation (overestimation) of that parameter. Tables 5 and 6, as well as Figs. 5 and 6, display both cases.

When fitting the corrected specified model, including the RS switching indicator, bias tends to decrease and precision to increase as the sample size rises as shown by Fig. 7 for sample sizes \(n=500, 1000\), at the low-medium level of relevance of the RS attitude (other scenarios give similar results).

Fig. 7
figure 7

Bias and standard deviation of parameter estimators in model including the RS switching indicator, at the low-medium level of relevance of the RS attitude, with sample sizes \(n=500,1000\), and setting A coefficients: \({\varvec{\gamma }}_1^{\prime }=(1,1)\), \({\varvec{\gamma }}_2^{\prime }=(0.2,0.2)\)

7 Final remarks

The novelty of our approach lies in providing a Markov switching regression model for ordered responses that takes into account simultaneously: attitude towards response style, unobserved heterogeneity, serial dependence and impact of time-varying covariates. The proposed Markov switching stereotype logit model allows us to take advantage of:

  • adopting an underlying bivariate Markov chain to model temporal dependence of responses, attitudes toward RS and unobserved heterogeneity among respondents

  • modelling the RS attitude as a time-varying effect because respondents can switch between RS and no-RS regimes at every time occasion

  • using a stereotype logit model as a flexible and parsimonious way to introduce covariate effects on no-RS responses and an useful approach for distinguishing no-RS from RS responding behavior

  • introducing the use of allocation sets as a tool for describing differences among the distributions of RS and no-RS responses.

Another aspect that is worth mentioning is the opportunity offered by the allocation sets as tools for the joint prediction of an observable and latent future realization, given the past history of responses and covariates. The joint predictor of \((R_{jit},U_{it},L_{it})\) can be preferable to the usual marginal predictors of \(R_{jit}\) and \((U_{it},L_{it})\), when the interest is on classifying every unit with respect to the response and the latent state simultaneously. Such aspects deserve further attention and give even more emphasis and prominence to the use of the allocation sets here proposed.

Our approach may have a limitation stemming from the assumption that transition probabilities are subject independent since unit-specific covariate effects are considered at the observation level. To overcome this drawback, subject specific time invariant discrete random effects can be introduced on initial and transition probabilities (conditioned on the states of the k-regime switching indicator) of the response style indicator as suggested in Sect. 2.1. Alternatively, our proposal can be extended by introducing subject-specific random effects in the logit models for the observable variables to take into account time invariant heterogeneity of RS and no-RS respondents, which the bivariate latent process does not account for. This generalization considers time invariant heterogeneity, allowing for a more comprehensive understanding of the variability in the data and does not assume that responses are independent given the bivariate latent chain, which can be a significant improvement in capturing the complexity of the underlying dynamics. However, it is important to note that this generalization may introduce computational challenges, especially when dealing with continuous random effects (Altman 2007). A possible extension of MSSL models, that accounts for RS or no-RS tendencies also through time invariant subject-specific discrete random effects and keeps the number of random effects at minimum, replaces the intercept term \(\phi _{0lj}+\phi _{1lj}s_{rj}\), in the model proposed for RS respondents, with \(\phi _{0lj}+(b_i+\phi _{1lj})s_{rj}\) where \(b_i\) is a (discrete) subject-specific random effect. Furthermore, the linear predictor of the stereotype logit model for no-RS respondents can be modified by introducing a (discrete) random intercept \(a_i\) so that the local logits for no-RS respondents are \(\eta _{irl1j}({\varvec{x}}_{it})=\alpha _{rl1j}+\mu _{rl1j} (a_i+{\varvec{\gamma }}_{l1j}^{\prime }{\varvec{x}}_{it}).\) The subject specific random terms \(b_i\) model time invariant heterogeneity of RS respondents, similarly to the approach of Schauberger and Tutz (2022) and the random intercepts \(a_i\) do a similar task for no-RS respondents.

Further advancement on the possibility of including time invariant effects in MSSL models needs a more in-depth study.