Skip to main content
Log in

A Latent Hidden Markov Model for Process Data

Psychometrika Aims and scope Submit manuscript


Response process data from computer-based problem-solving items describe respondents’ problem-solving processes as sequences of actions. Such data provide a valuable source for understanding respondents’ problem-solving behaviors. Recently, data-driven feature extraction methods have been developed to compress the information in unstructured process data into relatively low-dimensional features. Although the extracted features can be used as covariates in regression or other models to understand respondents’ response behaviors, the results are often not easy to interpret since the relationship between the extracted features, and the original response process is often not explicitly defined. In this paper, we propose a statistical model for describing response processes and how they vary across respondents. The proposed model assumes a response process follows a hidden Markov model given the respondent’s latent traits. The structure of hidden Markov models resembles problem-solving processes, with the hidden states interpreted as problem-solving subtasks or stages. Incorporating the latent traits in hidden Markov models enables us to characterize the heterogeneity of response processes across respondents in a parsimonious and interpretable way. We demonstrate the performance of the proposed model through simulation experiments and case studies of PISA process data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Data Availability

The dataset analyzed in the current study are available at


  • Binkley, M., Erstad, O., Herman, J., Raizen, S., Ripley, M., Miller-Ricci, M., & Rumble, M. (2012). Defining twenty-first century skills. In Assessment and teaching of 21st century skills (pp. 17–66). Springer.

  • Broyden, C. G. (1970). The convergence of a class of double-rank minimization algorithms 1. General considerations. IMA Journal of Applied Mathematics, 6(1), 76–90.

    Article  Google Scholar 

  • Cappé, O., Moulines, E., & Ryden, T. (2005). Inference in hidden Markov models. Springer.

    Book  Google Scholar 

  • Chen, Y. (2020). A continuous-time dynamic choice measurement model for problem-solving process data. Psychometrika, 85(4), 1052–1075.

    Article  PubMed  PubMed Central  Google Scholar 

  • Chen, Y., Li, X., Liu, J., & Ying, Z. (2019a). Statistical analysis of complex problem-solving process data: An event history analysis approach. Frontiers in Psychology, 10, 486.

    Article  PubMed  PubMed Central  Google Scholar 

  • Chen, Y., Li, X., & Zhang, S. (2019b). Joint maximum likelihood estimation for high-dimensional exploratory item factor analysis. Psychometrika, 84(1), 124–146.

    Article  PubMed  Google Scholar 

  • Cover, T. M., & Thomas, J. A. (2006). Elements of information theory (2nd ed.). Wiley.

    Google Scholar 

  • Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society Series B, 39(1), 1–22.

    Google Scholar 

  • Eddelbuettel, D., & François, R. (2011). Rcpp: Seamless r and c++ integration. Journal of Statistical Software, 40, 1–18.

    Article  Google Scholar 

  • Eichmann, B., Greiff, S., Naumann, J., Brandhuber, L., & Goldhammer, F. (2020). Exploring behavioural patterns during complex problem-solving. Journal of Computer Assisted Learning, 36(6), 933–956.

    Article  Google Scholar 

  • Fletcher, R. (1970). A new approach to variable metric algorithms. The Computer Journal, 13(3), 317–322.

    Article  Google Scholar 

  • Giner, G., Chen, L., Hu, Y., Dunn, P., Phipson, B., & Chen, Y. (2023). statmod: Statistical modeling [Computer software manual]. Retrieved from

  • Goldfarb, D. (1970). A family of variable-metric methods derived by variational means. Mathematics of Computation, 24(109), 23–26.

    Article  Google Scholar 

  • Greiff, S., Niepel, C., Scherer, R., & Martin, R. (2016). Understanding students’ performance in a computer-based assessment of complex problem solving: An analysis of behavioral data from computer-generated log files. Computers in Human Behavior, 61, 36–46.

    Article  Google Scholar 

  • Greiff, S., Wüstenberg, S., & Avvisati, F. (2015). Computer-generated log-file analyses as a window into students’ minds? A showcase study based on the PISA 2012 assessment of problem solving. Computers & Education, 91, 92–105.

    Article  Google Scholar 

  • Han, Y., Liu, H., & Ji, F. (2021). A sequential response model for analyzing process data on technology-based problem-solving tasks. Multivariate Behavioral Research, 57, 960.

    Article  PubMed  Google Scholar 

  • He, Q., & von Davier, M. (2016). Analyzing process data from problem-solving items with n-grams: Insights from a computer-based large-scale assessment. In Y. Rosen, S. Ferrara, & M. Mosharraf (Eds.), Handbook of research on technology tools for real-world skill development (pp. 749-776). Information Science Reference.

  • He, Q., Liao, D., & Jiao, H. (2019). Clustering behavioral patterns using process data in PIAAC problem-solving items. In Theoretical and practical advances in computer-based educational measurement (pp. 189-212). Springer.

  • Herborn, K., Mustafić, M., & Greiff, S. (2017). Mapping an experiment-based assessment of collaborative behavior onto collaborative problem solving in PISA 2015: A cluster analysis approach for collaborator profiles. Journal of Educational Measurement, 54(1), 103–122.

    Article  Google Scholar 

  • Liang, K., Tu, D., & Cai, Y. (2022). Using process data to improve classification accuracy of cognitive diagnosis model. Multivariate Behavioral Research.

    Article  Google Scholar 

  • Lord, F. M. (1980). Applications of item response theory to practical testing problems. Routledge.

    Google Scholar 

  • McCullagh, P., & Nelder, J. (2018). Generalized linear models. Routledge.

    Google Scholar 

  • OECD. (2014). PISA 2012 results: Creative problem solving: Students’ skills in tackling real-life problems (Vol. 5). OECD Publishing.

    Book  Google Scholar 

  • R Core Team. (2023). R: A language and environment for statistical computing [Computer software manual]. Vienna, Austria. Retrieved from

  • Rabiner, L., & Juang, B. (1986). An introduction to hidden Markov models. IEEE ASSP Magazine, 3(1), 4–16.

    Article  Google Scholar 

  • Rupp, A. A., Templin, J., & Henson, R. A. (2010). Diagnostic measurement: Theory, methods, and applications. Guilford Press.

    Google Scholar 

  • Shanno, D. F. (1970). Conditioning of quasi-Newton methods for function minimization. Mathematics of Computation, 24(111), 647–656.

    Article  Google Scholar 

  • Stadler, M., Fischer, F., & Greiff, S. (2019). Taking a closer look: An exploratory analysis of successful and unsuccessful strategy use in complex problems. Frontiers in Psychology, 10, 777.

    Article  PubMed  PubMed Central  Google Scholar 

  • Tang, X., Wang, Z., He, Q., Liu, J., & Ying, Z. (2020). Automatic feature construction for process data using multidimensional scaling. Psychometrika, 85, 378–397.

    Article  Google Scholar 

  • Tang, X., Wang, Z., Liu, J., & Ying, Z. (2021). An exploration of process data by action sequence autoencoder. British Journal of Mathematical and Statistical Psychology, 74, 1–33.

    Article  PubMed  Google Scholar 

  • Ulitzsch, E., He, Q., & Pohl, S. (2022a). Using sequence mining techniques for understanding incorrect behavioral patterns on interactive tasks. Journal of Educational and Behavioral Statistics, 47(1), 3–35.

    Article  Google Scholar 

  • Ulitzsch, E., Ulitzsch, V., He, Q., & Lüdtke, O. (2022b). A machine learning-based procedure for leveraging clickstream data to investigate early predictability of failure on interactive tasks. Behavior Research Methods, 55, 1392.

    Article  PubMed  PubMed Central  Google Scholar 

  • Viterbi, A. (1967). Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE transactions on Information Theory, 13(2), 260–269.

    Article  Google Scholar 

  • von Davier, M., Khorramdel, L., He, Q., Shin, H. J., & Chen, H. (2019). Developments in psychometric population models for technology-based large-scale assessments: An overview of challenges and opportunities. Journal of Educational and Behavioral Statistics, 44(6), 671–705.

    Article  Google Scholar 

  • Wang, Z., Tang, X., Liu, J., & Ying, Z. (2022). Subtask analysis of process data through a predictive model. British Journal of Mathematical and Statistical Psychology.

    Article  PubMed  Google Scholar 

  • Xiao, Y., He, Q., Veldkamp, B., & Liu, H. (2021). Exploring latent states of problem-solving competence using hidden Markov model on process data. Journal of Computer Assisted Learning, 37(5), 1232–1247.

    Article  Google Scholar 

  • Xu, H., Fang, G., & Ying, Z. (2020). A latent topic model with Markov transition for process data. British Journal of Mathematical and Statistical Psychology, 73(3), 474–505.

    Article  PubMed  Google Scholar 

  • Zhang, S., Wang, Z., Qi, J., Liu, J., & Ying, Z. (2023). Accurate assessment via process data. Psychometric, 88, 76–97.

    Article  Google Scholar 

  • Zhan, P., & Qiao, X. (2022). Diagnostic classification analysis of problem-solving competence using process data: An item expansion method. Psychometrika, 87, 1529.

    Article  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Xueying Tang.

Ethics declarations

Conflict of interest

The author has no conflicts of interest to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This research was funded by National Science Foundation Grant DMS-2310664.


Appendix A LHMM Likelihood Computation

The likelihood for a set of response processes \({\mathcal {Y}}_n\) following an LHMM is

$$\begin{aligned} L(\varvec{\eta }\mid {\mathcal {Y}}_n) = \prod _{i=1}^n P(\varvec{Y}^{(i)} = \varvec{y}^{(i)} \mid \varvec{\eta }) = \prod _{i = 1}^n \left\{ \int \phi (\theta _i) P(\varvec{Y}^{(i)} = \varvec{y}^{(i)} \mid \varvec{\eta }, \theta _i)d\theta _i\right\} . \end{aligned}$$

We demonstrate here how to compute \(L_i\left( \varvec{\eta }\mid \varvec{y}^{(i)}\right) = P\left( \varvec{Y}^{(i)} = \varvec{y}^{(i)} \mid \varvec{\eta }\right) \). For notation simplicity, the superscripts and the subscripts denoting different respondents are suppressed hereafter. We explain first how to compute \(f(\varvec{\eta }, \theta ) = P(\varvec{Y} = \varvec{y} \mid \varvec{\eta }, \theta )\) given \((\varvec{\eta }, \theta )\) and then how to numerically integrate \(\phi (\theta )f(\varvec{\eta }, \theta )\) with respect to \(\theta \) to obtain \(L(\varvec{\eta }\mid \varvec{y})\).

For \(k = 1, \ldots , K\) and \(t = 1, \ldots , T\), define the forward probability

$$\begin{aligned} \alpha _t(k \mid \theta ) = P( \varvec{Y}_{1:t} = \varvec{y}_{1:t}, S_t = k \mid \varvec{\eta }, \theta ). \end{aligned}$$

Given \(\varvec{\eta }\) and \(\theta \), we can obtain \(f(\varvec{\eta }, \theta )\) from the forward probabilities \(\alpha _T(k \mid \theta )\) since \(f(\varvec{\eta }, \theta ) = \sum _{k = 1}^K \alpha _T(k \mid \theta )\). According to HMM assumptions (14), it is easy to verify \(\alpha _1(k \mid \theta ) = \pi _k (\theta ) q_{k, y_1}(\theta )\) and

$$\begin{aligned} \alpha _{t}(k \mid \theta ) = \sum _{l = 1}^K \alpha _{t-1}(l \mid \theta ) p_{lk}(\theta )q_{k, y_t}(\theta ), ~t = 2, \ldots , T, \end{aligned}$$

where \(\pi _k(\theta )\), \(p_{kl}(\theta )\), and \(q_{kj}(\theta )\) are defined in (57). Therefore, \(\alpha _T(k\mid \theta )\) can be computed by first calculating \(\alpha _1(k \mid \theta )\) and then applying (A2) recursively.

Besides the forward probabilities, one can also define the backward probability

$$\begin{aligned} \beta _t(k \mid \theta ) = P(\varvec{Y}_{(t+1):T} = \varvec{y}_{(t+1):T} \mid S_t = k, \varvec{\eta }, \theta ), ~k = 1, \ldots , K, ~t = 1, \ldots , T-1. \end{aligned}$$

Letting \(\beta _T(k \mid \theta ) = 1\), then we have the recursive relation

$$\begin{aligned} \beta _t(k \mid \theta ) = \sum _{l = 1}^K p_{kl}(\theta )q_{l,y_{t+1}}(\theta ) \beta _{t+1}(l | \theta ). \end{aligned}$$

Although computing \(f(\varvec{\eta }, \theta )\) does not require the backward probabilities, we still compute them when evaluating the likelihood because they, together with the forward probabilities, are essential components for computing the derivatives of the likelihood function. See Appendix B for details.

Given that \(f(\varvec{\eta }, \theta )\) is computable, we can approximate

$$\begin{aligned} L(\varvec{\eta }\mid \varvec{y}) = \int \phi (\theta ) f(\varvec{\eta }, \theta ) d\theta = \frac{1}{\sqrt{\pi }} \int e^{-x^2} f(\varvec{\eta }, \sqrt{2x})dx \end{aligned}$$

using Gaussian–Hermite quadrature by \(\frac{1}{\sqrt{\pi }} \sum _{u = 1}^U w_u f(\varvec{\eta }, \sqrt{2}x_u)\) where \(x_1, \ldots , x_U\) are U quadrature points and \(w_1, \ldots , w_U\) are the associated weights. The quadrature points and the corresponding weights for a given U can be computed based on the Hermite polynomials. We use the function gauss.quad in the R package statmod for this aim.

The algorithm for computing the likelihood function for LHMM is summarized in Algorithm 1.

Algorithm 1

(LHMM likelihood computation) The likelihood function \(L(\varvec{\eta }\mid \varvec{y})\) for a response process \(\varvec{y}\) following LHMM is computed in the following steps.

  1. 1.

    Obtain Gaussian–Hermite quadrature points \(x_1, \ldots , x_U\) and the associated weights \(w_1, \ldots , w_U\).

  2. 2.

    For \(u = 1, \ldots , U\), compute \(f(\varvec{\eta }, \sqrt{2}x_u)\) as follows.

    1. (a)

      Compute \(\alpha _1(k \mid \sqrt{2}x_u) = \pi _k(\sqrt{2}x_u) q_{k, y_1}(\sqrt{2}x_u)\) and set \(\beta _T(k \mid \sqrt{2}x_u) = 1\) for \(k = 1, \ldots , K\).

    2. (b)

      For \(t = 2, \ldots , T\) and \(k = 1, \ldots , K\), compute

      $$\begin{aligned} \alpha _t(k \mid \sqrt{2}x_u) = \sum _{l = 1}^K \alpha _{t-1} (l \mid \sqrt{2}x_u) p_{lk}(\sqrt{2}x_u)q_{k, y_t}(\sqrt{2}x_u) \end{aligned}$$


      $$\begin{aligned} \beta _{T-t+1}(k \mid \sqrt{2} x_u) = \sum _{l = 1}^K p_{kl}(\sqrt{2}x_u) q_{l,y_{T-t+2}}(\sqrt{2}x_u) \beta _{T-t+2}(l \mid \sqrt{2}x_u). \end{aligned}$$
    3. (c)

      Compute \(f(\varvec{\eta }, \sqrt{2}x_u) = \sum _{k = 1}^K \alpha _T(k\mid \sqrt{2}x_u)\).

  3. 3.

    Compute \(L(\varvec{\eta }\mid \varvec{y}) = \frac{1}{\sqrt{\pi }}\sum _{u=1}^U w_u f(\varvec{\eta }, \sqrt{2}x_u)\).

Appendix B Gradient of LHMM Log-Likelihood Function

For a given element \(\eta \) in \(\varvec{\eta }\),

$$\begin{aligned} \frac{\partial \log L(\varvec{\eta })}{\partial \eta } = \sum _{i=1}^n \frac{1}{L_i(\varvec{\eta }\mid \varvec{y}^{(i)})} \frac{\partial L_i(\varvec{\eta }\mid \varvec{y}^{(i)})}{\partial \eta }. \end{aligned}$$

The algorithm for calculating \(L_i(\varvec{\eta }\mid \varvec{y}^{(i)})\) is presented in Appendix A. We explain here how to compute \(\frac{\partial L_i(\varvec{\eta }\mid \varvec{y}^{(i)})}{\partial \eta }\). The superscripts and the subscripts denoting different respondents are suppressed hereafter to simplify notation. Let \(f(\varvec{\eta }, \theta ) = P(\varvec{Y} = \varvec{y} \mid \varvec{\eta },\theta )\). Then

$$\begin{aligned} \frac{\partial L(\varvec{\eta }\mid \varvec{y})}{\partial \eta } = \int \phi (\theta ) \frac{\partial f(\varvec{\eta }, \theta )}{\partial \eta }d\theta . \end{aligned}$$

If \(\frac{\partial f(\varvec{\eta }, \theta )}{\partial \eta }\) is computable given \((\varvec{\eta }, \theta )\), then the integral on the right-hand side of (A5) can be approximated using Gaussian–Hermite quadrature similarly as in computing the likelihood function. In the remaining part, we focus on deriving \(\frac{\partial f(\varvec{\eta }, \theta )}{\partial \eta }\). In the following calculations, the initial state probability \(\pi _k\), the state transition probabilities \(p_{kl}\), and the state-action probabilities \(q_{kj}\) all depend on \(\theta \) as defined in (57). To simplify notation, we do not explicitly write them as functions of \(\theta \).

First, consider taking derivative of f with respect to \(\pi _k\), \(p_{kl}\), and \(q_{kj}\). Define \(\varvec{\alpha }_t = (\alpha _t(1), \ldots , \alpha _t(K))^\top \) and \(\varvec{\beta }_t = (\beta _t(1), \ldots , \beta _t(K))^\top \) where \(\alpha _t(k)\) and \(\beta _t(k)\) are the forward and backward probabilities defined in (A1) and (A3), respectively. Then, the relationship in (A2) and (A4) can be expressed compactly as

$$\begin{aligned} \varvec{\alpha }_t = \varvec{\alpha }_{t-1}^\top \varvec{P} \tilde{\varvec{Q}}_t, ~\text {and}~ \varvec{\beta }_t = \varvec{P} \tilde{\varvec{Q}}_{t+1} \varvec{\beta }_{t+1}, \end{aligned}$$

where \(\varvec{P}\) is the state transition probability matrix and \(\tilde{\varvec{Q}}_t = {{\,\textrm{diag}\,}}\{q_{1, y_t}, \ldots , q_{K, y_t}\}\). Recursively applying the above relationship, we get

$$\begin{aligned} \varvec{\alpha }_t = \varvec{\pi }^\top \tilde{\varvec{Q}}_1 \varvec{P} \tilde{\varvec{Q}}_2 \cdots \varvec{P} \tilde{\varvec{Q}}_t ~\text {and}~ \varvec{\beta }_t = \varvec{P}\tilde{\varvec{Q}}_{t+1} \cdots \varvec{P} \tilde{\varvec{Q}}_T \varvec{1}, \end{aligned}$$

where \(\varvec{1}\) is a column vector of K ones. Let x denote a generic element of \(\varvec{\pi }\), \(\varvec{P}\) or \(\varvec{Q}\). Then,

$$\begin{aligned} \frac{\partial f}{\partial x} = \frac{\partial \varvec{\pi }^\top \tilde{\varvec{Q}}_1}{\partial x} \varvec{P} \tilde{\varvec{Q}}_2 \cdots \varvec{P} \tilde{\varvec{Q}}_T \varvec{1} + \sum _{t=1}^{T-1} \varvec{\alpha }_t^\top \frac{\partial \varvec{P} \tilde{\varvec{Q}}_{t+1}}{\partial x} \varvec{\beta }_{t+1}. \end{aligned}$$

Replacing x with \(\pi _k\), \(p_{kl}\), and \(q_{kj}\) and simplifying the expression, we obtain

$$\begin{aligned} \begin{aligned} \frac{\partial f}{\partial \pi _k}&= q_{k,y_1} \beta _1(k), ~ k =1, \ldots , K\\ \frac{\partial f}{\partial p_{kl}}&= \sum _{t=1}^{T-1} \alpha _t(k) \beta _{t+1}(l)q_{l, y_{t+1}}, ~ k,l = 1, \ldots , K\\ \frac{\partial f}{\partial q_{kj}}&= \sum _{t: y_t = j} \alpha _t(k) \beta _t(k) /q_{kj}, ~k = 1, \ldots , K, ~ j = 1, \ldots , M. \end{aligned} \end{aligned}$$

According to the chain rule,

$$\begin{aligned} \begin{aligned} \frac{\partial f}{\partial \mu _k}&= \sum _{k'=1}^K \frac{\partial f}{\partial \pi _{k'}} \frac{\partial \pi _{k'}}{\partial \mu _{k}} = \pi _{k} \left( \frac{\partial f}{\partial \pi _{k}} - \sum _{k' = 1}^K \frac{\partial f}{\partial \pi _{k'}}\pi _{k'} \right) , ~\frac{\partial f}{\partial \tau _k} = \sum _{k'=1}^K \frac{\partial f}{\partial \pi _{k'}} \frac{\partial \pi _{k'}}{\partial \tau _{k}} = \theta \frac{\partial f}{\partial \mu _k},\\ \frac{\partial f}{\partial b_{kl}}&= \sum _{l'=1}^K\frac{\partial f}{\partial p_{kl'}} \frac{\partial p_{kl'}}{\partial b_{kl}} = p_{kl} \left( \frac{\partial f}{\partial p_{kl}} - \sum _{l' = 1}^K \frac{\partial f}{\partial p_{kl'}}p_{kl'} \right) , ~\frac{\partial f}{\partial a_{kl}} = \sum _{l'=1}^K\frac{\partial f}{\partial p_{kl'}} \frac{\partial p_{kl'}}{\partial a_{kl}} = \theta \frac{\partial f}{\partial b_{kl}},\\ \frac{\partial f}{\partial d_{kj}}&= \sum _{j'=1}^M\frac{\partial f}{\partial q_{kj'}} \frac{\partial q_{kj'}}{\partial d_{kj}} = q_{kj} \left( \frac{\partial f}{\partial q_{kj}} - \sum _{j' = 1}^K \frac{\partial f}{\partial q_{kj'}}q_{kj'} \right) , ~\frac{\partial f}{\partial c_{kj}} = \sum _{j'=1}^M\frac{\partial f}{\partial q_{kj'}} \frac{\partial q_{kj'}}{\partial c_{kj}} = \theta \frac{\partial f}{\partial d_{kj}}.\\ \end{aligned}\nonumber \\ \end{aligned}$$

Combining (A6) and (A7) gives \(\frac{\partial f}{\partial \eta }\) for \(\eta = \tau _k, \mu _k, a_{kl}, b_{kl}, c_{kj}, d_{kj}\).

Appendix C Viterbi Algorithm

Let \(\varvec{y}\) be a sequence following the LHMM with parameters \(\varvec{\eta }\) and latent trait \(\theta \). The most probable hidden state sequence \(\hat{\varvec{s}}\) can be found using the Viterbi algorithm. For \(k = 1, \ldots , K\) and \(t = 2, \ldots , T\), define

$$\begin{aligned} v_t(k) = \max _{\varvec{s}_{1:(t-1)}} P(\varvec{Y}_{1:t} = \varvec{y}_{1:t}, \varvec{S}_{1:(t-1)} = \varvec{s}_{1:(t-1)}, S_t = k \mid \theta , \varvec{\eta }). \end{aligned}$$

According to HMM assumptions (1)–(4), we have the recursive relation

$$\begin{aligned} v_t(k) = \max _{l=1, \ldots , K} v_{t-1}(l) p_{lk}(\theta ) q_{k, y_t}(\theta ), \end{aligned}$$

where \(v_1(k) = \pi _k(\theta ) q_{k,y_1}(\theta )\). Let

$$\begin{aligned} u_t(k) = \mathop {\textrm{argmax}}\limits _{l=1, \ldots , K} v_{t-1}(l) p_{lk}(\theta )q_{k, y_t}(\theta ). \end{aligned}$$

After computing \(v_t(k)\) and \(u_t(k)\) for \(k=1, \ldots , K\) and \(t=2, \ldots , T\) sequentially, the most probable hidden state sequence can be obtained by backtracing:

$$\begin{aligned} {\hat{s}}_T = \mathop {\textrm{argmax}}\limits _{k=1, \ldots , K} v_T(k), ~ {\hat{s}}_{t} = \mathop {\textrm{argmax}}\limits _{k=1, \ldots , K} u_{t+1}(k), \text {~for~} t = T-1, \ldots , 1. \end{aligned}$$

The algorithm is summarized in Algorithm 2.

Algorithm 2

(Viterbi Algorithm) The most probable hidden state sequence \(\hat{\varvec{s}}\) for a response process \(\varvec{y}\) following the LHMM with latent trait \(\theta \) is obtained in the following steps.

  1. 1.

    For \(k = 1, \ldots , K\), compute \(v_1(k) = \pi _k(\theta ) q_{k, y_1}(\theta )\).

  2. 2.

    For \(t = 2, \ldots , T\),

    1. (a)

      Compute \(w_t(l, k) = v_{t-1}(l) p_{lk}(\theta ) q_{k, y_t}(\theta )\) for \(k,l = 1, \ldots , K\);

    2. (b)

      Record \(v_t(k) = \max _{l} w_t(l, k)\) and \(u_t(k) = \mathop {\textrm{argmax}}\limits _{l} w_t(l, k)\) for \(k = 1, \ldots , K\).

  3. 3.

    Obtain \(\hat{\varvec{s}}\) by backtracing:

    1. (a)

      \({\hat{s}}_T = \mathop {\textrm{argmax}}\limits _k v_T(k)\);

    2. (b)

      For \(t = T-1, \ldots , 1\), set \({\hat{s}}_t = \mathop {\textrm{argmax}}\limits _k u_{t+1}(k)\).

Appendix D Estimated LHMM Parameters in Case Studies

Tables 3 and 4 present the LHMM parameter estimates for the CC item and the TICKET item, respectively.

Table 3 Estimated LHMM parameters for the CC item.
Table 4 Estimated LHMM parameters for the TICKET item.

Appendix E True Parameters in Simulation Studies

Table 5 presents the parameters of LHMM for generating the action sequences in the simulation study. The values are chosen so that the resulting state transition and state-action probability curves are similar to those obtained in the TICKET item.

Table 5 Parameters used for generating action sequences in the simulation study.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tang, X. A Latent Hidden Markov Model for Process Data. Psychometrika (2023).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: