`Background

One of the most widely used methods of cost-effectiveness modelling in healthcare is Markov modelling [1]. Markov models are state transition models in which the life course of a cohort of subjects or a series of individuals are modelled by placing patients into discrete and mutually exclusive health states. Utilities and costs are assigned to each state and time period so that expected utilities and costs can be estimated [2].

In these models, the health state transition probabilities (from one time period to the next) are usually estimated from empirical studies (clinical trials, observational epidemiological studies, or a meta-analysis of these studies). Sometimes data on some transition probabilities might be lacking entirely, simply because those transitions were not observed or reported in the clinical studies. An easy modelling solution for unobserved transitions is to assume that they never happen and assign a zero value to these transition probabilities. Nevertheless, this zero-value assignment approach may not be reasonable from the clinical perspective since such health state transitions occur in clinical practice within large samples or long time periods.

A natural way to address this missing evidence issue comes from Bayesian statistics, which allows us to combine prior beliefs and evidence formally and quantitatively to estimate posterior probabilities [3]. Briggs et al. proposed a Bayesian estimation of the transition probability matrix for models with multibranch nodes [4]. In their paper flat Dirichlet prior distributions were assumed with high level of uncertainty. In their approach these flat prior probabilities from each state to the other states were combined with the clinical trial data to obtain the posterior probabilities of the transition probability matrix with Markov chain Monte Carlo (MCMC) simulation using WinBUGS [5].

The method has several advantages: (1) it solves the problem of having zero values in the transition probability matrix derived from the clinical trial, (2) assuming a high level of uncertainty of the prior probabilities ensures that the clinical trial data influences most of the posterior probabilities, (3) estimation and probabilistic sensitivity analysis can be performed in one step if the Markov modelling process is also performed within the same framework, and (4) MCMC methods allow samples to be drawn from the joint posterior density, fully considering parameter uncertainty and the correlations between parameters.

The major limitation of the method applied by Briggs et al. is that the transition probabilities that did not occur in the clinical trials are influenced only by the prior probabilities, and in this case, applying vague priors might result in inappropriate estimates. A solution for the problem may be the use of informative priors based on expert opinions. Unfortunately, eliciting a Dirichlet prior distribution is not straightforward [6]. A key challenge is satisfying all the constraints of mathematical coherence. For example, the probabilities of each category must sum to one. Another challenge is that clinical experts cannot easily and directly estimate the probabilities of multiple outcomes, as “human limitations of memory and information processing capacity often lead to subjective probabilities that are poorly calibrated or internally inconsistent, even when assessed by experts”, so the primary questions must be divided into simple questions that are easy to understand and answer [7].

Recently, a freeware application was published by Elfadaly and Garthwaite to aid in eliciting Dirichlet and Gaussian copula prior distributions [8]. The proposed method elicits hyperparameters of the Dirichlet distribution from those of its marginal beta distributions through forms of reconciliation that use least-squares techniques.

Using both of the two abovementioned methods, we estimated the transition probability matrix of the patients with predominantly negative symptoms of schizophrenia in a cost-effectiveness analysis comparing the effect of cariprazine to that of risperidone. We illustrated case-specific differences between the clinical expert elicited priors and those of more common approaches (i.e., applying non-informative priors or assuming unobserved transitions have zero probability).

Methods

The context of the case study is as follows

The cost-effectiveness model has been described elsewhere (Fig. 1) [9]. Briefly, a Markov cohort model was built in Microsoft Excel with eight health states for schizophrenia (hereinafter referred to as the Mohr-Lenert health states) defined by Mohr et al. in 2004 and a death state [10]. The definition of the health states can also be found in a concise format in Table 2 of the paper presenting the cost-effectiveness model [9]. As the pivotal clinical trial in which the model was based on provided no data on mortality, as there were not any participants who died during the study period, the age- and sex-specific mortality rates of the general population were used in the model, and no difference in the mortality between the two treatment groups was assumed [11]. Considering the pharmacokinetic properties of cariprazine [12], the modelled time period was split into two periods: an initial 6-week time period with weekly cycles and 12-week long cycles occurring thereafter. Because the full clinical effects of cariprazine are expected to occur after the first 6 weeks, different transition probabilities were necessary for the first 6 weeks and for the subsequent model time period. Because the aim of this paper is to present a method of estimating the transition probability matrix for Markov models based on both expert opinion and clinical trial data, we used only the first 6-week period as an illustration of the method. The same method was applied for the subsequent period but with different observed data and prior probabilities.

Fig. 1
figure 1

Source: Németh et al. 2017

Cost-effectiveness model structure. [9]

Data to estimate the weekly transition probabilities for the first 6 weeks of the modelled time period for both the cariprazine and the risperidone arms were available from the first 4-week period of the Németh et al. clinical trial [11]. In the original publication, the model was used to estimate the utility of the cariprazine treatment compared to the risperidone treatment.

Prior elicitation

The prior probabilities in the transition matrix were elicited with the involvement of three leading clinical experts from Hungary who are actively involved in treating patients with schizophrenia and have deep insight into the typical courses of the disease (IB, BM, JR), and who were involved in the project. An additional criterion at the selection was to have considerable research experience. The prior probabilities were elicited using Prior Elicitation Graphical Software (PEGS) [8]. The feasibility of using the application, and the process of the elicitation was pilot tested without the involvement of the experts. The definitions of the Mohr-Lenert health states were explained to the experts. Besides this information given verbally, and the technical description of the exercise there were not any additional information given to the experts regarding the transitional probabilities between the studied health states. Then, the experts were asked individually face-to face to give their opinion about the probabilities of patients with predominantly negative symptoms of schizophrenia moving from a given Mohr-Lenert health state to another state conditional on the comparator treatment (i.e., risperidone). The experts were aware of main results of the clinical trial regarding the efficacy of cariprazine, but they were unaware of the trial results about the transition probabilities between different Mohr-Lenert health states. At each step we verified whether the experts understood what they had to assess (i.e., the descriptions of the patients in a certain state, which was used as the current state, and that they had to estimate the proportion of patients moving from this state to another state). In the assessment process, all experts were asked to assess the marginal medians and quartiles of the probability of each transition. Because directly estimating these statistics is difficult for clinical experts, we elicited this information by asking the following questions. For the median, the question was phrased as follows: “Considering patients in state A, what is the most likely proportion of patients moving to state B within one week during the first six weeks of treatment? Consider it equally likely that the true proportion is above this value or below this value. For example, suppose you assess this value as 0.4, you should think it is equally likely that the true proportion will be above 0.4 as it will be below 0.4.” For the lower quartiles, the question was phrased as follows: “If you were told that the true proportion was smaller, what is the value that you think is still reasonable? Estimate a proportion for which it holds that the probability that the true proportion is below equals the probability that the true proportion is between it and 0.4 (the median)”. Similar questions were asked to assess the upper quartiles.

In the next step, the experts were told that “the probabilities of the different patient movements from a given health state must add to one, and the assessments must also meet certain other requirements to be internally consistent. The application gives you three options for reconciling your assessments to meet these requirements. Select the one which best represents your opinion.” Once a clinical expert made a choice about these options, he or she was confronted with the result. Then, he or she still had an opportunity to change any quartiles. After the modifications, the application again calculated the coherent marginal quartiles for the Dirichlet distribution and presented them together with the expert’s revised values. Finally, when the expert thought that the proposition was in line with his or her view, the application presented the estimated transition probabilities with their variance and the hyperparameters of the Dirichlet distribution.

As the same prior was later used for modelling patients’ paths in the cariprazine arm, the difference in treatment efficacy originated only from the trial data. The estimated transition probabilities by the three experts were averaged and scaled to one person in each source state, ensuring high uncertainty of the prior probabilities and thereby allowing a large influence of the clinical trial data.

Estimation of the transition probability matrix

The transition probability matrix was finally estimated by WinBUGS based on the priors and the clinical evidence from the trial with 1000 burn-in samples and 50,000 estimation samples; see the code in (Additional file 1). Two chains were run, and convergence was assessed by visual inspection of the trace plots and by tracking the Brooks-Gelman-Rubin diagnostics.

We generated the treatment-specific transition probability matrices for the Markov model in three different ways: (1) based only on the observed clinical trial data (transition probabilities not observed within the trial were set to 0); (2) based on Bayesian estimation where prior transition probabilities came from experts’ opinions (as described previously); and (3) based on Bayesian estimation with vague prior transition probabilities (flat Dirichlet prior distributions). Furthermore, we compared the transition probability matrices and the incremental quality-adjusted life years (QALYs) across the three approaches.

Results

The experts found determining the priors mentally exhausting, as the phrasing of the questions were the same for each health state. It took approximately one hour for each expert to complete the two transition probability matrices (for the first 6 weeks and thereafter). The experts found the feedback loops in the application to be very helpful, as reviewing the results of their estimations helped them correct their initial judgements when they felt it was necessary. The estimates of experts 1 and 2 showed considerable consistency (Fig. 2), whereas expert 3 consistently estimated higher probabilities for the patients staying in the state where they were than did the other two experts. Table 1 shows the mean values of the estimates of the three experts.

Fig. 2
figure 2

Weekly transition probability estimates determined by the experts

Table 1 Weekly transition probability estimates (%) by the experts (mean values)

In Table 2, the observed relative frequencies of the weekly transitions in the first four weeks of the pivotal clinical trial are shown for both treatment arms [11]. There were quite a few transitions that were not observed, and there were no patients in states 7 and 8 (the two most severe Mohr-Lenert health states) in the first four weeks of the trial. In contrast to this observation, the experts generally believed that any of the possible transitions could happen, but some of these transitions were rather unlikely according to them.

Table 2 Observed weekly transition relative frequencies (%) between the eight health states from the first four weeks of the RGH-188-005 clinical trial

The average estimates of the experts showed reasonable consistency with the trial data, with some attenuation of the extreme observed values in the trial (Tables 1 and 2). For example, both patients who were in the risperidone arm who were in state 3 at the beginning of the trial stayed in state 3, resulting in a 100% transition probability, whereas the mean of the expert estimates was 42.9%. Note that if one of the two patients had moved to a different state, the trial transition probability estimate would have dropped to 50%. Similarly, the estimates for the transitions from state 1 to state 1 were 100% in both treatment arms based on the trial data, while the estimate was 82.39% by the experts. The experts did not exclude the possibility of any particular transition; thus, there were no zero estimates in their estimated transition probability matrix.

The estimates of the two chains converged on the trace plot, and the Brooks-Gelman-Rubin diagnostics did not show evidence for non-convergence either.

As expected, the posterior values of the transition probability matrix were very similar to the trial data relative frequencies in cells where there were a reasonably large number of events (states 2, 4, and 6) (Table 3). For example, in the clinical trial, the proportion of patients who stayed in state 2 was 80.99% in the risperidone arm, whereas the estimated posterior probabilities (expressed in %) were 80.72 and 80.43%, depending on the choice of the prior probability. In the case of the mildest disease state (state 1), the pattern of the posterior estimates clearly followed the pattern of the clinical trial data estimates when the informative prior was used. The posterior probability of staying in the mildest state, however, was much lower when the uninformative (flat) prior was used. This result was expected, as the experts’ opinions were in line with the observations in the clinical trial, and only 2 and 3 cases occurred in the risperidone arm and in the cariprazine arm, respectively, in the initial 4 weeks. This behaviour also occurred for the patient movements from state 3 in the risperidone arm. As the prior probability and the relative frequency from the trial had the same level of uncertainty (both were based on one case) in the cariprazine arm for state 3, the posterior probability distribution reflects equal influence of the prior probabilities and clinical trial data. As the same prior was used for both treatment arms, and there were no observations in states 7 and 8 in the initial 4 weeks of the trial, the posterior distributions of the transition probabilities from these health states were essentially the same in the two arms.

Table 3 Posterior estimates of the weekly transition probabilities (%) estimated with the use of uninformative (normal text) and informative (bold text) prior probabilities and clinical trial data

When the informative priors were used, these distributions reflected the opinions of the experts and were not at all uniform, avoiding the bias that would have resulted from the use of vague priors that assumed that the probability of each transition was the same from these health states (i.e., 1/8).

Table 4 shows the results regarding the health benefits with the use of the three different transition probability matrices. The estimated difference in QALYs was small but not negligible over a 54-week-long time period. Compared to the model that incorporated expert opinions, the model including only the clinical trial data estimated 2.8% higher, while the model using vague priors estimated 4% smaller QALY differences between the treatment arms. Nevertheless, these differences were very small compared to the precision of the QALY difference estimate, as the standard deviation of it was 0.014 after 1000 model runs in the probability sensitivity analysis in the cost-effectiveness analysis.

Table 4 Health benefit results of cost-effectiveness modelling with different transition probability matrices

Discussion

Substituting zeros with plausible and valid values in a transition probability matrix of a state transition model is challenging. In a Bayesian analysis, we combined clinical trial data with informative Dirichlet prior probabilities of transitions between health states defined by Mohr et al. in patients with predominantly negative symptoms of schizophrenia [10]. The elicitation of Dirichlet prior probabilities proved to be feasible and reliable with the application developed by Elfadaly and Garthwaite, as the average values of the experts’ estimates showed considerable consistency with the observed relative frequencies from the clinical trial for transitions that were observed in the trial. The opinions of the different experts could be pooled by linear combinations or by the supra Bayesian procedure [13]. The simple averaging method, which we used, is a widely used and recommended method, especially with small sample sizes [14].

A strength of our approach was that the evidence about treatment efficacy was derived entirely from the clinical trial, as the same prior distribution was used on the two arms. The use of informative priors for treatment efficacy might be justifiable in some cases, but we think that extracting efficacy estimates from clinical trials or effectiveness estimates from well-designed observational studies provides more valid results [15].

Different methods of estimating the transition probability matrix may lead to considerably different results. In our case, the size of the difference between the three results in the incremental QALY was not large. The size of this difference, however, depends on the specifics of the actual model. When the forecasting of the model includes significantly long periods of time in health states or transitions that were not observed or when there are large differences in the quality of life or costs of health states, the size of the difference in the results is likely to be larger. In our case, for example, when the time period of the model was extended to 258 weeks, the 4% difference in the incremental health gain between the approaches using vague and informative priors increased to approximately 6% (QALY difference of 0.0825 and 0.0875, respectively). The differences in the QALY estimates by estimation method were in line with our expectations, as the clinical experts estimated that transitions to the more severe states were less frequent than assumed with the use of the non-informative priors. Thus, we expected larger QALY estimates when informative priors were used.

Although several guidelines have been published about how to perform health economic modelling, the estimation of the transition probability matrix is a neglected area of study, as there are no available guidelines regarding this topic [16]. The Bayesian approach provides more plausible and valid results than simply assuming zero probabilities for the unobserved transitions, when these unobserved transitions still occur in clinical practice. Our case study showed that using vague priors different from the experts’ prior beliefs resulted in different model results. When the selection of the priors largely influences the model results (e.g., when the clinical trial data have large uncertainty), sensitivity analyses with plausible ranges of the priors can help determine the robustness of the results.

Conclusions

In summary, the proposed method by Briggs et al. provides a conceptual framework and practical solution to estimate transition probability matrices when some of the possible transitions did not occur in the empirical studies [4]. Using informative priors rather than vague priors when the Bayesian approach is applied could be an option when the transition probabilities without empirical estimates have a significant impact on the model results.. The software application developed by Elfadaly and Garthwaite is a good practical aid to elicit Dirichlet and Gaussian copula priors by expert interviews [8].