# Population-level intervention and information collection in dynamic healthcare policy

## Abstract

We develop a general framework for optimal health policy design in a dynamic setting. We consider a hypothetical medical intervention for a cohort of patients where one parameter varies across cohorts with imperfectly observable linear dynamics. We seek to identify the optimal time to change the current health intervention policy and the optimal time to collect decision-relevant information. We formulate this problem as a discrete-time, infinite-horizon Markov decision process and we establish structural properties in terms of first and second-order monotonicity. We demonstrate that it is generally optimal to delay information acquisition until an effect on decisions is sufficiently likely. We apply this framework to the evaluation of hepatitis C virus (HCV) screening in the general population determining which birth cohorts to screen for HCV and when to collect information about HCV prevalence.

## Keywords

Medical decision making Markov decision processes Hepatitis C virus Optimal stopping Dynamic programming## 1 Introduction

There is currently no guidance for determining the optimal schedule for collecting additional information regarding a decision to invest in a health program or technology [1, 2]. Current practice in the health decision science literature assumes that model parameters are fixed across cohorts and the value of additional information is calculated assuming the information-collection effort is initiated immediately [3, 4, 5]. However, in many cases the cost-effectiveness of a health program or technology – and, therefore, the value of additional information about one or more model parameters – may be changing over time because of trends affecting the cohort or the intervention [6]. In these cases, collecting additional information immediately may not be optimal and value-of-information calculations based on static parameter assumptions are likely to be biased. Planning over longer horizons is particularly important in health policy because, once established, clinical practice is difficult to change due to high switching costs (re-training and potentially new capital equipment expenditures), particularly if it appears that the level of service is being reduced [7].

In this paper we apply a stochastic dynamic programming approach to identify both the optimal time to change the current health intervention policy and the optimal time to collect decision-relevant information. We consider a hypothetical medical intervention for a cohort of patients. At each time, a new cohort of patients becomes eligible for the intervention and one parameter varies across the cohorts with imperfectly observable linear dynamics. We assume that the value of the intervention is linear in the dynamic parameter. In general, the (incremental) net monetary benefit of an intervention is linear in parameters with a one-time effect (e.g., the prevalence of a disease at one point in time or the outcome of a one-time screening test). When an effect accrues over time, such as for a reduction in the annual transition rate of a disease complication or death, linearity is often used as an approximation (see, e.g., [8]). At each time, the policy-maker can choose to invest in the medical intervention and/or to purchase sample information about the uncertain dynamic parameter. We demonstrate that information acquisition is best delayed until the signal is sufficiently likely to affect the optimal policy decision.

We apply this framework to the evaluation of hepatitis C virus (HCV) screening. Prior to the development of highly-effective treatments, HCV screening in the general population was not considered cost-effective [9] and universal screening was not recommended [10]. The advent of more effective therapy has changed the value of identifying infected individuals early to initiate treatment [11, 12, 13, 14, 15]. Recently released guidance by the Centers for Disease Control and Prevention (CDC) and the US Preventive Services Task Force (USPSTF) recommends one-time HCV screening for all individuals born between 1945 and 1965 [16, 17] although screening individuals born after 1965 may also be cost effective [13, 14, 15]. Based on our primary analysis of the National Health and Nutrition Examination Survey (NHANES), in the US general population, HCV prevalence is highest in people born around 1956 and declines thereafter at a rate of approximately 11% per birth year. Since HCV prevalence is decreasing across birth cohorts, HCV screening will only be cost-effective for a limited time or for a limited set of birth cohorts. We apply our model to simultaneously evaluate the optimal HCV-screening and information-acquisition policy.

Specifically, we apply our model to the policy decision of whether or not to perform one-time HCV screening in successive cohorts of healthy 50-year olds, who have not previously been tested for HCV, at a routine preventive health visit. Applying a traditional health economics framework, the policy-maker could decide today how many cohorts will be screened (e.g., each cohort of 50-year olds until those born in 1965 turn 50) or, to inform this decision, the policy-maker may seek additional information to be collected immediately. Our framework differs from the traditional paradigm in that each year the policy-maker makes a decision about whether to continue the one-time HCV screening program (whether or not to screen the new cohort of healthy 50-year olds) and whether to collect information about disease prevalence in this current cohort. If information is never collected, the optimal policy does not differ across frameworks. However, in our framework, the immediate decision is not limited to the decision of when to change policies, but it also includes when to collect information to inform a future change of policy. For example, the (immediately) optimal policy might be to screen each cohort of 50-year olds for the next 6 years and then collect information about HCV prevalence to inform future decision making. Delaying information acquisition until a time that the information is sufficiently likely to affect the decision increases the value of the information. In addition, from a practical perspective, collecting information years before it is likely to influence a policy change wastes immediate resources and, should something occur in the lag-time between the information-acquisition effort and the policy change, implementing the pre-determined policy change may not be optimal.

### 1.1 Related literature and contribution

The relevant literature spans technology adoption, dynamic decisions in healthcare, and the value of information in healthcare.

### **Technology adoption**

In technology-adoption models, a decision-maker considers the adoption of a technology of unknown profitability. Jensen [18] introduced a model in which information about a new technology is costlessly observed and the decision-maker can decide to adopt the new technology at any point in time. McCardle [19] presented a model in which collecting information is associated with a fixed cost; in each period the decision-maker can defer and collect information, or make a final decision to accept or reject the new technology. The optimal policy in each period is characterized by two thresholds: if the expected benefit is above the upper threshold, it is optimal to adopt the technology; if the expected benefit is below the lower threshold, it is optimal to reject the technology; and, if the expected benefit is between these two thresholds the optimal strategy is to gather information. Uncertainty about the technology’s value decreases over time and the two thresholds converge to the cost of adoption. Smith and McCardle [20] provided several meta-results, some of which we use, describing how properties of the value function of a stochastic dynamic program are preserved and propagated through finite-horizon Markov-reward and decision processes. Ulu and Smith [21] extended this work by relaxing the assumption that the decision-maker’s value of the technology can be summarized by the expected benefit, and they use more general monotone-comparative-statics techniques in terms of likelihood orders to generalize the class of signals that are observed prior to making an adoption decision.

Another line of research considered technologies, like ours, with uncertain and changing value. Rosenberg [22] found that expectation of technological improvement may delay a firm’s irreversible technology investments. Bessen [23] calculated the option value of delay for such a problem. Kornish [24] considered the choice between two uncertain technologies where each is subject to a positive network effect and explored the impact of the network effect on the optimal adoption policy. Chambers and Kouvelis [25] formulated a technology-adoption problem incorporating expected learning-curve effects.

### **Stochastic dynamic programs in healthcare**

Sequential decisions under uncertainty are common in healthcare [26, 27]. Most healthcare applications of stochastic dynamic programs have focused on optimizing the timing of interventions for an individual patient: the decision to accept or reject an offered kidney for transplantation [28]; the optimal treatment plan for mild spheroctosis [29]; the optimal surveillance and management of ischemic heart disease [30]; the optimal time to perform a living-donor liver transplant [31, 32]; the optimal time to initiate treatment for HIV [33, 34]; the optimal timing and frequency of HCV testing from the patient perspective [35]; the optimal use of statins in patients with type 2 diabetes [36, 37]; the optimal prostate biopsy referral [38]; and, optimal cancer screening programs [39, 40]. Dynamic programming has also been applied to complex appointment scheduling problems in healthcare, including problems with patients of different clinical types/priority [41, 42]; incorporating patient no-shows [43]; and problems of sequential appointment scheduling with the objective of closely adhering to a prescribed schedule (e.g., sequential chemotherapy appointments [44]) or with the objective of satisfying patient preferences [45, 46]. Fewer examples of application to population-level policy exist. Kornish and Keeney [47] and Özaltın et al. [48] formulated the influenza-strain selection problem in a finite-horizon optimal-stopping framework. Similar to our problem, the influenza-vaccine composition decision is also an optimal-stopping problem with information acquisition; however, it has many unique characteristics that distinguish it from the problem discussed here such as an inventory deadline (finite horizon), a product useful for one season only, and a time-consuming production process. Similar to many of the technology-adoption models discussed above but unlike our framework, in the influenza-vaccine composition models, information is collected in every period in which a final decision has not yet been made.

### **Health economics and value of information in healthcare**

Cost-effectiveness analysis is an economic method for comparing the lifetime discounted costs and health benefits associated with two or more medical interventions or health programs [1, 2]. In theory, the optimal allocation of resources across a portfolio of health interventions is determined by solving a constrained optimization problem with the objective of maximizing health benefits subject to a budget constraint [49, 50, 51]. In reality, regional and national health policy bodies routinely compare the incremental cost effectiveness ratio of candidate interventions to a pre-determined threshold intended to approximate the shadow price of the budget to determine if the intervention is ‘cost-effective’ as one component of their policy-making process [52]. Cost-effectiveness analysis is widely used to evaluate general population screening for relatively rare conditions because these programs impose a small cost on everyone who is screened and provide substantive healthcare gains for only a small number of individuals who are identified (or identified earlier than they would be otherwise); calculating the population-level costs and benefits can require detailed natural history models, extensive model calibration and validation, and thorough analysis.

Bayesian decision theory approaches to value-of-information assessment were first introduced by Raiffa and Schlaifer [53]. Weinstein [54] proposed the widespread adoption of value-of-information analysis to research priority setting in health policy and medicine. Hornberger et al. [55], Claxton and Posnett [56], and Claxton [57] introduced a Bayesian approach to identifying the optimal trial sample size and to assessing the value of additional information for technology-adoption assessments. Several approaches to increasing the accuracy of value-of-information calculations continued to relax assumptions implicit in the original formulation (see examples in [58, 59, 60, 61, 62]). One common assumption in these studies is that the currently estimated per-person value of information can be applied to individuals in all future cohorts. Recognizing some of the implications of this assumption, Philips et al. [6] discussed the impact that intervention-horizon uncertainty, price changes, and technological development can have on the per-person value of information for future cohorts. They find that delaying information collection may be desirable but do not provide a framework for determining the optimal time to collect information.

### **Contribution**

In this paper, we extend the technology-adoption literature by allowing for a technology that is changing in value over time, for the opportunity to ‘wait’ without collecting information, and for the possibility of optimally determining the collected amount of information in each period. We also incorporate the possibility of an imperfect information-collection technology. We broaden the scope of applications of stochastic dynamic programs in the area of healthcare in an important way – focusing on population policy rather than patient-level decisions. We extend the health decision science literature on value-of-information assessment by developing an approach to identify the optimal information-acquisition policy when model parameters are varying across cohorts. Finally, as an example, we apply our framework to the timely public policy problem of developing a population screening program for HCV. We find that considering the opportunity to collect information in the future leads to a substantially different policy recommendation than current guidelines because it explicitly considers and addresses the parameter uncertainty which is changing over time.

## 2 The model

A policy-maker faces recurring decisions for cohorts arriving at times *t* ∈{0,1,2,…} about whether to invest in a health intervention delivered once per cohort (of size *N*). By cohort we mean a group of individuals with a certain medical presentation (i.e., individuals with a new diagnosis of cancer) or of a certain status (i.e., individuals who turned 50 this year). The policy-maker’s objective is to maximize net monetary benefit from a societal perspective. The per-person incremental net monetary benefit (INMB) of performing the intervention compared to the *status quo* is assumed to be affine in an uncertain parameter \(\tilde {p}_{t}\) that varies across the cohorts, with realizations in [0,1] and known dynamics. So \(\text {INMB}_{t} = \theta \tilde {p}_{t} - \gamma \), for all *t* ≥ 0, where *𝜃* is the marginal INMB (with respect to the parameter \(\tilde {p}_{t}\)) and − *γ* is the fixed INMB, both measured on a per-person basis.

*t*, the policy-maker simultaneously decides whether to invest in a medical intervention for the individuals in cohort

*t*and whether to conduct a study of sample size

*n*

_{ t }over the period to obtain a better estimate of the uncertain parameter \(\tilde {p}_{t}\). Information, if sought, arrives at the end of the current period and is used, together with the known dynamics of \(\tilde {p}_{t}\), to inform the intervention decision for future cohorts. Let \(d_{t}\in \mathcal {D} = \{0, 1\}\) denote the intervention decision at time

*t*, where

*d*

_{ t }= 0 indicates

*‘No intervention’*and

*d*

_{ t }= 1 indicates

*‘Intervention.’*The amount of information collected is measured in terms of the sample size \(n_{t} \in {\mathcal N}=\{0,\ldots ,N\}\); it is obtained at the cost

*K*(

*n*

_{ t }), where

*K*(⋅) is an increasing function including a fixed and a variable cost when

*n*

_{ t }> 0 and

*K*(0) = 0. Thus, at each time

*t*the policy-maker implements the control \(u_{t}=(d_{t}, n_{t})\in \mathcal {D} \times \mathcal {N}\). The per-person current reward for the cohort in period

*t*is

*t*-th cohort which, in expectation, is geometrically decreasing over time;

*𝜃*> 0 denotes the marginal benefit of early diagnosis and treatment for an affected individual,

*γ*> 0 is the per-person cost of the program, and the current-period INMB

*g*is increasing in \(\tilde {p}_{t}\). Beyond our leading example, the framework can accommodate a wide variety of problems. As formulated, the uncertain parameter needs to lie in a compact interval (which can be mapped via bijection to [0,1]). Thus, the parameter can represent not only a probability but also other model parameters, such as a quality-of-life weight or cost. Additionally, our analysis assumes that the parameter value is decreasing over time. To model a situation where the expectation of the uncertain parameter is increasing (e.g., obesity prevalence), the problem can be formulated as one in which a parameter of opposite definition is decreasing (e.g., prevalence of individuals who are not obese). Our exposition involves an example of when to stop a health intervention. However, the framework can also be used in situations in which the decision-maker wishes to identify the optimal time to initiate a new intervention (e.g., when to adopt a new surgical technique). More broadly, our framework can be applied in settings in which the decision-maker wishes to identify the optimal time to stop the current intervention or initiate a new intervention; the uncertain parameter is geometrically increasing or decreasing across intervention cohorts; and the current-period reward function is linearly increasing or decreasing in the uncertain parameter. Examples are shown in Table 1.

Examples of alternative cases in which our framework applies

Case | Uncertain time-varying parameter \(\tilde {p}_t = 1- \tilde {q}_t\) | INMB | Setting | Example |
---|---|---|---|---|

A | \(\tilde {p}_t\), decreasing in | INMB | \(\mu _p(x_0)>\frac {\gamma }{\theta }\) “Intervention” is currently implemented. Optimal stopping problem. | “Intervention”: General population HCV screening at age 50 (Section 4). Period reward function: \(\text {INMB}_t =\theta \tilde {p}_t - \gamma \); \(\tilde {p}_t\), prevalence of HCV in cohort |

B | \(\tilde {p}_t\), decreasing in | INMB | \(\mu _p(x_0)>\frac {\gamma }{\theta }\) “Intervention” not currently implemented. Optimal starting problem. | “Intervention”: New surgical device vs. old device. Period reward function: \(\hat {\text {INMB}}_t = -\hat {\theta } \tilde {p}_t + \hat {\gamma }\); \(\tilde {p}_t\), probability of an adverse event in device iteration |

C | \(\tilde {q}_t\), increasing in | INMB | \(\mu _q(x_0)<1-\frac {\gamma }{\theta }\) \(\Rightarrow \mu _p(x_0)>\frac {\gamma }{\theta }\) “Intervention” is currently implemented. Optimal stopping problem. | “Intervention”: Pap smear for early identification of pre-cancerous lesions on the cervix from HPV infection. Period reward function: \(\hat {\text {INMB}}_t = - \hat {\theta } \tilde {q}_t + \hat {\gamma }\); \(\tilde {q}_t\), prevalence of HPV vaccine coverage in cohort |

D | \(\tilde {q}_t\), increasing in | INMB | \(\mu _q(x_0)<1-\frac {\gamma }{\theta }\) \(\Rightarrow \mu _p(x_0)>\frac {\gamma }{\theta }\) “Intervention” not currently implemented. Optimal starting problem. | “Intervention”: Peanut-free spaces regulation (in schools, airplanes, etc.). Period reward function: \(\hat {\text {INMB}}_t =\hat {\theta } \tilde {q}_t - \hat {\gamma }\); \(\tilde {q}_t\), prevalence of severe peanut allergy at time |

### 2.1 The information-acquisition problem

The policy-maker’s prior belief about \(\tilde {p}_{t}\) at *t* = 0 is beta-distributed with distribution parameters *x* _{0} = (*a* _{0},*b* _{0}). The posterior distribution when a beta-density is updated in a Bayesian manner with information collected using an imperfect information-collection technology is a mixture of beta-densities. Thus, in general, the policy-maker’s prior beliefs about \(\tilde {p}_{t}\) at time *t* are in \(\mathcal {P}\) where \(\mathcal {P}\) denotes the set of measures which are a mixture of beta-densities. Specifically if \(\tilde {p}_{t} \in \mathcal {P}\), then there exists parameters \(x_{t,i} = (a_{t,i}, b_{t,i}) \in \mathbb {R}_{++}^{2}\) for all *i* where \( 1 \leqslant i \leqslant m, m \in \mathbb {R}_{++}\), and a set of non-negative weights *ω* _{ i } such that \({\sum }_{i=1}^{m} \omega _{i} = 1\), where the distribution of \(\tilde {p}_{t}\) is a mixture of beta-densities of the form \({\sum }_{i=1}^{m} \omega _{i} \text { beta}(a_{t,i}, b_{t,i})\).

The policy-maker has the option to update his beliefs about the parameter \(\tilde {p}_{t}\) by testing *n* _{ t } individuals at cost *K*(*n* _{ t }). The information-collection technology has binary test characteristics *q* = (*q* _{1},*q* _{2}), where *q* _{1} is the sensitivity, *q* _{2} is the specificity, and *q* _{1} + *q* _{2} > 1 (indicating the test is properly labeled). The terms ‘sensitivity’ and ‘specificity’ are often used to describe test accuracy in the medical literature. For clarity, we state their relationship to Type I and Type II error: ‘Specificity’ = 1 −‘Type I error’ = 1 −‘False positive rate’ and ‘Sensitivity’ = 1 −‘Type II error’ = 1 −‘False negative rate’. The number of positive samples is an uncertainty \(\tilde {v_{t}}\) with realization *v* _{ t } ∈{0,…,*n* _{ t }}. Based on the collected information the policy-maker updates his beliefs about \(\tilde {p}_{t}\) in a Bayesian manner.

### **Proposition 1**

*If the policy-maker’s prior belief* *f* _{ p }(⋅)*is a mixture of beta-densities, i.e.,* \(f_{p} \in \mathcal {P}\) *,* *then for any number of positive observations* \(\tilde {v}_{t} = v_{t}\) *from* *n* _{ t } *samples, the Bayesian posterior belief* *f* _{ p|v }(⋅|*v* _{ t }) *is also a mixture of beta-densities, i.e.,* *f* _{ p|v } *is in* \(\mathcal {P}\) *.*

### *Proof*

See Appendix A.1. □

*m*≥ 1 beta-densities and if the information-collection technology is imperfect (i.e., \(\min \{q_{1},q_{2}\}<1\)), then the true posterior distribution is also a mixture of beta-densities, containing between

*m*+

*n*

_{ t }and

*m*× (

*n*

_{ t }+ 1) unique beta-distributions (see Appendix A.2.1). The resulting probability density function (pdf) is

Explicit expressions for the conditional mean and variance, *μ* _{ p|v } and *σ* *p*|*v*2, are provided in Appendix A.2.2.

### *Remark 1*

*m*≥ 1 beta-densities and the information-collection technology is perfect (i.e.,

*q*

_{1}=

*q*

_{2}= 1), then the distribution of sample information, \(\tilde {v}_{t}\), is a mixture of m beta-binomial distributions with the same weights

*ω*

_{ i }. Updating results in a posterior distribution that is a mixture of m beta-densities with pdf

*i*∈{1,…,

*m*}.

### 2.2 Approximate Bayesian inference

For practically relevant sample sizes *n* _{ t } and an imperfect information-collection technology, the number of beta-densities in the posterior distribution can become very large, thus requiring approximation. The need for distributional approximations in decision models has been recognized by Smith who proposed moment matching to replace continuous distributions by appropriate discrete ones [63]. More recently, moment-matching methods have also been used in a Markovian setting, to approximate vector-autoregressions [64]. In our Markov dynamic programming setting, we apply moment matching to approximate the exact posterior distribution which is a mixture of beta-densities with a single beta-distribution. This greatly simplifies the belief propagation compared to dealing with mixtures of beta-densities which feature an increasingly large number of coefficients with each information-collection effort and ultimately an infinite-dimensional state space.

*x*

_{ t }= (

*a*

_{ t },

*b*

_{ t }) and the posterior belief incorporating any information collected at time

*t*is represented by the updated parameters \(\hat {x}_{t} = (\hat {a}_{t}, \hat {b}_{t})\). Using the mean and variance of the exact posterior distribution,

*μ*

_{ p|v }and

*σ*

*p*|

*v*2, the approximate posterior belief parameters are determined using the one-to-one relationship between the standard parameters of the beta-distribution and its mean and variance

^{1}. We let

*ψ*(

*x*

_{ t },

*n*

_{ t },

*v*

_{ t },

*q*) denote the function that generates the approximating parameters, with

Mixtures of beta-distributions can be fitted to any continuous distribution on [0,1]. Thus, a single beta-distribution with the same mean and variance as a distribution formed by the mixture of beta-densities, will not always provide a satisfactory approximation. However, we focus on the special case where the time-*t* belief \(\tilde {p}_{t}\) has been obtained via Bayesian updating from a single beta-prior. In this special case, approximating the mixture of beta-densities with a single beta-distribution with the same mean and variance maintains unimodality^{2} and stationarity of the state space over time.

We assessed the approximation quality using simulation in the policy-relevant region for our application (Appendix A.3). We found that the maximum distance between the cumulative density function of the exact posterior distributions and that of the approximation with matching mean and variance were generally small (< 2%), but became large when the mean was approaching zero and the standard deviation was relatively large. The quality of the approximation was very good (< 0.5*%*) when the mean was greater than 2%. We deemed the approximation to be of sufficiently high quality for our numerical analysis because our initial conditions and predicted trajectory without information acquisition rely on the regions in which the approximation is good. Also, because of relatively high fixed costs associated with information acquisition, optimal sample sizes in our numerical analysis tended to be sufficiently large that information would likely only be collected once which reduces concerns about compounding the approximation error over successive information-collection efforts.

### 2.3 System dynamics

*x*

_{ t }, containing the parameters of the distribution of \(\tilde {p}_{t}\), represents the policy-maker’s current beliefs about the uncertain parameter and follows a law of motion of the form

*z*∈ (0,1) is the decay rate. These dynamics imply a geometrically decreasing expected value, increasing coefficient of variation, and decreasing variance for \(\mu (x_{0}) \leqslant \frac {1}{1+z}\) (Fig. 1). In the mean-variance space, the equivalent state dynamics become

Derivations of these equations are presented in Appendix A.4. The features of these dynamics can represent a wide variety of settings in which the expectation of a parameter is geometrically decreasing over time (e.g., a health condition that is decreasing in prevalence over time; see Section 4). To model a situation where the expectation (and variance) of the uncertain parameter is increasing (e.g., obesity prevalence), the problem can be re-formulated as one in which a parameter of opposite definition is decreasing (e.g., prevalence of individuals who are not obese).

### 2.4 The policy-maker’s problem

*δ*∈ (0,1), the policy-maker’s objective is to maximize the net present value of the stream of expected INMBs, given the initial belief

*x*

_{0}= (

*a*

_{0},

*b*

_{0}) and admissible policy decisions \(U\in {\mathcal {U}} = \left \{(u_{t})_{t\in {\mathbb N}} : u_{t}=(d_{t},n_{t})\in {\mathcal D}\times {\mathcal N} \right \}\). To achieve the objective, the policy-maker seeks to find the best of all possible policies

*π*

_{ t }(⋅),

*t*≥ 0, with

*u*

_{ t }=

*π*

_{ t }(

*x*

_{ t }) for all \(x_{t}\in \mathbb {R}_{++}^{2}\), which at each time

*t*maps the state space to admissible current-period actions

*u*

_{ t }, so that the implemented path of actions

*U*= (

*u*

_{0},

*u*

_{1},…) lies in the control-constraint set \(\mathcal {U}\). The number of positive observations in the testing sample of

*n*

_{ t }is a random variable \(\tilde {v_{t}}(n_{t})\) with realization

*v*

_{ t }∈{0,…,

*n*

_{ t }}. Based on the collected information the policy-maker updates his beliefs about \(\tilde {p}_{t}\) in an (approximate) Bayesian manner using the function

*ψ*(

*x*

_{ t },

*n*

_{ t },

*v*

_{ t },

*q*). Because of the decreasing trend of the uncertain parameter (

*z*< 1), it is never optimal to restart an optimally stopped program.

^{3}We consider stationary policies \(\pi :\mathbb {R}_{++}^{2}\to {\mathcal D}\times {\mathcal N}\) to solve the optimal control problem

*V*(

*x*) satisfies the Bellman equation,

*π*

^{∗}(

*x*) on the right-hand side defines an optimal policy.

### *Remark 2*

To reflect the policy-maker’s ongoing concern for the health-intervention decision, the problem is formulated in an infinite-horizon setting. Given a time-invariant system, this implies that the optimal policy can be described as a mapping from states to actions, without explicit consideration of time. If more information about the system becomes available over time, for example, relating to the decay rate in the system dynamics (see Eq. 4), then it is possible for the policy-maker to re-solve the problem and update the policy accordingly.

## 3 Dynamic healthcare decisions

### 3.1 Policies without information acquisition

*ψ*reduces to an identity map. For all states

*x*for which the optimal strategy is to not do the intervention, this action remains optimal in the future because of the decreasing trend of \(\tilde {p}_{t}\). Indeed, since for

*z*∈ (0,1),

*μ*(

*ϕ*(

*x*)) =

*z*

*μ*(

*x*) <

*μ*(

*x*), we have that for all states where

*V*(

*x*) = 0, it is also the case that

*V*(

*ϕ*(

*x*)) = 0. Hence, for \(\mu (x) \leq \frac {\gamma }{\theta }\) it is optimal to stop the intervention. This defines a threshold policy of the form

*t*≥ 0. Restricting attention to the interesting case where \(\mu (x_{0})\geq \frac {\gamma }{\theta }\) and using the fact that

*μ*(

*x*

_{ t }) =

*z*

^{ t }

*μ*(

*x*

_{0}), we can identify the optimal time

*T*(

*x*

_{0}) to stop the intervention, which is the first period in which the intervention has a nonpositive expected INMB (see Appendix A.5):

*x*, the value of implementing the optimal stopping policy for

*t*∈{0,...,

*T*(

*x*) − 1} is

### **Proposition 2**

*When information is prohibitively costly or practically infeasible to collect, the optimal value function* *V* _{NoInfo}(*x* _{ t }) *is non-decreasing and convex in* *μ*(*x* _{ t }) *.*

### *Proof*

See Appendix A.6. □

### *Remark 3*

The above result depends only on the decay in the mean of the uncertain parameter distribution and is otherwise distribution-free. In other words, it does not depend on the policy-maker’s beliefs other than that \(\tilde {p}_{t}\) is expected to decrease over time.

### 3.2 Policies with information acquisition

When the policy-maker has the option to acquire information, the value function is determined by the Bellman equation (Eq. 6). Its properties in the no-information case (Proposition 2) carry over to the more general situation.

### **Proposition 3**

*The optimal value function* *V* (*x* _{ t }) *is nondecreasing and convex in* *μ*(*x* _{ t }) *,* *and nondecreasing in* *σ* ^{2}(*x* _{ t }) *.*

### *Proof*

See Appendix A.7. □

#### 3.2.1 Special case: one-time information collection

*η*experiment (with

*η*≥ 1) and, briefly, ignoring the cost of information collection

*K*(

*η*), the value with information exceeds the no-information value,

*η*, at cost

*κ*(

*η*), is obtained by finding a period

*k*where information acquisition is preferred to waiting until the next period,

*k*+ 1. In other words, find the smallest

*k*for which

#### 3.2.2 General case: information collection in any period

Based on Proposition 3, the intervention is desirable for greater *μ*(*x* _{ t }) and greater *σ*(*x* _{ t }); the latter increases the upside of the policy-maker’s asymmetric (convex) payoffs, as if holding a call option. The dynamics presented in Eq. 4, with decreasing expectation and decreasing variance, imply monotonicity of the intervention decision, \(d_{t+1} \leqslant d_{t}\).

### **Corollary 1**

*Consider* *x* *t*(1)*,* *x* *t*(2)*with* *μ*(*x* *t*(1)) < *μ*(*x* *t*(2))*and* *σ* ^{2}(*x* *t*(1)) = *σ* ^{2}(*x* *t*(2))*,* *then if it is optimal to do the intervention with* *μ*(*x* *t*(1))*,* *it is also optimal to do the intervention with* *μ*(*x* *t*(2))*.*

### *Proof*

See Appendix A.8. □

### **Corollary 2**

*Consider* *x* *t*(1)*,* *x* *t*(2)*with* *μ*(*x* *t*(1)) = *μ*(*x* *t*(2))*and* *σ* ^{2}(*x* *t*(1)) < *σ* ^{2}(*x* *t*(2))*,* *then if it is optimal to do the intervention with* *σ*(*x* *t*(1))*,* *it is also optimal to do the intervention with* *σ*(*x* *t*(2))*.*

### *Proof*

See Appendix A.9. □

*‘no intervention (and do not sample).’*In region II, an optimal policy is

*‘do intervention and sample*

*n*

_{ t }

*individuals.’*In region III, an optimal policy is to

*‘do intervention and do not sample.’*

*‘no intervention (and do not sample)’*and

*‘do intervention and sample*

*n*

_{ t }

*individuals’*when the rewards of the two regions are equal:

*‘do intervention and sample*

*n*

_{ t }

*individuals’*and

*‘do intervention and do not sample’*when the rewards of the two regions are equal. Removing common terms from each side, this occurs when

For each *σ* ^{2}(*x* _{ t }), there can exist more than one *μ*(*x* _{ t }) where \(\frac {\gamma }{\theta } < \mu (x_{t}) \leqslant 1\) satisfying Eq. 11 because \(V(\phi (\psi (x_{t},n_{t},\tilde {v}_{t},q)))\) is increasing, but neither concave or convex, in *v* _{ t }. The existence of the section of region III between regions I and II (the location of point A) can be obtained using intuition. Consider two points, A and B, with the same standard deviation (Fig. 2). Compared to point B, if information were to be gathered at point A, the distribution of possible posterior states includes a higher proportion of states in region I (with a reward of 0) and a lower proportion of high-reward states (those with high mean and high standard deviation) and, therefore, information acquisition is less likely to yield a value exceeding its cost. Now consider two points, A and C, with the same mean. Compared to point C, if information were to be gathered at point A, the distribution of possible posterior states is narrower. In both of these cases, increased spread on the side of low mean has no impact on the expectation and increased spread into the high-reward states substantially increases expectation. Therefore, information acquisition is more likely to yield a value exceeding its cost for the state with higher standard deviation.

### **Proposition 4**

*For a fixed sample size* *η* *(so* *n* _{ t } ∈{0,*η*}*for all* t*), misclassification in the information-collection technology decreases the value function and reduces the* *number of states for which information acquisition is optimal.*

### *Proof*

See Appendix A.10. □

This result is consistent with Blackwell’s result that a less informative signal cannot increase the value of a single-person decision problem [66].

## 4 Application

### 4.1 Background and motivation

Recent model-based analyses concluded that one-time screening of individuals born between 1945 and 1965 is cost-effective [11, 12, 13, 14, 15] and the CDC and USPSTF recently released new guidance in support of one-time screening of these birth cohorts [16, 17]. Several studies indicate that screening individuals born later than 1965 is also likely to be cost-effective [13, 14, 15]. Since HCV prevalence is decreasing in birth year after the 1956 birth cohort (Fig. 3), there may be a time at which screening is no longer cost-effective. To improve the decision about the best time to stop screening, additional information about prevalence of the current and future cohorts may be desirable. However, standard approaches to finding the value of information do not usually include the option to delay the information acquisition.

Note that the population we model were predominantly infected decades ago [72, 73] and do not have ongoing risk factors for HCV re-infection. Many historically significant modes of disease incidence have been virtually eliminated including transmission by surgical or other hospital equipment prior to modern sterilization procedures and blood transfusion [73, 74]. Injection equipment sharing among people who actively use injection drugs (PWID) is currently the principal cause of HCV transmission [76]. Although a history of injection drug use is relatively common among individuals with chronic HCV infection (approximately 40% [71]), re-infection and disease transmission to others via injection drug use are not an ongoing risk for a large proportion of these individuals as three-quarters of HCV infected individuals with a self-reported history of injection drug use report last injecting greater than 5 years ago (median time since last injection = 20 years) [75]. Our model does not include PWID and so we do not consider the possibility of re-infection. PWID are a high-risk population and guidelines, separate from those otherwise discussed here, recommend routine annual HCV screening in this population [77].

We now apply the stochastic dynamic programming framework developed in Section 2 to the case of one-time HCV screening at a routine medical appointment at age 50 for successive birth cohorts. We consider screening at age 50 because one-time screening at this age had the lowest incremental cost-effectiveness ratio in an analysis of single birth cohort screening [14]. Waiting to perform a one-time screening in older individuals is less cost-effective because their disease may have progressed further and treatment is less effective in more severe disease states. One-time screening of younger individuals is less cost-effective because younger individuals are further away from the long-term consequences of HCV which screening and treatment hope to avoid. We transform the unbounded state space in terms of *x* _{ t } = (*a* _{ t },*b* _{ t }) to the compact policy-relevant space *μ*(*x* _{ t }) and *σ*(*x* _{ t }). Using value iteration implemented in R version 2.15.0 [78], we numerically determine an optimal HCV-screening and information-collection policy for US adults.

*‘do not screen for HCV and do not collect information about HCV prevalence in the current cohort;’*

*‘screen for HCV and collect sample information about HCV prevalence in the current cohort;’*

*‘screen for HCV and do not collect information about HCV prevalence in the current cohort.’*We compare this optimal strategy to the policies identified by various alternative approaches: a slightly modified version of the new CDC and USPSTF recommendation; an optimal policy without information acquisition; and an optimal policy with (possibly immediate) information acquisition. A policy of HCV screening does not inherently provide additional information about HCV prevalence to policy-makers, because only positive test outcomes are reported to the CDC and the reason for the medical test is private health information (the test may have been performed for a reason other than routine screening at age 50). Estimating prevalence among asymptomatic individuals seeking routine preventive medical care therefore requires a study with random sampling of those individuals. The (quasi-)linearity of INMB

_{ t }for this example is established in Appendix A.11.1. Parameter values and ranges used in sensitivity analysis are presented in Table 2. Details of parameter estimation are presented in Appendix A.11.2 and A.11.3.

Model parameter values and range used in sensitivity analysis

Males | Females | ||
---|---|---|---|

Variable, Description | Value (Range) | Value (Range) | Sources |

| |||

Eligible for a preventive health exam (PHE) | 2.1 million (1.8 − 2.3 million) | 2.1 million (1.9 − 2.4 million) | [79] |

Proportion who attend a PHE | 24.4% (19.3 − 29.5%) | 43.3% (37.3 − 49.3%) | [80] |

| 508, 222 (386, 600 − 630, 000) | 920, 706 (753, 000 − 1, 100, 000) | Calculated |

| |||

| 0.97 (0.950 − 0.999) | [81] | |

| 0.9996 (0.990 − 1.0) | [82] | |

| $ 28 ($ 20 − 40) | [83] | |

| $ 230 ($ 200 − 250) | [83] | |

| 0 (not varied) | Assumed | |

| 0 (not varied) | Assumed | |

| |||

| $ 146, 928 ($ 140, 000 − 154, 000) | $ 161, 121 ($ 153, 000 − 170, 000) | [14] |

| $ 126, 943 ($ 120, 000 − 133, 000) | $ 143, 900 ($ 136, 000 − 152, 000) | [14] |

| $ 181, 314 ($ 172, 250 − 190, 000) | $ 192, 135 ($ 182, 500 − 202, 000) | [14] |

| |||

| 10.42 (10.16 − 10.68) | 11.71 (11.41 − 12.0) | [14] |

| 10.06 (9.80 − 10.31) | 11.44 (11.15 − 11.72) | [14] |

| 15.69 (15.30 − 16.08) | 16.24 (15.83 − 16.64) | [14] |

| |||

| $ 7, 030 ($ 4, 000 − 12, 000) | $ 2, 962 ($ 3, 000 − 10, 000) | Eq. 20 |

| $ 28.05 ($ 22 − 40) | $ 28.05 ($ 22 − 40) | Eq. 21 |

| |||

| $ 50, 000 ($ 25, 000 − 250, 000) | Estimated | |

| $ 100 ($ 50 − 500) | [83] | |

| ( | Assumed | |

| |||

| | | Appendix A.11.3 |

(1960 birth cohort in 2010) | | | |

| 0.893 (0.871 − 0.915) | Appendix A.11.3 | |

| $ 75, 000/QALY gained ($ 50, 000 − 100, 000/QALY gained) | [84] | |

| 0.03 (0 − 0.05) | [1] |

### 4.2 Results

For the purposes of our analysis, we assume the current time to be the year 2010 and the initial cohort to be born in 1960.

#### 4.2.1 Policies identified by alternative approaches

*T*= 6 into Eq. 9. The sum of the discounted expected INMBs for screening 6 cohorts at age 50, until the 1965 birth cohort turns 50 years of age, is $399.1 million for men and $15.4 million for women (Table 3). The large difference between men and women is attributable to higher HCV prevalence and higher marginal INMB of early diagnosis and treatment in men.

Comparison of optimal policies indicated by various analytic approaches for men with initial belief *μ*(*x* _{0}) = 0.0310 and *σ*(*x* _{0}) = 0.0035 and women with initial belief *μ*(*x* _{0}) = 0.0135 and *σ*(*x* _{0}) = 0.0019

Case | Optimal Policy | Value (Expected INMB) | Increase in Expected INMB |
---|---|---|---|

Males | |||

CDC/USPSTF recommendation | Screen until 1965 birth cohort turns 50 | $ 399, 140, 000 | Reference |

No information available | Screen until 1978 birth cohort turns 50 | $ 566, 470, 000 | $ 167, 330, 000 |

Information only available immediately | Sample 910 men now, then identify optimal action | $ 566, 490, 000 | $ 167, 350, 000 |

Information available in all periods | Sample 4,000 men in 16 years (1976 birth cohort), then identify optimal action | $ 567, 940, 000 | $ 168, 800, 000 |

Females | |||

CDC/USPSTF recommendation | Screen until 1965 birth cohort turns 50 | $ 15, 390, 000 | Reference |

No information available | Screen until 1963 birth cohort turns 50 | $ 21, 720, 000 | $ 6, 330, 000 |

Information only available immediately | Sample 4,930 women now, then identify optimal action | $ 22, 320, 000 | $ 6, 930, 000 |

Information available in all periods | Sample 4,500 in 1 year (1961 birth cohort), then identify optimal action | $ 22, 500, 000 | $ 7, 110, 000 |

We identify the threshold prevalence value below which the HCV-screening program should be terminated and the best time to terminate the screening program, assuming no opportunity to collect information using Eqs. 7–8. In men, the program should be terminated when prevalence falls below 0.4%, which will occur in 18 years (95%CI: 16-19 years). In women, the program should be terminated when prevalence falls below 0.1%, which will occur in 3 years (95%CI: 0-5 years). The expected INMB of these policies is $566.5 million for men and $21.7 million for women (Table 3).

#### 4.2.2 Model results

Implementing the full model, we considered the possibility of collecting sample information at each decision period. For computational and illustrative reasons, we restricted the policy-maker’s choice to two sample sizes \(\mathcal {N} \in \{0, \eta \}\). We considered several possible values for *η* (2000, 2500, 3000, ..., 8000) and we present the results for the sample size that maximized the value at the initial condition for each gender. We also performed analyses using multiple study sample size levels available at each period. We do not present these analyses, as they led to the same optimal policies indicating that our restriction to two sample sizes was not material for this application.

For each state in the region where it is optimal to screen without information acquisition, we can identify the optimal next action and the time when it should occur (Fig. 5b). We subdivide this region by a solid line. Above the solid line, which is the region with higher uncertainty, it is optimal to screen without information acquisition for a specified number of periods and then to collect information. In the region with lower uncertainty, it is optimal to screen without information acquisition for a specified number of periods and then to stop screening without ever collecting information. The current prevalence estimates for men and women indicate that it is optimal to screen without information collection for 16 years and 1 year, respectively, and then to collect sample information to inform the next action. The expected INMBs of these policies are $567.9 million and $22.5 million for men and women, respectively (Table 3).

For each state, we also computed the marginal value of collecting a specific amount of information (Fig. 5c). The marginal value of information in the current period is near-zero for states in which collecting information in the future is optimal. Consistent with our expectations, in the ‘Screen and Collect Information’ region, the marginal value of information is greatest close to the \(\frac {\gamma }{\theta }\)-threshold and increases with uncertainty. In the ‘Screen and Do Not Collect Information’ region, the value of information is highest along the boundary that divides the region into points with trajectories leading to information collection and points with trajectories leading to ‘No Screening’ without information collection.

Sensitivity analysis identified that the general conclusions of our numerical analysis are robust to uncertainty in the inputs (details in Appendix A.11.4).

### 4.3 Discussion of application

Evaluating an HCV-screening policy over its entire lifecycle using a stochastic dynamic programming approach has led to several important policy-relevant insights. Our analysis indicates that recommendations by the CDC and USPSTF to screen individuals born between 1945 and 1965 at their next routine medical visit are conservative for men. Specifically, our analysis shows that, for men, screening should continue until at least the 1976 birth cohort turns 50 (in 2026), at which point 4,000 individuals should be sampled to inform about the continuation of the program. Screening men at least 10 years longer will enable early diagnosis in an estimated 50,500 additional individuals, thus preventing an expected 767 additional liver cancers and about 212 additional liver transplants. For women, we find that a large information-acquisition effort should take place when the 1961 birth cohort turns 50 (in 2011),^{4} as it is likely not cost-effective to screen women, per guidance, to the 1965 cohort because of relatively low prevalence (Fig. 3) and slower disease progression in women [85]. Compared to the CDC and USPSTF recommendation, our model increases the expected INMB by $168.8 million in men and $7.1 million in women.

Our analysis has several limitations. First, we assume only the current cohort can be sampled to learn about subsequent cohorts, relying on the correlation between cohorts (as implied by the system dynamics). In practice, for our example, it is possible to sample the next cohort (49-year olds) directly. We chose this assumption because the individuals who make up the ‘next cohort’ are typically unknown (e.g., the next cohort of patients with a heart attack, the next cohort of pregnant women, or the next cohort of cancer patients). Second, we consider one-time screening at age 50 based on a cost effectiveness analysis of once-in-a-lifetime HCV screening [14]. However, this analysis (and, consequently, ours) assumed that the cohort being screened has not been previously screened. Our model does not identify the optimal age at which to perform one-time screening. Third, we assumed that the individuals who attend a preventive health exam and participate in recommended HCV screening are an unbiased sample from the cohort–that is, individuals are not more or less likely to attend their preventive health exam if they are HCV-positive. However, if individuals at higher-risk of HCV disproportionately self-select for general population screening, then we have underestimated the duration for which screening will be cost-effective. If individuals at lower-risk disproportionately self-select for screening (often called the “worried well”), then we have overestimated the duration for which screening will be cost-effective. Fourth, we focus on HCV screening policy in the non-injection drug using population only because they were the focus of the recent change in HCV screening policy. Finally, while uncertainty (and related information acquisition) with respect to model parameters other than prevalence can be treated in an analogous manner, the details are left for future work.

## 5 Conclusion

Our analysis shows that when parameters vary across intervention cohorts, it may be optimal to delay information acquisition. This is a significant improvement over the current paradigm which only considers one-time immediate information collection. More specifically, we provide a framework for optimal information acquisition, in terms of timing and precision of the acquired signal (sample size). Further, we incorporate misclassification from an imperfect information-collection technology into our framework, which is an important real-life complexity of information gathering that adds substantial analytical difficulty.

The common assumption that the per-person value-of-information remains constant for future cohorts may result in a significant error when estimating the population value of additional information. It may indicate immediate expensive information collection when, incorporating the system dynamics, the optimal action is to collect information in the future or never at all. When a parameter is evolving across intervention cohorts, ignoring the opportunity to wait and collect information in the future, when the information collected is more likely to result in action, is a missed opportunity for increased efficiency. As seen in our example, adding the option of delaying information acquisition until a time when the signal is more likely to justify a policy shift can increase the expected value compared to a policy of immediate information collection. The dynamic programming framework developed in this paper enables an accurate assessment of the marginal value of additional information and identifies an optimal information-acquisition policy.

In this work, we assumed that the dynamics are monotonically increasing or decreasing and that they are deterministic. In future work, we plan to consider the more realistic assumption of uncertainty in the dynamics. This would then enable learning about the evolution of the parameters, rather than just their current state. Furthermore, our model does not consider the possibility of intervening on a cohort at a different time in the course of their disease or lives (i.e., at an earlier or later age) or the possibility of the intervention modifying the population-level dynamics. Although true for our application, this latter assumption does not hold in general for an infectious disease. Including the additional benefits of reduced disease transmission from prevention and treatment interventions may generate more near-term benefits and may dramatically alter the value of the intervention over time.

With strained resources for health programs and population-health monitoring, this type of analysis may ensure an optimal implementation horizon for health programs together with guidance on when and how much information should be collected to inform health-program adjustments. Beyond health, many application areas face limited resources for investment and information acquisition, high-quality decision-relevant information is often difficult or expensive to collect, and population or environmental trends influence the preferences and behavior of customers across industries. Facing a dynamic consumer, competitive, or physical environment, the optimal timing of high-quality information acquisition may provide competitive advantage.

## Footnotes

- 1.
A beta-distribution with parameters (

*a*,*b*) has mean \(\mu = \frac {a}{a + b}\) and variance \(\sigma ^{2}=\frac {a b}{(a + b)^{2} (a + b + 1)}\). Through direct substitution and rearrangement, it can be shown that a beta-distribution with mean*μ*and variance*σ*^{2}has parameters \(a=\mu \left (\frac {\mu (1-\mu )}{\sigma ^{2}} -1 \right )\) and \(b =(1-\mu ) \left (\frac {\mu (1-\mu )}{\sigma ^{2}} -1 \right )\). - 2.
The posterior distribution is a weighted sum of component beta-distributions; see Eq. 2. The weights of the exact posterior distribution are generated by the convolution of two binomial distributions; see Eq. 13. The convolution of two binomial distributions creates unimodal weights over the ordered set of component beta-distributions in the mixture (ordered in terms of increasing first parameter). For example, consider a prior

*x*_{ t }= (3, 7). Given a sample size*n*_{ t }= 5, there are 6 possible true outcomes, by which we mean the potentially unobservable number of actual positive samples in the study. These true outcomes correspond to 6 possible unique beta-distributions with parameters (3, 12), (4, 11), (5, 10), (6, 9), (7, 8), and (8, 7) forming the components of the exact posterior distribution. Because the information technology is imperfect, we have a belief over these possible outcomes equal to the distribution of the actual number of positives given a specific number of observed positives. We can compute this distribution using the weights in Eq. 2, where*j*is the number of actual positives among the*v*_{ t }observed positives, and*k*is the number of actual positives among the*n*_{ t }−*v*_{ t }observed negatives. The probability that the actual number of positives in the sample is*W*is determined by summing all the weights for which*j*+*k*=*W*. The unimodality of the weights over the ordered component distributions ensures the unimodality of the posterior distribution. To complete the numerical example, consider the case where there are 3 observed positives,*v*_{ t }= 3, and*q*= (0.9, 0.85), which results in weights of 0.028, 0.141, 0.346, 0.414, 0.067, 0.004 over the component beta-distributions in the mixture. - 3.
Weber [65] uses global optimization to consider the general problem of switching between arbitrary streams of expected benefits allowing for multiple switches, which can be viewed as a deterministic equivalent of the multi-armed bandit problem.

- 4.
Our initial cohort is individuals born in 1960. This result can be interpreted as a recommendation for immediate information collection.

- 5.
The monotone-likelihood-ratio property is satisfied [89].

## Notes

### Acknowledgments

The authors thank Laurie Barker, MSPH, Division of Viral Hepatitis, Centers for Disease Control and Prevention for assistance with NHANES analysis.

## References

- 1.Gold MR, Siegel JE, Russell LB, Weinstein MC (1996) Cost-Effectiveness in Health and Medicine. Oxford University Press, OxfordGoogle Scholar
- 2.Drummond MF, Sculpher MJ, Torrance GW (2005) Methods for the Economic Evaluation of Health Care Programs, 3rd edn. Oxford University Press, OxfordGoogle Scholar
- 3.Ades AE, Lu G, Claxton KP (2004) Expected value of sample information calculations in medical decision modeling. Med Decis Making 24(2):207–227CrossRefGoogle Scholar
- 4.Claxton KP, Sculpher MJ (2006) Using value of information analysis to prioritise health research: Some lessons from recent UK experience. PharmacoEconomics 24(11):1055–1068CrossRefGoogle Scholar
- 5.Eckermann S, Karnon J, Willan AR (2010) The value of value of information. PharmacoEconomics 28 (9):699–709CrossRefGoogle Scholar
- 6.Philips Z, Claxton K, Palmer S (2008) The half-life of truth: what are appropriate time horizons for research decisions?. Med Decis Making 28(3):287–299CrossRefGoogle Scholar
- 7.Eckermann S, Willan AR (2008) The option value of delay in health technology assessment. Med Decis Making 28(3):300–305CrossRefGoogle Scholar
- 8.Juusola JL, Brandeau ML (2016) HIV treatment and prevention: a simple model to determine optimal investment. Med Decis Making 36(3):391–409CrossRefGoogle Scholar
- 9.Singer ME, Younossi ZM (2001) Cost effectiveness of screening for hepatitis C virus in asymptomatic, average-risk adults. Am J Med 111(8):614–621CrossRefGoogle Scholar
- 10.Chou R, Clark EC, Helfand M (2004) Screening for hepatitis C virus infection: a review of the evidence for the U.S. Preventive Services Task Force. Ann Intern Med 140(6):465–479CrossRefGoogle Scholar
- 11.Rein DB, Smith BD, Wittenborn JS, Lesesne SB, Wagner LD, Roblin DW, Patel N, Ward JW, Weinbaum CM (2012) The cost-effectiveness of birth-cohort screening for hepatitis C antibody in US primary care settings. Ann Intern Med 156(4):263–270CrossRefGoogle Scholar
- 12.Coffin PO, Scott JD, Golden MR, Sullivan SD (2012) Cost-effectiveness and population outcomes of general population screening for hepatitis C. Clin Infect Dis 54(9):1259–1271CrossRefGoogle Scholar
- 13.McGarry LJ, Pawar VS, Panchmatia HR, Rubin JL, Davis GL, Younossi ZM, Capretta JC, O’Grady MJ, Weinstein MC (2012) Economic model of a birth cohort screening program for hepatitis C virus. Hepatology 55(5):1344–1355CrossRefGoogle Scholar
- 14.Liu S, Cipriano LE, Holodniy M, Goldhaber-Fiebert JD (2013) Cost-effectiveness analysis of risk-factor guided and birth-cohort screening for chronic hepatitis C infection in the United States. PLoS One 8(3):e58975CrossRefGoogle Scholar
- 15.Eckman MH, Talal AH, Gordon SC, Schiff E, Sherman KE (2013) Cost-effectiveness of screening for chronic hepatitis C infection in the United States. Clin Infect Dis 56(10):1382–1393CrossRefGoogle Scholar
- 16.Smith BD, Morgan RL, Beckett GA, Falck-Ytter Y, Holtzman D, Ward JW (2012) Hepatitis C virus testing of persons born during 1945–1965: Recommendations from the Centers for Disease Control and Prevention. Ann Intern Med 157(11):817–822CrossRefGoogle Scholar
- 17.Moyer VA, on behalf of the U.S. Preventive Services Task Force (2013) Screening for Hepatitis C Virus Infection in Adults: U.S. Preventive Services Task Force Recommendation Statement. Ann Intern Med 159(5):349–357CrossRefGoogle Scholar
- 18.Jensen R (1983) Innovation adoption and diffusion when there are competing innovations. J Econ Theory 29(1):161–171CrossRefGoogle Scholar
- 19.McCardle KF (1985) Information acquisition and the adoption of new technology. Manag Sci 31(11):1372–1389CrossRefGoogle Scholar
- 20.Smith JE, McCardle KF (2002) Structural properties of stochastic dynamic programs. Oper Res 50(5):796–809CrossRefGoogle Scholar
- 21.Ulu C, Smith JE (2009) Uncertainty, information acquisition, and technology adoption. Oper Res 57(3):740–752CrossRefGoogle Scholar
- 22.Rosenberg N (1982) Inside the black box: technology and economics. Cambridge University Press, CambridgeGoogle Scholar
- 23.Bessen J (1999) Real options and the adoption of new technologies. Research on Innovation. http://www.researchoninnovation.org
- 24.Kornish LJ (2006) Technology choice and timing with positive network effects. Eur J Oper Res 173(1):268–282CrossRefGoogle Scholar
- 25.Chambers C, Kouvelis P (2003) Competition, learning and investment in new technology. IIE Trans 35(9):863–878CrossRefGoogle Scholar
- 26.Schaefer AJ, Bailey MD, Shechter SM, Roberts MS (2005) Chapter 23: Modeling medical treatment using Markov decision processes. In: Operations Research and Health Care: A handbook of methods and applications, Springer US, volume 70 of International Series in Operations Research and Management Science. pp. 593–612Google Scholar
- 27.Alagoz O, Hsu H, Schaefer AJ, Roberts MS (2010) Markov decision processes: a tool for sequential decision making under uncertainty. Med Decis Making 30(4):474–483CrossRefGoogle Scholar
- 28.Ahn JH, Hornberger JC (1996) Involving patients in the cadaveric kidney transplant allocation process: A decision-theoretic perspective. Manag Sci 42(5):629–641CrossRefGoogle Scholar
- 29.Magni P, Quaglini S, Marchetti M, Barosi G (2000) Deciding when to intervene: a Markov decision process approach. Int J Med Inform 60(3):237–253CrossRefGoogle Scholar
- 30.Hauskrecht M, Fraser H (2000) Planning treatment of ischemic heart disease with partially observable Markov decision processes. Artif Intell Med 18(3):221–244CrossRefGoogle Scholar
- 31.Alagoz O, Maillart LM, Schaefer AJ, Roberts MS (2004) The optimal timing of living–donor liver transplantation. Manag Sci 50(10):1420–1430CrossRefGoogle Scholar
- 32.Alagoz O, Maillart LM, Schaefer AJ, Roberts MS (2007) Choosing among cadaveric and living–donor livers. Manag Sci 53(11):1702–1715CrossRefGoogle Scholar
- 33.Shechter SM, Bailey MD, Schaefer AJ, Roberts MS (2008) The optimal time to initiate HIV therapy under ordered health states. Oper Res 56(1):20–33CrossRefGoogle Scholar
- 34.Shechter SM, Bailey MD, Schaefer AJ, Roberts MS (2008) A modeling framework for replacing medical therapies. IIE Trans 40(9):861–869CrossRefGoogle Scholar
- 35.Kırkızlar E, Faissol DM, Griffin PM, Swann JL (2010) Timing of testing and treatment for asymptomatic diseases. Math Biosci 226(1):28–37CrossRefGoogle Scholar
- 36.Kurt M, Denton B, Schaefer AJ, Shah N, Smith S (2011) The structure of optimal statin initiation policies for patients with type 2 diabetes. IIE Trans 1(1):49–65Google Scholar
- 37.Mason JE, Denton BT, Shah ND, Smith SA (2014) Optimizing the simultaneous management of blood pressure and cholesterol for type 2 diabetes patients. Eur J Oper Res 233(3):727–738CrossRefGoogle Scholar
- 38.Zhang J, Denton BT, Balasubramanian H, Shah ND, Inman BA (2012) Optimization of prostate biopsy referral decisions. Manuf Serv Op 14(4):529–547CrossRefGoogle Scholar
- 39.Ayer T, Alagoz O, Stout NK, Burnside ES (2014) Designing a new breast cancer screening program considering adherence. Manag. Sci. ForthcomingGoogle Scholar
- 40.Erenay FS, Alagoz O, Said A (2014) Optimizing colonoscopy screening for colorectal cancer prevention and surveillance. Manuf Serv Op 16(3):381–400CrossRefGoogle Scholar
- 41.Patrick J, Puterman ML, Queyranne M (2008) Dynamic multipriority patient scheduling for a diagnostic resource. Oper Res 56(6):1507–1525CrossRefGoogle Scholar
- 42.Gocgun Y, Bresnahan BW, Ghate A, Gunn ML (2011) A Markov decision process approach to multi-category patient scheduling in a diagnostic facility. Artif Intell Med 53:73–81CrossRefGoogle Scholar
- 43.Patrick J (2012) A Markov decision model for determining optimal outpatient scheduling. Health Care Manag Sci 15:91–102CrossRefGoogle Scholar
- 44.Gocgun Y, Puterman ML (2014) Dynamic scheduling with due dates and time windows: an application to chemotherapy patient appointment booking. Health Care Manag Sci 17:60–76CrossRefGoogle Scholar
- 45.Gupta D, Wang L (2008) Revenue management for a primary-care clinic in the presence of patient choice. Oper Res 56(3):576–592CrossRefGoogle Scholar
- 46.Wang J, Fung RYK (2015) Adaptive dynamic programming algorithms for sequential appointment scheduling with patient preferences. Artif Intell Med 63:33–40CrossRefGoogle Scholar
- 47.Kornish LJ, Keeney RL (2008) Repeated commit-or-defer decisions with a deadline: the influenza vaccine composition. Oper Res 56(3):527–541CrossRefGoogle Scholar
- 48.Özaltın OY, Prokopyev OA, Schaefer AJ, Roberts MS (2011) Optimizing the societal benefits of the annual influenza vaccine: a stochastic programming approach. Oper Res 59(5):1131–1143CrossRefGoogle Scholar
- 49.Weinstein M, Zeckhauser R (1973) Critical ratios and efficient allocation. J Public Econ 2:147–157CrossRefGoogle Scholar
- 50.Culyer AJ (1989) The normative economics of health care finance and provision. Oxf Rev Econ Policy 5:34–58CrossRefGoogle Scholar
- 51.Stinnett AA, Paltiel AD (1996) Mathematical programming for the efficient allocation of health care resources. J Health Econ 15:641–653CrossRefGoogle Scholar
- 52.Williams I, McIver S, Moore D, Bryan S (2008) The use of economic evaluations in NHS decision-making: a review and empirical investigation. Health Technol Assess 12(7):1–175CrossRefGoogle Scholar
- 53.Raiffa H, Schlaifer R (1961) Applied Statistical Decision Theory. Harvard University Press, CambridgeGoogle Scholar
- 54.Weinstein MC (1983) Cost-effective priorities for cancer prevention. Science 221(4605):17–23CrossRefGoogle Scholar
- 55.Hornberger JC, Brown BW, Halpern J (1995) Designing a cost-effective clinical trial. Stat Med 14 (20):2249–2259CrossRefGoogle Scholar
- 56.Claxton K, Posnett J (1996) An economic approach to clinical trial design and research priority-setting. Health Econ 5(6):513–524CrossRefGoogle Scholar
- 57.Claxton K (1999) The irrelevance of inference: a decision-making approach to the stochastic evaluation of health care technologies. J Health Econ 18(3):341–364CrossRefGoogle Scholar
- 58.Eckermann S, Willan AR (2008) Time and expected value of sample information wait for no patient. Value Health 11(3):522–526CrossRefGoogle Scholar
- 59.McKenna C, Claxton K (2011) Addressing adoption and research design decisions simultaneously: the role of value of sample information analysis. Med Decis Making 31(6):853–865CrossRefGoogle Scholar
- 60.Hall PS, Edlin R, Kharroubi S, Gregory W, McCabe C (2012) Expected net present value of sample information from burden to investment. Med Decis Making 32(3):E11—E21CrossRefGoogle Scholar
- 61.Fenwick E, Claxton K, Sculpher M (2008) The value of implementation and the value of information: combined and uneven development. Med Decis Making 28(1):21–32CrossRefGoogle Scholar
- 62.Willan AR, Eckermann S (2010) Optimal clinical trial design using value of information methods with imperfect implementation. Health Econ 19(5):549–561Google Scholar
- 63.Smith JE (1993) Moment methods for decision analysis. Manag Sci 39(3):340–358CrossRefGoogle Scholar
- 64.Gospodinov N, Lkhagvasuren D (2014) A moment-matching method for approximating vector autoregressive processes by finite-state Markov chains. J Appl Econom 29(5):843–859CrossRefGoogle Scholar
- 65.Weber TA (2017) Optimal switching between cash-flow streams. Math. Method. Oper. Res. Forthcoming. https://doi.org/10.1007/s00186-017-0586-0
- 66.Blackwell D (1951) Comparison of experiments. In: Proceedings of the Second Berkeley Symposium on Mathematical Statistics and Probability. University of California Press. pp 93–102Google Scholar
- 67.Kim W (2002) The burden of hepatitis C in the United States. Hepatology 36(Suppl 1):S30—S34Google Scholar
- 68.Ghany MG, Strader DB, Thomas DL, Seeff LB (2009) Diagnosis, management, and treatment of hepatitis C: an update. Hepatology 49(4):1335–1374CrossRefGoogle Scholar
- 69.Armstrong GL, Wasley A, Simard EP, McQuillan GM, Kuhnert WL, Alter MJ (2006) The prevalence of hepatitis C virus infection in the United States, 1999 through 2002. Ann Intern Med 144(10):705–714CrossRefGoogle Scholar
- 70.Chak E, Talal AH, Sherman KE, Schiff ER, Saab S (2011) Hepatitis C virus infection in USA: an estimate of true prevalence. Liver Int 31(8):1090–1101CrossRefGoogle Scholar
- 71.Denniston MM, Klevens RM, McQuillan GM, Jiles RB (2012) Awareness of infection, knowledge of hepatitis C, and medical follow-up among individuals testing positive for hepatitis C: National Health and Nutrition Examination Survey 2001-2008. Hepatology 55(6):1652–1661CrossRefGoogle Scholar
- 72.Armstrong GL (2007) Injection drug users in the United States, 1979-2002: an aging population. Arch Intern Med 167(2):166–173CrossRefGoogle Scholar
- 73.Armstrong GL, Alter MJ, McQuillan GM, Margolis HS (2000) The past incidence of hepatitis C virus infection: implications for the future burden of chronic liver disease in the United States. Hepatology 31(3):777–782CrossRefGoogle Scholar
- 74.Joy JB, McCloskey RM, Nguyen T, Liang RH, Khudyakov Y, Olmstead A, Krajden M, Ward JW, Harrigan PR, Montaner JS, Poon AF (2016) The spread of hepatitis C virus genotype 1a in North America: a retrospective phylogenetic study. Lancet Infect Dis 16(6):698–702CrossRefGoogle Scholar
- 75.Barker L Personal communication, August 2, 2016. Based on analyses using the Centers for Disease Control and Prevention, National Health and Nutrition Examination Survey (NHANES) (2005-2012)Google Scholar
- 76.Klevens RM, Hu DJ, Jiles R, Holmberg SD (2012) Evolving epidemiology of hepatitis C virus in the United States. Clin Infect Dis Suppl 55(1):S3—9Google Scholar
- 77.American Association for the Study of Liver Diseases and the Infectious Diseases Society of America (AASLD-IDSA). HCV testing and linkage to care. Recommendations for testing, managing, and treating hepatitis C. Available at: http://www.hcvguidelines.org/full-report/hcv-testing-and-linkage-care. Accessed: February 23, 2017
- 78.R Development Core Team (2012) R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna. http://www.R-project.org/ Google Scholar
- 79.US Census Bureau (2010) QT – P1: Age groups and sex: 2010. Available at: http://www.census.gov/2010census/
- 80.Mehrotra A, Zaslavsky AM, Ayanian JZ (2007) Preventive health examinations and preventive gynecological examinations in the United States. Arch Intern Med 167(17):1876CrossRefGoogle Scholar
- 81.Gretch DR (1997) Diagnostic tests for hepatitis C. Hepatology 26(Suppl 3):43S–47SCrossRefGoogle Scholar
- 82.Hyland C, Kearns S, Young I, Battistutta D, Morgan C (1992) Predictive markers for hepatitis C antibody ELISA specificity in Australian blood donors. Transfusion Med 2(3):207–213CrossRefGoogle Scholar
- 83.Center for Medicare and Medicaid Services (CMS). 2010. Medicare fee schedule. U.S. Department of Health and Human Services. http://www.cms.gov/home/medicare.asp
- 84.Weinstein MC, Skinner JA (2010) Comparative effectiveness and health care spending – implications for reform. New Engl J Med 362(5):460–465CrossRefGoogle Scholar
- 85.Thein HH, Yi Q, Dore GJ, Krahn MD (2008) Estimation of stage-specific fibrosis progression rates in chronic hepatitis C virus infection: A meta-analysis and meta-regression. Hepatology 48(2):418–431CrossRefGoogle Scholar
- 86.Centers for Disease Control and Prevention (CDC) (2006) National Health and Nutrition Examination Survey (NHANES): analytic and reporting guidelines. http://www.cdc.gov/nchs/nhanes/nhanes2003-2004/analytical_guidelines.htm. Accessed: August 27, 2012. Last updated: September 2006
- 87.Centers for Disease Control and Prevention (CDC) (2011) Analytic note regarding 2007-2010 survey design changes and combining data across other survey cycles. http://www.cdc.gov/nchs/nhanes/nhanes2003-2004/analytical_guidelines.htm. Accessed: August 27, 2012. Last updated: September 2011
- 88.Centers for Disease Control and Prevention (CDC) (2012) National Health and Nutrition Examination Survey (NHANES) (1999–2010). http://www.cdc.gov/nchs/nhanes.htm. Accessed: August 27, 2012
- 89.Milgrom PR (1981) Good news and bad news: Representation theorems and applications. Bell J Econ 12 (12):380–391CrossRefGoogle Scholar
- 90.US Bureau of Labor Statistics (2011) Consumer Price Index (CPI):1913–present. Division of consumer prices and price indexes. Available at: http://www.bls.gov/cpi/

## Copyright information

**Open Access**This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.