Population-level intervention and information collection in dynamic healthcare policy

Cipriano, Lauren E.; Weber, Thomas A.

doi:10.1007/s10729-017-9415-5

Population-level intervention and information collection in dynamic healthcare policy

Open access
Published: 08 September 2017

Volume 21, pages 604–631, (2018)
Cite this article

Download PDF

You have full access to this open access article

Health Care Management Science Aims and scope Submit manuscript

Population-level intervention and information collection in dynamic healthcare policy

Download PDF

3215 Accesses
6 Citations
Explore all metrics

Abstract

We develop a general framework for optimal health policy design in a dynamic setting. We consider a hypothetical medical intervention for a cohort of patients where one parameter varies across cohorts with imperfectly observable linear dynamics. We seek to identify the optimal time to change the current health intervention policy and the optimal time to collect decision-relevant information. We formulate this problem as a discrete-time, infinite-horizon Markov decision process and we establish structural properties in terms of first and second-order monotonicity. We demonstrate that it is generally optimal to delay information acquisition until an effect on decisions is sufficiently likely. We apply this framework to the evaluation of hepatitis C virus (HCV) screening in the general population determining which birth cohorts to screen for HCV and when to collect information about HCV prevalence.

A Multi-Fidelity Rollout Algorithm for Dynamic Resource Allocation in Population Disease Management

Article 07 September 2018

Optimizing patient treatment decisions in an era of rapid technological advances: the case of hepatitis C treatment

Article 19 July 2015

Markov modeling in hepatitis B screening and linkage to care

Article Open access 18 May 2017

Discover the latest articles, news and stories from top researchers in related subjects.

Medical Ethics

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

There is currently no guidance for determining the optimal schedule for collecting additional information regarding a decision to invest in a health program or technology [1, 2]. Current practice in the health decision science literature assumes that model parameters are fixed across cohorts and the value of additional information is calculated assuming the information-collection effort is initiated immediately [3,4,5]. However, in many cases the cost-effectiveness of a health program or technology – and, therefore, the value of additional information about one or more model parameters – may be changing over time because of trends affecting the cohort or the intervention [6]. In these cases, collecting additional information immediately may not be optimal and value-of-information calculations based on static parameter assumptions are likely to be biased. Planning over longer horizons is particularly important in health policy because, once established, clinical practice is difficult to change due to high switching costs (re-training and potentially new capital equipment expenditures), particularly if it appears that the level of service is being reduced [7].

In this paper we apply a stochastic dynamic programming approach to identify both the optimal time to change the current health intervention policy and the optimal time to collect decision-relevant information. We consider a hypothetical medical intervention for a cohort of patients. At each time, a new cohort of patients becomes eligible for the intervention and one parameter varies across the cohorts with imperfectly observable linear dynamics. We assume that the value of the intervention is linear in the dynamic parameter. In general, the (incremental) net monetary benefit of an intervention is linear in parameters with a one-time effect (e.g., the prevalence of a disease at one point in time or the outcome of a one-time screening test). When an effect accrues over time, such as for a reduction in the annual transition rate of a disease complication or death, linearity is often used as an approximation (see, e.g., [8]). At each time, the policy-maker can choose to invest in the medical intervention and/or to purchase sample information about the uncertain dynamic parameter. We demonstrate that information acquisition is best delayed until the signal is sufficiently likely to affect the optimal policy decision.

We apply this framework to the evaluation of hepatitis C virus (HCV) screening. Prior to the development of highly-effective treatments, HCV screening in the general population was not considered cost-effective [9] and universal screening was not recommended [10]. The advent of more effective therapy has changed the value of identifying infected individuals early to initiate treatment [11,12,13,14,15]. Recently released guidance by the Centers for Disease Control and Prevention (CDC) and the US Preventive Services Task Force (USPSTF) recommends one-time HCV screening for all individuals born between 1945 and 1965 [16, 17] although screening individuals born after 1965 may also be cost effective [13,14,15]. Based on our primary analysis of the National Health and Nutrition Examination Survey (NHANES), in the US general population, HCV prevalence is highest in people born around 1956 and declines thereafter at a rate of approximately 11% per birth year. Since HCV prevalence is decreasing across birth cohorts, HCV screening will only be cost-effective for a limited time or for a limited set of birth cohorts. We apply our model to simultaneously evaluate the optimal HCV-screening and information-acquisition policy.

Specifically, we apply our model to the policy decision of whether or not to perform one-time HCV screening in successive cohorts of healthy 50-year olds, who have not previously been tested for HCV, at a routine preventive health visit. Applying a traditional health economics framework, the policy-maker could decide today how many cohorts will be screened (e.g., each cohort of 50-year olds until those born in 1965 turn 50) or, to inform this decision, the policy-maker may seek additional information to be collected immediately. Our framework differs from the traditional paradigm in that each year the policy-maker makes a decision about whether to continue the one-time HCV screening program (whether or not to screen the new cohort of healthy 50-year olds) and whether to collect information about disease prevalence in this current cohort. If information is never collected, the optimal policy does not differ across frameworks. However, in our framework, the immediate decision is not limited to the decision of when to change policies, but it also includes when to collect information to inform a future change of policy. For example, the (immediately) optimal policy might be to screen each cohort of 50-year olds for the next 6 years and then collect information about HCV prevalence to inform future decision making. Delaying information acquisition until a time that the information is sufficiently likely to affect the decision increases the value of the information. In addition, from a practical perspective, collecting information years before it is likely to influence a policy change wastes immediate resources and, should something occur in the lag-time between the information-acquisition effort and the policy change, implementing the pre-determined policy change may not be optimal.

1.1 Related literature and contribution

The relevant literature spans technology adoption, dynamic decisions in healthcare, and the value of information in healthcare.

Technology adoption

In technology-adoption models, a decision-maker considers the adoption of a technology of unknown profitability. Jensen [18] introduced a model in which information about a new technology is costlessly observed and the decision-maker can decide to adopt the new technology at any point in time. McCardle [19] presented a model in which collecting information is associated with a fixed cost; in each period the decision-maker can defer and collect information, or make a final decision to accept or reject the new technology. The optimal policy in each period is characterized by two thresholds: if the expected benefit is above the upper threshold, it is optimal to adopt the technology; if the expected benefit is below the lower threshold, it is optimal to reject the technology; and, if the expected benefit is between these two thresholds the optimal strategy is to gather information. Uncertainty about the technology’s value decreases over time and the two thresholds converge to the cost of adoption. Smith and McCardle [20] provided several meta-results, some of which we use, describing how properties of the value function of a stochastic dynamic program are preserved and propagated through finite-horizon Markov-reward and decision processes. Ulu and Smith [21] extended this work by relaxing the assumption that the decision-maker’s value of the technology can be summarized by the expected benefit, and they use more general monotone-comparative-statics techniques in terms of likelihood orders to generalize the class of signals that are observed prior to making an adoption decision.

Another line of research considered technologies, like ours, with uncertain and changing value. Rosenberg [22] found that expectation of technological improvement may delay a firm’s irreversible technology investments. Bessen [23] calculated the option value of delay for such a problem. Kornish [24] considered the choice between two uncertain technologies where each is subject to a positive network effect and explored the impact of the network effect on the optimal adoption policy. Chambers and Kouvelis [25] formulated a technology-adoption problem incorporating expected learning-curve effects.

Stochastic dynamic programs in healthcare

Sequential decisions under uncertainty are common in healthcare [26, 27]. Most healthcare applications of stochastic dynamic programs have focused on optimizing the timing of interventions for an individual patient: the decision to accept or reject an offered kidney for transplantation [28]; the optimal treatment plan for mild spheroctosis [29]; the optimal surveillance and management of ischemic heart disease [30]; the optimal time to perform a living-donor liver transplant [31, 32]; the optimal time to initiate treatment for HIV [33, 34]; the optimal timing and frequency of HCV testing from the patient perspective [35]; the optimal use of statins in patients with type 2 diabetes [36, 37]; the optimal prostate biopsy referral [38]; and, optimal cancer screening programs [39, 40]. Dynamic programming has also been applied to complex appointment scheduling problems in healthcare, including problems with patients of different clinical types/priority [41, 42]; incorporating patient no-shows [43]; and problems of sequential appointment scheduling with the objective of closely adhering to a prescribed schedule (e.g., sequential chemotherapy appointments [44]) or with the objective of satisfying patient preferences [45, 46]. Fewer examples of application to population-level policy exist. Kornish and Keeney [47] and Özaltın et al. [48] formulated the influenza-strain selection problem in a finite-horizon optimal-stopping framework. Similar to our problem, the influenza-vaccine composition decision is also an optimal-stopping problem with information acquisition; however, it has many unique characteristics that distinguish it from the problem discussed here such as an inventory deadline (finite horizon), a product useful for one season only, and a time-consuming production process. Similar to many of the technology-adoption models discussed above but unlike our framework, in the influenza-vaccine composition models, information is collected in every period in which a final decision has not yet been made.

Health economics and value of information in healthcare

Cost-effectiveness analysis is an economic method for comparing the lifetime discounted costs and health benefits associated with two or more medical interventions or health programs [1, 2]. In theory, the optimal allocation of resources across a portfolio of health interventions is determined by solving a constrained optimization problem with the objective of maximizing health benefits subject to a budget constraint [49,50,51]. In reality, regional and national health policy bodies routinely compare the incremental cost effectiveness ratio of candidate interventions to a pre-determined threshold intended to approximate the shadow price of the budget to determine if the intervention is ‘cost-effective’ as one component of their policy-making process [52]. Cost-effectiveness analysis is widely used to evaluate general population screening for relatively rare conditions because these programs impose a small cost on everyone who is screened and provide substantive healthcare gains for only a small number of individuals who are identified (or identified earlier than they would be otherwise); calculating the population-level costs and benefits can require detailed natural history models, extensive model calibration and validation, and thorough analysis.

Bayesian decision theory approaches to value-of-information assessment were first introduced by Raiffa and Schlaifer [53]. Weinstein [54] proposed the widespread adoption of value-of-information analysis to research priority setting in health policy and medicine. Hornberger et al. [55], Claxton and Posnett [56], and Claxton [57] introduced a Bayesian approach to identifying the optimal trial sample size and to assessing the value of additional information for technology-adoption assessments. Several approaches to increasing the accuracy of value-of-information calculations continued to relax assumptions implicit in the original formulation (see examples in [58,59,60,61,62]). One common assumption in these studies is that the currently estimated per-person value of information can be applied to individuals in all future cohorts. Recognizing some of the implications of this assumption, Philips et al. [6] discussed the impact that intervention-horizon uncertainty, price changes, and technological development can have on the per-person value of information for future cohorts. They find that delaying information collection may be desirable but do not provide a framework for determining the optimal time to collect information.

Contribution

In this paper, we extend the technology-adoption literature by allowing for a technology that is changing in value over time, for the opportunity to ‘wait’ without collecting information, and for the possibility of optimally determining the collected amount of information in each period. We also incorporate the possibility of an imperfect information-collection technology. We broaden the scope of applications of stochastic dynamic programs in the area of healthcare in an important way – focusing on population policy rather than patient-level decisions. We extend the health decision science literature on value-of-information assessment by developing an approach to identify the optimal information-acquisition policy when model parameters are varying across cohorts. Finally, as an example, we apply our framework to the timely public policy problem of developing a population screening program for HCV. We find that considering the opportunity to collect information in the future leads to a substantially different policy recommendation than current guidelines because it explicitly considers and addresses the parameter uncertainty which is changing over time.

2 The model

A policy-maker faces recurring decisions for cohorts arriving at times t ∈{0,1,2,…} about whether to invest in a health intervention delivered once per cohort (of size N). By cohort we mean a group of individuals with a certain medical presentation (i.e., individuals with a new diagnosis of cancer) or of a certain status (i.e., individuals who turned 50 this year). The policy-maker’s objective is to maximize net monetary benefit from a societal perspective. The per-person incremental net monetary benefit (INMB) of performing the intervention compared to the status quo is assumed to be affine in an uncertain parameter $\tilde {p}_{t}$ that varies across the cohorts, with realizations in [0,1] and known dynamics. So $\text {INMB}_{t} = \theta \tilde {p}_{t} - \gamma $, for all t ≥ 0, where 𝜃 is the marginal INMB (with respect to the parameter $\tilde {p}_{t}$) and − γ is the fixed INMB, both measured on a per-person basis.

At the beginning of period t, the policy-maker simultaneously decides whether to invest in a medical intervention for the individuals in cohort t and whether to conduct a study of sample size n _t over the period to obtain a better estimate of the uncertain parameter $\tilde {p}_{t}$. Information, if sought, arrives at the end of the current period and is used, together with the known dynamics of $\tilde {p}_{t}$, to inform the intervention decision for future cohorts. Let $d_{t}\in \mathcal {D} = \{0, 1\}$ denote the intervention decision at time t, where d _t = 0 indicates ‘No intervention’ and d _t = 1 indicates ‘Intervention.’ The amount of information collected is measured in terms of the sample size $n_{t} \in {\mathcal N}=\{0,\ldots ,N\}$; it is obtained at the cost K(n _t), where K(⋅) is an increasing function including a fixed and a variable cost when n _t > 0 and K(0) = 0. Thus, at each time t the policy-maker implements the control $u_{t}=(d_{t}, n_{t})\in \mathcal {D} \times \mathcal {N}$. The per-person current reward for the cohort in period t is

$$ g(\tilde{p}_{t}, u_{t}) = d_{t} (\theta \tilde{p}_{t} - \gamma) - \frac{K(n_{t})}{N}. $$

(1)

The application in Section 4 features the decision problem of when to stop a once-in-a-lifetime disease-screening program where $\tilde {p}_{t}$ is the uncertain disease prevalence in the t-th cohort which, in expectation, is geometrically decreasing over time; 𝜃 > 0 denotes the marginal benefit of early diagnosis and treatment for an affected individual, γ > 0 is the per-person cost of the program, and the current-period INMB g is increasing in $\tilde {p}_{t}$. Beyond our leading example, the framework can accommodate a wide variety of problems. As formulated, the uncertain parameter needs to lie in a compact interval (which can be mapped via bijection to [0,1]). Thus, the parameter can represent not only a probability but also other model parameters, such as a quality-of-life weight or cost. Additionally, our analysis assumes that the parameter value is decreasing over time. To model a situation where the expectation of the uncertain parameter is increasing (e.g., obesity prevalence), the problem can be formulated as one in which a parameter of opposite definition is decreasing (e.g., prevalence of individuals who are not obese). Our exposition involves an example of when to stop a health intervention. However, the framework can also be used in situations in which the decision-maker wishes to identify the optimal time to initiate a new intervention (e.g., when to adopt a new surgical technique). More broadly, our framework can be applied in settings in which the decision-maker wishes to identify the optimal time to stop the current intervention or initiate a new intervention; the uncertain parameter is geometrically increasing or decreasing across intervention cohorts; and the current-period reward function is linearly increasing or decreasing in the uncertain parameter. Examples are shown in Table 1.

Table 1 Examples of alternative cases in which our framework applies

Full size table

2.1 The information-acquisition problem

The policy-maker’s prior belief about $\tilde {p}_{t}$ at t = 0 is beta-distributed with distribution parameters x ₀ = (a ₀,b ₀). The posterior distribution when a beta-density is updated in a Bayesian manner with information collected using an imperfect information-collection technology is a mixture of beta-densities. Thus, in general, the policy-maker’s prior beliefs about $\tilde {p}_{t}$ at time t are in $\mathcal {P}$ where $\mathcal {P}$ denotes the set of measures which are a mixture of beta-densities. Specifically if $\tilde {p}_{t} \in \mathcal {P}$, then there exists parameters $x_{t,i} = (a_{t,i}, b_{t,i}) \in \mathbb {R}_{++}^{2}$ for all i where $ 1 \leqslant i \leqslant m, m \in \mathbb {R}_{++}$, and a set of non-negative weights ω _i such that ${\sum }_{i=1}^{m} \omega _{i} = 1$, where the distribution of $\tilde {p}_{t}$ is a mixture of beta-densities of the form ${\sum }_{i=1}^{m} \omega _{i} \text { beta}(a_{t,i}, b_{t,i})$.

The policy-maker has the option to update his beliefs about the parameter $\tilde {p}_{t}$ by testing n _t individuals at cost K(n _t). The information-collection technology has binary test characteristics q = (q ₁,q ₂), where q ₁ is the sensitivity, q ₂ is the specificity, and q ₁ + q ₂ > 1 (indicating the test is properly labeled). The terms ‘sensitivity’ and ‘specificity’ are often used to describe test accuracy in the medical literature. For clarity, we state their relationship to Type I and Type II error: ‘Specificity’ = 1 −‘Type I error’ = 1 −‘False positive rate’ and ‘Sensitivity’ = 1 −‘Type II error’ = 1 −‘False negative rate’. The number of positive samples is an uncertainty $\tilde {v_{t}}$ with realization v _t ∈{0,…,n _t}. Based on the collected information the policy-maker updates his beliefs about $\tilde {p}_{t}$ in a Bayesian manner.

Proposition 1

If the policy-maker’s prior belief f _p(⋅)is a mixture of beta-densities, i.e., $f_{p} \in \mathcal {P}$ , then for any number of positive observations $\tilde {v}_{t} = v_{t}$ from n _t samples, the Bayesian posterior belief f _p|v(⋅|v _t) is also a mixture of beta-densities, i.e., f _p|v is in $\mathcal {P}$ .

Proof

See Appendix A.1. □

If $\tilde {p}_{t}$ is a mixture of m ≥ 1 beta-densities and if the information-collection technology is imperfect (i.e., $\min \{q_{1},q_{2}\}<1$), then the true posterior distribution is also a mixture of beta-densities, containing between m + n _t and m × (n _t + 1) unique beta-distributions (see Appendix A.2.1). The resulting probability density function (pdf) is

$$\begin{array}{@{}rcl@{}} f_{p|v}\!(p | x_{t},\! n_{t},\!v_{t},\! q)\!\!&=&\!\! \sum\limits_{j=0}^{v_{t}} \sum\limits_{k=0}^{n_{t} -v_{t}} \sum\limits_{i=1}^{m} \omega^{\prime}_{j, k, i}\\ &&\times\text{beta}(a_{t,i}\!\,+\,j\,+\,k,\! b_{t,i}\,+\,n_{t}\!\,-\,j\,-\,k ), \end{array} $$

(2)

with updated weights

$$\omega^{\prime}_{j, k, i} = \frac{ \frac{ \omega_{i} {q_{1}^{j}} (1-q_{2})^{v_{t}-j} (1-q_{1})^{k} q_{2}^{n_{t}-v_{t}-k} {\Gamma}(a_{t,i}+b_{t,i}) {\Gamma}(a_{t,i} +j+k) {\Gamma}(b_{t,i} +n_{t}-j -k)} { {\Gamma}(j+1) {\Gamma}(v_{t}-j+1) {\Gamma}(k+1) {\Gamma}(n_{t}-v_{t}-k+1) {\Gamma}(a_{t,i}){\Gamma}(b_{t,i}) {\Gamma}(a_{t,i} + b_{t,i} +n_{t})}} { \sum\limits_{r=0}^{v_{t}} \sum\limits_{s=0}^{n_{t} - v_{t}} \sum\limits_{i=1}^{m} \frac{ \omega_{i} {q_{1}^{r}} (1-q_{2})^{v_{t}-r} (1-q_{1})^{s} q_{2}^{n_{t} - v_{t} - s} {\Gamma}(a_{t,i} + b_{t,i}) {\Gamma}(a_{t,i} +r+s) {\Gamma}(b_{t,i} +n_{t}-r -s)}{\Gamma(r+1) {\Gamma}(v_{t}-r+1) {\Gamma}(s+1) {\Gamma}(n_{t}-v_{t}-s+1){\Gamma}(a_{t,i}) {\Gamma}(b_{t,i}) {\Gamma}(a_{t,i} + b_{t,i} +n_{t})}}. $$

Explicit expressions for the conditional mean and variance, μ _p|v and σ p|v2, are provided in Appendix A.2.2.

Remark 1

If $\tilde {p}_{t}$ follows a mixture of m ≥ 1 beta-densities and the information-collection technology is perfect (i.e., q ₁ = q ₂ = 1), then the distribution of sample information, $\tilde {v}_{t}$, is a mixture of m beta-binomial distributions with the same weights ω _i. Updating results in a posterior distribution that is a mixture of m beta-densities with pdf

$$\begin{array}{@{}rcl@{}} f_{p|v}(p | x_{t},\! n_{t},\! v_{t},\!q \!\!&=&\!\!(1,1))\\ \!\!&=&\!\! \sum\limits_{i=1}^{m} \omega{\prime}_{i} \text{beta}(a_{t,i}\,+\,v_{t} , b_{t,i}\,+\,n_{t}\,-\,v_{t} ), \end{array} $$

(3)

with updated weights

$$\omega^{\prime}_{i} = \frac{\omega_{i} \frac{\Gamma(a_{t,i}+b_{t,i}){\Gamma}(a_{t,i} +v_{t}){\Gamma}(b_{t,i}+n_{t}-v_{t})}{\Gamma(a_{t,i}){\Gamma}(b_{t,i}){\Gamma}(a_{t,i}+b_{t,i}+n_{t})}}{\sum\limits_{j=1}^{m} \omega_{j} \frac{\Gamma(a_{t,j}+b_{t,j}){\Gamma}(a_{t,j} +v_{t}){\Gamma}(b_{t,j}+n_{t}-v_{t})}{\Gamma(a_{t,j}){\Gamma}(b_{t,j}){\Gamma}(a_{t,j}+b_{t,j}+n_{t})}}, $$

for all i ∈{1,…,m}.

2.2 Approximate Bayesian inference

For practically relevant sample sizes n _t and an imperfect information-collection technology, the number of beta-densities in the posterior distribution can become very large, thus requiring approximation. The need for distributional approximations in decision models has been recognized by Smith who proposed moment matching to replace continuous distributions by appropriate discrete ones [63]. More recently, moment-matching methods have also been used in a Markovian setting, to approximate vector-autoregressions [64]. In our Markov dynamic programming setting, we apply moment matching to approximate the exact posterior distribution which is a mixture of beta-densities with a single beta-distribution. This greatly simplifies the belief propagation compared to dealing with mixtures of beta-densities which feature an increasingly large number of coefficients with each information-collection effort and ultimately an infinite-dimensional state space.

Thus, instead of carrying forward full distribution information about the posterior mixture of beta-densities caused by an imperfect information-collection technology, the policy-maker’s posterior belief about $\tilde {p}_{t}$ is approximated by a single beta-distribution with the same mean and variance as the exact posterior distribution. The policy-maker’s prior belief is represented by the distribution parameters x _t = (a _t,b _t) and the posterior belief incorporating any information collected at time t is represented by the updated parameters $\hat {x}_{t} = (\hat {a}_{t}, \hat {b}_{t})$. Using the mean and variance of the exact posterior distribution, μ _p|v and σ p|v2, the approximate posterior belief parameters are determined using the one-to-one relationship between the standard parameters of the beta-distribution and its mean and variance^{Footnote 1}. We let ψ(x _t,n _t,v _t,q) denote the function that generates the approximating parameters, with

$$\begin{array}{@{}rcl@{}} \hat{x}_{t} &=& \left[ \begin{array}{c} \hat{a}_{t} \\ \hat{b}_{t} \end{array} \right] \,=\, \psi(x_{t}, n_{t}, v_{t}, q)\\ &=& \left. \left(\frac{\mu_{p|v}(1\,-\,\mu_{p|v})}{\sigma^{2}_{p|v}} \,-\, 1 \right)\left[ \begin{array}{c} \mu_{p|v} \\ 1\,-\, \mu_{p|v} \end{array} \right]\right|_{(x_{t},n_{t},v_{t},q)}. \end{array} $$

In the case of a perfect information-collection technology, the preceding relations describe the policy-maker’s posterior beliefs exactly.

Mixtures of beta-distributions can be fitted to any continuous distribution on [0,1]. Thus, a single beta-distribution with the same mean and variance as a distribution formed by the mixture of beta-densities, will not always provide a satisfactory approximation. However, we focus on the special case where the time-t belief $\tilde {p}_{t}$ has been obtained via Bayesian updating from a single beta-prior. In this special case, approximating the mixture of beta-densities with a single beta-distribution with the same mean and variance maintains unimodality^{Footnote 2} and stationarity of the state space over time.

We assessed the approximation quality using simulation in the policy-relevant region for our application (Appendix A.3). We found that the maximum distance between the cumulative density function of the exact posterior distributions and that of the approximation with matching mean and variance were generally small (< 2%), but became large when the mean was approaching zero and the standard deviation was relatively large. The quality of the approximation was very good (< 0.5%) when the mean was greater than 2%. We deemed the approximation to be of sufficiently high quality for our numerical analysis because our initial conditions and predicted trajectory without information acquisition rely on the regions in which the approximation is good. Also, because of relatively high fixed costs associated with information acquisition, optimal sample sizes in our numerical analysis tended to be sufficiently large that information would likely only be collected once which reduces concerns about compounding the approximation error over successive information-collection efforts.

2.3 System dynamics

The belief state x _t, containing the parameters of the distribution of $\tilde {p}_{t}$, represents the policy-maker’s current beliefs about the uncertain parameter and follows a law of motion of the form

$$ x_{t+1} = \phi(\hat{x}_{t}) = \left[ \begin{array}{ll} z & 0\\ 1-z & 1 \end{array} \right] \hat{x}_{t} , $$

(4)

where z ∈ (0,1) is the decay rate. These dynamics imply a geometrically decreasing expected value, increasing coefficient of variation, and decreasing variance for $\mu (x_{0}) \leqslant \frac {1}{1+z}$ (Fig. 1). In the mean-variance space, the equivalent state dynamics become

$$\begin{array}{@{}rcl@{}} \mu(x_{t+1})&=& z \mu(\hat{x}_{t}) \ \ \ \text{and} \ \ \ \sigma^{2}(x_{t+1})\\ &=& \sigma^{2}(\hat{x}_{t}) \left(z + z (1-z) \left(\frac{\mu(\hat{x}_{t})}{1-\mu(\hat{x}_{t})} \right)\right). \end{array} $$

Derivations of these equations are presented in Appendix A.4. The features of these dynamics can represent a wide variety of settings in which the expectation of a parameter is geometrically decreasing over time (e.g., a health condition that is decreasing in prevalence over time; see Section 4). To model a situation where the expectation (and variance) of the uncertain parameter is increasing (e.g., obesity prevalence), the problem can be re-formulated as one in which a parameter of opposite definition is decreasing (e.g., prevalence of individuals who are not obese).

2.4 The policy-maker’s problem

Given a social discount factor δ ∈ (0,1), the policy-maker’s objective is to maximize the net present value of the stream of expected INMBs, given the initial belief x ₀ = (a ₀,b ₀) and admissible policy decisions $U\in {\mathcal {U}} = \left \{(u_{t})_{t\in {\mathbb N}} : u_{t}=(d_{t},n_{t})\in {\mathcal D}\times {\mathcal N} \right \}$. To achieve the objective, the policy-maker seeks to find the best of all possible policies π _t(⋅), t ≥ 0, with u _t = π _t(x _t) for all $x_{t}\in \mathbb {R}_{++}^{2}$, which at each time t maps the state space to admissible current-period actions u _t, so that the implemented path of actions U = (u ₀,u ₁,…) lies in the control-constraint set $\mathcal {U}$. The number of positive observations in the testing sample of n _t is a random variable $\tilde {v_{t}}(n_{t})$ with realization v _t ∈{0,…,n _t}. Based on the collected information the policy-maker updates his beliefs about $\tilde {p}_{t}$ in an (approximate) Bayesian manner using the function ψ(x _t,n _t,v _t,q). Because of the decreasing trend of the uncertain parameter (z < 1), it is never optimal to restart an optimally stopped program.^{Footnote 3} We consider stationary policies $\pi :\mathbb {R}_{++}^{2}\to {\mathcal D}\times {\mathcal N}$ to solve the optimal control problem

$$\begin{array}{@{}rcl@{}} && \max_{\pi(\cdot)} \quad \mathbb{E}\left[ {\sum}_{t=0}^{\infty} \delta^{t} \left.g(\tilde{p}_{t}, \pi(x_{t}))\right|x_{0}\right],\\ &&\text{subject to} \quad x_{t+1} = \phi(\psi(x_{t}, n_{t}, \tilde{v}_{t}(n_{t}), q)), \ \ \ x_{0} \ \ \ \text{given}, \\ && u_{t} = \pi(x_{t})\in{\mathcal D}\times{\mathcal N}. \end{array} $$

(5)

Provided the value function V (x) satisfies the Bellman equation,

$$\begin{array}{@{}rcl@{}} V(x) &=& \max\limits_{(d,n) \in \mathcal{D}\times{\mathcal N}} \{ d (\theta \mu(x) - \gamma ) - \frac{K(n)}{N}\\ &&\quad\quad\quad\quad\quad+ \delta \mathbb{E}[V(\phi(\psi(x, n, \tilde{v}(n),q)))\\ &&\quad\quad\quad\quad\quad\times| x, (d, n)] \}, \end{array} $$

(6)

for all admissible states $x\in \mathbb {R}_{++}^{2}$, the corresponding maximizer π ^∗(x) on the right-hand side defines an optimal policy.

Remark 2

To reflect the policy-maker’s ongoing concern for the health-intervention decision, the problem is formulated in an infinite-horizon setting. Given a time-invariant system, this implies that the optimal policy can be described as a mapping from states to actions, without explicit consideration of time. If more information about the system becomes available over time, for example, relating to the decay rate in the system dynamics (see Eq. 4), then it is possible for the policy-maker to re-solve the problem and update the policy accordingly.

3 Dynamic healthcare decisions

3.1 Policies without information acquisition

If information is prohibitively costly or practically infeasible to collect, Eq. 6 simplifies to

$$V_{\text{NoInfo}}(x) = \max_{d \in \{0,1\}} \{ d (\theta \mu(x) - \gamma ) + \delta V_{\text{NoInfo}}(\phi(x))\}, $$

for all $x\in \mathbb {R}_{++}^{2}$, as there is no Bayesian updating and therefore ψ reduces to an identity map. For all states x for which the optimal strategy is to not do the intervention, this action remains optimal in the future because of the decreasing trend of $\tilde {p}_{t}$. Indeed, since for z ∈ (0,1), μ(ϕ(x)) = z μ(x) < μ(x), we have that for all states where V (x) = 0, it is also the case that V (ϕ(x)) = 0. Hence, for $\mu (x) \leq \frac {\gamma }{\theta }$ it is optimal to stop the intervention. This defines a threshold policy of the form

$$ d^{*} = \left\{ \begin{array}{ll} 0, & \text{if } \mu(x_{t}) \leq \frac{\gamma}{\theta},\\ 1, & \text{otherwise,} \end{array} \right. $$

(7)

for all t ≥ 0. Restricting attention to the interesting case where $\mu (x_{0})\geq \frac {\gamma }{\theta }$ and using the fact that μ(x _t) = z ^t μ(x ₀), we can identify the optimal time T(x ₀) to stop the intervention, which is the first period in which the intervention has a nonpositive expected INMB (see Appendix A.5):

$$ T(x_{0}) = \left\lceil \frac{1}{\ln(z)} \ln \left(\frac{\gamma}{\theta \mu(x_{0})} \right) \right\rceil . $$

(8)

Hence, given any initial state x, the value of implementing the optimal stopping policy for t ∈{0,...,T(x) − 1} is

$$\begin{array}{@{}rcl@{}} V_{\text{NoInfo}}(x) &=& \sum\limits_{t=0}^{T(x)-1} \delta^{t}\left(\theta z^{t} \mu(x) - \gamma \right)\\ &=& \theta\mu(x) \left(\frac{1-(\delta z)^{T(x)}}{1-\delta z} \right) - \gamma \left(\frac{1-\delta^{T(x)}}{1-\delta} \right).\\ \end{array} $$

(9)

Proposition 2

When information is prohibitively costly or practically infeasible to collect, the optimal value function V _NoInfo(x _t) is non-decreasing and convex in μ(x _t) .

Proof

See Appendix A.6. □

Remark 3

The above result depends only on the decay in the mean of the uncertain parameter distribution and is otherwise distribution-free. In other words, it does not depend on the policy-maker’s beliefs other than that $\tilde {p}_{t}$ is expected to decrease over time.

3.2 Policies with information acquisition

When the policy-maker has the option to acquire information, the value function is determined by the Bellman equation (Eq. 6). Its properties in the no-information case (Proposition 2) carry over to the more general situation.

Proposition 3

The optimal value function V (x _t) is nondecreasing and convex in μ(x _t) , and nondecreasing in σ ²(x _t) .

Proof

See Appendix A.7. □

3.2.1 Special case: one-time information collection

Assume for now that information can be collected at most once. Given a one-time size- η experiment (with η ≥ 1) and, briefly, ignoring the cost of information collection K(η), the value with information exceeds the no-information value,

$$\begin{array}{@{}rcl@{}} \theta \mu(x) &-& \gamma + \delta \mathbb{E}[V_{\text{NoInfo}}(\phi(\psi(x,n_{0} = \eta, \tilde{v}(\eta),q)))]\\ &-& V_{\text{NoInfo}}(x) > 0 , \end{array} $$

as a consequence of Jensen’s inequality. This insight is also useful for the comparison of experiments. A higher confidence in the information, i.e., for a larger sample size and/or better test characteristics, produces a mean-preserving spread of the random-variable $\mu (\phi (\psi (x_{t}, n_{0}=\eta , \tilde {v}_{t}(\eta ), q)))$ in the original experiment, and thus, by the convexity of the (monotone) value function and second-order stochastic dominance, a larger value with information.

Because of the monotone system dynamics, the optimal time to collect information of sample size η, at cost κ(η), is obtained by finding a period k where information acquisition is preferred to waiting until the next period, k + 1. In other words, find the smallest k for which

$$V(x_{k}|n_{k}=\eta) \geq V(x_{k+1} = \phi(x_{k})|n_{k+1}=\eta), $$

or equivalently

$$\begin{array}{@{}rcl@{}} &&\theta z^{k}\mu(x_{0}) + \frac{\delta}{1-\delta z}\\ &&\times\left(\mathbb{E}\left[V_{\text{NoInfo}}(\phi(\psi(x_{k},n_{k}=\eta,\tilde{v}(\eta),q)))]\right.\right.\\ &&-\left.\left. \delta \mathbb{E}[V_{\text{NoInfo}}(\phi(\psi(\phi(x_{k}),n_{k+1}=\eta,\tilde{v}(\eta),q)))\right] \right) \\ &\geq& \frac{1-\delta}{1-\delta z} \left(\gamma+\kappa(\eta)\right). \end{array} $$

The positivity of the right-hand side of the last inequality indicates that information acquisition may, on certain trajectories, never be optimal. This is confirmed in our application in Section 4, where the stopping region and the region with information acquisition have a common boundary, transversal to expected state trajectories.

3.2.2 General case: information collection in any period

Based on Proposition 3, the intervention is desirable for greater μ(x _t) and greater σ(x _t); the latter increases the upside of the policy-maker’s asymmetric (convex) payoffs, as if holding a call option. The dynamics presented in Eq. 4, with decreasing expectation and decreasing variance, imply monotonicity of the intervention decision, $d_{t+1} \leqslant d_{t}$.

Corollary 1

Consider x t(1), x t(2)with μ(x t(1)) < μ(x t(2))and σ ²(x t(1)) = σ ²(x t(2)), then if it is optimal to do the intervention with μ(x t(1)), it is also optimal to do the intervention with μ(x t(2)).

Proof

See Appendix A.8. □

Corollary 2

Consider x t(1), x t(2)with μ(x t(1)) = μ(x t(2))and σ ²(x t(1)) < σ ²(x t(2)), then if it is optimal to do the intervention with σ(x t(1)), it is also optimal to do the intervention with σ(x t(2)).

Proof

See Appendix A.9. □

A direct consequence of Proposition 3 and Corollaries 1 and 2 is that an optimal policy, as a map from states to actions, features three regions (Fig. 2). We describe, in detail, the features of the optimal policy for the case of an optimal stopping problem (Fig. 2A). In region I, an optimal policy is ‘no intervention (and do not sample).’ In region II, an optimal policy is ‘do intervention and sample n _t individuals.’ In region III, an optimal policy is to ‘do intervention and do not sample.’

The boundary between regions I and III is $\frac {\gamma }{\theta }$ (Section 3.1). For $0 \leqslant \mu (x_{t}) \leqslant \frac {\gamma }{\theta }$, the policy-maker is indifferent between ‘no intervention (and do not sample)’ and ‘do intervention and sample n _t individuals’ when the rewards of the two regions are equal:

$$ 0 = \theta \mu(x_{t}) - \gamma -\kappa(n_{t}) + \delta \mathbb{E}[V(\phi(\psi(x_{t},n_{t}, \tilde{v}_{t},q)))]. $$

(10)

Focusing on the region $\frac {\gamma }{\theta } \leqslant \mu (x_{t}) \leqslant 1$, the policy-maker is indifferent between ‘do intervention and sample n _t individuals’ and ‘do intervention and do not sample’ when the rewards of the two regions are equal. Removing common terms from each side, this occurs when

$$ - \kappa(n_{t}) + \delta \mathbb{E}[V(\phi(\psi(x_{t},n_{t}, \tilde{v}_{t}, q)))] = \delta V(\phi(x_{t})). $$

(11)

For each σ ²(x _t), there can exist more than one μ(x _t) where $\frac {\gamma }{\theta } < \mu (x_{t}) \leqslant 1$ satisfying Eq. 11 because $V(\phi (\psi (x_{t},n_{t},\tilde {v}_{t},q)))$ is increasing, but neither concave or convex, in v _t. The existence of the section of region III between regions I and II (the location of point A) can be obtained using intuition. Consider two points, A and B, with the same standard deviation (Fig. 2). Compared to point B, if information were to be gathered at point A, the distribution of possible posterior states includes a higher proportion of states in region I (with a reward of 0) and a lower proportion of high-reward states (those with high mean and high standard deviation) and, therefore, information acquisition is less likely to yield a value exceeding its cost. Now consider two points, A and C, with the same mean. Compared to point C, if information were to be gathered at point A, the distribution of possible posterior states is narrower. In both of these cases, increased spread on the side of low mean has no impact on the expectation and increased spread into the high-reward states substantially increases expectation. Therefore, information acquisition is more likely to yield a value exceeding its cost for the state with higher standard deviation.

Proposition 4

For a fixed sample size η (so n _t ∈{0,η}for all t), misclassification in the information-collection technology decreases the value function and reduces the number of states for which information acquisition is optimal.

Proof

See Appendix A.10. □

This result is consistent with Blackwell’s result that a less informative signal cannot increase the value of a single-person decision problem [66].

4 Application

4.1 Background and motivation

Chronic HCV infection is a slowly progressing blood-borne disease that causes liver fibrosis, cirrhosis, and liver cancer. It is the principal cause of death from liver disease and the leading indication for liver transplantation in the United States (US) [67, 68]. Between 2.7 and 5.2 million Americans (1.1% to 2.1% of the adult population) are chronically infected with HCV [69, 70]. In the non-injection drug using US population, prevalence peaks in the 1945 to 1965 birth cohorts and decreases thereafter (Fig. 3). Approximately half of all chronically infected individuals are unaware of their disease status [71].

Recent model-based analyses concluded that one-time screening of individuals born between 1945 and 1965 is cost-effective [11,12,13,14,15] and the CDC and USPSTF recently released new guidance in support of one-time screening of these birth cohorts [16, 17]. Several studies indicate that screening individuals born later than 1965 is also likely to be cost-effective [13,14,15]. Since HCV prevalence is decreasing in birth year after the 1956 birth cohort (Fig. 3), there may be a time at which screening is no longer cost-effective. To improve the decision about the best time to stop screening, additional information about prevalence of the current and future cohorts may be desirable. However, standard approaches to finding the value of information do not usually include the option to delay the information acquisition.

Note that the population we model were predominantly infected decades ago [72, 73] and do not have ongoing risk factors for HCV re-infection. Many historically significant modes of disease incidence have been virtually eliminated including transmission by surgical or other hospital equipment prior to modern sterilization procedures and blood transfusion [73, 74]. Injection equipment sharing among people who actively use injection drugs (PWID) is currently the principal cause of HCV transmission [76]. Although a history of injection drug use is relatively common among individuals with chronic HCV infection (approximately 40% [71]), re-infection and disease transmission to others via injection drug use are not an ongoing risk for a large proportion of these individuals as three-quarters of HCV infected individuals with a self-reported history of injection drug use report last injecting greater than 5 years ago (median time since last injection = 20 years) [75]. Our model does not include PWID and so we do not consider the possibility of re-infection. PWID are a high-risk population and guidelines, separate from those otherwise discussed here, recommend routine annual HCV screening in this population [77].

We now apply the stochastic dynamic programming framework developed in Section 2 to the case of one-time HCV screening at a routine medical appointment at age 50 for successive birth cohorts. We consider screening at age 50 because one-time screening at this age had the lowest incremental cost-effectiveness ratio in an analysis of single birth cohort screening [14]. Waiting to perform a one-time screening in older individuals is less cost-effective because their disease may have progressed further and treatment is less effective in more severe disease states. One-time screening of younger individuals is less cost-effective because younger individuals are further away from the long-term consequences of HCV which screening and treatment hope to avoid. We transform the unbounded state space in terms of x _t = (a _t,b _t) to the compact policy-relevant space μ(x _t) and σ(x _t). Using value iteration implemented in R version 2.15.0 [78], we numerically determine an optimal HCV-screening and information-collection policy for US adults.

At each time, we consider the actions of ‘do not screen for HCV and do not collect information about HCV prevalence in the current cohort;’ ‘screen for HCV and collect sample information about HCV prevalence in the current cohort;’ ‘screen for HCV and do not collect information about HCV prevalence in the current cohort.’ We compare this optimal strategy to the policies identified by various alternative approaches: a slightly modified version of the new CDC and USPSTF recommendation; an optimal policy without information acquisition; and an optimal policy with (possibly immediate) information acquisition. A policy of HCV screening does not inherently provide additional information about HCV prevalence to policy-makers, because only positive test outcomes are reported to the CDC and the reason for the medical test is private health information (the test may have been performed for a reason other than routine screening at age 50). Estimating prevalence among asymptomatic individuals seeking routine preventive medical care therefore requires a study with random sampling of those individuals. The (quasi-)linearity of INMB _t for this example is established in Appendix A.11.1. Parameter values and ranges used in sensitivity analysis are presented in Table 2. Details of parameter estimation are presented in Appendix A.11.2 and A.11.3.

Table 2 Model parameter values and range used in sensitivity analysis

Full size table

4.2 Results

For the purposes of our analysis, we assume the current time to be the year 2010 and the initial cohort to be born in 1960.

4.2.1 Policies identified by alternative approaches

The expected value of the CDC and USPSTF recommendation was obtained by substituting T = 6 into Eq. 9. The sum of the discounted expected INMBs for screening 6 cohorts at age 50, until the 1965 birth cohort turns 50 years of age, is $399.1 million for men and $15.4 million for women (Table 3). The large difference between men and women is attributable to higher HCV prevalence and higher marginal INMB of early diagnosis and treatment in men.

Table 3 Comparison of optimal policies indicated by various analytic approaches for men with initial belief μ(x ₀) = 0.0310 and σ(x ₀) = 0.0035 and women with initial belief μ(x ₀) = 0.0135 and σ(x ₀) = 0.0019

Full size table

We identify the threshold prevalence value below which the HCV-screening program should be terminated and the best time to terminate the screening program, assuming no opportunity to collect information using Eqs. 7–8. In men, the program should be terminated when prevalence falls below 0.4%, which will occur in 18 years (95%CI: 16-19 years). In women, the program should be terminated when prevalence falls below 0.1%, which will occur in 3 years (95%CI: 0-5 years). The expected INMB of these policies is $566.5 million for men and $21.7 million for women (Table 3).

The traditional approach to value-of-information assessment in the health policy literature assumes immediate information collection [3, 4]. For men and women, we find the optimal sample sizes to be 910 and 4,930 individuals from the current cohort, respectively (Fig. 4). The expected INMB of immediate information followed by the optimal policy based on the information collected increases by $20,000 for men and $600,000 for women. Women have a greater value of immediate information because they are closer to the intervention stopping region threshold and, therefore, immediate information is more likely to result in a policy change.

4.2.2 Model results

Implementing the full model, we considered the possibility of collecting sample information at each decision period. For computational and illustrative reasons, we restricted the policy-maker’s choice to two sample sizes $\mathcal {N} \in \{0, \eta \}$. We considered several possible values for η (2000, 2500, 3000, ..., 8000) and we present the results for the sample size that maximized the value at the initial condition for each gender. We also performed analyses using multiple study sample size levels available at each period. We do not present these analyses, as they led to the same optimal policies indicating that our restriction to two sample sizes was not material for this application.

The optimal policy is characterized by the three main regions described in Section 3.2.2 (Fig. 5a). At low prevalence and relatively low uncertainty, it is optimal to not screen and not collect information. At high prevalence, it is optimal to screen and not collect information. At prevalence close to the $\frac {\gamma }{\theta }$ threshold and relatively high uncertainty, it is optimal to both screen and collect information.

For each state in the region where it is optimal to screen without information acquisition, we can identify the optimal next action and the time when it should occur (Fig. 5b). We subdivide this region by a solid line. Above the solid line, which is the region with higher uncertainty, it is optimal to screen without information acquisition for a specified number of periods and then to collect information. In the region with lower uncertainty, it is optimal to screen without information acquisition for a specified number of periods and then to stop screening without ever collecting information. The current prevalence estimates for men and women indicate that it is optimal to screen without information collection for 16 years and 1 year, respectively, and then to collect sample information to inform the next action. The expected INMBs of these policies are $567.9 million and $22.5 million for men and women, respectively (Table 3).

For each state, we also computed the marginal value of collecting a specific amount of information (Fig. 5c). The marginal value of information in the current period is near-zero for states in which collecting information in the future is optimal. Consistent with our expectations, in the ‘Screen and Collect Information’ region, the marginal value of information is greatest close to the $\frac {\gamma }{\theta }$-threshold and increases with uncertainty. In the ‘Screen and Do Not Collect Information’ region, the value of information is highest along the boundary that divides the region into points with trajectories leading to information collection and points with trajectories leading to ‘No Screening’ without information collection.

Sensitivity analysis identified that the general conclusions of our numerical analysis are robust to uncertainty in the inputs (details in Appendix A.11.4).

4.3 Discussion of application

Evaluating an HCV-screening policy over its entire lifecycle using a stochastic dynamic programming approach has led to several important policy-relevant insights. Our analysis indicates that recommendations by the CDC and USPSTF to screen individuals born between 1945 and 1965 at their next routine medical visit are conservative for men. Specifically, our analysis shows that, for men, screening should continue until at least the 1976 birth cohort turns 50 (in 2026), at which point 4,000 individuals should be sampled to inform about the continuation of the program. Screening men at least 10 years longer will enable early diagnosis in an estimated 50,500 additional individuals, thus preventing an expected 767 additional liver cancers and about 212 additional liver transplants. For women, we find that a large information-acquisition effort should take place when the 1961 birth cohort turns 50 (in 2011),^{Footnote 4} as it is likely not cost-effective to screen women, per guidance, to the 1965 cohort because of relatively low prevalence (Fig. 3) and slower disease progression in women [85]. Compared to the CDC and USPSTF recommendation, our model increases the expected INMB by $168.8 million in men and $7.1 million in women.

Our analysis has several limitations. First, we assume only the current cohort can be sampled to learn about subsequent cohorts, relying on the correlation between cohorts (as implied by the system dynamics). In practice, for our example, it is possible to sample the next cohort (49-year olds) directly. We chose this assumption because the individuals who make up the ‘next cohort’ are typically unknown (e.g., the next cohort of patients with a heart attack, the next cohort of pregnant women, or the next cohort of cancer patients). Second, we consider one-time screening at age 50 based on a cost effectiveness analysis of once-in-a-lifetime HCV screening [14]. However, this analysis (and, consequently, ours) assumed that the cohort being screened has not been previously screened. Our model does not identify the optimal age at which to perform one-time screening. Third, we assumed that the individuals who attend a preventive health exam and participate in recommended HCV screening are an unbiased sample from the cohort–that is, individuals are not more or less likely to attend their preventive health exam if they are HCV-positive. However, if individuals at higher-risk of HCV disproportionately self-select for general population screening, then we have underestimated the duration for which screening will be cost-effective. If individuals at lower-risk disproportionately self-select for screening (often called the “worried well”), then we have overestimated the duration for which screening will be cost-effective. Fourth, we focus on HCV screening policy in the non-injection drug using population only because they were the focus of the recent change in HCV screening policy. Finally, while uncertainty (and related information acquisition) with respect to model parameters other than prevalence can be treated in an analogous manner, the details are left for future work.

5 Conclusion

Our analysis shows that when parameters vary across intervention cohorts, it may be optimal to delay information acquisition. This is a significant improvement over the current paradigm which only considers one-time immediate information collection. More specifically, we provide a framework for optimal information acquisition, in terms of timing and precision of the acquired signal (sample size). Further, we incorporate misclassification from an imperfect information-collection technology into our framework, which is an important real-life complexity of information gathering that adds substantial analytical difficulty.

The common assumption that the per-person value-of-information remains constant for future cohorts may result in a significant error when estimating the population value of additional information. It may indicate immediate expensive information collection when, incorporating the system dynamics, the optimal action is to collect information in the future or never at all. When a parameter is evolving across intervention cohorts, ignoring the opportunity to wait and collect information in the future, when the information collected is more likely to result in action, is a missed opportunity for increased efficiency. As seen in our example, adding the option of delaying information acquisition until a time when the signal is more likely to justify a policy shift can increase the expected value compared to a policy of immediate information collection. The dynamic programming framework developed in this paper enables an accurate assessment of the marginal value of additional information and identifies an optimal information-acquisition policy.

In this work, we assumed that the dynamics are monotonically increasing or decreasing and that they are deterministic. In future work, we plan to consider the more realistic assumption of uncertainty in the dynamics. This would then enable learning about the evolution of the parameters, rather than just their current state. Furthermore, our model does not consider the possibility of intervening on a cohort at a different time in the course of their disease or lives (i.e., at an earlier or later age) or the possibility of the intervention modifying the population-level dynamics. Although true for our application, this latter assumption does not hold in general for an infectious disease. Including the additional benefits of reduced disease transmission from prevention and treatment interventions may generate more near-term benefits and may dramatically alter the value of the intervention over time.

With strained resources for health programs and population-health monitoring, this type of analysis may ensure an optimal implementation horizon for health programs together with guidance on when and how much information should be collected to inform health-program adjustments. Beyond health, many application areas face limited resources for investment and information acquisition, high-quality decision-relevant information is often difficult or expensive to collect, and population or environmental trends influence the preferences and behavior of customers across industries. Facing a dynamic consumer, competitive, or physical environment, the optimal timing of high-quality information acquisition may provide competitive advantage.

Notes

A beta-distribution with parameters (a,b) has mean $\mu = \frac {a}{a + b}$ and variance $\sigma ^{2}=\frac {a b}{(a + b)^{2} (a + b + 1)}$. Through direct substitution and rearrangement, it can be shown that a beta-distribution with mean μ and variance σ ² has parameters $a=\mu \left (\frac {\mu (1-\mu )}{\sigma ^{2}} -1 \right )$ and $b =(1-\mu ) \left (\frac {\mu (1-\mu )}{\sigma ^{2}} -1 \right )$.
The posterior distribution is a weighted sum of component beta-distributions; see Eq. 2. The weights of the exact posterior distribution are generated by the convolution of two binomial distributions; see Eq. 13. The convolution of two binomial distributions creates unimodal weights over the ordered set of component beta-distributions in the mixture (ordered in terms of increasing first parameter). For example, consider a prior x _t = (3, 7). Given a sample size n _t = 5, there are 6 possible true outcomes, by which we mean the potentially unobservable number of actual positive samples in the study. These true outcomes correspond to 6 possible unique beta-distributions with parameters (3, 12), (4, 11), (5, 10), (6, 9), (7, 8), and (8, 7) forming the components of the exact posterior distribution. Because the information technology is imperfect, we have a belief over these possible outcomes equal to the distribution of the actual number of positives given a specific number of observed positives. We can compute this distribution using the weights in Eq. 2, where j is the number of actual positives among the v _t observed positives, and k is the number of actual positives among the n _t − v _t observed negatives. The probability that the actual number of positives in the sample is W is determined by summing all the weights for which j + k = W. The unimodality of the weights over the ordered component distributions ensures the unimodality of the posterior distribution. To complete the numerical example, consider the case where there are 3 observed positives, v _t = 3, and q = (0.9, 0.85), which results in weights of 0.028, 0.141, 0.346, 0.414, 0.067, 0.004 over the component beta-distributions in the mixture.
Weber [65] uses global optimization to consider the general problem of switching between arbitrary streams of expected benefits allowing for multiple switches, which can be viewed as a deterministic equivalent of the multi-armed bandit problem.
Our initial cohort is individuals born in 1960. This result can be interpreted as a recommendation for immediate information collection.
The monotone-likelihood-ratio property is satisfied [89].

References

Gold MR, Siegel JE, Russell LB, Weinstein MC (1996) Cost-Effectiveness in Health and Medicine. Oxford University Press, Oxford
Google Scholar
Drummond MF, Sculpher MJ, Torrance GW (2005) Methods for the Economic Evaluation of Health Care Programs, 3rd edn. Oxford University Press, Oxford
Google Scholar
Ades AE, Lu G, Claxton KP (2004) Expected value of sample information calculations in medical decision modeling. Med Decis Making 24(2):207–227
Article Google Scholar
Claxton KP, Sculpher MJ (2006) Using value of information analysis to prioritise health research: Some lessons from recent UK experience. PharmacoEconomics 24(11):1055–1068
Article Google Scholar
Eckermann S, Karnon J, Willan AR (2010) The value of value of information. PharmacoEconomics 28 (9):699–709
Article Google Scholar
Philips Z, Claxton K, Palmer S (2008) The half-life of truth: what are appropriate time horizons for research decisions?. Med Decis Making 28(3):287–299
Article Google Scholar
Eckermann S, Willan AR (2008) The option value of delay in health technology assessment. Med Decis Making 28(3):300–305
Article Google Scholar
Juusola JL, Brandeau ML (2016) HIV treatment and prevention: a simple model to determine optimal investment. Med Decis Making 36(3):391–409
Article Google Scholar
Singer ME, Younossi ZM (2001) Cost effectiveness of screening for hepatitis C virus in asymptomatic, average-risk adults. Am J Med 111(8):614–621
Article Google Scholar
Chou R, Clark EC, Helfand M (2004) Screening for hepatitis C virus infection: a review of the evidence for the U.S. Preventive Services Task Force. Ann Intern Med 140(6):465–479
Article Google Scholar
Rein DB, Smith BD, Wittenborn JS, Lesesne SB, Wagner LD, Roblin DW, Patel N, Ward JW, Weinbaum CM (2012) The cost-effectiveness of birth-cohort screening for hepatitis C antibody in US primary care settings. Ann Intern Med 156(4):263–270
Article Google Scholar
Coffin PO, Scott JD, Golden MR, Sullivan SD (2012) Cost-effectiveness and population outcomes of general population screening for hepatitis C. Clin Infect Dis 54(9):1259–1271
Article Google Scholar
McGarry LJ, Pawar VS, Panchmatia HR, Rubin JL, Davis GL, Younossi ZM, Capretta JC, O’Grady MJ, Weinstein MC (2012) Economic model of a birth cohort screening program for hepatitis C virus. Hepatology 55(5):1344–1355
Article Google Scholar
Liu S, Cipriano LE, Holodniy M, Goldhaber-Fiebert JD (2013) Cost-effectiveness analysis of risk-factor guided and birth-cohort screening for chronic hepatitis C infection in the United States. PLoS One 8(3):e58975
Article Google Scholar
Eckman MH, Talal AH, Gordon SC, Schiff E, Sherman KE (2013) Cost-effectiveness of screening for chronic hepatitis C infection in the United States. Clin Infect Dis 56(10):1382–1393
Article Google Scholar
Smith BD, Morgan RL, Beckett GA, Falck-Ytter Y, Holtzman D, Ward JW (2012) Hepatitis C virus testing of persons born during 1945–1965: Recommendations from the Centers for Disease Control and Prevention. Ann Intern Med 157(11):817–822
Article Google Scholar
Moyer VA, on behalf of the U.S. Preventive Services Task Force (2013) Screening for Hepatitis C Virus Infection in Adults: U.S. Preventive Services Task Force Recommendation Statement. Ann Intern Med 159(5):349–357
Article Google Scholar
Jensen R (1983) Innovation adoption and diffusion when there are competing innovations. J Econ Theory 29(1):161–171
Article Google Scholar
McCardle KF (1985) Information acquisition and the adoption of new technology. Manag Sci 31(11):1372–1389
Article Google Scholar
Smith JE, McCardle KF (2002) Structural properties of stochastic dynamic programs. Oper Res 50(5):796–809
Article Google Scholar
Ulu C, Smith JE (2009) Uncertainty, information acquisition, and technology adoption. Oper Res 57(3):740–752
Article Google Scholar
Rosenberg N (1982) Inside the black box: technology and economics. Cambridge University Press, Cambridge
Google Scholar
Bessen J (1999) Real options and the adoption of new technologies. Research on Innovation. http://www.researchoninnovation.org
Kornish LJ (2006) Technology choice and timing with positive network effects. Eur J Oper Res 173(1):268–282
Article Google Scholar
Chambers C, Kouvelis P (2003) Competition, learning and investment in new technology. IIE Trans 35(9):863–878
Article Google Scholar
Schaefer AJ, Bailey MD, Shechter SM, Roberts MS (2005) Chapter 23: Modeling medical treatment using Markov decision processes. In: Operations Research and Health Care: A handbook of methods and applications, Springer US, volume 70 of International Series in Operations Research and Management Science. pp. 593–612
Alagoz O, Hsu H, Schaefer AJ, Roberts MS (2010) Markov decision processes: a tool for sequential decision making under uncertainty. Med Decis Making 30(4):474–483
Article Google Scholar
Ahn JH, Hornberger JC (1996) Involving patients in the cadaveric kidney transplant allocation process: A decision-theoretic perspective. Manag Sci 42(5):629–641
Article Google Scholar
Magni P, Quaglini S, Marchetti M, Barosi G (2000) Deciding when to intervene: a Markov decision process approach. Int J Med Inform 60(3):237–253
Article Google Scholar
Hauskrecht M, Fraser H (2000) Planning treatment of ischemic heart disease with partially observable Markov decision processes. Artif Intell Med 18(3):221–244
Article Google Scholar
Alagoz O, Maillart LM, Schaefer AJ, Roberts MS (2004) The optimal timing of living–donor liver transplantation. Manag Sci 50(10):1420–1430
Article Google Scholar
Alagoz O, Maillart LM, Schaefer AJ, Roberts MS (2007) Choosing among cadaveric and living–donor livers. Manag Sci 53(11):1702–1715
Article Google Scholar
Shechter SM, Bailey MD, Schaefer AJ, Roberts MS (2008) The optimal time to initiate HIV therapy under ordered health states. Oper Res 56(1):20–33
Article Google Scholar
Shechter SM, Bailey MD, Schaefer AJ, Roberts MS (2008) A modeling framework for replacing medical therapies. IIE Trans 40(9):861–869
Article Google Scholar
Kırkızlar E, Faissol DM, Griffin PM, Swann JL (2010) Timing of testing and treatment for asymptomatic diseases. Math Biosci 226(1):28–37
Article Google Scholar
Kurt M, Denton B, Schaefer AJ, Shah N, Smith S (2011) The structure of optimal statin initiation policies for patients with type 2 diabetes. IIE Trans 1(1):49–65
Google Scholar
Mason JE, Denton BT, Shah ND, Smith SA (2014) Optimizing the simultaneous management of blood pressure and cholesterol for type 2 diabetes patients. Eur J Oper Res 233(3):727–738
Article Google Scholar
Zhang J, Denton BT, Balasubramanian H, Shah ND, Inman BA (2012) Optimization of prostate biopsy referral decisions. Manuf Serv Op 14(4):529–547
Article Google Scholar
Ayer T, Alagoz O, Stout NK, Burnside ES (2014) Designing a new breast cancer screening program considering adherence. Manag. Sci. Forthcoming
Erenay FS, Alagoz O, Said A (2014) Optimizing colonoscopy screening for colorectal cancer prevention and surveillance. Manuf Serv Op 16(3):381–400
Article Google Scholar
Patrick J, Puterman ML, Queyranne M (2008) Dynamic multipriority patient scheduling for a diagnostic resource. Oper Res 56(6):1507–1525
Article Google Scholar
Gocgun Y, Bresnahan BW, Ghate A, Gunn ML (2011) A Markov decision process approach to multi-category patient scheduling in a diagnostic facility. Artif Intell Med 53:73–81
Article Google Scholar
Patrick J (2012) A Markov decision model for determining optimal outpatient scheduling. Health Care Manag Sci 15:91–102
Article Google Scholar
Gocgun Y, Puterman ML (2014) Dynamic scheduling with due dates and time windows: an application to chemotherapy patient appointment booking. Health Care Manag Sci 17:60–76
Article Google Scholar
Gupta D, Wang L (2008) Revenue management for a primary-care clinic in the presence of patient choice. Oper Res 56(3):576–592
Article Google Scholar
Wang J, Fung RYK (2015) Adaptive dynamic programming algorithms for sequential appointment scheduling with patient preferences. Artif Intell Med 63:33–40
Article Google Scholar
Kornish LJ, Keeney RL (2008) Repeated commit-or-defer decisions with a deadline: the influenza vaccine composition. Oper Res 56(3):527–541
Article Google Scholar
Özaltın OY, Prokopyev OA, Schaefer AJ, Roberts MS (2011) Optimizing the societal benefits of the annual influenza vaccine: a stochastic programming approach. Oper Res 59(5):1131–1143
Article Google Scholar
Weinstein M, Zeckhauser R (1973) Critical ratios and efficient allocation. J Public Econ 2:147–157
Article Google Scholar
Culyer AJ (1989) The normative economics of health care finance and provision. Oxf Rev Econ Policy 5:34–58
Article Google Scholar
Stinnett AA, Paltiel AD (1996) Mathematical programming for the efficient allocation of health care resources. J Health Econ 15:641–653
Article Google Scholar
Williams I, McIver S, Moore D, Bryan S (2008) The use of economic evaluations in NHS decision-making: a review and empirical investigation. Health Technol Assess 12(7):1–175
Article Google Scholar
Raiffa H, Schlaifer R (1961) Applied Statistical Decision Theory. Harvard University Press, Cambridge
Google Scholar
Weinstein MC (1983) Cost-effective priorities for cancer prevention. Science 221(4605):17–23
Article Google Scholar
Hornberger JC, Brown BW, Halpern J (1995) Designing a cost-effective clinical trial. Stat Med 14 (20):2249–2259
Article Google Scholar
Claxton K, Posnett J (1996) An economic approach to clinical trial design and research priority-setting. Health Econ 5(6):513–524
Article Google Scholar
Claxton K (1999) The irrelevance of inference: a decision-making approach to the stochastic evaluation of health care technologies. J Health Econ 18(3):341–364
Article Google Scholar
Eckermann S, Willan AR (2008) Time and expected value of sample information wait for no patient. Value Health 11(3):522–526
Article Google Scholar
McKenna C, Claxton K (2011) Addressing adoption and research design decisions simultaneously: the role of value of sample information analysis. Med Decis Making 31(6):853–865
Article Google Scholar
Hall PS, Edlin R, Kharroubi S, Gregory W, McCabe C (2012) Expected net present value of sample information from burden to investment. Med Decis Making 32(3):E11—E21
Article Google Scholar
Fenwick E, Claxton K, Sculpher M (2008) The value of implementation and the value of information: combined and uneven development. Med Decis Making 28(1):21–32
Article Google Scholar
Willan AR, Eckermann S (2010) Optimal clinical trial design using value of information methods with imperfect implementation. Health Econ 19(5):549–561
Google Scholar
Smith JE (1993) Moment methods for decision analysis. Manag Sci 39(3):340–358
Article Google Scholar
Gospodinov N, Lkhagvasuren D (2014) A moment-matching method for approximating vector autoregressive processes by finite-state Markov chains. J Appl Econom 29(5):843–859
Article Google Scholar
Weber TA (2017) Optimal switching between cash-flow streams. Math. Method. Oper. Res. Forthcoming. https://doi.org/10.1007/s00186-017-0586-0
Article Google Scholar
Blackwell D (1951) Comparison of experiments. In: Proceedings of the Second Berkeley Symposium on Mathematical Statistics and Probability. University of California Press. pp 93–102
Kim W (2002) The burden of hepatitis C in the United States. Hepatology 36(Suppl 1):S30—S34
Google Scholar
Ghany MG, Strader DB, Thomas DL, Seeff LB (2009) Diagnosis, management, and treatment of hepatitis C: an update. Hepatology 49(4):1335–1374
Article Google Scholar
Armstrong GL, Wasley A, Simard EP, McQuillan GM, Kuhnert WL, Alter MJ (2006) The prevalence of hepatitis C virus infection in the United States, 1999 through 2002. Ann Intern Med 144(10):705–714
Article Google Scholar
Chak E, Talal AH, Sherman KE, Schiff ER, Saab S (2011) Hepatitis C virus infection in USA: an estimate of true prevalence. Liver Int 31(8):1090–1101
Article Google Scholar
Denniston MM, Klevens RM, McQuillan GM, Jiles RB (2012) Awareness of infection, knowledge of hepatitis C, and medical follow-up among individuals testing positive for hepatitis C: National Health and Nutrition Examination Survey 2001-2008. Hepatology 55(6):1652–1661
Article Google Scholar
Armstrong GL (2007) Injection drug users in the United States, 1979-2002: an aging population. Arch Intern Med 167(2):166–173
Article Google Scholar
Armstrong GL, Alter MJ, McQuillan GM, Margolis HS (2000) The past incidence of hepatitis C virus infection: implications for the future burden of chronic liver disease in the United States. Hepatology 31(3):777–782
Article Google Scholar
Joy JB, McCloskey RM, Nguyen T, Liang RH, Khudyakov Y, Olmstead A, Krajden M, Ward JW, Harrigan PR, Montaner JS, Poon AF (2016) The spread of hepatitis C virus genotype 1a in North America: a retrospective phylogenetic study. Lancet Infect Dis 16(6):698–702
Article Google Scholar
Barker L Personal communication, August 2, 2016. Based on analyses using the Centers for Disease Control and Prevention, National Health and Nutrition Examination Survey (NHANES) (2005-2012)
Klevens RM, Hu DJ, Jiles R, Holmberg SD (2012) Evolving epidemiology of hepatitis C virus in the United States. Clin Infect Dis Suppl 55(1):S3—9
Google Scholar
American Association for the Study of Liver Diseases and the Infectious Diseases Society of America (AASLD-IDSA). HCV testing and linkage to care. Recommendations for testing, managing, and treating hepatitis C. Available at: http://www.hcvguidelines.org/full-report/hcv-testing-and-linkage-care. Accessed: February 23, 2017
R Development Core Team (2012) R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna. http://www.R-project.org/
Google Scholar
US Census Bureau (2010) QT – P1: Age groups and sex: 2010. Available at: http://www.census.gov/2010census/
Mehrotra A, Zaslavsky AM, Ayanian JZ (2007) Preventive health examinations and preventive gynecological examinations in the United States. Arch Intern Med 167(17):1876
Article Google Scholar
Gretch DR (1997) Diagnostic tests for hepatitis C. Hepatology 26(Suppl 3):43S–47S
Article Google Scholar
Hyland C, Kearns S, Young I, Battistutta D, Morgan C (1992) Predictive markers for hepatitis C antibody ELISA specificity in Australian blood donors. Transfusion Med 2(3):207–213
Article Google Scholar
Center for Medicare and Medicaid Services (CMS). 2010. Medicare fee schedule. U.S. Department of Health and Human Services. http://www.cms.gov/home/medicare.asp
Weinstein MC, Skinner JA (2010) Comparative effectiveness and health care spending – implications for reform. New Engl J Med 362(5):460–465
Article Google Scholar
Thein HH, Yi Q, Dore GJ, Krahn MD (2008) Estimation of stage-specific fibrosis progression rates in chronic hepatitis C virus infection: A meta-analysis and meta-regression. Hepatology 48(2):418–431
Article Google Scholar
Centers for Disease Control and Prevention (CDC) (2006) National Health and Nutrition Examination Survey (NHANES): analytic and reporting guidelines. http://www.cdc.gov/nchs/nhanes/nhanes2003-2004/analytical_guidelines.htm. Accessed: August 27, 2012. Last updated: September 2006
Centers for Disease Control and Prevention (CDC) (2011) Analytic note regarding 2007-2010 survey design changes and combining data across other survey cycles. http://www.cdc.gov/nchs/nhanes/nhanes2003-2004/analytical_guidelines.htm. Accessed: August 27, 2012. Last updated: September 2011
Centers for Disease Control and Prevention (CDC) (2012) National Health and Nutrition Examination Survey (NHANES) (1999–2010). http://www.cdc.gov/nchs/nhanes.htm. Accessed: August 27, 2012
Milgrom PR (1981) Good news and bad news: Representation theorems and applications. Bell J Econ 12 (12):380–391
Article Google Scholar
US Bureau of Labor Statistics (2011) Consumer Price Index (CPI):1913–present. Division of consumer prices and price indexes. Available at: http://www.bls.gov/cpi/

Download references

Acknowledgments

The authors thank Laurie Barker, MSPH, Division of Viral Hepatitis, Centers for Disease Control and Prevention for assistance with NHANES analysis.

Author information

Authors and Affiliations

Ivey Business School, Western University, 1255 Western Road, London, ON, N6G 0N1, Canada
Lauren E. Cipriano
Ecole Polytechnique Fédérale de Lausanne, CDM-ODY 3.01, Station 5, CH-1015, Lausanne, Switzerland
Thomas A. Weber

Authors

Lauren E. Cipriano
View author publications
You can also search for this author in PubMed Google Scholar
Thomas A. Weber
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lauren E. Cipriano.

Additional information

Open Access

This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Appendix

A glossary of symbols is provided in Table 4.

Table 4 Glossary of Symbols

Full size table

1.1 A.1 Proof of Proposition 1

The special case where $\tilde {p}_{t}$ is beta-distributed with parameters x _t = (a _t,b _t) and the information-collection technology is perfect, i.e., where q = (1, 1), is well known [53]. The sample information $\tilde {v_{t}}$, where $\tilde {v_{t}}$ is the number of observed positives of n _t samples, is beta-binomially distributed, and updating the prior with the sample information results in a beta-distributed posterior belief with parameters (a _t + v _t,b _t + n _t − v _t).

Consider the interesting case with a prior belief f _p(x _t) which is a mixture of m _t ≥ 1 beta-distributions where x _t = (x _t,1,x _t,2,...,x _t,m _t) and $x_{t,i} = (a_{t,i}, b_{t,i}) \in \mathbb {R}_{++}^{2}$ such that $ 1 \leqslant i \leqslant m$, and a set of positive weights ω _i such that ${\sum }_{i=1}^{m} \omega _{i} = 1$. The pdf is

$$f_{p}(p|x_{t}) = \sum\limits_{i=1}^{m} w_{i} \frac{\Gamma(a_{i} + b_{i})}{\Gamma(a_{i}) {\Gamma}(b_{i})} p^{a_{i} -1} (1-p)^{b_{i} -1} . $$

At time t the policy-maker chooses n _t > 0, indicating that they will collect n _t Bernoulli trials. The information-collection technology, with test sensitivity q ₁ and test specificity q ₂, is imperfect. The probability of observing a ‘positive’ signal from any single Bernoulli trial is $\tilde {p}_{t} q_{1} + (1-\tilde {p}_{t})(1-q_{2})$. Therefore,

$$\begin{array}{@{}rcl@{}} f_{v|p}(v_{t}|\tilde{p}_{t} \,=\, p, n_{t}, q) \!&=&\! {n_{t} \choose v_{t}} [p q_{1} \,+\, (1\,-\,p)(1\,-\,q_{2})]^{v_{t}}\\ &&\times [1 \,-\, p q_{1}\,-\, (1\,-\,p)(1\,-\,q_{2})]^{n_{t} - v_{t}}. \end{array} $$

The resulting distribution of sample information is effectively a weighted beta-binomial distribution, correcting for the additional uncertainty introduced by the imperfect information-collection technology:

$$\begin{array}{@{}rcl@{}} f_{v}(v_{t} | x_{t}, n_{t}, q) \!\!&=&\!\! {{\int}_{0}^{1}} f_{v|p}(v_{t}|\tilde{p}_{t} \,=\, p_{t}, n_{t}, q) f_{p}(\tilde{p}_{t}\,=\, p_{t} | x_{t}) dp \\ \!\!&=&\!\! {{\int}_{0}^{1}} {n_{t} \choose v_{t}} [p q_{1} \,+\, (1\,-\,p)(1\,-\,q_{2})]^{v_{t}} [1 \,-\, p q_{1} \,-\, (1\,-\,p)(1\,-\,q_{2})]^{n_{t} - v_{t}} f_{p}(\tilde{p}_{t}\,=\, p_{t} | x_{t}) dp \\ \!\!&=&\!\! {{\int}_{0}^{1}} {n_{t} \choose v_{t}} [p q_{1} \,+\, (1\,-\,p)(1\,-\,q_{2})]^{v_{t}} [p(1\,-\, q_{1}) \,+\, (1\,-\,p)q_{2}]^{n_{t} - v_{t}} \!\times\! \sum\limits_{i=1}^{m} \omega_{i} \frac{\Gamma(a_{t,i}\,+\,b_{t,i})}{\Gamma(a_{t,i}){\Gamma}(b_{t,i})} p^{a_{t,i} -1} (1\,-\,p)^{b_{t,i}-1} dp \\ \!\!&=&\!\! {{\int}_{0}^{1}} {n_{t} \choose v_{t}} \left(\sum\limits_{j=0}^{v_{t}} {v_{t}\choose j} p^{j} {q_{1}^{j}} (1\,-\,p)^{v_{t}-j} (1\,-\,q_{2})^{v_{t}-j} \right) \left(\sum\limits_{k=0}^{n_{t}-v_{t}} {n_{t}-v_{t}\choose k} p^{k} (1\,-\,q_{1})^{k} (1\,-\,p)^{n_{t}-v_{t}-k} q_{2}^{n_{t}-v_{t}-k} \right) \\ &&\!\!\times \sum\limits_{i=1}^{m} \omega_{i} \frac{\Gamma(a_{t,i}\,+\,b_{t,i})}{\Gamma(a_{t,i}){\Gamma}(b_{t,i})} p^{a_{t,i} -1} (1\,-\,p)^{b_{t,i}-1} dp \\ \!\!&=&\!\!\sum\limits_{j=0}^{v_{t}} \sum\limits_{k=0}^{n_{t}-v_{t}} {n_{t} \choose v_{t}} {v_{t}\choose j} {n_{t}-v_{t}\choose k} {q_{1}^{j}} (1\,-\,q_{2})^{v_{t}-j} (1\,-\,q_{1})^{k} q_{2}^{n_{t}-v_{t}-k} \!\times\!\sum\limits_{i=1}^{m} \omega_{i} \frac{\Gamma(a_{t,i}\,+\,b_{t,i})}{\Gamma(a_{t,i}){\Gamma}(b_{t,i})} {{\int}_{0}^{1}} p^{a_{t,i}+j+k -1} (1\,-\,p)^{b_{t,i}+n_{t}-j-k-1} dp \\ \!\!&=&\!\!\sum\limits_{j=0}^{v_{t}} \sum\limits_{k=0}^{n_{t}-v_{t}} {n_{t} \choose v_{t}} {v_{t}\choose j} {n_{t}-v_{t}\choose k} {q_{1}^{j}} (1\,-\,q_{2})^{v_{t}-j} (1\,-\,q_{1})^{k} q_{2}^{n_{t}-v_{t}-k} \!\times\!\sum\limits_{i=1}^{m} \omega_{i} \frac{\Gamma(a_{t,i}\,+\,b_{t,i})}{\Gamma(a_{t,i}){\Gamma}(b_{t,i})} \frac{\Gamma(a_{t,i}\,+\,j\,+\,k){\Gamma}(b_{t,i}\,+\,n_{t}\,-\,j\,-\,k)}{\Gamma(a_{t,i}\,+\,b_{t,i}\,+\,n_{t})} \\ \!\!&=&\!\! \sum\limits_{j=0}^{v_{t}} \sum\limits_{k=0}^{n_{t} -v_{t}} {q_{1}^{j}} (1\,-\,q_{2})^{v_{t}-j} (1\,-\,q_{1})^{k} q_{2}^{n_{t} - v_{t}-k}\!\times\!\sum\limits_{i=1}^{m} \omega_{i} \frac{\Gamma(n_{t}\,+\,1){\Gamma}(a_{t,i} \,+\, b_{t,i}) {\Gamma}(a_{t,i} \,+\,j\,+\,k) {\Gamma}(b_{t,i} \,+\,n_{t}\,-\,j \,-\,k)}{\Gamma(j\,+\,1) {\Gamma}(v_{t}\,-\,j\,+\,1) {\Gamma}(k\,+\,1) {\Gamma}(n_{t}\,-\,v_{t}\,-\,k\,+\,1){\Gamma}(a_{t,i}) {\Gamma}(b_{t,i}) {\Gamma}(a_{t,i} \,+\, b_{t,i} \,+\,n_{t})}. \\ \end{array} $$

(12)

Using Bayes’ Theorem, the posterior distribution is a mixture of beta-distributions with weights summing to one:

$$\begin{array}{@{}rcl@{}} f_{p|v}(p | x_{t}, n_{t}, v_{t}, q) \!\!&=&\!\! \frac{f_{v|p}(\tilde{v}_{t}\,=\,v_{t} | \tilde{p}, n_{t}, q) f_{p}(\tilde{p}_{t} \,=\, p_{t} | x_{t})}{f_{v}(\tilde{v}_{t}\,=\,v_{t}| x_{t}, n_{t}, q)} \\ \!\!&=&\!\! \frac{{n_{t} \choose v_{t}} [p q_{1} \,+\, (1\,-\,p)(1\,-\,q_{2})]^{v_{t}} [p(1\,-\, q_{1}) \,+\, (1\,-\,p)q_{2}]^{n_{t} - v_{t}} {\sum}_{i=1}^{m} \omega_{i} \frac{\Gamma(a_{t,i}+b_{t,i})}{\Gamma(a_{t,i}){\Gamma}(b_{t,i})} p^{a_{t,i} -1} (1\,-\,p)^{b_{t,i}-1}} {f_{v}(\tilde{v}_{t}\,=\,v_{t}| x_{t}, n_{t}, q)} \\ \!\!&=&\!\!\frac{1}{f_{v}(\tilde{v}_{t}\,=\,v_{t}| x_{t}, n_{t}, q)} {n_{t} \choose v_{t}} \left(\sum\limits_{j=0}^{v_{t}} {v_{t}\choose j} p^{j} {q_{1}^{j}} (1\,-\,p)^{v_{t}-j} (1\,-\,q_{2})^{v_{t}-j} \right) \\ & &\!\!\times \left(\sum\limits_{k=0}^{n_{t}-v_{t}} {n_{t}-v_{t}\choose k} p^{k} (1\,-\,q_{1})^{k} (1\,-\,p)^{n_{t}-v_{t}-k} q_{2}^{n_{t}-v_{t}-k} \right) \sum\limits_{i=1}^{m} \omega_{i} \frac{\Gamma(a_{t,i}\,+\,b_{t,i})}{\Gamma(a_{t,i}){\Gamma}(b_{t,i})} p^{a_{t,i} -1} (1\,-\,p)^{b_{t,i}-1} \\ \!\!&=&\!\!\frac{1}{f_{v}(\tilde{v}_{t}\,=\,v_{t}| x_{t}, n_{t}, q)} \sum\limits_{j=0}^{v_{t}} \sum\limits_{k=0}^{n_{t}-v_{t}} {n_{t} \choose v_{t}} {v_{t}\choose j} {n_{t}\,-\,v_{t}\choose k} {q_{1}^{j}} (1\,-\,q_{2})^{v_{t}-j} (1\,-\,q_{1})^{k} q_{2}^{n_{t}-v_{t}-k} \\ & &\!\!\times \sum\limits_{i=1}^{m} \omega_{i} \frac{\Gamma(a_{t,i}\,+\,b_{t,i})}{\Gamma(a_{t,i}){\Gamma}(b_{t,i})} p^{a_{t,i}+j+k -1} (1\,-\,p)^{b_{t,i}+n_{t}-j-k-1} \\ \!\!&=&\!\!\frac{1}{f_{v}(\tilde{v}_{t}\,=\,v_{t}| x_{t}, n_{t}, q)} \sum\limits_{j=0}^{v_{t}} \sum\limits_{k=0}^{n_{t}-v_{t}} \sum\limits_{i=1}^{m} {q_{1}^{j}} (1\,-\,q_{2})^{v_{t}-j} (1\,-\,q_{1})^{k} q_{2}^{n_{t}-v_{t}-k} \omega_{i} \\ & &\!\!\times \frac{\Gamma(n_{t}\,+\,1) {\Gamma}(a_{t,i}\,+\,b_{t,i}) {\Gamma}(a_{t,i} \,+\,j\,+\,k) {\Gamma}(b_{t,i} \,+\,n_{t}\,-\,j \,-\,k)} { {\Gamma}(j\,+\,1) {\Gamma}(v_{t}\,-\,j\,+\,1) {\Gamma}(k\,+\,1) {\Gamma}(n_{t}\,-\,v_{t}\,-\,k\,+\,1) {\Gamma}(a_{t,i}){\Gamma}(b_{t,i}) {\Gamma}(a_{t,i} \,+\, b_{t,i} \,+\,n_{t})} \!\times\!\text{beta}(a_{t,i}\,+\,j\,+\,k , b_{t,i}\,+\,n_{t}\,-\,j\,-\,k ) \\ \!\!&=&\!\! \sum\limits_{j=0}^{v_{t}} \sum\limits_{k=0}^{n_{t} -v_{t}} \sum\limits_{i=1}^{m} \omega^{\prime}_{j, k, i} \text{beta}(a_{t,i}\,+\,j\,+\,k , b_{t,i}\,+\,n_{t}\,-\,j\,-\,k ), \end{array} $$

(13)

with updated weights

$$\omega^{\prime}_{j, k, i} = \frac{ \frac{ \omega_{i} {q_{1}^{j}} (1-q_{2})^{v_{t}-j} (1-q_{1})^{k} q_{2}^{n_{t}-v_{t}-k} {\Gamma}(a_{t,i}+b_{t,i}) {\Gamma}(a_{t,i} +j+k) {\Gamma}(b_{t,i} +n_{t}-j -k)} { {\Gamma}(j+1) {\Gamma}(v_{t}-j+1) {\Gamma}(k+1) {\Gamma}(n_{t}-v_{t}-k+1) {\Gamma}(a_{t,i}){\Gamma}(b_{t,i}) {\Gamma}(a_{t,i} + b_{t,i} +n_{t})}} { \sum\limits_{r=0}^{v_{t}} \sum\limits_{s=0}^{n_{t} - v_{t}} \sum\limits_{i=1}^{m} \frac{ \omega_{i} {q_{1}^{r}} (1-q_{2})^{v_{t}-r} (1-q_{1})^{s} q_{2}^{n_{t} - v_{t} - s} {\Gamma}(a_{t,i} + b_{t,i}) {\Gamma}(a_{t,i} +r+s) {\Gamma}(b_{t,i} +n_{t}-r -s)}{\Gamma(r+1) {\Gamma}(v_{t}-r+1) {\Gamma}(s+1) {\Gamma}(n_{t}-v_{t}-s+1){\Gamma}(a_{t,i}) {\Gamma}(b_{t,i}) {\Gamma}(a_{t,i} + b_{t,i} +n_{t})}}. $$

The coefficients of each component of the mixture distribution sum to 1 and, therefore, the posterior distribution is a mixture of beta-distributions, which concludes our proof.

1.2 A.2 General properties of mixtures of beta-distributions

1.2.1 A.2.1 The number of component distributions in the posterior mixture distribution

Perfect information-collection technology (i.e., q ₁ = q ₂ = 1).

If $\tilde {p}_{t}$ is a mixture of m ≥ 1 beta-distributions with weights ω _i and the information-collection technology is perfect, the distribution of sample information is a mixture of beta-binomial distributions with weights ω _i. Updating results in a posterior distribution that is a mixture of m beta-distributions with parameters (a _t,i + v _t,b _t,i + n _t − v _t).

Consider the example where m = 2, x _t = ((3, 7), (7, 3)), ω = (0.8, 0.2), n _t = 10, and v _t = 4. Let $\hat {x}_{t}$ denote the posterior belief state. Then, $\hat {x}_{t}=((3+4,7+6), (7+4,3+6))=((7,13), (11,9))$ and ω ^′ = (0.8, 0.2).

Imperfect information-collection technology (i.e., $\min \{q_{1},$ q ₂} < 1).

If $\tilde {p}_{t}$ is a mixture of m ≥ 1 beta-distributions with weights ω _i and the information-collection technology is imperfect, the true posterior distribution presented in Eq. 13 is a mixture of beta-distributions. There are n _t + 1 possible outcomes of a study with sample size of n _t and so there are n _t + 1 possible values for j + k in Eq. 13. However, not all the posterior distributions created by updating each of the m prior distributions are necessarily unique. Therefore, the true posterior distribution is a mixture of between m + n _t and m × (n _t + 1) beta-distributions.

Consider the example where m = 2, x _t = ((19, 20), (20, 19)), n _t = 5. Again, let $\hat {x}_{t}$ denote the posterior belief state. Given the sample size of n _t = 5 there are 6 possible true outcomes, by which we mean the unobservable number of actual positive samples in the study, which correspond to n _t + m = 7 possible unique beta-distributions $\hat {x}_{t}=((19,25), (20,24),(21,23), (22,22),(23,21), (24,$ 20), (25, 19)) which each contribute to the posterior mixture distribution. When we observe a specific number of positives in the sample, the imperfect information-collection technology results in a distribution over the true number of positives in the sample and, therefore, weights on each component in the posterior mixture distribution (given by Eq. 13).

1.2.2 A.2.2 Mean and variance of mixtures of beta-distributions

The posterior distribution of $\tilde {p}_{t}$ given sample information v _t collected using an imperfect information-collection technology is a mixture of beta-distributions. In this section, we first derive equations for the mean and variance for a general mixture of beta-distributions (with simplified notation) to show their relationship to the mean, and more generally, to the parameters of the component distributions. Then, we do the appropriate substitutions to present the conditional mean and variance of the posterior distribution f _p|v.

Consider a distribution f _Y(y) which is a mixture of M beta-distributions where the i-th component of the mixture has weight w _i, parameters a _i and b _i, and mean μ _i:

$$f_{Y}(y) \,=\, \sum\limits_{i=1}^{m} w_{i} \frac{\Gamma(a_{i} \,+\, b_{i})}{\Gamma(a_{i}) {\Gamma}(b_{i})} y^{a_{i} -1} (1\,-\,y)^{b_{i} -1} \text{where }\! \sum\limits_{i=1}^{m} w_{i} \,=\, 1. $$

We show that the mean of a mixture of beta-distributions is the weighted mean of each mixture component:

$$\begin{array}{@{}rcl@{}} \mathbb{E}[ f_{Y}(y) ] \!&=&\! {{\int}_{0}^{1}} y f_{Y}(y) dy\\ \!&=&\! {{\int}_{0}^{1}} y \left(\sum\limits_{i=1}^{m} w_{i} \frac{\Gamma(a_{i} \,+\, b_{i})}{\Gamma(a_{i}) {\Gamma}(b_{i})} y^{a_{i} -1} (1\,-\,y)^{b_{i} -1} \right) dy \\ \!&= &\! \sum\limits_{i=1}^{m} w_{i} \frac{\Gamma(a_{i} + b_{i})}{\Gamma(a_{i}) {\Gamma}(b_{i})} {{\int}_{0}^{1}} y^{a_{i}} (1-y)^{b_{i} -1} dy\\ \!&=&\! \sum\limits_{i=1}^{m} w_{i} \frac{\Gamma(a_{i} + b_{i})}{\Gamma(a_{i}) {\Gamma}(b_{i})} \frac{\Gamma(a_{i}+1) {\Gamma}(b_{i})}{\Gamma(a_{i} + b_{i} + 1)} \\ \!&=&\! \sum\limits_{i=1}^{m} w_{i} \frac{a_{i}}{a_{i} + b_{i}} = \sum\limits_{i=1}^{m} w_{i} \mu_{i}. \end{array} $$

(14)

We also derive the variance of f _Y(y):

$$\begin{array}{@{}rcl@{}} \mathbb{V}[ f_{Y}(y)] \!\!&= &\!\! {{\int}_{0}^{1}} y^{2} f_{Y}(y) dy \,-\, (\mathbb{E}[ f_{Y}(y) ])^{2} \\ \!\!&=&\!\! {{\int}_{0}^{1}} y^{2} \left(\sum\limits_{i=1}^{m} w_{i} \frac{\Gamma(a_{i} \,+\, b_{i})}{\Gamma(a_{i}) {\Gamma}(b_{i})} y^{a_{i} -1} (1\,-\,y)^{b_{i} -1} \right)\\ && \times dy \,-\, \left(\sum\limits_{i=1}^{m} w_{i} \mu_{i} \right)^{2} \\ \!\!&= &\!\! \sum\limits_{i=1}^{m} w_{i} \frac{\Gamma(a_{i} \,+\, b_{i})}{\Gamma(a_{i}) {\Gamma}(b_{i})} {{\int}_{0}^{1}} y^{a_{i} +1} (1\,-\,y)^{b_{i} -1}\\ &&\times dy \,-\, \left(\sum\limits_{i=1}^{m} \omega_{i} \mu_{i} \right)^{2} \\ \!\!&= &\!\! \sum\limits_{i=1}^{m} w_{i} \frac{\Gamma(a_{i} \,+\, b_{i})}{\Gamma(a_{i}) {\Gamma}(b_{i})} \frac{\Gamma(a_{i}\,+\,2) {\Gamma}(b_{i})}{\Gamma(a_{i} \,+\, b_{i}\,+\,2)}\,-\, \left(\sum\limits_{i=1}^{m} \omega_{i} \mu_{i} \right)^{2} \\ \!\!&=&\!\! \sum\limits_{i=1}^{m} w_{i} \frac{(a_{i}\,+\,1)a_{i}} {(a_{i} \,+\, b_{i}\,+\,1)(a_{i} \,+\, b_{i})} \,-\, \left(\sum\limits_{i=1}^{m} w_{i} \mu_{i} \right)^{2}. \end{array} $$

(15)

Mean and variance of the posterior distribution f _{p|
v}

. The posterior distribution of $\tilde {p}_{t}$, f _p|v, is a mixture of beta-distribution (Proposition 1). Using Eqs. 14 and 15 with appropriate substitutions, we can identify:

$$\begin{array}{@{}rcl@{}} \mathbb{E}[f_{p|v}(p | \tilde{v}_{t}=v_{t})] &=& \sum\limits_{j=0}^{v_{t}} \sum\limits_{k=0}^{n_{t} -v_{t}} \sum\limits_{i=1}^{m} \omega^{\prime}_{j, k, i}\\ &&\times\frac{a_{t,i}+j+k}{a_{t,i} + b_{t,i}+n_{t}} \end{array} $$

(16)

and

$$\begin{array}{@{}rcl@{}} \mathbb{V}[f_{p|v}(p | \tilde{v}_{t}\,=\,v_{t})] \!&=&\! \sum\limits_{j=0}^{v_{t}} \sum\limits_{k=0}^{n_{t} -v_{t}} \sum\limits_{i=1}^{m} \omega^{\prime}_{j, k, i}\\ &&\times\frac{(a_{t,i}\,+\,j\,+\,k\,+\,1)(a_{t,i}\,+\,j\,+\,k)} {(a_{t,i} \,+\, b_{t,i}\,+\,n_{t}\,+\,1)(a_{t,i} \,+\, b_{t,i}\,+\,n_{t})} \\ && - \left(\mathbb{E}[f_{p|v}(p | \tilde{v}_{t}\,=\,v_{t})] \right)^{2} \end{array} $$

(17)

where ω _i is the prior weight on the i-th component of the prior distribution, and

$$\omega^{\prime}_{j, k, i} = \frac{ \omega_{i} {q_{1}^{j}} (1-q_{2})^{v_{t}-j} (1-q_{1})^{k} q_{2}^{n_{t}-v_{t}-k} \frac{\Gamma(a_{t,i}+b_{t,i}) {\Gamma}(a_{t,i} +j+k) {\Gamma}(b_{t,i} +n_{t}-j -k)} { {\Gamma}(j+1) {\Gamma}(v_{t}-j+1) {\Gamma}(k+1) {\Gamma}(n_{t}-v_{t}-k+1) {\Gamma}(a_{t,i}){\Gamma}(b_{t,i}) {\Gamma}(a_{t,i} + b_{t,i} +n_{t})}} { {\sum}_{r=0}^{v_{t}} {\sum}_{s=0}^{n_{t} -v_{t}} {\sum}_{i=1}^{m} \frac{ \omega_{i} {q_{1}^{r}} (1-q_{2})^{v_{t}-r} (1-q_{1})^{s} q_{2}^{n_{t} - v_{t}-s} {\Gamma}(a_{t,i} + b_{t,i}) {\Gamma}(a_{t,i} +r+s) {\Gamma}(b_{t,i} +n_{t}-r -s)}{\Gamma(r+1) {\Gamma}(v_{t}-r+1) {\Gamma}(s+1) {\Gamma}(n_{t}-v_{t}-s+1){\Gamma}(a_{t,i}) {\Gamma}(b_{t,i}) {\Gamma}(a_{t,i} + b_{t,i} +n_{t})}} . $$

1.3 A.3 Quality of the Posterior Distribution Approximation

We performed extensive numerical simulations to test the accuracy of the approximation. We generated exact posterior distributions under the following conditions:

Single beta-distribution priors with
$$\begin{array}{@{}rcl@{}} \mu(x) &\in& \{0.001, 0.0025, 0.005, 0.0075, 0.01, 0.02,\\ &&\quad 0.03, 0.04\} \text{ and }\\ \sigma(x) &\in& \{0.001, 0.002, ..., 0.006\}. \end{array} $$
Information-collection technologies with
$$\begin{array}{@{}rcl@{}} q_{1} \!&\in&\! \{0.7, 0.8, 0.9, 0.95, 0.97, 0.99, 1\} \text{ and }\\ q_{2} \!&\in&\! \{0.7, 0.8, 0.9, 0.95, 0.97, 0.99, 0.999, 0.9996, 1\}. \end{array} $$
Sample sizes n ∈{100, 500, 1000, 2500, 5000}.
Observations v at percentiles of
$$\begin{array}{@{}rcl@{}} \{0.00001, 0.0001, 0.001, 0.002, 0.0025, 0.005, ...,\\ 0.9925, 0.9975, 0.998, 0.999, 0.9999, 0.99999\}. \end{array} $$

In total, 757,781 exact distributions were generated, approximately 600,000 of which had means and variance in the policy-relevant region for our numerical analysis (μ(x) ∈ (0, 0.04) × σ(x) ∈ (0, 0.008)). We calculated the Kolmogorov-distance—the maximum distance between the cumulative distribution functions—between each exact beta-mixture posterior distribution and a single beta-distribution with the same mean and standard deviation.

The Kolmogorov-distance between the cumulative density function of the exact posterior distributions and that of the approximation with matching mean and variance was generally small (< 2%) (Fig. 6). The Kolmogorov distances only increased in magnitude for very small means or small means and large standard deviations. Kolmogorov distances above 2% typically appeared only in the stopping region of our numerical example or in the upper left-hand section of the information collection region with small means and high standard deviations (σ(x) > 0.006). Based on the numerical experiments, for the purposes of informing the practical policy decision, the approximate belief update using moment matching proves to be of reasonably high quality.

1.4 A.4 Derivation of Dynamics in Mean-Variance space

The state x _t = (a _t,b _t), which contains the parameters of the distribution of $\tilde {p}_{t}$ representing the policy-maker’s current beliefs, follows a law of motion of the form

$$x_{t+1} = \phi(\hat{x}_{t}) = \left[ \begin{array}{cc} z & 0 \\ 1-z & 1 \end{array} \right] \hat{x}_{t} , $$

where z ∈ (0, 1) is the decay rate and $\hat {x}_{t}=(\hat {a}_{t}, \hat {b}_{t}) =\psi (x_{t},n_{t},v_{t},q)$ is the Bayesian update of x _t given v _t positive observations out of a test of n _t individuals in the current cohort.

First, we see that

$$x_{t+1} = \left[ \begin{array}{cc} a_{t+1} \\ b_{t+1} \end{array} \right] = \left[ \begin{array}{cc} z \hat{a}_{t} \\ (1-z) \hat{a}_{t} + \hat{b}_{t} \end{array} \right] . $$

Derivation of μ(x _t+1) beginning with the expectation of a beta-distribution:

$$\begin{array}{@{}rcl@{}} \mu(x_{t+1}) &=& \frac{a_{t+1}} { a_{t+1} + b_{t+1}} = \frac{z \hat{a}_{t}}{z \hat{a}_{t} + (1-z) \hat{a}_{t} + \hat{b}_{t}}\\ &=& \frac{z \hat{a}_{t}}{ \hat{a}_{t} + \hat{b}_{t}} = z \mu(\hat{x}_{t}). \end{array} $$

Derivation of σ ²(x _t+1) beginning with the variance of a beta-distribution:

$$\begin{array}{@{}rcl@{}} \sigma^{2}(x_{t+1}) \!\!&= &\!\! \frac{a_{t+1} b_{t+1}} { (a_{t+1} \,+\, b_{t+1})^{2} (a_{t+1} \,+\, b_{t+1} \,+\, 1 )}\\ \!\!&= &\!\! \frac{(z \hat{a}_{t}) ((1\,-\,z) \hat{a}_{t} \,+\, \hat{b}_{t})} { (z \hat{a}_{t} \,+\, (1-z) \hat{a}_{t} \,+\, \hat{b}_{t} )^{2} (z \hat{a}_{t} \,+\, (1\,-\,z) \hat{a}_{t} \,+\, \hat{b}_{t} \,+\, 1 )}\\ \!\!&=&\!\! \frac{z (1\,-\,z) \hat{a}_{t}^{2} \,+\, z \hat{a}_{t} \hat{b}_{t}} { (\hat{a}_{t} \,+\, \hat{b}_{t} )^{2} (\hat{a}_{t} \,+\, \hat{b}_{t} \,+\, 1 )}\\ \!\!&= &\!\! z (1\,-\,z) \left(\frac{ \hat{a}_{t}^{2}} { (\hat{a}_{t} \,+\, \hat{b}_{t} )^{2} (\hat{a}_{t} \,+\, \hat{b}_{t} \,+\, 1 )} \!\times\! \frac{\hat{b}_{t}}{\hat{b}_{t}} \!\times\! \frac{\hat{a}_{t} \,+\, \hat{b}_{t}}{\hat{a}_{t} \,+\, \hat{b}_{t}}\right)\\ &&\!\!+ z \left(\frac{ \hat{a}_{t} \hat{b}_{t} } { (\hat{a}_{t} \,+\, \hat{b}_{t} )^{2} (\hat{a}_{t} \,+\, \hat{b}_{t} \,+\, 1 )} \right) \end{array} $$

$$\begin{array}{@{}rcl@{}} \quad\quad\quad~~~\!\!&= &\!\! z (1-z) \left(\frac{ \hat{a}_{t} \hat{b}_{t}} { (\hat{a}_{t} + \hat{b}_{t} )^{2} (\hat{a}_{t} + \hat{b}_{t} + 1 )}\right.\\ &&\quad\quad\quad\quad\left.\times \frac{\hat{a}_{t}}{\hat{a}_{t} + \hat{b}_{t}} \times \frac{\hat{a}_{t} + \hat{b}_{t}}{\hat{b}_{t}} \right)\\ &&+ z \sigma^{2}(x_{t}) \\ \!\!&= &\!\! z (1-z) \sigma^{2}(\hat{x}_{t}) \left(\frac{\mu(\hat{x}_{t})}{1-\mu(\hat{x}_{t})} \right) + z \sigma^{2}(\hat{x}_{t}) \\ \!\!&= &\!\! \sigma^{2}(\hat{x}_{t}) \left(z + z (1-z) \left(\frac{\mu(\hat{x}_{t})}{1-\mu(\hat{x}_{t})} \right)\right). \end{array} $$

1.5 A.5 Derivation of Eq. 8

For $\mu (x_{0})\geq \frac {\gamma }{\theta }$, given Eq. 1 and using the fact that μ(x _t) = z ^t μ(x ₀), we can identify the optimal time to stop the intervention, T(x ₀), which is the first period in which the intervention has a nonpositive INMB. Specifically, we seek the value of t such that E[g(x _t)] = 0.

$$\begin{array}{@{}rcl@{}} E\left[g(\tilde{p}_{t})\right] = \theta \mu(x_{t}) - \gamma & =& \theta z^{t} \mu(x_{0}) - \gamma = 0 \\ z^{t} & =& \frac{\gamma}{\theta \mu(x_{0})} \\ t & =& \frac{1}{\ln(z)} \ln \left(\frac{\gamma}{\theta \mu(x_{0})} \right) \end{array} $$

Since decisions can only be made at discrete time intervals, we identify the optimal time to stop the intervention, T(x ₀), as the first integer period in which the intervention has a nonpositive INMB

$$T(x_{0}) = \left\lceil \frac{1}{\ln(z)} \ln \left(\frac{\gamma}{\theta \mu(x_{0})} \right) \right\rceil . $$

1.6 A.6 Proof of Proposition 2

Rewriting Eq. 9 as

$$V_{\text{NoInfo}}(x_{0}) = \sum\limits_{t=0}^{\infty}\delta^{t}(\theta z^{t} \mu(x_{0}) - \gamma ) \mathbf{1}_{\{t\leq T(x_{0})\}}, $$

where T(x ₀) is the optimal time to stop the intervention, the claim follows from the fact that each term is a nondecreasing, convex function. Specifically for each t, δ ^t(𝜃 z ^t μ(x ₀) − γ) is linear in μ(x ₀) with 𝜃 δ ^t z ^t ≥ 0 and T(x ₀) nondecreasing in μ(x ₀). Noting that when t = T(x ₀) = t(x ₀), (𝜃 z ^t μ(x ₀) − γ) = 0, we can conclude that δ ^t(𝜃 z ^t μ(x ₀) − γ)1 _{{t ≤ T(x} ₀)} is a continuous, nondecreasing convex function.

1.7 A.7 Proof of Proposition 3

We apply Proposition 5 in Smith and McCardle [20], which states that if (a) the current-period reward function satisfies the structural property (such as convexity and monotonicity in μ(x _t)) for each action, and (b) the state transitions satisfy a stochastic version of the structural property for each action, then the value function satisfies the structural property in a finite-horizon setting. In our setting, which is in principle infinite-horizon up to the stopping time, if the structural property is satisfied in the final period, just before the optimal stopping time, then the previous-period value function is obtained via maximization over functions that each satisfy the structural property. Thus, if the structural property is preserved by maximization, then the previous-period value function also satisfies the structural property. For the proof we assume a perfect detection technology to simplify exposition. We extend to the general case at the end of the proof.

First, condition (a) is satisfied because the current-period expected reward, $\mathbb {E}[g(\tilde {p}_{t}, u_{t})|x_{t}]$, is nondecreasing in μ(x _t) and σ ²(x _t) and (at least weakly) convex in μ(x _t) for any action $u_{t}\in \mathcal {D} \times \mathcal {N}$.

Second, the Bayesian update ψ preserves stochastic dominance of the beta-distributed prior, in the sense that if one prior is stochastically dominated by another prior, the corresponding posteriors will exhibit the same dominance relationship, under first-order stochastic dominance (FOSD) and second-order stochastic dominance (SOSD).^{Footnote 5} In particular, an increase of the mean μ(x _t) will result in an increase of the posterior mean, and an increase of the variance σ ²(x _t) in an increase of the posterior variance. To see this for SOSD, consider a mean preserving spread, so x t(1), x t(2) with μ(x t(1)) = μ(x t(2)) and σ ²(x t(1)) < σ ²(x t(2)). Then σ ²(x t(1)) < σ ²(x t(2)) implies a t(1) + b t(1) > a t(2) + b t(2), and $\mathbb {V}[\tilde {s}_{t}| x_{t}^{(1)}, n_{t}] \leqslant \mathbb {V}[\tilde {s}_{t}| x_{t}^{(2)}, n_{t}]$. The conditional variance of the next-period belief given n _t = η ≥ 0 samples is $\mathbb {V}[x_{t+1}|x_{t}, \tilde {s}_{t}, n_{t}=\eta ] = \frac {z(a_{t}+\tilde {s}_{t})(a_{t}+b_{t}+\eta -z(a_{t} + \tilde {s}_{t}))}{(a_{t} + b_{t} + \eta )^{2}(a_{t}+b_{t}+\eta +1)}$. The variance of the next-period belief is obtained using the law of total variance, so $\mathbb {V}[x_{t+1}|x_{t}^{(1)}, n_{t}=\eta ] <\mathbb {V}[x_{t+1}|x_{t}^{(2)}, n_{t}=\eta ]$; hence, ϕ ∘ ψ is increasing in σ ². Thus, the Bayesian update is increasing in μ(x _t) and σ ²(x _t), and the same holds true for its beta-approximation ψ introduced in Section 2.2. The state-transition function ϕ is linear (time-invariant) in μ(x _t) and linear (time-variant) in σ ²(x _t), so that the Bayesian-updated state-transition function ϕ ∘ ψ is increasing in (μ(x _t),σ ²(x _t)).

Finally, the convexity in μ(x _t) survives the maximization in the Bellman equation given that the objective function is supermodular in (u _t,μ(x _t)). Since the sum of nondecreasing convex functions is nondecreasing and convex, we only need a terminal condition to satisfy the backwards-induction approach presented by Smith and McCardle [20]. Note that since 0 < z < 1, $\lim _{t \to \infty } \mu (\phi (\psi (x_{t}, n_{t}, \tilde {v}_{t}, q))) = 0$ and $\lim _{t \to \infty } \sigma ^{2}(\phi (\psi (x_{t}, n_{t}, \tilde {v}_{t}, q))) = 0$. Therefore, there exists a time $T <\infty $ for which, given any initial state, an optimal policy is to stop the intervention, i.e., u _T = (0, 0), and V (x _T) = 0. Through the mean- and, ultimately, variance-reducing dynamics, with or without the variance-reducing acquisition of information, any initial state eventually approaches a ‘termination’ state over time. Since the reward of this state is zero, which is nondecreasing and convex, we conclude that V (x _t) is nondecreasing and convex in μ(x _t); by a similar argument it is also increasing in σ ²(x _t).

This proof relies mainly on the stochastic-dominance ordering, and it therefore directly extends to the case with misclassification. Condition (a) continues to be satisfied, since it relates only to the current period and is not influenced by information collection. Second, as before, Bayesian updating preserves stochastic dominance of the beta-mixture prior, in the sense that if one prior stochastically dominates another prior, the corresponding posteriors conserve the dominance ordering, for FOSD and SOSD. Finally, imperfect information collection does not affect the supermodularity of the objective function in (u _t,μ(x _t)), so convexity in μ(x _t) survives the maximization in the Bellman equation. This allows for backward induction starting with the ‘stop intervention’-region at zero reward, as described above.

1.8 A.8 Proof of Corollary 1

For the case where no information is available, this corollary was already shown to be true with the derivation of a threshold policy in Section 3.1. For the general case, with or without information collection in the current or future periods, we rely on the properties of the value function demonstrated in Proposition 3: if μ(x t(1)) < μ(x t(2)) and σ(x t(1)) = σ(x t(2)), then $V(x_{t}^{(1)}) \leqslant V(x_{t}^{(2)})$. This directly implies that, if it is optimal to do the intervention with μ(x t(1)), then it is also optimal to do the intervention at μ(x t(2)). Furthermore, if it is not optimal to do the intervention with μ(x t(2)) then, because $V(x_{t}^{(1)}) \leqslant V(x_{t}^{(2)})$, it is also not optimal to do the intervention at μ(x t(1)).

1.9 A.9 Proof of Corollary 2

For the case where no information is available, the optimal policy does not depend on σ(x _t) (see the threshold policy in Section 3.1). For the general case, with or without information collection in the current or future periods, we rely on the properties of the value function demonstrated in Proposition 3: if μ(x t(1)) = μ(x t(2)) and σ(x t(1)) < σ(x t(2)), then $V(x_{t}^{(1)}) \leqslant V(x_{t}^{(2)})$. This directly implies, if it is optimal to do the intervention with σ(x t(1)), then it is also optimal to do the intervention at σ(x t(2)). Furthermore, if it is not optimal to do the intervention with σ(x t(2)) then, because $V(x_{t}^{(1)}) \leqslant V(x_{t}^{(2)})$, it is also not optimal to do the intervention at σ(x t(1)).

1.10 A.10 Proof of Proposition 4

Misclassification results in a posterior distribution which has greater variance than would occur with the same sample size and a perfect detection technology. When evaluating the expected value with information collection, greater variance in the posterior implies an expected value over a larger number of states where the optimal next action is to not do the intervention and, therefore, have a value of 0.

Given a perfect detection technology, smaller sample sizes will have greater variance in the posterior distribution than larger sample sizes. Therefore, misclassification can be thought of as an effective sample size reduction or, equivalently, an increase in cost for each full unit of information.

An increase in cost or a decrease in the expected value of the next state given information that was collected this period, decrease the value of the information-collection alternative. Therefore, there are fewer states for which information collection is the optimal action, i.e., the action providing the greatest expected value. Misclassification has no effect on the option not to do the intervention, and has a limited effect on the immediate option not to collect information this period (it would influence this option only when the optimal action of a subsequent state is to collect information).

1.11 A.11 Supplemental methods and results for HCV screening example

1.11.1 A.11.1 Development of linear INMB

A schematic of the HCV screening decision problem for a single cohort is presented in Fig. 7.

We denote λ as the willingness-to-pay threshold, q ₁ and q ₂ as the test sensitivity and specificity, C _S > 0 as the cost of the screening test, $B_{S} \leqslant 0$ as the quality-of-life loss from the screening test, $C_{FP} \geqslant 0$ as the cost of correcting a false-positive test result, $B_{FP} \leqslant 0$ as the quality-of-life loss from a false-positive test result, and we denote the lifetime discounted costs and benefits of the true-positive, false-negative, and true-negative screening outcomes, C ₁, C ₂, C ₃, and B ₁, B ₂, B ₃, respectively.

The net monetary benefit (NB) of the decision not to screen cohort t is

$$ \text{NB}_{\text{NoScreening}} \,=\, \lambda \left(\tilde{p}_{t} B_{2} \,+\, (1\,-\,\tilde{p}_{t}) B_{3} \right) - \left(\tilde{p}_{t} C_{2} \,+\, (1\!\,-\,\!\tilde{p}_{t}) C_{3} \right). $$

(18)

The net monetary benefit (NB) of the decision not to screen cohort t is

$$\begin{array}{@{}rcl@{}} \text{NB}_{\text{Screening}} \!\!&= &\!\! \lambda \left(\tilde{p}_{t} q_{1} B_{1} \,+\, \tilde{p}_{t} (1\,-\,q_{1}) B_{2} \,+\, (1\,-\,\tilde{p}_{t})\right.\\ &&\quad\left.\!\times\! (1\,-\,q_{2}) (B_{3}\,+\,B_{FP}) \,+\, (1\,-\,\tilde{p}_{t}) q_{2} B_{3} \,+\, B_{S} \right)\\ &&\,-\,\left(\tilde{p}_{t} q_{1} C_{1} \,+\, \tilde{p}_{t} (1\,-\,q_{1}) C_{2} \,+\, (1\,-\,\tilde{p}_{t})\right.\\ &&\quad\!\times\!\! \left.(1\,-\,q_{2}) (C_{3}\,+\,C_{FP}) \,+\, (1\,-\,\!\tilde{p}_{t}) q_{2} C_{3} \,+\, C_{S} \right). \\ \end{array} $$

(19)

The INMB of screening compared to the alternative of not screening is computed as the difference between Eqs. 19 and 18:

$$\begin{array}{@{}rcl@{}} \text{INMB}_{\text{Screening}} \!&= &\! \text{NB}_{\text{Screening}} - \text{NB}_{\text{NoScreening}} \\ \!&= &\! \lambda \left(\tilde{p}_{t} q_{1} B_{1} + \tilde{p}_{t} (1-q_{1}) B_{2} + (1-\tilde{p}_{t})\right. \\ &&\quad\left.\times (1-q_{2}) (B_{3}+B_{FP}) + (1-\tilde{p}_{t})\right. \\ &&\quad\left.\times q_{2} B_{3} + B_{S} \right)\\&&- \left(\tilde{p}_{t} q_{1} C_{1} + \tilde{p}_{t} (1-q_{1}) C_{2} + (1-\tilde{p}_{t})\right. \\ &&\quad\left.\times (1-q_{2}) (C_{3}+C_{FP}) + (1-\tilde{p}_{t})\right. \\ &&\quad\left.\times q_{2} C_{3} + C_{S} \right) \\ && - \left(\lambda \left(\tilde{p}_{t} B_{2} + (1-\tilde{p}_{t}) B_{3} \right)\right. \\ &&\quad\left.- \left(\tilde{p}_{t} C_{2} + (1-\tilde{p}_{t}) C_{3} \right) \right)\\ \!&= &\! \lambda \tilde{p}_{t} \left[ q_{1} B_{1} + (1-q_{1}) B_{2} -(1-q_{2})\right. \\ &&\quad\left.\times (B_{3}+B_{FP}) - q_{2} B_{3} -B_{2} + B_{3} \right] \\ && + \lambda \left[ (1-q_{2}) (B_{3}+B_{FP}) + q_{2} B_{3}\right. \\ &&\quad\quad\left.+ B_{S} - B_{3} \right] \\ && - \tilde{p}_{t} \left[ q_{1} C_{1} + (1-q_{1}) C_{2} - (1-q_{2})\right. \\ &&\quad\quad\left.\times (C_{3}+C_{FP}) - q_{2} C_{3} - C_{2} + C_{3} \right] \\ && - (1\,-\,q_{2}) (C_{3}\,+\,C_{FP}) - q_{2} C_{3} - C_{S} + C_{3} \\ \!&= &\! \tilde{p}_{t} \left(q_{1} \left[ \lambda (B_{1} - B_{2} ) - (C_{1} - C_{2}) \right]\right. \\ &&\quad~\left.\times - (1-q_{2}) \left[ \lambda B_{FP} - C_{FP} \right] \right) \\ && + \lambda B_{S} - C_{S} + (1-q_{2}) (\lambda B_{FP} - C_{FP}) \end{array} $$

With terms collected, the INMB of screening at age 50 compared to not screening at time t in a cohort with HCV prevalence $\tilde {p}_{t}$ can be written $\text {INMB}_{t}=\theta \tilde {p}_{t} - \gamma $. The marginal INMB of early diagnosis and treatment for an individual with HCV, 𝜃, and the fixed INMB of screening, γ, are

$$ \theta= q_{1} \left[ \lambda (B_{1} - B_{2} ) - (C_{1} - C_{2}) \right] - (1-q_{2}) \left[ \lambda B_{FP} - C_{FP} \right] , $$

(20)

and

$$ \gamma=C_{S} - \lambda B_{S} - (1-q_{2}) (\lambda B_{FP} - C_{FP}). $$

(21)

1.11.2 A.11.2 Parameter estimation

Consistent with the recommendations of the US Panel on Cost-Effectiveness in Health and Medicine, we adopted a societal perspective, considered costs and benefits over a lifetime horizon for each cohort, and discounted future costs and health benefits at 3% annually [1]. We measured costs in 2010 US dollars and adjusted for inflation using the Consumer Price Index when appropriate [90]. Benefits are measured in quality-adjusted life-years (QALYs). We assumed a mid-range value for society’s maximum willingness to pay of $75,000 per QALY gained [84].

Estimating the lifetime costs and benefits of each HCV screening outcome for cohorts of asymptomatic 50-year old men and women requires a detailed natural history model of HCV. We used the model by Liu et al. [14] to estimate the lifetime costs and benefits of each HCV screening outcome.

We assumed that the cohort size at each period, the number of people who attend a preventive health exam at age 50, is constant over time, N, since there is less than 10% variation from the average population size across cohorts currently aged between 25 and 55 years of age [79]. At the beginning of each period t, the policy-maker simultaneously decides whether to screen the current cohort for HCV and whether to conduct a study of sample size n _t to better estimate the current prevalence of HCV. Information arrives at the end of the current period and is used, together with the prevalence dynamics, to inform the screening decision at t + 1 for the next cohort. We assumed that the cost of sample information, K(n _t), is affine in the sample size with a fixed cost of $50,000 and variable cost of $100 [83].

We used the National Health and Nutrition Examination Survey (NHANES) to estimate birth-cohort-specific HCV prevalence, HCV-prevalence dynamics, and the proportion of individuals currently unaware of their infection status. Ultimately, we estimated the HCV prevalence for our initial cohorts, men and women born in 1960 who are currently unaware of their infection status, to be 3.1% (95% CI: 2.4-3.8%) and 1.4% (95% CI: 1.0-1.7%), respectively. Restricting the analysis to individuals born between 1956 and 1980 (n = 12,607), we identified the rate of prevalence decay to be 0.893 (95% CI: 0.871-0.915) using logistic regression, controlling for race and gender. We present the detailed methods and results of this primary analysis below.

1.11.3 A.11.3 National health and nutrition examination survey (NHANES) analysis

Overview

The National Center for Health Statistics periodically conducts NHANES to compile representative statistics on the health of the US population [88]. Our analysis includes data collected from 1999 through 2010. Participants were chosen according to a stratified multistage algorithm to produce a representative sample of the civilian, non-institutionalized population of all 50 states and the District of Columbia. Only participants at least age 6 years old were eligible for HCV testing because of low blood sample volume in younger children. Birth years for individuals younger than 85 years, for survey years 1999-2006, and for individuals younger than 80 years old, for survey years 2008-2010, were estimated using their age in months and the 6-month window of the survey. Age in months is not provided for older individuals. Beginning with the 2001/2002 survey, participants testing HCV-positive were informed in writing of their test results, and four months later, they received a follow-up telephone questionnaire.

Statistical Analysis

All statistical analyses were performed in SAS version 9.3 (SAS Institute, Cary, North Carolina) according to National Center for Health Statistics guidelines [86, 87]. We accounted for the complex survey design using the appropriate study design variables, sampling weights, and by using SAS Survey procedures. Logistic regression analysis was used to identify the rate of HCV-prevalence decay over successive birth cohorts for individuals born between 1956 and 1980. Finally, using the follow up survey in HCV-positive individuals, we estimated the proportion of HCV-positive individuals who were unaware of their infection status prior to participation in NHANES.

Results

Of 51,587 participants of at least age 6 years surveyed between 1999 and 2010, 45,153 gave a blood sample suitable for HCV-antibody testing (final response rate for testing, 87.5%). Restricting analysis to individuals born between years 1956 and 1980 (n=12,607), we identified the rate of prevalence decay to be 0.893 (95% CI: 0.871-0.915) using logistic regression controlling for race and gender (Table 5). Using the regression, the predicted HCV prevalence for men and women born in 1960 are 4.7% (95% CI: 3.8-5.7%) and 2.9% (95% CI: 2.3-3.6%), respectively.

Table 5 Logistic regression predicting HCV prevalence for 1956-1980 birth cohorts using NHANES 1999–2010 (n = 12,607)

Full size table

Since the 2001/02 survey, 500 subjects were identified as HCV-positive and contacted for follow-up which included asking if they were previously aware of their HCV-infection status. The response rate to the follow-up questionnaire was 206 (41%). Using logistic regression, we estimated the proportion of men and women who were unaware of their HCV-infection status prior to participating in the NHANES study to be 55% (95% CI: 46-65%) and 39% (95% CI: 30-39%), respectively (Table 6). Because of the small sample size, we did not stratify analysis by birth year; race was excluded from the final regression model because it was not a significant predictor of prior infection-status awareness.

Table 6 Logistic regression predicting the proportion of HCV-positive individuals unaware of their infection status using NHANES 1999–2010 (n = 206)

Full size table

To compute the HCV prevalence among those who are currently unaware of their infection status, we also needed an estimate of the proportion of individuals who are aware of their HCV-negative status, which is unknown. We assumed it to be 15%, consistent with Liu et al. [14]. Using the logistic regression model to predict birth-cohort-specific HCV prevalence and adjusting for the number of individuals who are unaware of their infection status, we estimated the HCV prevalence for our initial cohorts, men and women born in 1960 who are currently unaware of their infection status, is 3.1% (95% CI: 2.4-3.8%) and 1.4% (95% CI: 1.0-1.7%), respectively.

1.11.4 A.11.4 Sensitivity analysis

We performed sensitivity analysis to evaluate the robustness of the optimal policy to uncertainty in model inputs (Table 7). We identified that the general conclusions of our analysis are robust to the uncertainty inputs. Specifically, for women, we find that the optimal time to collect information ranges from immediately to 7 years. For men, we find that the optimal time to collect information ranges from 11-21 years with the exception of scenarios in which we considered a low value of z. A very low value of z implies the prevalence of HCV is rapidly decreasing across birth cohorts. If this is the case, it is optimal to collect additional information immediately.

Table 7 Comparison of optimal policies indicated by various analytic approaches

Full size table

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Cite this article

Cipriano, L.E., Weber, T.A. Population-level intervention and information collection in dynamic healthcare policy. Health Care Manag Sci 21, 604–631 (2018). https://doi.org/10.1007/s10729-017-9415-5

Download citation

Received: 23 February 2017
Accepted: 10 August 2017
Published: 08 September 2017
Issue Date: December 2018
DOI: https://doi.org/10.1007/s10729-017-9415-5

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Population-level intervention and information collection in dynamic healthcare policy

Abstract

Similar content being viewed by others

A Multi-Fidelity Rollout Algorithm for Dynamic Resource Allocation in Population Disease Management

Optimizing patient treatment decisions in an era of rapid technological advances: the case of hepatitis C treatment

Markov modeling in hepatitis B screening and linkage to care

Explore related subjects

1 Introduction

1.1 Related literature and contribution

Technology adoption

Stochastic dynamic programs in healthcare

Health economics and value of information in healthcare

Contribution

2 The model

2.1 The information-acquisition problem

Proposition 1

Proof

Remark 1

2.2 Approximate Bayesian inference

2.3 System dynamics

2.4 The policy-maker’s problem

Remark 2

3 Dynamic healthcare decisions

3.1 Policies without information acquisition

Proposition 2

Proof

Remark 3

3.2 Policies with information acquisition

Proposition 3

Proof

3.2.1 Special case: one-time information collection

3.2.2 General case: information collection in any period

Corollary 1

Proof

Corollary 2

Proof

Proposition 4

Proof

4 Application

4.1 Background and motivation

4.2 Results

4.2.1 Policies identified by alternative approaches

4.2.2 Model results

4.3 Discussion of application

5 Conclusion

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Open Access

Appendix

Appendix

1.1 A.1 Proof of Proposition 1

1.2 A.2 General properties of mixtures of beta-distributions

1.2.1 A.2.1 The number of component distributions in the posterior mixture distribution

Perfect information-collection technology (i.e., q 1 = q 2 = 1).

Imperfect information-collection technology (i.e., \(\min \{q_{1},\) q 2} < 1).

1.2.2 A.2.2 Mean and variance of mixtures of beta-distributions

Mean and variance of the posterior distribution f p| v

1.3 A.3 Quality of the Posterior Distribution Approximation

1.4 A.4 Derivation of Dynamics in Mean-Variance space

1.5 A.5 Derivation of Eq. 8

1.6 A.6 Proof of Proposition 2

1.7 A.7 Proof of Proposition 3

1.8 A.8 Proof of Corollary 1

1.9 A.9 Proof of Corollary 2

1.10 A.10 Proof of Proposition 4

1.11 A.11 Supplemental methods and results for HCV screening example

1.11.1 A.11.1 Development of linear INMB

1.11.2 A.11.2 Parameter estimation

1.11.3 A.11.3 National health and nutrition examination survey (NHANES) analysis

Overview

Statistical Analysis

Results

1.11.4 A.11.4 Sensitivity analysis

Rights and permissions

About this article

Perfect information-collection technology (i.e., q ₁ = q ₂ = 1).

Imperfect information-collection technology (i.e., \(\min \{q_{1},\) q ₂} < 1).

Mean and variance of the posterior distribution f _{p|
v}