Abstract
Identifying the optimal screening strategies for breast cancer, the second leading cause of female cancer deaths in the US, is a major societal problem creating much controversy. The optimal screening strategies significantly depend on the sensitivity and specificity of the screening modality used. While the current state-of-the-art screening technology is mammography, its sensitivity or specificity may increase over time, or mammography may be replaced by another technology such as tomosynthesis in the near future. The purpose of this study is to identify the optimal use of the next generation of breast cancer screening modalities, whose sensitivity and specificity in clinical practice are either yet unknown or keep improving over time. Contrary to the prior literature that focuses on finding the optimal screening policy for given sensitivity and specificity values, we take an inverse optimization approach and focus on finding the range of sensitivity and specificity values, for which a given screening policy is optimal. To replicate breast cancer progression in the US population under various screening policies, we develop a parametric Partially Observable Markov Chain (POMC) model, which accounts for unobservable and age-specific disease progression, age-specific mortality, and the possibility of detecting cancer without a screening exam (either via self-detection or a clinical breast exam). We then formulate a nonlinear program (NLP) to identify the range of sensitivity and specificity values that optimize a particular screening policy. We show that this NLP is nonconvex for some parameter values, and hence difficult to solve. We prove several structural properties of the model, and by exploiting these properties, we propose a complete solution algorithm for this problem. We use real data in our numerical analysis and show that with the current technology, biennial breast cancer screening is slightly better than annual screening for the average-risk population. We also find that an improvement only in sensitivity (but not in specificity) will not change the current optimal policy. Furthermore, we characterize the lost potential quality-adjusted life years (QALYs) due to suboptimal practice, and show that biennial screening is more robust than annual screening in the sense that it results in fewer lost QALYs due to choosing a suboptimal screening policy. Given that the design of multicenter clinical trials may be prohibitively expensive and lengthy, our findings may be especially valuable to policymakers in deciding about the optimal use of an emerging breast cancer screening modality, and adapting a new technology in different settings.
Similar content being viewed by others
References
Abbeel, P., & Ng, A. Y. (2004). Apprenticeship learning via inverse reinforcement learning. In Proceedings of the twenty-first international conference on Machine learning (p. 1). New York: ACM.
Breast cancer facts and figures: 2009–2010. Atlanta: American Cancer Society.
Arias, E. (2006). United States life tables, 2004. National vital statistics reports, 54(14), 1–40.
Ayer, T. (2011). Optimal policies for personalized breast cancer screening. PhD thesis, University of Wisconsin-Madison
Ayer, T., Alagoz, O., & Stout, N. K. (2012). A POMDP approach to personalize mammography screening decisions. Operations Research, 60(1), 1017–1018.
Baker, R. D. (1998). Use of a mathematical model to evaluate breast cancer screening policy. Health Care Management Science, 1(2), 103–113.
Barton, M. B., Harris, R., & Fletcher, S. W. (1999). Does this patient have breast cancer? The screening clinical breast examination: should it be done? How? The Journal of the American Medical Association, 282(13), 1270–1280.
Baxter, N. (2001). Should women be routinely taught breast self-examination to screen for breast cancer? Canadian Medical Association Journal, 164(13), 1837–1845.
Bernardi, D., Ciatto, S., Pellegrini, M., Tuttobene, P., Fanto, C., Valentini, M., Michele, S. D., Peterlongo, P., & Houssami, N. (2012). Prospective study of breast tomosynthesis as a triage to assessment in screening. Breast cancer research and treatment, 133(1), 1–5.
Bolan, C. (2011). Breast screening’s trade-offs. Applied Radiology.
Brewer, N. T., Salz, T., & Lillie, S. E. (2007). Systematic review: the long-term effects of false-positive mammograms. Annals of Internal Medicine, 146(7), 502–510.
Cassandra, A. R. (1998). A survey of POMDP applications. In Working notes of AAAI 1998 fall symposium on planning with partially observable Markov decision processes (pp. 17–24).
Choi, J., & Kim, K. E. (2011). Inverse reinforcement learning in partially observable environments. Journal of Machine Learning Research, 12, 691–730.
Costantino, J. P., Gail, M. H., Pee, D., Anderson, S., Redmond, C. K., Benichou, J., & Wieand, H. S. (1999). Validation studies for models projecting the risk of invasive and total breast cancer incidence. Journal of the National Cancer Institute, 91(18), 1541–1548.
De Haes, J. C., de Koning, H. J., van Oortmarssen, G. J., Van Agt, H. M., de Bruyn, A. E., & van der Maas, P. J. (1991). The impact of a breast cancer screening programme on quality-adjusted life-years. International Journal of Cancer, 49(4), 538–544.
Dobbins, J. T. III. (2009). Tomosynthesis imaging: at a translational crossroads. Medical physics, 36(6), 1956–1967.
Drummond, M. F., Sculpher, M. J., Torrance, G. W., O’Brien, B. J., & Stoddart, G. L. (2005). Methods for the economic evaluation of health care programmes. New York: Oxford University Press.
Earle, C. C., Chapman, R. H., Baker, C. S., Bell, C. M., Stone, P. W., Sandberg, E. A., & Neumann, P. J. (2000). Systematic overview of cost-utility assessments in oncology. Journal of Clinical Oncology, 18(18), 3302–3317.
Elmore, J. G., Barton, M. B., Moceri, V. M., Polk, S., Arena, P. J., & Fletcher, S. W. (1998). Ten-year risk of false positive screening mammograms and clinical breast examinations. New England Journal of Medicine, 338(16), 1089–1096.
Elmore, J. G., Reisch, L. M., Barton, M. B., Barlow, W. E., Rolnick, S., Harris, E. L., Herrinton, L. J., Geiger, A. M., Beverly, R. K., Hart, G., et al. (2005). Efficacy of breast cancer screening in the community according to risk level. Journal of the National Cancer Institute, 97(14), 1035–1043.
Elmore, J. G., Wells, C. K., Lee, C. H., Howard, D. H., & Feinstein, A. R. (1994). Variability in radiologists’ interpretations of mammograms. New England Journal of Medicine, 331(22), 1493–1499.
Erkin, Z., Bailey, M. D., Maillart, L. M., Schaefer, A. J., & Roberts, M. S. (2010). Eliciting patients’ revealed preferences: an inverse Markov decision process approach. Decision Analysis, 7(4), 358–365.
Ferzli, G. S., Hurwitz, J. B., Puza, T., & Van Vorst-Bilotti, S. (1997). Advanced breast biopsy instrumentation: a critique. Journal of the American College of Surgeons, 185(2), 145–151.
Fryback, D. G., Stout, N. K., Rosenberg, M. A., Trentham-Dietz, A., Kuruchittham, V., & Remington, P. L. (2006). The Wisconsin breast cancer epidemiology simulation model. Journal of the National Cancer Institute Monographs, 36, 37–47.
Gail, M. H., Costantino, J. P., Bryant, J., Croyle, R., Freedman, L., Helzlsouer, K., & Vogel, V. (1999). Weighing the risks and benefits of tamoxifen treatment for preventing breast cancer. Journal of the National Cancer Institute, 91(21), 1829–1846.
Gur, D. (2007). Tomosynthesis: potential clinical role in breast imaging. American Journal of Roentgenology, 189(3), 614–615.
Hillman, B. J., & Gatsonis, C. A. (2008). When is the right time to conduct a clinical trial of a diagnostic imaging technology? Radiology, 248(1), 12–15.
Jemal, A., Siegel, R., Ward, E., Hao, Y., Xu, J., & Thun, M. J. (2009). Cancer statistics, 2009. CA: A Cancer Journal for Clinicians, 59(4), 225–249.
Klabunde, C. N., & Ballard-Barbash, R. (2007). Evaluating population-based screening mammography programs internationally. Seminars in Breast Disease, 10(2), 102–107.
Maillart, L. M., Ivy, J. S., Ransom, S., & Diehl, K. (2008). Assessing dynamic breast cancer screening policies. Operations Research, 56(6), 1411–1427.
Mandelblatt, J. S., Wheat, M. E., Monane, M., Moshief, R. D., Hollenberg, J. P., & Tang, J. (1992). Breast cancer screening for elderly women with and without comorbid conditions: a decision analysis model. Annals of Internal Medicine, 116(9), 722–730.
Mandelblatt, J. S., Cronin, K. A., Bailey, S., Berry, D. A., de Koning, J. H., Draisma, G., Huang, H., Lee, S. J., Munsell, M., Plevritis, S. K., et al. (2009). Effects of mammography screening under different screening schedules: model estimates of potential benefits and harms. Annals of Internal Medicine, 151(10), 738–747.
Messina, C. R., Lane, D. S., Glanz, K., West, D. S., Taylor, V., Frishman, W., & Powell, L. (2004). Relationship of social support and social burden to repeated breast cancer screening in the women’s health initiative. Health Psychology, 23(6), 582–594.
Nelson, H. D., Tyne, K., Naik, A., Bougatsos, C., Chan, B. K., & Humphrey, L. (2009). Screening for breast cancer: systematic evidence review update for the US preventive services task force. Annals of Internal Medicine, 151(10), 727–W242.
Neu, G., & Szepesvári, C. (2009). Training parsers by inverse reinforcement learning. Machine Learning, 77(2), 303–337.
Ng, A. Y., & Russell, S. (2000). Algorithms for inverse reinforcement learning. In Proc. 17th International Conf. on Machine Learning. Citeseer.
Ozekici, S., & Pliska, S. R. (1991). Optimal scheduling of inspections: a delayed Markov model with false positives and negatives. Operations Research, 39(2), 261–273.
Parker, S. L., Tong, T., Bolden, S., & Wingo, P. A. (1997). Cancer statistics, 1997. CA: A Cancer Journal for Clinicians, 47(1), 5–27.
Rafferty, E. A., Park, J. M., Philpotts, L. E., Poplack, S. P., Sumkin, J. H., Halpern, E. F., & Niklason, L. T. (2013). Assessing radiologist performance using combined digital mammography and breast tomosynthesis compared with digital mammography alone: results of a multicenter, multireader trial. Radiology, 266(1), 104–113.
Ramachandran, D. (2007). Bayesian inverse reinforcement learning. In 20th Int. Joint Conf. Artificial Intelligence.
Sackett, D. L., & Haynes, R. B. (2002). Evidence base of clinical diagnosis: the architecture of diagnostic research. BMJ: British Medical Journal, 324(7336), 539.
Shapiro, S., Coleman, E. A., Broeders, M., Codd, M., de Koning, H., Fracheboud, J., Moss, S., et al. (1998). Breast cancer screening programmes in 22 countries: current policies, administration and guidelines. International Journal of Epidemiology, 27(5), 735–742.
Shen, Y., & Zelen, M. (2001). Screening sensitivity and sojourn time from breast cancer early detection clinical trials: mammograms and physical examinations. Journal of Clinical Oncology, 19(15), 3490–3499.
Skaane, P. (2011). Controversies in mammography screening: let us not ignore science in this never-ending debate. Acta Radiologica, 52(10), 1061–1063.
Smallwood, R. D., & Sondik, E. J. (1973). The optimal control of partially observable Markov processes over a finite horizon. Operations Research, 21(5), 1071–1088.
Smith, R. A., Duffy, S. W., & Tabár, L. (2012). Breast cancer screening: the evolving evidence. Oncology, 26(5), 471–475.
Sommer, C. A., Stitzenberg, K. B., Tolleson-Rinehart, S., Carpenter, W. R., & Carey, T. S. (2011). Breast MRI utilization in older patients with newly diagnosed breast cancer. Journal of Surgical Research, 170(1), 77–83.
Sonnenberg, F. A., & Beck, J. R. (1993). Markov models in medical decision making: a practical guide. Medical Decision Making, 13(4), 322–338.
Stout, N. K., Rosenberg, M. A., Trentham-Dietz, A., Smith, M. A., Robinson, S. M., & Fryback, D. G. (2006). Retrospective cost-effectiveness analysis of screening mammography. Journal of the National Cancer Institute, 98(11), 774–782.
USPSTF (2009). Clinical guidelines: screening for breast cancer: US preventive services task force recommendation statement. Annals of Internal Medicine, 151, 716–726.
Zelen, M. (1993). Optimal scheduling of examinations for the early detection of disease. Biometrika, 80(2), 279–293.
Acknowledgements
The author thanks Jagpreet Chhatwal, Qiushi Chen, Chelsea C. White III, Jeff Pavelka, Anthony Bonifonte, Sait Tunc, and the anonymous reviewers for their suggestions and insights, which have improved this manuscript.
Author information
Authors and Affiliations
Corresponding author
Appendix:Proofs of the Analytical Results
Appendix:Proofs of the Analytical Results
Proof of Proposition 1
We prove this by induction. Basis: \(V_{\pi^{i}_{T}}(b,x,y) = \sum_{s \in S^{PO}}b(s) r_{T}(s)\). Now, we will present \(V_{\pi^{i}_{t+1}}(\tau[b,a,o],x,y)\) in terms of the α-vectors. Substituting the value of τ[b,a,o] from (2) into \(V_{\pi ^{i}_{t+1}}(\tau[b,a,o],x,y) =\sum_{s' \in S^{PO}}\tau[b,a,o](s')\alpha _{\pi^{i}_{t+1}}(s',x,y)\), we get
In (17), \(\sum_{s \in S^{PO}}b(s)K_{t}^{a}(o|s)\) does not depend on s′, hence we can move it out of the summation. Also, changing the order of summation, we obtain the following:
Now, we substitute the value of \(V_{\pi^{i}_{t+1}}(\tau[b,a,o],x,y) \) to obtain \(V_{\pi^{i}_{t}}(b,x,y)\) in terms of α-vectors.
Case 1: \(d^{i}_{t}=E\)
From (4), we know that
where (19) follows from replacing x with \(K_{t}^{E}(E-|0)\) and y with \(K_{t}^{E}(E+|s)\) when s∈{1,2}, and the fact that \(\mathbb{K}_{t}^{E}\) is a stochastic matrix.
Substituting the values of \(V_{\pi^{i}_{t+1}}(\tau[b,E,E-],x,y)\) and \(V_{\pi^{i}_{t+1}}(\tau[b,E,E+],x,y)\) from (18) we obtain
where (21) follows from changing the order of summation, rearranging the terms, and canceling the identical terms in the numerator and denominator; (22) follows from replacing \(K_{t}^{E}(E-|0)\) with x, \(K_{t}^{E}(E+|s)\) when s∈{1,2} with y, and the fact that \(\mathbb{K}_{t}^{E}\) is a stochastic matrix; and (23) follows because \(P_{t}^{(E,E-)}(s'|0) =P_{t}^{(E,E+)}(s'|0)\) for all t≤T by Assumption 3.
Case 2: \(d^{i}_{t}=W\)
Similar to Case 1, substituting the value of \(V_{\pi^{i}_{t+1}}(\tau [b,W,o],x,y)\) from (18) into (4), we obtain
where (24) follows from changing the order of summation and rearranging the terms, (25) follows from canceling the terms in the numerator and denominator, and (26) follows from simple algebra. □
Proof of Lemma 1
We prove the general inequality case and the proof for the strict inequality in Part (a) is very similar, with the only exception that the basis in the induction changes. By Proposition 1, \(V_{\pi^{i}_{t}}(b,x_{2},y_{2}) \geq V_{\pi ^{i}_{t}}(b,x_{1},y_{1})\) if \(\sum_{s \in S^{PO}}b(s)\alpha_{\pi ^{i}_{t}}(s,x_{2},y_{2}) \geq\sum_{s \in S^{PO}}b(s)\alpha_{\pi ^{i}_{t}}(s,x_{1},y_{1})\). Therefore, it is sufficient to show that \(\alpha _{\pi^{i}_{t}}(s,x_{2},y_{2})\geq\alpha_{\pi^{i}_{t}}(s,x_{1},y_{1})\) for all s∈S PO whenever x 2≥x 1 and y 2≥y 1. We prove this by induction as follows. Basis: \(\alpha_{\pi^{i}_{T}}(s,x_{2},y_{2})= r_{T}(s) \geq\alpha_{\pi^{i}_{T}}(s,x_{1},y_{1}) = r_{T}(s)\). Suppose the assertion holds for \(\alpha_{\pi^{i}_{t+1}}\). Then, we need to show this for the induction step. Note that when \(d^{i}_{t} = W\), this follows directly from Proposition 1 because of the induction hypothesis. When \(d^{i}_{t} = E\) and s=0, the assertion can be proven as follows:
where (28) follows because of the induction hypothesis and from the assumption that r t (0,E,E−)−r t (0,E,E+)≥0.
On the other hand, when \(d^{i}_{t} = E\) and s∈{1,2}, the assertion can be proven as follows:
where (29) follows from rearranging the terms, (30) follows from the induction hypothesis, and (31) follows from (9). □
Proof of Lemma 2
This proof follows directly from the proof of Lemma 1, where we show that \(\alpha_{\pi^{i}_{t}}(s,x,y)\) is nondecreasing in x and y for all s∈S PO, t≤T, and \(\pi^{i}_{t} \in\varPi _{t}\). □
Proof of Proposition 2
We would like to show that the feasible region of the NLP given in (5) is nonconvex in y, i.e. \(V_{\pi^{1}_{t}}(b,x,y) \leq V_{\pi ^{2}_{t}}(b,x,y)\) for all \(y \in\{\underline{y}, \overline{y}\}\), but ∃ \(\dot{y} \in[\underline{y},\overline{y}]\) such that \(V_{\pi ^{1}_{t}}(b,x,\dot{y}) \geq V_{\pi^{2}_{t}}(b,x,\dot{y})\). Let \(\pi ^{i}_{t+1}\) be any fixed policy (i.e., does not depend on b) in Π t+1, and \(\varPi_{t}=\{\pi^{1}_{t}, \pi^{2}_{t}\}\) where \(\pi^{1}_{t}=\{E, \pi^{i}_{t+1}\}\), and \(\pi^{2}_{t}=\{W, \pi^{i}_{t+1}\}\). That is, \(\pi ^{1}_{t}\) and \(\pi^{2}_{t}\) are identical except the action taken at time t. From Proposition 1, we know that \(V_{\pi ^{i}_{t}}(b,x,y) =\sum_{s \in S^{PO}}b(s)\alpha_{\pi^{i}_{t}}(s,x,y)\), hence it is sufficient to show that \(\sum_{s \in S^{PO}}b(s) (\alpha _{\pi^{1}_{t}}(s,x,y)- \alpha_{\pi^{2}_{t}}(s,x,y) )\) is negative for all \(y \in\{\underline{y}, \overline{y}\}\) but positive for some \(\dot {y} \in[\underline{y}, \overline{y}]\). Let \(\alpha_{\pi ^{i}_{t}}(x,y)=\big[\alpha_{\pi^{i}_{t}}(s,x,y)\big]\). From Proposition 1, we have
where (32) and (33) follow from Assumptions 2, 3, and the fact that both \(\pi^{1}_{t}\) and \(\pi ^{2}_{t}\) are fixed policies and do not depend on b, and (34) follows from Assumption 2 and Definition 1. Note that du D≥0, (x−1)du D≤0 because 0≤x≤1, and ℓ t (1,x,y) is nonincreasing in y by Lemma 2. Therefore, the values of du D, yℓ t (1,x,y), and yℓ t (2,x,y) could be such that
is negative for all \(y \in\{\underline{y}, \overline{y}\}\) but positive for some \(\dot{y} \in[\underline{y}, \overline{y}]\), which makes the problem nonconvex. □
Proof of Lemma 3
We prove this by induction. For the basis step, it is obvious to see that \(\alpha_{\pi^{i}_{T}}(s,\overline{x},y)-\alpha_{\pi ^{i}_{T}}(s,\underline{x},y)=0\), as \(\alpha_{\pi^{i}_{T}}(s,x,y)=r_{T}(s)\) for any s, x, and y. Assume that the assertion holds for the decision epoch t+1. At time t, there are two possible cases: either the decision variable is W or E.
Case 1: \(d^{i}_{t}=W\)
where (35) follows by canceling out \(\sum_{o \in \varTheta_{W}}K_{t}^{W}(o|s)r_{t}(s,W,o)\) and from Assumption 4, and (36) follows from the induction hypothesis.
Case 2: \(d^{i}_{t}=E\)
where (37) follows from Assumption 4 and simple algebra, and (38) follows from the induction hypothesis. □
Proof of Proposition 3
Let \(0 \leq\underline{x} \leq\overline{x} \leq1\). We want to show that:
Note that by Lemma 3 for any \(\pi^{i}_{t} \in\varPi_{t}\), b∈B, and \(\overline{x}, \underline{x} \in[0,1]\):
Then, we can rewrite (39) as follows:
Now, we will express \(\alpha_{\pi^{i}_{t}}(0,\overline{x},y) - \alpha _{\pi^{i}_{t}}(0,\underline{x},y)\) for a given policy \(\pi^{i}_{t}=\{ d^{i}_{t}, d^{i}_{t+1}, \ldots, d^{i}_{T}\}\) in terms of the input parameters (i.e., rewards, transition probabilities, and observation probabilities).
Case 1: \(d^{i}_{t}=E\)
From Proposition 1, we have
where (42) follows from rearranging the terms and (43) follows from Lemma 3.
Case 2: \(d^{i}_{t}=W\)
Again, from Proposition 1, we have
where (44) follows from the assumption that \(P_{t}^{(W,W-)}(s'|0)= P_{t}^{(W,W+)}(s'|0)= P_{t}^{(E,E+)}(s'|0)\) and algebraic simplification and (45) follows from Lemma 3.
Then, combining the results of Case 1 and Case 2, we get
Expanding this recursive equation, we obtain the following:
Then,
where (47) follows from (41), (48) follows from the recursive expansion, and (49) follows from the fact that \(\overline{x} \geq\underline{x}\), Assumption 1, and Definition 2. □
Proof of Corollary 1
Let \(\pi^{*}_{t}\) be the optimal policy at (x,y) and \(\overline{x} \geq x \geq0\). Let \(\pi^{i}_{t}\) be any arbitrary policy from the policy set Π t such that \(\pi^{i}_{t}\) is less aggressive than \(\pi^{*}_{t}\). Assume that \(\pi^{i}_{t}\) is uniquely optimal at \((\overline{x},y)\). Then \(V_{\pi^{i}_{t}}(b,\overline{x},y) > V_{\pi^{*}_{t}}(b,\overline{x},y)\). However, from Proposition 3, we have
where (50) follows because \(\pi^{*}_{t}\) is the optimal policy at (x,y). Then, \(V_{\pi^{i}_{t}}(b,\overline{x},y) \leq V_{\pi^{*}_{t}}(b,\overline{x},y)\) which contradicts with our assumption and completes the proof. □
Proof of Theorem 1
Let \(\varPi_{t}=\{ \pi^{i}_{t} | i \in I=\underline{I} \cup\overline{I}\}\) such that \(\underline{I}\) indexes policies that are less aggressive than \(\pi^{*}_{t}\) and \(\overline{I}\) indexes policies that are more aggressive than \(\pi^{*}_{t}\). Also, let 0≤x 1≤x 2≤1 such that \(V_{\pi^{*}_{t}}(b,x_{1},y) \geq V_{\pi^{i}_{t}}(b,x_{1},y)\) and \(V_{\pi ^{*}_{t}}(b,x_{2},y) \geq V_{\pi^{i}_{t}}(b,x_{2},y)\) for all \(\pi^{i}_{t} \in \varPi_{t}\). Then for any x such that x 1≤x≤x 2, we want to show that \(V_{\pi^{*}_{t}}(b,x,y) \geq V_{\pi^{i}_{t}}(b,x,y)\) for all \(\pi ^{i}_{t} \in\varPi_{t}\). Suppose we arbitrarily select a policy \(\pi^{i}_{t} \neq\pi^{*}_{t}\) from Π t . If \(i \in\underline{I}\) (i.e., \(\pi^{i}_{t}\) is less aggressive than \(\pi^{*}_{t}\)), then from Proposition 3, we have
where (51) follows because \(\pi^{*}_{t}\) is feasible at (x 1,y).
On the other hand, if \(i \in\overline{I}\) (i.e., \(\pi^{i}_{t}\) is more aggressive than \(\pi^{*}_{t}\)), then again from Proposition 3, we have
which again follows because \(\pi^{*}_{t}\) is feasible at (x 2,y). Then, \(V_{\pi^{*}_{t}}(b,{x},y) - V_{\pi^{i}_{t}}(b,{x},y) \geq0\) for any x 1≤x≤x 2 and \(\pi^{i}_{t} \in\varPi_{t}\), which completes the proof. □
Rights and permissions
About this article
Cite this article
Ayer, T. Inverse optimization for assessing emerging technologies in breast cancer screening. Ann Oper Res 230, 57–85 (2015). https://doi.org/10.1007/s10479-013-1520-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10479-013-1520-3