FormalPara Key Points for Decision Makers

Tomosynthesis added to synthetic mammography improves technical parameters of breast cancer screening.

Tomosynthesis added to synthetic mammography is cost effective in breast cancer screening in the Brazilian Supplementary Health System.

Tomosynthesis cost effectiveness varies according to specific local features, limiting generalizability.

1 Background

Breast cancer has become a significant public health problem. In 2020, it was estimated that approximately 2.26 million new cases of breast cancer were diagnosed, and 685,000 people worldwide died of breast cancer. Demographic and epidemiological projections show a continuing increase in breast cancer burden. According to estimates, there will be 3.19 million new cases of breast cancer by 2040 [1, 2]. Furthermore, the global economic cost of breast cancer from 2020 to 2050 has been estimated to be $2.0 trillion (in international dollars at constant 2017 prices). Across cancer types, breast cancer (7.7%) represents the third largest global economic cost of cancers [3].

Results from a study showed that the incidence of early-stage breast cancer in women who are regularly screened was significantly higher compared with those who are not, suggesting the benefit of early detection with the implementation of regular screening [4]. In addition, digital mammography (DM) screening and early detection have been associated with a reduction of up to 25% in the relative risk of death in the first decade after diagnosis among women of 40–70 years [5]. However, there are some controversies about DM screening related to overdiagnosis and overtreatment, which are considered the main issues, with estimates of overdiagnosis ranging between 15 and 30% [6]. Moreover, false-positive screening results may lead to unnecessary biopsies and increased anxiety, which can decrease the effectiveness and acceptability of screening programs [6].

DM is known as the gold standard for breast cancer screening, including in Brazil [7]; nonetheless, a key issue is its low sensitivity, which reduces the effectiveness of screening [8]. It is well defined in the literature that mammographic sensitivity decreases with increasing breast density [9]. Furthermore, greater breast density is an additional risk factor for developing breast cancer [10], and both false-positive and false-negative interpretations are more likely with dense breasts [11].

Currently, digital breast tomosynthesis (DBT) is gaining prominence because it improves cancer detection, especially in dense breast tissue. The ASTOUND-2 trial has assessed women with DM-negative dense breasts, and the results show an incremental cancer detection rate of 2.83 per 1000 screens [12]. Meta-analysis outcomes showed that DBT + DM had an incremental cancer detection rate of 2.4 cancers per 1000 screens in biennial screening practice with only a minor increase in recall rates compared with DM alone [13]. In addition, synthesized two-dimensional (s2D) mammograms from DBT were introduced, showing similar sensitivity and specificity concerning DM [14]. This has made it feasible to use DBT + s2D as a stand-alone screening modality rather than DBT combined with DM [30].

The cost effectiveness of DBT has been evaluated and debated across several countries since 2016 [15,16,17], with different characteristics and settings, the results suggest that DBT is cost effective. Given its superior detection rate to standard mammography, DBT could save on long-term costs by detecting more subtle diseases on time, leading to lower treatment costs than treatment of advanced diseases missed on standard mammography [18]. DBT reduces false positive exams [19] and would be considered cost effective owing to the low positive predictive value of screening with DM alone [20].

DBT is more likely to be a cost-effective alternative to mammography in women with dense breasts, whether it could be cost-effective in a general population highly depends on DBT costs [21]. TMIST is a large, randomized trial that compares standard digital mammography and tomosynthesis mammography. The primary hypothesis is whether tomosynthesis will reduce the incidence of advanced breast cancer compared with standard mammography. It will also compare how breast density mediates the detection of advanced cancer between the two modalities and anticipate results are expected for 2025 [22]. Furthermore, studies advise that the generalizability of results could depend on factors varying among countries, such as recall rates, program sensitivity and specificity, treatment cost, and willingness to pay (WTP) threshold [23,24,25].

Therefore, due to uncertainties regarding the clinical potential of DBT and the lack of economic evaluation to show the cost effectiveness of DBT in the Brazilian supplementary health system (health insurance companies) perspective, the first objective was to perform a meta-analysis to present estimates for recall and detection rates of DBT + s2D compared with DM alone. Second, by using the estimates from the performed meta-analysis in a hybrid economic model (decision tree plus Markov model), we simulated and compared the number of recalls, false positives, cancer cases detected at screening or as interval cancer, costs at different stages of follow-up, deaths, and life years gained to estimate the cost effectiveness of switching from DM to DBT + s2D in biennial breast screening of women aged 40–69 years with scattered areas of fibroglandular breast density and heterogeneous dense breasts [American College of Radiology Breast Imaging Reporting and Data System (ACR BI-RADS) breast density patterns B and C] [26].

2 Methods

2.1 Study Design

We estimated the cost effectiveness of DBT + s2D compared with DM alone in women aged 40–69 years with scattered areas of fibroglandular breast density and heterogeneous dense breasts (BI-RADS B and C) in Brazil. Although a health economic analysis plan was not previously published, the main aspects of the analysis were summarized according to the Consolidated Health Economic Evaluation Reporting Standards (CHEERS) statement to increase the transparency of the proposed study [7].

2.2 Study Population

The target population were females aged 40–69 years eligible for breast cancer screening according to Brazilian supplementary health system guideline [27] who met the following criteria: having scattered areas of fibroglandular breast density and heterogeneous dense breasts (BI-RADS B and C), undergoing biennial breast screening, and receiving DBT + s2D in the intervention arm versus DM in the control arm.

2.3 Study Perspective

From the Brazilian supplementary health system perspective, the economic model estimated the relative cost effectiveness of the DBT + s2D compared with DM alone, adhering to the Brazilian Ministry of Health’s guidelines [28]. Therefore, only healthcare costs (direct medical costs) incurred by the provider were included.

2.4 Intervention and Comparator

DBT is an advanced form of mammography that produces 3D images of the breast taken from different angles using a low-dose x-ray system. A reconstruction algorithm then processes the series of projections to estimate the 3D appearance of the breast, which can be viewed in successive slices [29]. DBT has the potential to partly overcome tissue superposition, thus improving the detection of breast lesions and minimizing the masking effects of DM [30]. The s2D images are created from raw DBT data, which reduces the concern related to increased radiation dose from combined DBT and DM screening [31].

DM is a technique that produces a 2D image and has been used to detect breast cancer in an early stage. However, some lesions can be obscured by the superposition of dense tissue and lead to either false positive or false negative results [30].

Patients were assigned to either the intervention group (DBT + s2D) or the comparator group (DM alone). In both groups, asymptomatic women aged 40–69 years were screened biennially for breast cancer adhering to Brazilian supplementary health system guidelines [27, 32, 33]. Cost effectiveness was expressed as the incremental cost-effectiveness ratio (ICER). We used the Monte Carlo simulation to calculate 95% confidence intervals (CIs) for our estimate of the difference in mean cost and quality-adjusted life years (QALYs) between intervention groups. The life years gained (LYG) were adjusted for utility values according to data identified through a manual search in the literature. The utility values were selected according to the description of the population evaluated and the respective health status of the model, i.e., women who underwent screening for breast cancer and women diagnosed with breast cancer undergoing treatment. All the data used for utility values were based on instruments to assess health-related quality of life based on social preferences (EQ-5D). When specific data of the Brazilian population were not available, data available from other countries were used. For the utility values regarding screening metrics comparing DBT  + s2D versus DM, data trials from Germany, Norway, Italy, and USA were used [31, 34,35,36,37,38,39]. For the utility values regarding annual probabilities of death for interval cancer and ductal cancer in situ (DCIS), data from The Norwegian Cancer Registry were used [34]. For the probabilities of progression to data from Canadian simulation were used [40]. For the first cycle of the model, it was considered that the disutility would be lower for patients who received false positive imaging (DM/DBT + s2D) results. False negative imaging screening is defined as cancer discovered between the routine screening intervals. For the other cycles, utility values were considered for interval cancer and for detected cancer [41, 42].

2.5 Time Horizon

The economic evaluation estimated the cost and LYG over a lifetime horizon (30 years), encompassing years of screening eligibility and mortality from breast cancer and other causes.

2.6 Decision Model

To estimate the short- and long-term effects of both assessed screening strategies (DBT + s2D versus DM alone), a hybrid economic model Microsoft Office Excel® (Microsoft Corporation, Redmond, WA, EUA; decision tree plus Markov model, supplemental material) was built and is in line with the current clinical pathway for screening breast cancer following the Brazilian Ministry of Health guidelines (Fig. 1) [27, 28, 32, 33].

Fig. 1
figure 1

Structure of lifetime economic model (decision tree and Markov model)

In summary, eligible patients can be screened either with DBT + s2D or DM. Patients assigned to the DBT + s2D intervention may follow three different branches according to what is observed in the screening result [positive (BI-RADS 4 and 5), suspect (BI-RADS 0), or negative/benign (BI-RADS 1, 2, and 3)]. In the first branch (BI-RADS 4 and 5), patients might be referred for biopsy and histopathological analysis for diagnostic confirmation when the screening test identifies a result that is suspicious or highly suspicious of malignancy. Biopsy results will either confirm the cancer diagnosis or classify the screening findings as a false positive (assuming no cancer or benign). We assume the biopsy is the gold standard for diagnosing breast cancer, with 100% sensitivity and specificity. Furthermore, we assumed that false negative results would be identified before the next screening test, characterizing them as interval cancers, and we considered the probabilities of invasive cancer or ductal cancer in situ (DCIS) for patients with biopsy-confirmed cancer. From this point, patients enter the Markov model with localized/regional cancer (TNM staging 0 to 3) or cancer with distant metastasis (TNM 4). Over the annual cycles, patients with localized/regional cancer can remain in this state of health, progress to the advanced stage by developing distant metastatic disease, or even progress to death. Patients diagnosed with metastatic cancer will remain in this health state until they progress to death.

Patients in the second branch (BI-RADS 0) are indicated for recall when additional evaluation is needed. We assumed that all recalled patients would undergo breast ultrasound scans, following the Brazilian Ministry of Health guidelines [27, 32, 33, 43]. Based on the breast ultrasound scan result, patients might or might not be referred to perform a biopsy for diagnostic confirmation. From this point, patients follow the same pathway as those in the previous branch.

Furthermore, patients assigned to the DBT + s2D group may have a negative screening result or benign result (BI-RADS 1, 2, or 3). In this case, there are two possibilities: (1) a patient does not have cancer (true negative result) or has a benign result, in either case patients must continue to be routinely screened, or (2) the screening result is false negative. Patients assigned to the DM group will follow the same structure described in the DBT + s2D branch, using different transition probabilities and costs over the model.

2.7 Model Input Parameters

There are well-established parameters in relation to recall and biopsy rate in the mammographic screening program for breast cancer in Brazil [43]. Nevertheless, these parameters are for all women and there are no parameters for each BI-RADS breast density pattern. So, probabilities of patients in both groups (DBT + s2D and DM alone) having undergone a biopsy or recalled were taken from the To-Be Trial study, whose outcomes were described by breast densities, according to the Volpara Density Grade (VDG) [34]. VDG outcomes are equivalent to the BI-RADS classification, which categorizes breast density into four levels, from A to D [44]. We calculated the probability of detecting cancer on the basis of the number of biopsies performed as described in the To-Be Trial study [34]. We also performed a meta-analysis to estimate the proportion of invasive cancer using data from six studies performed alongside randomized clinical trials (RCTs) and population-based screening programs. [28, 31, 35,36,37,38,39].

Patients diagnosed with invasive cancer were entered into the Markov model and started in one of the health states according to their stage of cancer, defined by the tumor, lymph node and metastasis (TNM) algorithm—TNM 1, TNM 2, TNM 3 or TNM 4, and their data were extracted from the To-Be Trial (Table 1) [34]. Every annual cycle, patients can stay in the same health state, progress to the next health state (distal metastasis), or die. The 5-year survival rates among Brazilian women with breast cancer were 98.7%, 93.3%, 86.2%, and 40.8% for TNM 1, 2, 3, and 4, respectively [45]. Probabilities of patients progressing from localized cancer (TNM 1–3) to metastatic cancer (TNM 4) were extracted from a Canadian study, assuming the probability of progression from local recurrence to distant recurrence [40].

Table 1 Point estimates, probability distributions, and source of parameter estimates used in the lifetime economic model

Due to a lack of data on the Brazilian population, we assumed that the 10-year survival probability for patients with DCIS and interval cancer were 94.2% and 82.5%, respectively, according to the Norwegian Cancer Registry [24]. For those women who attended the screening program and did not have cancer, we assumed that they would progress to death using the general mortality probabilities for women according to the Brazilian Institute of Geography and Statistics life-expectancy table [46]. Transition probabilities for the economic model (decision tree and Markov model) were sourced from the literature, and Table 1 presents a list of parameters used in the lifetime model.

2.8 Resource Use and Costs

DM costs € 101.30 and DBT costs € 334.86 based on Brazilian Hierarchical Classification of Medical Procedures 2022. It is published by the Brazilian Medical Association (AMB) and is updated every 2 years [47]. The price difference between DM and DBT regards device acquisition, data storage costs and what radiologists receive for interpretation of the images that takes much more time. All the costs were based on Brazilian supplementary health service (BSHS) price list [48]. Unit costs were obtained from the literature [48,49,50] and applied to the record of resource use associated with the screening program and treating breast cancer according to the disease stage, including biopsy, histopathological test, breast ultrasound, drugs, radiotherapy, surgery, exams, and follow-up appointments. Treatment costs were calculated according to the Brazilian guideline for the diagnosis and treatment of breast cancer recommendations [51,52,53,54], assuming that from all breast cancers 15–20% are triple negative (basal like), 10–20% are HER2 overexpressed, 20–30% are luminal B, and 40–60% are luminal A [55]. We estimated the proportion of patients for the different treatments used on the basis of the Brazilian Society of Clinical Oncology Guideline [51,52,53,54]. We assumed a patient with a surface body area of 1.8 m2 (160 cm and 70 kg) [56] to calculate the chemotherapy dosage and costs. All estimated costs were converted from Brazilian real (BRL or R$) to euros (€) using the average exchange rate in 2023 [57]. Costs and health benefits (QALYs and life years gained) were discounted at 5% in line with national guidelines [28]. Costs were expressed in euros (2023 prices). An overview of values and cost measures is shown in Table 1. Further details on the micro-costing of each stage of the disease are presented in the Supplementary Material (Table S1).

2.9 Health Outcomes—Clinical Effectiveness

We performed a systematic review with meta-analysis following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) recommendations [59] to identify and compare the analytical validity and clinical utility of using DBT + 2Ds compared with DM alone in cancer screening in women aged 40–69 years with scattered areas of fibroglandular breast density and heterogeneous dense breasts.

A broad literature search was undertaken using multiple electronic databases: Medline (Ovid), Embase, and Cochrane Library. The search strategy combined terms related to digital tomosynthesis technology, synthesized mammography, digital mammography, and breast cancer. All searches were performed from inception to June 2022. Further details on the search strategies used in the electronic databases are presented in Supplemental Materials (Table S2). Search results from the different databases were imported and merged into Rayyan and duplicates were removed automatically or deleted manually [60]. The screening process and the critical appraisal were done by one reviewer and checked by another to minimize selection bias [61]. The eligibility criteria considered systematic reviews with or without meta-analysis, randomized clinical trials (RCTs), and comparative, prospective, or retrospective observational studies. Analytical validity was observed in studies through the accuracy, sensitivity, specificity, and safety analysis of DBT + s2D, while the clinical utility was observed through the cancer detection rates, recall, biopsy, cancer interval, and mortality in women with scattered areas of fibroglandular breast density and heterogeneous dense breasts (BI-RADS B and C). We excluded studies conducted on patients in the postdiagnosis context, patients recalled due to breast injury, or previously identified as high risk. Tables containing a summary of the main characteristics of the included studies and participants and a narrative description of the main results found with descriptive statistics (absolute and relative frequency) were elaborated. Clinical utility outcomes (e.g., cancer detection rate and recall rate) from included studies stratified according to the BI-RADS classification were meta-analyzed using the random effects model.

We use the QUADAS-2 tool to assess the risk of bias and the methodological quality of individual studies [62]. The overall quality of evidence was assessed following the Grading of Recommendations Assessment, Development and Evaluation (GRADE) recommendations [63].

2.10 Assumptions

In formulating the key assumptions, we made conservative estimates to avoid favoring any intervention. We assumed that all recall patients before a biopsy had been submitted undergoing breast ultrasound scans, following the Brazilian Ministry of Health guidelines [43]. Furthermore, we assumed that patients with a false negative result would be identified before the next screening cycle, characterizing them with interval cancer. Comparing the difference between DBT + s2D versus DM alone was pooled as risk ratio using the Mantel–Haenszel method and based on random effects. Forest plots were used to display study-specific data in terms of improvement of detection cancer rates and detection of invasive cancer and reducing the recall rates and biopsy rates. As studies assessed different populations, we used random-effects models to allow for both within-study sampling variability and heterogeneity between studies when calculating pooled estimates. We use the Higgins I2 statistic, to assess the study heterogeneity, with I2 > 50% indicating the presence of heterogeneity [64]. When substantial heterogeneity in diagnostic accuracy was observed between studies, we investigated a threshold effect by visual assessment of coupled forest plots of sensitivity and specificity, and a Spearman correlation coefficient between sensitivity and false positive rate (correlation coefficient > 0.6 indicated a threshold effect) [65]. We also visually assessed the differences between the 95% confidence region and the 95% prediction region in the HSROC curve for examining the presence of heterogeneity between studies [66]. Analyses were carried out using the STATA version 14.2 (StataCorp.) [67].

2.11 Sensitivity Analysis

Uncertainty around the parameter estimates used in our model was fully characterized and propagated through to the model results by conducting probabilistic and deterministic sensitivity analyses (PSA/DSA). PSA was done by defining parameter values using distributions rather than point estimates. The model was then run 1000 times with a value randomly drawn from the assigned probability distribution. This produced a distribution of model outputs that was represented visually on the cost-effectiveness plane. Cost-effectiveness acceptability curves (CEACs) were used to represent the probability that an intervention would be cost effective compared with the control group at a range of willingness-to-pay (WTP) thresholds. DSA was carried out considering minimum and maximum values of the parameters using the 95% CI when the data were available or varying by ± 25%.

The WTP threshold for the public health system in Brazil is a gross domestic product (GDP) (€ 7200.00) per QALY [68]. Currently the supplementary (private) health system is discussing the implementation of a specific threshold, but there is no agreement yet. So, we used the public health system WTP threshold above as a conservative assumption.

3 Results

3.1 Meta-analysis

A total of 18 publications from 39 studies were included, including 2 randomized clinical trials and the remainder including prospective or retrospective observational studies. The detection cancer rate (DCR) of DBT + 2Ds was 1.35 [relative risk (RR), p < 0.001], the detection invasive cancer rate (DICR) was 1.48 (RR, p < 0.001), the recall rate was 0.81 (RR, p = 0.028), and the biopsy rate (BR) was 0.89 (RR, p = 0.303) when compared with DM.

3.2 Cost Effectiveness for the Base Case

Cost-effectiveness results found that DBT + s2D compared with the DM alone had a total cost of €2094.54 compared with €3048.57 (Table 2). Over a lifetime horizon, the QALYs gained in the intervention group were 18.9185, compared with 13.7196 in the control group. This equates to a cost save of € − 954.02 and an incremental 5.1989 QALYs associated with the DBT + s2D. Therefore, DBT + s2D is a dominant strategy because it is shown to be more effective and less costly compared with DM alone for screening women aged 40–69 years with scattered areas of fibroglandular breast density and heterogeneous dense breasts (ACR-BI-RADS B and C).

Table 2 Cost-effectiveness analysis results comparing DBT + s2D versus DM alone

3.3 Sensitivity Analyses

The cost-effectiveness plane shows the results of running the model 1000 times and recording the difference in cost and effectiveness between the DBT + s2D and DM alone (Fig. 2). Using 1000 Monte Carlo simulations, PSA has shown that DBT + s2D is dominant over DM alone to improve the cancer diagnosis in women aged 40–69 years with scattered areas of fibroglandular breast density and heterogeneous dense breasts (ACR-BI-RADS B and C). Cost-effectiveness data points are observed in the southeast quadrant of the plane (representing the scenario of “less costly and more effective,” that is, a dominant strategy) (Fig. 2).

Fig. 2
figure 2

Scatter plot of incremental cost-effectiveness ratio of DBT + s2D compared with DM alone. DBT + s2D digital breast tomosynthesis and synthetic mammography 2D, DM digital mammography

The CEAC shows the probability of DBT + s2D being cost effective for different levels of willingness-to-pay thresholds, compared with DM alone (Supplementary Material, Fig. S1). The CEAC shows that, at a willingness-to-pay threshold of € 7200 per QALY, the DBT + s2D has a 100% probability of being cost effective compared with DM alone.

The DSA demonstrated that the parameter that introduces the greatest uncertainty into the results is the cancer detection rate by mammography, which can lead to a cost saving of € 139.82 and 5.1850 QALYs gained; therefore, DBT continues to be dominant. The variations for minimum and maximum values of all other parameters confirmed the dominance of DBT + s2D compared with DM (Supplementary Material, Fig. S2).

3.4 Clinical Effectiveness Results

Figure S3 (Supplemental Material) shows the flow diagram of the process of identifying the studies included in the meta-analysis and used to inform the effectiveness parameters in our economic analysis. In summary, we retrieved 521 papers in the database searches, of which 39 were read in full. After all the exclusions, we selected 18 studies (Table S3—Characteristics of included studies) that met all the inclusion criteria and were critical appraisal through QUADAS-2 tools (Supplemental Material—Table S4). Forest plots showing pooled data for outcomes related to cancer detection rate, invasive cancer detection rate, recall rate reduction, and biopsy rate reduction are presented in Supplementary Material Figs. S4–S7, respectively.

Results comparing differences in both DBT + s2D versus DM alone groups in terms of biopsies and recalls performed, interval cancers diagnosed, and false negative and true positive results are shown in Table 3. DBT + s2D avoids recalls, biopsies, and false positives and increases true positives. Furthermore, patients assigned to DBT + s2D groups have lower interval cancer rates due to the ability of DBT + s2D to detect more patients with early stage breast cancer.

Table 3 Clinical effectiveness comparing both BDT + s2D versus DM alone screening program

4 Discussion

To our knowledge, this is the first economic evaluation analysis study comparing procedures, treatment, long-term effectiveness, and cost effectiveness of DBT + s2D versus DM alone for breast cancer screening in the Brazilian supplementary health system perspective. Our results indicate that for women aged 40–69 years with scattered areas of fibroglandular breast density and heterogeneous dense breasts (BI-RADS B and C), biennial screening with DBT + s2D compared with the DM alone meets the standard criteria to be considered a cost-effective use of resources in a Brazilian supplementary health service setting. Women with breast patterns BI-RADS A and D after the first mammogram should keep DM screening (current standard screening) while those with patterns BI-RADS B and C should be screened with DBT + s2D.

Our results agree with other studies that evaluated the use of DBT in biennial breast cancer screening in women with dense breasts [21, 24]. Due to uncertainties in the study’s assumptions related to input parameters, test accuracy, and participation rate in the screening program, we performed extensive probabilistic sensitivity analyses. In the absence of national data, we used published studies to perform a meta-analysis and determine the DBT + s2D effectiveness for breast cancer screening. The probabilistic sensitivity analysis confirmed the robustness of the results. Our results corroborate with the study by Moger et al. [24], in which the probabilistic sensitivity analysis shows that DBT has been cost effective in over 50% of the simulations at all WTP levels per QALYs, and in 80% of the simulations at levels greater than €22,000. The mammographic biennial coverage for the target female populations in BSHS is 58.1% [69]. The money saved with DBT + s2D screening could be used to increase cancer screening participation rates and insurance incentives.

The benefits of DBT are most striking in women with dense breasts; however, unlike conventional screening mammography, no prospective studies show that DBT screening reduces breast cancer mortality, given that it is a much newer modality. Hard data can take 15–20 years to acquire, so researchers must turn to economic models based on shorter-term end points to determine the viability of newer technologies that appear to advance patient outcomes. Establishing the cost effectiveness of a new imaging technique encourages wider adoption years before sufficient scientific evidence can be collected and analyzed [70]. The combination of such an analysis of the benefits and costs with a probabilistic, deterministic, and sensitivity analysis allows for an even more accurate assessment of whether the proposed technology is worth the cost [71].

There is some good evidence in the literature that breast cancer screening by DBT can be cost effective in relation to DM [15,16,17,18,19,20,21,22,23,24]. Cost effectivness is sensitive to the population evaluated and to local economic parameters. Unlike other models that evaluated DBT cost effectiveness in relation to DM, our model focused on BI-RADS B and C. In agreement with the others [15,16,17,18,19,20,21,22,23,24], we identified that the adoption of DBT + s2D could lead to cost savings for the Brazilian health insurance company's budget. Furthermore, our results show that the use of DBT + s2D as screening modality reduces the incidence of recalls after undetermined findings, improves the detection of invasive cancers, and allows earlier cancer detection, resulting in improved patient throughput and more efficient resource utilization. Raghu et al. [72] described that the use of DBT significantly improved diagnostic accuracy and confidence, increasing the proportion of results classified as normal and decreasing the rate of results categorized as probably benign and their associated costs of follow-up. Lourenco et al. [73] found that DBT streamlines the patient pathway for diagnosis by reducing additional imaging examinations: 57.2% of women screened with DM alone had both additional mammographic views and ultrasonography, and only 43.3% of women screened using DBT required additional imaging. Meta-analysis results that assessed women with dense breasts showed that the sensitivity of DBT or DBT plus DM is higher (84–90%) than DM alone (69–86%) [74].

Women with screen-detected or interval breast cancer reported better quality of life compared with women with symptomatic cancer [42]. The cost-effectiveness of adding DBT to DM screening depends critically on the ability of DBT to improve the specificity of DM—a screening intervention with low positive predictive values and potential for overdiagnosis [31]. Incremental costs per QALY gained for DBT screening may differ according to different countries and should be specifically analyzed.

Our study has some limitations. First, the DBT sensitivity and specificity estimates were based on international studies, evaluating patients and healthcare systems with different characteristics. Indeed, quite a few countries have data from trials comparing DBT + s2D versus DM in populational breast cancer screening. To provide accurate estimates for the Brazilian setting, Brazilian data on the use of DBT + s2D in breast cancer screening would be needed. Although it has good reliability regarding Brazilian mortality data of invasive breast cancer [45], the Brazilian Breast Cancer Screening Program is opportunistic and lacks reliable data on mortality of interval cancer and DCIS. For these utility values, data from The Norwegian Breast Cancer Registry was used because it is based on biannual mammography, similar to the Brazilian Breast Cancer Program, is a national registry of good methodological quality, and the population is also comparable: life expectancy for women is 80.5 and 84 years in Brazil and Norway, respectively [46, 75]. About the probabilities of progression to metastasis there are no data available from Brazil, so data from a Canadian simulation was used [40]. The Canadian Breast Cancer Screening Program is also biannual like the Brazilian and population also comparable: life expectancy for women is 80.5 and 84.1 years in Brazil and Canada [46, 75].

Still, our inputs remain fair and adhere to the Brazilian Ministry of Health guidelines for the diagnosis, management, and treatment of breast cancer [32, 33]. Our meta-analysis also shows some limitations. First, most of the studies included in our meta-analysis were retrospective observational studies, and some prospective studies could be classified as retrospective reader studies because the images were collected prospectively; however, the images were further evaluated in a reader study. Secondly, the reasons for performing DBT and the clinical workflow may differ according to the location and setting where the study was performed; however, the analysis for these different subgroups, including imaging anomalies, different settings, and symptoms, was not possible due to lack of data for these different subgroups in the primary studies. Third, substantial heterogeneity in results was observed. To identify factors causing heterogeneity, we examined the threshold effect between sensitivity and specificity using a coupled forest plot and Spearman correlation coefficient and performed sensitivity analyses. The inclusion of more prospective studies with a larger study population may help validate the present conclusions with relatively less heterogeneity. Fourth, the lack of blinding and the time interval between the index test and the reference standard were the main factors that affected the quality assessment. To ensure comparability and minimize bias resulting from confounding factors, we included only primary studies with a comparative design and with the same reference standard. However, the lack of a study protocol with a detailed description may influence the results. Future studies should provide a clear description according to each item requested from the QUADAS-2 tool.

Furthermore, this is a Markov transitional states model that simulates a hypothetical cohort of females whose average age of entry into the study was 40 years. In this model, microsimulations of patients according to age were not created due to limitations of the available data identified in the literature and Brazilian females, and therefore, the rates of cancer density, recall and false positives were not adjusted for age.

We consider that screening-related harms, such as false positive mammographic screening, would affect quality of life in the first cycle as a result of the increased short-term anxiety and unnecessary costs for the healthcare system. But overdiagnosis of potentially harmful cancers was not considered. We focused on the health benefits of breast cancer screening in terms of QALYs.

This cost-effectiveness study will help the Brazilian supplementary health system policymakers to judge whether adopting the DBT + s2D will lead to a cost-effective application of their resources. Even under less favorable assumptions, where the parameters with the highest level of uncertainty were varied, the DBT + s2D showed to be a cost-effective alternative. Furthermore, together with other studies published abroad, this study shows that, based on the established accuracy parameters, DBT + s2D is more effective and cost saving. Policymakers and practitioners should consider the possible cost benefits of introducing DBT + s2D as an alternative to the current practice and support further studies to strengthen the generalizability of the current findings.

5 Conclusion

Our cost-effectiveness analysis, in fact, suggests that DBT + s2D in breast cancer screening women aged 40–69 years with scattered areas of fibroglandular breast density and heterogeneous dense breasts (ACR-BI-RADS B and C) is potentially cost saving compared with using DM alone, reducing the costs for the Brazilian supplementary health system by € 954.02 per patient and with an incremental 5.1989 QALYs supporting DBT + s2D adoption. Even under less favorable assumptions, where the parameters with the highest level of uncertainty were varied, the DBT + s2D showed to be a cost-effective alternative. Furthermore, the DBT + s2D increased cancer detection and invasive cancer detection rates, as well as decreased recall and biopsy rates. Brazilian National Health Agency should consider adoption of DBT + s2D in breast cancer screening for women with ACR BI-RADS B and C breast patterns.