Impact of real-life use of artificial intelligence as support for human reading in a population-based breast cancer screening program with mammography and tomosynthesis

Elías-Cabot, Esperanza; Romero-Martín, Sara; Raya-Povedano, José Luis; Brehl, A.-K.; Álvarez-Benito, Marina

doi:10.1007/s00330-023-10426-4

Impact of real-life use of artificial intelligence as support for human reading in a population-based breast cancer screening program with mammography and tomosynthesis

Breast
Open access
Published: 17 November 2023

(2023)
Cite this article

Download PDF

You have full access to this open access article

European Radiology Aims and scope Submit manuscript

Impact of real-life use of artificial intelligence as support for human reading in a population-based breast cancer screening program with mammography and tomosynthesis

Download PDF

Esperanza Elías-Cabot ORCID: orcid.org/0000-0002-0085-4060^1,2,3,
Sara Romero-Martín^1,2,3,
José Luis Raya-Povedano^1,2,3,
A.-K. Brehl⁴ &
…
Marina Álvarez-Benito^1,2,3

1857 Accesses
19 Altmetric
2 Mentions
Explore all metrics

Abstract

Objectives

To evaluate the impact of using an artificial intelligence (AI) system as support for human double reading in a real-life scenario of a breast cancer screening program with digital mammography (DM) or digital breast tomosynthesis (DBT).

Material and methods

We analyzed the performance of double reading screening with mammography and tomosynthesis after implementarion of AI as decision support. The study group consisted of a consecutive cohort of 1 year screening between March 2021 and March 2022 where double reading was performed with concurrent AI support that automatically detects and highlights lesions suspicious of breast cancer in mammography and tomosynthesis. Screening performance was measured as cancer detection rate (CDR), recall rate (RR), and positive predictive value (PPV) of recalls. Performance in the study group was compared using a McNemar test to a control group that included a screening cohort of the same size, recorded just prior to the implementation of AI.

Results

A total of 11,998 women (mean age 57.59 years ± 5.8 [sd]) were included in the study group (5049 DM and 6949 DBT). Comparing global results (including DM and DBT) of double reading with vs. without AI support, we observed an increase in CDR, PPV, and RR by 3.2/‰ (5.8 vs. 9; p < 0.001), 4% (10.6 vs. 14.6; p < 0.001), and 0.7% (5.4 vs. 6.1; p < 0.001) respectively.

Conclusion

AI used as support for human double reading in a real-life breast cancer screening program with DM and DBT increases CDR and PPV of the recalled women.

Clinical relevance statement

Artificial intelligence as support for human double reading improves accuracy in a real-life breast cancer screening program both in digital mammography and digital breast tomosynthesis.

Key Points

• AI systems based on deep learning technology offer potential for improving breast cancer screening programs.

• Using artificial intelligence as support for reading improves radiologists’ performance in breast cancer screening programs with mammography or tomosynthesis.

• Artificial intelligence used concurrently with human reading in clinical screening practice increases breast cancer detection rate and positive predictive value of the recalled women.

Multi-modality radiomics model predicts axillary lymph node metastasis of breast cancer using MRI and mammography

Article 10 February 2024

Inter- and intra-observer variability of qualitative visual breast-composition assessment in mammography among Japanese physicians: a first multi-institutional observer performance study in Japan

Article 15 April 2024

Machine learning and deep learning approach for medical image analysis: diagnosis to detection

Article 24 December 2022

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Breast screening with mammography is considered the most effective method to decrease breast cancer mortality as it increases the chance for early detection [1, 2]. However, up to 24% of breast cancers are missed despite regular screening [3]. The high volume of mammograms to be read in screening programs, most of which will be normal, and the presentations of cancer as small subtle lesions, can reduce the performance of radiologists, increasing the risk of false negatives [4, 5].

Traditional computer-aided detection (CAD) systems, in the context of breast cancer screening, have not been shown to improve diagnostic accuracy [6] and have been associated with a significant increase in false positives and recall rates [7].

Recently, the development of new artificial intelligence (AI) systems based on “Deep learning” technology has improved traditional CAD algorithms and shows great potential in the field of breast imaging [8]. By providing large amounts of training data to a deep learning algorithm, deep learning inherently defines the optimal features that are necessary to detect suspicious areas. This is an advantage of neural network–based algorithms towards machine learning (i.e., random forest) [9, 10], where feature selection is not performed by the network itself but has been previously defined. Deep learning neural network–based algorithms allow for predictive analytics and improve their performance depending on the size of training data. In this way, these AI systems not only identify suspicious findings in the image, but also establishes an overall exam score indicating the overall cancer risk. This ability to stratify digital mammograms and tomosynthesis has been analyzed in retrospective simulated studies to reduce reading workload up to 70% without a negative impact in cancer detection [11,12,13,14,15]. In a recently published study conducted prospectively in a real-world screening setting, AI-assisted reading was associated with a significant reduction in screen-reading workload (44%), and similar cancer detection rate [16].

Deep learning–based AI systems implemented as a stand-alone solution have been shown to yield performance levels similar to the performance of a radiologist, with an AUC ranging from 0.7 to 0.96 [17,18,19,20,21].

Furthermore, AI systems can be used for concurrent clinical decision support during reading of DM or DBT exams. Many studies have shown that radiologists improved their cancer detection accuracy when using an AI system as concurrent reading support [22,23,24,25,26,27], while decreasing their false positive and recall rates [24, 25]. However, none of these studies have been carried out in real-world screening programs and have used cancer-enriched datasets. Such laboratory conditions might affect the reader’s behavior and are not fully transferable to clinical screening practices.

In this study, we evaluate a consecutive cohort of screening mammograms and tomosynthesis that were double read with concurrent AI support, in a real screening environment, assessing the impact it has had on detection and recall rates.

To the best of our knowledge, there are no studies that investigate this effect in a large and consecutive series, in the real setting of a population-based screening program.

Material and methods

This study has been carried out in a single institution, the Córdoba Breast Cancer Screening Program (CBCSP), and did not receive any private or public funding. The authors who were in control of the data and information submitted for publication were not employees of or consultants for Screenpoint Medical (E.E.C., S.R.M., J.L.R.P. and M.A.B.). The requirement for informed consent was waived by the hospital's institutional review board.

The CBCSP is part of the Andalusian screening program in Spain. All women aged 50–69 are invited to participate with a biennial mammogram or tomosynthesis depending on resource availability.

In the CBCSP, all exams are independently double read, without consensus or arbitration (women are recalled if any reader decides to recall). The readings are performed according to Breast Imaging Reporting and Data System (BI-RADS) terminology. Studies classified as BI-RADS 1 and 2 are not recalled. Studies classified as BI-RADS 3, 4, and 5 are recalled and sent to the reference hospital. Women recalled are subjected to diagnostic workup (special mammography projections, tomosynthesis, contrast mammography, ultrasound, or MRI). Suspected findings are confirmed by percutaneous biopsy.

In March 2021, a commercially available breast cancer detection AI software was implemented in the CBCSP to assist the screening radiologists with mammography and tomosynthesis readings.

This study evaluates screening performance in the consecutive screening cohort 1 year after implementation of AI, the study group, and compares to a control group of the same size just prior to the implementation of AI..

Study group (with AI support)

The study group consisted of women (age range, 50–69) who were consecutively screened with digital mammography (DM) or digital breast tomosynthesis (DBT), between 15 March 2021 and 14 March 2022, using a Selenia device (Hologic) for DM and a Selenia Dimensions device (Hologic) for DBT, respectively. We excluded recalled women whose diagnostic workup results were not accessible (Fig. 1).

During this period, all exams were double read in the screening program, by a team of ten radiologists dedicated to breast imaging (2 to 20 years of experience in breast imaging at the time of the study), concurrently using an AI system for cancer detection support. The radiologists read the DM and DBT exams while viewing the AI system findings. The final decision on whether or not to recall was made by the radiologist.

AI support system

The AI system used in this study was the commercially available AI system Transpara®, (version 1.7.0, ScreenPoint Medical) which has received Conformité Européenne (CE) mark approval and was cleared by the U.S Food and Drug Administration. This software has been investigated in other publications [11, 12, 14, 17, 18, 22, 23, 27].

The AI system is based on deep convolutional neural networks and has been trained, validated, and tested on more than 1 million mammograms, to detect suspicious soft tissue lesions and calcifications on digital mammograms and tomosynthesis across machines from different vendors. These lesions are marked with a score from 1 to 100 according to the probability of malignancy. Based on the lesion detected with the highest score, the system categorizes each exam into three categories representing the probability and risk that a visible cancer is present on the image: low, intermediate, and elevated.

Control group (without AI support)

In order to compare the performance indicators, a control group from the same CBCSP was collected. This group consisted of consecutive women (age range, 50–69) who were screened with DM or DBT examinations performed with the same two devices, 1 year prior to AI implementation, between 15 March 2020 and 14 March 2021. We excluded recalled women whose diagnostic work-up results were not accessible (Fig. 1). These exams were double read during that period in the screening program by the same team of radiologists as the study group but without artificial intelligence support.

From the available collected cohort, the final selection for the control group was done by statistical matching to ensure balance between control and study group regarding the control variables (Fig. 1).

Reference standard

Age, type of exam (DM or DBT), breast density, radiological findings, recall indications, biopsy procedures, and histopathologic results were retrieved from the medical records.

We considered cancer those women recalled with histologically confirmed malignancy. And we considered non-cancer those women not recalled and women who were recalled with a benign biopsy result or who were classified as BIRADS 1, 2, or 3 and not submitted to biopsy after the diagnostic work-up performed in the reference hospital.

Statistical analysis

The first step in the statistical analysis was the implementation of a matching method to ensure balance between control and study group regarding the control variables: modality (DM or DBT), age, and breast density. Nearest Neighbor Matching with Mahalanobis distance was selected as a matching method, and the number of matches was established based on the best performance in terms of balance between the two groups. The last step was the diagnosis of the quality of matches through the standardized mean difference (SMD) of the distance measure as well as the control variables, setting the threshold in 0.1 [28]. The region of common support (that is, overlapping between distributions) was evaluated through density plots and barplots (for continuous and discrete variables, respectively).

The performance of screening in the balanced study and control groups was evaluated and compared in terms of cancer detection rate (CDR), recall rate (RR), and positive predictive value of recalls (PPV). The analysis was done globally and separately for DM and DBT. McNemar test was applied for estimating differences between groups.

All the analyses were carried out with R free software [29]. The packages Machlt [30] and Matching [31] were used for conducting matching analysis, and Imtest [32] and sandwich [33, 34] packages to make inference for estimated logistic regression coefficients.

Results

Participant and examination characteristics

From the available 12,011 women in the study group, 11,998 screening examinations from 11,998 women (mean age, 57.59 years ± 5.8 [standard deviation]) were included (99.8%) (5049 digital mammography exams and 6949 digital breast tomosynthesis exams). Thirteen examinations were excluded because the diagnostic workup results were not accessible.

In the control group from the available 16,555 women, 11,998 screening examinations from 11,998 women (mean age 57.94 years ± 5.58 [standard deviation]) were selected after the implementation of a matching method (5045 digital mammography exams and 6953 digital breast tomosynthesis exams) (Fig 1).

Near Neighbor Matching with Mahalanobis distance and a 1:1 ratio without replacement was selected. The standardized mean differences (SMD < 0.1) showed an optimal balance.

The characteristics of the study and control groups are included in Tables 1 and 2.

Table 1 Characteristics of study group and control group

Full size table

Table 2 Summary of cancer characteristics

Full size table

Screening performance with AI system support (study group)

From the total DM and DBT exams, 108 cancers were diagnosed. Cancer detection rate (CDR), recall rate (RR), and positive predictive value of recalled women (PPV1) were 9.0‰ (95% CI: 8.2, 9.7), 6.1% (95% CI: 5.9, 6.3), and 14.6% (95% CI: 13.5, 15.7) respectively (Table 3).

Table 3 Comparison of readings with AI system support and readings without AI system in terms of cancer detection rate, recall rate, and positive predictive value

Full size table

From the DM exams, CDR was 8.1‰ (95% CI: 7.0, 9.1), RR was 6.6% (95% CI: 6.3, 6.9), and PPV1 was 12.2% (95% CI: 10.7, 13.7) and from the DBT exams CDR, RR, and PPV1 were 9.6‰ (95% CI: 8.6, 10.6), 5.7% (95% CI: 5.5, 6.0), and 16.7% (95% CI: 15.2, 18.3) respectively (Table 3).

Illustrative examples of examinations in the study group categorized as elevated risk where the AI system–assisted radiologists in the detection of cancer are shown in Figs. 2 and 3.

Screening performance without AI system support (control group)

From the total 11,998 DM and DBT exams, 70 cancers were diagnosed. Cancer detection rate (CDR), recall rate (RR), and positive predictive value of recalls (PPV1) were 5.8‰ (95% CI: 5.2, 6.4), 5.4% (95% CI: 5.3, 5.6), and 10.6% (95% CI: 9.6, 11.6) respectively (Table 3).

From the DM exams, CDR was 5.7‰ (95% CI: 4.8, 6.6), RR was 6.2% (95% CI: 5.9, 6.5), and PPV of recalled women was 9.2% (95% CI: 7.9, 10.6) and from the DBT exams CDR, RR, and PPV of recalls were 5.8‰ (95% CI: 5.1, 6.6), 4.9% (95% CI: 4.7, 5.1), and 11.9% (95% CI: 10.4, 13.3) respectively (Table 3).

Comparison between study and control group

When comparing the global results of reading in the study group with AI system support to the control group without AI, an increase in CDR of 3.2‰ (95% CI: 0–9, 5.4; p < 0.001), an increase in PPV of recalls of 4% (95% CI: 0.3, 7.7; p < 0.001), and also an increase in RR of 0.7% (95% CI: 0.05, 1.3; p < 0.001) was observed (Table 3).

Similar results were obtained in the independent assessment of DM and DBT, although the differences were slightly higher for DBT: an increase in CDR, RR, and PPV1 in the study group was observed compared to the control group of 3.8‰ (p < 0.05), 0.8 (p < 0.05), and 4.8% (p = 0.07) respectively for DBT, and 2.4‰ (p = 0.05), 0.4 (p = 0.37), and 3% (p = 0.25) respectively for DM (Table 3).

AI standalone performance

The total of the 11,998 included exams in the study group were classified as follows: low, 7917 (65.9%); intermediate, 3730 (31%); and elevated, 351 (2.9%). The 108 studies with cancers were categorized as follows: low, 1 (0.9%); intermediate, 32 (29.6%); elevated, 75 (69.4%) (Fig. 4).

Globally, the probability of cancer in the low- and intermediate-risk categories was less than 2% (0.01 and 0.8% respectively), while in the high-risk category the probability of cancer was 21.3% (Fig. 4).

The AI system correctly marked 101 cancers (93.5%) with the highest score.

Discussion

In this study, we evaluate the impact in performance of a double reading screening program after the implementation of an AI system for concurrent decision support for the radiologists. The implementation of an AI system for concurrent decision support for reading DM and DBT screening exams increased the breast cancer detection rate (CDR) by 3.2‰ (9.0‰ vs 5.8‰; p < 0.001) and the positive predictive value (PPV) of the recalled women by 4% (14.6% vs 10.6%; p < 0.001) with an acceptable increase in recall rate (RR) by 0.7% (6.1% vs 5.4%; p < 0.001). AI-based performance benefits with respect to CDR and PPV were present when reading DM and DBT studies with AI support. Yet, effects were more pronounced for DBT (CDR + 3.8; PPV of recalled women +4.8; and recall rate + 0.8 in the study group).

Several retrospective studies have shown that radiologists improved their cancer detection performance when using an AI system as a concurrent reading aid. So, reading DM tests with the help of AI was associated with an increase in AUC of between 0.02 and 0.07, as well as an increase in sensitivity and specificity [23, 24, 26]. Similar effects have been reported for DBT. Simultaneous AI support in reading screening tests for DBT resulted in an increase in AUC between 0.03 and 0.057, with no change in specificity [22, 25, 27]. However, all these studies were conducted on small cancer-enriched datasets rather than population-based screening data and the transferability of the results to real-world screening settings might be limited. Our study represents the real-life impact of AI as support for human double reading, in a screening program. Regarding recalls, in our study, we found a slight increase in recall rate (+0.8%) that was accompanied by an increase in the PPV of recalled women by 4.8%, indicating that AI did not increase the number of unnecessary recalls but rather enabled radiologists to recall relevant cases. In the Conant et al study, AI support for reading DBT exams resulted in a reduction in recall rate for non-cancers by 7.2% [25].

The current study tests the ability of the AI system to stratify screening exams based on cancer risk. Overall, 66% of screening exams were classified as low risk and there was only one cancer that was falsely allocated by AI to the low risk category. Less than 3% of all cases were classified as high risk and yet, practically 70% of all cancers were among those cases flagged as high risk. These results are similar to those obtained by other authors in several previous retrospective simulated studies which demonstrated that AI-based stratification screening workflows could achieve workload reduction by up to 70% by identifying a significant volume of low-risk studies that could be partially or fully removed from human reading [11, 14,15,16, 35, 36]. In addition to providing important information to the radiologist to make a well-informed decision, the AI system’s ability to stratify studies opens the possibility for new strategies in screening programs. Such AI-based screening workflows would reduce the time dedicated to low-risk studies and would allow readers to focus on higher-risk studies that require more attention. Likewise, Dembrower et al [35] demonstrated in a retrospective simulation that the sensitivity of the screening program could be increased if a more intensive screening is carried out for the 1–5% most suspicious exams.

This study has several limitations. First, it has been carried out in a single institution, using two devices from the same vendor and analyzed by only one AI system. Second, the number of false negatives and interval cancers is still unknown since next round cancers and interval cancers remain to be monitored in the future.

In conclusion, this study shows that AI as a concurrent reading aid for human double reading in a real-word screening scenario with DM or DBT significantly increases cancer detection rate and positive predictive value of the recalled women.

Abbreviations

AI:: Artificial intelligence
CDR:: Cancer detection rate
DBT:: Digital breast tomosynthesis
DM:: Digital mammography
NPV:: Negative predictive value
PPV:: Positive predictive value
RR:: Recall rate
SD:: Standard deviation
SMD:: Standardized mean differences

References

Paci E (2012) Summary of the evidence of breast cancer service screening outcomes in Europe and first estimate of the benefit and harm balance sheet. J Med Screen 19:5–13
Article PubMed Google Scholar
Smith RA, Andrews KS, Brooks D et al (2018) Cancer screening in the United States, 2018: a review of current American Cancer Society guidelines and current issues in cancer screening. CA Cancer J Clin 68:297–316
Article PubMed Google Scholar
Hovda T, Hoff SR, Larsen M et al (2022) True and missed interval cancer in organized mammographic screening: a retrospective review study of diagnostic and prior screening mammograms. Acad Radiol 29:S180–S191
Article PubMed Google Scholar
Rodriguez-Ruiz A, Lång K, Gubern-Merida A et al (2019) Can we reduce the workload of mammographic screening by automatic identification of normal exams with artificial intelligence? A feasibility study. Eur Radiol 29(9):4825–4832
Article PubMed PubMed Central Google Scholar
Lotter W, Diab AR, Haslam B et al (2021) Robust breast cancer detection in mammography and digital breast tomosynthesis using an annotation-efficient deep learning approach. Nat Med 27(2):244–249
Article CAS PubMed PubMed Central Google Scholar
Lehman CD, Wellman RD, Buist DSM et al (2015) Diagnostic accuracy of digital screening mammography with and without computer-aided detection. JAMA Intern Med 175(11):1828–37
Article PubMed PubMed Central Google Scholar
Fenton JJ, Taplin SH, Carney PA et al (2007) Influence of computer-aided detection on performance of screening mammography. N Engl J Med 356:1399–1409
Article CAS PubMed PubMed Central Google Scholar
Tagliafico AS, Piana M, Schenone D et al (2020) Overview of radiomics in breast cancer diagnosis and prognostication. Breast 49:74–80
Article PubMed Google Scholar
Comes MC, La Forgia D, Didonna V et al (2021) Early prediction of breast cancer recurrence for patients treated with neoadjuvant chemotherapy: a transfer learning approach on DCE-MRIs. Cancers 13(10):2298
Article PubMed PubMed Central Google Scholar
Petrillo A, Fusco R, Di Bernardo E et al (2022) Prediction of breast cancer histological outcome by radiomics and artificial intelligence analysis in contrast-enhanced mammography. Cancers 14:2132
Article CAS PubMed PubMed Central Google Scholar
Raya-Povedano JL, Romero-Martín S, Elías-Cabot E, Gubern-Mérida A, Rodríguez-Ruiz A, Álvarez-Benito M (2021) AI-based strategies to reduce workload in breast cancer screening with mammography and tomosynthesis: a retrospective evaluation. Radiology 300(1):57–65
Article PubMed Google Scholar
Lång K, Dustler M, Dahlblom V, Åkesson A, Andersson I, Zackrisson S (2021) Identifying normal mammograms in a large screening population using artificial intelligence. Eur Radiol 31(3):1687–1692
Article PubMed Google Scholar
Yala A, Schuster T, Miles R, Barzilay R, Lehman C (2019) A deep learning model to triage screening mammograms: a simulation study. Radiology 293(1):38–46
Article PubMed Google Scholar
Lauritzen AD, Rodríguez-Ruiz A, von Euler-Chelpin MC et al (2022) An artificial intelligence–based mammography screening protocol for breast cancer: outcome and radiologist workload. Radiology 304(1):41–49
Article PubMed Google Scholar
Shoshan Y, Bakalo R, Gilboa-Solomon F et al (2022) Artificial intelligence for reducing workload in breast cancer screening with digital breast tomosynthesis. Radiology 303(1):69–77
Article PubMed Google Scholar
Lang K, Josefsson V, Larsson AM et al (2023) Artificial intelligence-supported screen reading versus standard double reading in the Mammography Screening with Artificial Intelligence trial (MASAI): a clinical safety analysis of a randomised, controlled, non-inferiority single-blinded, screening accuracy study. Lancet Oncol 24:936–944
Article PubMed Google Scholar
Romero-Martín S, Elías-Cabot E, Raya-Povedano JL, Gubern-Mérida A, Rodríguez-Ruiz A, Álvarez-Benito M (2022) Stand-alone use of artificial intelligence for digital mammography and digital breast tomosynthesis screening: a retrospective evaluation. Radiology 302(3):535–542
Article PubMed Google Scholar
Rodriguez-Ruiz A, Lång K, Gubern-Mérida A et al (2019) Stand-alone artificial intelligence for breast cancer detection in mammography: comparison with 101 radiologists. J Natl Cancer Inst 111(9):916–922
Article PubMed PubMed Central Google Scholar
Sasaki M, Tozaki M, Rodríguez-Ruiz A et al (2020) Artificial intelligence for breast cancer detection in mammography: experience of use of the ScreenPoint Medical Transpara system in 310 Japanese women. Breast Cancer 27(4):642–651
Article PubMed Google Scholar
Salim M, Wåhlin E, Dembrower K et al (2020) External evaluation of 3 commercial artificial intelligence algorithms for independent assessment of screening mammograms. JAMA Oncol 6(10):1581–1588
Article PubMed PubMed Central Google Scholar
McKinney SM, Sieniek M, Godbole V et al (2020) International evaluation of an AI system for breast cancer screening. Nature 577(7788):89–94
Article CAS PubMed Google Scholar
Van Winkel SL, Rodríguez-Ruiz A, Appelman L et al (2021) Impact of artificial intelligence support on accuracy and reading time in breast tomosynthesis image interpretation: a multi-reader multi-case study. Eur Radiol 31(11):8682–8691
Article PubMed PubMed Central Google Scholar
Rodríguez-Ruiz A, Krupinski E, Mordang JJ et al (2019) Detection of breast cancer with mammography: effect of an artificial intelligence support system. Radiology 290(2):305–314
Article PubMed Google Scholar
Pacilè S, Lopez J, Chone P et al (2020) Improving breast cancer detection accuracy of mammography with the concurrent use of an artificial intelligence tool. Radiol Artif Intell 2(6):e190208
Article PubMed PubMed Central Google Scholar
Conant EF, Toledano AY, Periaswamy S et al (2019) Improving accuracy and efficiency with concurrent use of artificial intelligence for digital breast tomosynthesis. Radiol Artif Intell 1(4):e180096
Article PubMed PubMed Central Google Scholar
Kim H-E, Kim HH, Han B-K et al (2020) Changes in cancer detection and false-positive recall in mammography using artificial intelligence: a retrospective, multireader study. Lancet Digit Health 2:e138–e148
Article PubMed Google Scholar
Pinto MC, Rodriguez-Ruiz A, Pedersen K et al (2021) Impact of artificial intelligence decision support using deep learning on breast cancer screening interpretation with single-view wide-angle digital breast tomosynthesis. Radiology 300(3):529–536
Article PubMed Google Scholar
Zhang Z, Kim HJ, Lonjon G, Zhu Y (2019) Balance diagnostics after propensity score matching. Ann Transl Med 7:16–16
Article CAS PubMed PubMed Central Google Scholar
R Core Team (2020) R a language and environment for statistical computing : R Foundation for Statistical Computing. Vienna, Austria. URL https://www.R-project.org/. Accessed Dec 2022
Ho DE, Imai K, King G, Stuart EA (2011) MatchIt : nonparametric preprocessing for parametric causal inference. J Stat Softw 42(8):1–28
Article Google Scholar
Sekhon JS (2011) Multivariate and propensity score matching software with automated balance optimization: the matching package for R. J Stat Softw 42(7):1–52
Article Google Scholar
Zeileis A, Hothorn T (2002) Diagnostic checking in regression relationships. R-news 2:7–10
Google Scholar
Berger S, Graham N, Zeileis A (2017) Various versatile variances: an object-oriented implementation of clustered covariances in R. Working Papers. Faculty of Economics and Statistics, Universität Innsbruck, pp 2017–12
Zeileis A (2006) Object-oriented computation of sandwich estimators. J Stat Softw 16(9):1–16
Article Google Scholar
Dembrower K, Wåhlin E, Liu Y et al (2020) Effect of artificial intelligence-based triaging of breast cancer screening mammograms on cancer detection and radiologist workload: a retrospective simulation study. Lancet Digit Health 2(9):e468–e474
Article PubMed Google Scholar
Larsen M, Aglen CF, Lee CI et al (2022) Artificial intelligence evaluation of 122 969 mammography examinations from a population-based screening program. Radiology 303(3):502–511
Article PubMed Google Scholar

Download references

Funding

The authors state that this work has not received any funding.

Author information

Authors and Affiliations

Maimónides Biomedical Research Institute of Córdoba (IMIBIC), Córdoba, Spain
Esperanza Elías-Cabot, Sara Romero-Martín, José Luis Raya-Povedano & Marina Álvarez-Benito
Breast Cancer Unit, Department of Diagnostic Radiology, Reina Sofía University Hospital, Menéndez Pidal Avenue s/n, 14004, Córdoba, Spain
Esperanza Elías-Cabot, Sara Romero-Martín, José Luis Raya-Povedano & Marina Álvarez-Benito
University of Córdoba, Córdoba, Spain
Esperanza Elías-Cabot, Sara Romero-Martín, José Luis Raya-Povedano & Marina Álvarez-Benito
ScreenPoint Medical BV, Toernooiveld 300, 6525 EC, Nijmegen, The Netherlands
A.-K. Brehl

Authors

Esperanza Elías-Cabot
View author publications
You can also search for this author in PubMed Google Scholar
Sara Romero-Martín
View author publications
You can also search for this author in PubMed Google Scholar
José Luis Raya-Povedano
View author publications
You can also search for this author in PubMed Google Scholar
A.-K. Brehl
View author publications
You can also search for this author in PubMed Google Scholar
Marina Álvarez-Benito
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Esperanza Elías-Cabot.

Ethics declarations

Guarantor

The scientific guarantor of this publication is Marina Álvarez Benito.

Conflict of interest

One of the authors of this manuscript (AKB) declare relationships with the following companies: ScreenPoint Medical.

The authors (EEC, SRM, JLRP and MAB) who were not employees of or consultants for ScreenPoint Medical had control of the data and information submitted for publication at all times.

Statistics and biometry

María Pata (Biostatech) kindly provided statistical advice for this manuscript.

Informed consent

Written informed consent was waived by the Reina Sofía University Hospital’s Institutional Review Board.

Ethical approval

Reina Sofía University Hospital´s Institutional Review Board approval was obtained.

Study subjects or cohorts overlap

No

Methodology

• retrospective

• diagnostic or prognostic study

• performed at one institution

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Elías-Cabot, E., Romero-Martín, S., Raya-Povedano, J.L. et al. Impact of real-life use of artificial intelligence as support for human reading in a population-based breast cancer screening program with mammography and tomosynthesis. Eur Radiol (2023). https://doi.org/10.1007/s00330-023-10426-4

Download citation

Received: 11 August 2023
Revised: 11 August 2023
Accepted: 01 October 2023
Published: 17 November 2023
DOI: https://doi.org/10.1007/s00330-023-10426-4

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Impact of real-life use of artificial intelligence as support for human reading in a population-based breast cancer screening program with mammography and tomosynthesis

Abstract

Objectives

Material and methods

Results

Conclusion

Clinical relevance statement

Key Points

Similar content being viewed by others

Multi-modality radiomics model predicts axillary lymph node metastasis of breast cancer using MRI and mammography

Inter- and intra-observer variability of qualitative visual breast-composition assessment in mammography among Japanese physicians: a first multi-institutional observer performance study in Japan

Machine learning and deep learning approach for medical image analysis: diagnosis to detection

Introduction

Material and methods

Study group (with AI support)

AI support system

Control group (without AI support)

Reference standard

Statistical analysis

Results

Participant and examination characteristics

Screening performance with AI system support (study group)

Screening performance without AI system support (control group)

Comparison between study and control group

AI standalone performance

Discussion

Abbreviations

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Guarantor

Conflict of interest

Statistics and biometry

Informed consent

Ethical approval

Study subjects or cohorts overlap

Methodology

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation