Introduction

Breast cancer is the most common cancer in women worldwide1. Over half of all cases (53.0%) occur in less developed regions of the world1,2. There is a trade-off between benefit and harm of breast cancer screening1,3,4,5,6,7,8,9. False-positive results lead to anxiety and unnecessary, often invasive diagnostic procedures. Breast cancer screening can often over-diagnose disease and lead to unwarranted treatment. The accuracy of screening services may vary from one population to another, implying that a single screening procedure may not be universally effective.

Chinese government has launched a national breast cancer screening program in 2009, but has not yet formed a reliable breast cancer screening program that is suitable for national and regional population characteristics. Although mammography has been shown to reduce mortality in breast cancer, its accuracy is lower in women with high-density breast tissue and in young women in particular. In addition, mammography equipment is not affordable in rural areas in China. Furthermore, Asian women characteristically have higher-density breasts than women from other ethnic groups10,11. Irrespective of the ethnic origin, about 60% of women in their 40s are estimated to have dense breasts12,13,14. Given this, ultrasonography could offer a low-cost way to improve sensitivity and detection rates of early cancers in women with dense breasts7. It is imperative to establish a suitable breast cancer screening program in China that would be cost-effective and would improve screening benefits. Therefore, it is crucial to conduct an evidence-based study for the effectiveness of ultrasound techniques for the detection of breast cancer in the female population in China.

Although handheld ultrasound (HHUS) can be used to screen the whole breast, the technique is time-consuming and is less likely to achieve a standardized image because of the mobility of the breast tissue and the high degree of operator-dependence. This has become a major obstacle to the acceptance of ultrasound as a breast cancer screening technology in poor areas of developing and middle-income countries. Currently, breast ultrasonography can be performed using equipment for automated breast ultrasound systems (ABUS) in which all the breast tissue can be covered in a reproducible manner18,19,20,21. The anticipated advantage of ABUS systems is the decoupling of image acquisition and reading, which improves the possibilities for implementing breast ultrasonography in a screening setting. ABUS reduces the operator's dependency and can obtain series of reproducible standardized breast ultrasound images22,23,24,25,26,27,28.

To obtain reliable clinical epidemiological evidence that US can become a primary screening tool, long-term studies in large randomized controlled trials are needed. The time cost of such studies and the risk of negative results are high. To this end, we first design a non-inferior diagnostic efficiency study that takes less time and requires less samples. Its purpose is to provide initial evidence for future large randomized cohort studies. We have designed this multicenter hospital-based study to determine the efficacy of ABUS. Our goal was to compare ABUS, X-ray and HHUS and to determine if ABUS could be a suitable methodology for breast cancer screening.

Methods

Study protocol

This STUDY was a hospital-based, multicenter, non-inferiority clinical trial that compared the diagnostic performance of ABUS, HHUS and mammography in Chinese women. In short, main procedures of this study were shown in Fig. 1. All women from outpatient were invited to participant in our study, eligible participants signed the informed consent form were enrolled, and a face-to-face questionnaire interview including social-demographic and potential breast cancer risk factors information was conducted by trained health workers. Then all enrolled participants underwent both HHUS and ABUS examination, successively. Women aged between 40 and 69 years old received an extra MG test, but the younger female group did not due to radiation. The study was approved by the Institutional Review Board of Cancer Institute, Chinese Academy of Medical Sciences (IRB approval No.15-061/988) and the Institutional Review Board of all participating hospitals. Informed consent was obtained from all participants for this prospective analysis. All the procedures above strictly followed clinical routines and guidelines.

Figure 1
figure 1

Flowchart showing patient selection and study design.

Inclusion criteria

Inclusion criteria were (1) female patients 30–69 years old; (2) woman who visited doctor for breast cancer examination and (3) no visible signs of breast cancer. Exclusion criteria were: (1) women who were pregnant, breastfeeding or planning to become pregnant; (2) lumpectomy history, contralateral mastectomy, breast augmentation; (3) surgical or percutaneous biopsy in the last 12 months; (4) diagnosis or treatment for cancer in the last 12 months. In each hospital, at least 300 subjects were scanned following similar routine and workflow (Fig. 1).

Qualification of research center and staff

A total of five hospitals were included in this study. They are: Sun Yat-sen University Cancer Hospital; Chinese Academy of Medical Sciences Cancer Hospital; Tianjin Medical University Affiliated Tumor Hospital; Hangzhou First People's Hospital; Shanghai Jiaotong University Affiliated Xinhua Hospital.

The system under evaluation in this study was the Invenia 3D-Automated Breast Ultrasound System (ABUS), manufactured by GE Healthcare (Sunnyvale, CA USA). ABUS is a computer-based system for evaluating the complete breast. For each evaluation, each breast was imaged in three views: lateral (LAT), anteroposterior (AP) and medial (MED) with an automated 6 to 14 MHz linear array transducer attached to a rigid compression plate (covering areas of 15.4 × 17.0 × 5.0 cm). Each view acquired up to about 300 2D images and reconstructed in the coronal plane from the skin to the chest wall. The standardized review process involves using a patented, thick-slice coronal plane for quick navigation through the breast, as well the use of “survey mode,” which is similar to cine and allows radiologist rapid interpret of many images. The acquisition time for each view was approximately 60 s, with about 3–4 min per breast.

HHUS was performed in the supine position by experienced radiologists. The devices used to conduct HHUS included the GE LOGIQ9 (GE Medical Systems, Milwaukee, WI, USA), the Aixplorer system (Supersonic Imagine, Aix en Provence, France), the iU22 Ultrasound System (Philips Medical Systems, Bothell, WA, USA) and the s2000 (Siemens Medical Solutions, Mountain View, CA, USA).

The devices used to perform mammography and obtain mammographic images included the GE Sengraphe DS (GE Medical Systems, Milwaukee, WI, USA), the Hologic Selenia (Hologic, Bedford, MA, USA) and Fujifilm FDR MS-2500 (Fujifilm Corp, Tokyo, Japan). Generally, HHUS was performed by 10 radiologists, all with ≥ 5 years of experience in US examination and diagnosis. ABUS diagnosis was made also by 5 radiologist with at least 5 years’ experience of HHUS. MG diagnosis was performed by another 10 radiologist , all with at least 5 years of experience in MG diagnosis. MRI diagnosis was performed by the other 5 radiologist, all with ≥ 5 years’ experience in MRI diagnosis of breast. The radiologist performing and interpreting the US images and a different radiologist interpreting the MG were not permitted to know the results of the other current screening examination until their interpretations had been recorded, although prior breast imaging (if any) was available together with risk factor and biopsy/surgical history.

BI-RADS assessment results were sorted into six categories: 0 = incomplete, 1 = normal, 2 = benign, 3 = probably benign, 4 = suspicious, 5 = highly suggestive of malignancy, for HHUS, ABUS, and MG. The highest BI-RADS classified result among HHUS, ABUS, and MG would be the referral reference. For BI-RADS category 3, a magnetic resonance imaging (MRI) test was necessarily provided to distinguish the true negative results with false negative results. For BI-RADS category 1 to 2, there was no referral expecting that 10% of them were randomly selected to do MRI examination. For women classified as category 4 or 5, either core aspiration biopsy or surgical biopsy was done and a pathological diagnosis was followed. The MRI BI-RADS were also assessed. Women with BI-RADS category 1 to 3 of MRI would be considered to be negative. Otherwise, the woman would receive a biopsy examination to get the pathological information.

Statistical analysis

Sensitivities and specificities among three methods (ABUS, HHUS and MG) were compared by non-inferior Z tests (non-inferior value was 0.06) by the score method (35–37), in which the variance was calculated by the restricted maximum likelihood estimation to estimate proposed by Nam (37). When the non-inferior test achieved P less than 0.025 and the test subjects get higher sensitivity or specificity estimates, the further superior test (the superior value was set to be 0) was conducted using Mcnemar. The sensitivity, specificity, false positive rate (FPR) and accuracy (AC) were calculated. Cancer size and number distribution was recorded.

Sample size determination

The sample size was determined based on the need for a sufficient number of women with breast cancer to adequately evaluate the performance of ABUS, HHUS, and MG. According to American College of Radiology Breast Imaging Reporting and Data System (BI-RADS) distribution, the particular sample size is allocated into different BI-RADS categories: 40% in the BI-RADS 1–2 category, 35% BI-RADS 3, 12.5% BI-RADS 4 and 12.5% BI-RADS 5. All women were invited for breast cancer service MG, HHUS and ABUS from February 2016 to March 2017. For the BI-RADS category 3, a magnetic resonance imaging (MRI) test was provided to distinguish between negative results and false negative results.

Results

From February 2016 to March 2017, a total of 2844 women consented to participate in our study. 1947 women were eligible for the study and completed scanning examination. Breast density was also reassessed by the radiologist and classified by using BI-RADS density category 1 (“almost entirely fat”), category 2 (“scattered fibroglandular densities”), category 3 (“heterogeneously dense”), or category 4 (“extremely dense”).” 24.31% of women were classified as having BI-RADS breast density type1-2, 75.69% type 3–4 by radiologists (Table 1). For analyses, category 1-2was categorized as “low-density breasts,” and categories 3–4 were defined as high-density breasts.

Table 1 Patient demographic and clinical characteristics at enrollment.

The mean age of participants was 45.40 ± 9.77 years. 680 women were 30–39 years old and 1,267 were 40–69 years old (Table 1). Generally, the study included 786 BI-RADS lesions class 1–2 (40.37%), 543 BI-RADS lesions class 3 (27.89%), 338 BI-RADS lesions class 4 (17.36%) and 280 BI-RADS lesions class 5 (14.38%).

Cancer detection

In the age group of 30–39 (680 subjects), HHUS detected 79 cases of breast cancer (11.62%), including 70 cases of invasive carcinoma (10.29%) and 9 cases of non-invasive carcinoma (1.32%). ABUS detected 75 cases of breast cancer (11.03%), including 65 cases of invasive carcinoma (9.56%) and 10 cases of non-invasive carcinoma (1.47%). (Table 2) In the 30–39 age group, the mean diameter of cancer detected by HHUS was 22.74 ± 11.05 mm, and the average diameter of cancer detected by ABUS was 19.78 ± 10.83 mm (Table 3).

Table 2 Cancer Detection of HHUS, ABUS and MG.
Table 3 Cancer size distribution.

In the 40–69 age group (1,267 subjects) (Table 2), HHUS detected a total of 314 breast cancers (24.78%), of which 286 (22.57%) were invasive and 28 (2.21%) were non-invasive. 197 cases of invasive carcinoma (15.55%) were detected in the high-density subgroup and 89 cases of invasive carcinoma (7.02%) in the low-density subgroup. ABUS detected 306 cases of breast cancer (24.15%), including 276 cases of invasive carcinoma (21.79%) and 30 cases of non-invasive carcinoma (2.37%). 190 cases of invasive carcinoma (15.00%) were detected by ABUS in the high-density subgroup and 86 cases of invasive carcinoma (6.79%) by the low-density subgroup. MG detected 283 cases of breast cancer (22.34%), including 258 cases of invasive carcinoma (20.37%) and 25 cases of non-invasive carcinoma (1.97%). In the high density subgroup, MG detected 172 invasive cancers (13.58%). In the low-density subgroup, MG detected 86 invasive cancers (6.79%). In the 40–69 age group, the mean diameter of cancer detected by HHUS was 22.82 ± 10.59 mm, the mean diameter of cancer detected by ABUS was 20.04 ± 10.12 mm, and the mean diameter of cancer detected by MG was 21.87 ± 8.70 mm (Table 3).

In total (1947 subjects) (Table 3), 415 (21.31%) were confirmed by final pathology, 363 (18.64%) were invasive, and 52 (2.67%) were non-invasive. HHUS detected 393 cases of breast cancer (20.18%), including 356 cases of invasive carcinoma (18.28%) and 37 cases of non-invasive carcinoma (1.90%). ABUS detected 381 breast cancers (19.57%), of which 341 were invasive (17.51%) and non-invasive 40 (2.05%). MG detected 283 cases of cancer (14.54%), including 258 cases of invasive carcinoma (13.25%), non-invasive carcinoma in 25 cases (1.28%). The tumor diameters of the 30–39 years group and the 40–69 years age group were combined to calculate. The average diameter of cancer detected by HHUS was 22.80 ± 10.68 mm, and the average diameter of cancer detected by ABUS was 19.99 ± 10.26 mm.

Non-inferiority and superiority analysis of sensitivity

ABUS vs. HHUS

In 30–39 age group (Table 4), non-inferior Z tests showed that ABUS sensitivity (87.21%) was non-inferior to HHUS sensitivity (91.86%) with P = 0.325. As HHUS sensitivity was higher than the sensitivity of ABUS, it lead to superiority test of ABUS vs. HHUS was not available.

Table 4 Non-inferiority and superiority analysis results in 40–69 years age group.

In the 40–69 age group (Table 4), non-inferior Z tests showed that ABUS sensitivity (93.01%) was non-inferior to HHUS sensitivity (95.44%) with P = 0.014. Superiority test of HHUS vs. ABUS is also not available.

For all participants (Table 5), ABUS sensitivity (91.81%) compared with HHUS sensitivity (94.70%) with non-inferior Z tests, P = 0.015. Therefore, it can be inferred that the overall sensitivity of ABUS are not inferior to that of HHUS. Superiority test of HHUS vs. ABUS for all participants is not available.

Table 5 General non-inferiority and superiority study between HHUS and ABUS.

ABUS/HHUS vs. MG

In the 40–69 age group (Table 4), non-inferior Z tests showed that ABUS sensitivity (93.01%) was non-inferior to MG sensitivity (86.02%) with P < 0.001 and HHUS sensitivity (95.44%) was non-inferior to MG sensitivity (86.02%) with P < 0.001. Sensitivity of ABUS and HHUS are all superior to MG with P < 0.001 by superior test.

In high-density breast subgroup (Table 4), non-inferior Z tests showed that ABUS sensitivity (92.54%) was non-inferior to MG sensitivity (83.77%) with P < 0.001 and HHUS (95.61%) sensitivity was non-inferior to MG sensitivity (83.77%) with P < 0.001. Superiority Mcnemar test show that ABUS sensitivity was superior to MG sensitivity, P = 0.002 and HHUS sensitivity was superior to MG sensitivity, P < 0.001.

In low-density breast subgroup (Table 4), non-inferior Z tests showed that ABUS sensitivity (94.06%) was non-inferior to MG sensitivity (91.09%) with P = 0.008 and HHUS (95.05%) sensitivity was non-inferior to MG sensitivity (91.09%) with P < 0.001. Superiority Mcnemar test show that ABUS sensitivity was not superior to MG sensitivity, P = 0.183 and HHUS sensitivity was not superior to MG sensitivity, P = 0.079.

Non-inferiority and superiority analysis of specificity

ABUS vs. HHUS

In 30–39 age group (Table 4), non-inferior Z tests showed that ABUS specificity (92.93%) was non-inferior to HHUS specificity (89.06%) with P < 0.001. Superiority Mcnemar test show that ABUS specificity was superior to HHUS specificity, P < 0.001.

In the 40–69 age group (Table 4), non-inferior Z tests showed that ABUS specificity (92.86%) was non-inferior to HHUS specificity (89.55%) with P < 0.001. Superiority test show that ABUS specificity was superior to HHUS specificity, P < 0.001.

For all participants (Table 5), ABUS specificity (92.89%) was non-inferior to HHUS specificity (89.36%) with P < 0.001. Superiority test show that specificity of ABUS was superior to that of HHUS with P < 0.001.

ABUS/HHUS vs. MG

In the 40–69 age group (Table 4), non-inferior Z tests showed that ABUS specificity (92.86%) was non-inferior to MG specificity (91.68%) with P < 0.001 and HHUS specificity (89.55%) was non-inferior to MG specificity (91.68%) with P < 0.001. ABUS is not superior to MG with P = 0.114 by superior test. Superiority of HHUS vs. MG is not available.

In high-density breast subgroup (Table 4), non-inferior Z tests showed that ABUS specificity (92.34%) was non-inferior to MG specificity (90.56%) with P < 0.001 and HHUS (88.78%) specificity was non-inferior to MG specificity (90.56%) with P < 0.001. Superiority Mcnemar test show that ABUS specificity was not superior to MG specificity, P = 0.061. Superiority test of HHUS vs. MG was not available.

In low-density breast subgroup (Table 4), non-inferior Z tests showed that ABUS specificity (94.69%) was non-inferior to MG specificity (95.65%) with P = 0.007 and HHUS (92.27%) specificity was non-inferior to MG specificity (95.65%) with P < 0.001. Superiority Mcnemar test was not available.

Discussion

This multicenter study demonstrated that ultrasound (ABUS or US) is superior to mammography in dense breast patients, but perform as good as X-rays in low-dense breast patients. Furthermore, we found that both ABUS and HHUS, the sensitivity is superior to MG, ABUS specificity (92.34%) was non-inferior to MG specificity (90.56%). This conclusion suggests that US at least in symptomatic populations is more effective at detecting breast cancer than MG.

ABUS have been shown to achieve the same diagnostic accuracy as HHUS18,28,29. In Su Kyung Jeh study, the diagnostic performance of ABUS was higher than that of HHUS in respect of specificity and accuracy29. Chang et al.25 reported that both ABUS and HHUS had high sensitivity (both 100%) and high specificity (95.0% and 85.0%, respectively) for 69 lesions. In addition, the ABUS had a higher diagnostic accuracy (97.1%) than HHUS (91.4%) for breast masses. The authors concluded that ABUS is a promising modality in breast imaging. In our study, ABUS achieved higher accuracy than HHUS (ABUS 92.66% vs. HHUS 90.50% in all subjects; ABUS 92.90% vs. HHUS 91.08% in the 40–69 age group). In addition, ABUS had the highest specificity compared to HHUS, ABUS and MG (ABUS 92.89% vs. HHUS 89.36% in all subjects; ABUS 92.86% vs. MG 91.68% vs. HHUS 89.55% in the 40–69 age group) .This may be because ABUS can display more coronal plane-related information such as mass margins, shape, spiculations, and distortion associated with tissue retraction (Fig. 2). Meanwhile, breast cancer detection rates were higher in HHUS and ABUS than in MG (20.18% of HHUS / 19.57% of ABUS vs. 14.54% of MG). Most of the breast cancers detected are invasive breast cancer. The reason maybe that most of the participants were actually symptomatic and had tumor diameters with mean diameter close to 20 mm.

Figure 2
figure 2

A 32-year-old woman with ductal carcinoma of the left breast. (A) shows a hypoechoic lesion with handheld ultrasound (HHUS). And (B,C,D) show the heterogeneous hypoechoic lesions in the medial (B), lateral (D), and anterior–posterior (C) position of Automated Breast Ultrasound (ABUS).

In the study, the sensitivity of ABUS was lower than that of HHUS, which may be related to the compression of mammary gland tissue during the operation of ABUS, resulting in unclear display of some lesions.However, from the non-inferiority analysis results, sensitivity of ABUS is not inferior to HHUS. Meanwhile, specificity of ABUS is superior to HHUS in both the 30–39 age group and the 40–69 age group. This may be due to the additional information of the coronal plane and it helps to differentiate between benign and malignant breast lesions18,30,31,32. Therefore, on the issue of diagnostic efficacy, ABUS and HHUS have their own strengths and weaknesses in sensitivity and specificity. Moreover, non-inferiority tests also demonstrated that US (ABUS/HHUS) specificity is non-inferior to MG. Meanwhile, it must also be noted that the ABUS diagnosis is actually made by a radiologist with considerable HHUS experience. In addition, according to the ABUS user interface and habits of the process, operators are generally the first to read the conventional 2-D information before interpretation of the coronal plane. Therefore, the diagnostic decision of ABUS may come from the diagnostic information of the conventional 2-D section more in routine ABUS operation. Thus, the comparison of the diagnostic efficacy between ABUS and HHUS may reflect, to a greater extent, the capability comparison between two experienced HHUS radiologists, other than ability contrast of two different machines. However, the greater strength of ABUS lies in its standardized cross-section and its potential tele-consultation capabilities, which empower experienced radiologist to work for less-experienced areas.

The distinction between efficacy as measured in experimental studies and the effectiveness of a mass population intervention is a crucial one for public health decision-making. Therefore, the limitation of this study is that all the conclusions come from the symptomatic population, thus limiting the extension of evidence to the asymptomatic population. In the future, there is still a need for randomized controlled validation study in asymptomatic populations.

In summary, the sensitivity of ABUS/HHUS is superior to that of MG. The specificity of ABUS/HHUS is non-inferior to that of MG. Therefore, given the affordability, feasibility and good performance of ultrasound, ABUS provides a standardized and reproducible imaging device that can be used for breast cancer detection. Our study suggests that large-scale, multicenter randomized controlled studies are warranted to confirm the benefits of breast cancer screening by ABUS.