FormalPara Key Points

Whole genome sequencing is a reliable method for clinical human epidermal growth factor receptor 2 status assessment.

The ploidy-corrected copy number value is the most accurate biomarker for genetic testing with high concordance to gold standard immunohistochemistry and fluorescence in situ hybridization.

Short-read whole genome sequencing for human epidermal growth factor receptor 2 assessment is consistent across different platforms and wet lab protocols.

1 Introduction

Human epidermal growth factor receptor 2 (HER2) is an important biomarker for targeted therapy in breast cancer (BC). Patients with an overexpression of the receptor were considered the worst prognosis group before HER2 inhibitors were introduced into clinical practice [1]. Currently, the first and second generation of these drugs slow down disease progression, improving the outcomes in HER2-positive subgroups of BCs. Therefore, it is crucial to pinpoint the HER2-overexpression status accurately and precisely [2].

The molecular mechanism of HER2 overexpression is, in most cases, amplification of a 17q12 chromosome region containing the HER2 coding ERBB2 gene. The reference method for the assessment of ERBB2 amplification is immunohistochemistry (IHC) coupled with fluorescence in situ hybridization (FISH) [3]. Currently, diagnostic companies and medical services are beginning to offer novel next-generation sequencing (NGS) assays, detecting dozens of actionable biomarkers in a single test. They are trying to incorporate the ERBB2 copy number (ERBB2 CN) into their portfolio as well. Unfortunately, ERBB2 amplification status cannot be easily determined by establishing a simple threshold for negative and positive values, as the genomic context of chromosome 17 copy number and tumor ploidyFootnote 1 are interrelated with ERBB2 CN [4]. First, duplication or triplication of the whole chromosome set (polyploidy) or just a subset of chromosomes (aneuploidy) is a common feature of BC [5, 6]. However, changes in ploidy may not be associated with an overexpression of the ERBB2 gene, as average global transcript levels remain unchanged. On the contrary, Newcombe et al. noted a decrease in HER2 expression in recurrent polyploid BC cells [7]. Second, the isolated deletion or duplication events of chromosome 17 may influence the ERBB2 transcription [8, 9]. The gain of an additional copy of chromosome 17, called polysomy, is correlated with tumor ploidy and is considered its surrogate in the FISH test, but discrepancies between these parameters are in part the reason for inaccuracy in ERBB2 amplification detection [10].

As it is not feasible to determine ploidy in conventional FISH, the ratio between ERBB2 CN and chromosome 17 centromeric probe (CEP17) CN serves as a diagnostic criterion in dual-probe assays, recommended by the official American Society of Clinical Oncology/College of American Pathologists (ASCO/CAP) clinical practice guidelines for diagnostics of HER2 in patients with BC [3].

In contrast, whole genome sequencing (WGS) is capable of acquiring absolute ERBB2 CN, CEP17 CN, and mean ploidy of tumor cells simultaneously. Moreover, WGS can confirm the presence of the neoplastic cell in the sample, providing quality control of the material for analyses [11]. As WGS is mainly based on polymerase chain reaction-free methodology, it preserves the original proportions of DNA fragments, in contrast to enrichment or PCR-based NGS panels, which may distort the original proportions of DNA fragments and skew the quantification [12].

The purpose of this study was to determine the feasibility of accurately distinguishing between HER2-positive and HER2-negative cases of BC based on matched tumor-normal WGS. To date, there have been only a few studies evaluating the clinical utility of NGS testing of ERBB2 gene status, including the WGS method [4, 11, 13,14,15]. Some of them directly address the clinical need to verify the relevance of their findings for patient management, reporting the overall concordance between IHC/FISH and NGS at about a 90% level.

Our study operates on the large population-based cohort of 876 BCs from publicly available databases and additional external secondary data from 551 patients, supplied with the final clinical HER2 status based on ASCO/CAP guidelines and targeted treatment information, which serves to validate metastatic sample status. We analyzed the whole cohort of patients, aiming to establish the criteria for WGS ERBB2 status assessment as close to the gold standard as possible, optimized for both sensitivity and precision with a bias-free machine learning approach. We also provide the proof of concept that genomic data, acquired on different platforms with different chemistry, yield sufficiently uniform results for molecular diagnostics of ERBB2 amplification by WGS.

2 Materials and Methods

2.1 Sample Choice

Matched tumor-normal genomes from 876 patients with BC sequenced within three large Genomic Consortia (119, International Cancer Genome Consortium; 70, The Cancer Genome Atlas; 688, Hartwig Medical Foundation [HMF]) were downloaded from controlled-access databases after meeting formal criteria [13, 16,17,18]. The samples were sequenced using a low PCR amplification or PCR-free library preparation protocols and paired-end 100–150 base-pair Illumina reads with a 350–550 base-pair insert size (for details, see Table 1 of the Electronic Supplementary Material [ESM]). For analyses of primary tumor samples, we included the datasets with clinical HER2 status described as positive or negative, according to ASCO/CAP guidelines 2007–18 (depending on the year the original study was conducted, see Table 1 of the ESM). For metastatic/advanced tumor samples from the HMF database, metadata on HER2 status were available only for primary tumors, the IHC/FISH status for sequenced sample from the second biopsy was not provided. Because of the high rate of conversion from HER2-negative to HER2-positive status (and vice versa) during the cancer evolution [4, 11], in metastatic cancers, we have taken into consideration also the patients’ treatment metadata and discarded all samples for which treatment history (pre-biopsy and post-biopsy) was discordant with initial HER2 status (e.g., if trastuzumab was included in any line of treatment even though HER2 status was reported negative). For details on discarded samples, see the ESM.

Additional genomic data from an external pipeline were also used for validation purposes. Secondary data, derived from 560 BC genomes, were previously published by Nik-Zainal et al. [13]. From these data, we have extracted the complete information about clinical HER2 status, ploidy, purity, ERBB2 CN, and CEP17 CN of 551 patients. These data are also available in the ESM.

As there were no new tissue/DNA/RNA samples processed, the written consent of each subject is in possession of data providers. The primary data were collected in accordance with the standards set by the Declaration of Helsinki and the highest data security standards of ISO 27001.

2.2 Whole-Genome Data Processing

The files downloaded from HMF, The Cancer Genome Atlas, and the International Cancer Genome Consortium were analyzed using publicly available, open-source software embedded within an in-house pipeline (Fig. 1) implemented using Ruffus [19]. The analysis started with FASTQ file extraction from the BAM/CRAM files using Broad Institutes’ Picard tools [20]. Tumor samples with coverage exceeding 75× were downsampled with Seqtk version 1.3-r106 [21] to approximately 60× mean coverage. Next, all reads were trimmed using cutadapt version 2.10 [22] and mapped to the GRCh37 genome using Sanger’s Cancerit CGPMAP pipeline version 3.0.0 [23]. Samples with uniquely mapped read coverage below 20× for either tumor or normal genomes were excluded from the analysis [24, 25]. Mean tumor sample coverage across all datasets after downsampling was 48×, reference blood/EBV-transformed lymphocyte sample mean coverage was 36× (detailed data are provided in Table 2 and Fig. 1 of the ESM).

Fig. 1
figure 1

Summary of the in-house pipeline used for data extraction and processing

Variant calling was performed using Sanger’s Cancerit CGPWGS pipeline version 2.0.1 [23], and specifically copy number variants, purity, and ploidy were identified with ascatNgs [26]. Identified variants were annotated using Ensembl VEP version 102 [27].

The external validation data of Sanger’s Institute were processed by the Wellcome Sanger Institute as described in the original study, with a key step of copy number variant calling performed with ascatNgs [26]. The samples with incomplete/missing/inconsistent clinical data, failed processing, or a low depth of coverage were discarded (see Fig. 2 and the ESM).

Fig. 2
figure 2

Samples and data qualified for the study

2.3 Analyzed Parameters and Method Validation

In the study, we used clinical data on HER2 status according to ASCO/CAP recommendations. In the case of HMF metastatic/advanced tumors, pathomorphological evaluation data were available only at the point of diagnosis and were not supplied for actual WGS biopsies. HER2 status may shift in metastatic cancers, and without the information about the latest HER2 assessment, samples from HMF could have been wrongly labeled. To tackle this problem, we have also evaluated the metadata of the presence of targeted treatment with HER2 inhibitors in these samples and excluded all discrepant instances, in which therapy of metastatic cancer was not in compliance with initial HER2 status.

Based on ASCAT copy number alteration calling, ERBB2 (NC_000017.10:37844167_37886679) and the uniquely mapped 8250 bp sequence adjacent to CEP17 (NC_000017.10:22236000_22244250), copy numbers were extracted along with ploidy and purity estimation for all the tumor samples. The data were used to create three features for HER2 status assessment: absolute ERBB2 CN, ERBB2 CN-n (ploidy-adjusted ERBB2 CN), and ERBB2 CN/CEP17 CN ratio. Based on these features, a machine learning-based classifier was constructed, which determined the best approach for HER2 status discrimination. Six hundred and fourteen samples from the datasets were used as a training set (discovery cohort), the remaining 264 samples served as a validation hold-out set for the classifier and were not analyzed a priori. An additional external dataset processed by Wellcome Sanger Institute consisted of 551 samples, for which the same coordinates were used for extracting CEP17 CN and ERBB2 CN.

A decision tree-based classifier was chosen after comparing the effectiveness of logistic regression, random forest, and decision-tree models. After tuning the hyperparameters of each classifier, all three approaches achieve nearly identical performance, with the most robust decision tree model performing the best on average (for more details, see the ESM).

For the decision-tree-based modeling, the discovery cohort was randomly split into a training (75%) and a test set (25%). As the number of samples in IHC/FISH HER2-positive and HER2-negative groups was unbalanced (there were almost eight times less HER2-positive samples than HER2-negative samples), we added class weights (8:1) for compensation. After constructing the model, we measured its performance on 264 samples from the validation set. We used accuracy, precision, and recall along with the F1 score. Cohen’s Kappa score was estimated to evaluate the non-randomness of classification.

To show how each of the three features influences the classifier’s performance alone, we have established the same parameters independently for each of them as well and compared all the approaches with random data classification methods (Fig. 3). To further test the validity of our results, we decided to evaluate whether differences in tumor purity, heterogeneity of ploidy, or differences in mean depth of coverage had any deteriorative effects on the correctness of the results. For these experiments, we divided the samples into two near-equinumerous groups for each comparison and evaluated the differences in the tests’ performance.

Fig. 3
figure 3

Training set cross-validation accuracy comparison between three features used to determine human epidermal growth factor receptor 2 (HER2) amplification status in whole genome sequencing (WGS) data. Because of a class imbalance, the plot also includes a reference classifier assigning all samples to the majority class (DummyMostFrequent), which serves as a simple baseline

As the most simplistic model with one feature and a predetermined threshold was optimal, it was then used instead of ML to establish HER2 status for the test set and the external data. For analytical validation, we have determined the overall predictive value, positive predictive value (PPV), and negative predictive value (NPV) with confidence intervals (CIs) separately for the test set as well as for an external Sanger dataset of 551 patients (Table 1).

Table 1 Analytical validation of the whole genome sequencing ploidy-corrected ERBB2 CN on internal hold-out and the external dataset

3 Results

In the analyzed dataset, 159 patients were categorized as triple-negative BC (18%), among HER2-negative patients, ER+/HER− accounted for 599 (88%). One hundred and ten samples (13%) were identified by clinical testing as HER2 positive, among them: 74 ER+/HER2+ (8%), 36 ER−/HER2+ (4%). For eight patients, ER status was unavailable.

HER2 positivity was slightly underrepresented in favor of triple-negative BC in comparison with statistics for the Caucasian population (18%), which may be an accidental or sampling bias related to the Genomic Consortia’s sample collection process, or an effect of discarding datasets with incomplete clinical data. The decision-tree machine learning approach has demonstrated the best discrimination between HER2-positive and HER2-negative cases based on a single-parameter, ploidy-corrected ERBB2 CN with a threshold of 2.265 (Fig. 3). The decision tree algorithm was evaluated in a three-fold cross-validation repeated ten times to estimate the mean value and standard deviation for each metric. The results were as follows: accuracy = 96.7% (± 0.87%), precision = 86% (± 5%), recall = 89% (± 6%), Cohen’s Kappa = 85% (± 3.7%), and F1 = 87% (± 3%). A high value of Cohen’s Kappa strongly indicates that our model classifies samples in a non-random manner.

The learning curve displayed no further improvement with sample numbers exceeding 150 instances; therefore, we believe the results display the best reflection of the biological phenomenon of HER2 amplification we could extract from genomic data. Moreover, a principal component analysis of the dataset (Fig. 4) has shown a very good and robust separation of data into two groups, representing differences in HER2 status.

Fig. 4
figure 4

Principal component analysis of the dataset with six features: purity, ploidy, ERBB2 CN, CEP17 CN, ERBB2 CN/CEP17 CN ratio, and ploidy-corrected ERBB2 CN

As data distribution across depths of coverage, tumor purities, and ploidies was not normal (Tables 2–3 and Figs. 1–2 of the ESM), we decided to compare the accuracy distributions for these parameters with the Wilcoxon signed-rank test. The evaluation of results across data coverages has shown no significant differences (p > 0.05) between groups.

The comparison of low vs high purity also has not yielded significant differences (p > 0.05). However, there is a significant decrease in the mean accuracy of the test from 0.97 to 0.94, dependent on an increased tumor ploidy above two (p = 5.1 × 10-6) (Fig. 5).

Fig. 5
figure 5

Wilcoxon test comparison of means between distributions of accuracies in: A high vs low coverage data (threshold ×49), B high vs low ploidy data (threshold 3), and C high and low purity data (threshold 0.6)

As the best classification of samples was achieved by a single feature approach, the final classifier was reduced to a single feature. The use of the ML approach was therefore not necessary for further analyses of HER2 status in the analytical validation step, thus a simple threshold was used instead. The analytical validation of the ploidy-corrected ERBB2_CN method gave the diagnostic sensitivity of 91.18% (95% CI 76.32–98.14) and specificity of 98.69% (95% CI 96.22–99.73) for the hold-out dataset. In the external dataset, sensitivity was 89.86% (95% CI 80.21–95.82) and specificity was 96.06% (95% CI 93.91–97.61). For details, see Table 1.

4 Discussion

Decreasing NGS prices and increasing availability of this technology in medical practice have encouraged the transition from conventional cytogenetic and molecular methods to NGS in contemporary oncology. However, the evidence on the reliability of NGS techniques in the clinical use for copy number detection is still very limited. As the HER2 protein is one of the most significant biomarkers for BC diagnostics, targeted treatment response prediction, and prognostics, there were several attempts to show the applicability of NGS techniques in this indication.

The largest analytical validation study was conducted by Memorial Sloan Kettering on their proprietary MSK-IMPACT Assay [4]. This hybrid-capture-based panel NGS test was analyzed in 213 BC samples and evaluated in a clinical setting on further 599 samples. The cut-off for a positive result was established based solely on ERBB2 CN, adjusted to the background and normal signal of diploid genomes (defined as a ‘fold change’ of 1.5). The group reported 95% specificity and 100% sensitivity on > 10% of tumor content, with IHC/FISH evaluated by the newest2018 guidelines, and a dual-probe FISH assay [4]. In 2020, a continuation of the study exploited the borderline cases with excellent concordance [15]. Several other studies have also proven the clinical value of panel NGS for HER2 testing in BC and other solid tumors, with the same strategy of fold change determination, using either Illumina [1,2,3, 8] or Ion Torrent short-read methodology [9]. Another approach to define the ERBB2 amplification by panel NGS was using a cut-off of 2 standard deviations from the median depth of coverage across on-target data in a pool of samples [28, 29]. The question remains how universal this strategy is depending on the size of the panel used by different groups. The ‘resolution’ determined by the number of on-target sites and the presence of additional single nucleotide variant (SNV) targeting probes scattered across the genome (SNV ‘backbone’) greatly influence the ability to properly call the copy number alterations and normalize the data. The choice of the ‘panel of normals’ for data normalization may also influence the output, as different populations vary in inborn copy number variants. Hence, panel NGS strategies for normalizing any copy number variation in cancer should be cautiously validated for different methodologies, pipelines, and populations.

As new long-read sequencing technologies gain popularity, new methods of copy number detection in cancer are emerging. Nattestad et al. have described the high efficiency of PacBio SMRT technology in ERBB2 amplification detection in cell lines [30]. The main advantage of using this technique is the ability to explain the structure of a particular rearrangement. However useful, this feature may not be crucial from the diagnostic point of view. Moreover, the technology was not tested in patient samples yet, especially in degraded formalin-fixed paraffin embedded (FFPE) material [30].

There are also attempts to use cell-free/plasma DNA as starting material for copy number determination in NGS [31], and approaches to use targeted RNA sequencing on tumor tissue derived from FFPE [32]. The drawbacks of these methods are the quantity and/or quality of initial material—plasma DNA is lacking stability and FFPE blocks present uneven degradation of RNA. In terms of routine diagnostics, these methods are still too inaccurate and hard to standardize to embrace them at present.

However, data on clinical WGS utility for HER2 status assessment are scarce. There have only been two small clinical validation studies with direct comparison to the orthogonal methods. The first, released by Hartwig Medical Foundation, was a part of a WGS pan-cancer validation study. The ERBB2 status was evaluated on only 16 samples with the overall concordance of 93%. The HMF group compared ploidy and chromosome 17 CN with absolute CN of ERBB2 but did not draw any conclusions because of the small sample size [11]. The second, performed by King’s College Hospital in London, was performed on 145 BC samples with only 27 positives for amplified HER2. With the four discrepant samples, the sensitivity in the UK cohort was 88% and specificity was 98% [28]. The method itself has a big advantage over panel NGS as it may be universally used for different biomarkers in different cancers by using adequate bioinformatics pipelines. Moreover, library preparation is straightforward and reproducible, as is variant calling. The vast amounts of information extracted from the WGS analysis may serve as both a foundation for basic cancer science and a universal pan-cancer diagnostic tool. The main drawback of the method is the necessity to use fresh tissue, preserving big quantities (> 1 µg) of high-molecular-weight DNA. Second, the accurate variant calling in WGS requires also using a germline reference, a non-cancerous patient material, e.g., DNA isolated from the peripheral blood sample. Last, the analysis of big data generated from WGS experiments is challenging in terms of computational power and storage.

We attempted to systematically determine the criteria for WGS of ERBB2 CN in matched-normal tumor samples. Our strategy was to gather publicly available BC datasets with reliable clinical metadata and analyze them uniformly with a minimal 20× depth of coverage. We have chosen such a sensitivity threshold based on AscatNGS algorithm doccumentation, recommending at least t for accurate CN alterations calling (https://github.com/cancerit/ascatNgs).

Our machine learning approach, based on the decision tree classifier, agnostically chose the optimal approach of HER2 status assessment to be ploidy-corrected ERBB2 CN over ERBB2 CN/CEP17 CN ratio and absolute ERBB2 CN. To measure the test’s reliability, we used Cohen’s kappa coefficient. The high value of 85% rules out the possibility of the data agreement occurring by chance. To further test the superiority of the ploidy-corrected approach, we compared the results for the three features in a separate external dataset from the Wellcome Sanger Institute, which confirmed our observations: the accuracy and precision were 0.952813 and 0.765432, vs 0.724138 and 0.306977, vs 0.945554 and 0.724138 for absolute ERBB2 CN, ERBB2 CN/CEP17 CN ratio, and ploidy-corrected ERBB2 CN, respectively (for further details, see the project’s GitHub repository).

This is, to the best of our knowledge, the first and the largest study of its type utilizing machine learning approaches to evaluate diagnostic criteria of a genomic test. In the context of traditional HER2 status determination, first proof-of-concept ML-based solutions for robust FISH and IHC assessment are already tested in a clinical set-up [33, 34].

The ML approach is an emerging field of medicine, improving the efficiency of pathomorphological assessment [35], radiology [36], and clinical chemistry [37]. In the field of BC diagnostics, genomics and transcriptomics ML is being applied to distinguish between intrinsic BC subtypes with different prognoses [38], identify new potential biomarkers, or repurpose the existing biomarkers. These strategies perform well in the scientific environment but may only be used in the clinical setting after well-planned validation, showing concordance and stability of the tests. The potential limitation of using ML-based classifiers in medicine lies in the complexity and explainability of the results in the clinical set-up. This complexity led to the formulation of strict European Union and US Food and Drug Administration regulations for ML-based medical devices, which requires adherence to a set of tightly supervised rules. Nonetheless, ML classifiers are now widely used in medical genomics in biomarker discovery and refinement, providing unbiased solutions that can be then evaluated in the traditional validation process.

Our results prove that WGS may be a reliable method for HER2 diagnostics. We hypothesize that it could be implemented as a stand-alone test or in combination with IHC instead of FISH or other NGS-based methods in routine clinical practice. With a diagnostic sensitivity of 91.18% and a specificity of 98.69%, determined on unselected and heterogeneous groups of patients and validated on additional external data, we conclude that the technology is mature enough for prospective, multicenter, analytical, and clinical validation.

Our results do not deviate relevantly from those reported by other groups focusing on ERBB2 NGS testing; however, the diagnostic sensitivity is still not optimal. It can be improved by optimizing the threshold value, which might increase the sensitivity by compromising the specificity. In this study, we aimed for a reduction in the false-positive error rate; however, balancing between type I and type II errors should be left to the decisions of oncologists and regulatory bodies after careful assessment of clinical consequences of both strategies.

The limitation of the study in the context of sensitivity is also the large CI, as there were few HER2+ samples in the test set. However, the external validation results strongly support the high sensitivity of WGS.

Several other factors might contribute to a slightly lower analytical sensitivity of our solution in comparison with panel NGS sensitivity reported by other groups. We suspect that heterogenous evaluation of IHC/FISH results, made on the basis of different issues of ASCO/CAP guidelines, may have contributed to the discrepancy in the results. Additionally, there was some heterogeneity in the WGS raw data, acquired on different equipment by different genomic consortia. There were differences in the tumor sample collection, DNA extraction, and library preparation methods. All these pre-analytical and analytical factors must have contributed to the greater variation in HER2 results than in the single-facility method with uniform IHC/FISH evaluation methodology and a single laboratory protocol for sample management. Even so, the WGS method exhibits robustness and effectiveness, which is a great advantage, allowing for a low-cost external, even worldwide quality-control assessment program to be held out in the near future.

Other factors lowering the sensitivity are changes in HER2 status, which could have occurred in metastatic tumors from the HMF dataset. In these instances, we could not directly evaluate the correctness of IHC/FISH data because they came from the primary biopsy, not the biopsy corresponding with the sample used for WGS. The shift between IHC-positive and IHC-negative status is reported in up to 11.5% of HER2-negative cancers (conversion to HER2 positive) and in 37% of those initially positive (conversion to HER2 negative in the presence of selective pressure of trastuzumab) [4, 11].

The metastatic nature of HMF samples may have also contributed to lower than population numbers of HER2-positive cases in our cohort, as they tend to have better outcomes and might have been recruited less often to the study. The algorithm of patients’ qualification to first-line targeted therapy in all genomic consortia mentioned above could also have been somehow related to this under-representation.

Some of the discrepancies may have come from tumor subclonality, which is a common serious diagnostic issue. The signal from a small proportion of HER2 amplified cells may be below the resolution of WGS at 30–60× depth of coverage [13].

The spatial intra-tumor heterogeneity may have also contributed to false-negative/positive results when there were differences in sampling locations between tissue collected for FFPE blocks and WGS (e.g., different distant metastases are sampled). In addition, overexpression of HER2 is not always based on ERBB2 amplification, as an estimated 3–10% of non-amplified tumors exhibit high overexpression [39]. Even though there is currently no known genomic background of this phenomenon, WGS could potentially detect alterations in HER2 regulatory pathways, leading to overexpression, which could further improve WGS diagnostic power.

We may not forget that our results are compared to the BC diagnostic gold standard (IHC and FISH) and some issues of these tests may influence our results. There will always be a ‘diagnostic gray zone’ in which the prediction of response is not optimal, even though ASCO/CAP struggles to refine and evaluate their guidelines systematically [3]. Our classification tends to gradually worsen when ERBB2 CN is decreasing below 6 copies especially in tumors with higher ploidies. We cannot provide the data on the patient outcomes in these cases, but some authors made observations that polysomic tumors are not a homogenous group and they display different responses to trastuzumab and different clinicopathological features [40, 41]. As classical methods do not distinguish between polysomy and polyploidy, it might be reasonable to evaluate whether these two entities differ in terms of their properties and anti-HER2 drug response.

To tackle this issue, a prospective WGS study with an up-to-date pathomorphological HER2 evaluation according to the newest ASCO/CAP guidelines should be performed. It might not only evaluate the single biomarker concordance but would analyze the treatment response and try to seek new meaningful correlations with a vast number of genomic alterations, e.g., ploidy or complexity of chromosome 17 rearrangements.

The current knowledge on ERBB2 genetics is still based on methods established in the 20th century. We believe new potent tools such as ML and WGS present a vast array of solutions and opportunities to help improve the diagnostics and treatment of HER2-positive BC, and allow us to move into the 21st century.

5 Conclusions

We provide evidence that the ERBB2 status can be reliably determined by WGS methodology, which may be included in a comprehensive test for BC diagnostics. The 20% of tumor purity and 20× depth of coverage are sufficient to ensure good quality of genomic data in most instances. Given good concordance of a WGS with routinely used methods, we suggest that assessment by a WGS method may be an alternative to other NGS-based methods as well as FISH-based diagnostic tools. Hence, it should be subjected to evaluation by ASCO/CAP in the future updates of the HER2 testing recommendations. In our work, we have also proven that short-read WGS technology bears great potential for establishing a harmonized global quality assessment program for ERBB2 detection, as the outputs of heterogeneous data gathered from four different genomic consortia show a high degree of concordance between methodologies and pipelines.