Introduction

Human epidermal growth factor receptor 2 (HER2) is an oncogene that is overexpressed in approximately 10–30 % of gastric cancers [13]. Trastuzumab (Herceptin®), a humanized monoclonal antibody against HER2, was originally developed for the treatment of metastatic breast cancers [4]. The ToGA study [5] demonstrated that trastuzumab significantly improves overall survival compared with chemotherapy alone in advanced HER2-positive gastric and gastroesophageal junction cancers. In the ToGA study [5], HER2-positive gastric cancer was defined as overexpression of HER2 protein assessed by immunohistochemistry (IHC) and/or gene amplification by fluorescence in situ hybridization (FISH). The immunohistochemical scoring system (IHC score 0, 1+ to 3+) used in breast cancer was employed to evaluate overexpression of HER2 in gastric cancer. However, because of biological differences between breast and gastric cancer, such as an increased frequency of tumor heterogeneity and a basolateral vs. circumferential membrane staining pattern, the ASCO/College of American Pathologists (CAP) HER2 IHC scoring criteria were modified specifically for gastric and esophagogastric junction cancers [1, 5]. More importantly, exploratory subgroup analyses of the ToGA study revealed that, among HER2 FISH-positive cases, high-level HER2 expression (IHC 3+ or 2+) was a favorable predictive marker for trastuzumab treatment. These data suggest that assessing both HER2 protein overexpression and gene amplification status may be useful for predicting the efficacy of trastuzumab therapy in gastric cancer. Several new molecularly targeted drugs against HER2 protein are currently being tested in vivo as well as in clinical studies [68], further highlighting the importance of accurate HER2 status assessment.

Compared to breast cancer, gastric cancer shows higher rates of intratumoral heterogeneity of HER2 protein overexpression [1]. Although HER2 gene amplification status is also thought to be heterogeneous in gastric cancers [9], there are only a few studies of HER2 genotypic heterogeneity in gastric cancer [10, 11] and its clinical significance has not yet been determined. In the ToGA study, 22.4 % of FISH-positive gastric cancers showed only weak or no protein expression [5]. It is therefore important to establish the clinical significance of the correlation between HER2 protein overexpression and gene amplification.

The gene–protein assay (GPA) is a newly established technique which allows both IHC and brightfield dual-color in situ hybridization (DISH) to be performed on a single slide, thereby enabling pathologists to examine both protein overexpression and gene amplification simultaneously at the single-cell level. The utility of GPA technology has been demonstrated in breast cancer, especially in equivocal cases or cases showing intratumoral heterogeneity [12].

This study examined the diagnostic accuracy of GPA technology for evaluating HER2 status in gastric cancer, comparing GPA results with single IHC and DISH HER2 assays. In addition, we also analyzed intratumoral phenotypic and genotypic HER2 heterogeneity in over 800 gastric cancer cases examined by GPA.

Materials and methods

Cases and tissue microarray

Tissue microarray (TMA) construction has previously been described by Aizawa et al. [13]. Briefly, formalin-fixed paraffin-embedded specimens from 1006 consecutive patients with gastric cancer who underwent surgical resection at the National Cancer Center Hospital East, Chiba, Japan between January 2003 and July 2007 were selected to construct the TMAs. For each clinical case, a representative section was selected and two tissue cores (each 2.0 mm in diameter) were obtained from different tumor areas. Serial 4-μm sections were prepared and used for hematoxylin and eosin, IHC, DISH, and GPA staining. Clinicopathological parameters were obtained from the medical records. The study protocol was approved by the Institutional Review Board of the National Cancer Center, Japan.

HER2 immunohistochemistry, dual-color in situ hybridization, and gene–protein assay

HER2 IHC, HER2 and chromosome 17 centromere (CEN17) DISH, and HER2 GPA assays were performed as previously described by Nitta et al. [12]. Briefly, for HER2 IHC, HER2 protein expression was detected using the PATHWAY HER-2/neu rabbit monoclonal antibody (clone 4B5; Ventana Medical Systems, Inc., Tucson, AZ, USA) and the iVIEW DAB detection kit (Ventana) on a BenchMark XT automated slide staining system (Ventana). For HER2 DISH, HER2 gene and CEN17 targets were visualized with the ultraView SISH DNP detection kit (Ventana) and the ultraView Red ISH DIG detection kit (Ventana), respectively, after hybridizing with the INFORM HER2 dual ISH DNA probe cocktail (Ventana). For HER2 GPA, the HER2 IHC protocol was followed by the HER2 and CEN17 DISH protocol, in which HybReady (a hybridization buffer, Ventana) was replaced with HybClear (Ventana). HybClear contains naphthol phosphate as a blocker. All tissue sections were counterstained with hematoxylin II (Ventana) and bluing reagent (Ventana). Air-dried glass slides were coverslipped using the Tissue-Tek Film automated coverslipper (Sakura Finetek Japan, Tokyo, Japan). Only one optimized protocol each for HER2 DISH and HER2 GPA was performed for all gastric cancer TMA slides.

Evaluation of HER2 status

The ToGA study scoring system for surgically resected gastric cancer tissue was used to evaluate HER2 protein overexpression on the IHC and GPA slides [5]. The manufacturer’s instructions were followed to evaluate the HER2 gene amplification statuses of DISH slides. Briefly, the HER2/CEN17 ratio was determined by counting HER2 gene signals (black dots) and CEN17 signals (red dots) in 20 representative tumor cell nuclei. When this ratio was between 1.8 and 2.2, in situ hybridization (ISH) signals in an additional 20 nuclei were counted, and the HER2/CEN17 ratio was calculated for a total of 40 nuclei. HER2 gene status was reported as non-amplified if HER2/CEN17 <2.0 or amplified if HER2/CEN17 ≥2.0. The GPA slides were evaluated using the same IHC and DISH scoring criteria described above. Cases with a HER2 IHC score of 3+ and/or HER2 gene amplification were defined as HER2 positive in accordance with the criteria used in the ToGA study [5]. In comparison, we also analyzed HER2 status based on European criteria [14], where HER2 IHC scores of 3+ or IHC scores of 2+ with HER2 gene amplification are defined as HER2 positive.

All tissue cores stained for HER2 IHC, DISH, and GPA were evaluated by YN and TK. All tissue cores stained with each staining method were evaluated independently based on results from the other staining methods.

Intratumoral heterogeneity of HER2 protein overexpression and HER2 gene amplification

In this study, intratumoral heterogeneity of HER2 protein expression (phenotypic heterogeneity) was defined as different IHC scores on two separate tissue cores. It should be noted that cases with an inter-core discrepancy of IHC 1+ and 0 were not considered phenotypically heterogeneous because both IHC 1+ and 0 are clinically considered negative. Intratumoral heterogeneity of HER2 gene amplification (genotypic heterogeneity) was defined as different gene amplification statuses (positive vs. negative) of two tissue cores. In addition, intratumoral heterogeneity of HER2 protein overexpression in a single TMA core (intra-core phenotypic heterogeneity) was defined as different IHC scores within a single core, with <50 % of tumor cells representing the highest IHC score. Intra-core genotypic heterogeneity was not assessed.

Statistical analyses

Kappa coefficients for assay agreement were calculated for each analysis. The clinical characteristics between the two groups were compared using the chi-square test for noncontinuous variables and the t test for continuous variables. All p values reported are two-sided, and p < 0.05 is considered statistically significant. All analyses were performed using the IBM SPSS Statistics 21 software package (SPSS Inc., Tokyo, Japan).

Results

Among the 1006 clinical cases (2012 tissue cores), 1980 tissue cores were confirmed to have sufficient tumor cells and were eligible for IHC and DISH analyses. HER2 gene amplification status could not be evaluated in 194 cases by DISH because of inadequate staining levels, including weak/absent CEN17 and/or HER2 signals in internal positive control cells or tumor nuclei. No modifications of the HER2 DISH protocol were made to accommodate these cases, and they were excluded. 875 cases (1750 cores) were confirmed to have evaluable tumor cell areas in both cores for both IHC and DISH and were eligible for GPA and further analysis. Characteristics of these 875 cases are listed in Table 1S of the Electronic supplementary material (ESM).

Concordance of the HER2 statuses obtained from the IHC, DISH, and GPA methods

Serial sections from each TMA block were prepared and stained for HER2 IHC, DISH, and GPA (Fig. 1). The results of the comparison between HER2 IHC scores obtained by single IHC and the GPA scores are shown in Table 1. 1736 cores presented the same IHC score from the single IHC and GPA assays. The remaining 14 cores showed only single score differences. The concordance rate between these two methods was 99.2 % (1736/1750 cores). The κ value between the IHC score and the GPA IHC score was 0.97.

Fig. 1
figure 1

Immunohistochemical (a, d, g, j), dual-color in situ hybridization (b, e, h, k), and gene–protein assay (c, f, i, l) staining examples from tissue microarray samples. ac HER2 immunohistochemistry (IHC) 0 case without gene amplification. df IHC 2+ case without gene amplification. gi IHC 2+ case with gene amplification. jl IHC 3+ case with gene amplification

Table 1 Concordance between HER2 IHC score and GPA IHC score for 1750 cores from 875 cases

HER2 DISH and GPA concordance results for gene amplification are shown in Table 2. HER2 gene amplification was observed in 167 out of 1750 cores (9.5 %) by DISH and in 163 out of 1750 cores by GPA (9.3 %). There were four cores in which gene amplification could be detected by GPA but not by DISH. The HER2/CEN17 ratio for all four discordant cores was between 1.8 and 2.2. In addition, there were 8 cores in which the gene copy number could not be counted on the GPA-stained slide because the CEN17 signals (red dots) were obscured by strong 3,3′-diaminobenzidine (DAB) staining for HER2 protein. Since these 8 cores were all IHC score 3+, gene amplification status did not influence the final HER2 status. The concordance rate between DISH and GPA for HER2 gene amplification was 99.3 % (1738/1750 cores). The κ value between DISH and GPA DISH results was 0.99.

Table 2 Concordance of HER2 status (amplified/non-amplified) between DISH and GPA DISH for 1750 cores from 875 cases

Finally, HER2 status as defined by the ToGA study [5] was compared between IHC/DISH and GPA (Table 3). Upon examining single HER2 IHC and DISH assays, 96 cases were found to be HER2 positive (51 IHC score 3+, 45 IHC 0, 1+ and 2+/gene amplified) while 98 cases were HER2 positive by GPA. Two cases were positive only by GPA. These cases were scored IHC 0 and non-amplified by single IHC/DISH. However, they were scored as IHC 0 and amplified by GPA. The concordance rate between the two methods was 99.8 % (873/875 cases). The agreement between IHC/DISH and GPA was excellent (κ value of 0.99). GPA detected all HER2-positive cases evaluated by single IHC/DISH. Moreover, two additional cases were identified as HER2 positive using GPA. In addition, according to the European scoring criteria described by Albarello et al. [14], 84 cases were HER2 positive using the single assays (51 IHC score 3+, 33 IHC 2+/gene amplified), while 83 cases were HER2 positive by GPA (Table 2S of the ESM). Only one case was negative by GPA. This case was scored IHC 2+ and amplified by single IHC and DISH assays, and IHC 1+ and amplified by GPA. The concordance rate between these two methods was 99.9 % (874/875 cases). The agreement between IHC/DISH and GPA was excellent (κ of 0.99).

Table 3 Concordance of HER2 final status (positive/negative) between single IHC/DISH and GPA on 875 cases

Correlation between HER2 IHC score and gene amplification status

875 cases were analyzed to compare IHC scores and gene amplification statuses obtained by single IHC and DISH assays. As shown in Table 3S of the ESM, all 51 IHC 3+ cases had gene amplification, whereas only 33 out of 76 (43.4 %) IHC 2+ cases had gene amplification. There were 12 cases with IHC scores of 0 or 1+ and positive HER2 gene amplification by IHC/DISH.

HER2 heterogeneity in protein expression and gene amplification

The association between intratumoral phenotypic heterogeneity and genotypic heterogeneity is shown in Table 4S of the ESM. There were 764 cases (87.3 %) with the same IHC scores for the two cores, while 111 cases (12.7 %) presented different IHC scores (Table 4). After excluding 49 cases with IHC scores of 0 and 1+ (see “Materials and methods”), 62 cases (7.1 %) were assessed for phenotypic heterogeneity. In 875 cases, HER2 protein expression of intensity >2+ in at least one core was observed in 127 cases. Of these, 76 were IHC 2+ and 51 were IHC 3+. Phenotypic heterogeneity was more frequently observed in IHC 2+ cases (47/74; 63.5 %) than in IHC 3+ cases (15/53; 28.3 %).

Table 4 Concordance of GPA IHC scores for the two cores in 875 cases

HER2 gene amplification was observed in 93 out of 875 cases. Of these, 25 showed HER2 gene amplification in only one of the two cores and were therefore considered cases with genotypic heterogeneity (Table 5). Among these 25 cases with genotypic heterogeneity, 15 showed phenotypic heterogeneity, 3 showed protein overexpression in both cores, and 7 were IHC negative (0/1+) in both cores.

Table 5 Concordance of HER2 gene amplification by GPA for two cores in 875 cases

Finally, 71 cases had either phenotypic or genotypic heterogeneity or both (Fig. 2a–d) after excluding one case in which amplification status by GPA could not be evaluated (Table 5S of the ESM). Among 14 cases with IHC 3+ as the highest score, 7 cases showed IHC 2+ in the other core (IHC 3+/2+) and all possessed homogeneous gene amplification, while the remaining 7 cases were IHC 3+/0 and showed gene amplification only in IHC 3+ cores. In contrast, among 50 cases with IHC 2+ as the highest score (IHC2+/1+ or 0), 4 (8.0 %) and 11 (22.0 %) cases showed homogeneous and heterogeneous gene amplification, respectively.

Fig. 2
figure 2

Two TMA cores (A and B) obtained from the same lesion demonstrate intratumoral phenotypic and genotypic heterogeneity (ad). a, c Core A was immunohistochemistry (IHC) 3+ with gene amplification. b, d Core B was IHC 0 without gene amplification (c, d ×60). Intra-core phenotypic heterogeneity. e Heterogeneity of HER2 protein overexpression within one TMA core (eg). f, g Areas with different immunohistochemistry (IHC) scores were observed at the cell-to-cell level. IHC 3+ and IHC 1+/0 area (f) and IHC-negative area (g). Homogeneous gene amplification was observed in spite of heterogeneous protein overexpression (f ,g ×60)

As a final HER2 status assessment, among 98 HER2 GPA-positive cases based on the ToGA study IHC and gene amplification criteria, there were 25 cases (26.9 %) showing discrepant statuses of the two tissue cores (Table 6S of the ESM).

Intra-core heterogeneity

200 cores with an IHC score of 2+ or 3+ were evaluated for intra-core heterogeneity of HER2 protein expression (intra-core phenotypic heterogeneity) (Fig. 2e–g). 69 out of 109 cores with IHC 2+ (63.3 %) showed intra-core phenotypic heterogeneity, compared to only 9 out of 91 cores (9.9 %) with IHC3+. In 62 cases with phenotypic heterogeneity of the two cores, intra-core heterogeneity was observed in 44 cases (71.0 %).

HER2 heterogeneity and other clinicopathological factors

Clinicopathological characteristics of cases with or without phenotypic heterogeneity and genotypic heterogeneity are shown in Tables 7S and 8S of the ESM. In the cases with phenotypic heterogeneity, there were no significant differences in any clinicopathologic characteristics (age, gender, histology, tumor location, macroscopic type, TNM stage). However, phenotypic and genotypic heterogeneity were more frequently observed in early-stage cancers (Tables 7S, 8S of the ESM), suggesting that gastric cancer possesses heterogeneous characteristics early in tumor development. This is consistent with data suggesting that HER2-positive tumors may not have a growth advantage over HER2-negative tumors.

Discussion

This study demonstrated that: (1) HER2 test results obtained from GPA show good concordance with single IHC and DISH assays in gastric cancer; (2) there are high frequencies of phenotypic and genotypic HER2 intratumoral heterogeneity in gastric cancer; and (3) HER2 genetic and phenotypic heterogeneity is more frequently observed in the early stages of gastric cancer development.

Tubbs et al. [15] reported a dual HER2 protein and HER2 gene assay for breast cancer in 2004. Two subsequent studies further demonstrated the feasibility of this assay in breast cancer [16, 17]. Hirschmann et al. [18] reported the simultaneous analysis of HER2 gene and HER2 protein on a single slide in a small study with 25 gastric cancers, in which the same antibody and DISH probes from the current study were used but without naphthol phosphate. Recently, Nitta et al. [12] described the diagnostic utility of the HER2 GPA technology in breast cancer, especially in equivocal cases or cases showing intratumoral heterogeneity of HER2. The present study is the first to evaluate the concordance of the HER2 statuses obtained using conventional methods (single IHC and DISH assays) and GPA in a large number of gastric cancer cases. The agreement rates are similar to those seen in breast cancer, with Nitta et al. [12] reporting an overall percent agreement of 97.8–99.5 % for IHC and 96.0–97.7 % for DISH. We conclude that GPA is equivalent to single IHC and DISH in evaluations of HER2 protein expression and gene amplification status in gastric cancer.

In this study, there were 194 cases that could not be evaluated for HER2 gene amplification because of inadequate staining in tumor or internal control cell nuclei. While the exact reason for the ISH staining failure could not be identified, possible reasons may be pre-analytical variation, such as in fixation duration or time after the preparation of the paraffin blocks.

It should also be noted that the CEN17 signals could not be assessed in tumor cell nuclei in eight cores with high HER2 gene amplification because strong DAB staining obscured the CEN17 signals. Although Hirschmann et al. [18] also expressed concern about this problem, false-negative results are unlikely since the final HER2 status can be determined by the IHC score regardless of HER2 gene amplification status. Regarding the four cores in our study in which gene amplification could be detected by GPA but not by DISH, it is our assumption that under the guidance of IHC staining, tumor cells with HER2 amplification are more precisely selected for gene copy number evaluation.

The CAP issued a supplemental guideline in 2009 to define breast cancer tumors that are “genetically heterogeneous.” They defined these as tumors with at least 5 % but fewer than 50 % of nuclei with a HER2/CEN17 ratio of >2.2 [19]. The 2013 ASCO/CAP HER2 guideline update [20] referred to ISH heterogeneity and recommended a standardized method for ISH interpretation that included scanning the entire slide prior to counting and/or using an IHC HER2 test to define areas of potential amplification. In gastric cancer, there are no guidelines for tumor heterogeneity assessment, and the clinical significance of this finding has not yet been determined. Compared to breast cancer, gastric cancer shows a higher frequency of heterogeneity in HER2 expression [1] and HER2 gene amplification [9]. Yang et al. [21] reported a 79.3 % rate of heterogeneous HER2 protein expression by IHC in gastric cancer, while HER2 genetic heterogeneity was found in 44.0 % of cases. Our results are consistent with that study, with phenotypic heterogeneity observed more frequently than genotypic heterogeneity. Kim et al. [22] evaluated the proportion of positively stained tumor areas in relation to HER2 scores in gastric cancer. They found that heterogeneity was more prevalent in IHC 2+ cases, with 90.9 % of IHC 3+ cases but only 40.9 % of IHC 2+ cases staining more than 50 % of the tumor area. In our study, phenotypic heterogeneity was observed in 63.5 % of IHC 2+ cases, in contrast to 28.3 % in IHC 3+ cases, consistent with the previous report.

In the ToGA study [5], about 22 % of HER2-positive cases showed gene amplification without protein overexpression (FISH+/IHC 0 or 1+). In this study, 12 cases of IHC 0 or 1+ with gene amplification were identified. The biological nature and clinical outcomes associated with this patient population are yet to be determined. Simultaneous analysis of HER2 protein expression and gene amplification at the single-cell level by GPA may be applicable to further investigation in this area. In the current study, all IHC 3+ cases had HER2 gene amplification, whereas 43.4 % of IHC 2+ cases had no gene amplification. These results are consistent with the previous report by Kim et al. [22] and others [1, 2]. In addition, it should be noted that focal areas consisting of a few IHC-positive tumor cells with HER2 gene amplification were observed. GPA may contribute to the accurate evaluation of HER2 status in such cases.

Lee et al. [10] studied the clinical significance of tumor heterogeneity, finding that intratumoral HER2 heterogeneity in gastric cancer was significantly associated with longer disease-free survival. They reported that cases with diffuse or mixed Lauren histological subtype tended to have heterogeneous rather than homogeneous HER2 expression. In this study, we also showed that poorly differentiated tumors tend to have high rate of heterogeneity for HER2 expression or amplification. These observations may represent useful information for pathologists, since they could be used to predict heterogeneous HER2 status (expression or amplification) based on routine histological examination. Moreover, they reported that the frequency of tumor heterogeneity was comparable for early and advanced stages, suggesting that tumors acquire a certain degree of diversity early in their development. Since HER2-positive gastric cancers are reported to show comparable clinical behavior [9, 2325] to that of HER2-negative gastric cancers, HER2-positive tumor cells may not have a growth/survival advantage over those that are HER2 negative. Regarding gastric carcinogenesis, the overexpression of the mutated p53 gene is a major genetic event [26]. Kataoka et al. [27] reported a strong correlation between p53 overexpression and HER2 positivity. The possibility of an association of p53 overexpression with HER2 status and heterogeneity needs to be checked in further studies.

Conclusions

HER2-positive gastric cancers demonstrate different HER2 protein expression and gene amplification statuses within the same lesion. It may be important to evaluate both phenotypic and genotypic heterogeneity to gain a deeper understanding and improve the prediction of clinical outcome in gastric cancer patients treated with trastuzumab and similar targeted therapies. The newly established GPA technology described here may be useful for establishing biomarkers for other molecularly targeted drugs.