Introduction

Gastric cancer constitutes one of the major health care burdens worldwide, with 934,000 new cases diagnosed and 700,000 deaths annually, especially in Eastern Europe, South America, and Asia [13]. It is the second leading cause of cancer mortality in the world, mainly because most patients present with locally advanced or metastatic disease at diagnosis. Patients with metastatic gastric cancer have a median survival time of only 8–10 months and a 5-year survival rate of only 7 % [4]. Fluoropyrimidine-based and platinum-based combination chemotherapy regimens are the current adjuvant treatment for patients with advanced gastric cancer [4, 5]. Combination chemotherapy is considered beneficial in comparison to best supportive care because the weighted median average survival of patients receiving chemotherapy increases by approximately 6 months, with a significant overall hazard ratio of 0.39 [5].

The HER2 protein (HER2/neu, ErbB-2) is a member of the epidermal growth factor receptor family. It is a transmembrane tyrosine kinase receptor involved in tumor cell proliferation, apoptosis, adhesion, migration, and differentiation [6]. After recognizing its importance in breast cancer, its contribution to gastric cancer was identified [7, 8]. Initial studies detected HER2 overexpression in 9–38 % of gastric cancers and established its prognostic significance [68]. More recently, in 2012, the Tratuzumab for Gastric Cancer (ToGA) study demonstrated that the monoclonal anti-HER2 antibody trastuzumab, in combination with chemotherapy (capecitabine and cisplatin or fluorouracil and cisplatin) significantly prolonged survival compared with chemotherapy alone, without incremental toxicity [9]. The maximum survival benefit was observed in tumors with either an HER2 immunohistochemistry (IHC) score of 2+ and fluorescence in situ hybridization (FISH) positivity [HER2: chromosome enumeration probe 17 (CEP17) ratio of ≥2] or only an IHC score of 3+ than in initially-defined cohorts that were IHC 3+ or only FISH-positive (median survival, 16.0 vs. 13.8 months).

HER2 expression is more heterogeneous in gastric cancers than in breast cancers [10, 11]. The HER2 IHC scoring system for gastric cancer has two diagnostic criteria that are not included in the HER2 IHC scoring system for breast cancer [10]. First, incomplete lateral or basolateral membranous HER2 immunoreactivity in gastric cancer cells, presumably resulting from residual secretory function in the luminal side of tumor cells, is indicative of HER2 positivity. Second, to prevent underestimation, there is no cut-off threshold for the percentage of tumor cell reactivity in biopsy specimens.

In the ToGA trial, only 23 % of patients received gastrectomy procedures [9]. Therefore, a significant proportion of enrolled patients were evaluated solely on the basis of biopsy specimens. Clinicians can choose biopsy or excision specimens for HER2 testing of patients with metachronous metastases. However, only a few studies have addressed the uniformity (or lack thereof) of HER2 IHC results obtained from endoscopic biopsies versus surgically excised material, and the relevant clinicopathological factors predicting different results in biopsy and excision specimens have not been investigated [1114]. To address this issue, we conducted HER2 testing of paired endoscopic biopsy and surgical excision specimens from 180 patients with gastric cancer. We followed the current scoring system and identified the pertinent clinicopathological factors.

Materials and methods

Study cohort

We retrospectively examined 342 consecutive surgical specimens from esophagogastric and gastric tumors. The specimens were collected from January 2012 to September 2014 and were stored in the archive of the Department of Anatomical Pathology at Linkou Chang Gung Memorial Hospital in Taiwan. Paired biopsy and excision specimens were identified in 183 cases. After excluding 3 intraoperative biopsy specimens, the study cohort consisted of 180 paired specimens. The sex and age of each patient and location and size of each tumor were retrieved from electronic charts. Two experienced pathologists (Dr. S.-C. Huang and T.-C. Chen) separately reviewed the slides and reached a consensus on the pathological findings for the biopsy and excision specimens, including tumor percentage, Lauren histotype [15], tumor differentiation, invasion depth, and pathological stage. Tumor differentiation was defined as well, moderate, or poor depending on whether neoplastic glandular formation covered >90, 90–50, or <50 %, respectively, of the tumor area [16]. The tumor fragment ratio is the number of tumor fragments divided by the total number of tissue fragments in biopsy specimens. This study was approved by the Institutional Review Board of our hospital.

IHC

HER2 IHC was performed on whole sections of formalin-fixed, paraffin-embedded tissue blocks of both biopsy and surgical specimens. The primary antibody used for IHC was a monoclonal HER2 antibody (A485, 1:200; Dako, Carpinteria, CA, USA). Antigen retrieval, antibody incubation, and chromogen counterstaining were performed in an automated immunostainer (BOND-MAX™, Leica Biosystem) with concurrent use of optimal positive and negative controls. Immunoreactivity was assessed independently by two expert pathologists (Dr. S.-C. Huang and T.-C. Chen) and a consensus was obtained using a multiheaded microscope. HER2 IHC was scored on the basis of immunoreactivity in membranes according to Hofmann’s scoring system [10]. Complete or basolateral membranous reactivity was considered the definition for positive staining. For biopsy specimens, the scoring was as follows: 0, no reactivity in any tumor cell; 1+, barely perceptible reactivity in at least 1 cluster of ≥5 tumor cells; 2+, weak to moderate reactivity in at least 1 cluster of ≥5 tumor cells; and 3+, strong reactivity in at least 1 cluster of ≥5 tumor cells. For excision specimens, the scoring was as follows: 0, no reactivity or membranous staining in <10 % of the tumor cells; 1+, barely perceptible reactivity in ≥10 % of tumor cells; 2+, weak to moderate reactivity in ≥10 % of tumor cells; and 3+, strong reactivity in ≥10 % of tumor cells. Specimens with IHC 0 or 1+ scores were considered HER2-negative. Specimens with IHC 2+ scores were considered equivocal for HER2 and were subjected to HER2 FISH analysis. Specimens with IHC 3+ scores were considered HER2-positive. Intratumoral HER2 reactivity was evaluated in specimens with IHC scores of 2+ or 3+ and considered positive if it was heterogeneous and occurred in less than one-third of tumor cells [13].

Fluorescence in situ hybridization

Specimens in which HER2 IHC reactivity was equivocal were tested blindly by Mr. S.-E. Lee using HER2 FISH. Four-micron-thick tissue sections were deparaffinized, treated with 1 M NaSCN at 80 °C for 20 min and with pepsin (0.05 % in 0.2 N HCl) at 37 °C for 4 min, and dehydrated in gradient alcohol solution. We used fluorescein isothiocyanate-labeled p17H8 (CEP17, accession number M13882) and the orange-fluorescence-uridine-5′-triphosphate-labeled bacterial artificial chromosome clones RP11-94L15 (HER2 gene, accession number AC079199) for in situ hybridization. After co-denaturation at 80 °C for 5 min, probes and samples were hybridized at 37 °C for 16 h in Thermobrite (Abbott Molecular). After hybridization, samples were washed with 2× saline–sodium citrate/0.3 % NP-40 at 72 °C for 2 min and counterstained with 4′,6-diamidino-2-phenylindole (Vector Labs). HER2/CEP 17 ratios were determined using a fluorescence microscope (BX6I, Olympus). An HER2/CEP17 ratio of less than 2 signified no amplification; an HER2/CEP17 ratio of more than 2 signified amplification [17]. Borderline ratios (1.8–2.2) were decided by counting 20 additional tumor nuclei.

Statistical analysis

Statistical analysis was conducted using SPSS software (version 20; IBM, New York, NY, USA). The agreement of HER2 IHC and FISH results was assessed via the kappa coefficient. Associations between clinicopathological characteristics and HER2 discordance were evaluated by using independent t, Pearson χ 2, or Fisher exact tests according to the variable. A multivariate logistic regression model was applied to the variables with a p value of less than 0.05 in the univariate analysis. Two-sided p values were calculated, and p < 0.05 was considered significant in all statistical analyses.

Results

Patient characteristics and pathological findings

We collected 180 paired biopsy and excision specimens from patients with esophagogastric and gastric cancers. This group of patients consisted of 110 men (61.1 %) and 70 women (38.9 %) with a mean age of 64.2 years (28–96 years). The study cohort included 162 cases (90 %) of gastric cancer, 18 cases (10 %) of esophagogastric junction cancer, and 8 cases (4.4 %) of stump cancer. After excluding stump cancer, the tumors occurred in the cardia or fundus (24, 13.3 %), body (55, 30.6 %), or antrum (69, 38.3 %), or extended across two or more regions (24, 13.3 %). In the 178 cases for which data were available, the most common trans-axial site was lesser curvature (71, 39.4 %), followed by semi-annular or annular fashion (39, 21.7 %), greater curvature (33, 18.3 %), anterior wall (20, 11.1 %), and posterior wall (15, 8.3 %). Forty-six cases (25.6 %) were of early cancer and 18 cases (10 %) were of metastatic disease at initial diagnosis.

The average number of tissue fragments in biopsy specimens was 6.3 (1–14), and 75 biopsy specimens (41.5 %) had more than 6 tissue fragments. The mean number of tumor fragments in each specimen was 4.3 (1–10), with a mean tumor fragment ratio of 69.7 % in each specimen (5–100 %). The mean percentage of tumor was 55.6 % (5–100 %). Intestinal type adenocarcinoma (91, 50.6 %) was the most common diagnosis of the excision specimens, and 105 (58.3 %) excision specimens were classified as poorly differentiated.

Comparison of HER2 test results in paired biopsy and excision specimens

The HER2 IHC scores of paired biopsy and excision specimens are compared in Table 1. When divided into 4 scores, HER2 scores differed in 90 pairs (50 %), and the kappa coefficient was 0.264 (p < 0.001). When IHC scores of 0 and 1+ were combined, the kappa coefficient increased to 0.339 (p < 0.001).

Table 1 HER2 immunohistochemical scores in paired biopsy and excision specimens

Fourteen of 41 (34.1 %) biopsy specimens with IHC scores of 2+ or 3+ exhibited a heterogeneous staining pattern. FISH revealed HER2 amplification in 4 of 31 biopsy specimens with 2+ IHC scores. The overall percentage of HER2-positive biopsy specimens was 7.8 %. Twenty-two of 51 (43.1 %) excision specimens with IHC scores of 2+ or 3+ scores displayed intratumoral heterogeneity. FISH confirmed HER2 amplification in 5 of 42 excision specimens with 2+ IHC scores. The overall percentage of HER2-positive excision specimens was also 7.8 %. When the status of all equivocal specimens was resolved, the kappa coefficient reached 0.690 (p < 0.001) (Table 2). Ultimately, there were 8 paired specimens (4.4 %) with HER2 discordance.

Table 2 HER2 test results in paired biopsy and excision specimens

Clinicopathological features of discordant specimens

HER2 discordance in the 8 paired specimens was unrelated to the age and sex of the patient, the location and size of the tumor, invasion depth, Lauren classification, or tumor differentiation (see Table S1 of the Electronic supplementary material, ESM). Although not statistically significant, paired specimens of esophagogastric tumors had a higher likelihood of discordance (11.1 vs. 3.7 %, p = 0.184), and no discordance was observed in paired specimens of diffuse-type adenocarcinoma. The number of biopsy tissue fragments and the tumor fragment ratio were not related to HER2 discordance, whereas the biopsy specimen in most discordant pairs had a tumor fragment ratio of >0.5 (p = 0.682 when dividing specimens into two groups with ratios of ≤0.5 or >0.5). Tumor heterogeneity in biopsy and excision specimens was unrelated to HER2 discordance (p = 0.673, p = 0.688, respectively). Discordant tumor differentiation was correlated significantly with discordant HER2 results (p = 0.01). Although the tumor stage was initially significant in univariate analysis, the logistic regression model showed that it was not a related factor (p = 0.097). In contrast, discordant tumor differentiation maintained its significance (p = 0.009, OR 7.87, 95 % CI 1.66–37.38).

A detailed analysis of the 8 discordant paired specimens suggested that discordance might result from differences in the IHC and FISH results (Table 3). IHC discrepancies accounted for 5 discordant cases, and FISH discrepancies accounted for 3 discordant cases (Fig. 1). Case no. 114 with IHC discordance was reconfirmed using FISH.

Table 3 Summary of 8 cases with discordant HER2 results in paired biopsy and excision specimens
Fig. 1
figure 1

Illustration of HER2 discordance in case nos. 6 and 114 (IHC immunohistochemistry, FISH fluorescence in situ hybridization)

Discussion

The percentage of HER2-positive specimens in this study is higher than that in our previous study (7.8 vs. 6.1 %, respectively) [18]. This may reflect different study populations and methodologies. Our current study differs from our previous study in terms of distribution of tumor location, histological type, and tumor differentiation. In our previous study, HER2 IHC was performed in tissue microarrays rather than whole-tissue sections, and this procedural difference may account, at least in part, for the different percentages of HER2-positive cells in the two studies [19, 20].

After new consensus recommendations for HER2 scoring for gastric cancer were proposed in 2008 [10], only a few studies addressed the reliability of endoscopic biopsy specimens in comparison with surgical excision specimens for HER2 testing [1114, 21, 22]. In the study of Lee et al. [11], 31 of 54 paired biopsy and gastrectomy specimens (57.4 %) had similar HER2 IHC scores on the basis of 4 categories, and 40 (74.1 %) had similar HER2 IHC scores on the basis of 3 categories. When silver in situ hybridization (SISH) was used as the gold standard, 7 discordant paired specimens (13 %) were identified, with false negativity in either the biopsy or gastrectomy specimen. Four additional studies demonstrated overall concordance rates ranging from 89 to 96 % for paired specimens [1214, 21]. FISH or SISH produced a higher percentage of concordant results for paired specimens (>90 %) than did IHC (approximately 80 %). The positive and negative predictive values of biopsy specimens were approximately 70 and 90 %, respectively [13].

The kappa coefficient, also known as interrater reliability, examines the agreement between two raters in the assignment of categories or categorical variables [23, 24]. It is an important tool for assessing how well different diagnostic systems are implemented. Our result show that Hofmann’s HER2 scoring system results in substantial agreement in the HER2 status of paired specimens when two categories (HER2-positive and HER2-negative) are used. However, the system was not very precise, and there was a high percentage of discordant initial HER2 IHC scores. Compared with excision specimens, the biopsy specimens may overestimate or underestimate HER2 status, and discrepancies may occur not only at the IHC level but also at the FISH level. Discordant HER2 results for paired specimens have been also reported for primary and metastatic carcinoma [2527].

Intratumoral heterogeneous HER2 expression is a common feature of gastric cancer and occurs in 30–50 % of HER2-positive tumors [7, 11, 26]. In our study cohort, intratumoral heterogeneity was observed in approximately 40 % of specimens with IHC scores of 2+ or 3+. However, intratumoral heterogeneity of HER2 expression was not significantly associated with HER2 discordance. Biopsy adequacy, including tumor tissue fragment number and tumor percentage in the biopsy specimen, also had no significant impact on HER2 discordance. This suggests that the intrinsic sample error of the biopsy procedure may be unrelated to the number of tissue fragments. A possible explanation of HER2 discordance in paired specimens is sample error, which is unavoidable in heterogeneous populations of tumor cells. Warneke et al. [20] used tissue microarrays generated from embedded tumor tissue as biopsies. The tissue microarrays resulted in an HER2 false-negative rate of 24 % and an HER2 false-positive rate of 3 %. In these cases, HER2-positive tumor cells usually comprised only 10–20 % of the entire tumor area in a background of complete absence of HER2 expression, imparting a “black-and-white” expression pattern.

Our present study further suggests that sample error may be predicted by discordant tumor differentiation in paired biopsy and excision specimens, perhaps because of the high correlation between HER2 expression and intestinal histologic type, especially in well and moderately differentiated tumors. In the ToGA trial, HER2 positivity differed significantly in histologic types (intestinal, 34 %; diffuse, 6 %; and mixed, 20 %) [6]. Kim et al. [19] found that an HER2 score of 2+ or 3+ was eight times more likely in well to moderately differentiated gastric tumors than in poorly differentiated tumors. A recent meta-analysis of 15 studies involving 5,290 patients also supported this notion and revealed the odds ratio of HER2 overexpression could be 3.14 and 6.2 for tumor differentiation (differentiated vs. poorly differentiated) and Lauren’s classification (intestinal vs. diffuse), respectively [28]. As tumor differentiation is assessed considering the proportion of gland formation in adenocarcinoma, this morphological parameter partly determines the probability of HER2 expression. Moreover, there are no examples of HER2 discordance in diffuse-type adenocarcinoma, which accentuates the notion.

For breast cancer, the incidence of HER2 overexpression in ductal carcinoma in situ is higher than that for infiltrating ductal carcinoma, and becomes least in invasive lobular carcinoma [29]. There is a strong linkage between the HER2 expression and epithelial differentiation in Wilms tumors [30], uterine carcinosarcoma [31, 32], and biphasic synovial sarcoma [33, 34]. The above findings also support the association of membranous expression of HER2 protein with glandular formation. Although the underlying mechanism remains elusive to date, HER2 protein has been shown to interact with Erbin (Erbb2 interacting protein) and plakophilin-4 to form desmosomes at the basolateral membrane of epithelial cells, which are critical for maintaining cellular polarity and integrity [35, 36]. However, there is no evidence that HER2 overexpression influences cell integrity or vice versa.

The patients in the ToGA trial received randomized treatment regardless of the type of specimen tested for HER2 overexpression. Thus, discordant HER2 results in paired biopsy and excision specimens suggest that some patients may not have received the best treatment (e.g., patients with undetected HER2 overexpression did not receive trastuzumab). In inoperable cases, the only available tissue for HER2 testing is endoscopic or laparoscopic biopsied tissue. A significant number of patients develop metachronous metastasis or unresectable disease after receiving gastrectomy at an initial tumor stage. In this clinical condition, HER2 testing should be performed on both biopsy and gastrectomy specimens, especially when discordant tumor differentiation exists. In the current study, discordant paired specimens represented nearly one-third of HER2-positive specimens (either biopsy or excision). HER2 testing of all available specimens is also advocated by other investigators [14, 20, 26, 27]. Until future research determines the survival outcomes of patients with HER2 positivity in biopsy or excision specimens, the above strategy should be used in clinical practice if feasible.

In conclusion, among paired endoscopic biopsy and surgical excision specimens from gastric tumors, we found that 7.8 % were HER2-positive and 4.4 % showed HER2 discordance. Hofmann’s HER2 scoring system resulted in substantial agreement in HER2 status in paired specimens despite discrepancies in IHC scores. HER2 discordance may stem from sampling error in tumors with heterogeneous HER2 expression. Sampling error may be avoided by testing all available specimens, especially when discordant tumor differentiation in paired specimens is evident. However, studies with large cohorts of paired biopsy or excision specimens are needed to clarify the significance of discordant HER2 results.