Introduction

Small cell lung cancer (SCLC) is a heterogeneous malignancy with genetic and phenotypic disparity [1]. Patients with SCLC that demonstrate to be similar in clinical features can manifest quite differently due to their distinct biological underpinnings. There have been studies reporting that the biology of a tumor is largely dependent on intratumor heterogeneity (ITH) [2]. ITH, consisting of tumor cohorts differing in genetic, phenotypic characteristics within one tumor, was reported to be accountable for cancer metastasis, recurrence, and drug resistance [3,4,5].

Although immunotherapy has brought some benefit to patients with SCLC, drug resistance and relapse still remain to be a challenging issue. Recent studies have found that tumor immuno-environment is involved in such processes [6, 7]. Recently, an increasing number of studies documented that the tumor microenvironment has been considered as a target for drug treatment. Besides strategies such as anti-PD-1/PD-L1, CD8+ infiltrating T cells have also been reported as potential targets for cancer treatments. These therapeutic strategies warrant an in-depth understanding of ITH.

The exploration of ITH requires genomic analysis of large numbers of cells, which has been hindered by the lack of biopsy specimens in SCLC. Besides, the association of ITH with immunological features and the impact of ITH on prognosis have never been explored in SCLC. Hence, to investigate the ITH in SCLC, we characterize ITH among a cohort of SCLC patients on an unprecedented scale in our study. We explored the relationship between ITH and immunological features in SCLC. The impact of ITH on overall survival in SCLC was also explored. Moreover, we analyzed the significantly different driver genes with SNV and CNV between ITH-low and ITH-high SCLC. Our study will help elucidate the biology of SCLC and improve clinical treatment options for patients with SCLC.

Materials and methods

Sample collection, processing, and genomic DNA extraction

The data of patients with small cell lung cancer (SCLC) histologically confirmed via biopsies at Shandong Cancer Hospital and Institute (Jinan, Shandong, China) from September 2017 to December 2019 were assessed. The correctness of the diagnoses was ensured by the pathological confirmation from two experienced pathologists. Strict quality inspection was carried out on the samples, and the contaminated and insufficient DNA samples were removed. Finally, we recruited a cohort of 178 patients. We defined overall survival (OS) as the period from diagnosis to death or the last observation date. The last follow-up was on 26 November, 2020. We have censored the related data if a patient survives at the last follow-up. Clinicopathological data were retrieved from the patients’ medical records. This study was approved by the Ethics Committee of Shandong Cancer Hospital and Institute. Informed consent was granted from each patient involved in the study. Tumor tissues were subjected to fixation with formalin and embedment in paraffin (FFPE). Their corresponding blood samples were set as controls. Genomic DNA was obtained from each FFPE sample by GeneRead DNA FFPE Kit (Qiagen, Beverly, MA, USA) and blood sample by the DNA Blood Midi/Mini kit (Qiagen).

DNA library construction and whole-exome sequencing (WES)

Genomic DNA was digested into fragments with the size of 200 bp using enzymatic method (5X WGS Fragmentation Mix, Qiagen). T-adapters were added to both ends after end repairing and A tailing. For the library construction, the purified DNA was amplified by ligation-mediated PCR, subsequently the final sequencing libraries were generated using the 96 rxn xGen Exome Research Panel v1.0 (Integrated DNA Technologies, Coralville, IA, USA) according to the manufacturer’s instructions. Paired-end multiplex samples were sequenced with the Illumina NovaSeq 6000 System (San Diego, CA, USA). Sequencing depth was 200× per tissue sample.

Sequence alignment and variant detection

Fastp was adopted to preprocess the raw data to trim adaptor sequences [8]. Subsequently, the Burrows-Wheeler Aligner (BWA, v0.7.15) was adopted to align the clean reads in Fast Q format to the reference human genome (hg19/GRCh37) [9]. SAM tools [10] and Picard (2.12.1) (http://picard.sourceforge.net/) were employed to sort the mapped BAM files and process PCR duplicates. In a bid to make computation of the sequencing coverage and depth, the final BAM files were generated for local realignment and base quality recalibration via the Genome Analysis Toolkit (GATK, v3.8) [11]. MuTect has been adopted to identify the somatic single nucleotide variations (SNVs) [12], and GATK Somatic Indel Detector has been used to detect somatic small insertions and deletions (InDels). Variant annotation based on multiple databases was performed via the ANNOVAR software [13], including variant description (HGVS, http://varnomen.hgvs.org/), population frequency databases (1000 Genomes Project, http://browser.1000genomes.org) dbSNP (https://www.ncbi.nlm.nih.gov/snp/), ExAC (http://exac.broadinstitute.org), variant functional prediction databases PolyPhen-2 (http://genetics.bwh.harvard.edu/pph2/), and SIFT (http://www.blocks.fhcrc.org/sift/SIFT.html), and phenotype or disease databases (OMIM, http://www.omim.org; COSMIC, https://cancer.sanger.ac.uk/cosmic/; ClinVar, http://www.ncbi.nlm.nih.gov/clinvar). After annotation, the retained nonsynonymous SNVs were screened with variant allele frequency cutoff ≥ 3% cutoff ≥ 1% for cancer hotspots from disease databases for further analysis. TMB was defined as the total number of nonsynonymous SNVs/Indels of coding area of a tumor genome based on whole-exome sequencing (WES). Significant driver genes were identified by combining two methods of MutsigCV and dNdScv as described in the previous studies [14, 15], with a false discovery rate (FDR) < 10%. We identified the copy number variations (CNVs) by virtue of the Genome Identification of Significant Targets in Cancer (GISTIC) 2.0 algorithm [16]. Significant amplification and deletion were screened with FDR < 10% for further analyses in terms of chromosomal arm. Significant amplification was screened with FDR < 5% and G-score > 0.3, and significant deletion was screened with FDR < 5% and G-score < − 0.2 for further analyses focally. Focal CNV gene analysis of individual sample was performed based on paired tumor–normal WES data using GATK with default parameters. CNV amplification was defined as the ratio of copy number of tumor vs normal > 1.3 and amplified exons > 70%, while CNV deletion was defined as the proportion of copy number of tumor as compared with normal < 0.7 and amplified exons > 70%. Somatic mutational signatures were de novo analyzed by the SomaticSignatures R package (v2.20.0) [17] according to a non-negative matrix factorization (NMF) method.

Immunohistochemical staining

Immunohistochemical staining was conducted using Enhance Labelled Polymer System (ELPS, DAKO, Denmark). Briefly, tumor specimen sections were incubated with antibody against programmed cell death-ligand 1 (anti-PD-L1) (13,684, 1:100 dilution, CST, USA) and antibody against CD8 (85,336, 1:100 dilution, CST, USA) at 4 °C overnight, and then subjected to phosphate buffered saline (PBS) with 5 min per time for three times. Then the slides were incubated with corresponding secondary antibodies at 37 °C for 30 min, after which they were subjected to PBS. They were further reacted with 3,3-diaminobenzidin (DAB), after which washed with the distilled water. Next, dehydration was conducted, followed by clearing and mounting with neutral gums. Images of the stained tissues were captured by Digital Pathology Slide Scanner (KF-PRO-120, KF-BIO, China).

The assessment of PD-L1 expression and determination of its cutoff

For the evaluation of PD-L1 expression, we define tumor proportion score (TPS) as the number of PD-L1-staining tumor cells divided by the total number of viable tumor cells multiplied by 100; we also define combined positive score (CPS) as the number of PD-L1-staining cells divided by the total number of viable tumor cells multiplied by 100 [18]. PD-L1 staining of the tonsil had been adopted to ensure eligibility of the enrolled specimens. The qualified staining should be strong positivity for PD-L1 in intratonsillar cleft epithelia and negative staining for PD-L1 in lymphocytes (mantle zone and germinal center B cells) and superficial epithelial cells. The cutoff TPS/CPS value was 1.0% for positivity of PD-L1.

The assessment of CD8+ T cell infiltration, TMB as well as TNB

We observed CD8+ T cell distribution in the tumor stroma at a 200-fold magnification. If they were equally distributed, they were calculated in three randomly chosen areas (0.1 mm2 per area). If unequally distributed, corresponding areas would be selected according to CD8+ T cells’ proportion in areas with various densities (0.1 mm2 per area). CD8+ T cell infiltration was calculated as CD8+ T cell count/0.1 mm2 × 10, namely CD8+ T cell count/mm2. For each sample, we define TMB as the total number of nonsynonymous SNVs/Indels of coding area of a tumor genome based on WES. TNB was determined as the number of all putative neoantigens per megabase of genome. No staining of CD8+ T cell infiltration was defined as negative and otherwise positive.

Intratumoral heterogeneity (ITH) analysis

An ITH index at the level of genetic alterations was introduced to evaluate the ITH level in tumor tissues from patients with SCLC. The cancer cell fraction (CCF) of mutations in tumors could be implied by PyClone (v0.13.0). The clonal status was determined by the confidence interval (CI) of CCF. A mutation was labeled as subclonal if the upper bound of the 95% CI of CCF was < 1, and would be clonal otherwise [19, 20]. We summed the amounts of clonal mutations (Nmain) and the amount of subclonal mutations (Nsub) in Cmain and Csub, respectively. The ITH index is determined as the proportion of the amount of mutations in the subclone mutation cluster to the total amount of mutations (sum of clonal mutation number and subclonal mutation number): ITH = Nsub/(Nmain + Nsub) [21].

Statistical analysis

The correlation between ITH and percentage of subclonal, between ITH and TMB, and between ITH and TNB was evaluated by Pearson’s correlation test. Student's t-test was used to make a comparison between two groups. The difference of clinical variables according to ITH levels was analyzed using Chi-squared test. For the evaluation of factors for CD8+ T cell infiltration in patients with SCLC, multivariate logistic regression was applied. Survival curves were generated using the Kaplan–Meier and log-rank test. Prognostic factors for OS in patients with SCLC were evaluated using the univariate and multivariate Cox proportional hazards regression models. Statistical analysis was performed using SPSS 22.0 software package. A P value less than 0.05 was considered to be of statistical significance.

Results

Clinicopathological characteristics of the 178 patients with SCLC

We have analyzed the clinicopathological characteristics of the 178 patients with SCLC. Most of the patients were younger than 65-year old, accounting for 61.8%. Among these 178 SCLC patients, male accounted for 71.9% whereas female accounted for 28.1%. A total of 86 (48.0%) patients were at limited stage and 92 (52.0%) were at extensive stage. With regard to treatment strategies, all of these SCLC patients have undergone chemotherapy. And majority (114 patients) underwent radiotherapy whereas 64 patients have not been exposed to radiotherapy. There were 109 smokers and 69 non-smokers. The ratio of drinkers and non-drinkers was 39.9 and 59.9%, with 1.1% of unknown drinking status. The information of all the SCLC patients could be referred to Supplementary Table S1 in our previous publication [22].

The association of ITH with subclonal percentage

To address the relevance of subclonal and ITH and the importance of subclonal and clonal, we subjected the total of SCLC patients to mutation and clonality analysis. We have demonstrated the counts of both subclonal mutation and clonal mutation of each patient (Fig. 1A). For all these SCLC samples, those with more subclonal mutations are more inclined to be boosted with more subclonal driver mutations (Fig. 1B). Obviously, we found that high ITH was positively linked with subclonal percentage (R = 0.41; P = 1.7e−08) (Fig. 1C).

Fig. 1
figure 1

Counts of clonal and subclonal mutation and their driver mutations as well as the association between ITH and percentage subclonal in SCLC. A Counts of clonal mutation (yellow) and subclonal mutation (green) of each patient with SCLC. B Counts of clonal driver mutation (yellow) and subclonal driver mutation (green) of each patient with SCLC. Proportion of clonal mutation (yellow) and subclonal mutation (green) of each patient. C The association between ITH and subclonal percentage

The association between ITH and immunological markers

We next assessed immune-related markers involving tumor mutational burden (TMB), tumor neoantigen burden (TNB), programmed cell death-ligand 1 tumor proportion score (PD-L1 TPS), programmed cell death-ligand 1 combined positive score (PD-L1 CPS) as well as CD8+ T cell infiltration. We also analyzed their associations with ITH. We did not detect significant association between ITH and TMB (R2 = 0.0101, P = 0.1821) (Fig. 2A). Similarly, no significant association was observed between ITH and TNB (R2 = 0.0198, P = 0.0612) (Fig. 2B). We have shown that negative PD-L1 TPS patients with SCLC had numerically higher ITH than those with positive PD-L1 TPS ones despite being statistically insignificant (P = 0.0610) (Fig. 2C). No significant distinction in ITH was found between negative PD-L1 CPS and positive PD-L1 CPS among the SCLC cohort (P = 0.6347) (Fig. 2D). Interestingly, we observed increased ITH score for patients with negative expression of CD8+ T cell infiltration, which demonstrated the negative correlation of CD8+ T cell infiltration with ITH (P = 0.0220) (Fig. 2E). To further validate the association between ITH on CD8+ T cell infiltration, logistic regression analysis has been performed. As we expected, multivariate analysis demonstrated that ITH-high patients were more likely to be with less CD8+ T cell infiltration (OR, 0.254; 95% CI 0.077–0.834; P = 0.024) after adjustment of clinicopathological variables including stage, age, sex, smoking, family history (Table 1).

Fig. 2
figure 2

Associations between immune-related markers and ITH in SCLC. A Association of TMB with ITH among patients with SCLC. B Association of TNB with ITH for patients with SCLC. C Difference in ITH between those with PD-L1 (TPS) negative and those with PD-L1 (TPS) positive among patients with SCLC. D Difference in ITH between those with PD-L1 (CPS) negative and those with PD-L1 (CPS) positive among patients with SCLC. E Difference in ITH between those with negative CD8+ T cell and those with positive CD8+ T cell among patients with SCLC. Scale bar was 100 μm, as shown in CE

Table 1 Logistic regression analysis of factors for CD8+ T cell infiltration in patients with SCLC (N = 178)

Determination of the optimal cutoff of ITH and the impact of ITH on OS

X-tile was employed to assess the optimized cutoff of ITH to divide OS [23]. The results showed that the optimal cutoff of ITH was 0.2 (Fig. 3A). Furthermore, we compared some of the clinical variables between ITH-low and ITH-high SCLC patients. Results showed that no significant distinction in stage, age, sex, smoking, and family history between ITH-low and ITH-high cohort (P > 0.05 for all) (Table 2). We dissected the effect of ITH on survival. Importantly, significant OS benefits could be found in patients with reduced ITH score in contrast to the cohort with elevated ITH (P = 0.0049) (Fig. 3B). Multivariate Cox regression analyses was further adopted and found that ITH was an independent prognostic factor on OS with adjustment for clinicopathological variables including stage, age, sex, smoking, family history (HR, 2.044; 95% CI 1.190–3.512; P = 0.010) (Table 3).

Fig. 3
figure 3

The role of ITH on OS and the significantly different genes and CNV landscape between ITH-low and ITH-high in SCLC. A X-tile was adopted to assess the optimized cutoff of ITH. B OS of ITH-low and ITH-high for patients with SCLC. C The genetic mutation variation between ITH-low and ITH-high cohort among patients with SCLC. D The top ten significantly different CNVs between ITH-low and ITH-high cohort among patients with SCLC

Table 2 The clinical variables’ difference between patients with ITH-high and ITH-low
Table 3 Cox regression analyses of prognostic factors for OS in patients with SCLC (N = 178)

The top significantly different genes with SNV and CNV between ITH-low and ITH-high patients

We further analyzed the difference in genetic mutation and CNV landscape between ITH-low and ITH-high SCLC patients. The significantly different genes between ITH-low and ITH-high SCLC were analyzed using Mutsigcv, DnDscv, and Paper. The significantly different driver genes were MED1, DET1, ELF4, SARS, PGC, ARHGEF38, RNF145, RPE, and C8orf44, as demonstrated in Fig. 3C. Similarly, the top ten genes with significantly different CNVs between ITH-low and ITH-high SCLC were also demonstrated as listed below: TBC1D3, MAZ, AES, TRIM49C, TRIM49, TSPY1, CBX4, UBL4A, HES1, and C8orf82 (Fig. 3D).

Discussion

The treatment of SCLC remains a clinical challenge despite numerous efforts dedicated towards battling against the recalcitrant disease. Although a few patients may benefit from different cancer treatments, notable problems such as drug resistance and relapse have been emerged [24, 25]. Recent studies have found that tumor infiltrating cells in tumor microenvironment are the main factors responsible for drug resistance and dampened responses to treatments [26, 27]. However, little is known about the underlying mechanisms. ITH has been reported to be accountable for the differences within a tumor harboring the same clinical and pathological characteristics. The relapse and resistance have also been attributed to the ITH within SCLC [28, 29]. Currently, the association between ITH and immunological markers remains an unaddressed issue of vital importance. Therefore, in the current study, we evaluated the ITH and its relationship with important immunological markers in SCLC.

It has to be noted that Pyclone was adopted to quantify ITH in the present study. Indeed, there have been many other algorithms such as ABSOLUTE, EXPANDS, and MATH in the quantification of ITH [30]. Undoubtedly, taking all these algorithms into ITH evaluation in our SCLC cohort may add credit to the calculation of ITH. However, this may entail disturbing issues such as complicated procedures and time-consuming analyses. Therefore, we conducted Pyclone as our approach to evaluate ITH in our study. Indeed, Pyclone is one of the approaches that would assess clonal populations, which infers accurate clustering of mutations that co-occur in individual cells [31, 32]. Besides, we optimized the subclonal status from CCF < 1 to the upper bound of the 95% CI of CCF < 1, and clonal status from CCF ≥ 1 to the upper bound of the 95% CI of CCF ≥ 1, allowing for more precise establishment of clonal composition of the tumors and therefore more convincing ITH index [19, 20].

In the current study, no association was detected between ITH and TMB, TNB. This seems to be conflicting since SCLC with elevated ITH are assumed to be possessed with increased somatic mutations and neoantigens. It could be explained by the following reason: SCLC with elevated ITH may have more neoantigens, while the elevated neoantigen may coordinate with immune organs to alter the ITH levels in SCLC as a feedback. Thus, no correlation was detected between ITH and TMB, TNB.

Recent studies have reported that decreased immune cell immersion is related with tumor progression whereas inflamed tumor infiltration is linked with inhibited progression [33]. Negative PD-L1 (TPS) was numerically increased in high ITH SCLC, which suggested less and weakened immune evasion in high ITH SCLC. Meanwhile, an inverse relationship between CD8+ T cells and ITH was found, as consistent with McDonald’s study reporting the correlation between high heterogeneity with fewer numbers of CD8+ T cell immersion in breast cancer patients [34]. We all demonstrated an inverse correlation between ITH and CD8+ T cell infiltration despite distinct tumor types. In our study, ITH was also demonstrated to be correlated with worse survival. We assume it could be possibly due to less immune CD8+ T cell infiltration and subsequent less activation of immune responses since a high number of CD8+ cells were found to be predictive of favorable survival for patients with SCLC [35]. Our results might be in favor of the perception that ITH may be modulated by the selection of tumor infiltrating cells, thereby affecting survival [2].

The association between low ITH and improved OS has been observed, but the underlying mechanism in terms of genetic mutation remains unclear. We speculate there might be other factors besides the possible impact of CD8+ T cell immersion on survival. Our analysis revealed significantly different driver genes and CNV between ITH-low and ITH-high SCLC. Among these genes, MED1 was found to exert a pivotal role in the maintenance of peripheral CD8+ T cells [36]. And ELF4 directly activates KLF4 downstream of T cell receptor (TCR) signaling, leading to cell cycle arrest in CD8+ T cells [37]. These suggested their potential involvement in ITH via the regulation of CD8+ T cell infiltration. These results indicate that aberrant alteration could possibly be intimately associated with ITH. They may be useful in the development of prognostic markers and could possibly serve as therapeutic targets in SCLC.

Undeniably, our study has some limitations. First, we only focused on the association between CD8+ T cell and ITH. It is better to evaluate other T lymphocytes such as CD4+ T lymphocytes and T regulatory cells considering their vital role as regulators of stage in SCLC [38]. Second, we only demonstrated the genetic mutation variation between ITH-low and ITH-high cohort among patients with SCLC. It would be interesting to test their expressions in tumor tissues of the 178 SCLC patients and validate their relationship with ITH and CD8+ T cell infiltration. The role of these genes in the regulation of CD8+ and ITH warrants further study and validation.

Conclusion

In conclusion, our study depicts the landscape of ITH among a cohort of SCLC patients on an unprecedented scale. We are the first to demonstrate the negative relationship of CD8+ T cell infiltration with ITH in SCLC. We have also revealed that ITH is an independent prognostic factor in predicting OS for patients with SCLC. Furthermore, the identification of different genes between ITH-low and ITH-high might hold promise for therapeutic strategies to address the challenges of ITH. Our study offers a novel perspective into the role of ITH in SCLC immunity and prognosis. Meanwhile, it also put forward a direction for more precise and individualized therapeutic strategies in which ITH manifests its clinical transformation.