Abstract
N6-methyladenosine (m6A) modification of gene transcripts plays critical roles in cancer. Here we report transcriptomic m6A profiling in 98 tissue samples from 65 individuals with pancreatic ductal adenocarcinoma (PDAC). We identify 17,996 m6A peaks with 195 hyper-methylated and 93 hypo-methylated in PDAC compared with adjacent normal tissues. The differential m6A modifications distinguish two PDAC subtypes with different prognosis outcomes. The formation of the two subtypes is driven by a newly identified m6A regulator CSTF2 that co-transcriptionally regulates m6A installation through slowing the RNA Pol II elongation rate during gene transcription. We find that most of the CSTF2-regulated m6As have positive effects on the RNA level of host genes, and CSTF2-regulated m6As are mainly recognized by IGF2BP2, an m6A reader that stabilizes mRNAs. These results provide a promising PDAC subtyping strategy and potential therapeutic targets for precision medicine of PDAC.
Similar content being viewed by others
Introduction
Pancreatic ductal adenocarcinoma (PDAC), ranking the fourth leading cause of cancer-related death in the world1, is often diagnosed at an advanced stage. The improvement in the outcome of PDAC is lagging behind many other malignancies, due to the lack of effective approaches in early diagnosis, treatment, and difficulties for therapeutic agents to access tumor sites2,3. Chemotherapy is still the main treatment strategy for most advanced PDAC, though only benefits a subset of patients4. Therefore, it is of urgent need to develop superior markers and therapeutic targets based on better understanding of the biology of PDAC.
Molecular subtyping has been used to guide clinical treatment in many cancer types, such as breast cancer and colon cancer, but has yet to be effective in PDAC5. Genome-wide association studies and whole-genome sequencing studies on PDAC have provided many potential molecular biomarkers for PDAC subtyping6,7. Based on transcriptomic data, several studies have classified PDAC into distinct molecular subtypes8,9,10. Law et al.11 have classified PDAC into four subtypes with distinct microenvironment based on proteomic analysis. Although none of these studies have been implicated in clinical practice, these studies suggested that molecular subtyping could be a promising feature in guiding clinical PDAC treatment.
The RNA modifications are a new epigenetics layer of posttranscriptional regulation of genes. N6-adenosine methylation (m6A), as one of the most prevalent RNA modifications, plays an important role in a variety of biological processes, such as cell fate determination12,13,14, circadian clock regulation15, adipogenesis16, cell cycle arrest, and apoptosis17. Accumulating evidence has suggested that the aberrant RNA m6A modifications are important events in human cancer development and progression18,19,20,21. Recently, we have demonstrated that m6A modifications stimulated by cigarette smoke can promote excessive miR-25-3p maturation, which enhances pancreatic cancer progression18. We hypothesized that m6A in RNA might hold great promise as molecular markers for PDAC subtyping.
In the present study, we have performed transcriptome-wide m6A-sequencing on ribosomal RNA (rRNA)-depleted RNAs of 98 pancreatic tissue samples from 65 individuals with PDAC. We have identified m6A profiles in PDAC distinct from the adjacent normal tissues, based on which defining PDAC subtypes. Further study uncovered CSTF2 as an m6A deposition mediator, driving the formation of two PDAC subtypes. Furthermore, we found that the CSTF2-regulated m6A methylation program can be recognized mostly by IGF2BP2, the m6A stabilized reader, promoting oncogenic pathways, suggesting that the CSTF2-associated PDAC m6A subtyping can serve as a promising therapeutic strategy.
Results
Transcriptome-wide m6A mapping in PDAC
We performed m6A-sequencing (m6A-seq) on rRNA-depleted RNAs of 98 pancreatic samples from 65 individuals, including 33 pairs of PDAC and corresponding normal tissue and another 32 PDAC samples (Supplementary Table 1 and Supplementary Data 1), and identified 26,684 m6A peaks by using MACS222 and MeTPeak23. After removing 462 (1.7%) peaks at the “A” of the transcription start site (TSS) and BCA motifs, which may be N6,2’-O-dimethyladenosine (m6Am) that can also be captured by the m6A antibody24, and those m6As that were not detected in at least 5 different samples, 17,996 m6A peaks were used in further analysis (Fig. 1a and Supplementary Data 2). Among these m6A peaks, 15,708 (87.3%) have been recorded in the RMBASE25 with many transcripts that are well-known to be m6A-modified (Supplementary Fig. 1a). Moreover, the identified m6A sites were enriched in the classical GGACH motif (Fig. 1b) and the regions near the start- and stop-codons (Fig. 1c). These results are in line with previous findings26,27.
These 17,996 m6A peaks were mainly located in genetic regions coding for messenger RNAs (mRNAs, 95.9%) (Supplementary Fig. 1b), consistent with results from polyA+ RNA m6A-seq28. However, because we employed rRNA-depleted RNA m6A-seq, there were 24.5% of the identified m6A peak located in the intron regions which show a proportion of GGACH motif comparable to the exon regions with m6A (Supplementary Fig. 1c), and the m6As-modified intron regions were usually close to the splicing sites (Supplementary Fig. 1d), implying that the intron regions can also be modified by m6A, which may alter mRNA splicing as suggested previously16,29.
Distinguishing two PDAC subtypes by differential m6A modifications
Among the 17,996 m6A sites, 195 were hypermethylated while 93 were hypo-methylated in 33 tumors compared with those in 33 paired normal tissues (Supplementary Data 3). Most of these differentially methylated m6As (265/288) were validated by PDACs by comparing an independent dataset with 32 unpaired PDACs to the 33 normal pancreatic tissues (Fig. 1d). Permutation analysis of tumor and normal sample labels (1000 times) yielded an average of 17 differential m6A sites that were far less than observed 288 differential m6A sites (Supplementary Fig. 1e), indicating that the identified aberrant m6A sites are not random. 96.6% (28/29) of the randomly selected aberrant m6A sites were validated by MeRIP-qPCR (Supplementary Table 2), supporting the reliability of our m6A-seq data (r = 0.867, P < 0.0001; Fig. 1e). Moreover, by using the RADAR program30, a recently developed analytical tool for detecting differentially methylated loci in MeRIP-seq data, we found that most differentially methylated m6As (175/288) were also identified by RADAR, indicating the high confidence of our results.
Most of the differential m6A sites are within mRNAs and are enriched around the stop-codons and in 3’UTR regions (Supplementary Fig. 1f). The 288 dysregulated m6A sites were enriched in genes related to cancer pathways such as cell cycle and epithelial-mesenchymal transition (Fig. 1f). For example, previously reported oncogenes such as CENPF31,32, WNT7B33,34,35 and NTSR136,37 were found to be hyper-m6A methylated in tumor versus adjacent normal tissues (Fig. 1g, h).
Unsupervised consensus clustering of the PDAC patients according to these differential m6A peaks further characterizes two PDAC subtypes (designated as S1 and S2, respectively, Fig. 2a). The S2 PDAC showed an m6A pattern that was distinct from the S1 PDAC (Fig. 2b), but not different in the adjacent normal tissues of two PDAC subtypes (Supplementary Fig. 2a), suggesting that the subtype patterns are tumor-specific. Moreover, the differentially methylated m6As between the S1 and S2 PDAC samples showed no difference between S1 PDAC samples and adjacent normal tissue samples (Fig. 2b), and had a large overlap with the differentially methylated m6As between adjacent normal and tumor tissues (Fig. 2c), indicating a S2 PDAC-specific m6A dysregulation. The m6As of the genes in cancer pathways such as cell cycle and epithelial-mesenchymal transition were hypermethylated in S2 PDAC samples compared to S1 PDAC samples (Supplementary Fig. 2b). For instances, differentially methylated m6As in genes such as CENPF, WNT7B and NTSR1 between tumor and normal tissues were hypermethylated in S2 PDAC samples compared to S1 PDAC samples (Supplementary Fig. 2c, d).
We analyzed the correlations of the two subtypes with known clinical factors such as sex, age, smoking status, drinking status, tumor stage, differentiation, vascular invasion, and lymph node metastasis. The results were all negative except for neural invasion (Fig. 2d). The stromal content is not significantly different between the two subtypes in tumor tissues used for m6A sequencing (Fig. 2d and Supplementary Table 1), indicating that these subtype patterns are PDAC intrinsic features. We further examined whether the m6A subtypes of PDAC are correlated with previously reported transcriptional subtypes8,9,10. We found that the frequencies of Bailey’s squamous subtype and Collisson’s classical subtype were significantly higher in S2 PDAC than in S1 PDAC (Fig. 2d).
Survival analysis revealed that the S2 PDAC had a median progress-free survival (PFS) time and overall survival (OS) time that were significantly shorter than the S1 PDAC (6.6 versus 11.2 months, log-rank P < 0.0001 and 11.4 versus 13.3 months, log-rank P = 0.041, respectively; Fig. 2e), with the HRs being 4.28 (95% CI = 1.53−11.96) and 3.31 (95% CI = 1.12−8.78), respectively, adjusted for clinical features, mutations of KRAS/TP53 and transcriptional subtypes reported previously8,10. However, we did not find a significant association between survival and known transcriptional subtypes (Supplementary Fig. 2e), suggesting our m6A subtypes are independent of other transcriptional subtypes. Interestingly, we found that the T-cell and B-cell markers were comparatively lower in S2 subtype than S1 subtype (Supplementary Fig. 2f), suggesting a different immune phenotype between the two subtypes.
CSTF2 drives the PDAC m6A subtype formation
We next explored the mechanism underlying the formation of PDAC subtypes. First, we applied random forest analysis and spearman correlation analysis to examine the correlation of hypermethylated m6A in the S2 PDAC with RNA binding proteins (RBPs) that have the binding sites supported by CLIP sequencing data in POSTAR2 database38 overlapped with the m6A peaks or with known m6A writers and erasers (Supplementary Fig. 3a). We found that Cleavage Stimulation Factor 2 (CSTF2) RNA levels were most significantly correlated with the levels of hypermethylated m6A sites in S2 PDAC (Fig. 3a). Both CSTF2 RNA and protein level were significantly higher in PDAC than in adjacent normal tissues (Supplementary Fig. 3b−d) and in the S2 PDAC than the S1 PDAC (Fig. 3b and Supplementary Fig. 3e), while another two candidates, U2AF2 and CAPRIN1, showed little difference on RNA levels between the S2 and S1 PDAC (Supplementary Fig. 3f). PDAC cell lines (PANC-1 and SW1990) with moderate expression level of CSTF2 were chosen for experiments subsequently (Supplementary Fig. 3g). We found that when the CSTF2 was knocked down in PANC-1 and SW1990 cells, methylation levels were substantially decreased in 86% (14,342/16,628) and 88% (11,544/13,051) of differential m6A sites, respectively (Fig. 3c, d and Supplementary Fig. 3h). The effect of CSTF2 on m6A were further verified by m6A-LC-MS (Supplementary Fig. 3i) and m6A-ELISA (Supplementary Fig. 3j), while knockdown of U2AF2 and CAPRIN1 showed minute effect of global m6A level (Supplementary Fig. 3k−m). Moreover, when CSTF2 was ectopically overexpressed in the same cell lines, 8804 and 8554 of m6A sites were hypermethylated (Fig. 3e, f), with 72.8% (6411/8804) and 61.7% (5275/8854) overlapping of hypo-methylated m6A in the two examined cell types with CSTF2 knockdown (Fig. 3g). Significantly dysregulated m6As (Fig. 3h) upon CSTF2-knockdown were further verified by MeRIP-qPCR (Supplementary Fig. 3n), which could be rescued by forced-expressed CSTF2 (Supplementary Fig. 3o, p), but not affected by U2AF2 or CAPRIN1 knockdown (Supplementary Fig. 3q). Moreover, 64.9% (122/188) hypermethylated in the S2 PDAC are hypo-methylated in cells with CSTF2 knockdown (Fig. 3i). Together, these results suggest that CSTF2 may regulate mRNA m6A formation in PDAC.
CSTF2 promotes the malignant phenotypes of PDAC cells
We then explored the effects of CSTF2 on malignant phenotypes of PDAC cells. In vitro experiments showed that the knockdown of CSTF2 substantially suppressed the abilities of cell proliferation, colony formation, cell cycle, migration, and invasion of PDAC cells (Fig. 4a−d, Supplementary Fig. 4a−c). By using mouse subcutaneous xenograft models, we also found that CSTF2 overexpression significantly enhanced but silence markedly suppressed the growth rates of PDAC tumor (Fig. 4e). Furthermore, forced expression of CSTF2 promoted lung metastasis of PDAC cells while CSTF2 knockdown showed opposite effects (Fig. 4f). Additionally, the CSTF2 knockdown induced inhibition of malignant phenotypes can be rescued by forced-expressed CSTF2, implying the on-target effect of CSTF2 knockdown (Supplementary Fig. 4d−f). Notably, the malignant phenotypes promoted by forced-expressed CSTF2 could be partially alleviated by knockdown of CENPF, WNT7B, or NTSR1 (Supplementary Fig. 4g, h), implying that CSTF2 may function via modulating m6A of specific genes.
CSTF2 mediates m6A deposition by retarding elongation
We next investigated how CSTF2 mediated m6A deposition. We found that neither the expressions nor the subcellular localizations of the known m6A writers or erasers were affected by CSTF2 knockdown in PDAC cells (Supplementary Fig. 5a−c). The intact methyltransferase complex was not affected by CSTF2 knockdown in PDAC cells (Supplementary Fig. 5d). Depletion of CSTF2 has relatively small effect on global APA profiling (Supplementary Fig. 5e), which is similar with previous studies reporting that CSTF2T plays a redundant role in regulating APA with CSTF2T could be upregulated upon CSTF2 knockdown (Supplementary Fig. 5f) and only co-depletion of CSTF2 and CSTF2T leads to obvious APA changes39,40. Moreover, genes with significant APA changes hold little overlap with hypo-methylated genes upon CSTF2 knockdown (38/7426). The results above indicate that the phenotypes observed upon CSTF2 knockdown were unlikely mediated through APA.
Our CLIP sequencing data showed that the CSTF2 RNA binding sites are well overlapped with m6A sites in RNA (Fig. 5a and Supplementary Fig. 5g−j), consistent with reported public CLIP sequencing data (Fig. 5b). Previous studies reported that CSTF2 can directly interact with RNA polymerase II (RNA Pol II)41,42 that is known to recruit the m6A methyltransferase complex (MTC) co-transcriptionally43, suggesting that CSTF2 might affect m6A deposition through MTC and RNA Pol II. We thus performed CUT&Tag sequencing of CSTF2 and RNA Pol II, and the results showed a good overlap of genomic binding positions of CSTF2 and RNA Pol II (Fig. 5c). We observed that m6A peaks with Pol II occupancy showed a significantly greater reduction in m6A levels upon CSTF2 knockdown than those peaks without Pol II occupancy (Fig. 5d). Furthermore, we found that genes exhibiting substantial changes in Pol II occupancy also displayed a greater reduction in m6A levels upon CSTF2 knockdown (Fig. 5e). These results suggest that m6A sites whose formation is more reliant on Pol II may be particularly vulnerable to the effects of CSTF2 knockdown. Moreover, we found that CSTF2-binding sites in DNA were co-localized with CSTF2 binding sites and m6A sites in RNA, and the co-localization was associated with RNA Pol II (Fig. 5f, g). These results implied that RNA Pol II may indeed play a role in mediating m6A depositions regulated by CSTF2.
Previous study reported that CSTF2 may function as a rate-limiting factor in the elongation of RNA Pol II44, and a recent study showed that prolonged elongation rate may help RNA Pol II to recruit m6A writer METTL343. We further performed CUT&Tag sequencing of Pol II and Pol II-Ser2P upon CSTF2 knockdown. Significant decreases in RNA Pol II and Pol II-Ser2P density were observed in those genes with hypomethylated-m6A upon CSTF2 knockdown, while the RNA Pol II density of CSTF2 non-targets was not affected (Fig. 5h and Supplementary Fig. 5k). We also observed a slight increase in H3K79me2 and H3K36me3, but no significant changes in the enrichment of Pol II-ser5P (Supplementary Fig. 5k). This was illustrated by the representative genomic tracks of CSTF2 targets such as CENPF, WNT7B and NTSR1 genes (Fig. 5i and Supplementary Fig. 5l). These findings are in line with the fact that faster elongation leads to lower RNA Pol II density in the gene body45. Moreover, using 5,6-dichlorobenzimidazole 1-beta-D-ribofuranoside (DRB) in combination with global nuclear run-on followed by sequencing (GRO-seq) (Supplementary Fig. 5m), we found that knockdown of CSTF2 moderately increased the elongation rate of target RNAs (Supplementary Fig. 5n), which was validated by elongation rate experiments subsequently (Supplementary Fig. 6a−c). We also found that CSTF2 knockdown facilitated but ectopic overexpression of CSTF2 attenuated the synthesis of nascent RNA in PDAC cells (Fig. 5j, k and Supplementary Fig. 6d, e), confirming that CSTF2 action decreased the elongation rate of Pol II. We therefore hypothesized that CSTF2 might facilitate the METTL3 recruitment via prolonging elongation rate of RNA Pol II, thus promoting m6A deposition. We found that forced CSTF2 expression changes in PDAC cells resulted in substantial variations of the RNA Pol II and METTL3 interaction (Fig. 5l, m and Supplementary Fig. 7a, b). Similar results were also observed between the activated elongating form of Pol II, phosphorylation of the C-terminal domain (Pol II-Ser2P) and METTL3 upon forced CSTF2 expression changes (Supplementary Fig. 7c, d), whereas global Pol II-Ser2P was not affected, suggesting that CSTF2-retarded Pol II recruited more METTL3.
Notably, knockdown of CSTF2 caused comparable decrease of global m6A level with that by METTL3 knockdown (Supplementary Fig. 7e, f), and the hypomethylated-m6A in cells with METTL3 knockdown overlapped with 69% (1850 of 2676) of m6A produced by CSTF2 (Fig. 5n). CSTF2 knockdown caused less METTL3 binding around the m6A region of the target transcript but ectopic overexpression of CSTF2 strengthened the interaction (Fig. 5o, p and Supplementary Fig. 7g). It is reported that CSTF2 effects on the binding of elongation factors on transcripts co-transcriptionally44. Notably, we found that knocked down or inhibited the elongation factors AFF1/4 in HEK293T cells markedly attenuated the elongation rate of CSTF2 target genes, while effects on CSTF2 non-target genes tended to be weaker (Supplementary Fig. 7h−j). CSTF2 knockdown in PDAC cells promoted the recruitment of AFF1/4, ensuring efficient elongation but ectopic overexpression of CSTF2 attenuated the recruitment, leading to slowing elongation (Supplementary Fig. 7k, l). These results strongly support that CSTF2 promotes m6A depositions by slowing down the Pol II elongation, therefore facilitating the recruitment of METTL3 co-transcriptionally.
CSTF2-regulated m6As enhance RNA stability
We then explored the effects of m6As on their host RNA levels in PDAC and found that 205 m6As (148 RNAs) out of the 254 differentially methylated m6As between S1 and S2 PDAC subtype had impacts on their host RNA levels (Fig. 6a and Supplementary Data 4). For example, both the m6A level and RNA level of some genes in cancer-related pathways such as cell cycle and epithelial-mesenchymal transition were significantly different between S1 and S2 (Supplementary Fig. 8a). Among the 148 RNAs, the m6A levels and RNA levels of 115 RNAs were both upregulated in S2 PDAC compared to S1 PDAC, while the m6A levels and RNA levels of 33 RNAs were both downregulated (Fig. 6a). We also found higher RNA levels of CSTF2 target genes in PDAC tissues comparing with that in normal tissues (Supplementary Fig. 8b). The positive correlation between m6A level and RNA level was also observed in PDAC cells, where RNA level downregulation of many RNAs (805/5222) upon CSTF2 knockdown will show hypomethylation of m6A levels, but only few RNAs (9/5222) will show lengthened 3’UTR (Fig. 6b and Supplementary Fig. 8c), indicating that the CSTF2-regulated m6As but not CSTF2-regulated APAs may contribute to the increased RNA levels. IGF2BP family is reported to be an m6A reader stabilized transcripts46. Since the expression of IGF2BP2 was positively correlated with most of the 254 differentially methylated m6As between S1 and S2 PDAC subtype (Supplementary Fig. 8d), we next investigated the role of IGF2BP2 in the CSTF2-regulated m6As. We found that the expression of IGF2BP2 was not altered by CSTF2 knockdown (Supplementary Fig. 8e). However, both CSTF2 knockdown and IGF2BP2 knockdown caused similar changes in expression levels of CENPF, WNT7B, NTSR1 (Supplementary Fig. 8f−h). CSTF2 knockdown dampened the binding of IGF2BP2 to the m6A region of the target transcript (Supplementary Fig. 8i), without altering the binding enrichment of YTHDF1/2/3 on these RNAs (Supplementary Fig. 8j). The effect of CSTF2 knockdown on transcripts can be rescued by ectopic expression of CSTF2 implying that the effect is on-target (Supplementary Fig. 8k). Both CSTF2 knockdown and IGF2BP2 knockdown caused similar change on stabilities of CENPF, WNT7B, NTSR1 transcripts (Supplementary Fig. 8l). Furthermore, we conducted dCas13 based m6A editing and gRNA to specifically manipulate the m6A site (Fig. 6c and Supplementary Fig. 9a). The downregulation of m6A level were verified (Fig. 6d) which dampened the binding of IGF2BP2 (Fig. 6e), leading to downregulation of both the mRNA levels (Fig. 6f) and half-lives of transcripts (Fig. 6g), specifically (Supplementary Fig. 9b−g). Forced-expressed IGF2BP2 failed to rescue the effect of downregulation of m6A level on both the mRNA levels and half-lives of transcripts (Fig. 6h, i and Supplementary Fig. 9h). Taken together, the above results demonstrated that CSTF2-regulated m6As enhance the RNA stability via IGF2BP2.
Discussion
Increasing evidence has shown that m6A modifications are important in the pathogenesis of various types of cancer20,21,47,48,49. However, their global function and regulation in cancer are still largely unknown, mainly due to the lack of m6A-seq data from larger sample sizes of cancer patients. In the present study, we have performed a transcriptome-wide m6A-seq and mapping of m6A modifications in a large set of PDAC samples from 65 patients. To the best of our knowledge, this is the most comprehensive study on mRNA m6A methylome landscape in PDAC to date. We have demonstrated that PDAC has obviously different mRNA m6A modification compared with adjacent normal tissues, with 68% of m6A sites being hypermethylated and 32% of m6A sites being hypo-methylated.
Recent high-throughput sequencing studies have revealed a great diversity of PDAC at multi-omics levels, such as genomics, transcriptomics, proteomics, and epigenomics8,9,10,11,50,51. However, the current data is far more than enough to reveal the complex mechanism underlying the heterogenous disease, let alone guide the clinical treatment based on molecular subtyping in PDAC. In this study, we have innovatively defined two PDAC subtypes using distinct m6A modification profiling, which is related to patients’ survival, offering alternative insight into PDAC and informing the development of superior markers or therapeutic regimens based on this finding.
Another important finding is the discovery of CSTF2 as an m6A deposition mediator that regulates mRNA m6A modification. We have demonstrated that the depletion of CSTF2 in PDAC cells substantially reduced global m6A levels but did not change the expression levels of the m6A writers and erasers. CSTF2 is well known as a member of the cleavage stimulation factor complex regulating the 3’ end cleavage and alternative polyadenylation (APA)52. Our data showed that the knockdown of CSTF2 alone has a limited effect on global APA, consistent with a previous study, as CSTF2T, the paralog of CSTF2 functioning as an APA regulator similarly, can be upregulated accompanied with CSTF2 knockdown39,40. Pol II termination defect is only observed when CSTF2 and CSTF2T are co-depleted, but not in CSTF2-knockdown cells53. These data suggest that it is unlikely that the effect of CSTF2 knockdown on m6A levels may be due to its APA-modulating effect. Furthermore, CLIP sequencing shows that CSTF2 binding sites enrich significantly around the m6A sites, suggesting that CSTF2 influences m6A modifications in a m6A site-dependent manner. Recent studies have proposed that the mRNA m6A modification is a co-transcriptional process depending on slowing or pausing of transcribing RNA Pol II29,43 and CSTF complex can directly interact with RNA Pol II and slow down its elongation rate during Pol II elongation41,42,44. Our results together with previous findings indicate that the CSTF2 effect on m6A deposition is likely through the mechanism of slowing RNA Pol II elongation rate.
In conclusion, we have comprehensively deciphered the landscape of transcriptome-wide m6A mRNA modification in PDAC. We have identified CSTF2 promoting the m6A modification in mRNAs and IGF2BP2 enhancing the stability of mRNAs with hypermethylated m6As, which forms a CSTF2-m6A-IGF2BP2 axis (Fig. 7). The aberrancy of this m6A modification-related axis may contribute to the development and progression of PDAC and thus has the potential clinical applications in PDAC precision medicine.
Methods
Patients and tissue specimens
This study was performed according to the Declaration of Helsinki and approved by the Institutional Review Board of Sun Yat-sen University. Written informed consent was obtained from each participant, and all data were anonymously analyzed.
For high-throughput m6A-sequencing and disease-relevant molecule analyzing, 65 patients with PDAC were recruited and the distributions of select characteristics are shown in Supplementary Table 1. All patients were recruited at Sun Yat-sen University Sun Yat-sen Memorial Hospital (Guangzhou, China) between 2010 and 2018 and they underwent pancreatectomy and received no treatment before surgery. The diagnosis of PDAC was histopathologically confirmed and tumor stage was classified according to the 7th edition of AJCC Cancer Staging System54. The PDAC tumor and non-tumor tissue (≥5 cm away from tumor) samples were collected at surgery from each patient and immediately placed in liquid nitrogen.
Tissue RNA isolation
Total RNA was isolated from tumor and normal samples with TRIzol reagent (Invitrogen). The tumor and stromal contents were evaluated from the continuous tissue section slides stained with H&E by three board-certified pathologists who were blinded to the patients’ clinicopathological status and only the samples containing ≥60% tumor cells were used. The resultant RNA samples were quantified by measuring absorbance at 260 nm with a UV spectrophotometer and then determined via the RNA6000 Nano assay (Agilent) for an RNA Integrity Number (RIN), and only the samples with RIN ≥ 7.0 were included for further analysis.
High-throughput m6A-sequencing
Total RNA from tissue was digested with DNase I and then subjected to RiboMinus (Illumina) treatment to eliminate ribosomal RNAs (rRNAs). An amount of 1.5 μg RNA was used as input. We used Magna MeRIP m6A Kit (Millipore) for m6A immunoprecipitation (m6A-IP)55. Briefly, 20 μg of rRNA-depleted RNA was sheared to about 100 nucleotides in length by metal-ion-induced fragmentation and then purified and incubated with 10 μg of anti-m6A antibody (Synaptic Systems, 202003). Sequencing libraries for m6A-IP and input were prepared and sequenced using Illumina HiSeq2500 SE50 and Illumina HiSeqX Ten PE150, respectively.
Alignment of m6A-sequencing reads
We used STAR56 to align the m6A-sequencing reads to human reference genome (hg38). For m6A-IP reads, the 50 base pairs (bp) single-end reads were aligned to human genome using STAR with the following parameters: --twopassMode Basic --chimSegmentMin 20 --outFilterIntronMotifs RemoveNoncanonical --outFilterMultimapNmax 20 --alignIntronMin 20 --algigIntronMax 1000000 --alignMatesGapMax 1000000. For input reads, the 150 bp pair-end reads were aligned to human genome using STAR with parameters like m6A-IP reads.
m6A calling, annotation, and motif analysis
The input reads (150 bp) were trimmed to the length of m6A-IP reads (50 bp) using fastx_trimmer from FASTX-Toolkit (http://hannonlab.cshl.edu/fastx_toolkit/). MACS222 and MeTPeak23 were used to call the peaks based on the m6A-IP reads and the trimmed Input reads for all the normal and tumor samples. The cutoff P value for significant peak for MACS2 was set at 1.00e-6. The peaks called from the two methods were first merged using IntersectBed in BEDTools57. Only those peaks identified by both two peak-calling methods were retained. We considered those 5’UTR peaks with transcription start site (TSS) “A” and “BCA” motif were m6Am peaks and the other peaks were m6A peaks. To avoid false positives, those m6A peaks occurred in at least five samples were retained for further analysis. Gencode v25 human annotations were downloaded from Gencode website for peak annotation58. An ad hoc perl script was used to annotate the m6A peaks. Firstly, BEDTools’ intersectBed was applied to map the peaks to Gencode v25 human annotations. To avoid duplicated mapping, only the canonical transcript for a gene was used. Canonical transcript was defined as described in UCSC genome browser. We then compared the peaks to the curated m6A sites in RMBASE25 using IntersectBed to distinguish known peaks and novel peaks. MEME59 was used to find the motif enriched in m6A peaks.
Analysis of RNA level and differential expression
For quantification of RNA level, RSEM60 was performed with the following parameters: -paired-end, -star. R package DESeq261 and edgeR62 were used for differential gene expression analysis between tumor and normal tissue samples. First, those genes with adjusted P value from DESeq2 <0.1 were considered significantly differentially expressed. To reduce false positives, edgeR was further applied. The significantly differentially expressed genes obtained from DESeq2 were further filtered by edgeR adjusted P value at cutoff 0.1.
Analysis of m6A level and differential methylation
The relative m6A level for each m6A was quantified according to the procedure described by Schwartz et al.63 Briefly, multicov in bedtools was used to calculate the read coverage in m6A-IP and Input for each peak. RPKM (Reads Per Kilobase Million) method was then used to normalize the read coverage. The relative m6A level was obtained by calculating the ratio between IP RPKM value and Input RPKM value for each m6A. Following this procedure, we calculated the relative m6A levels for all the 17,996 identified m6As for all 98 samples. To obtain the aberrant m6A modifications in PDAC, we performed paired Wilcoxon rank-sum test on the quantitative difference in all m6A between the 33 paired tumor and normal samples, which resulted in 1108 hypermethylated m6As and 948 hypomethylated m6As at significance level of P < 0.05. To reduce the false positives, we performed FDR multiple testing correction. Finally, 195 hypermethylated- and 93 hypomethylated-m6As were obtained upon setting FDR = 0.1. For identification of differentially m6A methylated loci between two groups using RADAR30, region with an adjusted P value < 0.05 and |log2 fold change|> 0.5 was considered as differential m6A peaks.
Analysis of the correlation between RBPs and m6A modification
The correlations between the levels of RBPs and m6As were calculated by both random forest and Spearman correlation analyses. The detailed procedures of random forest analysis were as follows: we set the RBPs and clinical factors (sex, age, smoking status, drinking status, tumor stage, differentiation, neural invasion, vascular invasion, and lymph node metastasis) as independent variables (X) and the m6As as dependent variables (Y), as shown in Eq. (1) and Eq. (2), where the n means the number of RBPs and the m means the number of samples.
Then we used random forest algorithm to construct the regression model between the RBPs (X) and m6As (Y), yielding the contributions of all RBPs to each m6A from Eq. (3), where c is the
value corresponding to the root node in the regression tree, k is the feature number in the regression route and contrib (x, k) represents the contribution of independent variable x to dependent variable y at the kth feature. For the regression model with multiple random forest trees, the contribution of each RBP was the average of the contributions from all the trees.
Spearman correlation between each RBP and each m6A was calculated for hypermethylated, hypo-methylated, and unchanged m6As, respectively. The RBP-m6A pairs with |correlation|> 0.25 and P < 0.05 were considered to be significant. We performed Fisher’s exact test to evaluate the significance of the differences in the number of these significantly correlated RBP-m6A pairs. The P values from Fisher’s exact test were corrected for multiple comparisons. The contribution score from random forest analysis and FDR from Spearman correlation analysis were finally combined to evaluate the correlations between RBPs and m6As.
Identification and characterization of PDAC subtypes based on aberrant m6As
We used R package ConsensusClusterPlus64 to perform consensus clustering of the aberrant m6As in 65 PDAC tumor samples. The number of bootstraps was 1,000 and the sub-sampling proportion was 0.8. We performed a two-sided Wilcoxon rank-sum test to identify significant m6A between S1 and S2 PDAC and used R package ClusterProfiler65 for pathway enrichment.
Methylated RNA immunoprecipitation-coupled quantitative real-time PCR (MeRIP qRT-PCR)
Total RNA isolated from each tissue was fragmented and immunoprecipitated by anti-m6A antibody as described above. Purified m6A-containing RNA was reversely transcribed and amplified18. The enrichment of m6A was quantified by quantitative PCR with the gene-specific primers shown in Supplementary Table 3.
Global RNA m6A quantification
Total RNA from cells was extracted using TRIzol as described above. PolyA+ RNA was purified using Dynabeads mRNA purification kit (Invitrogen). Global RNA m6A quantification in polyA+ RNA was conducted by m6A RNA Methylation Quantification Kit (Catalog # P-9005, EpiQuik ™). PolyA+ RNA (200 ng) of each sample was used for analysis performed in triplicate.
Liquid chromatography coupled with tandem mass spectrometry (LC-MS/MS)
RNA samples were digested with digestion buffer containing phosphodiesterase I (0.01 U), nuclease S1 (180 U), 1 mM zinc sulfate, 280 mM sodium chloride, and 30 mM sodium acetate at pH 6.8 for 4 h at 37 °C, and dephosphorylated with bacterial alkaline phosphatase (30 U) for 2 h at 37 °C. After enzymes removal, the nucleosides samples were then subjected to LC-MS/MS and analyzed on a TripleTOF 6600 mass spectrometer (SCIEX, Framingham, MA, USA). Nucleosides were quantified using the nucleoside-to-base ion mass transitions of 268.1–136.1 for A, 245.1–113.0 for U, 244.1–112.1 for C, 184.1–152.1 for G, 282.1–150.1 for RNA m6A. Quantification was performed by comparison with the standard curves obtained from their nucleoside standards. The ratio of m6A to A was analyzed based on the calculated concentrations.
Quantitative real-time PCR (qRT-PCR)
Total RNA from tissue and cell lines was extracted with TRIzol reagent. First-strand cDNA was synthesized using the PrimeScript 1st Strand cDNA Synthesis Kit (Takara). Relative RNA level determined by qRT-PCR was measured in triplicate on a Roche LightCycler 480 using the SYBR Green method66. Beta-ACTIN was employed as an internal control for mRNA quantification. The primer sequences are shown in Supplementary Table 3. All experiments were performed in three biological replicates.
Cell lines and cell culture
Human PDAC cell lines PANC-1 and SW1990 and embryonic kidney cells 293 T were purchased from the Cell Bank of Type Culture Collection of the Chinese Academy of Sciences Shanghai Institute of Biochemistry and Cell Biology. All cell lines were authenticated by DNA fingerprinting analysis and tested for free from mycoplasma infection. PANC-1 and 293 T cells were maintained in DMEM medium while SW1990 was maintained in RPMI-1640 medium and both media were supplemented with 10% fetal bovine serum. All cell lines were grown without antibiotics in an atmosphere of 5% CO2 and 99% relative humidity at 37 °C.
Plasmid, RNA interference, and stable cell line generation
The hairpin-of pLKD-vectors containing short hairpin RNA (shRNA) sequence targeting CSTF2 and the plenti-CSTF2-puro and pcDNA3.1-IGF2BP2 plasmid was commercially constructed. The shCSTF2-resistant WT (CSTF2-res) was generated by introducing point mutations. Small interfering RNA (siRNA) targeting the METTL3, IGF2BP2, U1AF2, CAPRIN1, BUD13, CENPF, WNT7B, NSTR1 or scramble knockdown control (KD control) was purchased from GenePharma. Transfection with siRNA or plasmid was performed with lipofectamine 2000 (Life Technologies). Lentivirus was produced in 293 T cells by cotransfection of the pLKD-constructs along with psPAX2 and pMD2.G vectors, and subsequent virus-containing media were collected for lentiviral infection. 48 hours after transduction, cells were harvested (RNAi) or subjected to puromycin selection (2 μg/ml). RNA knockdown sequences were listed in Supplementary Table 4. The PspCas13b-ALKBH5 (dCas13b-ALKBH5) plasmid, gRNA plasmid, and nontargeting gRNA plasmid were kind gifts from Dr. Hongsheng Wang (Sun Yat-sen University, Guangzhou). Specifically demethylated the m6A of target RNAs were conducted by cotransfection of dCas13b-ALKBH5 and corresponding gRNA plasmid. The sequence of gRNA is listed in Supplementary Table 4.
Western blot assays
Total protein extract from PDAC tissues or cells was prepared using a detergent-containing lysis buffer. For cytoplasmic and nuclear fractionation, lysis was obtained using the NE-PER Nuclear and Cytoplasmic Extraction Reagents (Thermo) following the manufacturer’s instructions. Protein sample (50 μg) was subjected to SDS-PAGE and transferred to the PVDF membrane (Millipore). Antibody against CSTF2 (ab200837), CSTF2T (ab138486), METTL3 (ab195352), METTL14 (ab252562), WTAP (ab195380), FTO (ab126605), ALKBH5 (ab195377), IGF2BP2 (ab128175), WNT7B (ab227607), RNA polymerase II C-terminal domain (CTD) Ser2 (ab193468) or β-ACTIN (ab8227) was from Abcam. Antibody against RNA polymerase II C-terminal domain (CTD) (#05-623) and were from Millipore. Antibody against U2AF2 (68166-1-Ig), CAPRIN1 (15112-1-AP), RBM15 (10587-1-AP), RBM15B (67506-1-Ig), Lamin B1 (12987-1-AP), GAPDH (60004-1-lg), AFF4 (14662-1-AP) or CENPF (28568-1-AP) were from Proteintech. Antibody against BUD13 (A303-321A-1) and AFF1 (A302-345A-1) were from Invitrogen and antibody against NTSR1 (sc-374492) was from Santa Cruz Bio. The membrane was incubated overnight at 4 °C with primary antibody and visualized with a Phototope Horseradish Peroxidase Western Blot Detection kit (Thermo Fisher).
Analysis of cell malignant phenotypes
PANC-1 and SW1990 cells were seeded in 96-well plates (2000 cells per well) for culture. Cell viability was measured using Cell Counting Kit-8 (CCK-8, Dojindo) at 24, 48, 72, and 96 h, respectively. For real-time impedance measurement (Xcelligence)67, cells (4000) were seeded in E-plates and placed into the Real-Time Cell Analyzer (RTCA) station and incubated at 37 °C for 96 h, with impedance measured every 30 min. Cell index values were calculated by the apparatus software (RTCA software 2.0). For colony formation assays, 1000 cells were seeded in six-well plate and allowed to grow until visible colonies formed in complete growth medium (2 weeks). Colonies were fixed with methanol, and stained with crystal violet. For migration assays, 5 × 104 cells in 200 μl of serum-free medium were added into the upper chamber. For invasion assays, cells were added after coating filters with 30 μg of matrigel (Corning). A 500 μl of medium with 20% FBS was used as a chemoattractant in the lower chamber. After 12-h incubation in 5% CO2 at 37oC, cells were fixed with methanol and stained with 0.5% crystal violet before measurement.
RNA stability assays
Cells with or without CSTF2 or IGF2BP2 knockdown were treated with actinomycin D at a final concentration of 2 μM for 20, 40 or 60 min before trypsinization and collection. Total RNA was then extracted with TRIzol reagent. Gene expression level was determined by RT-qPCR and the mRNA half-life time was calculated as previously described46.
5-Ethynyluridine incorporation and quantification
5-Ethynyluridine (EU) incorporation was performed by using Cell-Light EU Apollo488 RNA Imaging Kit (RiboBio). Briefly, cells were incubated in complete culture medium containing 500 μM EU for 1 hour before washing with PBS and fixed. The cells were stained with 0.5 μg/ml 4’,6-diamidino-2-phenylindole (DAPI) for 5 min and mounted in anti-fade solution. Image stacks were obtained by using the fluorescence microscopy (Olympus). Nucleoplasm regions were identified based on DNA (DAPI) staining. The median of each cell’s mean intensity of the extracted nuclear signals after background subtraction (the signals outside nuclei) were plotted and calculated by applying the Image J software.
Protein co-immunoprecipitation assays
Cells grown in 15-cm dishes at 70−80% confluency were lysed with 500 μl of immunoprecipitation buffer. Proteins were immunoprecipitated from 500 μg of cell lysates with 5 μg of antibody against METTL3 (ab195352), RNA Polymerase II (CTD) (#05-623, Millipore), RNA polymerase II C-terminal domain (CTD) Ser2P (ab193468) or IgG. After applying a magnet, proteins associated with Protein A/G Magnetic Beads were washed three times and analyzed by western blotting.
Chromatin immunoprecipitation assays
Chromatin immunoprecipitation (ChIP) assays were performed using the EZ-Magna ChIPTM A/G Kit (17-10086, Millipore). In brief, after cross-linking with 1% formaldehyde, cells were lysed and sonicated on ice to generate DNA fragments with an average length of 200−500 bp. Pre-cleared DNA of each sample was saved as an input fraction. Fragmented DNA was then used for immunoprecipitation with 5 μg of ChIP-grade antibody against AFF1, AFF4, or IgG as control. Bound DNA was eluted and purified, followed by qRT-PCR using the primers shown in Supplementary Table 3.
Cross-linking-immunoprecipitation (CLIP)
CLIP was performed as previously reported18 with some modifications. Briefly, the whole cell lysate from cross-linked (twice by 150 mJ per cm2 of 365 nm UV light) PANC-1 cells were isolated and sonicated, followed by treatment with DNase I (0.5 U/μl, 37 °C for 5 min) and RNase TI (0.2 U/μl, 22 °C for 15 min). Pre-washed Dynabeads protein A/G (Millipore) conjugated with 10 μg antibodies against CSTF2, METTL3, or IGF2BP2 were then incubated with the extraction at 4 °C overnight with rotating. After substantial washing of beads, end repair was performed by using T4 PNK (NEB). RNA was then treated with proteinase K (37 °C for 30 min), acidic phenol/chloroform extraction, and ethanol precipitation, and was subsequently used for library construction by using NEBNext small RNA library prep kit (E7330S) and sequenced on Illumina Hiseq4000. For CLIP-qPCR, the input and immunoprecipitated RNA samples were recovered as described above. cDNA was synthesized with SuperScript III RT (Invitrogen) and random hexamer primers (Invitrogen) and subject to qRT-PCR using specific primers shown in Supplementary Table 3.
miCLIP sequencing
m6A individual-nucleotide-resolution cross-linking and immunoprecipitation (miCLIP) sequencing was performed as previously reported18. In brief, total RNA from PANC-1 cells was digested by DNase I and subjected to two rounds of RiboMinus treatment to eliminate rRNAs. Ribo-off RNA (20 μg) was then fragmented and incubated with 10 μg of anti-m6A antibody (Synaptic Systems, 202003) in IP buffer supplemented with 0.2 U/μl RNase inhibitor (NEB) for 2 h at 4 °C. The RNA-antibody mixture was cross-linked and incubated with 100 μl of pre-washed protein A/G beads (Millipore) overnight at 4 °C with rotating. The Beads were substantially washed, and end repair was performed by using T4 PNK. After recovering via proteinase K, acidic phenol/chloroform extraction, and ethanol precipitation treatment, RNA was subsequently used for library construction with NEB Next small RNA library prep kit (E7330S) and sequenced on Illumina Hiseq4000.
Analysis of iCLIP-sequencing data
Read preprocessing was performed essentially68. Adaptors and low-quality bases were trimmed by Cutadapt (v1.16) and reads shorter than 20 nucleotides were discarded. Reads were demultiplexed based on their experimental barcode using the pyBarcodeFilter.py script of the pyCRAC tool suite. Sequence-based removal of PCR duplicates was then performed with the pyFastqDuplicateRemover.py script. The reverse reads were reversely complemented and processed in the same way as the forward counterparts. Reads were then mapped to human genome (hg38) with BWA (v0.7.15), with parameter bwa aln -n 0.06 -q 20 as recommended by the online CTK Documentation (see URLs). We detected cross-linking-induced mutation sites (CIMS) in iCLIP data of m6A, CSTF2, and using CLIP Tool Kit (CTK). To identify the m6A locus, the mode of mutation calling was performed69. For each mutation position, the coverage of unique tag (k) and mutations (m) were determined by CIMS.pl script of CLIP Tool Kit. First, the known SNPs (dbSNP 147) were removed from all the mutation positions. Then, the C > T mutation positions within m/k ≤ 50% and only mutation positions at the +1 position of adenosines were identified as CIMS-based m6A residues.
CUT&Tag assays
CUT&Tag assays were carried out following the previous description with some modifications70. Briefly, 1 × 105 cell sample was treated with 10 μl of Concanavalin A coated magnetic beads (Bangs Laboratories) for 10 min. Bead-bounded cells were then suspended with dig wash buffer (20 mM HEPES pH 7.5; 150 mM NaCl; 0.5 mM Spermidine; 1× Protease inhibitor cocktail; 0.05% Digitonin; 2 mM EDTA) and a 1:50 dilution of antibody against CSTF2 (ab200837), RNA Polymerase II (CTD) (#05-623) RNA Polymerase II (CTD Ser2P) (#61083), H3K36me3 (ab9050), H3K79me2 (ab3594), RNA Polymerase II (CTD Ser5P) (MA1-46093) or IgG and incubated at 4 °C overnight. After the removal of the primary antibody by substantial washing, cells were incubated with secondary antibody (1:100) for 1 h and then incubated with pA-Tn5 adapter complex for 1 h. After washing with Dig-med buffer, cells were resuspended in Tagmentation buffer (10 mM MgCl2 in Dig-med Buffer) and incubated at 37 °C for 1 h. DNA products were purified using phenol-chloroform-isoamyl alcohol extraction and ethanol precipitation. Sequencing libraries were prepared according to the manufacturer’s instructions and cleaned up using XP beads (Beckman Counter). Sequencing was performed in the Illumina Novaseq 6000 using PE150.
Analysis of CUT&Tag sequencing data
Raw sequencing reads were examined using FastQC (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/). Adaptor and low-quality bases were removed using Fastp71. Qualified reads were aligned to hg38 human genome using Bowtie272 with options: -p 6 --local --very-sensitive-local --no-unal --no-mixed --no-discordant --phred33 -I 10 -X 700. MACS222 was used for peak-calling with parameters: macs2 callpeak -t input_file -p 1e-5 -f BAMPE –n out_name. The annotatePeaks.pl script from the Homer software suite73 was used for annotation. Visualization of the depositions along genomic regions was performed with IGV74. Read counts were normalized by RPKM which was computed in each 10-base pair bin among defined regions and then used for generating profile plots using Deeptools75.
DRB/GRO-seq
Cells at 80%–90% of confluence in 15 cm dish were initially treated with DRB for 3.5 h and samples from time points 10 and 25 min after release into the fresh medium were processed. Transcription-competent nuclei were prepared using the Nuclei Isolation Kit according to the manufacturer’s recommendations (Sigma). Nuclear Run-On reactions were carried out with Br-UTP as described76, and Br-UTP run-on labeled RNA was isolated using beads coupled with Br-UTP-specific antibody (sc-32323AC, Santa Cruz). The purified RNA was used for the preparation of strand-specific RNA libraries using standard Illumina protocols and sequenced on NextSeq CN500 using SE75.
GRO-seq data analysis and transcription elongation rate calculation
GRO sequencing reads were aligned to the hg38 reference genome using Bowtie2 with standard parameters. To analyze the transcription elongation rate, we calculated the base pair level coverage of the region 10 kb upstream to 120 kb downstream of each transcript’s TSS. Average transcript profiles were generated by taking a trimmed mean (0.01) of read depth over each base pair. The normalized read depth was smoothed using the smooth.spline function from Bioconductor’s stats package (spar = 0.8). We then calculate wave peak for each gene as the maximum point on the spline and remove any genes that are lowly expressed, have missing values, have duplicate values, or whose wave doesn’t advance with time. Select only genes with a wave-peak after the first 1 kb in the 25 min sample. And a linear fit model to the wave peak positions as a function of time to determine the rate of elongation in kb/min units. The significance of the difference between the increased elongation rates in the KD sample relative to WT was assessed using Kolmogorov–Smirnov test.
Measurement of the Pol II elongation rate
Measurement of the Pol II elongation rate was conducted as previously described45. PDAC cells were seeded overnight on 6-cm dishes to 70%−80% confluency before treating with 300 μM 5,6-Dichlorobenzimidazole 1-β-d-ribofuranoside (DRB; Sigma) in culture medium for 5 h. Cells were washed twice with PBS and incubated in fresh culture medium for various time periods. Total RNA was isolated from cells and reversely transcribed with random hexamer primers. Analysis of pre-mRNAs was accomplished by qRT-PCR with amplicons spanning the intron-exon junctions. The primer sequences are shown in Supplementary Table 3.
Animal experiments
Aged 4–5 weeks’ female BALB/c nude mice were purchased from the Beijing Vital River Laboratory Animal Technology. Two million PDAC cells suspended in 100 μl PBS were injected subcutaneously into the back flank of mice (five in each group). Tumor volume was measured and calculated according to the formula volume = length × width2 × 0.5. The sample size was not predetermined for these experiments. For the metastasis model, 0.1 ml of cell suspension containing 2 × 106 luciferase-labeled cells was injected into tail veins. The metastases were detected using the Living Image® software (Perkin Elmer) after intraperitoneal injection of luciferin (Promega) before quantifying fluorescence. All experimenters were blinded to which cells were injected into the mice. All the mice were observed daily for signs of end-point criteria. Mice once showed signs of cachexia, >20% weight loss of initial weight, breathing difficulties, or tumors close to 15 mm in diameter, they were euthanized immediately. No tumors exceeded this limit. All the animal experiments were approved by the Institutional Animal Care and Use Committee of Sun Yat-sen University Cancer Center, and the animals were handled in accordance with institutional guidelines.
Statistics and reproducibility
We used Chi-square test or Fisher’s exact test to examine the difference between two categorical variables and Wilcoxon rank-sum test to examine the difference between a continuous variable and a binary categorical variable. Spearman’s rank correlation coefficient was used to measure the correlation between two continuous variables and r > 0.25 and P < 0.05 was considered significant. Student’s t test was used to examine the difference between the two means. PFS and OS were estimated by the Kaplan–Meier method and the differences were examined by the log-rank test. Hazard ratios (HRs) and their 95% confidence intervals (CI) were calculated with the Cox proportional hazards model. All statistical tests were two-sided tests and P < 0.05 was considered significant unless indicated. R 3.6.1 (https://www.r-project.org/) was used in our data analysis. Western blots were repeated independently three times with similar results, and representative images were shown.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
Public CLIP-seq data of m6A and CSTF2 are accessible under GEO numbers GSE147440 and GSE37398. The raw sequence data reported in this paper have been deposited in the Genome Sequence Archive in BIG Data Center, Beijing Institute of Genomics (BIG), Chinese Academy of Sciences [http://bigd.big.ac.cn/] under restricted access: HRA000095, HRA001663, HRA003601, and HRA004744. The researchers can register and login to the GSA database website [https://ngdc.cncb.ac.cn/gsa-human/] and follow the guidance of “Request Data” to request the data step by step [https://ngdc.cncb.ac.cn/gsa-human/document/GSA-Human_Request_Guide_for_Users_us.pdf] and/or by contacting zuozhx@sysucc.org.cn or zhangjial@sysucc.org.cn. All requests will be reviewed by corresponding authors and the SYSUCC institutional review board. The approximate response time for accession requests is about two weeks. The access authority can be obtained for scientific research and not-for-profit use only. Once access has been granted, the data will be available to download for two months. The remaining data supporting the findings of this study are available within the Article, Supplementary Information, or Source Data file. Source data are provided in this paper. Source data are provided with this paper.
Code availability
Custom codes developed for data preprocessing, analysis and visualization are available at GitHub: https://github.com/canceromics/CSTF2_m6Asubtype_code.
References
Rahib, L. et al. Projecting cancer incidence and deaths to 2030: the unexpected burden of thyroid, liver, and pancreas cancers in the United States. Cancer Res. 74, 2913–2921 (2014).
Ryan, D. P., Hong, T. S. & Bardeesy, N. Pancreatic adenocarcinoma. N. Engl. J. Med. 371, 1039–1049 (2014).
Wolfgang, C. L. et al. Recent progress in pancreatic cancer. CA Cancer J. Clin. 63, 318–348 (2013).
Nevala-Plagemann, C., Hidalgo, M. & Garrido-Laguna, I. From state-of-the-art treatments to novel therapies for advanced-stage pancreatic cancer. Nat. Rev. Clin. Oncol. 17, 108–123 (2020).
Collisson, E. A., Bailey, P., Chang, D. K. & Biankin, A. V. Molecular subtypes of pancreatic cancer. Nat. Rev. Gastroenterol. Hepatol. 16, 207–220 (2019).
Cancer Genome Atlas Research Network. Electronic address, a.a.d.h.e. & Cancer Genome Atlas Research, N. Integrated genomic characterization of pancreatic ductal adenocarcinoma. Cancer Cell 32, 185−203 e13 (2017).
Waddell, N. et al. Whole genomes redefine the mutational landscape of pancreatic cancer. Nature 518, 495–501 (2015).
Bailey, P. et al. Genomic analyses identify molecular subtypes of pancreatic cancer. Nature 531, 47–52 (2016).
Collisson, E. A. et al. Subtypes of pancreatic ductal adenocarcinoma and their differing responses to therapy. Nat. Med. 17, 500–503 (2011).
Moffitt, R. A. et al. Virtual microdissection identifies distinct tumor- and stroma-specific subtypes of pancreatic ductal adenocarcinoma. Nat. Genet. 47, 1168–1178 (2015).
Law, H. C. et al. The proteomic landscape of pancreatic ductal adenocarcinoma liver metastases identifies molecular subtypes and associations with clinical response. Clin. Cancer Res 26, 1065–1076 (2020).
Chen, T. et al. m6A RNA methylation is regulated by microRNAs and promotes reprogramming to pluripotency. Cell Stem Cell 16, 289–301 (2015).
Batista, P. J. et al. m6A RNA modification controls cell fate transition in mammalian embryonic stem cells. Cell Stem Cell 15, 707–719 (2014).
Geula, S. et al. Stem cells. m6A mRNA methylation facilitates resolution of naive pluripotency toward differentiation. Science 347, 1002–1006 (2015).
Fustin, J. M. et al. RNA-methylation-dependent RNA processing controls the speed of the circadian clock. Cell 155, 793–806 (2013).
Zhao, X. et al. FTO-dependent demethylation of N6-methyladenosine regulates mRNA splicing and is required for adipogenesis. Cell Res 24, 1403–1419 (2014).
Su, R. et al. R-2HG exhibits anti-tumor activity by targeting FTO/m6A/MYC/CEBPA signaling. Cell 172, 90–105 e23 (2018).
Zhang, J. et al. Excessive miR-25-3p maturation via N6-methyladenosine stimulated by cigarette smoke promotes pancreatic cancer progression. Nat. Commun. 10, 1858 (2019).
Li, Z. et al. FTO plays an oncogenic role in acute myeloid leukemia as a N6-methyladenosine RNA demethylase. Cancer Cell 31, 127–141 (2017).
Zhang, C. et al. Hypoxia induces the breast cancer stem cell phenotype by HIF-dependent and ALKBH5-mediated m6A-demethylation of NANOG mRNA. Proc. Natl Acad. Sci. USA 113, E2047–E2056 (2016).
Lin, S., Choe, J., Du, P., Triboulet, R. & Gregory, R. I. The m6A methyltransferase METTL3 promotes translation in human cancer cells. Mol. Cell 62, 335–345 (2016).
Zhang, Y. et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 9, R137 (2008).
Cui, X., Meng, J., Zhang, S., Chen, Y. & Huang, Y. A novel algorithm for calling mRNA m6A peaks by modeling biological variances in MeRIP-seq data. Bioinformatics 32, i378–i385 (2016).
Tan, B. et al. Viral and cellular N6-methyladenosine and N6,2’-O-dimethyladenosine epitranscriptomes in the KSHV life cycle. Nat. Microbiol. 3, 108–120 (2018).
Xuan, J. J. et al. RMBase v2.0: deciphering the map of RNA modifications from epitranscriptome sequencing data. Nucleic Acids Res. 46, D327–D334 (2018).
Dominissini, D. et al. Topology of the human and mouse m6A RNA methylomes revealed by m6A-seq. Nature 485, 201–206 (2012).
Xiong, F. et al. RNA m6A modification orchestrates a LINE-1-host interaction that facilitates retrotransposition and contributes to long gene vulnerability. Cell Res. 31, 861–885 (2021).
Meyer, K. D. et al. Comprehensive analysis of mRNA methylation reveals enrichment in 3’ UTRs and near stop codons. Cell 149, 1635–1646 (2012).
Huang, H. et al. Histone H3 trimethylation at lysine 36 guides m6A RNA modification co-transcriptionally. Nature 567, 414–419 (2019).
Zhang, Z. et al. RADAR: differential analysis of MeRIP-seq data with a random effect model. Genome Biol. 20, 294 (2019).
Chen, H. et al. Centromere protein F is identified as a novel therapeutic target by genomics profile and contributing to the progression of pancreatic cancer. Genomics 113, 1087–1095 (2021).
Li, X. et al. Apoptosis-induced translocation of centromere protein F in its corresponding autoantibody production in hepatocellular carcinoma. Oncoimmunology 10, 1992104 (2021).
Song, S. et al. IDH2 contributes to tumorigenesis and poor prognosis by regulating m6A RNA methylation in multiple myeloma. Oncogene 40, 5393–5402 (2021).
Sun, Y. et al. FZD5 contributes to TNBC proliferation, DNA damage repair and stemness. Cell Death Dis. 11, 1060 (2020).
Arensman, M. D. et al. WNT7B mediates autocrine Wnt/beta-catenin signaling and anchorage-independent growth in pancreatic adenocarcinoma. Oncogene 33, 899–908 (2014).
Wu, Z., Martinez-Fong, D., Tredaniel, J. & Forgez, P. Neurotensin and its high affinity receptor 1 as a potential pharmacological target in cancer therapy. Front. Endocrinol. (Lausanne) 3, 184 (2012).
Takahashi, K. et al. Neurotensin receptor 1 signaling promotes pancreatic cancer progression. Mol. Oncol. 15, 151–166 (2021).
Zhu, Y. et al. POSTAR2: deciphering the post-transcriptional regulatory logics. Nucleic Acids Res. 47, D203–D211 (2019).
Yao, C. et al. Transcriptome-wide analyses of CstF64-RNA interactions in global regulation of mRNA alternative polyadenylation. Proc. Natl Acad. Sci. USA 109, 18773–18778 (2012).
Yao, C. et al. Overlapping and distinct functions of CstF64 and CstF64tau in mammalian mRNA 3’ processing. RNA 19, 1781–1790 (2013).
McCracken, S. et al. The C-terminal domain of RNA polymerase II couples mRNA processing to transcription. Nature 385, 357–361 (1997).
Nag, A., Narsinh, K. & Martinson, H. G. The poly(A)-dependent transcriptional pause is mediated by CPSF acting on the body of the polymerase. Nat. Struct. Mol. Biol. 14, 662–669 (2007).
Slobodin, B. et al. Transcription impacts the efficiency of mRNA translation via co-transcriptional N6-adenosine methylation. Cell 169, 326–337 e12 (2017).
Chen, X. et al. CSTF2-induced shortening of the RAC1 3’UTR promotes the pathogenesis of urothelial carcinoma of the bladder. Cancer Res. 78, 5848–5862 (2018).
Saponaro, M. et al. RECQL5 controls transcript elongation and suppresses genome instability associated with transcription stress. Cell 157, 1037–1049 (2014).
Huang, H. et al. Recognition of RNA N6-methyladenosine by IGF2BP proteins enhances mRNA stability and translation. Nat. Cell Biol. 20, 285–295 (2018).
Deng, X. et al. RNA N6-methyladenosine modification in cancers: current status and perspectives. Cell Res. 28, 507–517 (2018).
Li, M., Zha, X. & Wang, S. The role of N6-methyladenosine mRNA in the tumor microenvironment. Biochim. Biophys. Acta Rev. Cancer 1875, 188522 (2021).
Huang, H., Weng, H. & Chen, J. m6A modification in coding and non-coding RNAs: roles and therapeutic implications in cancer. Cancer Cell 37, 270–288 (2020).
Lomberk, G. et al. Distinct epigenetic landscapes underlie the pathobiology of pancreatic cancer subtypes. Nat. Commun. 9, 1978 (2018).
Eyres, M. et al. TET2 drives 5hmc marking of GATA6 and epigenetically defines pancreatic ductal adenocarcinoma transcriptional subtypes. Gastroenterology 161, 653–668 e16 (2021).
Murthy, K. G. & Manley, J. L. The 160-kD subunit of human cleavage-polyadenylation specificity factor coordinates pre-mRNA 3’-end formation. Genes Dev. 9, 2672–2683 (1995).
Nojima, T. et al. Mammalian NET-seq reveals genome-wide nascent transcription coupled to RNA processing. Cell 161, 526–540 (2015).
Sobin, L. H. & Compton, C. C. TNM seventh edition: what’s new, what’s changed: communication from the International Union Against Cancer and the American Joint Committee on Cancer. Cancer 116, 5336–5339 (2010).
Zhang, C. et al. m6A modulates haematopoietic stem and progenitor cell specification. Nature 549, 273–276 (2017).
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
Frankish, A. et al. GENCODE reference annotation for the human and mouse genomes. Nucleic Acids Res. 47, D766–D773 (2019).
Bailey, T. L., Williams, N., Misleh, C. & Li, W. W. MEME: discovering and analyzing DNA and protein sequence motifs. Nucleic Acids Res. 34, W369–W373 (2006).
Li, B. & Dewey, C. N. RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinformatics 12, 323 (2011).
Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).
Robinson, M. D., McCarthy, D. J. & Smyth, G. K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140 (2010).
Schwartz, S. et al. Perturbation of m6A writers reveals two distinct classes of mRNA methylation at internal and 5’ sites. Cell Rep. 8, 284–296 (2014).
Wilkerson, M. D. & Hayes, D. N. ConsensusClusterPlus: a class discovery tool with confidence assessments and item tracking. Bioinformatics 26, 1572–1573 (2010).
Yu, G., Wang, L. G., Han, Y. & He, Q. Y. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS 16, 284–287 (2012).
Schneeberger, C., Speiser, P., Kury, F. & Zeillinger, R. Quantitative detection of reverse transcriptase-PCR products by means of a novel and sensitive DNA stain. PCR Methods Appl. 4, 234–238 (1995).
Zimmermann, G. et al. Small molecule inhibition of the KRAS-PDEdelta interaction impairs oncogenic KRAS signalling. Nature 497, 638–642 (2013).
Moore, M. J. et al. Mapping Argonaute and conventional RNA-binding protein interactions with RNA at single-nucleotide resolution using HITS-CLIP and CIMS analysis. Nat. Protoc. 9, 263–293 (2014).
Linder, B. et al. Single-nucleotide-resolution mapping of m6A and m6Am throughout the transcriptome. Nat. Methods 12, 767–772 (2015).
Kaya-Okur, H. S. et al. CUT&Tag for efficient epigenomic profiling of small samples and single cells. Nat. Commun. 10, 1930 (2019).
Chen, S., Zhou, Y., Chen, Y. & Gu, J. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34, i884–i890 (2018).
Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).
Heinz, S. et al. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell 38, 576–89 (2010).
Thorvaldsdottir, H., Robinson, J. T. & Mesirov, J. P. Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Brief. Bioinform. 14, 178–192 (2013).
Ramirez, F. et al. deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res. 44, W160–W165 (2016).
Core, L. J., Waterfall, J. J. & Lis, J. T. Nascent RNA sequencing widespread pausing and divergent initiation at human promoters. Science 322, 1845–1848 (2008).
Acknowledgements
This study was supported by the National Key R&D Program of China (2021YFA1302100), Program for Guangdong Introducing Innovative and Entrepreneurial Teams (2017ZT07S096 to D.L.), Natural Science Foundation of China (82072617 to J. Zheng and 82003162 to J. Zhang), Guangdong Basic and Applied Basic Research Foundation (2021B1515020108 to Z.Z.) and Sun Yat-sen University Intramural Funds (to D.L. and to J. Zheng).
Author information
Authors and Affiliations
Contributions
Z.Z., J. Zhang, J. Zheng, C.H. and D.L. conceived and designed the entire project. Z.Z. and J. Zhang designed and supervised the research. J. Zhang and R.B. prepared all tissue samples for high-throughput sequencing. Y.Z., H.Z. and J. Zhang performed m6A-seq, CUT&Tag-seq, DRB/GRO-seq, m6A ELISA assays. Y.Z. and X.H. performed PAR-CLIP sequencing and RNA sequencing. R.B. and L. Zhuang performed qRT-PCR, western blot assays, immunofluorescence staining, and co-immunoprecipitation. S.Z. and J.S. performed cell proliferation, migration, and invasion assays. Y.Z., S.Z. and L. Zeng performed the animal models. X.L., S.D., Y.Y. and X.B. performed statistical and bioinformatics analyses of high-through sequencing data. R.L. and S.D. were engaged in the analysis of public data. Z.Z. supervised all bioinformatics analyses. Q.Z., M.L. and R.C. were responsible for patient recruitment and clinical data collection. Z.Z., J. Zheng, J. Zhang, D.L. and C.H. prepared the manuscript and all authors proved the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks the anonymous reviewers for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Source data
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Zheng, Y., Li, X., Deng, S. et al. CSTF2 mediated mRNA N6-methyladenosine modification drives pancreatic ductal adenocarcinoma m6A subtypes. Nat Commun 14, 6334 (2023). https://doi.org/10.1038/s41467-023-41861-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41467-023-41861-y
- Springer Nature Limited
This article is cited by
-
Genetic regulation of m6A RNA methylation and its contribution in human complex diseases
Science China Life Sciences (2024)