Introduction

Head and neck cancers are the seventh most commonly diagnosed group of malignancies worldwide [1] with oral squamous cell carcinoma (OSCC) having a high prevalence. According to the Global Cancer Observatory, the annual incidence rate of OSCC was reported to be 377,713 cases globally in 2020, with Asia being a major contributor [2]. Developing on the mucosal epithelium of the oral cavity, OSCC has become the third most common cancer in India, accounting for over 30% of all cancers, with a poor five-year survival rate [3]. It arises more often in males, and its incidence increases with age [4]. Diagnosis of OSCC in its early stages is often challenging leading to misdiagnosis, and mistreatment, resulting in advanced stage and high mortality. More promising diagnostic techniques and novel biomarkers are required for early diagnosis and improved prognosis.

Pre-mRNA splicing is a critical process in gene expression and demands meticulous precision to excise introns without introducing any aberrant addition or loss of nucleotide. In humans, approximately 95% of genes with multiple exons undergo alternate splicing generating diverse mRNA and protein variants from a single gene locus [5]. Perturbations in splicing are frequently observed and are associated with mutations in, and/or altered expression of splicing machinery and its components in various cancers. Additionally, the tumor microenvironment may also influence the expression of specific isoforms in a particular cell type.

Serine-Arginine rich splicing factors (also called as SRSF) are an essential component of spliceosome, a large macromolecular complex responsible for splicing. These factors not only ensure the efficiency and accuracy of constitutive splicing, but also intricately regulate alternative splicing of precursor mRNAs. These multifunctional factors are also involved in the export of mRNA from the nucleus to the cytoplasm to exert regulatory functions, non-sense mediated decay, and mRNA translation [6]. Hence they serve as a pivotal regulator of mRNA metabolism. The splicing factor family has twelve canonical, highly conserved members in humans, SRSF1–SRSF12. These members exhibit a similar structural organization and display partial functional redundancy [7]. These factors reside in organelles called nuclear speckles [8] which primarily function as a modification or assembly site providing active splicing factors to the site of transcription [9]. Studies on Drosophila, Caenorhabditis elegans, and mouse animal model where knockouts of SRSFs resulted in embryonic lethality reveal their importance in normal healthy developmental processes [10, 11]. However, the contribution of SRSFs to disease when mis-regulated remains unclear. Numerous studies have established the association of dysregulated alternative splicing with various forms of cancer. Two pivotal splicing factors, SRSF1 and SRSF3 have consistently emerged as proto-oncoproteins in various reports [12, 13]. SRSF1 in particular is the most widely studied and well-established oncogenic splicing factor. It is implicated in many cancers including oral cancer, breast cancer, lung cancer and colon cancer, and is mostly overexpressed. It has been shown to interact with long non-coding RNA LINC01296, promoting cellular proliferation, epithelial-mesenchymal transition, and cell migration in OSCC [14]. Likewise, aberrant expression of SRSF3 resulting in altered splicing patterns has been implicated in several cancers. This altered expression has been linked to pivotal cellular processes encompassing the cell cycle, apoptosis, cell proliferation, drug resistance, cell migration and invasion [15]. However, the role of other SRSF members remains inadequately documented at present.

Table 1 highlights the aberrations of the SRSF family in different cancers. Despite these studies the precise role of the SRSF family in various cancerous conditions remains elusive. In the present study, we sought to address this gap by scrutinizing the expression levels of all SRSF family members across 23 cancerous cell lines of diverse origin using quantitative real-time PCR (qRT-PCR). Our finding revealed a notable divergence in the expression levels of SRSF family member in four distinct oral cancer cell lines. These intriguing results prompted us to examine transcript levels of SRSF family members in patients with oral cancer. Additionally, we correlated our patient sample data with the TCGA database for a comprehensive analysis.

Table 1 Dysregulation of SRSF splicing factor family in cancer

Materials and methods

Cell culture

Twenty-seven cell lines (23 cancerous, one transformed, and three non-transformed cell lines of different origins) were routinely maintained in their recommended growth medium (viz. DMEM/RPMI/DMEM-F12) supplemented with 10% FBS and 1% penicillin-Streptomycin at 37 °C in a humidified incubator provided with 5% CO2. Initial passages of all the cell lines were used for the experiments. Cell lines details are provided in Supplementary Table S1.

Study subjects

A total of sixty-six oral tissue biopsy samples including OSCC tissue biopsies (n = 40) and adjacent non tumor tissues (n = 26) were collected from All India Institute of Medical Sciences, New Delhi, from 2016 to 2018. Oral cancer patients and patients with erythroplakia or hyperplasia were included in this study. Patients with other associated malignancies, radiation therapy, any other chronic disease, malnutrition or pregnancy were excluded. All study participants provided written consent prior to beginning of the study.

RNA isolation and reverse transcription

Oral tissue biopsy samples were collected in RNA-later (#R0901, Sigma-Aldrich) and stored at -80°C till further use. Samples were snap-frozen in liquid nitrogen and crushed into a fine powder which were then further processed using Promega ReliaPrep™ RNA Miniprep Systems (#Z6110) according to manufacturer instructions. RNA was eluted in 10–15 µl of nuclease-free water. Total RNA from cell lines was isolated using RNAzolⓇRT reagent (R-4533, Sigma-Aldrich) according to manufacturer instructions. Briefly, cells were lysed in RNAzol followed by the addition of nuclease-free water to precipitate DNA, proteins, polysaccharides, and other molecules. RNA was precipitated using 100% isopropanol, washed with 75% ethanol, and eluted in 30–50 µl of nuclease-free water. The integrity of the isolated RNA was checked using 0.8% ethidium bromide by agarose gel electrophoresis (28 S:18 S rRNA ratio close to 2:1). Total RNA isolated from tissue biopsies and cell lines were then quantified using Qubit™ 3 fluorometer (Invitrogen) and was subsequently converted to 2000 ng of cDNA using Applied Biosystems high-capacity cDNA reverse transcription kit (#4368814). The cDNA was stored at -20°C in RNase/DNase-free tubes till further processing.

Quantitative real-time PCR

Specific mRNA expression of the SRSF splicing factor genes was measured using Quant Studio 6 Flex systems (Applied Biosystems) qRT-PCR. Briefly, 50 ng of cDNA was amplified in a 10 µl reaction according to the manufacturer’s recommended cycling conditions using SYBR Green intercalating dye (Promega). Each biological samples were run in duplicates and the relative mRNA expression levels were determined using average delta Ct (ΔCt) and 2−ΔΔCt methods. The Ct values were first normalized with RPLp0 as internal control [16] and then to their respective controls. Primer sequences are listed in Supplementary Table S2.

Patient data analysis from TCGA database

The Head and neck squamous cell carcinoma (HNSC) gene expression data was extracted from GDC TCGA database (http://xena.ucsc.edu). The HNSC dataset included 612 samples, of which gene expression array data and clinical follow up of 500 primary tumor tissues and 44 adjacent non tumor tissue were available. Data for samples affected with cancer of pharynx, larynx and tonsil were removed and 319 samples of oral squamous cell carcinoma (OSCC) patients were retained for further analysis. Expression profile of SRSF family members were examined in cases affected by OSCC.

Prognosis analysis

Kaplan-Meier analysis with log-rank test was used to determine the association between the expression of the SRSF family and the overall survival (OS) of the OSCC patients from TCGA database. The receiver operating characteristic curve (ROC) was used to access the SRSF transcript levels (ΔCt) in OSCC samples.

Statistical analysis

All experiments in the study were repeated three times. Statistical analysis was performed using SPSS (version 17.0; Chicago, Illinois) and GraphPad Prism 8.0 (GraphPad Software Inc., USA). GraphPad Prism was used for data analysis and data representation. Using the Kolmogorov-Smirnov and Shapiro-Wilk test, the normal distribution of each data set was examined, and it was determined that the data sets were non-normally distributed. Mann-Whitney U test was used in the two-group analysis to look at group differences. A p-value of < 0.05 (*), < 0.01 (**), < 0.001 (***) and < 0.0001 (****) was considered to be statistically significant. The qualitative variables were expressed as proportions (%) and quantitative variables were expressed as mean ± standard deviation or mean ± standard error of the mean.

Results and discussion

Expression profile of the SRSF family in cancerous cell lines of various origins

To compare the levels of all the SRSF family members (SRSF1-SRSF12) in different cancers, we examined their expression in 27 cell lines of different origin (23 cancerous, 1 transformed and 3 non-transformed cell lines). We observed that the average ΔCt values of the SRSF family were quite similar in the non-transformed cell lines (VH10, HaCaT, IHH) despite their different origin – foreskin fibroblast, skin keratinocyte and hepatocyte (Supplementary Fig. S1A). Hence these three non-transformed cell lines were used as controls to study the expression profile of the SRSF family members in all the cancerous cell lines.

The cancerous cell lines used in the study included oral cancer (n = 4), breast cancer (n = 4), glioblastoma (n = 2), cervical cancer (n = 3), gastric cancer (n = 2), leukemia (n = 2), pancreatic cancer (n = 2), lung cancer (n = 2), neuroblastoma (n = 1), colorectal carcinoma (n = 1) and human embryonic kidney transformed (n = 1). Details of all the cell lines under study are described in Supplementary Table S1. A scatter plot of ΔCt values of expression of SRSFs in all the cancerous cell lines of different origins is provided in Supplementary Fig. S1B-L.

Figure 1A describes the relative expression profile of the SRSF family (SRSF1 to SRSF12) in cancerous cell lines compared to control cell lines. In all oral cancer cell lines (SCC-4, UM-SCC-84, CAL33, SAS-H1), we observed a notable upregulation of SRSF3, SRSF9, and SRSF10 with a near significant difference (p = 0.057) in average ΔCt values compared to control cell lines (Fig. 1B-D). Increased expression of SRSF3 was previously observed in OSCC patients, oral cancer cell lines [17] as well as cervical, lung, colon, stomach, and breast cancer [18, 19] suggesting a role in tumor initiation, progress and maintenance. SRSF9 has also been implicated in cancers like hepatocellular carcinoma [20], colorectal cancer [21], and cervical cancer [22]. The frequent upregulation of SRSF10 in oral cancer has been described earlier, establishing a connection to the process of oncogenesis [23]. No significant difference was observed in other SRSF family members in oral cancer cell lines. These results suggested the involvement of SRSF3, SRSF9, and SRSF10 in oral cancer.

Fig. 1
figure 1

Expression profile of SRSF family members in cancerous cell lines. (A) Heatmap of the fold change expression in all the 23 cancerous cell lines of different origin and a transformed cell line compared to the three control cell lines (used as controls). Colour gradation represents fold change; Red colour indicates upregulation and blue colour indicates downregulation. Fold change was calculated by 2− ΔΔCt method. Differential gene expression (average ΔCt) of (B) SRSF3; (C) SRSF9; (D) SRSF10 in oral cancer cell lines compared to non-transformed cell lines with a near significant difference between them. ΔCt values is inversely proportional to expression. Data is presented in the form of Mean ± SD from three independent experiments

Interestingly, we observed ~ 2 fold upregulation of SRSF2 in almost all cancerous cell lines. Previously, upregulation of SRSF2 has been documented in breast cancer [24], cervical cancer [25] and colorectal cancer [26] cell lines. Similarly, SRSF7 was also upregulated (~ 1.8 fold and above) in almost all the cell lines, except UM-SCC-84 and Kasumi-1 where we observed a downregulation of 1.5 fold. Elevated levels of SRSF7 have been previously reported in colon cancer and lung cancer patient samples as well as cell lines [27]. Similarly, we observed that SRSF3 was upregulated by 1.6 fold and above in almost all the cell lines except, A549 and Kasumi-1 where a downregulation of 1.5 fold was observed. A few cell lines, namely MDA-MB-453, MKN-45, MIA PaCa-2 and HEK293T had a significant upregulation of all the SRSFs at the transcript level. In contrast, Kasumi-1 and A549 showed a significant downregulation of all the SRSFs at transcript level. We observed a lower expression of SRSF6 in lung cancer cell lines whereas Cohen-Eliav et al. reported its overexpression in lung cancer patients which may trigger abnormal proliferation [28].

HEK293T, a SV-40 large T antigen transformed cell line derived from an embryonic kidney, showed a significant upregulation of all the splicing factors SRSF1 to SRSF12 ranging from 3 to 10 fold. SRSFs expression is known to be higher during the developmental phase. Studies on the SRSFs highlight their importance in the development process and their ability to regulate embryonic pluripotent stem cells [29]. Conditional deletion of SRSF in animal models result in severe developmental defects, stating that these factors are important in normal development [30].

It is important to mention that SRSF12 showed highly variable expression across diverse cancer cell lines including SCC-4, SAS-H1, C-33 A, SH-SY5Y, and HEK293T (Supplementary Fig. S2). While its involvement in cancer remains incompletely characterised, an integrated analysis in soft tissue sarcoma indicated upregulation [31]. Additionally, a study on intravascular NK/T-cell lymphoma reported copy number loss of SRF12 gene [32]. Nevertheless, the precise role of SRSF12 in cancer biology warrants further exploration through dedicated research studies.

Expression profile of SRSF family in OSCC patients

The aberrant expression profile of SRSF family members in oral cancer cell lines were further investigated in primary tissue samples from OSCC patients. A total of forty patients were enrolled in the study and were divided into 3 categories based on their clinical diagnosis: Pre-cancer (n = 15), Early Cancer (n = 11), and Late cancer (n = 14). Adjacent non-tumor tissues (n = 26) were used as controls. The demographic characteristics of the study subjects are given in Table 2.

Table 2 Demographic characteristics of oral squamous cell carcinoma patients

Figure 2A describes the relative expression profile of SRSF family (SRSF1 to SRSF12) in patients with pre-cancer, early cancer and late cancer stages of oral cancer compared to controls. Scatter plot of ΔCt values of all the SRSF family members SRSF1-SRSF12 in controls, pre-cancer, early cancer and late cancer samples under present study is provided in Supplementary Fig. S3.

Fig. 2
figure 2

Expression profile of SRSF family members in oral cancer samples. (A) Heatmap of the fold change expression in the three patient groups- Pre-cancer, Early cancer and Late cancer compared to controls (adjacent non-tumor tissues) for different splicing factors. Colour gradation represents fold change. Red colour indicates upregulation and blue colour. Differential gene expression (average ΔCt) of (B) SRSF1; (C) SRSF3; (D) SRSF7; (E) SRSF9; (F) SRSF10; (G) SRSF11 observed in different oral cancer patient groups compared to controls. A significant difference was observed between late cancer groups vs. controls. Data is presented in the form of Mean ± SD from three independent experiments. Adjacent non-tumor tissues were used as control

A lower expression of transcript level of SRSFs were observed in pre-cancer patients as compared to controls. The expression was upregulated in the late-cancer patient group when the fold change in the patient group was compared to the controls. The early cancer patient group showed a mixed expression profile of these factors. We observed > 2-fold upregulation of SRSF10, SRSF 11 and 1.6-fold upregulation of SRSF1, SRSF2, SRSF3, SRSF6 and SRSF9 in the late cancer patient group compared to controls. We also noted a significant difference in the expression levels (average ΔCt) of SRSF1, SRSF3, SRSF7, SRSF9, SRSF10 and SRSF11 in late cancer patient samples compared to that of the controls (Fig. 2B-G). Similarly, > 1.6-fold upregulation of SRSF8, SRSF10, SRSF11 and SRSF12 was observed in early cancer patient group compared to controls.

Interestingly, the levels of SRSF3, SRSF10 and SRSF11 increased from pre-cancer to early cancer to late cancer indicating that these splicing factors may play a critical role in oral cancer progression (Fig. 2A). A near significant difference (p = 0.058) in SRSF10 levels emerged in the late cancer patient group compared to pre-cancer patient group (Fig. 2C). Further, SRSF3 and SRSF11 were upregulated by 1.7-fold in late cancer patients compared to pre-cancer patients (p = 0.058) (Supplementary Fig. S4). Similar findings were reported by a group investigating SRSF3 expression in OSCC. They observed its up-regulated expression in patients with higher grade cancer suggesting its association with cancer development [33]. A comprehensive study on alternative splicing signatures in OSCC revealed that higher expression of SRSF9 was related to poor outcome [34]. Another study showed that SRSF10 overexpression was associated with poor disease outcome and suggested that SRSF10 may act as an oncogenic driver [23]. These studies align with our data suggesting that there are more than one SRSF associated with the progression of OSCC.

There was no significant difference in transcript levels of SRSF2, SRSF4, SRSF5, SRSF6, SRSF8, SRSF12 among the three patient groups (Supplementary Fig. S5). However a group studying SRSF5 reported its overexpression in OSCC proposing that it might be essential for cell proliferation and development of cancer [35]. A study on HNSC reported a positive correlation of expression in a proangiogenic factor, VEGF isoforms (VEGFAxxx and VEGFA165b), with SRSF1, SRSF5 and SRSF6 which resulted in alternative splicing of VEGFA gene and linked these factors with tumor growth and metastasis [36]. They reported upregulation of SRSF1 and SRSF5, in pharyngeal tumors and downregulation of SRSF6 in advanced oral tumor.

We used ROC curve to evaluate the diagnostic performance of SRSF family members in different stages of oral cancer (Supplementary Fig. S6). The diagnostic performance of SRSF10 was 0.82 in late stage of oral cancer (p = 0.0009) which was much higher than the other SRSF family members. Other SRSFs family members with significantly higher area under the curve (AUC) values in late cancer stage included- SRSF11 (AUC = 0.77, p = 0.008), SRSF3 AUC = 0.77, p = 0.006), SRSF9 (AUC = 0.76, p = 0.008), SRSF1 (AUC = 0.75, p = 0.01) and SRSF7 (AUC = 0.72, p = 0.02). Statistical significance was not observed in Pre-cancer and Early cancer stages.

Validation on external database with TCGA-HNSC dataset

The findings from the oral cancer patients tissue in our study were corelated with the transcriptome data available in TCGA-HNSC database. The mRNA expression of SRSF family members in OSCC tissue samples (n = 319) were compared with adjacent non tumor tissues (n = 32). The OSCC samples were categorized under four categories viz. Stage I (n = 11), II (n = 79), III (n = 68), IV (n = 161) according to AJCC TNM system.

The expression levels of SRSF1, SRSF4, SRSF9, SRSF10 and SRSF11 were significantly upregulated in stage II, III and IV (Fig. 3A, D, I, J, K). There was no difference in the expression level of SRSF members in stage I as compared to adjacent non tumor tissues.

Fig. 3
figure 3

The box plot of RNA expression (RPKM) on TCGA-HNSC data of SRSF family members. RNA expression of gene (A) SRSF1; (B) SRSF2; (C) SRSF3; (D) SRSF4; (E) SRSF5; (F) SRSF6; (G) SRSF7; (H) SRSF8; (I) SRSF9; (J) SRSF10; (K) SRSF11; (L) SRSF12. X-axis represents Tissue normal (TN) and stages of Oral cancer I-IV according to AJCC TNM system. The Y-axis is corresponding to Reads Per Kilobase of transcript per Million mapped reads (RPKM) value of RNA-Seq data. The corresponding gene name is provided on Y-axis

The diagnostic performance of SRSF family was validated in an additional cohort taken from HNSC dataset from TCGA (Supplementary Fig. S7). ROC values for the genes with higher expression in stage IV was - SRSF4 (AUC = 0.71, p = 0.002), SRSF10 (AUC = 0.70, p = 0.0003), SRSF1 (AUC = 0.69, p = 0.0005), SRSF9 (AUC = 0.67, p = 0.003) and SRSF11 (AUC = 0.64, p = 0.02). Similarly, ROC values of above mentioned genes were above 0.65 in stage II and III as well.

Expression levels of SRSF5, SRSF8 and SRSF12 (Fig. 3E, H, L) were significantly downregulated in all the clinical stages of cancer stage I-IV. SRSF6 was also significantly downregulated but in later stages of cancer (Stage III and IV, Fig. 3F). No remarkable change in the expression of SRSF3 and SRSF7 was observed at any stage. ROC values for the genes with downregulated expression i.e. SRSF5, SRSF6 and SRSF8 were statistically significant in all four cancer stages. The detailed information is summarized in Supplementary Fig. S7. The Kaplan Meier survival analysis (Fig. 4) indicated that elevated expression of SRSF9 is associated with reduced overall survival which was also observed by another study group in 2020 [34]. No significant difference in the overall survival was observed with other members of SRSF family.

Fig. 4
figure 4

Kaplan-Meier survival analysis of oral cancer patients for SRSF genes using publicly available TCGA-HNSC dataset. KM survival plot for gene (A) SRSF1; (B) SRSF9; (C) SRSF10; (D) SRSF11. Blue line indicates low SRSF gene expression and Red line indicates a higher expression

Oral cancer is a major health concern in India with a multifaceted aetiology. New and improved treatment protocols along with novel biomarkers are required to address this problem. Although SRSFs are mostly linked to cancer, the underlying mechanism resulting in disease generation remains largely unknown. It is challenging to separate their distinct roles from one another due to their functional redundancy in various cellular processes. Comprehending their individual roles in RNA metabolism, transcription, translation, and splicing will facilitate our understanding of the networking underlying their functions. Single cell study of these proteins might help in understanding their cell-tissue type specific roles and the molecular mechanism resulting in carcinogenesis. This study is a comparative analysis examining the expression of all serine-arginine rich splicing factor family in cancerous cell lines of various origin and oral carcinoma tissues in a holistic manner. In addition, cancer stage specific expression analysis of all the 12 members of SRSF family in OSCC patients and TCGA database provided a novel insight into the deregulated profiles of SRSF1, SRSF9, SRSF10 and SRSF11. Our study suggests their critical role in oral cancer as potential predictive markers. Specifically, high expression of SRSF9 may be linked to a decrease in overall survival. The present study strongly suggests the critical role of aforementioned splicing factors in oral cancer progression. However, further validation is warranted to therapeutically target serine/arginine-rich family members to modulate the alternate splicing in oral cancer.

Conclusion

Our findings highlight the pivotal role of multiple SRSF family members in oral squamous cell carcinoma, emphasizing their significance as central players in the dysregulated splicing landscape linked to cancer progression.