Abstract
Genomic rearrangements and copy number variations (CNVs) are the major regulators of clustered microRNAs (miRNAs) expression. Several clustered miRNAs are harbored in and around chromosome fragile sites (CFSs) and cancer-associated genomic hotspots. Aberrant expression of such clusters can lead to oncogenic or tumor suppressor activities. Here, we developed CmirC (Clustered miRNAs co-localized with CNVs), a comprehensive database of clustered miRNAs co-localized with CNV regions. The database consists of 481 clustered miRNAs co-localized with CNVs and their expression patterns in 35 cancer types of the TCGA. The portal also provides information on CFSs, miRNA cluster candidates, genomic coordinates, target gene networks, and gene functionality. The web portal is integrated with advanced tools such as JBrowse, NCBI-BLAST, GeneSCF, visNetwork, and NetworkD3 to help the researchers in data analysis, visualization, and browsing. This portal provides a promising avenue for integrated data analytics and offers additional evidence for the complex regulation of clustered miRNAs in cancer. The web portal is freely accessible at http://slsdb.manipal.edu/cmirclust to explore clinically significant miRNAs.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Introduction
Cancer is one of the major non-communicable diseases with high incidence and mortality rates (Sung et al. 2021). Several factors including genetic and epigenetic alterations participate in carcinogenesis (You and Jones 2012). Despite the architecture of human genome majorly consisting of non-coding regions, a large number of studies have focused on cancer-causing genomic alterations in protein-coding regions (Gloss and Dinger 2018). However, the non-coding genomic regions can drive essential biological functions and control the expression of genes that are involved in several diseases, including cancer. Small non-coding miRNAs regulate gene expression via targeting mRNAs and have been shown to act as oncogenes or tumor suppressors under certain conditions (Peng and Croce 2016; Oh et al. 2017). Amplification or deletion of miRNA genes, their abnormal transcriptional regulation, and defects in the biogenesis pathway can alter the miRNA expression profiles in cancer patients (Peng and Croce 2016).
The miRBase offers information on miRNAs and the latest version (v22.1) consists of 2654 mature human miRNAs (Kozomara et al. 2019). Figure 1 illustrates the 481 miRNAs belonging to 159 clusters spanning all the chromosomes. Each cluster consists of more than one miRNA transcribed from physically adjacent positions driven by a single promoter region (Seitz et al. 2004). The cluster members show high sequence similarity in the seed region, and they often target the same or different genes belonging to a specific pathway. Hence, the effect due to the abnormal expression of clustered miRNAs could be more severe than non-clustered miRNAs.
Aberrantly expressed clusters are regulated by genetic and epigenetic mechanisms such as mutations, deletions, amplifications, and DNA methylation (Kabekkodu et al. 2018). These in turn can alter the mRNA translation, signal transduction pathways, metabolic flux, or protein function. Dysregulated expression of CNV-driven clustered miRNAs in carcinomas of the ovary (Zhang et al. 2008), gastrointestinal tract (An et al. 2013), lung (Xia et al. 2018), and urinary bladder (Ware et al. 2022) has been well documented. However, the interplay between clustered miRNAs and CNVs in all cancer types is largely unknown. Nearly 50% of the miRNA genes are co-localized with the CFSs that exhibit higher genomic instability and frequent copy number changes. Furthermore, they also affect the expression of clustered miRNAs (Calin et al. 2004; Sevignani et al. 2007; Kabekkodu et al. 2018).
In recent years, numerous databases such as miRCancer (Xie et al. 2013), miRwayDB (Das et al. 2018), TACCO (Chou et al. 2019), miRactDB (Tan et al. 2021), DriverDBv3 (Liu et al. 2020), and miR-TV (Pan and Lin 2020) have been released for public access to comprehend cancer drivers. Additionally, well established computational approaches such as miRDriver (Bose and Bozdag 2019), CAMIRADA (Shamsizadeh et al. 2019), FCMDAP (Li et al. 2019), and PMAMCA (Ha et al. 2019) can determine the associations between microRNAs and cancers. However, a comprehensive resource for miRNA clusters co-localized with CNV regions and their expression analysis is unavailable. Hence, it is necessary to integrate genetic and epigenetic data associated with the miRNA clusters, for the critical evaluation of cancer progression and the development of therapeutics.
The present study integrated multi-omics datasets from the 35 TCGA cancer types and developed a user-friendly database, Clustered miRNAs co-localized with CNVs (CmirC). With advanced search and browse options, the portal allows users to explore and analyze the datasets of individual clustered miRNAs co-localized with CNVs, CFSs, and their regulation in different cancer types.
Materials and methods
The study consists of two major parts: (i) integration of CNV—clustered miRNA data analysis of 35 cancer types, and (ii) development of a database for clustered miRNAs co-localized with CNVs. We have used publicly available CNV, clustered miRNA, RNASeq datasets, and bioinformatics resources for integrated data analysis. Computer languages such as hypertext markup language (HTML), PHP: hypertext pre-processor (PHP), JavaScript, and MySQL were used to develop this interactive database. The schematic representation of data collection, analysis, integration, and CmirC database development is illustrated in Fig. 2.
Data collection and sources
Currently, 159 miRNA clusters are reported from the human genome and each cluster consists of two or more miRNAs. Primary information on 159 miRNA clusters consisting of 481 precursors and 717 mature miRNAs was retrieved from the miRBase v22.1 database (http://www.mirbase.org/). The transcription start sites (TSS) for clustered miRNAs were retrieved from the FANTOM5 repository (Lizio et al. 2019) and the CFSs data was downloaded from the HumCFS database (Kumar et al. 2019). Pre-computed segmented copy number aberrations CNA (SCNA) data for 35 cancer types were obtained from the Broad Institute’s FireBrowse portal (http://firebrowse.org/). Level 3 miRNA expression datasets from the TCGA-GDC portal (https://portal.gdc.cancer.gov/) were interrogated using the R package TCGAbiolinks (Colaprico et al. 2016). Potential miRNA target genes were identified and retrieved from miRTarBase (Huang et al. 2020), DIANA-TarBase (Vlachos et al. 2015), and miRDB (Chen and Wang 2020) repositories. Only those target gene(s) for miRNAs that are reported by these three databases were considered for further downstream analysis.
Integrated data analysis
Recurrent CNVs (RCNVs) in the samples of 35 cancer types were analyzed using GAIA 3.10, R Bioconductor package. The probe metadata file at the Broad Institute’s data portal (ftp://ftp.broadinstitute.org/pub/GISTIC2.0/hg19_support/) was used to obtain information on RCNV cytoband and their location. The RCNVs are defined by false discovery rate (FDR) Q score < 0.15 from 10 iterations. The segmental mean of 0.3 was set as the threshold to identify the copy number gain or loss. The regions with a mean threshold value > 0.3 and ≤ 0.3 were considered as a copy number gain (amplification) and loss (deletion), respectively. UCSC LiftOver (https://genome.ucsc.edu/cgi-bin/hgLiftOver), a genome upgradation tool, was used to lift all SCNA genomic coordinates to match with the hg38 genome build. Further, BEDTools (Quinlan and Hall 2010) were used to intersect the genomic coordinates of miRNA clusters onto the recurrent significant CNV regions. The intersect function of the BEDTools was used to map coordinates of 159 miRNA cluster regions on fragile sites. The CRAN package “circlize” was used for the graphical representation of significant CNV and miRNA cluster co-localization (Gu et al. 2014). A quantile filtration with a cut-off value of 0.25 was used to filter the significant miRNAs and thereby excluding the miRNAs with very low read count. For the differential expression (DE) analysis, we used the TCGAanalyze_DEA package with various functions of the edgeR package from Bioconductor (Robinson et al. 2010). The function “glmLRT” was employed to make pair-wise tests and DE analysis between the two groups. The p values obtained were sorted in ascending order and further adjusted using the FDR correction to shortlist the top differentially expressed miRNAs. Thresholds for logarithmic fold change (Log2FC) and FDR were set as 1 and 0.1 respectively, such that differentially expressed miRNAs were considered to be significant only if Log2FC > 1 and FDR < 0.05.
Database construction, web interface, and data visualization
The LAMP, a Linux-based stack of open-source software consisting of Apache v2.4, MySQL v5.7, and PHP v7.2, was used for the development of CmirC. The database is hosted on the server with the Red Hat Enterprise release 6.10 operating system. Data has been categorized and stored in tabular format using MySQL relational database management system for efficient administration and management. The interactive user interface was built using HTML, cascading style sheet, bootstrap, and JavaScript. The intermediate layer and server-side scripting were performed using PHP. An interactive genome browser was configured using JBrowse (Skinner et al. 2009) to provide fast and smooth scrolling of identified RCNA from 35 cancers and co-localized miRNA clusters. The NCBI BLAST + v2.5.0 (Altschul et al. 1990) has been integrated into the CmirC to provide sequence similarity-based searching. We incorporated two R packages, visNetwork (https://github.com/datastorm-open/visNetwork), and networkD3 (https://github.com/christophergandrud/networkD3) to portray the miRNA-target gene interaction networks. Command-line tool gene set clustering based on functional annotation (GeneSCF) v1.1 (Subhash and Kanduri 2016) has been configured on the CmirC server allowing the user to perform the enrichment analysis of a set of miRNA target genes. The expression profiles of miRNAs are presented as a bar graph using additional JavaScript packages, Chart.js (https://www.chartjs.org/), and CanvasJS (http://canvasjs.com/).
Results
Integrated data analysis
The CmirC is an open-access platform that provides information on miRNA clusters and their co-localization with RCNVs from the TCGA cancer types. A total of 12,496 CNVs and 10,461 miRNA expression samples were downloaded from the TCGA database for the integrated data analysis and database development. We have identified 125 miRNA clusters co-localized with RCNVs across 35 cancer types. The top 10 cancer types which were more prone to genomic instability events are glioma (GBMLGG), breast invasive carcinoma (BRCA), glioblastoma multiforme (GBM), uterine corpus endometrial carcinoma (UCEC), ovarian cancer (OV), sarcoma (SARC), stomach adenocarcinoma (STAD), lung adenocarcinoma (LUAD), urothelial carcinoma (BLCA), and hepatocellular carcinoma (LIHC). A total of 48,876 RCNVs events were identified in these cancers. The highest number of miRNA clusters were found to be co-localized with RCNA in BRCA (78) followed by OV (70), GBMLGG (64), LUAD (66), and BLCA (62). The total number of identified RCNVs and co-localized miRNA clusters from the individual cancer types is illustrated in Fig. 3. Interestingly, we identified co-localization of hsa-mir-199a/mir-214 and hsa-mir-657/mir-1250 miRNA clusters with amplified region in 28 and 21 cancer types, respectively. Similarly, hsa-mir-3926–2/mir-5692a-2 cluster was found in deleted region of 29 cancer types. The RCNV and miRNA cluster relation is depicted as an interactive network and is provided in the information section of CmirC portal.
A large proportion of the genetic and epigenetic reprogramming influences the miRNA promoter and transcription start sites (TSS). Considering this, we identified potential TSS for 149 miRNA clusters from the FANTOM5 repository. Also, by intersecting miRNA cluster coordinates on fragile sites, we identified 57 miRNA clusters spanning 29 fragile sites on different chromosomes (Table 1). Out of these 29, the frequency of 19 fragile sites is common, nine are rare, and one is unknown. The chromosome 19 fragile sites (FRA19A and FRA19B) have the highest miRNA cluster co-localization (7 and 5, respectively), and the largest miRNA cluster hsa-mir-512–1/mir-1283–1 (C19MC) is co-localized with FRA19A. The integrated RCNA and cluster candidate’s differential expression showed that 46 miRNA clusters associated with 12 cancer types were significantly correlated. Of which, the expression of 32 miRNA clusters was upregulated, whereas 14 clusters were downregulated (Fig. 4).
The CmirC interface
The homepage of the CmirC web portal provides a quick browse option, wherein users can explore the repository by selecting specific miRNA clusters, fragile sites, or cancer types (Fig. 5A). All these browsing options are indexed in a tabular form for easy and efficient access. We have provided an option to perform a BLAST-based sequence similarity search against clustered miRNA sequences at the CmirC. This facilitates fast and accurate identification of clustered miRNAs from user-given datasets. The help page provides a user manual for easy navigation of resources provided in the CmirC. The datasets can be downloaded in various file formats such as tab-delimited, comma-separated, XLSX files, PDF, HTML, BED, PNG, and JPEG for further downstream data analysis or presentation.
Key features and utilities of CmirC
Browse options
The CmirC provides multiple straightforward browsing facilities to access the integrated datasets. Users can browse the database in three ways: (i) by miRNA cluster name that provides information about 159 miRNA clusters, including genomic coordinates, cluster candidates, and CNV information; (ii) by fragile sites that lists all the miRNA clusters on the chromosome fragile sites with their genomic location information; and (iii) by cancer type. Users can retrieve RCNAs and their co-localized miRNA clusters by selecting the specific cancer type. Further, the hyperlink has been enabled to fetch detailed information on individual miRNA clusters.
Sequence alignment
The stable version of NCBI BLAST + (v2.5.0) is configured to perform a sequence similarity search (Fig. 5B). The server executes the BLASTN alignment algorithm for the user-uploaded sequence against the clustered miRNA sequences. This option allows the identification of clustered miRNAs homologous to the query sequence.
Genome browser
The CmirC is integrated with an interactive tool JBrowse to visualize miRNA cluster co-localization in the RCNV regions (Fig. 5C). This genome browser allows users to quickly view CNV regions, fragile sites, and their co-localized miRNA clusters with extended zoom levels for higher resolution. The display includes multiple parallel tracks of annotated features such as reference sequences, precursor and mature miRNAs, fragile sites, and RCNA from each cancer type. This feature facilitates cumulative visualization and seamless navigation between the tracks. Users can conveniently navigate through miRNA clusters, CNV regions, and CFSs. Details about mature, precursor as well as clustered miRNAs, CNV areas, and CFSs in the genome (hg38 build) appear in a pop-up window. Using the highly flexible and customizable option of JBrowse, users can easily upload their data, analyze, and download the reports.
CNV plots and miRNA-gene networks
This comprehensive resource will provide an opportunity to generate high-quality publication-ready graphs and plots for scientific reporting. Users can generate circos plots of integrated data for individual cancer types (Fig. 6A). The web server is implemented with the open-source R packages; visNetwork and networkD3 for the visualization of clustered miRNA and their target genes. These packages provide an easy way for viewing and adjusting the interaction networks (Fig. 6B, C). The visNetwork generated miRNA-target genes interactions provide a dropdown list for each node in the network. Further, the interconnected subnetworks can be viewed upon the node selection from the dropdown list. The users can directly select and tweak individual nodes or edges of interest. The miRNA and target gene list can be downloaded in tab-delimited file format for further analysis.
Data analysis module
CmiRClustFinder v2.0 data analysis tool is an upgraded version with GUI of our prior developed pipeline available on GitHub (https://github.com/msls-bioinfo/CmiRClustFinder_v1.0). This program enables users to identify RCNVs in cancer samples, as well as perform co-localization analyses between RCNVs and user-specified genomic areas of interest. We have provided three modules: (i) data analysis portal, in which the user can upload SCNA datasets and gene coordinates in BED format; (ii) analysis with the TCGA cancer types, the user can pick any one of the 35 TCGA cancer type and run co-localization analysis; and (iii) the standalone version that allows user to download and execute the pipeline on Linux operating systems. The pipeline will generate co-localization report in “.tsv” file and circos-based plots as images. This web tool is available for public use on the CmirC portal.
Miscellaneous features
The web portal is automated for data analysis and generates reports in various graphical representations. The publication-ready clustered miRNA expression profile of cancer types can be visualized in bar graphs (Fig. 6D). The differential expression of miRNAs across tumors and control samples of 35 cancer types is provided (displayed in Fig. 6E). In-house, shell scripts embedded with the webserver allow for gene set functional enrichment analysis on target genes in a convenient manner. The complete list of molecular functions (MF), biological processes (BP), cellular components (CC), and network of cancer genes (NCG) can be retrieved in “.tsv” file format. The significance of the analysis can be predicted based on the FDR and the p value calculated using Fisher’s exact test. Further, a bubble plot with the top 20 enriched functions (ranked based on log-transformed p value) of miRNA targeted genes is also provided (Fig. 6F, G).
In a move towards automated web portals for big data analysis, we have developed a specific database for clustered miRNAs co-localized with CNV regions reported from 35 cancer types. The comprehensive resource embedded with data analytics and visualization packages provides a better choice for researchers to comprehend the clustered miRNA regulation during carcinogenesis.
Discussion
Understanding primary mechanisms underlying carcinogenesis requires comprehensive annotation of the integrated cancer genetic data. A major portion of the human genome represents non-coding regions. However, these regions harbor functional elements that can regulate the expression of protein-coding genes (Gloss and Dinger 2018). Genetic and epigenetic reprogramming can also influence the transcription of miRNA expression under various physiological conditions (Gulyaeva and Kushlinskiy 2016). Recent evidence suggests that the alteration of protein-coding genes alone cannot constitute the entire molecular basis of tumor development (Xue and He 2014). Also, the vast majority of somatic alterations of the cancer genome are reported in non-coding regions (Cuykendall et al. 2017). The miRNAs are an important class of non-coding genetic elements that regulate gene expression and control multiple biological events (Oh et al. 2017). Studies on altered expressions of individual miRNAs in carcinogenesis are increasingly recognized and explored over the past few decades (Peng and Croce 2016). Besides, a group of miRNAs is found as a cluster at various genomic loci. The proportion of clustered miRNAs is different across the species, and they tend to be evolutionarily conserved. Researchers have proposed that clustered miRNAs can act more efficiently than a single miRNA, as a cluster contains multiple miRNAs (Wang et al. 2016). Clustered miRNAs deregulation is more potent and crucial in cancer signaling pathways and is further responsible for clinical complications such as resistance to therapy (Lin et al. 2020; Becker et al. 2012). Hence, the role of clustered miRNAs as a biomarker for diagnosis, prognosis, treatment, and improved patient care remains to be fully exploited. The genetic and epigenetic reprogramming that alters the cluster miRNA targeted gene regulation is even more complex during carcinogenesis. Considering all these concerns, we have performed an integrated analysis using publicly available cancer datasets from the TCGA.
Identification of RCNV across TCGA cancer types exhibited the top 10 genomic instability-prone cancers. It has been suggested that strong clustered miRNA articulation happens during carcinogenesis, recommending that the profiling of these miRNA groups could be utilized for the clinical diagnosis of cancer (Kabekkodu et al. 2018). By mapping the miRNA clusters on recurrent CNV regions, we have found the maximum number of miRNA clusters co-localized with genomic susceptibility loci in BRCA, OV, GBMLGG, LUAD, and BLCA. Few reports have already indicated an abnormal expression of clustered miRNAs that may contribute to cancer hallmarks acquisition in the above-mentioned tumor types (Molina-Pinelo et al. 2014; Enokida et al. 2016; Yoshida et al. 2021). Interestingly, we observed that specific genomic regions consisting of miRNA clusters are involved in deletions and amplification events depending on the cancer types. However, members of these clusters can behave either as oncogene or as tumor suppressors, depending on the alteration, cell type, or transcriptional events. In recent years, CFSs have been acknowledged as a significant aspect of cancer biology, as these are the regions where most cancer-related genes occur (Kumar et al. 2019). The genetic instability at CFSs leads to dysregulation of the expression of oncogenes or tumor suppressor genes. A thorough understanding of the relationship between CFS-co-localized miRNA clusters and carcinogenesis is needed before therapeutic strategies based on genomic profiles can be determined. Here, we report that ~ 35% of the miRNA clusters are co-localized with 29 CFSs on various chromosomes. The miRNA differential expression profiles (normal vs tumor samples) were correlated with RCNV co-localized clustered miRNAs. A total of 11 CNV-driven downregulated miRNA clusters were identified from UCEC, followed by LUSC (9 clusters) and BLCA (7 clusters). However, six CNV-driven miRNA clusters have been identified as upregulated in HNSC. A total of 12 distinct cancer types associated miRNA clusters induced by CNVs and their expression patterns are shown in Fig. 4. Ultimately, this analysis suggests new insights into the multi-layered complex regulation of clustered miRNAs during tumorigenesis.
Currently, studies on the miRNA cluster regulation during the tumor development are in its infancy. Moreover, there are no comprehensive resources that provide information on the structural variation and functional regulation of miRNA clusters during carcinogenesis. In this regard, we have developed a web portal CmirC that provides integrated information on clustered miRNA-CNV co-localization and expression profile of 481 clustered miRNAs in 35 cancer types. The current version of CmirC integrates data on CNV, clustered miRNAs, fragile sites, miRNA expression, and their targets as well as multiple bioinformatics tools for convenient data retrieval and analysis. The CmirC offers interactive networks of miRNA clusters and their target genes with a gene set functional enrichment facility. All the identified RCNA-miRNA cluster co-localization datasets are provided for downloading along with circos-based graphical representation. A customizable genome browser displays an integrated genetic dataset in individual tracks for quick access. Further, hyperlinks are enabled for browsing all the precursor and mature miRNAs from their parent repository miRBase.
The CmirC platform is equipped with multiple genetic and epigenetic effectors that could potentially impact miRNA cluster regulation during carcinogenesis. The integrative high throughput genetic data provided in this resource will also be helpful to understand the important characteristic of tumor heterogeneity. We suggest that alterations identified co-localizing with cluster miRNAs must be systematically and functionally tested to investigate their effects during tumorigenesis. The combined impact of multiple events on clustered miRNAs can be utilized to advance cancer therapeutics and prevention. Also, manipulation of the clustered miRNA expression in a tissue-specific manner can be achieved, and early results are promising for cancer diagnosis and prognosis. We believe that this repository is a valuable resource in cancer biology that can be exploited to assess clinically significant cancer biomarker data.
Conclusion
The CNV-miRNA integrated study analyzed 12,496 CNVs and 10,461 miRNAs from the TCGA data to understand the cancer type-specific expression. We developed CmirC, an online portal to analyze and retrieve multi-omics clustered miRNA associated data from 35 TCGA cancer types. The CmirC database and data analysis platform could pave the way for the possible understanding of challenges in dysregulated clustered miRNA-mediated cancers. This functional genomics approach integrates clustered miRNAs, cancer associated CNVs, miRNA targeted genes, and expression datasets in a visualized and interactive manner. We anticipate that CmirC will provide helpful information on miRNA clusters’ structural variation and functional regulation during carcinogenesis. Also, the portal for clustered miRNAs co-localized with RCNV regions can play a potential role in the development of biomarkers for the diagnosis and prognosis of various cancers.
Data availability
The data stored in CmirC can be freely retrieved, visualized, and downloaded from the portal http://slsdb.manipal.edu/cmirclust/.
References
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215(3):403–410. https://doi.org/10.1016/S0022-2836(05)80360-2
An J, Pan Y, Yan Z, Li W, Cui J, Yuan J, Tian L, Xing R, Lu Y (2013) MiR-23a in amplified 19p1313 loci targets metallothionein 2A and promotes growth in gastric cancer cells. J Cell Biochem 114(9):2160–2169. https://doi.org/10.1002/jcb.24565
Becker LE, Lu Z, Chen W, Xiong W, Kong M, Li Y (2012) A systematic screen reveals MicroRNA clusters that significantly regulate four major signaling pathways. PLoS ONE 7(11):e48474. https://doi.org/10.1371/journal.pone.0048474
Bose B, Bozdag S, (2019) MiRDriver: a tool to infer copy number derived miRNA-gene networks in cancer. bioRxiv. https://doi.org/10.1101/652156
Calin GA, Sevignani C, Dumitru CD, Hyslop T, Noch E, Yendamuri S, Shimizu M, Rattan S, Bullrich F, Negrini M, Croce CM (2004) Human microRNA genes are frequently located at fragile sites and genomic regions involved in cancers. Proc Natl Acad Sci U S A 101(9):2999–3004. https://doi.org/10.1073/pnas.0307323101
Chen Y, Wang X (2020) miRDB: an online database for prediction of functional microRNA targets. Nucleic Acids Res 48(D1):D127–D131. https://doi.org/10.1093/nar/gkz757
Chou PH, Liao WC, Tsai KW, Chen KC, Yu JS, Chen TW (2019) TACCO, a database connecting transcriptome alterations, pathway alterations and clinical outcomes in cancers. Sci Rep 9(1):3877. https://doi.org/10.1038/s41598-019-40629-z
Colaprico A, Silva TC, Olsen C, Garofano L, Cava C, Garolini D, Sabedot TS, Malta TM, Pagnotta SM, Castiglioni I, Ceccarelli M, Bontempi G, Noushmehr H (2016) TCGAbiolinks: an R/Bioconductor package for integrative analysis of TCGA data. Nucleic Acids Res 44(8):e71. https://doi.org/10.1093/nar/gkv1507
Cuykendall TN, Rubin MA, Khurana E (2017) Non-coding genetic variation in cancer. Curr Opin Syst Biol 1:9–15. https://doi.org/10.1016/j.coisb.2016.12.017
Das SS, Saha P, Chakravorty N (2018) miRwayDB: a database for experimentally validated microRNA-pathway associations in pathophysiological conditions. Database (Oxford) 2018:bay023. https://doi.org/10.1093/database/bay023
Enokida H, Yoshino H, Matsushita R, Nakagawa M (2016) The role of microRNAs in bladder cancer. Investig Clin Urol 57(1):S60–S76. https://doi.org/10.4111/icu.2016.57.S1.S60
Gloss BS, Dinger ME (2018) Realizing the significance of noncoding functionality in clinical genomics. Exp Mol Med 50(8):1–8. https://doi.org/10.1038/s12276-018-0087-0
Gu Z, Gu L, Eils R, Schlesner M, Brors B (2014) circlize implements and enhances circular visualization in R. Bioinformatics 30(19):2811–2812. https://doi.org/10.1093/bioinformatics/btu393
Gulyaeva LF, Kushlinskiy NE (2016) Regulatory mechanisms of microRNA expression. J Transl Med 14(1):143. https://doi.org/10.1186/s12967-016-0893-x
Ha J, Park C, Park S (2019) PMAMCA: prediction of microRNA-disease association utilizing a matrix completion approach. BMC Syst Biol 13(1):33. https://doi.org/10.1186/s12918-019-0700-4
Huang HY, Lin YC, Li J, Huang KY, Shrestha S, Hong HC, Tang Y, Chen YG, Jin CN, Yu Y, Xu JT, Li YM, Cai XX, Zhou ZY, Chen XH, Pei YY, Hu L, Su JJ, Cui SD, Wang F, Xie YY, Ding SY, Luo MF, Chou CH, Chang NW, Chen KW, Cheng YH, Wan XH, Hsu WL, Lee TY, Wei FX, Huang HD (2020) miRTarBase 2020: updates to the experimentally validated microRNA-target interaction database. Nucleic Acids Res 48(D1):D148–D154. https://doi.org/10.1093/nar/gkz896
Kabekkodu SP, Shukla V, Varghese VK, D’ Souza J, Chakrabarty S, Satyamoorthy K (2018) Clustered miRNAs and their role in biological functions and diseases. Biol Rev Camb Philos Soc 93(4):1955–1986. https://doi.org/10.1111/brv.12428
Kozomara A, Birgaoanu M, Griffiths-Jones S (2019) miRBase: from microRNA sequences to function. Nucleic Acids Res 47(D1):D155–D162. https://doi.org/10.1093/nar/gky1141
Kumar R, Nagpal G, Kumar V, Usmani SS, Agrawal P, Raghava G (2019) HumCFS: a database of fragile sites in human chromosomes. BMC Genomics 19(9):985. https://doi.org/10.1186/s12864-018-5330-5
Li X, Lin Y, Gu C, Yang J (2019) FCMDAP: using miRNA family and cluster information to improve the prediction accuracy of disease related miRNAs. BMC Syst Biol 13(2):26. https://doi.org/10.1186/s12918-019-0696-9
Lin SC, Wu HL, Yeh LY, Yang CC, Kao SY, Chang KW (2020) Activation of the miR-371/372/373 miRNA cluster enhances oncogenicity and drug resistance in oral carcinoma cells. Int J Mol Sci 21(24):9442. https://doi.org/10.3390/ijms21249442
Liu SH, Shen PC, Chen CY, Hsu AN, Cho YC, Lai YL, Chen FH, Li CY, Wang SC, Chen M, Chung IF, Cheng WC (2020) DriverDBv3: a multi-omics database for cancer driver gene research. Nucleic Acids Res 48(D1):D863–D870. https://doi.org/10.1093/nar/gkz964
Lizio M, Abugessaisa I, Noguchi S, Kondo A, Hasegawa A, Hon CC, de Hoon M, Severin J, Oki S, Hayashizaki Y, Carninci P, Kasukawa T, Kawaji H (2019) Update of the FANTOM web resource: expansion to provide additional transcriptome atlases. Nucleic Acids Res 47(D1):D752–D758. https://doi.org/10.1093/nar/gky1099
Molina-Pinelo S, Pastor MD, Suarez R, Romero-Romero B, González De la Peña M, Salinas A, García-Carbonero R, De Miguel MJ, Rodríguez-Panadero F, Carnero A, Paz-Ares L (2014) MicroRNA clusters: dysregulation in lung adenocarcinoma and COPD. Eur Respir J 43(6):1740–1749. https://doi.org/10.1183/09031936.00091513
Oh M, Rhee S, Moon JH, Chae H, Lee S, Kang J, Kim S (2017) Literature-based condition-specific miRNA-mRNA target prediction. PLoS ONE 12(3):e0174999. https://doi.org/10.1371/journal.pone.0174999
Pan CY, Lin WC (2020) miR-TV: an interactive microRNA target viewer for microRNA and target gene expression interrogation for human cancer studies. Database (Oxford) 2020:baz148. https://doi.org/10.1093/database/baz148
Peng Y, Croce CM (2016) The role of MicroRNAs in human cancer. Signal Transduct Target Ther 1:15004. https://doi.org/10.1038/sigtrans.2015.4
Quinlan AR, Hall IM (2010) BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26(6):841–842. https://doi.org/10.1093/bioinformatics/btq033
Robinson MD, McCarthy DJ, Smyth GK (2010) edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26(1):139–140. https://doi.org/10.1093/bioinformatics/btp616
Seitz H, Royo H, Bortolin ML, Lin SP, Ferguson-Smith AC, Cavaillé J (2004) A large imprinted microRNA gene cluster at the mouse Dlk1-Gtl2 domain. Genome Res 14(9):1741–1748. https://doi.org/10.1101/gr.2743304
Sevignani C, Calin GA, Nnadi SC, Shimizu M, Davuluri RV, Hyslop T, Demant P, Croce CM, Siracusa LD (2007) MicroRNA genes are frequently located near mouse cancer susceptibility loci. Proc Natl Acad Sci U S A 104(19):8017–8022. https://doi.org/10.1073/pnas.0702177104
Shamsizadeh S, Goliaei S, Razaghi MZ (2019) CAMIRADA: cancer microRNA association discovery algorithm, a case study on breast cancer. J Biomed Inform 94:103180. https://doi.org/10.1016/j.jbi.2019.103180
Skinner ME, Uzilov AV, Stein LD, Mungall CJ, Holmes IH (2009) JBrowse: a next-generation genome browser. Genome Res 19(9):1630–1638. https://doi.org/10.1101/gr.094607.109
Subhash S, Kanduri C (2016) GeneSCF: a real-time based functional enrichment tool with support for multiple organisms. BMC Bioinformatics 17(1):365. https://doi.org/10.1186/s12859-016-1250-z
Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, Bray F (2021) Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 71(3):209–249. https://doi.org/10.3322/caac.21660
Tan H, Kim P, Sun P, Zhou X (2021) miRactDB characterizes miRNA-gene relation switch between normal and cancer tissues across pan-cancer. Brief Bioinform 22(3):bbaa089. https://doi.org/10.1093/bib/bbaa089
Vlachos IS, Paraskevopoulou MD, Karagkouni D, Georgakilas G, Vergoulis T, Kanellos I, Anastasopoulos IL, Maniou S, Karathanou K, Kalfakakou D, Fevgas A, Dalamagas T, Hatzigeorgiou AG (2015) DIANA-TarBase v70: indexing more than half a million experimentally supported miRNA:mRNA interactions. Nucleic Acids Res 43(D1):D153–D159. https://doi.org/10.1093/nar/gku1215
Wang Y, Luo J, Zhang H, Lu J (2016) microRNAs in the same clusters evolve to coordinately regulate functionally related genes. Mol Biol Evol 33(9):2232–2247. https://doi.org/10.1093/molbev/msw089
Ware AP, Kabekkodu SP, Chawla A, Paul B, Satyamoorthy K (2022) Diagnostic and prognostic potential clustered miRNAs in bladder cancer. 3 Biotech 12(8):173. https://doi.org/10.1007/s13205-022-03225-z
Xia E, Kanematsu S, Suenaga Y, Elzawahry A, Kondo H, Otsuka N, Moriya Y, Iizasa T, Kato M, Yoshino I, Yokoi S (2018) MicroRNA induction by copy number gain is associated with poor outcome in squamous cell carcinoma of the lung. Sci Rep 8(1):15363. https://doi.org/10.1038/s41598-018-33696-1
Xie B, Ding Q, Han H, Wu D (2013) miRCancer: a microRNA-cancer association database constructed by text mining on literature. Bioinformatics 29(5):638–644. https://doi.org/10.1093/bioinformatics/btt014
Xue B, He L (2014) An expanding universe of the non-coding genome in cancer biology. Carcinogenesis 35(6):1209–1216. https://doi.org/10.1093/carcin/bgu099
Yoshida K, Yokoi A, Sugiyama M, Oda S, Kitami K, Tamauchi S, Ikeda Y, Yoshikawa N, Nishino K, Niimi K, Suzuki S, Kikkawa F, Yokoi T, Kajiyama H (2021) Expression of the chrXq27.3 miRNA cluster in recurrent ovarian clear cell carcinoma and its impact on cisplatin resistance. Oncogene 40(7):1255–1268. https://doi.org/10.1038/s41388-020-01595-3
You JS, Jones PA (2012) Cancer genetics and epigenetics: two sides of the same coin? Cancer Cell 22(1):9–20. https://doi.org/10.1016/j.ccr.2012.06.008
Zhang L, Volinia S, Bonome T, Calin GA, Greshock J, Yang N, Liu CG, Giannakakis A, Alexiou P, Hasegawa K, Johnstone CN, Megraw MS, Adams S, Lassus H, Huang J, Kaur S, Liang S, Sethupathy P, Leminen A, Simossis VA, Sandaltzopoulos R, Naomoto Y, Katsaros D, Gimotty PA, DeMichele A, Huang Q, Bützow R, Rustgi AK, Weber BL, Birrer MJ, Hatzigeorgiou AG, Croce CM, Coukos G (2008) Genomic and epigenetic alterations deregulate microRNA expression in human epithelial ovarian cancer. Proc Natl Acad Sci U S A 105(19):7004–7009. https://doi.org/10.1073/pnas.0801615105
Acknowledgements
The authors would like to thank DST-FIST, the Government of India, TIFAC-CORE in Pharmacogenomics, and Manipal Academy of Higher Education (MAHE), Manipal, for the support and facilities provided. APW gratefully acknowledge MAHE, Manipal, for the Dr. TMA Pai Ph.D. fellowship and the Indian Council of Medical Research, Government of India, for the Senior Research Fellowship (Reference ID: BMI/11(10)/2022).
Funding
Open access funding provided by Manipal Academy of Higher Education, Manipal. This work is supported by the Vision Group on Science and Technology, Government of Karnataka (RGS-F/GRD No. 997/2020–21/144).
Author information
Authors and Affiliations
Contributions
APW analyzed the data, developed the database, and wrote the original draft; KS and BP conceived this study, proofread, and edited the manuscript. All authors read and approved the final version of the manuscript.
Corresponding author
Ethics declarations
Ethics approval
The manuscript does not involve any animal study.
Competing interests
The authors declare no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Ware, A.P., Satyamoorthy, K. & Paul, B. CmirC: an integrated database of clustered miRNAs co-localized with copy number variations in cancer. Funct Integr Genomics 22, 1229–1241 (2022). https://doi.org/10.1007/s10142-022-00909-w
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10142-022-00909-w