Skip to main content

Flanking region sequence information to refine microRNA target predictions

Abstract

The non-coding elements of a genome, with many of them considered as junk earlier, have now started gaining long due respectability, with microRNAs as the best current example. MicroRNAs bind preferentially to the 3′ untranslated regions (UTRs) of the target genes and negatively regulate their expression most of the time. Several microRNA:target prediction softwares have been developed based upon various assumptions and the majority of them consider the free energy of binding of a target to its microRNA and seed conservation. However, the average concordance between the predictions made by these softwares is limited and compounded by a large number of false-positive results. In this study, we describe a methodology developed by us to refine microRNA:target prediction by target prediction softwares through observations made from a comprehensive study. We incorporated the information obtained from dinucleotide content variation patterns recorded for flanking regions around the target sites using support vector machines (SVMs) trained over two different major sources of experimental data, besides other sources. We assessed the performance of our methodology with rigorous tests over four different dataset models and also compared it with a recently published refinement tool, MirTif. Our methodology attained a higher average accuracy of 0.88, average sensitivity and specificity of 0.81 and 0.94, respectively, and areas under the curves (AUCs) for all the four models scored above 0.9, suggesting better performance by our methodology and a possible role of flanking regions in microRNA targeting control. We used our methodology over genes of three different pathways — toll-like receptor (TLR), apoptosis and insulin — to finally predict the most probable targets. We also investigated their possible regulatory associations, and identified a hsa-miR-23a regulatory module.

This is a preview of subscription content, access via your institution.

Abbreviations

Ac:

accuracy

AUC:

area under the curve

FN:

false negative

FP:

false positive

MCC:

Matthew correlation coefficient

ROC:

receiver operating characteristic

Sn:

sensitivity

Sp:

specificity

SVM:

support vector machine

TFBS:

transcription factor-binding site

TLR:

toll-like receptor

TN:

true negative

TP:

true positive

UTR:

untranslated region

VDR:

vitamin D receptor

References

  • Akbani R, Kwek S and Japkowicz N 2004 Applying support vector machines to imbalanced datasets; in Proceedings of the 15th ECML (Italy: Springer)

    Google Scholar 

  • Ambros V, Bartel B, Bartel D P, Burge C B, Carrington J C, Chen X, Dreyfuss G, Eddy S R et al. 2003 A uniform system for microRNA annotation; RNA 9 277–279

    Article  PubMed  CAS  Google Scholar 

  • Andronescu M, Zhang Z C and Condon A 2005 Secondary structure prediction of interacting RNA molecules; J. Mol. Biol. 4 987–1001

    Article  CAS  Google Scholar 

  • Brennecke J, Stark A, Russell RB and Cohen S M 2005 Principles of microRNA-target recognition; PLoS Biol. 3 e85

    Article  PubMed  CAS  Google Scholar 

  • Chang C and Lin C 2001 LIBSVM: a library for support vector machines http://www.csie.ntu.edu.tw/~cjlin/libsvm

  • Cheng A M, Byrom M W, Shelton J and Ford L P 2005 Antisense inhibition of human miRNAs and indications for an involvement of miRNA in cell growth and apoptosis; Nucleic Acids Res. 33 1290–1297

    Article  PubMed  CAS  Google Scholar 

  • Cullen B R 2004 Transcription and processing of human microRNA precursors; Mol. Cell 16 861–865

    Article  PubMed  CAS  Google Scholar 

  • Didiano D and Hobert O 2008 Molecular architecture of a miRNA-regulated 3′ UTR; RNA 14 1297–1317

    Article  PubMed  CAS  Google Scholar 

  • Doench J G and Sharp P A 2004 Specificity of microRNA target selection in translational repression; Genes Dev. 18 504–511

    Article  PubMed  CAS  Google Scholar 

  • Drucker H, Burges C, Kaufman L, Smola A and Vapnik V 1997 Support vector regression machines; Adv. Neural Inf. Processing Syst. 9 155–161

    Google Scholar 

  • Gardner P P and Giegerich R 2004 A comprehensive comparison of comparative RNA structure prediction approaches; BMC Bioinformatics 5 140

    Article  PubMed  CAS  Google Scholar 

  • Griffiths-Jones S 2004 The microRNA registry; Nucleic Acids Res. 32 D109–D111

    Article  PubMed  CAS  Google Scholar 

  • Griffiths-Jones S, Grocock R J, Van D S, Bateman A and Enright A J 2006 miRBase: microRNA sequences, targets and gene nomenclature; Nucleic Acids Res. 34 D140–D144

    Article  PubMed  CAS  Google Scholar 

  • Griffiths-Jones S, Saini H K, Dongen S and Enright A J 2008 miRBase: tools for microRNA genomics; Nucleic Acids Res. 36 D154–D158

    Article  PubMed  CAS  Google Scholar 

  • Grimson A, Farh K K, Johnston W K, Garrett-Engele P, Lim L P and Bartel D P 2007 MicroRNA targeting specificity in mammals: determinants beyond seed pairing; Mol. Cell 6 91–105

    Article  CAS  Google Scholar 

  • Hammell M, Long D, Zhang L, Lee A, Carmack C S, Han M, Ding Y and Ambros V 2008 mirWIP: microRNA target prediction based on microRNA-containing ribonucleoprotein-enriched transcripts; Nat. Methods 5 813–819

    Article  PubMed  CAS  Google Scholar 

  • He M L, Chen Y, Peng Y, Jin D, Du D, Wu J, Lu P and Lin M C 2002 Induction of apoptosis and inhibition of cell growth by developmental regulator hTBX5; Biochem. Biophys. Res. Commun. 297 185–192

    Article  PubMed  CAS  Google Scholar 

  • Hobert O 2004 Common logic of transcription factor and microRNA action; Trends Biochem. Sci. 29 462–468

    Article  PubMed  CAS  Google Scholar 

  • Höchsmann M, Toller T, Giegerich R and Kurtz S 2003 Local similarity in RNA secondary structures; Proceedings of the IEEE Bioinformatics Conference CSB-2003 (California, USA: Stanford) pp 159–168

    Google Scholar 

  • Hofacker I L and Stadler P F 2006 Memory efficient folding algorithms for circular RNA secondary structures; Bioinformatics 22 1172–1176

    Article  PubMed  CAS  Google Scholar 

  • Hubbard T, Barker D, Birney E, Cameron G, Chen Y, Clark L, Cox T and Cuff J 2002 The Ensembl genome database project; Nucleic Acids Res. 30 38–41

    Article  PubMed  CAS  Google Scholar 

  • Kanehisa M and Goto S 2000 KEGG: Kyoto encyclopedia of genes and genomes; Nucleic Acids Res. 28 27–30

    Article  PubMed  CAS  Google Scholar 

  • Kel A E, Gössling E, Reuter I, Cheremushkin E, Kel-Margoulis O V and Wingender E 2003 MATCH: a tool for searching transcription factor binding sites in DNA sequences; Nucleic Acids Res. 31 3576–3579

    Article  PubMed  CAS  Google Scholar 

  • Kent W J, Hsu F, Karolchik D, Kuhn R M, Clawson H, Trumbower H and Haussler D 2005 Exploring relationships and mining data with the UCSC gene sorter; Genome Res. 15 737–741

    Article  PubMed  CAS  Google Scholar 

  • Kertesz M, Iovino N, Unnerstall U, Gaul U and Segal E 2007 The role of site accessibility in microRNA target recognition; Nat. Genet. 39 1278–1284

    Article  PubMed  CAS  Google Scholar 

  • Kim S K, Nam J W, Rhee J K, Lee W J and Zhang B T 2006 miTarget: microRNA target gene prediction using a support vector machine; BMC Bioinformatics 7 411

    Article  PubMed  CAS  Google Scholar 

  • Kiriakidou M, Nelson PT, Kouranov A, Fitziev P, Bouyioukos C, Mourelatos Z and Hatzigeorgiou A 2004 A combined computational-experimental approach predicts human microRNA targets; Genes Dev. 18 1165–1178

    Article  PubMed  CAS  Google Scholar 

  • Korfali N, Ruchaud S, Loegering D, Bernard D, Dingwall C, Kaufmann S H and Earnshaw W C 2004 Caspase-7 gene disruption reveals an involvement of the enzyme during the early stages of apoptosis; J. Biol. Chem. 279 1030–1039

    Article  PubMed  CAS  Google Scholar 

  • Lai E C 2002 Micro RNAs are complementary to 3′ UTR sequence motifs that mediate negative post-transcriptional regulation; Nat. Genet. 30 363–374

    Article  PubMed  CAS  Google Scholar 

  • Lai E C 2004 Predicting and validating microRNA targets; Genome Biol. 5 115

    Article  PubMed  Google Scholar 

  • Lee R C, Feinbaum R L and Ambros V 1993 The C. elegans heterochronic gene lin-4 encodes small RNAs with antisense complementarity to lin-14; Cell 75 843–854

    Article  PubMed  CAS  Google Scholar 

  • Lewis B P, Burge C B and Bartel D P 2005 Conserved seed pairing, often flanked by adenosines, indicates that thousands of human genes are microRNA targets; Cell 120 15–20

    Article  PubMed  CAS  Google Scholar 

  • Lewis B P, Shih I H, Jones-Rhoades M W, Bartel D P and Burge C B 2003 Prediction of mammalian microRNA targets; Cell 115 787–798

    Article  PubMed  CAS  Google Scholar 

  • Lim L P, Lau N C, Garrett-Engele P, Grimson A, Schelter J M, Castle J, Bartel D P and Linsley P S et al. 2005 Microarray analysis shows that some microRNAs downregulate large numbers of target mRNAs; Nature (London) 433 769–773

    Article  CAS  Google Scholar 

  • Long D, Lee R, Williams P, Chan C Y, Ambros V and Ding Y 2007 Potent effect of target structure on microRNA function; Nat. Struct. Mol. Biol. 14 287–294

    Article  PubMed  CAS  Google Scholar 

  • Miller A A and Waterhouse P 2005 Plant and animal microRNAs: similarities and differences; Funct. Integr. Genomics 5 129–135

    Article  CAS  Google Scholar 

  • Rehmsmeier M, Steffen P, Hochsmann M and Giegerich R 2004 Fast and effective prediction of microRNA/target duplexes; RNA 10 1507–1517

    Article  PubMed  CAS  Google Scholar 

  • Robins H, Li Y and Padgett R 2005 Incorporating structure to predict microRNA targets; Proc. Natl. Acad. Sci. USA 102 4006–4009

    Article  PubMed  CAS  Google Scholar 

  • Sethupathy P, Corda B and Hatzigeorgiou A G 2006 TarBase: a comprehensive database of experimentally supported animal microRNA targets; RNA 12 192–197

    Article  PubMed  CAS  Google Scholar 

  • Shankar R, Chaurasia A, Ghosh B, Chekmenev D, Cheremushkin E, Kel A and Mukerji M 2007 Non-random genomic divergence in repetitive sequences of human and chimpanzee in genes of different functional categories; Mol. Genet. Genomics 277 441–455

    Article  PubMed  CAS  Google Scholar 

  • Song Z, Krishna S, Thanos D, Strominger J L and Ono S J 1994 A novel cysteine-rich sequence-specific DNA-binding protein interacts with the conserved X-box motif of the human major histocompatibility complex class II genes via a repeated Cys-His domain and functions as a transcriptional repressor; J. Exp. Med. 180 1763–1774

    Article  PubMed  CAS  Google Scholar 

  • Thadani R and Tammi M T 2006 MicroTar: predicting microRNA targets from RNA duplexes; BMC Bioinformatics 18 7

    Google Scholar 

  • Thompson W, Rouchka E C and Lawrence C E 2003 Gibbs recursive sampler: finding transcription factor binding sites; Nucleic Acids Res. 31 3580–3585

    Article  PubMed  CAS  Google Scholar 

  • Thompson W, Palumbo M J, Wasserman W W, Liu J S and Lawrence C E 2004 Decoding human regulatory circuits; Genome Res. 14 1967–1974

    Article  PubMed  CAS  Google Scholar 

  • Umeda M, Nishitani H and Nishimoto T 2003 A novel nuclear protein, Twa1, and Muskelin comprise a complex with RanBPM; Gene 303 47–54

    Article  PubMed  CAS  Google Scholar 

  • Wang X and El Naqa I M 2008 Prediction of both conserved and nonconserved microRNA targets in animals; Bioinformatics 24 325–332

    Article  PubMed  CAS  Google Scholar 

  • Will S, Reiche K, Hofacker I L, Stadler P F and Backofen R 2007 Inferring noncoding RNA families and classes by means of genome-scale structure-based clustering; PLoS Comput. Biol. 3 4

    Article  CAS  Google Scholar 

  • Wingender E, Chen X, Hehl R, Karas H, Liebich I, Matys V, Meinhardt T, Prüss M et al. 2000 TRANSFAC: an integrated system for gene expression regulation; Nucleic Acids Res. 28 316–319

    Article  PubMed  CAS  Google Scholar 

  • Xiao F, Zuo Z, Cai G, Kang S, Gao X and Li T 2009 miRecords: an integrated resource for microRNA-target interactions; Nucleic Acids Res. 37 D105–D110

    Article  PubMed  CAS  Google Scholar 

  • Yang Y, Wang Y and Li K 2008 MirTif: a support vector machine-based microRNA target interaction filter; BMC Bioinformatics 9 S4

    Article  PubMed  CAS  Google Scholar 

  • Zhang D, Yoon H G and Wong J 2005 JMJD2A is a novel N-CoRinteracting protein and is involved in repression of the human transcription factor achaete scute-like homologue 2 (ASCL2/Hash2); Mol. Cell. Biol. 25 6404–6414

    Article  PubMed  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ravi Shankar.

Additional information

Contributed equally

Electronic supplementary material

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Heikham, R., Shankar, R. Flanking region sequence information to refine microRNA target predictions. J Biosci 35, 105–118 (2010). https://doi.org/10.1007/s12038-010-0013-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12038-010-0013-7

Keywords

  • Bioinformatics
  • genome
  • microRNA
  • non-coding
  • RNA