Journal of Biosciences

, Volume 35, Issue 1, pp 105–118

Flanking region sequence information to refine microRNA target predictions

Article

Abstract

The non-coding elements of a genome, with many of them considered as junk earlier, have now started gaining long due respectability, with microRNAs as the best current example. MicroRNAs bind preferentially to the 3′ untranslated regions (UTRs) of the target genes and negatively regulate their expression most of the time. Several microRNA:target prediction softwares have been developed based upon various assumptions and the majority of them consider the free energy of binding of a target to its microRNA and seed conservation. However, the average concordance between the predictions made by these softwares is limited and compounded by a large number of false-positive results. In this study, we describe a methodology developed by us to refine microRNA:target prediction by target prediction softwares through observations made from a comprehensive study. We incorporated the information obtained from dinucleotide content variation patterns recorded for flanking regions around the target sites using support vector machines (SVMs) trained over two different major sources of experimental data, besides other sources. We assessed the performance of our methodology with rigorous tests over four different dataset models and also compared it with a recently published refinement tool, MirTif. Our methodology attained a higher average accuracy of 0.88, average sensitivity and specificity of 0.81 and 0.94, respectively, and areas under the curves (AUCs) for all the four models scored above 0.9, suggesting better performance by our methodology and a possible role of flanking regions in microRNA targeting control. We used our methodology over genes of three different pathways — toll-like receptor (TLR), apoptosis and insulin — to finally predict the most probable targets. We also investigated their possible regulatory associations, and identified a hsa-miR-23a regulatory module.

Keywords

Bioinformatics genome microRNA non-coding RNA 

Abbreviations used

Ac

accuracy

AUC

area under the curve

FN

false negative

FP

false positive

MCC

Matthew correlation coefficient

ROC

receiver operating characteristic

Sn

sensitivity

Sp

specificity

SVM

support vector machine

TFBS

transcription factor-binding site

TLR

toll-like receptor

TN

true negative

TP

true positive

UTR

untranslated region

VDR

vitamin D receptor

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Supplementary material

12038_2010_13_MOESM1_ESM.pdf (878 kb)
Supplementary material, approximately 877 KB.

References

  1. Akbani R, Kwek S and Japkowicz N 2004 Applying support vector machines to imbalanced datasets; in Proceedings of the 15th ECML (Italy: Springer)Google Scholar
  2. Ambros V, Bartel B, Bartel D P, Burge C B, Carrington J C, Chen X, Dreyfuss G, Eddy S R et al. 2003 A uniform system for microRNA annotation; RNA 9 277–279CrossRefPubMedGoogle Scholar
  3. Andronescu M, Zhang Z C and Condon A 2005 Secondary structure prediction of interacting RNA molecules; J. Mol. Biol. 4 987–1001CrossRefGoogle Scholar
  4. Brennecke J, Stark A, Russell RB and Cohen S M 2005 Principles of microRNA-target recognition; PLoS Biol. 3 e85CrossRefPubMedGoogle Scholar
  5. Chang C and Lin C 2001 LIBSVM: a library for support vector machines http://www.csie.ntu.edu.tw/~cjlin/libsvm
  6. Cheng A M, Byrom M W, Shelton J and Ford L P 2005 Antisense inhibition of human miRNAs and indications for an involvement of miRNA in cell growth and apoptosis; Nucleic Acids Res. 33 1290–1297CrossRefPubMedGoogle Scholar
  7. Cullen B R 2004 Transcription and processing of human microRNA precursors; Mol. Cell 16 861–865CrossRefPubMedGoogle Scholar
  8. Didiano D and Hobert O 2008 Molecular architecture of a miRNA-regulated 3′ UTR; RNA 14 1297–1317CrossRefPubMedGoogle Scholar
  9. Doench J G and Sharp P A 2004 Specificity of microRNA target selection in translational repression; Genes Dev. 18 504–511CrossRefPubMedGoogle Scholar
  10. Drucker H, Burges C, Kaufman L, Smola A and Vapnik V 1997 Support vector regression machines; Adv. Neural Inf. Processing Syst. 9 155–161Google Scholar
  11. Gardner P P and Giegerich R 2004 A comprehensive comparison of comparative RNA structure prediction approaches; BMC Bioinformatics 5 140CrossRefPubMedGoogle Scholar
  12. Griffiths-Jones S 2004 The microRNA registry; Nucleic Acids Res. 32 D109–D111CrossRefPubMedGoogle Scholar
  13. Griffiths-Jones S, Grocock R J, Van D S, Bateman A and Enright A J 2006 miRBase: microRNA sequences, targets and gene nomenclature; Nucleic Acids Res. 34 D140–D144CrossRefPubMedGoogle Scholar
  14. Griffiths-Jones S, Saini H K, Dongen S and Enright A J 2008 miRBase: tools for microRNA genomics; Nucleic Acids Res. 36 D154–D158CrossRefPubMedGoogle Scholar
  15. Grimson A, Farh K K, Johnston W K, Garrett-Engele P, Lim L P and Bartel D P 2007 MicroRNA targeting specificity in mammals: determinants beyond seed pairing; Mol. Cell 6 91–105CrossRefGoogle Scholar
  16. Hammell M, Long D, Zhang L, Lee A, Carmack C S, Han M, Ding Y and Ambros V 2008 mirWIP: microRNA target prediction based on microRNA-containing ribonucleoprotein-enriched transcripts; Nat. Methods 5 813–819CrossRefPubMedGoogle Scholar
  17. He M L, Chen Y, Peng Y, Jin D, Du D, Wu J, Lu P and Lin M C 2002 Induction of apoptosis and inhibition of cell growth by developmental regulator hTBX5; Biochem. Biophys. Res. Commun. 297 185–192CrossRefPubMedGoogle Scholar
  18. Hobert O 2004 Common logic of transcription factor and microRNA action; Trends Biochem. Sci. 29 462–468CrossRefPubMedGoogle Scholar
  19. Höchsmann M, Toller T, Giegerich R and Kurtz S 2003 Local similarity in RNA secondary structures; Proceedings of the IEEE Bioinformatics Conference CSB-2003 (California, USA: Stanford) pp 159–168Google Scholar
  20. Hofacker I L and Stadler P F 2006 Memory efficient folding algorithms for circular RNA secondary structures; Bioinformatics 22 1172–1176CrossRefPubMedGoogle Scholar
  21. Hubbard T, Barker D, Birney E, Cameron G, Chen Y, Clark L, Cox T and Cuff J 2002 The Ensembl genome database project; Nucleic Acids Res. 30 38–41CrossRefPubMedGoogle Scholar
  22. Kanehisa M and Goto S 2000 KEGG: Kyoto encyclopedia of genes and genomes; Nucleic Acids Res. 28 27–30CrossRefPubMedGoogle Scholar
  23. Kel A E, Gössling E, Reuter I, Cheremushkin E, Kel-Margoulis O V and Wingender E 2003 MATCH: a tool for searching transcription factor binding sites in DNA sequences; Nucleic Acids Res. 31 3576–3579CrossRefPubMedGoogle Scholar
  24. Kent W J, Hsu F, Karolchik D, Kuhn R M, Clawson H, Trumbower H and Haussler D 2005 Exploring relationships and mining data with the UCSC gene sorter; Genome Res. 15 737–741CrossRefPubMedGoogle Scholar
  25. Kertesz M, Iovino N, Unnerstall U, Gaul U and Segal E 2007 The role of site accessibility in microRNA target recognition; Nat. Genet. 39 1278–1284CrossRefPubMedGoogle Scholar
  26. Kim S K, Nam J W, Rhee J K, Lee W J and Zhang B T 2006 miTarget: microRNA target gene prediction using a support vector machine; BMC Bioinformatics 7 411CrossRefPubMedGoogle Scholar
  27. Kiriakidou M, Nelson PT, Kouranov A, Fitziev P, Bouyioukos C, Mourelatos Z and Hatzigeorgiou A 2004 A combined computational-experimental approach predicts human microRNA targets; Genes Dev. 18 1165–1178CrossRefPubMedGoogle Scholar
  28. Korfali N, Ruchaud S, Loegering D, Bernard D, Dingwall C, Kaufmann S H and Earnshaw W C 2004 Caspase-7 gene disruption reveals an involvement of the enzyme during the early stages of apoptosis; J. Biol. Chem. 279 1030–1039CrossRefPubMedGoogle Scholar
  29. Lai E C 2002 Micro RNAs are complementary to 3′ UTR sequence motifs that mediate negative post-transcriptional regulation; Nat. Genet. 30 363–374CrossRefPubMedGoogle Scholar
  30. Lai E C 2004 Predicting and validating microRNA targets; Genome Biol. 5 115CrossRefPubMedGoogle Scholar
  31. Lee R C, Feinbaum R L and Ambros V 1993 The C. elegans heterochronic gene lin-4 encodes small RNAs with antisense complementarity to lin-14; Cell 75 843–854CrossRefPubMedGoogle Scholar
  32. Lewis B P, Burge C B and Bartel D P 2005 Conserved seed pairing, often flanked by adenosines, indicates that thousands of human genes are microRNA targets; Cell 120 15–20CrossRefPubMedGoogle Scholar
  33. Lewis B P, Shih I H, Jones-Rhoades M W, Bartel D P and Burge C B 2003 Prediction of mammalian microRNA targets; Cell 115 787–798CrossRefPubMedGoogle Scholar
  34. Lim L P, Lau N C, Garrett-Engele P, Grimson A, Schelter J M, Castle J, Bartel D P and Linsley P S et al. 2005 Microarray analysis shows that some microRNAs downregulate large numbers of target mRNAs; Nature (London) 433 769–773CrossRefGoogle Scholar
  35. Long D, Lee R, Williams P, Chan C Y, Ambros V and Ding Y 2007 Potent effect of target structure on microRNA function; Nat. Struct. Mol. Biol. 14 287–294CrossRefPubMedGoogle Scholar
  36. Miller A A and Waterhouse P 2005 Plant and animal microRNAs: similarities and differences; Funct. Integr. Genomics 5 129–135CrossRefGoogle Scholar
  37. Rehmsmeier M, Steffen P, Hochsmann M and Giegerich R 2004 Fast and effective prediction of microRNA/target duplexes; RNA 10 1507–1517CrossRefPubMedGoogle Scholar
  38. Robins H, Li Y and Padgett R 2005 Incorporating structure to predict microRNA targets; Proc. Natl. Acad. Sci. USA 102 4006–4009CrossRefPubMedGoogle Scholar
  39. Sethupathy P, Corda B and Hatzigeorgiou A G 2006 TarBase: a comprehensive database of experimentally supported animal microRNA targets; RNA 12 192–197CrossRefPubMedGoogle Scholar
  40. Shankar R, Chaurasia A, Ghosh B, Chekmenev D, Cheremushkin E, Kel A and Mukerji M 2007 Non-random genomic divergence in repetitive sequences of human and chimpanzee in genes of different functional categories; Mol. Genet. Genomics 277 441–455CrossRefPubMedGoogle Scholar
  41. Song Z, Krishna S, Thanos D, Strominger J L and Ono S J 1994 A novel cysteine-rich sequence-specific DNA-binding protein interacts with the conserved X-box motif of the human major histocompatibility complex class II genes via a repeated Cys-His domain and functions as a transcriptional repressor; J. Exp. Med. 180 1763–1774CrossRefPubMedGoogle Scholar
  42. Thadani R and Tammi M T 2006 MicroTar: predicting microRNA targets from RNA duplexes; BMC Bioinformatics 18 7Google Scholar
  43. Thompson W, Rouchka E C and Lawrence C E 2003 Gibbs recursive sampler: finding transcription factor binding sites; Nucleic Acids Res. 31 3580–3585CrossRefPubMedGoogle Scholar
  44. Thompson W, Palumbo M J, Wasserman W W, Liu J S and Lawrence C E 2004 Decoding human regulatory circuits; Genome Res. 14 1967–1974CrossRefPubMedGoogle Scholar
  45. Umeda M, Nishitani H and Nishimoto T 2003 A novel nuclear protein, Twa1, and Muskelin comprise a complex with RanBPM; Gene 303 47–54CrossRefPubMedGoogle Scholar
  46. Wang X and El Naqa I M 2008 Prediction of both conserved and nonconserved microRNA targets in animals; Bioinformatics 24 325–332CrossRefPubMedGoogle Scholar
  47. Will S, Reiche K, Hofacker I L, Stadler P F and Backofen R 2007 Inferring noncoding RNA families and classes by means of genome-scale structure-based clustering; PLoS Comput. Biol. 3 4CrossRefGoogle Scholar
  48. Wingender E, Chen X, Hehl R, Karas H, Liebich I, Matys V, Meinhardt T, Prüss M et al. 2000 TRANSFAC: an integrated system for gene expression regulation; Nucleic Acids Res. 28 316–319CrossRefPubMedGoogle Scholar
  49. Xiao F, Zuo Z, Cai G, Kang S, Gao X and Li T 2009 miRecords: an integrated resource for microRNA-target interactions; Nucleic Acids Res. 37 D105–D110CrossRefPubMedGoogle Scholar
  50. Yang Y, Wang Y and Li K 2008 MirTif: a support vector machine-based microRNA target interaction filter; BMC Bioinformatics 9 S4CrossRefPubMedGoogle Scholar
  51. Zhang D, Yoon H G and Wong J 2005 JMJD2A is a novel N-CoRinteracting protein and is involved in repression of the human transcription factor achaete scute-like homologue 2 (ASCL2/Hash2); Mol. Cell. Biol. 25 6404–6414CrossRefPubMedGoogle Scholar

Copyright information

© Indian Academy of Sciences 2010

Authors and Affiliations

  1. 1.Department of Bioinformatics and Structural BiologyIndian Institute of Advanced ResearchGandhinagarIndia

Personalised recommendations