A New Approach to Link Prediction in Gene Regulatory Networks

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9375)

Abstract

Link prediction is an important data mining problem that has many applications in different domains such as social network analysis and computational biology. For example, biologists model gene regulatory networks (GRNs) as directed graphs where nodes are genes and links show regulatory relationships between the genes. By predicting links in GRNs, biologists can gain a better understanding of the cell regulatory circuits and functional elements. Existing supervised methods for GRN inference work by building a feature-based classifier from gene expression data and using the classifier to predict links in the GRNs. In this paper we present a new supervised approach for link prediction in GRNs. Our approach employs both gene expression data and topological features extracted from the GRNs, in combination with three machine learning algorithms including random forests, support vector machines and neural networks. Experimental results on different datasets demonstrate the good performance of the proposed approach and its superiority over the existing methods.

Keywords

Machine learning Data mining Feature selection Bioinformatics Systems biology 

References

  1. 1.
    Al Hasan, M., Chaoji, V., Salem, S., Zaki, M.: Link prediction using supervised learning. In: SDM06: Workshop on Link Analysis, Counter-terrorism and Security (2006)Google Scholar
  2. 2.
    Al Hasan, M., Zaki, M.J.: A survey of link prediction in social networks. In: Aggarwal, C.C. (ed.) Social Network Data Analytics, pp. 243–275. Springer, New York (2011)CrossRefGoogle Scholar
  3. 3.
    Cerulo, L., Elkan, C., Ceccarelli, M.: Learning gene regulatory networks from only positive and unlabeled data. BMC Bioinformatics 11(1), 228 (2010)CrossRefGoogle Scholar
  4. 4.
    Chang, C.C., Lin, C.J.: Libsvm: a library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2(3), 27 (2011)Google Scholar
  5. 5.
    Csardi, G., Nepusz, T.: The igraph software package for complex network research. Int. J. Complex Syst. 1695(5), 1–9 (2006)Google Scholar
  6. 6.
    De Smet, R., Marchal, K.: Advantages and limitations of current network inference methods. Nat. Rev. Microbiol. 8(10), 717–729 (2010)Google Scholar
  7. 7.
    Ernst, J., Beg, Q.K., Kay, K.A., Balázsi, G., Oltvai, Z.N., Bar-Joseph, Z.: A semi-supervised method for predicting transcription factor-gene interactions in escherichia coli. PLoS Comput. Biol. 4(3), e1000044 (2008)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Fire, M., Tenenboim, L., Lesser, O., Puzis, R., Rokach, L., Elovici, Y.: Link prediction in social networks using computationally efficient topological features. In: 2011 IEEE Third International Conference on Privacy, Security, Risk and Trust (PASSAT) and 2011 IEEE Third Inernational Conference on Social Computing (SocialCom), pp. 73–80. IEEE (2011)Google Scholar
  9. 9.
    Gillani, Z., Akash, M.S., Rahaman, M., Chen, M.: CompareSVM: supervised, support vector machine (SVM) inference of gene regularity networks. BMC Bioinformatics 15(1), 395 (2014)CrossRefGoogle Scholar
  10. 10.
    Günther, F., Fritsch, S.: neuralnet: training of neural networks. R Journal 2(1), 30–38 (2010)Google Scholar
  11. 11.
    Guyon, I., Alamdari, A.R.S.A., Dror, G., Buhmann, J.M.: Performance prediction challenge. In: International Joint Conference on Neural Networks, IJCNN 2006, pp. 1649–1656. IEEE (2006)Google Scholar
  12. 12.
    Hecker, M., Lambeck, S., Toepfer, S., Van Someren, E., Guthke, R.: Gene regulatory network inference: data integration in dynamic models: a review. Biosystems 96(1), 86–103 (2009)CrossRefGoogle Scholar
  13. 13.
    Japkowicz, N., Shah, M.: Evaluating Learning Algorithms. Cambridge University Press, Cambridge (2011)CrossRefMATHGoogle Scholar
  14. 14.
    Kanji, G.K.: 100 Statistical Tests. Sage, London (2006)CrossRefGoogle Scholar
  15. 15.
    Kolaczyk, E.D.: Statistical Analysis of Network Data: Methods and Models. Springer, New York (2009)CrossRefMATHGoogle Scholar
  16. 16.
    Liaw, A., Wiener, M.: Classification and regression by randomforest. R News 2(3), 18–22 (2002). http://CRAN.R-project.org/doc/Rnews/ Google Scholar
  17. 17.
    Liben-Nowell, D., Kleinberg, J.: The link-prediction problem for social networks. J. Am. Soc. Inform. Sci. Technol. 58(7), 1019–1031 (2007)CrossRefGoogle Scholar
  18. 18.
    Menon, A.K., Elkan, C.: Link prediction via matrix factorization. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds.) ECML PKDD 2011, Part II. LNCS, vol. 6912, pp. 437–452. Springer, Heidelberg (2011) CrossRefGoogle Scholar
  19. 19.
    Mordelet, F., Vert, J.P.: Sirene: supervised inference of regulatory networks. Bioinformatics 24(16), i76–i82 (2008)CrossRefGoogle Scholar
  20. 20.
    Newman, M.: Networks: An Introduction. Oxford University Press, Oxford (2010)CrossRefMATHGoogle Scholar
  21. 21.
    Newman, M.E.: The mathematics of networks. In: The New Palgrave Encyclopedia of Economics, vol. 2, pp. 1–12 (2008)Google Scholar
  22. 22.
    Nuin, P.A.S., Wang, Z., Tillier, E.R.M.: The accuracy of several multiple sequence alignment programs for proteins. BMC Bioinformatics 7, 471 (2006). http://dx.doi.org/10.1186/1471-2105-7-471 CrossRefGoogle Scholar
  23. 23.
    Schaffter, T., Marbach, D., Floreano, D.: Genenetweaver: in silico benchmark generation and performance profiling of network inference methods. Bioinformatics 27(16), 2263–2270 (2011)CrossRefGoogle Scholar
  24. 24.
    Takes, F.W., Kosters, W.A.: Computing the eccentricity distribution of large graphs. Algorithms 6(1), 100–118 (2013)MathSciNetCrossRefGoogle Scholar
  25. 25.
    Thompson, J.D., Plewniak, F., Poch, O.: A comprehensive comparison of multiple sequence alignment programs. Nucleic Acids Res. 27(13), 2682–2690 (1999). http://dx.doi.org/10.1093/nar/27.13.2682 CrossRefGoogle Scholar
  26. 26.
    Turki, T., Wei, Z.: IPRed: Instance reduction algorithm based on the percentile of the partitions. In: Proceedings of the 26th Modern AI and Cognitive Science Conference MAICS, pp. 181–185 (2015)Google Scholar
  27. 27.
    Wang, J.T.L.: Inferring gene regulatory networks: challenges and opportunities. J. Data Min. Genomics Proteomics 06(01), e118 (2015). http://dx.doi.org/10.4172/2153-0602.1000e118 Google Scholar
  28. 28.
    Ye, J., Cheng, H., Zhu, Z., Chen, M.: Predicting positive and negative links in signed social networks by transfer learning. In: Proceedings of the 22nd International Conference on World Wide Web, pp. 1477–1488. International World Wide Web Conferences Steering Committee (2013)Google Scholar
  29. 29.
    Zhong, L., Wang, J.T.L., Wen, D., Aris, V., Soteropoulos, P., Shapiro, B.A.: Effective classification of microRNA precursors using feature mining and adaboost algorithms. OMICS 17(9), 486–493 (2013)CrossRefGoogle Scholar
  30. 30.
    Zhong, L., Wang, J.T.L., Wen, D., Shapiro, B.A.: Pre-mirna classification via combinatorial feature mining and boosting. In: 2012 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2012, Philadelphia, PA, USA, 4–7 October 2012, pp. 1–4 (2012). http://doi.ieeecomputersociety.org/10.1109/BIBM.2012.6392700

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Computer Science DepartmentKing Abdulaziz UniversityJeddahSaudi Arabia
  2. 2.Department of Computer ScienceNew Jersey Institute of TechnologyNewarkUSA

Personalised recommendations