Advertisement

CUDAGRN: Parallel Speedup of Inferring Large Gene Regulatory Networks from Expression Data Using Random Forest

  • Seyed Ziaeddin Alborzi
  • D. A. K. Maduranga
  • Rui Fan
  • Jagath C. Rajapakse
  • Jie Zheng
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8626)

Abstract

Reverse engineering of the Gene Regulatory Networks (GRNs) from high-throughput gene expression data is one of the most pressing challenges of computational biology. In this paper a method for parallelization of the Gene Regulatory Network inference algorithm, GENIE3, based on GPU by exploiting the compute unified device architecture (CUDA) programming model is designed and implemented. GENIE3 solves regulatory network prediction by developing tree based ensemble of Random forests. Our proposed method significantly improves the computational efficiency of GENIE3 by constructing the forest on the GPU in parallel. Our experiments on real and synthetic datasets show that, CUDA implementation outperforms sequential implementation by achieving a speed-up of 15 times (real data) and 14 to 18 times (synthetic data) respectively.

Keywords

Gene regulatory network Random forests GPU compute unified device architecture (CUDA) 

References

  1. 1.
    Bolouri, H.: Computational modeling of gene regulatory networks: a primer. Imperial College Press, London (2008)CrossRefGoogle Scholar
  2. 2.
    Butte, A.J., Kohane, I.S.: Mutual information relevance networks: functional genomic clustering using pairwise entropy measurements. In: Pacific Symposium on Biocomputing, vol. 5, pp. 418–429 (2000)Google Scholar
  3. 3.
    Schafer, J., Strimmer, K.: An empirical Bayes approach to inferring large-scale gene association networks. Bioinformatics 21(6), 754–764 (2005)CrossRefGoogle Scholar
  4. 4.
    Liang, S., Fuhrman, S., Somogyi, R.: REVEAL, a general reverse engineering algorithm for inference of genetic network architectures. In: Pacific Symposium on Biocomputing, vol. 3(3), pp. 18–29 (1998)Google Scholar
  5. 5.
    Akutsu, T., Miyano, S., Kuhara, S.: Identification of genetic networks from a small number of gene expression patterns under the Boolean network model. In: Pacific Symposium on Biocomputing, vol. 4, pp. 17–28 (1999)Google Scholar
  6. 6.
    Friedman, N., Linial, M., Nachman, I., Pe’er, D.: Using Bayesian networks to analyze expression data. Journal of Computational Biology 7(3-4), 601–620 (2000)CrossRefGoogle Scholar
  7. 7.
    Chen, H., Maduranga, D.A.K., Mundra, P.A., Zheng, J.: Integrating epigenetic prior in dynamic bayesian network for gene regulatory network inference. In: 2013 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), pp. 76–82 (2013)Google Scholar
  8. 8.
    Vohradsky, J.: Neural model of the genetic network. Journal of Biological Chemistry 276(39), 36168–36173 (2001)CrossRefGoogle Scholar
  9. 9.
    Irrthum, A., Wehenkel, L., Geurts, P.: Inferring regulatory networks from expression data using tree-based methods. PloS One 5(9), e12776 (2010)Google Scholar
  10. 10.
    Li, X., Xu, R.: High-dimensional data analysis in cancer research. Springer (2009)Google Scholar
  11. 11.
    Maduranga, D.A.K., Zheng, J., Mundra, P.A., Rajapakse, J.C.: Inferring gene regulatory networks from time-series expressions using random forests ensemble. In: Ngom, A., Formenti, E., Hao, J.-K., Zhao, X.-M., van Laarhoven, T. (eds.) PRIB 2013. LNCS, vol. 7986, pp. 13–22. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  12. 12.
    The DREAM4 In Silico network challenge (2010), http://wiki.c2b2.columbia.edu/dream
  13. 13.
    Manavski, S.A., Valle, G.: CUDA compatible GPU cards as efficient hardware accelerators for Smith-Waterman sequence alignment. BMC Bioinformatics 9(suppl. 2), S10 (2008)Google Scholar
  14. 14.
    Ferreira, J.F., Lobo, J., Dias, J.: Bayesian real-time perception algorithms on GPU. Journal of Real-Time Image Processing 6(3), 171–186 (2011)CrossRefGoogle Scholar
  15. 15.
    Suchard, M.A., Wang, Q., Chan, C., Frelinger, J., Cron, A., West, M.: Under-standing GPU programming for statistical computation: Studies in massively parallel massive mixtures. Journal of Computational and Graphical Statistics 19(2), 419–438 (2010)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Park, I.K., Singhal, N., Lee, M.H., Cho, S., Kim, C.W.: Design and performance evaluation of image processing algorithms on GPUs. IEEE Transactions on Parallel and Distributed Systems 22(1), 91–104 (2011)CrossRefGoogle Scholar
  17. 17.
    Shi, H., Schmidt, B., Liu, W., Müller-Wittig, W.: Parallel mutual information estimation for inferring gene regulatory networks on GPUs. BMC Research Notes 4(1), 189 (2011)CrossRefGoogle Scholar
  18. 18.
    Colmenares, J., Ortiz, J., Rocchia, W.: GPU linear and non-linear Poisson Boltzmann solver module for DelPhi. Bioinformatics, btt699 (2013)Google Scholar
  19. 19.
    Li, L., Li, C., Sarkar, S., Zhang, J., Witham, S., Zhang, Z., Alexov, E.: DelPhi: a comprehensive suite for DelPhi software and associated resources. BMC Biophysics 5(1), 9 (2012)CrossRefGoogle Scholar
  20. 20.
    Saeys, Y., Inza, I., Larrañaga, P.: A review of feature selection techniques in bioinformatics. Bioinformatics 23(19), 2507–2517 (2007)CrossRefGoogle Scholar
  21. 21.
    Geurts, P., Irrthum, A., Wehenkel, L.: Supervised learning with decision tree-based methods in computational and systems biology. Molecular Biosystems 5(12), 1593–1605 (2009)CrossRefGoogle Scholar
  22. 22.
    Breiman, L.: Random forests. Machine Learning 45(1), 5–32 (2001)zbMATHCrossRefGoogle Scholar
  23. 23.
    Breiman, L., Friedman, J., Stone, C.J., Olshen, R.A.: Classification and regression trees. CRC Press (1984)Google Scholar
  24. 24.
    Geurts, P., Ernst, D., Wehenkel, L.: Extremely randomized trees. Machine Learning 63(1), 3–42 (2006)zbMATHCrossRefGoogle Scholar
  25. 25.
    Strobl, C., Boulesteix, A.L., Zeileis, A., Hothorn, T.: Bias in random forest vari-able importance measures: Illustrations, sources and a solution. BMC Bioinformatics 8(1), 25 (2007)CrossRefGoogle Scholar
  26. 26.
    Gropp, W., Lusk, E., Skjellum, A.: Using MPI: portable parallel programming with the message-passing interface, vol. 1. MIT Press (1999)Google Scholar
  27. 27.
    Chapman, B., Jost, G., Van Der Pas, R.: Using OpenMP: portable shared memory parallel programming, vol. 10. MIT Press (2008)Google Scholar
  28. 28.
    Schaffter, T., Marbach, D., Floreano, D.: GeneNetWeaver: in silico benchmark generation and performance profiling of network inference methods. Bioinformatics 27(16), 2263–2270 (2011)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Seyed Ziaeddin Alborzi
    • 1
  • D. A. K. Maduranga
    • 1
  • Rui Fan
    • 1
  • Jagath C. Rajapakse
    • 1
    • 2
  • Jie Zheng
    • 1
    • 3
  1. 1.Bioinformatics Research Centre, School of Computer EngineeringNanyang Technological UniversitySingapore
  2. 2.Department of Biological EngineeringMassachusetts Institute of TechnologyUSA
  3. 3.A*STAR (Agency for Science, Technology,and Re-search)Genome Institute of SingaporeSingapore

Personalised recommendations