Parallel Information Theory Based Construction of Gene Regulatory Networks

  • Jaroslaw Zola
  • Maneesha Aluru
  • Srinivas Aluru
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5374)


We present a parallel method for construction of gene regulatory networks from large-scale gene expression data. Our method integrates mutual information, data processing inequality and statistical testing to detect significant dependencies between genes, and efficiently exploits parallelism inherent in such computations. We present a novel method to carry out permutation testing for assessing statistical significance while reducing its computational complexity by a factor of Θ(n 2), where n is the number of genes. Using both synthetic and known regulatory networks, we show that our method produces networks of quality similar to ARACNE, a widely used mutual information based method. We present a parallelization of the algorithm that, for the first time, allows construction of whole genome networks from thousands of microarray experiments using rigorous mutual information based methodology. We report the construction of a 15,147 gene network of the plant Arabidopsis thaliana from 2,996 microarray experiments on a 2,048-CPU Blue Gene/L in 45 minutes, thus addressing a grand challenge problem in the NSF Arabidopsis 2010 initiative.


gene networks mutual information parallel computational biology systems biology 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Zhu, X., Gerstein, M., Snyder, M.: Getting connected: analysis and principles of biological networks. Genes & development 21(9), 1010–1024 (2007)CrossRefGoogle Scholar
  2. 2.
    The chipping forecast II. Special Supplement. Nature Genetics (2002)Google Scholar
  3. 3.
    Torres, T., Metta, M., Ottenwalder, B., et al.: Gene expression profiling by massively parallel sequencing. Genome research 18(1), 172–177 (2008)CrossRefGoogle Scholar
  4. 4.
    Butte, A., Kohane, I.: Unsupervised knowledge discovery in medical databases using relevance networks. In: Proc. of American Medical Informatics Association Symposium, pp. 711–715 (1999)Google Scholar
  5. 5.
    D’haeseleer, P., Wen, X., Fuhrman, S., et al.: Mining the gene expression matrix: Inferring gene relationships from large scale gene expression data. In: Information Processing in Cells and Tissues (1998)Google Scholar
  6. 6.
    de la Fuente, A., Bing, N., Hoeschele, I., et al.: Discovery of meaningful associations in genomic data using partial correlation coefficients. Bioinformatics 20(18), 3565–3574 (2004)CrossRefGoogle Scholar
  7. 7.
    Schafer, J., Strimmer, K.: An empirical Bayes approach to inferring large-scale gene association networks. Bioinformatics 21(6), 754–764 (2005)CrossRefGoogle Scholar
  8. 8.
    Friedman, N., Linial, M., Nachman, I., et al.: Using Bayesian networks to analyze expression data. Journal of Computational Biology 7, 601–620 (2000)CrossRefGoogle Scholar
  9. 9.
    Yu, H., Smith, A., Wang, P., et al.: Using Bayesian network inference algorithms to recover molecular genetic regulatory networks. In: Proc. of International Conference on Systems Biology (2002)Google Scholar
  10. 10.
    Daub, C., Steuer, R., Selbig, J., et al.: Estimating mutual information using B-spline functions – an improved similarity measure for analysing gene expression data. BMC Bioinformatics 5, 118 (2004)CrossRefGoogle Scholar
  11. 11.
    Hartemink, A.: Reverse engineering gene regulatory networks. Nature Biotechnology 23(5), 554–555 (2005)CrossRefGoogle Scholar
  12. 12.
    Ma, S., Gong, Q., Bohnert, H.: An Arabidopsis gene network based on the graphical Gaussian model. Genome research 17(11), 1614–1625 (2007)CrossRefGoogle Scholar
  13. 13.
    Basso, K., Margolin, A., Stolovitzky, G., et al.: Reverse engineering of regulatory networks in human B cells. Nature Genetics 37(4), 382–390 (2005)CrossRefGoogle Scholar
  14. 14.
    Butte, A., Kohane, I.: Mutual information relevance networks: functional genomic clustering using pairwise entropy measurements. In: Pacific Symposium on Biocomputing, pp. 418–429 (2000)Google Scholar
  15. 15.
    Cover, T., Thomas, J.: Elements of Information Theory, 2nd edn. Wiley, Chichester (2006)zbMATHGoogle Scholar
  16. 16.
    EMBL-EBI ArrayExpress (last visited) (2008),
  17. 17.
    NCBI Gene Expression Omnibus (last visited) (2008),
  18. 18.
    NASC European Arabidopsis Stock Centre (last visited) (2008),
  19. 19.
    Schneidman, E., Still, S., Berry, M., et al.: Network information and connected correlations. Physical review letters 91(23), 238701 (2003)CrossRefGoogle Scholar
  20. 20.
    Khan, S., Bandyopadhyay, S., Ganguly, A., et al.: Relative performance of mutual information estimation methods for quantifying the dependence among short and noisy data. Physical review. E 76(2 Pt 2), 026209 (2007)MathSciNetCrossRefGoogle Scholar
  21. 21.
    Moon, Y., Rajagopalan, B., Lall, U.: Estimation of mutual information using kernel density estimators. Physical review. E 52(3), 2318–2321 (1995)CrossRefGoogle Scholar
  22. 22.
    Kraskov, A., Stogbauer, H., Grassberger, P.: Estimating mutual information. Physical review. E 69(6 Pt 2), 066138 (2004)MathSciNetCrossRefGoogle Scholar
  23. 23.
    De Boor, C.: A practical guide to splines. Springer, Heidelberg (1978)CrossRefzbMATHGoogle Scholar
  24. 24.
    Van den Bulcke, T., Van Leemput, K., Naudts, B., et al.: SynTReN: a generator of synthetic gene expression data for design and analysis of structure learning algorithms. BMC Bioinformatics 7, 43 (2006)CrossRefGoogle Scholar
  25. 25.
    Palaniswamy, S., James, S., Sun, H., et al.: AGRIS and AtRegNet. A platform to link cis-regulatory elements and transcription factors into regulatory networks. Plant physiology 140(3), 818–829 (2006)CrossRefGoogle Scholar
  26. 26.
    Statistical algorithms description document (last visited) (2008),
  27. 27.
    Irizarry, R., Warren, D., Spencer, F., et al.: Multiple-laboratory comparison of microarray platforms. Nature Methods 2, 345–350 (2005)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Jaroslaw Zola
    • 1
  • Maneesha Aluru
    • 2
  • Srinivas Aluru
    • 1
  1. 1.Department of Electrical and Computer EngineeringIowa State UniversityAmesUSA
  2. 2.Department of Genetics, Cellular, and Developmental BiologyIowa State UniversityAmesUSA

Personalised recommendations