Predictive Minimum Description Length Principle Approach to Inferring Gene Regulatory Networks

  • Vijender Chaitankar
  • Chaoyang Zhang
  • Preetam Ghosh
  • Ping Gong
  • Edward J. Perkins
  • Youping Deng
Part of the Advances in Experimental Medicine and Biology book series (AEMB, volume 696)


Reverse engineering of gene regulatory networks using information theory models has received much attention due to its simplicity, low computational cost, and capability of inferring large networks. One of the major problems with information theory models is to determine the threshold that defines the regulatory relationships between genes. The minimum description length (MDL) principle has been implemented to overcome this problem. The description length of the MDL principle is the sum of model length and data encoding length. A user-specified fine tuning parameter is used as control mechanism between model and data encoding, but it is difficult to find the optimal parameter. In this work, we propose a new inference algorithm that incorporates mutual information (MI), conditional mutual information (CMI), and predictive minimum description length (PMDL) principle to infer gene regulatory networks from DNA microarray data. In this algorithm, the information theoretic quantities MI and CMI determine the regulatory relationships between genes and the PMDL principle method attempts to determine the best MI threshold without the need of a user-specified fine tuning parameter. The performance of the proposed algorithm is evaluated using both synthetic time series data sets and a biological time series data set (Saccharomyces cerevisiae). The results show that the proposed algorithm produced fewer false edges and significantly improved the precision when compared to existing MDL algorithm.


Mutual Information Gene Regulatory Network Boolean Network Minimum Description Length Conditional Mutual Information 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Hecker M, Lambeck S, Toepfer S, van Someren E, Guthke R (2009) Gene regulatory network inference: Data integration in dynamic models – A review. Bio Systems, 96, 1, 86–103.PubMedCrossRefGoogle Scholar
  2. 2.
    Zhao W, Serpedin E, Dougherty ER (2008) Inferring connectivity of genetic regulatory networks using information-theoretic criteria. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 5, 2, 262–274.PubMedCrossRefGoogle Scholar
  3. 3.
    John D, Tabus I, Astola J (2008) Inference of gene regulatory networks based on a universal minimum description length. EURASIP Journal on Bioinformatics and Systems Biology (published online April 15, 2008).Google Scholar
  4. 4.
    Zhao W, Serpedin E, Dougherty ER (2006) Inferring gene regulatory networks from time series data using the minimum description length principle. Bioinformatics, 22, 17, 2129–2135.PubMedCrossRefGoogle Scholar
  5. 5.
    Margolin AA, Nemenman I, Basso K, Wiggins C, Stolovitzky G, Dalla Favera R, Califano A (2006) ARACNE: An algorithm for reconstruction of genetic networks in a mammalian cellular context. BMC Bioinformatics, 7(Suppl 1), S7.PubMedCrossRefGoogle Scholar
  6. 6.
    Liang S (1998) Reveal, A general reverse engineering algorithm for inference of genetic network architectures. Pacific Symposium on Biocomputing, 3, 18–29.Google Scholar
  7. 7.
    Cover TM, Thomas JA. (1991) Elements of information theory. Wiley-Interscience, New York.CrossRefGoogle Scholar
  8. 8.
    Grünwald PD, Myung IJ, Pitt MA (2005) Advances in minimum description length (Theory and Applications). The MIT Press, Cambridge, MA.Google Scholar
  9. 9.
    Hansen MH, Yu B (2001) Model selection and the principle of minimum description length. Journal of the American Statistical Association, 96, 454, 746–774.CrossRefGoogle Scholar
  10. 10.
    Chaitankar V, Zhang C, Ghosh P, Perkins EJ, Gong P, Deng Y (2009) Gene regulatory network inference using predictive minimum description length principle and conditional mutual information. Proceedings of International Joint Conference on Bioinformatics, Systems Biology and Intelligent Computing, 487–490.Google Scholar
  11. 11.
    Rissanen J (2006) An introduction to the MDL principle. Helsinki Institute for Information Technology, Tampere and Helsinki Universities of Technology, Finland, and University of London, England. (
  12. 12.
    Rissanen J (1984) Universal coding, information, prediction and estimation. IEEE Transactions on Information Theory, 30, 4, 629–636.CrossRefGoogle Scholar
  13. 13.
    Spellman PT, et al. (1998) Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Molecular Biology of the Cell, 9, 3273–3297.PubMedGoogle Scholar
  14. 14.
    Kanehisa M, et al. (2008) KEGG for linking genomes to life and the environment. Nucleic Acids Research, 36, D480–D484.PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  • Vijender Chaitankar
  • Chaoyang Zhang
    • 1
  • Preetam Ghosh
  • Ping Gong
  • Edward J. Perkins
  • Youping Deng
  1. 1.School of ComputingThe University of Southern MississippiHattiesburgUSA

Personalised recommendations