Abstract
The development of new antimalarial drugs is urgently needed due to elevated drug resistance in the causative agents Plasmodium parasites. An intervention strategy based on the interruption of the parasite cell cycle could be undertaken using a systems-biology aided drug discovery approach. However, little is known about the components or the mechanism of parasite cell cycle control to date. In this proof of concept study, we attempted to infer the skeleton components using comparative genomic analysis and to uncover the genetic regulatory network (GRN) ab initio using a Variational Bayesian expectation maximization (VBEM) approach.
Similar content being viewed by others
Abbreviations
- APP:
-
a posteriori probability
- AUC:
-
Area under the curve
- BLAST:
-
Basic local alignment search tool
- GRN:
-
Genetic regulatory network
- KEGG:
-
Kyoto encyclopedia of genes and genomes
- MAP:
-
Maximum a posteriori
- MAPK:
-
Mitogen-activated protein kinase
- MCM:
-
Minichromosome maintenance
- ORF:
-
Open reading frame
- PCNA:
-
Proliferating cell nuclear antigen
- ROC:
-
Receiver operating characteristic
- VBEM:
-
Variational Bayesian expectation maximization
References
Abrahamsen MS, Templeton TJ, Enomoto S, Abrahante JE, G Zhu, Lancto CA et al (2004) Complete genome sequence of the apicomplexan, Cryptosporidium parvum. Science 304:441–445
Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402
Aravind L, Iyer LM, Wellems TE, Miller LH (2003) Plasmodium biology: genomic gleanings. Cell 115:771–785
Bahl A, Brunk B, Crabtree J, Fraunholz MJ, Gajria B, Grant GR et al (2003) PlasmoDB: the Plasmodium genome resource. A database integrating experimental and computational data. Nucleic Acids Res 31:212–215
Beal MJ (2003) Variational algorithms for approximate Bayesian inference. In: The Gatsby computational neuroscience unit. University College, London
Beal MJ, Falciani F, Ghahramani Z, Rangel C, Wild DL (2005) A Bayesian approach to reconstructing genetic regulatory networks with hidden factors. Bioinformatics, 349–356
Bernardo JM, Smith AFM (2000) Bayesian Theory. Wiley, New York
Berriman ME, Ghedin C, Hertz-Fowler G, Blandin H, Renauld DC, Bartholomeu NJ et al (2005) The genome of the African trypanosome Trypanosoma brucei. Science 309:416–422
Bozdech Z, Llinas M, Pulliam BL, Wong ED, Zhu J, DeRisi JL (2003) The transcriptome of the intraerythrocytic developmental cycle of Plasmodium falciparum. PLoS Biol 1:E5
Carlton JM, Angiuoli SV, Suh BB, Kooij TW, Pertea M, Silva JC et al (2002) Genome sequence and comparative analysis of the model rodent malaria parasite Plasmodium yoelii yoelii. Nature 419:512–519
D’Haeseleer P, Liang S, Somogyi R (2000) Genetic network inference: from co-expression clustering to reverse engineering. Bioinformatics 16:707–726
Date SV, Stoeckert CJ, Jr. (2006) Computational modeling of the Plasmodium falciparum interactome reveals protein function on a genome-wide scale. Genome Res 16:542–549
Doerig C, Meijer L, Mottram JC (2002) Protein kinases as drug targets in parasitic protozoa. Trends Parasitol 18:366–371
Dorin D, Semblat JP, Poullet P, Alano P, Goldring JP, Whittle C, Patterson S, Chakrabarti D, Doerig C (2005) PfPK7, an atypical MEK-related protein kinase, reflects the absence of classical three-component MAPK pathways in the human malaria parasite Plasmodium falciparum. Mol Microbiol 55:184–196
Dougherty ER, Shmulevich I, Chen J, Wang ZJ (2005) Genomic signal processing and statistics. Hindawi Publishing Corporation, New York
El-Sayed NM, Myler PJ, Bartholomeu DC, Nilsson D, Aggarwal G, Tran AN et al (2005) The genome sequence of Trypanosoma cruzi, etiologic agent of Chagas disease. Science 309:409–415
Friedman N (1998) The Bayesian structural EM algorithm. In Fourteenth conf on uncertainty in artificial intelligence (UAI)
Friedman N (2004) Inferring cellular networks using probabilistic graphical models. Science 303:799–805
Friedman N, Linial M, Nachman I, Pe’er D (2000) Using Bayesian networks to analyze expression data. J Comput Biol 7:601–620
Gardner MJ, Hall N, Fung E, White O, Berriman M, Hyman RW et al (2002) Genome sequence of the human malaria parasite Plasmodium falciparum. Nature 419:498–511
Gat-Viks A, Tanay A, Raijman D, Shamir R (2005) Factor graph network models for biological systems. In: Recomb 2005, Boston, MA
George EI, McCulloch R (1993) Variable selection via Gibbs sampling. J Am Stat Assoc 88:881–889
Hall N, Karras M, Raine JD, Carlton JM, Kooij TW, Berriman M et al (2005) A comprehensive survey of the Plasmodium life cycle by genomic, transcriptomic, and proteomic analyses. Science 307:82–86
Hammarton TC, Mottram JC, Doerig C (2003) The cell cycle of parasitic protozoa: potential for chemotherapeutic exploitation. Prog Cell Cycle Res 5:91–101
Ivens ACCS, Peacock EA, Worthey L, Murphy G, Aggarwal M, et al (2005) The genome of the kinetoplastid parasite, Leishmania major. Science 309:436–442
Janse CJ, Haghparast A, Speranca MA, Ramesar J, Kroeze H, del Portillo HA, Waters AP (2003) Malaria parasites lacking eef1a have a normal S/M phase yet grow more slowly due to a longer G1 phase. Mol Microbiol 50:1539–1551
Jeong H, Mason SP, Barabasi AL, Oltvai ZN (2001) Lethality and centrality in protein networks. Nature 411:41–42
Kanehisa M, Goto S, Hattori M, Aoki-Kinoshita KF, Itoh M, Kawashima S, Katayama T, Araki M, Hirakawa M (2006) From genomics to chemical genomics: new developments in KEGG. Nucleic Acids Res 34:D354–357
Kay SM (1993) Fundamentals of statistical signal processing: estimation theory. Prentice-Hall Inc., Upper Saddle River, NJ, USA
Kim SY, Imoto S, Miyano S (2003) Inferring gene networks from time series microarray data using dynamic Bayesian networks. Brief Bioinform 4:228–235
Kitano H (2002) Systems biology: a brief overview. Science 295:1662–1664
Lilburn TG, Wang Y (2006) Systems biology and computer aided drug discovery. Curr Comput Aided Drug Design 2:267–274
Liu JS (2001) Monte Carlo strategies in scientific computing. Springer
Llinas M, Bozdech Z, Wong ED, Adai AT, DeRisi JL (2006) Comparative whole genome transcriptome analysis of three Plasmodium falciparum strains. Nucleic Acids Res 34:1166–1173
Margolin AA, Nemenman I, Basso K, Wiggins C, Stolovitzky G, Dalla Favera R, Califano A (2006) ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinformatics 7 (Suppl 1):S7
Merckx A, LeRoch K, Nivez MP, Dorin D, Alano P, Gutierrez GJ et al. (2003) Identification and initial characterization of three novel cyclin-related proteins of the human malaria parasite Plasmodium falciparum. J Biol Chem 278:39839–39850
Minka, TP (2001) A family of algorithms for approximate Bayesian inference. In: Electrical engineering and computer science. Massachusetts Institute of Technology, p 75
Mulder NJ, Apweiler R, Attwood TK, Bairoch A, Bateman A, Binns D, et al. (2005) InterPro, progress and status in 2005. Nucleic Acids Res 33:D201–205
Patankar S, Munasinghe A, Shoaibi A, Cummings LM, Wirth DF (2001) Serial analysis of gene expression in Plasmodium falciparum reveals the global expression profile of erythrocytic stages and the presence of anti-sense transcripts in the malarial parasite. Mol Biol Cell 12:3114–3125
Pe’er D, Regev A, Elidan G, Friedman N (2001) Inferring subnetworks from perturbed expression profiles. Bioinformatics 17(Suppl 1):S215–224
Perrin BE, Ralaivola L, Mazurie A, Bottani S, Mallet J, DAlche-Buc F (2003) Gene networks inference using dynamic Bayesian networks. Bioinformatics 19(Suppl 2):II138–II148
Rangarajan R, Bei A, Henry N, Madamet M, Parzy D, Nivez MP, Doerig C, Sultan A (2006) Pbcrk-1, the Plasmodium berghei orthologue of P. falciparum cdc-2 related kinase-1 (Pfcrk-1), is essential for completion of the intraerythrocytic asexual cycle. Exp Parasitol 112:202–207
Reininger L, Billker O, Tewari R, Mukhopadhyay A, Fennell C, Dorin-Semblat D, et al. (2005) A NIMA-related protein kinase is essential for completion of the sexual cycle of malaria parasites. J Biol Chem 280:31957–31964
Robert CP, Casella G (2005) Monte Carlo statistical methods. Springer
Schafer J, Strimmer K (2005) An empirical Bayes approach to inferring large-scale gene association networks. Bioinformatics 21:754–764
Segal E, Friedman N, Kaminski N, Regev A, Koller D (2005) From signatures to models: understanding cancer using microarrays. Nat Genet 37(Suppl):S38–45
Segal E, Taskar B, Gasch A, Friedman N, Koller D (2001) Rich probabilistic models for gene expression. Bioinformatics 17(Suppl 1):S243–252
Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, Ideker T (2003) Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 13:2498–2504
Shmulevich I, Dougherty ER, Kim S, Zhang W (2002) Probabilistic Boolean Networks: a rule-based uncertainty model for gene regulatory networks. Bioinformatics 18:261–274
Tienda-Luna IM, Yin Y, Huang Y, Padillo DPR, Perez MCC, Wang Y (2007) Uncovering gene networks using variational Bayesian variable selection. Artificial Life (in press)
van Berlo RJP, van Someren EP, Reinders MJT (2003) Studying the conditions for learning dynamic Bayesian networks to discover genetic regulatory networks. Simulation 79:689–702
Winkler G (1995) Image analysis, random fields and dynamic Monte Carlo methods. Springer
Wu X, Ye Y, Subramanian K, Zhang L (2003) Interactive gene interaction analysis using graphical gaussian models. In: The 3rd ACM SIGKDD workshop on data mining in bioinformatics, Washington, DC, pp 63–69
Yeh I, Hanekamp T, Tsoka S, Karp PD, Altman RB (2004) Computational analysis of Plasmodium falciparum metabolism: organizing genomic information to facilitate drug discovery. Genome Res 14:917–924
Acknowledgements
This work is supported in part by an NSF Grant CCF-0546345 to Y. Huang, NIH 1R21AI067543-01A1, San Antonio Area Foundation Biomedical Research funds, UTSA Faculty Research Award to Y. Wang. Y. Wang is also supported by NIH RCMI grant 2G12 RR013646-06A1. I. M. Tienda-Luna and M. C. Carrion are supported by MCyT under proyect TEC 2004-06096-C03-02/TCM. M. Sanchez is supported by the NIH MBRS-RISE (Minority Biomedical Research Support Research Initiative for Scientific Enhancement) fellowship.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
Conjugate priors of parameters and latent variables
We choose the conjugate priors for the parameters and latent variables and they are
where \({\varvec{\upmu}}_{0}\) and C 0 are the mean and the covariance of the prior probability density p(b i ) and they will be discussed in a later section. In general, \({\varvec{\upmu}}_{{{\mathbf{w}}_i}}\) is simply set as a zero vector, and meanwhile ν0 and γ0 are set equal to small positive real values. These priors satisfy the conditions for conjugate exponential (CE) models, which will facilitate the derivation of VBE and VEM steps.
Derivation of the VBE and VBM steps
Starting from some arbitrary distribution over the parameters, the VBE step calculates the approximation on the APPs of latent variables. By applying the CE model, q(b i ) can be shown to have the following expression
where
with
We now turn to the VBM step in which we compute \(q({\varvec{\theta}}_{i}).\) Again, from the CE model, we obtain
where
with
Computation of the lower bound \({{\mathcal{F}}}\)
The convergence of the VBEM algorithm is tested using a lower bound of ln p(y i ). In this paper, we use \({{\mathcal{F}}}\) to denote this lower bound and we calculate it using the newest q(b i ) and q(\({\varvec{\theta}}_{i}\)) obtained in the iterative process. \({{\mathcal{F}}}\) can be written more succinctly using the definition of the KL divergence. Let’s first review the definition of the KL divergence and then derive an analytical expression for \({{\mathcal{F}}}.\)
The KL divergence measures the difference between two probability distributions and it is also termed relative entropy. Thus, using this definition we can write the difference between the real and the approximate distributions in the following way:
And finally, the lower bound \({\mathcal{F}}\) can be written in terms of the previous definitions as:
Rights and permissions
About this article
Cite this article
Tienda-Luna, I.M., Yin, Y., Carrion, M.C. et al. Inferring the skeleton cell cycle regulatory network of malaria parasite using comparative genomic and variational Bayesian approaches. Genetica 132, 131–142 (2008). https://doi.org/10.1007/s10709-007-9155-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10709-007-9155-4