Skip to main content

Advertisement

Log in

Inferring the skeleton cell cycle regulatory network of malaria parasite using comparative genomic and variational Bayesian approaches

  • Published:
Genetica Aims and scope Submit manuscript

Abstract

The development of new antimalarial drugs is urgently needed due to elevated drug resistance in the causative agents Plasmodium parasites. An intervention strategy based on the interruption of the parasite cell cycle could be undertaken using a systems-biology aided drug discovery approach. However, little is known about the components or the mechanism of parasite cell cycle control to date. In this proof of concept study, we attempted to infer the skeleton components using comparative genomic analysis and to uncover the genetic regulatory network (GRN) ab initio using a Variational Bayesian expectation maximization (VBEM) approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Abbreviations

APP:

a posteriori probability

AUC:

Area under the curve

BLAST:

Basic local alignment search tool

GRN:

Genetic regulatory network

KEGG:

Kyoto encyclopedia of genes and genomes

MAP:

Maximum a posteriori

MAPK:

Mitogen-activated protein kinase

MCM:

Minichromosome maintenance

ORF:

Open reading frame

PCNA:

Proliferating cell nuclear antigen

ROC:

Receiver operating characteristic

VBEM:

Variational Bayesian expectation maximization

References

  • Abrahamsen MS, Templeton TJ, Enomoto S, Abrahante JE, G Zhu, Lancto CA et al (2004) Complete genome sequence of the apicomplexan, Cryptosporidium parvum. Science 304:441–445

    Article  PubMed  CAS  Google Scholar 

  • Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402

    Article  PubMed  CAS  Google Scholar 

  • Aravind L, Iyer LM, Wellems TE, Miller LH (2003) Plasmodium biology: genomic gleanings. Cell 115:771–785

    Article  PubMed  CAS  Google Scholar 

  • Bahl A, Brunk B, Crabtree J, Fraunholz MJ, Gajria B, Grant GR et al (2003) PlasmoDB: the Plasmodium genome resource. A database integrating experimental and computational data. Nucleic Acids Res 31:212–215

    Article  PubMed  CAS  Google Scholar 

  • Beal MJ (2003) Variational algorithms for approximate Bayesian inference. In: The Gatsby computational neuroscience unit. University College, London

  • Beal MJ, Falciani F, Ghahramani Z, Rangel C, Wild DL (2005) A Bayesian approach to reconstructing genetic regulatory networks with hidden factors. Bioinformatics, 349–356

  • Bernardo JM, Smith AFM (2000) Bayesian Theory. Wiley, New York

    Google Scholar 

  • Berriman ME, Ghedin C, Hertz-Fowler G, Blandin H, Renauld DC, Bartholomeu NJ et al (2005) The genome of the African trypanosome Trypanosoma brucei. Science 309:416–422

    Article  PubMed  CAS  Google Scholar 

  • Bozdech Z, Llinas M, Pulliam BL, Wong ED, Zhu J, DeRisi JL (2003) The transcriptome of the intraerythrocytic developmental cycle of Plasmodium falciparum. PLoS Biol 1:E5

    Article  PubMed  Google Scholar 

  • Carlton JM, Angiuoli SV, Suh BB, Kooij TW, Pertea M, Silva JC et al (2002) Genome sequence and comparative analysis of the model rodent malaria parasite Plasmodium yoelii yoelii. Nature 419:512–519

    Article  PubMed  CAS  Google Scholar 

  • D’Haeseleer P, Liang S, Somogyi R (2000) Genetic network inference: from co-expression clustering to reverse engineering. Bioinformatics 16:707–726

    Article  PubMed  CAS  Google Scholar 

  • Date SV, Stoeckert CJ, Jr. (2006) Computational modeling of the Plasmodium falciparum interactome reveals protein function on a genome-wide scale. Genome Res 16:542–549

    Article  PubMed  CAS  Google Scholar 

  • Doerig C, Meijer L, Mottram JC (2002) Protein kinases as drug targets in parasitic protozoa. Trends Parasitol 18:366–371

    Article  PubMed  CAS  Google Scholar 

  • Dorin D, Semblat JP, Poullet P, Alano P, Goldring JP, Whittle C, Patterson S, Chakrabarti D, Doerig C (2005) PfPK7, an atypical MEK-related protein kinase, reflects the absence of classical three-component MAPK pathways in the human malaria parasite Plasmodium falciparum. Mol Microbiol 55:184–196

    Article  PubMed  CAS  Google Scholar 

  • Dougherty ER, Shmulevich I, Chen J, Wang ZJ (2005) Genomic signal processing and statistics. Hindawi Publishing Corporation, New York

    Google Scholar 

  • El-Sayed NM, Myler PJ, Bartholomeu DC, Nilsson D, Aggarwal G, Tran AN et al (2005) The genome sequence of Trypanosoma cruzi, etiologic agent of Chagas disease. Science 309:409–415

    Article  PubMed  CAS  Google Scholar 

  • Friedman N (1998) The Bayesian structural EM algorithm. In Fourteenth conf on uncertainty in artificial intelligence (UAI)

  • Friedman N (2004) Inferring cellular networks using probabilistic graphical models. Science 303:799–805

    Article  PubMed  CAS  Google Scholar 

  • Friedman N, Linial M, Nachman I, Pe’er D (2000) Using Bayesian networks to analyze expression data. J Comput Biol 7:601–620

    Article  PubMed  CAS  Google Scholar 

  • Gardner MJ, Hall N, Fung E, White O, Berriman M, Hyman RW et al (2002) Genome sequence of the human malaria parasite Plasmodium falciparum. Nature 419:498–511

    Article  PubMed  CAS  Google Scholar 

  • Gat-Viks A, Tanay A, Raijman D, Shamir R (2005) Factor graph network models for biological systems. In: Recomb 2005, Boston, MA

  • George EI, McCulloch R (1993) Variable selection via Gibbs sampling. J Am Stat Assoc 88:881–889

    Article  Google Scholar 

  • Hall N, Karras M, Raine JD, Carlton JM, Kooij TW, Berriman M et al (2005) A comprehensive survey of the Plasmodium life cycle by genomic, transcriptomic, and proteomic analyses. Science 307:82–86

    Article  PubMed  CAS  Google Scholar 

  • Hammarton TC, Mottram JC, Doerig C (2003) The cell cycle of parasitic protozoa: potential for chemotherapeutic exploitation. Prog Cell Cycle Res 5:91–101

    PubMed  Google Scholar 

  • Ivens ACCS, Peacock EA, Worthey L, Murphy G, Aggarwal M, et al (2005) The genome of the kinetoplastid parasite, Leishmania major. Science 309:436–442

    Article  PubMed  Google Scholar 

  • Janse CJ, Haghparast A, Speranca MA, Ramesar J, Kroeze H, del Portillo HA, Waters AP (2003) Malaria parasites lacking eef1a have a normal S/M phase yet grow more slowly due to a longer G1 phase. Mol Microbiol 50:1539–1551

    Article  PubMed  CAS  Google Scholar 

  • Jeong H, Mason SP, Barabasi AL, Oltvai ZN (2001) Lethality and centrality in protein networks. Nature 411:41–42

    Article  PubMed  CAS  Google Scholar 

  • Kanehisa M, Goto S, Hattori M, Aoki-Kinoshita KF, Itoh M, Kawashima S, Katayama T, Araki M, Hirakawa M (2006) From genomics to chemical genomics: new developments in KEGG. Nucleic Acids Res 34:D354–357

    Article  PubMed  CAS  Google Scholar 

  • Kay SM (1993) Fundamentals of statistical signal processing: estimation theory. Prentice-Hall Inc., Upper Saddle River, NJ, USA

    Google Scholar 

  • Kim SY, Imoto S, Miyano S (2003) Inferring gene networks from time series microarray data using dynamic Bayesian networks. Brief Bioinform 4:228–235

    Article  PubMed  CAS  Google Scholar 

  • Kitano H (2002) Systems biology: a brief overview. Science 295:1662–1664

    Article  PubMed  CAS  Google Scholar 

  • Lilburn TG, Wang Y (2006) Systems biology and computer aided drug discovery. Curr Comput Aided Drug Design 2:267–274

    Article  CAS  Google Scholar 

  • Liu JS (2001) Monte Carlo strategies in scientific computing. Springer

  • Llinas M, Bozdech Z, Wong ED, Adai AT, DeRisi JL (2006) Comparative whole genome transcriptome analysis of three Plasmodium falciparum strains. Nucleic Acids Res 34:1166–1173

    Article  PubMed  CAS  Google Scholar 

  • Margolin AA, Nemenman I, Basso K, Wiggins C, Stolovitzky G, Dalla Favera R, Califano A (2006) ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinformatics 7 (Suppl 1):S7

    Article  PubMed  Google Scholar 

  • Merckx A, LeRoch K, Nivez MP, Dorin D, Alano P, Gutierrez GJ et al. (2003) Identification and initial characterization of three novel cyclin-related proteins of the human malaria parasite Plasmodium falciparum. J Biol Chem 278:39839–39850

    Article  PubMed  CAS  Google Scholar 

  • Minka, TP (2001) A family of algorithms for approximate Bayesian inference. In: Electrical engineering and computer science. Massachusetts Institute of Technology, p 75

  • Mulder NJ, Apweiler R, Attwood TK, Bairoch A, Bateman A, Binns D, et al. (2005) InterPro, progress and status in 2005. Nucleic Acids Res 33:D201–205

    Article  PubMed  CAS  Google Scholar 

  • Patankar S, Munasinghe A, Shoaibi A, Cummings LM, Wirth DF (2001) Serial analysis of gene expression in Plasmodium falciparum reveals the global expression profile of erythrocytic stages and the presence of anti-sense transcripts in the malarial parasite. Mol Biol Cell 12:3114–3125

    PubMed  CAS  Google Scholar 

  • Pe’er D, Regev A, Elidan G, Friedman N (2001) Inferring subnetworks from perturbed expression profiles. Bioinformatics 17(Suppl 1):S215–224

    PubMed  Google Scholar 

  • Perrin BE, Ralaivola L, Mazurie A, Bottani S, Mallet J, DAlche-Buc F (2003) Gene networks inference using dynamic Bayesian networks. Bioinformatics 19(Suppl 2):II138–II148

    PubMed  Google Scholar 

  • Rangarajan R, Bei A, Henry N, Madamet M, Parzy D, Nivez MP, Doerig C, Sultan A (2006) Pbcrk-1, the Plasmodium berghei orthologue of P. falciparum cdc-2 related kinase-1 (Pfcrk-1), is essential for completion of the intraerythrocytic asexual cycle. Exp Parasitol 112:202–207

    Article  PubMed  CAS  Google Scholar 

  • Reininger L, Billker O, Tewari R, Mukhopadhyay A, Fennell C, Dorin-Semblat D, et al. (2005) A NIMA-related protein kinase is essential for completion of the sexual cycle of malaria parasites. J Biol Chem 280:31957–31964

    Article  PubMed  CAS  Google Scholar 

  • Robert CP, Casella G (2005) Monte Carlo statistical methods. Springer

  • Schafer J, Strimmer K (2005) An empirical Bayes approach to inferring large-scale gene association networks. Bioinformatics 21:754–764

    Article  PubMed  Google Scholar 

  • Segal E, Friedman N, Kaminski N, Regev A, Koller D (2005) From signatures to models: understanding cancer using microarrays. Nat Genet 37(Suppl):S38–45

    Article  PubMed  CAS  Google Scholar 

  • Segal E, Taskar B, Gasch A, Friedman N, Koller D (2001) Rich probabilistic models for gene expression. Bioinformatics 17(Suppl 1):S243–252

    PubMed  Google Scholar 

  • Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, Ideker T (2003) Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 13:2498–2504

    Article  PubMed  CAS  Google Scholar 

  • Shmulevich I, Dougherty ER, Kim S, Zhang W (2002) Probabilistic Boolean Networks: a rule-based uncertainty model for gene regulatory networks. Bioinformatics 18:261–274

    Article  PubMed  CAS  Google Scholar 

  • Tienda-Luna IM, Yin Y, Huang Y, Padillo DPR, Perez MCC, Wang Y (2007) Uncovering gene networks using variational Bayesian variable selection. Artificial Life (in press)

  • van Berlo RJP, van Someren EP, Reinders MJT (2003) Studying the conditions for learning dynamic Bayesian networks to discover genetic regulatory networks. Simulation 79:689–702

    Article  Google Scholar 

  • Winkler G (1995) Image analysis, random fields and dynamic Monte Carlo methods. Springer

  • Wu X, Ye Y, Subramanian K, Zhang L (2003) Interactive gene interaction analysis using graphical gaussian models. In: The 3rd ACM SIGKDD workshop on data mining in bioinformatics, Washington, DC, pp 63–69

  • Yeh I, Hanekamp T, Tsoka S, Karp PD, Altman RB (2004) Computational analysis of Plasmodium falciparum metabolism: organizing genomic information to facilitate drug discovery. Genome Res 14:917–924

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgements

This work is supported in part by an NSF Grant CCF-0546345 to Y. Huang, NIH 1R21AI067543-01A1, San Antonio Area Foundation Biomedical Research funds, UTSA Faculty Research Award to Y. Wang. Y. Wang is also supported by NIH RCMI grant 2G12 RR013646-06A1. I. M. Tienda-Luna and M. C. Carrion are supported by MCyT under proyect TEC 2004-06096-C03-02/TCM. M. Sanchez is supported by the NIH MBRS-RISE (Minority Biomedical Research Support Research Initiative for Scientific Enhancement) fellowship.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yufeng Wang.

Appendix

Appendix

Conjugate priors of parameters and latent variables

We choose the conjugate priors for the parameters and latent variables and they are

$$ \begin{aligned} p({\mathbf{b}}_{i})&={\mathcal{N}}({\mathbf{b}}_{i} \vert{\varvec{\upmu}}_{0},{\mathbf{C}}_{0})\\ p({\varvec{\theta}}_{i})&=p({\mathbf{w}}_{i}, \sigma_{i}^{2})= p({\mathbf{w}}_{i}|\sigma_{i}^{2})p(\sigma_{i}^{2})\\ &={\mathcal{N}}({\mathbf{w}}_{i}|{\varvec{\upmu}}_{{\mathbf{w}}_{i}},\sigma_{i}^{2}{\mathbf{I}}_{G})I{\mathcal{G}}\left(\frac{\nu_{0}}{2}, \frac{\gamma_{0}}{2}\right)\\ \end{aligned} $$
(8)

where \({\varvec{\upmu}}_{0}\) and C 0 are the mean and the covariance of the prior probability density p(b i ) and they will be discussed in a later section. In general, \({\varvec{\upmu}}_{{{\mathbf{w}}_i}}\) is simply set as a zero vector, and meanwhile ν0 and γ0 are set equal to small positive real values. These priors satisfy the conditions for conjugate exponential (CE) models, which will facilitate the derivation of VBE and VEM steps.

Derivation of the VBE and VBM steps

Starting from some arbitrary distribution over the parameters, the VBE step calculates the approximation on the APPs of latent variables. By applying the CE model, q(b i ) can be shown to have the following expression

$$ q({\mathbf{b}}_{i})={\mathcal{N}}\left({\mathbf{b}}_{i}|{\varvec{\upmu}}_{{\mathbf{b}}_{i}},{\varvec{\Upsigma}}_{{\mathbf{b}}_i}\right) $$
(9)

where

$$ \begin{aligned} {\varvec{\upmu}}_{{\mathbf{b}}_{i}}&=\varvec{\Upsigma}_{{\mathbf{b}}_{i}}(\sigma_{0}^{-2}{\varvec{\upmu}}_{0}+\mathbf {f})\\ \varvec {\Upsigma}_{{\mathbf{b}}_{i}}&=(\sigma_{0}^{-2}I_{G}+D)^{-1}\\ \end{aligned} $$
(10)

with

$$ \begin{aligned} \mathbf {D}&={\mathbf{B}}\otimes\left[({\mathbf{m}}_{{\mathbf{w}}_{i}}{\mathbf{m}}_{{\mathbf{w}}_{i}})^{\rm T} \langle\sigma_{i}^{-2}\rangle_{q({\varvec{\theta}}_{i})}+\mathbf {A}^{-1}\right]\\ \mathbf {f}^{\rm T}&={\mathbf{y}}_{i}^{\rm T}\mathbf {R}\,{\rm diag}\,(\mathbf {m}_{{\mathbf{w}}_{i}})\langle \sigma_{i}^{-2}\rangle_{q({\varvec{\theta}}_{i})}\\ {\mathbf{B}}&={\mathbf{R}}^{\rm T}{\mathbf {R}}\\ {\mathbf {A}}&={\mathbf {I}}_{G}+{\mathbf {K}}\\ {\mathbf {K}}&={\mathbf{B}}\otimes({\varvec {\Upsigma}}_{{\mathbf{b}}_{i}}+{\varvec{\upmu}}_{{\mathbf b}_{i}}{\varvec{\upmu}}_{{\mathbf{b}}_{i}}^{\rm T})\\ \end{aligned} $$
(11)

We now turn to the VBM step in which we compute \(q({\varvec{\theta}}_{i}).\) Again, from the CE model, we obtain

$$ q({\varvec{\theta}}_{i})={\mathcal{N}}({\mathbf{w}}_{i}|{\mathbf{m}}_{{\mathbf {w}}_{i}},{\varvec{\Upsigma}}_{{\mathbf{w}}_{i}})I{\mathcal{G}}\left(\frac{\alpha}{2},\frac{\beta}{2}\right) $$
(12)

where

$$ \begin{aligned} {\mathbf{m}}_{{\mathbf{w}}_i}&=\left({\mathbf {I}}_G+\mathbf {K}\right)^{-1}\left({\mathbf {y}}_i^{\rm T}{\mathbf{RM}}_x\right)^{\rm T}\\ {\varvec{\Upsigma}}_{{\mathbf {w}}_i}&=\sigma_i^2 \left({\mathbf {I}}_G+\mathbf {K}\right)^{-1} \\ \frac{\alpha}{2}=&\frac{N\left({\eta +1}\right)-2}{2} \\ \frac{\beta}{2}=&-\frac{c}{2}\\ \end{aligned} $$
(13)

with

$$ \begin{aligned} {\mathbf{M}}_x&={\rm diag}\left({{\mathbf{m}}_{{\mathbf{b}}_i}}\right) \\ c=&{\mathbf{y}}_i^{\rm T} {\mathbf{RM}}_x {\mathbf{m}}_{{\mathbf{w}}_i}-{\mathbf{y}}_i^{\rm T} {\mathbf{y}}_i-\nu_0 \\ \end{aligned} $$
(14)

Computation of the lower bound \({{\mathcal{F}}}\)

The convergence of the VBEM algorithm is tested using a lower bound of ln p(y i ). In this paper, we use \({{\mathcal{F}}}\) to denote this lower bound and we calculate it using the newest q(b i ) and q(\({\varvec{\theta}}_{i}\)) obtained in the iterative process. \({{\mathcal{F}}}\) can be written more succinctly using the definition of the KL divergence. Let’s first review the definition of the KL divergence and then derive an analytical expression for \({{\mathcal{F}}}.\)

The KL divergence measures the difference between two probability distributions and it is also termed relative entropy. Thus, using this definition we can write the difference between the real and the approximate distributions in the following way:

$$ \begin{aligned} \left[q\left({\mathbf{b}}_{i}\right)\parallel p\left({\mathbf{b}}_{i},{\mathbf{y}}_i| {\varvec{\theta}}_{i}\right)\right]&=-\int {\rm d}{\mathbf{b}}_i\; q\left({{\mathbf{b}}_i}\right)\ln \frac{p\left({\mathbf{b}}_i,{\mathbf{y}}_i|\varvec {\theta}_i\right)} {q\left({\mathbf{b}}_i\right)}\\ KL\left[q\left({\varvec {\theta}_i}\right)\parallel p\left({\varvec{\theta}}_i\right)\right]&= -\int {{\rm d}{\varvec {\theta}}}_i \ln \frac{p\left({{\varvec{\theta}}_i} \right)} {q\left({\varvec {\theta}_i}\right)}\\ \end{aligned} $$
(15)

And finally, the lower bound \({\mathcal{F}}\) can be written in terms of the previous definitions as:

$$ \begin{aligned} {\mathcal{F}}&=\int {\rm d} {\varvec{\theta}}_i q\left({\varvec{\theta}}_i \right)\left[\int {{\rm d}{\bf b}_i}q\left({\mathbf{b}}_i \right) \ln \frac{p\left({\mathbf{b}}_i,{\mathbf{y}}_i|{\varvec{\theta}}_i \right)}{q\left({\mathbf{b}}_i\right)}+ \ln \frac{p\left({\varvec{\theta}}_i \right)}{q\left({\varvec{\theta}}_i \right)}\right]\\ &=-\int {{\rm d} {\varvec{\theta}}_i}q\left({\varvec{\theta}}_i\right)KL\left[q\left({\bf b}_i \right) \parallel p\left({\mathbf{b}}_i,{\mathbf{y}}_i|{\varvec{\theta}}_i \right)\right]-KL\left[q\left({\varvec{\theta}}_i \right)|{p\left({\varvec{\theta}}_i \right)}\right]\\ \end{aligned} $$
(16)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tienda-Luna, I.M., Yin, Y., Carrion, M.C. et al. Inferring the skeleton cell cycle regulatory network of malaria parasite using comparative genomic and variational Bayesian approaches. Genetica 132, 131–142 (2008). https://doi.org/10.1007/s10709-007-9155-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10709-007-9155-4

Keywords

Navigation