Skip to main content
Log in

Bayesian variable selection for gene expression modeling with regulatory motif binding sites in neuroinflammatory events

  • Original Article
  • Published:
Neuroinformatics Aims and scope Submit manuscript

Abstract

Multiple transcription factors (TFs) coordinately control transcriptional regulation of genes in eukaryotes. Although numerous computational methods focus on the identification of individual TF-binding sites (TFBSs), very few consider the interdependence among these sites. In this article, we studied the relationship between TFBSs and microarray gene expression levels using both family-wise and member-specific motifs, under various combination of regression models with Bayesian variable selection, as well as motif scoring and sharing conditions, in order to account for the coordination complexity of transcription regulation. We proposed a three-step approach to model the relationship. In the first step, we preprocessed microarray data and usedp-values and expression ratios to preselect upregulated and down-regulated genes. The second step aimed to identify and score individual TFBSs within DNA sequence of each gene. A method based on the degree of similarity and the number of TFBSs was employed to calculate the score of each TFBS in each gene sequence. In the last step, linear regression and probit regression were used to build a predictive model of gene expression outcomes using these TFBSs as predictors. Given a certain number of predictors to be used, a full search of all possible predictor sets is usually combinatorially prohibitive. Therefore, this article considered the Bayesian variable selection for prediction using either of the regression models. The Bayesian variable selection applied in the context of gene selection, missing value estimation, and regulatory motif identification. In our modeling, the regressor was approximated as a linear combination of the TFBSs and a Gibbs sampler was employed to find the strongest TFBSs. We applied these regression models with the Bayesian variable selection on spinal cord injury gene expression data set. These TFs demonstrated intricate regulatory roles either as a family or as individual members in neuroinflammatory events. Our an alysis can be applied to create plausible hypotheses for combinatorial regulation by TFBSs and avoiding false-positive candidates in the modeling process at the same time. Such a systematic approach provides the possibility to dissect transcription regulation, from a more comprehensive perspective, through which phenotypical events at cellular and tissue levels are moved forward by molecular events at gene transcription and translation levels.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Albert, J. and Chib, S. (1993) Bayesian analysis of binary and polychotomous response data. J. Am. Stat. Assoc. 88, 669–679.

    Article  Google Scholar 

  • Azizkhan, J. C., Jensen, D. E., Pierce, A. J., and Wade, M. (1993) Transcription from TATA-less promoters: dihydrofolate reductase as a model. Crit. Rev. Eukaryot. Gene Expr. 3, 229–254.

    CAS  Google Scholar 

  • Bailey, T. L. and Elkan, C. (1994) Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proc. Int. Conf. Intell. Syst. Mol. Biol. 2, 28–36.

    CAS  Google Scholar 

  • Bailey, T. L. and Noble, W. S. (2003) Searching for statistically significant regulatory modules. Bioinformatics. 19(Suppl 2), II16-II25.

    Google Scholar 

  • Bigger, C. B., Melnikova, I. N., and Gardner, P. D. (1997) Sp1 and Sp3 regulate expression of the neuronal nicotinic acetylcholine receptor beta4 subunit gene. J. Biol. Chem. 272, 25,976–25,982.

    Article  CAS  Google Scholar 

  • Birnbaum M. J., van Wijnen, A. J., Odgren, P. R., et al. (1995) Sp1 trans-activation of cell cycle regulated promoters is selectively repressed by Sp3. Biochemistry 34, 16,503–16,508.

    CAS  Google Scholar 

  • Bussemaker, H. J., Li, H., and Siggia, E. D. (2001) Regulatory element detection using correlation with expression. Nat. Genet. 27, 167–171.

    Article  CAS  Google Scholar 

  • Chen, Q. K., Hertz, G. Z., and Stormo, G. D. (1995) MATRIX SEARCH 1.0: A computer program that scans DNA sequences for transcriptional elements using a database of weight matrices. Comput. Appl. Biosci. 11, 563–566.

    CAS  Google Scholar 

  • Conlon, E. M., Liu, X. S., Lieb J. D., and Liu, J. S. (2003) Integrating regulatory motif discovery and genome-wide expression analysis. Proc. Natl. Acad. Sci. USA. 100, 3339–3344.

    Article  CAS  Google Scholar 

  • Ensemble (2004) Project Ensemble (http://www. ensembl.org/).

  • Geiger, A., Salazar, G., and Kervran, A. (2001) Role of the Sp family of transcription factors on glucagon receptor gene expression. Biochem. Biophys. Res. Commun. 285, 838–844.

    Article  CAS  Google Scholar 

  • Grabe, N. (2002) AnBaba2: Context specific identification of transcription factor binding sites. In Silico Biol. 2, S1-S15.

    Google Scholar 

  • Grundy, W. N., Bailey, T. L., Elkan, C. P., and Baker, M. E. (1997) Meta-MEME: motif-based hidden Markov models of protein families. Comput. Appl. Biosci. 13, 397–406.

    CAS  Google Scholar 

  • Hagen, G., Muller, S., Beato, M., and Suske, G. (1994) Sp1-mediated transcriptional activation is repressed by Sp3. EMBO J. 13, 3843–3851.

    CAS  Google Scholar 

  • Imai, K. and van Dyk, D. A. (2003) A Bayesian analysis of the multinomial probit model using marginal data augmentation. J. Econometrics 124, 311–334.

    Article  Google Scholar 

  • Kechris, K. J., van Zwet, E., Bickel, P. J., and Eisen, M. B. (2004) Detecting DNA regulatory motifs by incorporating positional trends in information content. Genome Biol 5, R50.

    Article  Google Scholar 

  • Kel, A. E., Gossling, E., Reuter, I., Cheremushkin, E., Kel-Margoulis, O. V., and Wingender, E. (2003) MATCH: A tool for searching transcription factor binding sites in DNA sequences. Nucl. Acids Res. 31, 3576–3579.

    Article  CAS  Google Scholar 

  • Kel, A., Kel-Margoulis, O., Babenko, V., and Wingender, E. (1999) Recognition of NFATp/ AP-1 composite elements within genes induced upon the activation of immune cells. J. Mol. Biol. 288, 353–376.

    Article  CAS  Google Scholar 

  • Keles, S., van der Laan, M., and Eisen, M. B. (2002) Identification of regulatory elements using a feature selection method. Bioinformatics 18, 1167–1175.

    Article  CAS  Google Scholar 

  • Keles, S., van der Laan, M. J., and Vulpe C. (2004) Regulatory motif finding by logic regression. Bioinformatics 20, 2799–2811.

    Article  CAS  Google Scholar 

  • Lee, K. E., Sha, N., Dougherty, E. R., Vannucci, M., and Mallick, B. K. (2003) Gene selection: a Bayesian variable selection approach. Bioinformatics 19, 90–97.

    Article  CAS  Google Scholar 

  • Liu, X. S., Brutlag, D. L., and Liu, J. S. (2002) An algorithm for finding protein-DNA binding sites with applications to chromatin-immunoprecipitation microarray experiments. Nat. Biotechnol. 20, 835–839.

    CAS  Google Scholar 

  • Majello, B., De Luca, P., and Lania, L. (1997) Sp3 is a bifunctional transcription regulator with modular independent activation and repression domains. J. Biol. Chem. 272, 4021–4026.

    Article  CAS  Google Scholar 

  • Matys, V., Fricke, E., Geffers, R., et al. (2003) TRANSFAC: transcriptional regulation, from patterns to profiles. Nucl. Acids Res. 31, 374–378.

    Article  CAS  Google Scholar 

  • NCBI (2004) National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/).

  • Noti, J. D. (1997) Sp3 mediates transcriptional activation of the leukocyte integrin genes CD11C and CD11B and cooperates with c-Jun to activate CD11C. J. Biol. Chem. 272, 24,038–24,045.

    Article  CAS  Google Scholar 

  • Pan, J. Z., Jornsten, R., and Hart, R. P. (2004) Screening anti-inflammatory compounds in injured spinal cord with microarrays: A comparison of bioinformatics analysis approaches. Physiol. Genom. 17, 201–214.

    Article  Google Scholar 

  • Popovich, P. G. and Jones, T. B. (2003) Manipulating neuroinflammatory reactions in the injured spinal cord: back to basics. Trends Pharmacol. Sci. 24, 13–17.

    Article  CAS  Google Scholar 

  • Prestridge, D. S. (1996) SIGNALSCAN 4.0: Additional databases and sequence formats. Comput. Appl. Biosci. 12, 157–160.

    CAS  Google Scholar 

  • Quandt, K., Frech, K., Karas, H., Wingender, E., and Werner, T. (1995) MatInd and MatInspector: New fast and versatile tools for detection of consensus matches in nucleotide sequence data. Nucl. Acids Res. 23, 4878–4884.

    Article  CAS  Google Scholar 

  • Rajakumar, R. A., Thamotharan, S., Menon, R. K. and Devaskar, S. U. (1998) Sp1 and Sp3 regulate transcriptional activity of the facilitative glucose transporter isoform-3 gene in mammalian neuro-blasts and trophoblasts. J. Biol. Chem. 273, 27,474–27,483.

    Article  CAS  Google Scholar 

  • Rajakumar, R. A., Thamotharan, S., Raychaudhuri, N., Menon, R. K., and Devaskar, S. U. (2004) Trans-activators regulating neuronal glucose transporter isoform-3 gene expression in mammalian neurons. J. Biol. Chem. 279, 26,768–26,779.

    Article  CAS  Google Scholar 

  • RGSC (2004) Rat Genome Sequencing Consortium (http://www.hgsc.bcm.tmc.edu/projects/rat/ assembly.html).

  • Robert, C. (1995) Simulation of truncated normal variables. Stat. Comput. 5, 121–125.

    Article  Google Scholar 

  • Ross, S., Tienhaara, A., Lee, M. S., Tsai, L. H., and Gill, G. (2002) GC box-binding transcription factors control the neuronal specific transcription of the cyclin-dependent kinase 5 regulator p35. J. Biol. Chem. 277, 4455–4464.

    Article  CAS  Google Scholar 

  • Smith, M. and Kohn, R. (1997) Nonparametric regression using Bayesian variable selection. J. Econometry 75, 317–344.

    Article  Google Scholar 

  • Supp, D. M., Witte, D. P., Branford, W. W., Smith, E. P., and Potter, S. S. (1996) Sp4, a member of the Sp1-family of zinc finger transcription factors, is required for normal murine growth, viability, and male fertility. Dev. Biol. 176, 284–299.

    Article  CAS  Google Scholar 

  • Tadesse, M. G., Vannucci, M., and Lio, P. (2004) Identification of DNA regulatory motifs using Bayesian variable selection. Bioinformatics 20, 2553–2561.

    Article  CAS  Google Scholar 

  • Tompa, M., Li, N., Bailey, T. L., et al. (2005) Assessing computational tools for the discovery of transcription factor binding sites. Nat. Biotechnol. 23, 137–144.

    Article  CAS  Google Scholar 

  • Troyanskaya, O., Cantor, M., Sherlock, G., et al. (2001) Missing value estimation methods for DNA microarrays. Bioinformatics 17, 520–525.

    Article  CAS  Google Scholar 

  • van Helden, J., Andre, B., and Collado-Vides, J. (1998) Extracting regulatory sites from the upstream region of yeast genes by computational analysis of oligonucleotide frequencies. J. Mol. Biol. 281, 827–842.

    Article  Google Scholar 

  • Yoo, J., Jeong, M. J., Kwon, B. M., Hur, M. W., Park, Y. M., and Han, M. Y. (2002) Activation of dynamin I gene expression by Sp1 and Sp3 is required for neuronal differentiation of N1E-115 cells. J. Biol. Chem. 277, 11,904–11,909.

    CAS  Google Scholar 

  • Zhou, X., Wang, X., and Dougherty, E. R. (2003a) Binarization of microarray data based on a mixture model. Mol. Cancer Ther. 2, 679–684.

    CAS  Google Scholar 

  • Zhou, X., Wang, X., and Dougherty, E. R. (2003b) Missing-value estimation using linear and nonlinear regression with Bayesian gene selection, Bioinformatics 19, 2302–2307.

    Article  CAS  Google Scholar 

  • Zhou, X., Wang, X., and Dougherty, E. R. (2004) Gene prediction using multinomial probit regression with the Bayesian variable selection. EURASIP J. Appl. Signal Proc. 3, 115–124.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kuang-Yu Liu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, KY., Zhou, X., Kan, K. et al. Bayesian variable selection for gene expression modeling with regulatory motif binding sites in neuroinflammatory events. Neuroinform 4, 95–117 (2006). https://doi.org/10.1385/NI:4:1:95

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1385/NI:4:1:95

Index Entries

Navigation