Combinatorial Genetic Regulatory Network Analysis Tools for High Throughput Transcriptomic Data

  • Elissa J. Chesler
  • Michael A. Langston
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4023)

Abstract

A series of genome-scale algorithms and high-performance implementations is described and shown to be useful in the genetic analysis of gene transcription. With them it is possible to address common questions such as: “are the sets of genes co-expressed under one type of conditions the same as those sets co-expressed under another?” A new noise-adaptive graph algorithm, dubbed “paraclique,” is introduced and analyzed for use in biological hypotheses testing. A notion of vertex coverage is also devised, based on vertex-disjoint paths within correlation graphs, and used to determine the identity, proportion and number of transcripts connected to individual phenotypes and quantitative trait loci (QTL) regulatory models. A major goal is to identify which, among a set of candidate genes, are the most likely regulators of trait variation. These methods are applied in an effort to identify multiple-QTL regulatory models for large groups of genetically co-expressed genes, and to extrapolate the consequences of this genetic variation on phenotypes observed across levels of biological scale through the evaluation of vertex coverage. This approach is furthermore applied to definitions of homology-based gene sets, and the incorporation of categorical data such as known gene pathways. In all these tasks discrete mathematics and combinatorial algorithms form organizing principles upon which methods and implementations are based.

Keywords

Microarray Analysis Putative Co-Regulation  Quantitative Trait Loci Regulatory Models 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Abu-Khzam, F.N., Collins, R.L., Fellows, M.R., Langston, M.A., Suters, W.H., Symons, C.T.: Kernelization algorithms for the vertex cover problem: Theory and experiments. In: Proceedings, Workshop on Algorithm Engineering and Experiments, New Orleans, Louisiana (2004)Google Scholar
  2. 2.
    Abu-Khzam, F.N., Langston, M.A., Shanbhag, P., Symons, C.T.: Scalable parallel algorithms for FPT problems. Algorithmica (accepted for publication, 2006)Google Scholar
  3. 3.
    Alter, O., Brown, P.O., Botstein, D.: Singular value decomposition for genome-wide expression data processing and modeling. Proceedings of the National Academy of Sciences 97, 10101–10106 (2000)CrossRefGoogle Scholar
  4. 4.
    Baldwin, N.E., Chesler, E.J., Kirov, S., Langston, M.A., Snoddy, J.R., Williams, R.W., Zhang, B.: Computational, integrative, and comparative methods for the elucidation of genetic coexpression networks. Journal of Biomedicine and Biotechnology 2, 172–180 (2005)CrossRefGoogle Scholar
  5. 5.
    Bartoli, M., Ternaux, J.P., Forni, C., Portalier, P., Salin, P., Amalric, M., Monneron, A.: Down-regulation of striatin, a neuronal calmodulin-binding protein, impairs rat locomotor activity. Journal of Neurobiology 40, 234–243 (1999)CrossRefGoogle Scholar
  6. 6.
    Becamel, C., Gavarini, S., Chanrion, B., Alonso, G., Galeotti, N., Dumuis, A., Bockaert, J., Marin, P.: The serotonin 5-ht2a and 5-ht2c receptors interact with specific sets of pdz proteins. Journal of Biological Chemistry 279, 20257–20266 (2004)CrossRefGoogle Scholar
  7. 7.
    Bellaachia, A., Portnoy, D., Chen, Y., Elkahloun, A.G.: E-cast: A data mining algorithm for gene expression data. In: Proceedings, Workshop on Data Mining in Bioinformatics, Edmonton, Alberta, Canada (2002)Google Scholar
  8. 8.
    Ben-Dor, A., Bruhn, L., Friedman, N., Nachman, I., Schummer, M., Yakhini, Z.: Tissue classification with gene expression profiles. Journal of Computational Biology, 54–64 (2000)Google Scholar
  9. 9.
    Ben-Dor, A., Shamir, R., Yakhini, Z.: Clustering gene expression patterns. Journal of Computational Biology 6(3/4), 281–297 (1999)CrossRefGoogle Scholar
  10. 10.
    Bomze, I., Budinich, M., Pardalos, P., Pelillo, M.: The maximum clique problem. In: Du, D.Z., Pardalos, P.M. (eds.) Handbook of Combinatorial Optimization, vol. 4, Kluwer Academic Publishers, Dordrecht (1999)Google Scholar
  11. 11.
    Brem, R.B., Kruglyak, L.: The landscape of genetic complexity across 5,700 gene expression traits in yeast. Proceedings of the National Academy of Sciences 102, 1572–1577 (2005)CrossRefGoogle Scholar
  12. 12.
    Brem, R.B., Yvert, G., Clinton, R., Kruglyak, L.: Genetic dissection of transcriptional regulation in budding yeast. Science 296, 752–755 (2002)CrossRefGoogle Scholar
  13. 13.
    Broman, K.W., Speed, T.P.: A model selection approach for the identification of quantitative trait loci in experimental crosses. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 64, 641–656 (2002)MATHCrossRefMathSciNetGoogle Scholar
  14. 14.
    Butte, A.J., Tamayo, P., Slonim, D., Golub, T.R., Kohane, I.S.: Discovering functional relationships between RNA expression and chemotherapeutic susceptibility using relevance networks. Proceedings of the National Academy of Sciences 97, 12182–12186 (2000)CrossRefGoogle Scholar
  15. 15.
    Butz, S., Okamoto, M., Sudhof, T.C.: A tripartite protein complex with the potential to couple synaptic vesicle exocytosis to cell adhesion in brain. Cell 94, 773–782 (1998)CrossRefGoogle Scholar
  16. 16.
    Bystrykh, L., Weersing, E., Dontje, B., Sutton, S., Pletcher, M.T., Wiltshire, T., Su, A., Vellenga, E., Wang, J., Manly, K.F., Lu, L., Chesler, E.J., Alberts, R., Jansen, R.C., Williams, R.W., Cooke, M.P., Haan, G.d.: Uncovering regulatory pathways that affect hematopoietic stem cell function using ’genetical genomics’. Nature Genetics 37, 225–232 (2005)CrossRefGoogle Scholar
  17. 17.
    Chandran, L.S., Grandoni, F.: Refined memorisation for vertex cover. In: Proceedings, International Workshop on Parameterized and Exact Computation (IWPEC) (2004)Google Scholar
  18. 18.
    Chesler, E.J., Lu, L., Shou, S., Qu, Y., Gu, J., Wang, J., Hsu, H.C., Mountz, J.D., Baldwin, N.E., Langston, M.A., Hogenesch, J.B., Threadgill, D.W., Manly, K.F., Williams, R.W.: Complex trait analysis of gene expression uncovers polygenic and pleiotropic networks that modulate nervous system function. Nature Genetics 37, 233–242 (2005)CrossRefGoogle Scholar
  19. 19.
    Chesler, E.J., Lu, L., Wang, J., Williams, R.W., Manly, K.F.: Webqtl: Rapid exploratory analysis of gene expression and genetic networks for brain and behavior. Nature Neuroscience 7, 486–486 (2004)CrossRefGoogle Scholar
  20. 20.
    Chesler, E.J., Wang, J., Lu, L., Qu, Y., Manly, K.F., Williams, R.W.: Genetic correlates of gene expression in recombinant inbred strains: a relational model system to explore neurobehavioral phenotypes. Neuroinformatics 1, 343–357 (2003)CrossRefGoogle Scholar
  21. 21.
    Chesler, E.J., Williams, R.W.: Brain gene expression: Genomics and genetics. International Review of Neurobiology 60, 59–95 (2004)CrossRefGoogle Scholar
  22. 22.
    Churchill, G.A., Airey, D.C., Allayee, H., Angel, J.M., Attie, A.D., Beatty, J., Beavis, W.D., Belknap, J.K., Bennett, B., Berrettini, W., Bleich, A., Bogue, M., Broman, K.W., Buck, K.J., Buckler, E., Burmeister, M., Chesler, E.J., Cheverud, J.M., Clapcote, S., Cook, M.N., Cox, R.D., Crabbe, J.C., Crusio, W.E., Darvasi, A., Deschepper, C.F., Doerge, R.W., Farber, C.R., Forejt, J., Gaile, D., Garlow, S.J., Geiger, H., Gershenfeld, H., Gordon, T., Gu, J., Gu, W., Haan, G.d., Hayes, N.L., Heller, C., Himmelbauer, H., Hitzemann, R., Hunter, K., Hsu, H.C., Iraqi, F.A., Ivandic, B., Jacob, H.J., Jansen, R.C., Jepsen, K.J., Johnson, D.K., Johnson, T.E., Kempermann, G., Kendziorski, C., Kotb, M., Kooy, R.F., Llamas, B., Lammert, F., Lassalle, J.M., Lowenstein, P.R., Lu, A.L.L., Manly, K.F., Marcucio, R., Matthews, D., Medrano, J.F., Miller, D.R., Mittleman, G., Mock, B.A., Mogil, J.S., Montagutelli, X., Morahan, G., Morris, D.G., Mott, R., Nadeau, J.H., Nagase, H., Nowakowski, R.S., O’Hara, B.F., Osadchuk, A.V., Page, G.P., Paigen, A., Paigen, K., Palmer, A.A., Pan, H.J., Peltonen-Palotie, L., Peirce, J., Pomp, D., Pravenec, M., Prows, D.R., Qi, Z., Reeves, R.H., Roder, J., Rosen, G.D., Schadt, E.E., Schalkwyk, L.C., Seltzer, Z., Shimomura, K., Shou, S., Sillanpaa, M.J., Siracusa, L.D., Snoeck, H.W., Spearow, J.L., Svenson, K., Tarantino, L.M., Threadgill, D., Toth, L.A., Valdar, W., Villena, F.P.d., Warden, C., Whatley, S., Williams, R.W., Wiltshire, T., Yi, N., Zhang, D., Zhang, M., Zou, F.: The collaborative cross, a community resource for the genetic analysis of complex traits. Nature Genetics 36, 1133–1137 (2004)CrossRefGoogle Scholar
  23. 23.
    Doerge, R.W.: Mapping and analysis of quantitative trait loci in experimental populations. Nature Reviews Genetics 3, 43–52 (2002)CrossRefGoogle Scholar
  24. 24.
    Downey, R.G., Fellows, M.R.: Parameterized Complexity. Springer, Heidelberg (1999)Google Scholar
  25. 25.
    Feige, U., Peleg, D., Kortsarz, G.: The dense k-subgraph problem. Algorithmica 29, 410–421 (2001)MATHCrossRefMathSciNetGoogle Scholar
  26. 26.
    Girolami, M., Breitling, R.: Biologically valid linear factor models of gene expression. Bioinformatics 20, 3021–3033 (2004)CrossRefGoogle Scholar
  27. 27.
    Hansen, P., Jaumard, B.: Cluster analysis and mathematical programming. Mathematical Programming 79(1-3), 191–215 (1997)CrossRefMathSciNetGoogle Scholar
  28. 28.
    Hartuv, E., Schmitt, A., Lange, J., Meier-Ewert, S., Lehrachs, H., Shamir, R.: An algorithm for clustering cDNAs for gene expression analysis. In: Proceedings, RECOMB, Lyon, France (1999)Google Scholar
  29. 29.
    Heyer, L.J., Kruglyak, S., Yooseph, S.: Exploring expression data: Identification and analysis of coexpressed genes. Genome Research 9, 1106–1115 (1999)CrossRefGoogle Scholar
  30. 30.
    Hubner, N., Wallace, C.A., Zimdahl, H., Petretto, E., Schulz, H., Maciver, F., Mueller, M., Hummel, O., Monti, J., Zidek, V., Musilova, A., Kren, V., Causton, H., Game, L., Born, G., Schmidt, S., Muller, A., Cook, S.A., Kurtz, T.W., Whittaker, J., Pravenec, M., Aitman, T.J.: Integrated transcriptional profiling and linkage analysis for identification of genes underlying disease. Nature Genetics 37, 243–253 (2005)CrossRefGoogle Scholar
  31. 31.
    Langston, M.A., Lan, L., Peng, X., Baldwin, N.E., Symons, C.T., Zhang, B., Snoddy, J.R.: A combinatorial approach to the analysis of differential gene expression data: The use of graph algorithms for disease prediction and screening. In: Shoemaker, J.S., Lin, S.M. (eds.) Methods of Microarray Data Analysis IV, Springer, Heidelberg (2005)Google Scholar
  32. 32.
    Langston, M.A., Perkins, A.D., Saxton, A.M., Scharff, J.A., Voy, B.H.: Innovative computational methods for transcriptomic data analysis. In: Proceedings, ACM Symposium on Applied Computing, Dijon, France (accepted for publication, 2006)Google Scholar
  33. 33.
    Li, J., Burmeister, M.: Genetical genomics: Combining genetics with gene expression analysis. Human Molecular Genetics 14, 163–169 (2005)CrossRefGoogle Scholar
  34. 34.
    Manly, K.F., Olson, J.M.: Overview of qtl mapping software and introduction to map manager qt. Mammalian Genome 10, 327–334 (1999)CrossRefGoogle Scholar
  35. 35.
    Peirce, J.L., Lu, L., Gu, J., Silver, L.M., Williams, R.W.: A new set of bxd recombinant inbred lines from advanced intercross populations in mice. BMC Genetics 5, 7 (2004)CrossRefGoogle Scholar
  36. 36.
    Schadt, E.E., Monks, S.A., Drake, T.A., Lusis, A.J., Che, N., Colinayo, V., Ruff, T.G., Milligan, S.B., Lamb, J.R., Cavet, G., Linsley, P.S., Mao, M., Stoughton, R.B., Friend, S.H.: Genetics of gene expression surveyed in maize, mouse and man. Nature 422, 297–302 (2003)CrossRefGoogle Scholar
  37. 37.
    Slonim, D.K.: From patterns to pathways: gene expression data analysis comes of age. Nature 32, 502–508 (2002)Google Scholar
  38. 38.
    Wagner, A.: Distributed robustness versus redundancy as causes of mutational robustness. Bioessays 27, 176–188 (2005)CrossRefGoogle Scholar
  39. 39.
    Zhang, Y., Abu-Khzam, F.N., Baldwin, N.E., Chesler, E.J., Langston, M.A., Samatova, N.F.: Genome-scale computational approaches to memory-intensive applications in systems biology. In: Proceedings, Supercomputing, Seattle, Washington (2005)Google Scholar

Copyright information

© Springer Berlin Heidelberg 2007

Authors and Affiliations

  • Elissa J. Chesler
    • 1
  • Michael A. Langston
    • 2
  1. 1.Life Sciences Division, Oak Ridge National Laboratory, P.O. Box 2008, Oak Ridge, TN 37831-6124USA
  2. 2.Department of Computer Science, University of Tennessee, Knoxville, TN 37996–3450USA

Personalised recommendations