Genomic Signal Processing: From Matrix Algebra to Genetic Networks

  • Orly Alter
Part of the Methods in Molecular Biology™ book series (MIMB, volume 377)


DNA microarrays make it possible, for the first time, to record the complete genomic signals that guide the progression of cellular processes. Future discovery in biology and medicine will come from the mathematical modeling of these data, which hold the key to fundamental understanding of life on the molecular level, as well as answers to questions regarding diagnosis, treatment, and drug development. This chapter reviews the first data-driven models that were created from these genome-scale data, through adaptations and generalizations of mathematical frameworks from matrix algebra that have proven successful in describing the physical world, in such diverse areas as mechanics and perception: the singular value decomposition model, the generalized singular value decomposition model comparative model, and the pseudoinverse projection integrative model. These models provide mathematical descriptions of the genetic networks that generate and sense the measured data, where the mathematical variables and operations represent biological reality. The variables, patterns uncovered in the data, correlate with activities of cellular elements such as regulators or transcription factors that drive the measured signals and cellular states where these elements are active. The operations, such as data reconstruction, rotation, and classification in subspaces of selected patterns, simulate experimental observation of only the cellular programs that these patterns represent. These models are illustrated in the analyses of RNA expression data from yeast and human during their cell cycle programs and DNA-binding data from yeast cell cycle transcription factors and replication initiation proteins. Two alternative pictures of RNA expression oscillations during the cell cycle that emerge from these analyses, which parallel well-known designs of physical oscillators, convey the capacity of the models to elucidate the design principles of cellular systems, as well as guide the design of synthetic ones. In these analyses, the power of the models to predict previously unknown biological principles is demonstrated with a prediction of a novel mechanism of regulation that correlates DNA replication initiation with cell cycle-regulated RNA transcription in yeast. These models may become the foundation of a future in which biological systems are modeled as physical systems are today.

Key Words

Singular value decomposition (SVD) generalized SVD (GSVD) pseudoinverse projection blind source separation (BSS) algorithms genome-scale RNA expression and proteins’ DNA-binding data cell cycle yeast Saccharomyces cerevisiae human HeLa cell line analog harmonic and digital ring oscillators 


  1. 1.
    Fodor, S. P., Rava, R. P., Huang, X. C., Pease, A. C., Holmes, C. P., and Adams, C. L. (1993) Multiplexed biochemical assays with biological chips. Nature 364, 555–556.PubMedCrossRefGoogle Scholar
  2. 2.
    Schena, M., Shalon, D., Davis, R. W., and Brown, P. O. (1995) Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 270, 467–470.PubMedCrossRefGoogle Scholar
  3. 3.
    Brown, P. O., and Botstein, D. (1999) Exploring the new world of the genome with DNA microarrays. Nat. Genet. 21, 31–37.CrossRefGoogle Scholar
  4. 4.
    Pollack, J. R., and Iyer, V. R. (2002) Characterizing the physical genome. Nat. Genet. 32, 515–521.PubMedCrossRefGoogle Scholar
  5. 5.
    Sherlock, G., Hernandez-Boussard, T., Kasarskis, A., et al. (2001) The Stanford microarray database. Nucleic Acids Res. 29, 152–155.PubMedCrossRefGoogle Scholar
  6. 6.
    Spellman, P. T., Sherlock, G., Zhang, M. Q., et al. (1998) Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol. Biol. Cell 9, 3273–3297.PubMedGoogle Scholar
  7. 7.
    Whitfield, M. L., Sherlock, G., Saldanha, A., et al. (2002) Identification of genes periodically expressed in the human cell cycle and their expression in tumors. Mol. Biol. Cell 13, 1977–2000.PubMedCrossRefGoogle Scholar
  8. 8.
    Simon, I., Barnett, J., Hannett, N., et al. (2001) Serial regulation of transcriptional regulators in the yeast cell cycle. Cell 106, 697–708.PubMedCrossRefGoogle Scholar
  9. 9.
    Wyrick, J. J., Aparicio, J. G., Chen, T., et al. (2001) Genome-wide distribution of ORC and MCM proteins in S. cerevisiae: high-resolution mapping of replication origins. Science 294, 2301–2304.CrossRefGoogle Scholar
  10. 10.
    Newton, I. (1999) The Principia: Mathematical Principles of Natural Philosophy. (Cohen, I. B., and Whitman, A., trans.) University of California Press, Berkeley, CA.Google Scholar
  11. 11.
    Hubel, D. H., and Wiesel, T. N. (1968) Receptive fields and functional architecture of monkey striate cortex. J. Physiol. 195, 215–243.PubMedGoogle Scholar
  12. 12.
    Barlow, H. B. (1972) Single units and sensation: a neuron doctrine for perceptual psychology? Perception 1, 371–394.PubMedCrossRefGoogle Scholar
  13. 13.
    Olshausen, B. A., and Field, D. J. (1996) Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381, 607–609.PubMedCrossRefGoogle Scholar
  14. 14.
    Bell, A. J., and Sejnowski, T. J. (1997) The “independent components” of natural scenes are edge filters. Vision Res. 37, 3327–3338.PubMedCrossRefGoogle Scholar
  15. 15.
    Golub, G. H., and Van Loan, C. F. (1996) Matrix Computation, 3rd ed., Johns Hopkins University, Press, Baltimore, MD.Google Scholar
  16. 16.
    Alter, O., Brown, P. O., and Botstein, D. (2000) Singular value decomposition for genome-wide expression data processing and modeling. Proc. Natl. Acad. Sci. USA 97, 10,101–10,106.PubMedCrossRefGoogle Scholar
  17. 17.
    Alter, O., Brown, P. O., and Botstein, D. (2001) Processing and modeling genome-wide expression data using singular value decomposition. In: Microarrays: Optical Technologies and Informatics, vol. 4266 (Bittner, M. L., Chen, Y., Dorsel, A. N., and Dougherty, E. R., eds.), Int. Soc. Optical Eng., Bellingham, WA, pp. 171–186.Google Scholar
  18. 18.
    Nielsen, T. O., West, R. B., Linn, S. C., et al. (2002) Molecular characterisation of soft tissue tumours: a gene expression study. Lancet 359, 1301–1307.PubMedCrossRefGoogle Scholar
  19. 19.
    Alter, O., Brown, P. O., and Botstein, D. (2003) Generalized singular value decomposition for comparative analysis of genome-scale expression data sets of two different organisms. Proc. Natl. Acad. Sci. USA 100, 3351–3356.PubMedCrossRefGoogle Scholar
  20. 20.
    Alter, O., Golub, G. H., Brown, P. O., and Botstein, D. (2004) Novel genome-scale correlation between DNA replication and RNA transcription during the cell cycle in yeast is predicted by data-driven models. In: Proc. Miami Nat. Biotechnol. Winter Symp. on the Cell Cycle, Chromosomes and Cancer, vol. 15 (Deutscher, M. P., Black, S., Boehmer, P. E., et al., eds.), Univ. Miami Sch. Med., Miami, FL, Scholar
  21. 21.
    Alter, O. and Golub, G. H. (2004) Integrative analysis of genome-scale data by using pseudoinverse projection predicts novel correlation between DNA replication and RNA transcription. Proc. Natl. Acad. Sci. USA 101, 16,577–16,582.PubMedCrossRefGoogle Scholar
  22. 22.
    Alter, O., and Golub, G. H. (2005) Reconstructing the pathways of a cellular system from genome-scale signals using matrix and tensor computations. Proc. Natl. Acad. Sci. USA 102, 17,559–17,564.PubMedCrossRefGoogle Scholar
  23. 23.
    Alter, O., and Golub, G. H. (2006) Singular value decomposition of genome-scale mRNA lengths distribution reveals asymmetry in RNA gel electrophoresis band broadening. Proc. Natl. Acad. Sci. USA 103, 11,828–11,833.PubMedCrossRefGoogle Scholar
  24. 24.
    Alter, O. (2006) Discovery of principles of nature from mathematical modeling of DNA microarray data. Proc. Natl. Acad. Sci. USA 103, 16,063–16,064.PubMedCrossRefGoogle Scholar
  25. 25.
    Wigner, E. P. (1960) The unreasonable effectiveness of mathematics in the natural sciences. Commun. Pure Appl. Math. 13, 1–14.CrossRefGoogle Scholar
  26. 26.
    Hopfield, J. J. (1999) Odor space and olfactory processing: collective algorithms and neural implementation. Proc. Natl. Acad. Sci. USA 96, 12,506–12,511.PubMedCrossRefGoogle Scholar
  27. 27.
    Sirovich, L., and Kirby, M. (1987) Low-dimensional procedure for the characterization of human faces. J. Opt. Soc. Am. A 4, 519–524.PubMedCrossRefGoogle Scholar
  28. 28.
    Turk, M., and Pentland, A. (1991) Eigenfaces for recognition. J. Cogn. Neurosci. 3, 71–86.CrossRefGoogle Scholar
  29. 29.
    Landau, L. D., and Lifshitz, E. M. (1976) Mechanics, 3rd ed. (Sykes, J. B., and Bell, J. S., trans.), Butterworth-Heinemann, Oxford, UK.Google Scholar
  30. 30.
    Tavazoie, S., Hughes, J. D., Campbell, M. J., Cho, R. J., and Church, G. M. (1999) Systematic determination of genetic network architecture. Nat. Genet. 22, 281–285.PubMedCrossRefGoogle Scholar
  31. 31.
    Roberts, C. J., Nelson, B., Marton, M. J., et al. (2000) Signaling and circuitry of multiple MAPK pathways revealed by a matrix of global gene expression profiles. Science 287, 873–880.PubMedCrossRefGoogle Scholar
  32. 32.
    Alberts, B., Bray, D., Lewis, J., Raff, M., Roberts, K., and Watson, J. D. (1994) Molecular Biology of the Cell, 3rd ed., Garland Pub., New York, NY.Google Scholar
  33. 33.
    Klevecz, R. R., Bolen, J., Forrest, G., and Murray, D. B. (2004) A genomewide oscillation in transcription gates DNA replication and cell cycle. Proc. Natl. Acad. Sci. USA 101, 1200–1205.PubMedCrossRefGoogle Scholar
  34. 34.
    Li, C. M., and Klevecz, R. R. (2006) A rapid genome-scale response of the transcriptional oscillator to perturbation reveals a period-doubling path to phenotypic change. Proc. Natl. Acad. Sci. USA 103, 16,254–16,259.PubMedCrossRefGoogle Scholar
  35. 35.
    Nicolis, G. and Prigogine, I. (1971) Fluctuations in nonequilibrium systems. Proc. Natl. Acad. Sci. USA 68, 2102–2107.PubMedCrossRefGoogle Scholar
  36. 36.
    Rössler O. E. (1976) An equation for continuous chaos. Phys. Lett. A 35, 397–398.CrossRefGoogle Scholar
  37. 37.
    Roux, J.-C., Simoyi, R. H., and Swinney, H. L. (1983) Observation of a strange attractor. Physica D 8, 257–266.CrossRefGoogle Scholar
  38. 38.
    Stuart, J. M., Segal, E., Koller, D., and Kim, S. K. (2003) A gene-coexpression network for global discovery of conserved genetic modules. Science 302, 249–255.PubMedCrossRefGoogle Scholar
  39. 39.
    Bergmann, S., Ihmels, J., and Barkai, N. (2004) Similarities and differences in genome-wide expression data of six organisms. PLoS Biol 2, E9.PubMedCrossRefGoogle Scholar
  40. 40.
    Mushegian, A. R., and Koonin, E. V. (1996) A minimal gene set for cellular life derived by comparison of complete bacterial genomes. Proc. Natl. Acad. Sci. USA 93, 10,268–10,273.PubMedCrossRefGoogle Scholar
  41. 41.
    Dwight, S. S., Harris, M. A., Dolinski, K., et al. (2002) Saccharomyces Genome Database (SGD) provides secondary gene annotation using the Gene Ontology (GO). Nucleic Acids Res. 30, 69–72.PubMedCrossRefGoogle Scholar
  42. 42.
    Kurihara, L. J., Stewart, B. G., Gammie, A. E., and Rose, M. D. (1996) Kar4p, a karyogamy-specific component of the yeast pheromone response pathway. Mol. Cell. Biol. 16, 3990–4002.PubMedGoogle Scholar
  43. 43.
    Ewing, B. and Green, P. (2000) Analysis of expressed sequence tags indicates 35,000 human genes. Nat. Genet. 25, 232–234.PubMedCrossRefGoogle Scholar
  44. 44.
    Elowitz, M. B., and Leibler, S. (2000) A synthetic oscillatory network of transcriptional regulators. Nature 403, 335–338.PubMedCrossRefGoogle Scholar
  45. 45.
    Fung, E., Wong, W. W., Suen, J. K., Butler, T., Lee, S. G., and Liao, J. C. (2005) A synthetic gene-metabolic oscillator. Nature 435, 118–122.PubMedCrossRefGoogle Scholar
  46. 46.
    Bussemaker, H. J., Li, H., and Siggia, E. D. (2001) Regulatory element detection using correlation with expression. Nat. Genet. 27, 167–171.PubMedCrossRefGoogle Scholar
  47. 47.
    Lu, P., Nakorchevskiy, A., and Marcotte, E. M. (2003) Expression deconvolution: a reinterpretation of DNA microarray data reveals dynamic changes in cell populations. Proc. Natl. Acad. Sci. USA 100, 10,370–10,375.PubMedCrossRefGoogle Scholar
  48. 48.
    Chang, V. K., Fitch, M. J., Donate, J. J., hristensen, T. W., Merchant, A. M., and Tye, B. K. (2003) Mcm1 binds replication origins. J. Biol. Client. 278, 6093–6100.CrossRefGoogle Scholar
  49. 49.
    Donate, J. J., Chung, S. C., and Tye, B. K. (2006) Genome-wide hierarchy of replication origin usage in Saccharomyces cerevisiae. PloS Genet. 2, E9.CrossRefGoogle Scholar
  50. 50.
    Diffley, J. F. X., Cocker, J. H., Dowell, S. J., and Rowley, A. (1994) Two steps in the assembly of complexes at yeast replication origins in vivo. Cell 78, 303–316.PubMedCrossRefGoogle Scholar
  51. 51.
    Kelly, T. J. and Brown, G. W. (2000) Regulation of chromosome replication. Annu. Rev. Biochem. 69, 829–880.PubMedCrossRefGoogle Scholar
  52. 52.
    Micklem, G., Rowley, A., Harwood, J., Nasmyth, K., and Diffley, J. F. X. (1993) Yeast origin recognition complex is involved in DNA replication and transcriptional silencing. Nature 366, 87–89.PubMedCrossRefGoogle Scholar
  53. 53.
    Fox, C. A. and Rine, J. (1996) Influences of the cell cycle on silencing. Curr. Opin. Cell Biol. 8, 354–357.PubMedCrossRefGoogle Scholar
  54. 54.
    Ihmels, J., Levy, R., and Barkai, N. (2004) Principles of transcriptional control in the metabolic network of Saccharomyces cerevisiae. Nat. Biotechnol. 60, 86–92.CrossRefGoogle Scholar
  55. 55.
    Carlson, J. M. and Doyle, J. (1999) Highly optimized tolerance: a mechanism for power laws in designed systems. Phys. Rev. E 60, 1412–1427.CrossRefGoogle Scholar
  56. 56.
    Arkin, A. P. and Ross, J. (1994) Computational functions in biochemical reaction networks. Biophys. J. 67, 560–578.PubMedCrossRefGoogle Scholar
  57. 57.
    Ptashne, M. (1992) Genetic Switch: Phage Lambda and Higher Organisms, 2nd ed., Blackwell Publishers, Oxford, UK.Google Scholar
  58. 58.
    McAdams, H. H. and Shapiro, L. (1995) Circuit simulation of genetic networks. Science 269, 650–656.PubMedCrossRefGoogle Scholar
  59. 59.
    Schilling, C. H. and Palsson, B. O. (1998) The underlying pathway structure of biochemical reaction networks. Proc. Natl. Acad. Sci. USA 95, 4193–4198.PubMedCrossRefGoogle Scholar
  60. 60.
    Yeung, M. K., Tegner, J., and Collins, J. J. (2002) Reverse engineering gene networks using singular value decomposition and robust regression. Proc. Natl. Acad. Sci. USA 99, 6163–6168.PubMedCrossRefGoogle Scholar
  61. 61.
    Price, N. D., Reed, J. L., Papin, J. A., Famili, L, and Palsson, B. O. (2003) Analysis of metabolic capabilities using singular value decomposition of extreme pathway matrices. Biophys. J. 84, 794–804.PubMedCrossRefGoogle Scholar
  62. 62.
    Vlad, M. O., Arkin, A. P., and Ross, J. (2004) Response experiments for nonlinear systems with application to reaction kinetics and genetics. Proc. Natl. Acad. Sci. USA 101, 7223–7228.PubMedCrossRefGoogle Scholar
  63. 63.
    Doyle, J. and Stein, G. (1981) Multivariable feedback design: Concepts for a classical/modern synthesis. IEEE Trans. Automat. Contr. 26, 4–16.CrossRefGoogle Scholar
  64. 64.
    Broomhead, D. S. and King, G. P. (1986) Extracting qualitative dynamics from experimental-data. Physica D 20, 217–236.CrossRefGoogle Scholar

Copyright information

© Humana Press Inc., Totowa, NJ 2007

Authors and Affiliations

  • Orly Alter
    • 1
  1. 1.Department of Biomedical Engineering, Institute for Cellular and Molecular Biology and Institute for Computational Engineering and SciencesUniversity of Texas at AustinAustin

Personalised recommendations