Data Mining in Proteomics pp 61-77

Part of the Methods in Molecular Biology book series (MIMB, volume 696) | Cite as

The Origin and Early Reception of Sequence Databases


Emerging areas of scientific research never arise in a social or intellectual vacuum, but must establish themselves in relation to well-established disciplines. This necessity poses challenges for scientists who must not only create a new disciplinary identity, but must also defend their research from criticism and even condescension from other scientists. The early use of sequence databases provides an excellent case study for examining the challenges facing novel sciences. The need for sequence databases grew out of protein sequencing in biochemistry beginning in the late 1950s. The rapid increase in the number of sequences made databases an attractive resource, but protein biochemists often considered building, managing, and doing research with databases a “second-rate” science. Similarly, computational biologists who used databases and digital computers to study evolutionary phenomena faced criticism from more traditional evolutionary biologists. In retrospect, one can see this early computational biology as laying important foundations for the bioinformatics, molecular evolution, and molecular systematics of today. However, within the context of the 1960s, establishing a scientific identity posed serious challenges for Margaret Dayhoff, Walter Fitch, and Russell Doolittle and other computational biologists who used computers and databases to investigate evolutionary problems.


  1. 1.
    Wolfe KH, Li WH (2003) Molecular evolution meets the genomic revolution. Nat Genet Suppl 33:255–265CrossRefGoogle Scholar
  2. 2.
    Kanehisa M, Bork P (2003) Bioinformatics in the post-sequence era. Nat Genet Suppl 33:305–310CrossRefGoogle Scholar
  3. 3.
    Patterson SD, Aebersold RH (2003) Proteomics: the first decade and beyond. Nat Genet Suppl 33:311–323CrossRefGoogle Scholar
  4. 4.
    de Chadarevian S (1996) Sequences, conformation, information: biochemists and molecular biologists in the 1950s. J Hist Biol 29:361–386CrossRefPubMedGoogle Scholar
  5. 5.
    de Chadarevian S (1999) Protein sequencing and the making of molecular genetics. Trends Biochem Sci 24:203–206CrossRefPubMedGoogle Scholar
  6. 6.
    Sanger F (1959) The chemistry of insulin. Science 129:1340–1344CrossRefPubMedGoogle Scholar
  7. 7.
    Sanger F (1988) Sequences, sequences, sequences. Ann Rev Biochem 57:1–28CrossRefPubMedGoogle Scholar
  8. 8.
    Strasser BJ (in press) Collecting, comparing, and computing sequences: the making of Margaret O. Dayhoff’s atlas of protein sequence and structure. J Hist Biol.Google Scholar
  9. 9.
    Strasser BJ (2006) Collecting and experimenting: the moral economies of biological research, 1960s–1980s. Preprints Max-Planck Inst Hist Sci 310:105–123Google Scholar
  10. 10.
    Strasser BJ (2008) GenBank – natural history in the 21st century? Science 322:537–538CrossRefPubMedGoogle Scholar
  11. 11.
    Smith TF (1990) The history of the genetic sequence databases. Genomics 6:701–707CrossRefPubMedGoogle Scholar
  12. 12.
    Schachman HK (1979) Summary remarks: a retrospect on proteins. In: Srinivasan PR, Fruton JS, Edsall JT (eds) The origins of modern biochemistry: a retrospect on proteins, vol 325. Annals of the New York Academy of Sciences, New York, pp 363–373Google Scholar
  13. 13.
    Eck RV, Dayhoff MO (1966) The atlas of protein sequence and structure 1966. National Biomedical Research Foundation, Silver Spring, MAGoogle Scholar
  14. 14.
    Hunt LT (1983) Margaret O. Dayhoff, 1925–1983. DNA 2:97–98CrossRefPubMedGoogle Scholar
  15. 15.
    Hunt LT (1984) Margaret O. Dayhoff, 1925–1983. Bull Math Biol 46:467–472Google Scholar
  16. 16.
    Margoliash E, Schejter A (1996) How does a small protein become so popular?: a succinct account of the development of our understanding of cytochrome c. In: Scott RA, Mauk AG (eds) Cytochrome c: a multidisciplinary approach. University Science Books, Sausalito, CAGoogle Scholar
  17. 17.
    Doolittle RF, Blömback B (1964) Amino acid sequence investigations of fibrinopeptides from various mammals: evolutionary implications. Nature 202:147–152CrossRefPubMedGoogle Scholar
  18. 18.
    Ingram VM (1961) Gene evolution and the haemoglobins. Nature 189:704–708CrossRefPubMedGoogle Scholar
  19. 19.
    Zuckerkandl E, Pauling L (1963) Chemical paleogenetics: molecular “restoration studies” of extinct forms of life. Acta Chem Scand 17:S9–S16CrossRefGoogle Scholar
  20. 20.
    Dayhoff MO (1969) Computer analysis of protein evolution. Sci Am 221:87–95CrossRefGoogle Scholar
  21. 21.
    Hagen JB (1999) Naturalists, molecular biologists, and the challenges of molecular evolution. J Hist Biol 32:321–341CrossRefPubMedGoogle Scholar
  22. 22.
    Doolittle RF (2000) On the trail of protein sequences. Bioinformatics 16:24–33CrossRefPubMedGoogle Scholar
  23. 23.
    Moody G (2004) Digital code of life: how bioinformatics is revolutionizing science, medicine, and business. Wiley, Hoboken, NJGoogle Scholar
  24. 24.
    Hagen JB (2000) The origins of bioinformatics. Nat Rev Genet 1:231–236CrossRefPubMedGoogle Scholar
  25. 25.
    Crick FHC (1958) On protein synthesis. Symp Soc Exp Biol 12:138–163PubMedGoogle Scholar
  26. 26.
    Aronson J (2002) Molecules and monkeys: George Gaylord Simpson and the challenge of molecular evolution. Hist Philos Life Sci 24:441–465CrossRefPubMedGoogle Scholar
  27. 27.
    Dietrich MR (1998) Paradox and persuasion: negotiating the place of molecular evolution within evolutionary biology. J Hist Biol 31:85–111CrossRefPubMedGoogle Scholar
  28. 28.
    Morgan GJ (1998) Emile Zuckerkandl, Linus Pauling and the molecular evolutionary clock, 1959–1965. J Hist Biol 31:155–178CrossRefPubMedGoogle Scholar
  29. 29.
    Sommer M (2008) History in the gene: negotiations between molecular and organismal anthropology. J Hist Biol 41:473–528CrossRefPubMedGoogle Scholar
  30. 30.
    Hagen JB (in press). Waiting for Sequences: Morris Goodman, Immunodiffusion Experiments, and the Origins of Molecular Anthropology. J Hist Biol. Google Scholar
  31. 31.
    Zuckerkandl E, Pauling L (1965) Evolutionary divergence and convergence in proteins. In: Bryson V, Vogel HJ (eds) Evolving genes and proteins. Academic Press, New York, pp 97–166Google Scholar
  32. 32.
    Zuckerkandl E, Pauling L (1965) Molecules as documents of evolutionary history. J Theor Biol 8:357–366CrossRefPubMedGoogle Scholar
  33. 33.
    Strasser BJ (1999) Sickle cell anemia, a molecular disease. Science 286:1488–1490CrossRefPubMedGoogle Scholar
  34. 34.
    Dietrich MR (1994) The origins of the neutral theory of molecular evolution. J Hist Biol 27:21–59CrossRefPubMedGoogle Scholar
  35. 35.
    Kumar S (2005) Molecular clocks: four decades of evolution. Nat Rev Genet 6:654–662CrossRefPubMedGoogle Scholar
  36. 36.
    Kimura M (1983) The neutral theory of molecular evolution. Cambridge University Press, CambridgeCrossRefGoogle Scholar
  37. 37.
    Suárez E, Barahona A (1996) The experimental roots of the neutral theory of molecular evolution. Hist Philos Life Sci 18:55–81Google Scholar
  38. 38.
    Margoliash E (1972) The molecular variation of cytochrome c as a function of the evolution of species. Harvey Lect 66:177–247Google Scholar
  39. 39.
    Hagen JB (2001) The introduction of computers into systematic research in the united states during the 1960s. Stud His Philos Biol Biomed Sci 32:291–314Google Scholar
  40. 40.
    Hagen JB (2003) The statistical frame of mind in systematic biology from quantitative zoology to biometry. J Hist Biol 36:353–384CrossRefPubMedGoogle Scholar
  41. 41.
    Felsenstein J (2004) Inferring phylogenies. Sinauer, Sunderland, MAGoogle Scholar
  42. 42.
    Fitch WM, Margoliash E (1967) Construction of phylogenetic trees. Science 155:279–284CrossRefPubMedGoogle Scholar
  43. 43.
    Fitch WM (1988) This week’s citation classic. Curr Contents 19(27):16Google Scholar
  44. 44.
    Fitch WM (1987) This week’s citation classic. Curr Contents 18(27):14Google Scholar
  45. 45.
    Margoliash E, Fitch WM, Dickerson RE (1968) Molecular expression of evolutionary phenomena in the primary and tertiary structures of cytochrome c. Structure, function, and evolution in proteins. Brookhaven Symp Biol 21(2):259–305PubMedGoogle Scholar
  46. 46.
    Dickerson RE, Geis I (1969) The structure and action of proteins. Harper & Row, New YorkGoogle Scholar
  47. 47.
    Hull DL (1988) Science as a process: an evolutionary account of the social and conceptual development of science. University of Chicago Press, ChicagoGoogle Scholar
  48. 48.
    Fitch WM (2000) Homology: a personal view on some of the problems. Trends Genet 16(5):227–231CrossRefPubMedGoogle Scholar
  49. 49.
    Doolittle RF (1997) A Delicate Balance. Boston Rev (February–March).Google Scholar
  50. 50.
    Doolittle RF, Oncley JL, Surgenor DM (1962) Species differences in the interaction of thrombin and fibrinogen. J Biol Chem 237:3123–3127Google Scholar
  51. 51.
    Doolittle RF (1997) Some reflections on the early days of sequence searching. J Mol Med 75:239–241PubMedGoogle Scholar
  52. 52.
    Bairoch A (2000) Serendipity in bioinformatics, the tribulations of a Swiss bioinformatician through exciting times! Bioinformatics 16:48–64CrossRefPubMedGoogle Scholar
  53. 53.
    Dayhoff MO, Eck RV, Chang MA, Souchard MR (1965) Atlas of protein sequence and structure. National Biological Research Foundation, Silver Spring, MDGoogle Scholar
  54. 54.
    Fitch WM (1970) Distinguishing homologous from analogous proteins. Syst Zool 19:99–113CrossRefPubMedGoogle Scholar
  55. 55.
    Fitch WM (1971) Toward defining the course of evolution: minimum change for a specific tree topology. Syst Zool 20:406–416CrossRefGoogle Scholar
  56. 56.
    Ledley RS (1965) Use of computers in biology and medicine. McGraw-Hill, New YorkGoogle Scholar
  57. 57.
    Smith EL (1979) Amino acid sequences of proteins – the beginnings. In: Srinivasan PR, Fruton JS, Edsall JT (eds) The origins of modern biochemistry: a retrospect on proteins, vol 325. Annals of the New York Academy of Sciences, New York, pp 107–118Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  1. 1.Department of BiologyRadford UniversityRadfordUSA

Personalised recommendations