Supercomputer ’92 pp 32-48 | Cite as
The Human Genome and High Performance Computing in Molecular Biology
Abstract
Genetic sequences contain the basic instruction code of living systems — a basic book of life. The period 1992–2010 will see the deciphering of much of this information, in many organisms, including that of the human genome. Unfortunately, the code is written in biological assembler language and needs to be deciphered. The translation rules from the basic code to biological function is not yet fully known. Here, computational molecular biology is challenged to make major contributions. The potential benefits to medical science and biotechnology are huge.
Four of the basic components of genome-related data are the genetic sequences of DNA/RNA, the sequences of protein molecules derived from genes, the specific three-dimensional shapes of these proteins and the biological function of the protein molecules and their molecular partners. Now and in the near future, there are two serious information gaps: the protein sequence-structure gap and, the protein sequence-function gap. Key computational problems to close these gaps are molecular dynamcis simulations of protein behavior and selective database searches for biologically significant similarities between protein molecules. These are presented in some detail in this paper.
The advent of high performance computing hardware now on the drawing boards is a necessary but not sufficient condition for a possible solution to some of the key problems of computational biology. We will have to concentrate on the development of software and the training of a new generation of interdisciplinary experts in this emerging part of the life sciences.
Preview
Unable to display preview. Download preview PDF.
References
- 1.A. Bairoch, B. Boeckmann, The SWISS-PROT protein sequence data bank, Nucl. Acids Res. 19 (1991) 2247–2250.Google Scholar
- 2.F.C. Bernstein, T.F. Koetzle, G.J.B. Williams, E.F. Meyer, M.D. Brice, J.R Rodgers, O. Kennard, T. Shimanouchi, M. Tasumi, The Protein Data Bank: a computer-based archival file for macromolecular structures, J. Mol. Biol. 112 (1977) 535–542.CrossRefGoogle Scholar
- 3.U. Hobohm, M. Scharf, R Schneider, C. Sander, Selection of representative protein data sets, Protein Science 1 (1992) No. 3.Google Scholar
- 4.C. Sander, R. Schneider, Database of Homology-Derived Protein Structures and the Structural Meaning of Sequence Alignment, Proteins 9 (1991) 56–68.CrossRefGoogle Scholar
- 5.J.P. Priestle, RIBBON: a stereo cartoon drawing program for proteins, J. Appl. Crystallogr. 21 (1988) 572–576.CrossRefGoogle Scholar
- 6.A.V. Finkelstein, B.A. Reva, in Protein Design on Computers (C Sander and G Vriend, eds.), EMBL Biocomputing Technical Document 6 (1991) p. 139.Google Scholar
- 7.G.N. Reeke Jr., Protein folding: computational approaches to an exponential-time problem, Ann. Rev. Comput. Sci. 3 (1988) 59–84.CrossRefGoogle Scholar
- 8.M. Karplus, G.A. Petsko, Molecular dynamics simulations in biology, Nature 347 (1990) 631–639.CrossRefGoogle Scholar
- 9.H.J.C. Berendsen, W.F. van Gunsteren, in Molecular Dynamics Simulation of Statistical-Mechanical Systems (G Ciccotti and WG Hoover, eds.), North-Holland, Amsterdam (1986) p.43.Google Scholar
- 10.W.F. van Gunsteren, H.J.C Berendsen, GROMOS: Groningen molecular simulation computer program package, University of Groningen, The Netherlands (1987).Google Scholar
- 11.G. Vriend, WHAT IF: a molecular modeling and drug design program, J. Mol Graph. 8 (1990) 52–55.CrossRefGoogle Scholar
- 12.H. Grubmüller, H. Heller, K. Schulten, Molecular dynamics simulation on a parallel computer, preprint (1989).Google Scholar