Skip to main content

Abstract

Nucleic acid and protein sequences contain a wealth of information of interest to molecular biologists, since the genome forms the blueprint of the cell. Currently, a database search for sequence similarities represents the most direct computational approach to decipher the codes connecting molecular sequences with protein structure and function (Doolittle, 1990). If the unknown protein is related to one of known structure/function, inferences based on the known structure/function and the degree of the relationship can provide the most reliable clues to the nature of the unknown protein. This technique has proved successful and has led to new understanding in a wide variety of biological studies (Boswell and Lesk, 1988).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990): Basic local alignment search tool. J Mol Biol 215:403–410

    PubMed  CAS  Google Scholar 

  • Asoh H, Otsu N (1990): An approximation of nonlinear discriminant analysis by multilayer neural networks. Proc Intn’l Joint Confon Neural Networks (June) 111:211–216

    Article  Google Scholar 

  • Bairoch A (1992): PROSITE: A dictionary of sites and patterns in proteins. Nucl Acids Res (Suppl) 20: 2013–2018

    PubMed  CAS  Google Scholar 

  • Barker WC, George DG, Mewes H-W, Tsugita A (1992): The PIR-international protein sequence database. Nuc Acids Res (Suppl) 20: 2023–2026

    CAS  Google Scholar 

  • Barton (1991): A matrix method for optimizing a neural network. Neural Computation 3:450–459

    Article  Google Scholar 

  • Berry MW (1992): Large-scale sparse singular value computations. Int J Supercomputer Applications 6:13–49

    Google Scholar 

  • Bohr H, Bohr J, Brunak S, Cotterill RMJ, Fredholm H, Lautrup B, Peterson SB (1990): A novel approach to prediction of the 3-dimensional structures of protein backbones by neural networks. FEBS Letters 261:43–46

    Article  PubMed  CAS  Google Scholar 

  • Boswell DR, Lesk AM (1988): Sequence comparison and alignment: The measurement and interpretation of sequence similarity. In: Computational Molecular Biology: Sources and Methods for Sequence Analysis, pp. 161–178. Lesk AM, ed. New York: Oxford University Press

    Google Scholar 

  • Brutlag DL, Dautricourt J-P, Fier RDJ, Moxon B, Stamm R (1992): BLAZE: An implementation of the Smith-Waterman sequence comparison algorithm on a massively parallel computer. Extended Abstracts of the 2nd International Workshop of Open Problems in Computational Molecular Biology: 60-68

    Google Scholar 

  • Chen S (1993): Characterization and learning of protein conformations. Proc. 2nd International Conference on Bioinformatics, Supercomputing and Complex Genome Analysis:391-399

    Google Scholar 

  • Cherkassky V, Vassilas N (1989): Performance of back propagation networks for associative database retrieval. Proc Int Joint Conf on Neural Networks 1:77–83

    Article  Google Scholar 

  • Claverie J-M, Sauvaget I, Bougueleret L (1990): K-tuple frequency analysis: From intron/exon discrimination to T-cell epitope mapping. In: Molecular Evolution: Computer Analysis of Proteins and Nucleic Acid Sequences, Methods in Enzy-mology, Vol. 183. Doolittle RF, ed. New York: Academic Press, pp. 237–252

    Google Scholar 

  • Davison DB (1990): Sequence searching on supercomputers. In: Computers and DNA, Santa Fe Institute Studies in the Sciences of Complexity. Bell G, Marr TG, eds. Reading, MA: Addison-Wesley, pp. 93–97

    Google Scholar 

  • Dayhoff MO (ed.) (1972): Atlas of Protein Sequence and Structure, Volume 5. National Biomedical Research Foundation, Washington, D.C.

    Google Scholar 

  • Dayhoff J (1990): Neural Network Architectures, An Introduction. New York: Van Nostrand Reinhold

    Google Scholar 

  • Deerwester S, Dumais ST, Furnas, Landaur TK, Harshman R (1990): Indexing by latent semantic analysis. J Amer Soc for Information Science 41:391–407

    Article  Google Scholar 

  • Demeler B, Zhou G (1991): Neural network optimization for E. coli promoter prediction. Nuc Acids Res 19:1593–1599

    Article  CAS  Google Scholar 

  • Devereux J (1988): A rapid method for identifying sequences in large nucleotide sequence databases. Ph.D. Thesis, University of Wisconsin

    Google Scholar 

  • Doolittle RF (1990): Searching through sequence databases. In: Molecular Evolution: Computer Analysis of Proteins and Nucleic Acid Sequences, Methods in Enzymology, Vol. 183. Doolittle RF, ed. New York: Academic Press, pp. 99–110

    Google Scholar 

  • Gallinari P, Thiria S and Soulie FF (1988): Multilayer perceptrons and data analysis. Proc Intn’l Joint Conf on Neural Networks 1:391–399

    Google Scholar 

  • Gribskov M, Devereux J (eds.) (1991): Sequence Analysis Primer. New York: Stockton Press

    Google Scholar 

  • Harris N, Hunter L, States D (1992): Megaclassification: Discovering motifs in massive datastreams. Proceedings of 10th National Conference on Artificial Intelligence, AAAI Press

    Google Scholar 

  • Henikoff S, Henikoff JG (1991): Automated assembly of protein blocks for database searching. Nuc Acid Res 19:6565–6572

    Article  CAS  Google Scholar 

  • Hirst JD, Sternberg MJE (1992): Prediction of structural and functional features of protein and nucleic acid sequences by artificial neural networks. Biochemistry 31:7211–7218

    Article  PubMed  CAS  Google Scholar 

  • Holley LH, Karplus M (1989): Protein secondary structure prediction with a neural network. Proc Natl Acad Sci USA 86:152–156

    Article  PubMed  CAS  Google Scholar 

  • Horton PB, Kanehisa M (1992): An assessment of neural network and statistical approaches for prediction of E. coli promoter sites. Nuc Acids Res 20:4331–4338

    Article  CAS  Google Scholar 

  • Karlin S, Ost F, Blaisdell BE (1989): Patterns in DNA and amino acid sequences and their statistical significance. In: Mathematical Methods for DNA Sequences, Waterman MS, ed. Boca Raton, FL: CRC Press, Inc. pp. 133–157

    Google Scholar 

  • Kimoto T, Asakawa K, Yoda M, Takeoka M (1990): Stock market prediction system with modular neural networks. Proc Int Joint Conf on Neural Networks (June) 1:1–6

    Article  Google Scholar 

  • Kneller DG, Cohen FE, Langridge R (1990): Improvements in protein secondary structure prediction by an enhanced neural network. J Mol Biol 214:171–182

    Article  PubMed  CAS  Google Scholar 

  • Konopka AK, Owens J (1990): Non-continuous patterns and compositional complexity of nucleic acid sequences. In: Computers and DNA, SFI Studies in the Sciences of Complexity, Vol. VII. Bell G, Marr T, eds. Addison-Wesley. pp. 147–155

    Google Scholar 

  • Lapedes A, Barnes C, Burks C, Farber R, Sirotkin K (1990): Application of neural networks and other machine learning algorithms to DNA sequence analysis. In: Computers and DNA, SFI Studies in the Sciences of Complexity, Vol. VII. Bell G, Marr T, eds. Addison-Wesley, pp. 157–182

    Google Scholar 

  • Le Cun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, Jeckel LD (1989): Backpropagation applied to handwritten zip code recognition. Neural Computation 1:541–551

    Article  Google Scholar 

  • Lendaris GG, Harb LA (1990): Improved generalization in ANNs via use of conceptual graphs: A character recognition task as an example case. Proc Intn’l Joint Conf on Neural Networks (June) I:551–556

    Article  Google Scholar 

  • Liebman MN (1993): Application of neural networks to the analysis of structure and function in biologically active macromolecules. Proc 2nd International Conference on Bioinformatics, Supercomputing and Complex Genome Analysis:331-347

    Google Scholar 

  • Lipman DJ, Pearson WR (1985): Rapid and sensitive protein similarity searches. Science 277:1435–1441

    Article  Google Scholar 

  • Needleman SB, Wunsch CD (1970): A general method applicable to the search for similarities in the amino acid sequences of two proteins. J Mol Biol 48:443–453

    Article  PubMed  CAS  Google Scholar 

  • Olsen GJ, Overbeek R, Larsen N, Marsh TL, McCaughey MJ, Maciukenas MA, Kuan W-M, Macke, TJ, Xing Y, Woese CR (1992): The ribosomal RNA database project. Nuc Acids Res (Suppl) 20:2199–2200

    CAS  Google Scholar 

  • O’Neill MC (1992): Escherichia coli promoters: Neural networks develop distinct descriptions in learning to search for promoters of different spacing classes. Nuc Acids Res 20:3471–3477

    Article  Google Scholar 

  • Pabo CO (1987): New generation databases for molecular biology. Nature 327: 467

    Article  PubMed  CAS  Google Scholar 

  • Pearson WR, Lipman DJ (1988): Improved tools for biological sequence comparisons. Proc Nat Acad Sci 85:2444–2448

    Article  PubMed  CAS  Google Scholar 

  • Qian N, Sejnowski TJ (1988): Predicting the secondary structure of globular proteins using neural network models. J Mol Biol 202:865–884

    Article  PubMed  CAS  Google Scholar 

  • Rumelhart DE, McClelland JL (eds.) (1986): Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Volume 1: Foundations. MIT Press.

    Google Scholar 

  • Smith TF, Waterman M (1981): Identification of common molecular subsequences. J Mol Biol 147:195–197

    Article  PubMed  CAS  Google Scholar 

  • Stormo GD, Schneider TD, Gold L, Ehrenfeucht A (1982): Use of the ‘Perceptron’ algorithm to distinguish translation initiation sites in E. coli. Nuc Acids Res 10:2997–3011

    Article  CAS  Google Scholar 

  • Uberbacher EC, Mural RJ (1991): Locating protein-coding regions in human DNA sequences by a multiple sensor-neural network approach. Proc Natl Acad Sci USA 88:11261–11265

    Article  PubMed  CAS  Google Scholar 

  • van Heel M (1991): A new family of powerful multivariant statistical sequence analysis techniques. J Mol Biol 220:877–887

    Article  PubMed  Google Scholar 

  • Webb AR, Lowe D (1990): The optimized internal representation of multilayered classifier networks performs nonlinear discriminant analysis. Neural Networks 3:367–375

    Article  Google Scholar 

  • Woese CR (1987): Bacterial evolution. Microbiological Reviews 51:221–271

    PubMed  CAS  Google Scholar 

  • Wu CH, Ermongkonchai A, Chang TC (1991): Protein classification using a neural network protein database (NNPDB) system. Proceedings of the Analysis of Neural Network Applications Conference:29-41

    Google Scholar 

  • Wu CH, Whitson G, McLarty J, Ermongkonchai A, Chang T (1992): Protein classification artificial neural system. Protein Science 1:667–677

    Article  PubMed  CAS  Google Scholar 

  • Wu CH (1993): Classification neural networks for rapid sequence annotation and automated database organization. Computers & Chemistry 17:219–227

    Article  CAS  Google Scholar 

  • Xin Y, Carmeli T, Liebman M, Wilcox GL (1993): Use of the backpropagation neural network algorithm for prediction of protein folding patterns. Proc.2nd International Conference on Bioinformatics, Supercomputing and Complex Genome Analysis:359-375

    Google Scholar 

Download references

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1994 Birkhäuser Boston

About this chapter

Cite this chapter

Wu, C.H. (1994). Neural Networks for Molecular Sequence Classification. In: Merz, K.M., Le Grand, S.M. (eds) The Protein Folding Problem and Tertiary Structure Prediction. Birkhäuser Boston. https://doi.org/10.1007/978-1-4684-6831-1_9

Download citation

  • DOI: https://doi.org/10.1007/978-1-4684-6831-1_9

  • Publisher Name: Birkhäuser Boston

  • Print ISBN: 978-1-4684-6833-5

  • Online ISBN: 978-1-4684-6831-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics