Molecular Life Sciences

Living Edition
| Editors: Robert D. Wells, Judith S. Bond, Judith Klinman, Bettie Sue Siler Masters, Ellis Bell

Using Sequence Information to Identify Motifs

  • Scott Cooper
  • Anton Sanderfoot
Living reference work entry


Identifying potential functions for an unknown protein sequence has become easier due to the proliferation of powerful algorithms to compare the sequence to large databases of consensus domains and motifs. Understanding the use and the potential dangers of such algorithms can greatly speed any work with new and novel sequences. The most important thing to remember is that such programs can only create a hypothesis about the shape or function of a sequence, not prove any functions. Such programs are the first step in any study of a new protein, not the last.


The flow of biological information from DNA to RNA to protein necessarily requires that the final shape of the protein is encoded into the DNA. This essential fact of biology is the basis for the field of bioinformatics. Bioinformatics greatly expands the knowledge of researchers about the potential functions of a given gene by simply examining the nucleotide or encoded-protein sequence. Another essential fact...


Malate Dehydrogenase Unknown Sequence Peroxisomal Target Signal Bipartite Nuclear Localization Signal Diguanylate Cyclase 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in to check access.


  1. Bagos PG, Liakopoulos TD, Spyropoulos IC, Hamodrakas SJ (2004) PRED-TMBB: a web server for predicting the topology of beta-barrel outer membrane proteins. Nucleic Acids Res 32(Webserver Issue):W400–W404PubMedCentralPubMedCrossRefGoogle Scholar
  2. Emanuelsson O, Brunak S, von Heijne G, Nielsen H (2007) Locating proteins in the cell using TargetP, SignalP, and related tools. Nat Protoc 2:953–971PubMedCrossRefGoogle Scholar
  3. Krogh A, Larsson B, von Heijne G, Sonnhammer ELL (2001) Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J Mol Biol 305:567–580PubMedCrossRefGoogle Scholar
  4. Pei J, Grishin NV (2001) GGDEF domain is homologous to adenylyl cyclase. Proteins 42:210–216PubMedCrossRefGoogle Scholar
  5. Petersen TN, Brunak S, von Heijne G, Nielsen H (2011) SignalP 4.0: discriminating signal peptides from transmembrane regions. Nat Methods 8:785–786PubMedCrossRefGoogle Scholar
  6. Punta M, Coggill PC, Eberhardt RY, Mistry J, Tate J, Boursnell C, Pang N, Forslund K, Ceric G, Clements J, Heger A, Holm L, Sonnhammer ELL, Eddy SR, Bateman A, Finn RD (2012) The Pfam protein families database. Nucleic Acids Res 40(Database Issue):D290–D301PubMedCentralPubMedCrossRefGoogle Scholar
  7. Schwarz F, Aebi M (2011) Mechanisms and principles of N-linked protein glycosylation. Curr Opin Struct Biol 21:576–582PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  1. 1.Department of BiologyUniversity Wisconsin – La CrosseLa CrosseUSA