Using Sequence Information to Identify Motifs
Identifying potential functions for an unknown protein sequence has become easier due to the proliferation of powerful algorithms to compare the sequence to large databases of consensus domains and motifs. Understanding the use and the potential dangers of such algorithms can greatly speed any work with new and novel sequences. The most important thing to remember is that such programs can only create a hypothesis about the shape or function of a sequence, not prove any functions. Such programs are the first step in any study of a new protein, not the last.
The flow of biological information from DNA to RNA to protein necessarily requires that the final shape of the protein is encoded into the DNA. This essential fact of biology is the basis for the field of bioinformatics. Bioinformatics greatly expands the knowledge of researchers about the potential functions of a given gene by simply examining the nucleotide or encoded-protein sequence. Another essential fact...