Protocol

Bioinformatics

Volume 452 of the series Methods in Molecular Biology™ pp 231-251

Discovering Sequence Motifs

  • Timothy L. BaileyAffiliated withARC Centre of Excellence in Bioinformatics, and Institute for Molecular Bioscience, The University of Queensland

* Final gross prices may vary according to local VAT.

Get Access

Abstract

Sequence motif discovery algorithms are an important part of the computational biologist's toolkit. The purpose of motif discovery is to discover patterns in biopolymer (nucleotide or protein) sequences in order to better understand the structure and function of the molecules the sequences represent. This chapter provides an overview of the use of sequence motif discovery in biology and a general guide to the use of motif discovery algorithms. The chapter discusses the types of biological features that DNA and protein motifs can represent and their usefulness. It also defines what sequence motifs are, how they are represented, and general techniques for discovering them. The primary focus is on one aspect of motif discovery: discovering motifs in a set of unaligned DNA or protein sequences. Also presented are steps useful for checking the biological validity and investigating the function of sequence motifs using methods such as motif scanning—searching for matches to motifs in a given sequence or a database of sequences. A discussion of some limitations of motif discovery concludes the chapter.

Key words

Motif discovery sequence motif sequence pattern protein domain multiple alignment position-specific scoring matrix PSSM position-specific weight matrix PWM transcription factor binding site transcription factor promoter protein features