In Silico Characterization of Proteins InterPro and Proteome Analysis
The main problem we aim to solve in this chapter is the quick and reliable elucida- tion of protein function and large-scale analysis of whole proteomes (protein compo- nent of genomes). This problem arose with the advancement of DNA sequencing technologies and the dawning of the genome sequencing era. Previously, unclassified DNA sequences trickled into the public databases from bench scientists working on experimental investigation of the function of the gene products. However, currently the raw sequences are flooding in with a distinct lack of accompanying annotation, result- ing in a requirement for automatic in silico protein sequence analysis tools. Tradition- ally, scientists use sequence similarity searches to compare a query sequence to those of known function, but this method has its limitations and relies on the quality of exist- ing data. Here we describe improved methods for protein sequence classification using protein signatures.
KeywordsGene Ontology Hide Markov Model Query Sequence Regular Expression Enzyme Commission Number
- 7.Eddy, S. HMMER2 Profile hidden Markov models for biological sequence analysis. [http://hmmer.wustl.edu/]