Summary
Protein sequence classification and comparison has become increasingly important in the current “omics” revolution, where scientists are working on functional genomics and proteomics technologies for large-scale protein function prediction. However, functional classification is also important for the bench scientist wanting to analyze single or small sets of proteins, or even a single genome. A number of tools are available for sequence classification, such as sequence similarity searches, motif- or pattern-finding software, and protein signatures for identifying protein families and domains. One such tool, InterPro, is a documentation resource that integrates the major players in the protein signature field to provide a valuable tool for annotation of proteins. Protein sequences are searched using the InterProScan software to identify signatures from the InterPro member databases; Pfam, PROSITE, PRINTS, ProDom, SMART, TIGRFAMs, PIRSF, SUPERFAMILY, Gene3D, and PANTHER. The InterPro database can be searched to retrieve precalculated matches for UniProtKB proteins, or to find additional information on protein families and domains. For completely sequenced genomes, the user can retrieve InterPro-based analyses on all nonredundant proteins in the proteome, and can execute user-selected proteome comparisons. This chapter will describe how to use InterPro and InterProScan for protein sequence classification and comparative proteomics
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Finn, R. D., Mistry, J., Schuster-Bockler, B., et al. (2006) Pfam: clans, web tools and services. Nucleic Acids Res. 34, D247–D251.
Hulo, N., Bairoch, A., Bulliard, V., et al. (2006) The PROSITE database. Nucleic Acids Res. 34, D227–D230.
Attwood, T. K., Bradley, P., Flower, D. R., et al. (2003) PRINTS and its automatic supplement pre-PRINTS. Nucleic Acids Res. 31, 400–402.
Letunic, I., Copley, R. R., Pils, B., Pinkert, S., Schultz, J., and Bork, P. (2006) SMART 5: domains in the context of genomes and networks. Nucleic Acids Res. 34, D257–D260.
Haft, D. H., Selengut, J. D., and White, O. (2003) The TIGRFAMs database of protein families. Nucleic Acids Res. 31, 371–373.
Bru, C., Courcelle, E., Carrere, S., Beausse, Y., Dalmar, S., and Kahn, D. (2005). The ProDom database of protein domain families: more emphasis on 3D. Nucleic Acids Res. 33, D212–D215.
Wu, C. H., Nikolskaya, A., Huang, H., et al. (2004) PIRSF: family classification system at the Protein Information Resource. Nucleic Acids Res. 32, D112–D114.
Madera, M., Vogel, C., Kummerfeld, S. K., Chothia, C., and Gough, J. (2004) The SUPERFAMILY database in 2004: additions and improvements. Nucleic Acids Res. 32, D235–D239.
Pearl, F., Todd, A., Sillitoe, I., et al. (2005) The CATH Domain Structure Database and related resources Gene3D and DHS provide comprehensive domain family information for genome analysis. Nucleic Acids Res. 33, D247–D251.
Mi, H., Lazareva-Ulitsky, B., Loo, R., et al. (2005) The PANTHER database of protein families, subfamilies, functions and pathways. Nucleic Acids Res. 32, D284–D288.
Mulder, N. J., Apweiler, R., Attwood, T. K., et al. (2005) InterPro, progress and status in 2005. Nucleic Acids Res. 33, D201–D205.
Harris, M. A., Clark, J., Ireland, A., et al. (2004) The Gene Ontology (GO) database and informatics resource. Nucleic Acids Res. 32, 258–261.
Quevillon, E., Silventoinen, V., Pillai, S., et al. (2005) InterProScan: protein domains identifier. Nucleic Acids Res. 33, W116–W120.
Wu, C. H., Apweiler, R., Bairoch, A., et al. (2006) The Universal Protein Resource (UniProt): an expanding universe of protein information. Nucleic Acids Res. 34, D187–D191.
Pruess, M., Kersey, P., and Apweiler, R. (2005) The Integr8 project: a resource for genomic and proteomic data. In Silico Biol. 5, 179–185.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2007 Humana Press Inc.
About this protocol
Cite this protocol
Mulder, N., Apweiler, R. (2007). InterPro and InterProScan. In: Bergman, N.H. (eds) Comparative Genomics. Methods In Molecular Biology™, vol 396. Humana Press. https://doi.org/10.1007/978-1-59745-515-2_5
Download citation
DOI: https://doi.org/10.1007/978-1-59745-515-2_5
Publisher Name: Humana Press
Print ISBN: 978-1-934115-37-4
Online ISBN: 978-1-59745-515-2
eBook Packages: Springer Protocols