Skip to main content

InterPro and InterProScan

Tools for Protein Sequence Classification and Comparison

  • Protocol
Comparative Genomics

Part of the book series: Methods In Molecular Biology™ ((MIMB,volume 396))

Summary

Protein sequence classification and comparison has become increasingly important in the current “omics” revolution, where scientists are working on functional genomics and proteomics technologies for large-scale protein function prediction. However, functional classification is also important for the bench scientist wanting to analyze single or small sets of proteins, or even a single genome. A number of tools are available for sequence classification, such as sequence similarity searches, motif- or pattern-finding software, and protein signatures for identifying protein families and domains. One such tool, InterPro, is a documentation resource that integrates the major players in the protein signature field to provide a valuable tool for annotation of proteins. Protein sequences are searched using the InterProScan software to identify signatures from the InterPro member databases; Pfam, PROSITE, PRINTS, ProDom, SMART, TIGRFAMs, PIRSF, SUPERFAMILY, Gene3D, and PANTHER. The InterPro database can be searched to retrieve precalculated matches for UniProtKB proteins, or to find additional information on protein families and domains. For completely sequenced genomes, the user can retrieve InterPro-based analyses on all nonredundant proteins in the proteome, and can execute user-selected proteome comparisons. This chapter will describe how to use InterPro and InterProScan for protein sequence classification and comparative proteomics

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Finn, R. D., Mistry, J., Schuster-Bockler, B., et al. (2006) Pfam: clans, web tools and services. Nucleic Acids Res. 34, D247–D251.

    Article  CAS  PubMed  Google Scholar 

  2. Hulo, N., Bairoch, A., Bulliard, V., et al. (2006) The PROSITE database. Nucleic Acids Res. 34, D227–D230.

    Article  CAS  PubMed  Google Scholar 

  3. Attwood, T. K., Bradley, P., Flower, D. R., et al. (2003) PRINTS and its automatic supplement pre-PRINTS. Nucleic Acids Res. 31, 400–402.

    Article  CAS  PubMed  Google Scholar 

  4. Letunic, I., Copley, R. R., Pils, B., Pinkert, S., Schultz, J., and Bork, P. (2006) SMART 5: domains in the context of genomes and networks. Nucleic Acids Res. 34, D257–D260.

    Article  CAS  PubMed  Google Scholar 

  5. Haft, D. H., Selengut, J. D., and White, O. (2003) The TIGRFAMs database of protein families. Nucleic Acids Res. 31, 371–373.

    Article  CAS  PubMed  Google Scholar 

  6. Bru, C., Courcelle, E., Carrere, S., Beausse, Y., Dalmar, S., and Kahn, D. (2005). The ProDom database of protein domain families: more emphasis on 3D. Nucleic Acids Res. 33, D212–D215.

    Article  CAS  PubMed  Google Scholar 

  7. Wu, C. H., Nikolskaya, A., Huang, H., et al. (2004) PIRSF: family classification system at the Protein Information Resource. Nucleic Acids Res. 32, D112–D114.

    Article  CAS  PubMed  Google Scholar 

  8. Madera, M., Vogel, C., Kummerfeld, S. K., Chothia, C., and Gough, J. (2004) The SUPERFAMILY database in 2004: additions and improvements. Nucleic Acids Res. 32, D235–D239.

    Article  CAS  PubMed  Google Scholar 

  9. Pearl, F., Todd, A., Sillitoe, I., et al. (2005) The CATH Domain Structure Database and related resources Gene3D and DHS provide comprehensive domain family information for genome analysis. Nucleic Acids Res. 33, D247–D251.

    Article  CAS  PubMed  Google Scholar 

  10. Mi, H., Lazareva-Ulitsky, B., Loo, R., et al. (2005) The PANTHER database of protein families, subfamilies, functions and pathways. Nucleic Acids Res. 32, D284–D288.

    Google Scholar 

  11. Mulder, N. J., Apweiler, R., Attwood, T. K., et al. (2005) InterPro, progress and status in 2005. Nucleic Acids Res. 33, D201–D205.

    Article  CAS  PubMed  Google Scholar 

  12. Harris, M. A., Clark, J., Ireland, A., et al. (2004) The Gene Ontology (GO) database and informatics resource. Nucleic Acids Res. 32, 258–261.

    Article  Google Scholar 

  13. Quevillon, E., Silventoinen, V., Pillai, S., et al. (2005) InterProScan: protein domains identifier. Nucleic Acids Res. 33, W116–W120.

    Article  CAS  PubMed  Google Scholar 

  14. Wu, C. H., Apweiler, R., Bairoch, A., et al. (2006) The Universal Protein Resource (UniProt): an expanding universe of protein information. Nucleic Acids Res. 34, D187–D191.

    Article  CAS  PubMed  Google Scholar 

  15. Pruess, M., Kersey, P., and Apweiler, R. (2005) The Integr8 project: a resource for genomic and proteomic data. In Silico Biol. 5, 179–185.

    Article  CAS  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Humana Press Inc.

About this protocol

Cite this protocol

Mulder, N., Apweiler, R. (2007). InterPro and InterProScan. In: Bergman, N.H. (eds) Comparative Genomics. Methods In Molecular Biology™, vol 396. Humana Press. https://doi.org/10.1007/978-1-59745-515-2_5

Download citation

  • DOI: https://doi.org/10.1007/978-1-59745-515-2_5

  • Publisher Name: Humana Press

  • Print ISBN: 978-1-934115-37-4

  • Online ISBN: 978-1-59745-515-2

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics