Skip to main content

Advertisement

Log in

PepPat, a pattern-based oligopeptide homology search method and the identification of a novel tachykinin-like peptide

  • Published:
Mammalian Genome Aims and scope Submit manuscript

Abstract

PepPat, a hybrid method that combines pattern matching with similarity scoring, is described. We also report PepPat's application in the identification of a novel tachykinin-like peptide. PepPat takes as input a query peptide and a user-specified regular expression pattern within the peptide. It first performs a database pattern match and then ranks candidates on the basis of their similarity to the query peptide. PepPat calculates similarity over the pattern spanning region, enhancing PepPat's sensitivity for short query peptides. PepPat can also search for a user-specified number of occurrences of a repeated pattern within the target sequence. We illustrate PepPat's application in short peptide ligand mining. As a validation example, we report the identification of a novel tachykinin-like peptide, C14TKL-1, and show it is an NK1 (neuokinin receptor 1) agonist whose message is widely expressed in human periphery. Availability: PepPat is offered online at: http://peppat.cbi.pku.edu.cn

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1a
Figure 1b

Similar content being viewed by others

References

  1. MD Adams SE Celniker RA Holt CA Evans JD Gocayne et al. (2000) ArticleTitleThe genome sequence of Drosophila melanogaster. Science 287 2185–2195 Occurrence Handle10.1126/science.287.5461.2185 Occurrence Handle10731132

    Article  PubMed  Google Scholar 

  2. SF Altschul W Gish et al. (1990) ArticleTitleBasic local alignment search tool. J Mol Biol 215 403–410 Occurrence Handle10.1006/jmbi.1990.9999 Occurrence Handle1:CAS:528:DyaK3MXitVGmsA%3D%3D Occurrence Handle2231712

    Article  CAS  PubMed  Google Scholar 

  3. SF Altschul TL Madden et al. (1997) ArticleTitleGapped BLAST and PSI-BLAST: a new generation 01 protein database search programs. Nucleic Acids Res 25 3389–3402 Occurrence Handle9254694

    PubMed  Google Scholar 

  4. R Apweiler TK Attwood et al. (2001) ArticleTitleThe InterPro database, an integrated documentation resource for protein families, domains and functional sites. Nucleic Acids Res 29 37–40 Occurrence Handle1:CAS:528:DC%2BD3MXjtlWnsbw%3D Occurrence Handle11125043

    CAS  PubMed  Google Scholar 

  5. TK Attwood MJ Blythe et al. (2002) ArticleTitlePRINTS and PRINTS-S shed light on protein ancestry. Nucleic Acids Res 30 239–241 Occurrence Handle10.1093/nar/30.1.239 Occurrence Handle1:CAS:528:DC%2BD38Xht12ku7o%3D Occurrence Handle11752304

    Article  CAS  PubMed  Google Scholar 

  6. R Baeza-Yates GH Gonnet (1992) ArticleTitleA new approach to text searching. Commun Assoc Comp Mach 35 IssueID10 74–82 Occurrence Handle10.1145/135239.135243

    Article  Google Scholar 

  7. A Bateman E Birney et al. (2002) ArticleTitleThe Pfam protein families database. Nucleic Acids Res 30 276–280 Occurrence Handle1:CAS:528:DC%2BD38Xht12rsro%3D Occurrence Handle11752314

    CAS  PubMed  Google Scholar 

  8. F Bellucci F Carini et al. (2002) ArticleTitlePharmacological profile of the novel mammalian tachykinin, hemokinin 1. Br J Pharmacol 135 266–274 Occurrence Handle1:CAS:528:DC%2BD38XovVemtw%3D%3D Occurrence Handle11786503

    CAS  PubMed  Google Scholar 

  9. V Camarda A Rizzi et al. (2002) ArticleTitlePharmacological profile of hemokinin 1: a novel member of the tachykinin family. Life Sci 71 363–370 Occurrence Handle10.1016/S0024-3205(02)01682-X Occurrence Handle1:CAS:528:DC%2BD38XktFGqt7w%3D Occurrence Handle12044836

    Article  CAS  PubMed  Google Scholar 

  10. F Corpet F Servant et al. () ArticleTitleProDom and ProDom-CG: tools for protein domain analysis and whole genome comparison. Nucleic Acids Res

    Google Scholar 

  11. MO Dayhoff (1978) Survey of new data and computer methods of analysis. Atlas of protein sequence and structure, vol. 5, suppl. 3. National Biomedical Research Foundation Georgetown University, Washington, D.C.

    Google Scholar 

  12. R Durbin S Eddy (1998) Biological sequence analysis: probabilistic models of proteins and nucleic acids. Cambridge University Press Cambridge, UK Occurrence Handle9504563

    PubMed  Google Scholar 

  13. E Espinoza M Carrigan et al. (2000) ArticleTitleA statistical view of FMRFamide neuropeptide diversity. Mol Neurobiol 21 35–56 Occurrence Handle1:CAS:528:DC%2BD3MXjs12itb8%3D Occurrence Handle11327149

    CAS  PubMed  Google Scholar 

  14. L Falquet M Pagni et al. (2002) ArticleTitleThe PROSITE database, its status in 2002. Nucleic Acids Res 27 215–219

    Google Scholar 

  15. J Gracy P Argos (1998) ArticleTitleDOMO: a new database of aligned protein domains. Trends Occurrence Handle1:CAS:528:DyaK1cXktl2rtbk%3D Occurrence Handle9683316

    CAS  PubMed  Google Scholar 

  16. MJ Greenberg DA Price (1992) ArticleTitleRelationships among the FMRFamide-like peptides. Prog Brain Res 92 25–27 Occurrence Handle1:CAS:528:DyaK3sXhsVWru7c%3D Occurrence Handle1302879

    CAS  PubMed  Google Scholar 

  17. JG Henikoff EA Greene (2000) ArticleTitleIncreased coverage of protein families with the blocks database servers. Nucleic Acids Res 28 228–230 Occurrence Handle1:CAS:528:DC%2BD3cXhvVKjt74%3D Occurrence Handle10592233

    CAS  PubMed  Google Scholar 

  18. S Henikoff JG Henikoff (1992) ArticleTitleAmino acid substitution matrices from protein blocks. Proc Natl Acad Sci USA 89 10915–10919 Occurrence Handle1:CAS:528:DyaK3sXjsFCgsQ%3D%3D

    CAS  Google Scholar 

  19. S Hinuma Y Shintani et al. (2000) ArticleTitleNew neuropeptides containing carboxy-terminal RFamide and their receptor in mammals. Nat Cell Biol 2 703–708

    Google Scholar 

  20. JY Huang DL Brutlag (2001) ArticleTitleThe EMOTIF database. Nucleic Acids Res 29 202–204 Occurrence Handle10.1093/nar/29.1.202 Occurrence Handle1:CAS:528:DC%2BD3MXjtlWmsLc%3D Occurrence Handle11125091

    Article  CAS  PubMed  Google Scholar 

  21. DE Knuth JH Morris Jr VR Pratt (1977) ArticleTitleFast pattern matching in strings. SIAM J Comput 6 323–350

    Google Scholar 

  22. I Letunc L Goodstadt et al. (2002) ArticleTitleRecent improvements to the SMART domain-based sequence annotation resource. Nucleic Acids Res 30 242–244

    Google Scholar 

  23. A Marchler-Bauer AR Panchenko et al. (2002) ArticleTitleCCD: a database of conserved domain alignments with links to domain three-dimensional structures. Nucleic Acids Res 30 281–283 Occurrence Handle10.1093/nar/30.1.281 Occurrence Handle1:CAS:528:DC%2BD38Xht12rsrs%3D Occurrence Handle11752315

    Article  CAS  PubMed  Google Scholar 

  24. G Mehldau G Myers (1993) ArticleTitleA system for pattern matching applications on biosequences. Comput Appl Biosci 9 299–314 Occurrence Handle1:CAS:528:DyaK3sXltFKntbw%3D Occurrence Handle8324630

    CAS  PubMed  Google Scholar 

  25. O Morteau B Lu et al. (2001) ArticleTitleHemokinin is a full agonist at the substance P receptor. Nat Immunol 2 1088 Occurrence Handle10.1038/ni1201-1088 Occurrence Handle1:CAS:528:DC%2BD3MXptVaisbw%3D Occurrence Handle11725292

    Article  CAS  PubMed  Google Scholar 

  26. DW Mount (2001) Bioinformatics: sequence and genome analysis. Cold Sping Harbor Laboratory Press Cold Spring harbour, N.Y. Occurrence Handle10.1097/00041552-200109000-00021 Occurrence Handle1:STN:280:DC%2BD3MrhtlCgsg%3D%3D Occurrence Handle11496065

    Book  CAS  PubMed  Google Scholar 

  27. R Nichols SA Schneuwly JE Dixon (1988) ArticleTitleIdentification and characterization of a Drosophila homologue to the vertebrate neuropeptide cholecystokinin. J Biol Chem 263 12167–12170 Occurrence Handle1:CAS:528:DyaL1MXhsV2qu7s%3D Occurrence Handle2842322

    CAS  PubMed  Google Scholar 

  28. WR Pearson (2000) ArticleTitleFlexible sequence similarity searching with the FASTA3 program package. Methods Mol Biol 132 185–219 Occurrence Handle1:CAS:528:DyaK1MXmslKqsLs%3D Occurrence Handle10547837

    CAS  PubMed  Google Scholar 

  29. WR Pearson DJ Lipman (1988) ArticleTitleImproved tools for biological sequence comparison. Proc Natl Acad Sci USA 85 2444–2448 Occurrence Handle1:CAS:528:DyaL1cXktFyit78%3D Occurrence Handle3162770

    CAS  PubMed  Google Scholar 

  30. G Pesole S Liuni (2000) ArticleTitlePatSearch: a pattern matcher software that finds functional elements in nucleotide and protein sequences and assesses their statistical significance. Bioinformatics 16 439–450 Occurrence Handle10.1093/bioinformatics/16.5.439 Occurrence Handle1:CAS:528:DC%2BD3cXlvVKqt7w%3D Occurrence Handle10871266

    Article  CAS  PubMed  Google Scholar 

  31. DA Price MJ Greenberg (1977) ArticleTitleStructure of a molluscan cardioexcitatory neuropeptide. Science 197 670–672 Occurrence Handle1:CAS:528:DyaE2sXltFentbg%3D Occurrence Handle877582

    CAS  PubMed  Google Scholar 

  32. RB Raffa (1991) ArticleTitleThe actions of FMRF-NH2 and FMRF-NH2 related peptides on mammals. NIDA Res Monogr 105 243–249 Occurrence Handle1:STN:280:By6A2c3nsF0%3D Occurrence Handle1678861

    CAS  PubMed  Google Scholar 

  33. AA Schaffer YI Wolf et al. (1999) ArticleTitleIMPALA: matching a protein sequence against a collection of PSI-BLAST-constructed position-specific score matrices. Bioinformatics 15 1000–1011 Occurrence Handle10.1093/bioinformatics/15.12.1000 Occurrence Handle1:CAS:528:DC%2BD3cXit1yks74%3D Occurrence Handle10745990

    Article  CAS  PubMed  Google Scholar 

  34. LR Schneider PH Taghert (1988) ArticleTitleIsolation and characterization of a Drosophila gene that encodes multiple neuropeptides related to Phe-Met-Arg-Phe-NH2 (FMRFamide). Proc Natl Acad Sci USA 85 1193–1197

    Google Scholar 

  35. J Vanden Broeck H Torfs J Poels W Van Poyer E Swinnen et al. (1999) ArticleTitleTachykinin-like peptides and their receptors. A review. Ann NY Acad Sci 897 374–387 Occurrence Handle1:CAS:528:DC%2BD3cXnt1SqtQ%3D%3D

    CAS  Google Scholar 

  36. L Wall T Christiansen RL Schwartz (1996) Programming Perl, 2nd edn. O'Reilly and Associates Sebastapol, Ca

    Google Scholar 

  37. S Watson S Arkinstall (1994) The G-protein linked receptors. Academic Press New York 261–271

    Google Scholar 

  38. DL Wheeler DM Church et al. (2002) ArticleTitleDatabase resources of the National Center for Biotechnology Information: 2002 update. Nucleic Acids Res 30 13–16 Occurrence Handle1:CAS:528:DC%2BD38Xht12ksbw%3D Occurrence Handle11752242

    CAS  PubMed  Google Scholar 

  39. Y Zhang L Lu et al. (2000) ArticleTitleHemokinin is a hematopoietic-specific tachykinin that regulates B lymphoiesis. Nat Immunol 1 392–397 Occurrence Handle10.1038/80826 Occurrence Handle1:CAS:528:DC%2BD3cXnvVGht74%3D Occurrence Handle11062498

    Article  CAS  PubMed  Google Scholar 

  40. Z Zhang A Schaffer et al. (1998) ArticleTitleProtein sequence similarity searches using patterns as seeds. Nucleic Acids Res 26 3986–3990 Occurrence Handle9705509

    PubMed  Google Scholar 

Download references

Acknowledgements

The authors thank Professor Gu Xiaochen of Peking University, China for helpful discussions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ying Jiang.

MEDLINE Abstract

MEDLINE Abstract

http://www.ncbi.nlm.nih.gov/BLAST/

http://www.ncbi.nlm.nih.gov/BLAST/

http://www.ebi.ac.uk/interpro

http://www.bioinf.man.ac.uk/dbbrowser/PRINTS/

http://pfam.wustl.edu

http://protein.toulouse.inra.fr/prodom.html

http://hmmer.wustl.edu

http://www.expasy.ch/prosite/

http://www.infobiogen.fr/~gracy/domo/home.htm

http://blocks.fhcrc.org

http://motif.stanford.edu/emotif/

http://www.ebi.ac.uk/fasta3/

http://bighost.area.ba.cnr.it/BIG/Patsearch/Patsearch.html

http://blocks.fhcrc.org/blocks/impala.html

http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi

http://www.ncbi.nlm.nih.gov/BLAST/

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jiang, Y., Gao, G., Fang, G. et al. PepPat, a pattern-based oligopeptide homology search method and the identification of a novel tachykinin-like peptide . Mamm Genome 14, 341–349 (2003). https://doi.org/10.1007/s00335-002-3061-y

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00335-002-3061-y

Keywords

Navigation