Skip to main content
Log in

Identification of Protein Motifs in Phytoplasma Associated with Root (Wilt) Disease of Coconut (Cocos nucifera) Using an Improved Statistical Measure

  • Full-Length Research Article
  • Published:
Agricultural Research Aims and scope Submit manuscript

Abstract

Protein motif is a conserved short region within larger sequence, and it provides an important way to get functional or structural information about a noble protein sequence. This study primarily provides a new approach to identify motifs which are considered to play an important role to determine functional or structural class of any protein. Coconut palm (Cocos nucifera L.) root (wilt) disease (CRWD) became a serious concern for coconut cultivation in the coastal region of South India. It has already been reported that species of the phytoplasma (-belonging to ribosomal group 16SrXI) is primarily associated with CRWD. In this study, our objective is to modify an existing motif identification algorithm, and also identify motifs that describe their conserved region within a CRWD phytoplasma protein sequence. Accordingly, the existing information content measure formulas for a no-gapped and gapped aligned sequence set were re-formulated. Further, an improvement on both these existing measures has been vied by incorporating prior information of BLOSUM90 substitution matrix. Results showed that proposed modifications could successfully identify N-terminal myristoylation motif within an alignment matrix of Arabidopsis thaliana sequences; concurrently our modification has also been validated. Both the approaches, namely frequency- and BLOSUM90-based information content, have been compared using the observed results, in terms of identified motifs. Finally, we could also test this noble BLOSUM90-based approach to correctly predict conserved regions at locations, (163–171) and (117–138), in an aligned secA gene sequence set of phytoplasma, an etiological agent of CRWD.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  1. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215(3):403–410

    Article  CAS  PubMed  Google Scholar 

  2. Bork P, Gibson TJ (1996) Applying motif and profile searches. Methods Enzymol 266:162–184

    Article  CAS  PubMed  Google Scholar 

  3. Bork P, Koonin EV (1996) Protein sequence motifs. Curr Opin Struct Biol 6(3):366–376

    Article  CAS  PubMed  Google Scholar 

  4. Charif D, Lobry J (2007) SeqinR 1.0-2: a contributed package to the R project for statistical computing devoted to biological sequences retrieval and analysis. In: Bastolla U, Porto M, Roman H, Vendruscolo M (eds) Structural approaches to sequence evolution: molecules, networks, populations, biological and medical physics, biomedical engineering. Springer Verlag, New York, pp 207–232 ISBN: 978-3-540-35305-8

    Chapter  Google Scholar 

  5. Dreos R, Ambrosini G, Périer RC, Bucher P (2015) The eukaryotic promoter database: expansion of EPDnew and new promoter analysis tools. Nucleic Acids Res 43(D1):D92–D96

    Article  CAS  PubMed  Google Scholar 

  6. Gupta S, Kapoor V (1982) Fundamentals of mathematical statistics: a modern approach. Sultan Chand and Sons, New Delhi

    Google Scholar 

  7. Harrison NA, Helmick EE, Elliott ML (2008) Lethal yellowing-type diseases of palms associated with phytoplasmas newly identified in Florida, USA. Ann Appl Biol 153(1):85–94

    Article  CAS  Google Scholar 

  8. Henikoff S, Henikoff JG (1992) Amino acid substitution matrices from protein blocks. Proc Natl Acad Sci 89(22):10,915–10,919

    Article  CAS  Google Scholar 

  9. Hertz GZ, Stormo GD (1995) Identification of consensus patterns in unaligned DNA and protein sequences: a large-deviation statistical basis for penalizing gaps. Proc Third Int Conf Bioinf Genome Res 2:201–216

    Google Scholar 

  10. Hertz GZ, Stormo GD (1999) Identifying DNA and protein patterns with statistically significant alignments of multiple sequences. Bioinformatics 15(7):563–577

    Article  CAS  PubMed  Google Scholar 

  11. Hertz GZ, Hartzell GW, Stormo GD (1990) Identification of consensus patterns in unaligned DNA sequences known to be functionally related. Comput Appl Biosci: CABIOS 6(2):81–92

    CAS  PubMed  Google Scholar 

  12. Huan HX, Tuyet DT, Ha DT, Hung NT (2015) An efficient ant colony algorithm for DNA motif finding. In: Nguyen VH, Le AC, Huynh VN (eds) Knowledge and systems engineering, vol 326. Advances in Intelligent Systems and Computing, Springer, pp 589–601

  13. Johnson DR, Bhatnagar RS, Knoll LJ, Gordon JI (1994) Genetic and biochemical studies of protein N-myristoylation. Annu Rev Biochem 63(1):869–914

    Article  CAS  PubMed  Google Scholar 

  14. Johnson NL, Kotz S, Balakrishnan N (1997) Discrete multivariate distributions, vol 165. Wiley, New York

    Google Scholar 

  15. Kaiser F, Eisold A, Labudde D (2015) A novel algorithm for enhanced structural motif matching in proteins. J Comput Biol 22:698–713

    Article  CAS  PubMed  Google Scholar 

  16. Ladunga IS (2003) Finding homologs in amino acid sequences using network BLAST searches. Curr Protoc Bioinf 25:3–4

    Google Scholar 

  17. Li F, Liu J, Valls L, Hiser C, Ferguson-Miller S (2015) Identification of a key cholesterol binding enhancement motif in translocator protein 18 kDa. Biochemistry 54(7):1441–1443

    Article  CAS  PubMed  Google Scholar 

  18. Liu S, Trenkler G (2008) Hadamard, Khatri-Rao, Kronecker and other matrix products. Int J Inf Syst Sci 4(1):160–177

    Google Scholar 

  19. Liu Z, Li F, Zhang B, Li S, Wu J, Shi Y (2015) Structural basis of plant homeodomain finger 6 (PHF6) recognition by retinoblastoma binding protein 4 (RBBP4) component of the nucleosome remodeling and deacetylase (NuRD) complex. J Biol Chem 290:6630–6638

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Manimekalai R, Soumya V, Nair S, Thomas GV, Baranwal V (2014) Molecular characterization identifies 16srxi-b group phytoplasma (candidatus phytoplasma oryzae-related strain) associated with root wilt disease of coconut in india. Sci Hortic 165:288–294

    Article  CAS  Google Scholar 

  21. Myrie W, Harrison N, Douglas L, Helmick E, Gore-Francis J, Oropeza WC, McLaughlin (2014) First report of lethal yellowing disease associated with subgroup 16SrIV-A phytoplasmas in Antigua, West Indies. New Dis Rep 29(1):12

    Article  Google Scholar 

  22. Nipah JO, Jones P, Dickinson MJ (2007) Detection of lethal yellowing phytoplasma in embryos from coconut palms infected with cape St Paul wilt disease in Ghana. Plant Pathol 56:777–784

    Article  CAS  Google Scholar 

  23. Ou J, Zhu LJ (2013) motifStack: plot stacked logos for single or multiple DNA, RNA and amino acid sequence. R Package Version 1(8):1

    Google Scholar 

  24. Perera L, Meegahakumbura MK, Wijesekara HRT, Fernando WBS, Dickinson MJ (2012) A phytoplasma is associated with the Weligama coconut leaf wilt disease in Sri Lanka. J Plant Pathol 94(1):205–209

    Google Scholar 

  25. Pruitt KD, Tatusova T, Maglott DR (2007) NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res 35(suppl 1):D61–D65

    Article  CAS  PubMed  Google Scholar 

  26. Schneider TD, Stormo GD, Gold L, Ehrenfeucht A (1986) Information content of binding sites on nucleotide sequences. J Mol Biol 188(3):415–431

    Article  CAS  PubMed  Google Scholar 

  27. Shil S, Das KK, Dutta A (2015) Searching of conserved motifs within a partial seca gene sequence of phytoplasma associated with root (wilt) disease of coconut (Cocos nucifera) in India: using a frequency based approach. Int J Bio-resour Stress Manag 6(2):208–213

    Article  Google Scholar 

  28. Sun T, Shi X, Friso G, Van Wijk K, Bentolila S, Hanson MR (2015a) A zinc-finger motif-containing protein is essential for chloroplast rna editing. PLoS Genetics 11(3):e1005,028–e1005,028

    Article  CAS  Google Scholar 

  29. Sun W, Chen H, Wang J, Sun HW, Yang SK, Sang YL, Lu XB, Xu XH (2015b) Expression analysis of genes encoding mitogen-activated protein kinases in maize provides a key link between abiotic stress signaling and plant reproduction. Funct Integr Genom 15(1):107–120

    Article  CAS  Google Scholar 

  30. Team RC (2014) R: A language and environment for statistical computing. R Foundation for Statistical Computing. http://www.R-project.org/

  31. Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG (1997) The CLUSTAL\_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res 25(24):4876–4882

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Wu T, Brutlag D (1994) Identification of protein motifs using conserved amino acid properties and partitioning techniques. In: Proceedings. International conference on intelligent systems for molecular biology; ISMB, vol 3, pp 402–410

  33. Xia X (2011) Position weight matrix, Gibbs sampler, and the associated significance tests in motif characterization and prediction. Scientifica 2012:917,540–917,540

    Google Scholar 

  34. Yamauchi S, Fusada N, Hayashi H, Utsumi T, Uozumi N, Endo Y, Tozawa Y (2010) The consensus motif for N-myristoylation of plant proteins in a wheat germ cell-free translation system. FEBS J 277(17):3596–3607

    Article  CAS  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sandip Shil.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shil, S., Das, K.K. & Saxena, V.K. Identification of Protein Motifs in Phytoplasma Associated with Root (Wilt) Disease of Coconut (Cocos nucifera) Using an Improved Statistical Measure. Agric Res 8, 56–64 (2019). https://doi.org/10.1007/s40003-018-0351-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40003-018-0351-5

Keywords

Navigation