Skip to main content

Phylogenetic and Evolutionary Analysis of Plant ARGONAUTES

  • Protocol
  • First Online:
Book cover Plant Argonaute Proteins

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1640))

Abstract

Comparative sequence analysis is widely used for the reconstruction of phylogeny and for understanding the evolutionary history of gene families. Here, we describe the methodologies to reconstruct the phylogenetic and evolutionary history of a gene family across genomes with a focus on the ARGONAUTE (AGO) family of proteins in plants. The method described here may easily be adapted for studying molecular evolution of a wide variety of gene families. We enlist methods as well as parameters for the collection of molecular data (nucleic acids and peptides), preparation of datasets, and selection of evolutionary models and various methods for the phylogenetic and evolutionary analysis, such as maximum likelihood and Bayesian inference.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Carmell MA, Xuan Z, Zhang MQ, Hannon GJ (2002) The Argonaute family: tentacles that reach into RNAi, developmental control, stem cell maintenance, and tumorigenesis. Genes Dev 16(21):2733–2742. doi:10.1101/gad.1026102

    Article  CAS  PubMed  Google Scholar 

  2. Hutvagner G, Simard MJ (2008) Argonaute proteins: key players in RNA silencing. Nat Rev Mol Cell Biol 9(1):22–32. doi:10.1038/nrm2321

    Article  CAS  PubMed  Google Scholar 

  3. Kuhn CD, Joshua-Tor L (2013) Eukaryotic Argonautes come into focus. Trends Biochem Sci 38(5):263–271. doi:10.1016/j.tibs.2013.02.008

    Article  CAS  PubMed  Google Scholar 

  4. Hur JK, Zinchenko MK, Djuranovic S, Green R (2013) Regulation of Argonaute slicer activity by guide RNA 3′ end interactions with the N-terminal lobe. J Biol Chem 288(11):7829–7840. doi:10.1074/jbc.M112.441030

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Song JJ, Smith SK, Hannon GJ, Joshua-Tor L (2004) Crystal structure of Argonaute and its implications for RISC slicer activity. Science 305(5689):1434–1437. doi:10.1126/science.1102514

    Article  CAS  PubMed  Google Scholar 

  6. Axtell MJ (2013) Classification and comparison of small RNAs from plants. Annu Rev Plant Biol 64:137–159. doi:10.1146/annurev-arplant-050312-120043

    Article  CAS  PubMed  Google Scholar 

  7. Baulcombe D (2004) RNA silencing in plants. Nature 431(7006):356–363. doi:10.1038/nature02874

    Article  CAS  PubMed  Google Scholar 

  8. Singh RK, Gase K, Baldwin IT, Pandey SP (2015) Molecular evolution and diversification of the Argonaute family of proteins in plants. BMC Plant Biol 15(1):1–23. doi:10.1186/s12870-014-0364-6

    Article  Google Scholar 

  9. Hock J, Meister G (2008) The Argonaute protein family. Genome Biol 9(2):210. doi:10.1186/gb-2008-9-2-210

    Article  PubMed  PubMed Central  Google Scholar 

  10. Singh RK, Pandey SP (2015) Evolution of structural and functional diversification among plant Argonautes. Plant Signal Behav 10(10):e1069455. doi:10.1080/15592324.2015.1069455

    Article  PubMed  PubMed Central  Google Scholar 

  11. Mi S, Cai T, Hu Y, Chen Y, Hodges E, Ni F, Wu L, Li S, Zhou H, Long C, Chen S, Hannon GJ, Qi Y (2008) Sorting of small RNAs into Arabidopsis argonaute complexes is directed by the 5′ terminal nucleotide. Cell 133(1):116–127. doi:10.1016/j.cell.2008.02.034

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Vaucheret H (2008) Plant ARGONAUTES. Trends Plant Sci 13(7):350–358. doi:10.1016/j.tplants.2008.04.007

    Article  CAS  PubMed  Google Scholar 

  13. Chothia C, Lesk AM (1986) The relation between the divergence of sequence and structure in proteins. EMBO J 5(4):823–826

    CAS  PubMed  PubMed Central  Google Scholar 

  14. Fraser HB, Hirsh AE, Steinmetz LM, Scharfe C, Feldman MW (2002) Evolutionary rate in the protein interaction network. Science 296(5568):750–752. doi:10.1126/science.1068696

    Article  CAS  PubMed  Google Scholar 

  15. Waxman D, Peck JR (1998) Pleiotropy and the preservation of perfection. Science 279(5354):1210–1213. doi:10.1126/science.279.5354.1210

    Article  CAS  Google Scholar 

  16. Goodstein DM, Shu S, Howson R, Neupane R, Hayes RD, Fazo J, Mitros T, Dirks W, Hellsten U, Putnam N, Rokhsar DS (2012) Phytozome: a comparative platform for green plant genomics. Nucleic Acids Res 40(Database issue):D1178–D1186. doi:10.1093/nar/gkr944

    Article  CAS  PubMed  Google Scholar 

  17. Mewes HW, Frishman D, Guldener U, Mannhaupt G, Mayer K, Mokrejs M, Morgenstern B, Munsterkotter M, Rudd S, Weil B (2002) MIPS: a database for genomes and protein sequences. Nucleic Acids Res 30(1):31–34. doi:10.1093/nar/30.1.31

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Pruitt KD, Tatusova T, Maglott DR (2005) NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res 33(Database issue):D501–D504. doi:10.1093/nar/gki025

    Article  CAS  PubMed  Google Scholar 

  19. Clarke JT, Warnock RC, Donoghue PC (2011) Establishing a time-scale for plant evolution. New Phytol 192(1):266–301. doi:10.1111/j.1469-8137.2011.03794.x

    Article  PubMed  Google Scholar 

  20. Soskine M, Tawfik DS (2010) Mutational effects and the evolution of new protein functions. Nat Rev Genet 11(8):572–582. doi:10.1038/nrg2808

    Article  CAS  PubMed  Google Scholar 

  21. Eddy SR, Durbin R (1994) RNA sequence analysis using covariance models. Nucleic Acids Res 22(11):2079–2088. doi:10.1093/nar/22.11.2079

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Eisen MB, Spellman PT, Brown PO, Botstein D (1998) Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci U S A 95(25):14863–14868. doi:10.1073/pnas.95.25.14863

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Gotoh O (1982) An improved algorithm for matching biological sequences. J Mol Biol 162(3):705–708. doi:10.1016/0022-2836(82)90398-9

    Article  CAS  PubMed  Google Scholar 

  24. Krogh A (1998) An introduction to hidden Markov models for biological sequences. New Compr Biochem 32:45–63. doi:10.1016/S0167-7306(08)60461-5

    Article  CAS  Google Scholar 

  25. Bach MJ (1986) The design of the UNIX operating system, vol 5. Prentice-Hall Englewood Cliffs, NJ

    Google Scholar 

  26. Stamatakis A (2006) RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22(21):2688–2690. doi:10.1093/bioinformatics/btl446

    Article  CAS  PubMed  Google Scholar 

  27. Lartillot N, Rodrigue N, Stubbs D, Richer J (2013) PhyloBayes MPI: phylogenetic reconstruction with infinite mixtures of profiles in a parallel environment. Syst Biol 62(4):611–615. doi:10.1093/sysbio/syt022

    Article  CAS  PubMed  Google Scholar 

  28. Fitch WM (2000) Homology: a personal view on some of the problems. Trends Genet 16(5):227–231. doi:10.1016/S0168-9525(00)02005-9

    Article  CAS  PubMed  Google Scholar 

  29. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215(3):403–410. doi:10.1016/S0022-2836(05)80360-2

    Article  CAS  PubMed  Google Scholar 

  30. Eddy SR (1998) Profile hidden Markov models. Bioinformatics 14(9):755–763. doi:10.1093/bioinformatics/14.9.755

    Article  CAS  PubMed  Google Scholar 

  31. Finn RD, Clements J, Eddy SR (2011) HMMER web server: interactive sequence similarity searching. Nucleic Acids Res 39(Web Server issue):W29–W37. doi:10.1093/nar/gkr367

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Lechner M, Findeiss S, Steiner L, Marz M, Stadler PF, Prohaska SJ (2011) Proteinortho: detection of (co-)orthologs in large-scale analysis. BMC Bioinformatics 12(1):124. doi:10.1186/1471-2105-12-124

    Article  PubMed  PubMed Central  Google Scholar 

  33. Boussau B, Daubin V (2010) Genomes as documents of evolutionary history. Trends Ecol Evol 25(4):224–232. doi:10.1016/j.tree.2009.09.007

    Article  PubMed  Google Scholar 

  34. Katoh K, Kuma K, Toh H, Miyata T (2005) MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res 33(2):511–518. doi:10.1093/nar/gki198

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Atkinson GC, Baldauf SL (2011) Evolution of elongation factor G and the origins of mitochondrial and chloroplast forms. Mol Biol Evol 28(3):1281–1292. doi:10.1093/molbev/msq316

    Article  PubMed  Google Scholar 

  36. Christin PA, Spriggs E, Osborne CP, Stromberg CA, Salamin N, Edwards EJ (2014) Molecular dating, evolutionary rates, and the age of the grasses. Syst Biol 63(2):153–165. doi:10.1093/sysbio/syt072

    Article  PubMed  Google Scholar 

  37. Castresana J (2000) Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Mol Biol Evol 17(4):540–552. doi:10.1093/oxfordjournals.molbev.a026334

    Article  CAS  PubMed  Google Scholar 

  38. Capella-Gutierrez S, Silla-Martinez JM, Gabaldon T (2009) trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25(15):1972–1973. doi:10.1093/bioinformatics/btp348

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Fitch WM, Margoliash E (1967) Construction of phylogenetic trees. Science 155(3760):279–284. doi:10.1126/science.155.3760.279

    Article  CAS  PubMed  Google Scholar 

  40. Page RD, Holmes EC (2009) Molecular evolution: a phylogenetic approach. John Wiley & Sons, New York, NY

    Google Scholar 

  41. Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S (2011) MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol Biol Evol 28(10):2731–2739. doi:10.1093/molbev/msr121

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Felsenstein J (1985) Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39(4):783–791. doi:10.2307/2408678

    Article  PubMed  Google Scholar 

  43. Anisimova M, Gascuel O (2006) Approximate likelihood-ratio test for branches: a fast, accurate, and powerful alternative. Syst Biol 55(4):539–552. doi:10.1080/10635150600755453

    Article  PubMed  Google Scholar 

  44. Larget B, Simon DL (1999) Markov chain Monte Carlo algorithms for the Bayesian analysis of phylogenetic trees. Mol Biol Evol 16(6):750–759. doi:10.1093/oxfordjournals.molbev.a026160

    Article  CAS  Google Scholar 

  45. Tomii K, Kanehisa M (1996) Analysis of amino acid indices and mutation matrices for sequence comparison and structure prediction of proteins. Protein Eng 9(1):27–36. doi:10.1093/protein/9.1.27

    Article  CAS  PubMed  Google Scholar 

  46. Dayhoff MO, Schwartz RM (1978) A model of evolutionary change in proteins. In: Dayhoff MO (ed) Atlas of protein sequence and structure, vol 5. National Biomedial Research Foundation, Washington DC, pp 345–358

    Google Scholar 

  47. Henikoff S, Henikoff JG (1992) Amino acid substitution matrices from protein blocks. Proc Natl Acad Sci U S A 89(22):10915–10919. doi:10.1073/pnas.89.22.10915

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Jones DT, Taylor WR, Thornton JM (1992) The rapid generation of mutation data matrices from protein sequences. Comput Appl Biosci 8(3):275–282. doi:10.1093/bioinformatics/8.3.275

    CAS  PubMed  Google Scholar 

  49. Whelan S, Goldman N (2001) A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach. Mol Biol Evol 18(5):691–699. doi:10.1093/oxfordjournals.molbev.a003851

    Article  CAS  PubMed  Google Scholar 

  50. Yang Z (1996) Maximum-Likelihood Models for Combined Analyses of Multiple Sequence Data. J Mol Evol 42(5):587–596. doi:10.1007/BF02352289

    Article  CAS  PubMed  Google Scholar 

  51. Abascal F, Zardoya R, Posada D (2005) ProtTest: selection of best-fit models of protein evolution. Bioinformatics 21(9):2104–2105. doi:10.1093/bioinformatics/bti263

    Article  CAS  PubMed  Google Scholar 

  52. Akaike H (1973) Maximum likelihood identification of Gaussian autoregressive moving average models. Biometrika 60(2):255–265. doi:10.1093/biomet/60.2.255

    Article  Google Scholar 

  53. Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464. doi:10.1214/aos/1176344136

    Article  Google Scholar 

  54. Maddison WP, Donoghue MJ, Maddison DR (1984) Outgroup analysis and parsimony. Syst Biol 33(1):83–103. doi:10.1093/sysbio/33.1.83

    Article  Google Scholar 

  55. Hedges SB, Kumar S (2009) The timetree of life. OUP Oxford,

    Google Scholar 

  56. Soltis DE, Smith SA, Cellinese N, Wurdack KJ, Tank DC, Brockington SF, Refulio-Rodriguez NF, Walker JB, Moore MJ, Carlsward BS, Bell CD, Latvis M, Crawley S, Black C, Diouf D, Xi Z, Rushworth CA, Gitzendanner MA, Sytsma KJ, Qiu YL, Hilu KW, Davis CC, Sanderson MJ, Beaman RS, Olmstead RG, Judd WS, Donoghue MJ, Soltis PS (2011) Angiosperm phylogeny: 17 genes, 640 taxa. Am J Bot 98(4):704–730. doi:10.3732/ajb.1000404

    Article  PubMed  Google Scholar 

  57. Piel WH, Donoghue M, Sanderson M, Netherlands L TreeBASE: a database of phylogenetic information. In: Proceedings of the 2nd International Workshop of Species 2000, 2000.

    Google Scholar 

  58. Chen K, Durand D, Farach-Colton M (2000) NOTUNG: a program for dating gene duplications and optimizing gene family trees. J Comput Biol 7(3-4):429–447. doi:10.1089/106652700750050871

    Article  CAS  PubMed  Google Scholar 

  59. Gu X, Vander Velden K (2002) DIVERGE: phylogeny-based analysis for functional-structural divergence of a protein family. Bioinformatics 18(3):500–501. doi:10.1093/bioinformatics/18.3.500

    Article  CAS  PubMed  Google Scholar 

  60. Gaucher EA, Gu X, Miyamoto MM, Benner SA (2002) Predicting functional divergence in protein evolution by site-specific rate shifts. Trends Biochem Sci 27(6):315–321. doi:10.1016/S0968-0004(02)02094-7

    Article  CAS  PubMed  Google Scholar 

  61. Yang Z (2007) PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol 24(8):1586–1591. doi:10.1093/molbev/msm088

    Article  CAS  PubMed  Google Scholar 

  62. Suyama M, Torrents D, Bork P (2006) PAL2NAL: robust conversion of protein sequence alignments into the corresponding codon alignments. Nucleic Acids Res 34(suppl 2):W609–W612. doi:10.1093/nar/gkl315

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Nielsen R, Yang Z (1998) Likelihood models for detecting positively selected amino acid sites and applications to the HIV-1 envelope gene. Genetics 148(3):929–936

    CAS  PubMed  PubMed Central  Google Scholar 

  64. Yang Z, Nielsen R, Goldman N, Pedersen A-MK (2000) Codon-substitution models for heterogeneous selection pressure at amino acid sites. Genetics 155(1):431–449

    CAS  PubMed  PubMed Central  Google Scholar 

  65. Yang Z, Wong WS, Nielsen R (2005) Bayes empirical bayes inference of amino acid sites under positive selection. Mol Biol Evol 22(4):1107–1118. doi:10.1093/molbev/msi097

    Article  CAS  PubMed  Google Scholar 

  66. Yang Z, Nielsen R (1998) Synonymous and nonsynonymous rate variation in nuclear genes of mammals. J Mol Evol 46(4):409–418. doi:10.1007/PL00006320

    Article  CAS  PubMed  Google Scholar 

  67. Goldman N, Yang Z (1994) A codon-based model of nucleotide substitution for protein-coding DNA sequences. Mol Biol Evol 11(5):725–736. doi:10.1093/oxfordjournals.molbev.a040153

    CAS  PubMed  Google Scholar 

  68. Fares MA, McNally D (2006) CAPS: coevolution analysis using protein sequences. Bioinformatics 22(22):2821–2822. doi:10.1093/bioinformatics/btl493

    Article  CAS  PubMed  Google Scholar 

  69. Gascuel O (1997) BIONJ: an improved version of the NJ algorithm based on a simple model of sequence data. Mol Biol Evol 14(7):685–695. doi:10.1093/oxfordjournals.molbev.a025808

    Article  CAS  PubMed  Google Scholar 

  70. Simonetti FL, Teppa E, Chernomoretz A, Nielsen M, Marino Buslje C (2013) MISTIC: mutual information server to infer coevolution. Nucleic Acids Res 41(Web Server issue):W8–14. doi:10.1093/nar/gkt427

    Article  PubMed  PubMed Central  Google Scholar 

  71. Buslje CM, Santos J, Delfino JM, Nielsen M (2009) Correction for phylogeny, small number of observations and data redundancy improves the identification of coevolving amino acid pairs using mutual information. Bioinformatics 25(9):1125–1131. doi:10.1093/bioinformatics/btp135

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  72. Larkin MA, Blackshields G, Brown N, Chenna R, McGettigan PA, McWilliam H, Valentin F, Wallace IM, Wilm A, Lopez R (2007) Clustal W and Clustal X version 2.0. Bioinformatics 23(21):2947–2948. doi:10.1093/bioinformatics/btm404

    Article  CAS  PubMed  Google Scholar 

  73. Edgar RC (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32(5):1792–1797. doi:10.1093/nar/gkh340

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  74. Notredame C, Higgins DG, Heringa J (2000) T-Coffee: a novel method for fast and accurate multiple sequence alignment. J Mol Biol 302(1):205–217. doi:10.1006/jmbi.2000.4042

    Article  CAS  PubMed  Google Scholar 

  75. Katoh K, Misawa K, Kuma K, Miyata T (2002) MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res 30(14):3059–3066. doi:10.1093/nar/gkf436

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  76. Pearson T, Hornstra HM, Sahl JW, Schaack S, Schupp JM, Beckstrom-Sternberg SM, O’Neill MW, Priestley RA, Champion MD, Beckstrom-Sternberg JS (2013) When outgroups fail; phylogenomics of rooting the emerging pathogen, Coxiella burnetii. Syst Biol 62(5):752–762. doi:10.1093/sysbio/syt038

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  77. Jill Harrison C, Langdale JA (2006) A step by step guide to phylogeny reconstruction. Plant J 45(4):561–572. doi:10.1111/j.1365-313X.2005.02611.x

    Article  PubMed  Google Scholar 

Download references

Acknowledgments

Financial assistance from MPG India partner program of Max Planck Society and Department of Science and Technology, India, the WHEAT Competitive Grants Initiative, CIMMYT and the CGIAR (A4031.09.10), and core funding from IISER-Kolkata is thankfully acknowledged.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shree P. Pandey .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Science+Business Media LLC

About this protocol

Cite this protocol

Singh, R.K., Pandey, S.P. (2017). Phylogenetic and Evolutionary Analysis of Plant ARGONAUTES. In: Carbonell, A. (eds) Plant Argonaute Proteins. Methods in Molecular Biology, vol 1640. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-7165-7_20

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-7165-7_20

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-7164-0

  • Online ISBN: 978-1-4939-7165-7

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics