Skip to main content

Orthology Prediction and Phylogenetic Analysis Methods in Plants

  • Protocol
  • First Online:
Plant Comparative Genomics

Part of the book series: Methods in Molecular Biology ((MIMB,volume 2512))

Abstract

In this chapter, we outline a pipeline for ortholog prediction and phylogenetic analysis in plants. This computational pipeline uses algorithms from different software to enable bioinformatic-beginner biologists to predict orthologs that can be shared with many distinct plant nonmodel and model species and identify gene loss events. Prediction of orthologs allows (1) investigation of the evolutionary relationships of plant genomes, (2) discovery of their origin, function, and (3) the impact of their adaptability to the environment.

We developed a pipeline to fit, not only eukaryote but also prokaryote organisms, with small or large genomes. All results acquired from the orthologs predication will enable phylogenetic tree construction, using gene and species (phylogenomic) phylogeny approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Rokas A, Williams BI, King N, Carroll SB (2003) Genome-scale approaches to resolving incongruence in molecular phylogenies. Nature 425:798. https://doi.org/10.1038/nature02053

    Article  CAS  PubMed  Google Scholar 

  2. Gee H (2003) Ending incongruence. Nature 425:782

    Article  CAS  Google Scholar 

  3. Hipp AL, Eaton DAR, Cavender-Bares J et al (2014) A framework phylogeny of the American oak clade based on sequenced RAD data. PLoS One 9:e93975. https://doi.org/10.1371/journal.pone.0093975

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Widhelm TJ, Grewe F, Huang J-P et al (2019) Multiple historical processes obscure phylogenetic relationships in a taxonomically difficult group (Lobariaceae, Ascomycota). Sci Rep 9:8968. https://doi.org/10.1038/s41598-019-45455-x

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Wang Y, Coleman-Derr D, Chen G, Gu YQ (2015) OrthoVenn: a web server for genome wide comparison and annotation of orthologous clusters across multiple species. Nucleic Acids Res 43:W78. https://doi.org/10.1093/nar/gkv487

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Kapli P, Yang Z, Telford MJ (2020) Phylogenetic tree building in the genomic age. Nat Rev Genet 21:428–444. https://doi.org/10.1038/s41576-020-0233-0

    Article  CAS  PubMed  Google Scholar 

  7. Dalquen DA, Dessimoz C (2013) Bidirectional best hits miss many orthologs in duplication-rich clades such as plants and animals. Genome Biol Evol 5:1800. https://doi.org/10.1093/gbe/evt132

    Article  PubMed  PubMed Central  Google Scholar 

  8. Salomaki ED, Eme L, Brown MW, Kolisko M (2020) Releasing uncurated datasets is essential for reproducible phylogenomics. Nat Ecol Evol 4:1435

    Article  Google Scholar 

  9. Rotterová J, Salomaki E, Pánek T et al (2020) Genomics of new ciliate lineages provides insight into the evolution of obligate anaerobiosis. Curr Biol 30:2037. https://doi.org/10.1016/j.cub.2020.03.064

    Article  CAS  PubMed  Google Scholar 

  10. Lax G, Eglit Y, Eme L et al (2018) Hemimastigophora is a novel supra-kingdom-level lineage of eukaryotes. Nature 564:410. https://doi.org/10.1038/s41586-018-0708-8

    Article  CAS  PubMed  Google Scholar 

  11. Shaver S, Casas-Mollano JA, Cerny RL, Cerutti H (2010) Origin of the polycomb repressive complex 2 and gene silencing by an E(z) homolog in the unicellular alga Chlamydomonas. Epigenetics 5:301–312. https://doi.org/10.4161/epi.5.4.11608

    Article  CAS  PubMed  Google Scholar 

  12. Chen DH, Qiu HL, Huang Y et al (2020) Genome-wide identification and expression profiling of SET DOMAIN GROUP family in Dendrobium catenatum. BMC Plant Biol 20:1–19. https://doi.org/10.1186/s12870-020-2244-6

    Article  CAS  Google Scholar 

  13. Burki F, Pawlowski J (2006) Monophyly of rhizaria and multigene phylogeny of unicellular bikonts. Mol Biol Evol 23:1922. https://doi.org/10.1093/molbev/msl055

    Article  CAS  PubMed  Google Scholar 

  14. Torruella G, Derelle R, Paps J et al (2012) Phylogenetic relationships within the Opisthokonta based on phylogenomic analyses of conserved single-copy protein domains. Mol Biol Evol 29:531. https://doi.org/10.1093/molbev/msr185

    Article  CAS  PubMed  Google Scholar 

  15. Saunders GW, Jackson C, Salomaki ED (2018) Phylogenetic analyses of transcriptome data resolve familial assignments for genera of the red-algal Acrochaetiales-Palmariales Complex (Nemaliophycidae). Mol Phylogenet Evol 119:151. https://doi.org/10.1016/j.ympev.2017.11.002

    Article  PubMed  Google Scholar 

  16. Huerta-Cepas J, Szklarczyk D, Heller D et al (2019) EggNOG 5.0: a hierarchical, functionally and phylogenetically annotated orthology resource based on 5090 organisms and 2502 viruses. Nucleic Acids Res 47:D309. https://doi.org/10.1093/nar/gky1085

    Article  CAS  PubMed  Google Scholar 

  17. Emms DM, Kelly S (2019) OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol 20:238. https://doi.org/10.1186/s13059-019-1832-y

    Article  PubMed  PubMed Central  Google Scholar 

  18. Panzetta A (2016) A new similarity measure for phylogenetic trees. Ca’ Foscari University

    Google Scholar 

  19. Metropolis N, Rosenbluth AW, Rosenbluth MN et al (1953) Equation of state calculations by fast computing machines. J Chem Phys 21:1087. https://doi.org/10.1063/1.1699114

    Article  CAS  Google Scholar 

  20. Kumar S, Stecher G, Li M et al (2018) MEGA X: molecular evolutionary genetics analysis across computing platforms. Mol Biol Evol 35:1547. https://doi.org/10.1093/molbev/msy096

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Sharaf G, Jiroutová O (2019) Characterization of aminoacyl-tRNA synthetases in chromerids. Genes (Basel) 10:582. https://doi.org/10.3390/genes10080582

    Article  CAS  Google Scholar 

  22. Hall BG (2005) Comparison of the accuracies of several phylogenetic methods using protein and DNA sequences. Mol Biol Evol 22:792–802. https://doi.org/10.1093/molbev/msi066

    Article  CAS  PubMed  Google Scholar 

  23. Johnson LS, Eddy SR, Portugaly E (2010) Hidden Markov model speed heuristic and iterative HMM search procedure. BMC Bioinformatics 11:431. https://doi.org/10.1186/1471-2105-11-431

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Kriventseva EV, Kuznetsov D, Tegenfeldt F et al (2019) OrthoDB v10: sampling the diversity of animal, plant, fungal, protist, bacterial and viral genomes for evolutionary and functional annotations of orthologs. Nucleic Acids Res 47:D807–D811. https://doi.org/10.1093/nar/gky1053

    Article  CAS  PubMed  Google Scholar 

  25. Scornavacca C, Galtier N (2016) Incomplete lineage sorting in mammalian phylogenomics. Syst Biol 66:syw082. https://doi.org/10.1093/sysbio/syw082

    Article  CAS  Google Scholar 

  26. Sonnhammer ELL, Gabaldon T, Sousa da Silva AW et al (2014) Big data and other challenges in the quest for orthologs. Bioinformatics 30:2993–2998. https://doi.org/10.1093/bioinformatics/btu492

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Forslund SK, Kaduk M, Sonnhammer ELL (2019) Evolution of protein domain architectures. In: Methods in molecular biology. Humana Press Inc., Totowa, NJ, pp 469–504

    Google Scholar 

  28. Taylor TC, Andersson I (1997) The structure of the complex between rubisco and its natural substrate ribulose 1,5-bisphosphate. J Mol Biol 265:432–444. https://doi.org/10.1006/jmbi.1996.0738

    Article  CAS  PubMed  Google Scholar 

  29. Mohr G, Perlman PS, Lambowitz AM (1993) Evolutionary relationships among group II intron-encoded proteins and identification of a conserved domain that may be related to maturase function. Nucleic Acids Res 21:4991–4997. https://doi.org/10.1093/nar/21.22.4991

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. El-Gebali S, Mistry J, Bateman A et al (2019) The Pfam protein families database in 2019. Nucleic Acids Res 47:D427. https://doi.org/10.1093/nar/gky995

    Article  CAS  PubMed  Google Scholar 

  31. Letunic I, Bork P (2018) 20 years of the SMART protein domain annotation resource. Nucleic Acids Res 46:D493. https://doi.org/10.1093/nar/gkx922

    Article  CAS  PubMed  Google Scholar 

  32. Katoh K, Standley DM (2013) MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol 30:772–780. https://doi.org/10.1093/molbev/mst010

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Sardaraz M, Tahir M, Aziz Ikram T, Bajwa H (2012) Applications and algorithms for inference of huge phylogenetic trees: a review. Am J Bioinformatics Res 2:21–26. https://doi.org/10.5923/j.bioinformatics.20120201.04

    Article  Google Scholar 

  34. Nesnidal MP, Helmkampf M, Bruchhaus I, Hausdorf B (2010) Compositional heterogeneity and phylogenomic inference of metazoan relationships. Mol Biol Evol 27:2095–2104. https://doi.org/10.1093/molbev/msq097

  35. Rabiee M, Sayyari E, Mirarab S (2019) Multi-allele species reconstruction using ASTRAL. Mol Phylogenet Evol 130:286–296. https://doi.org/10.1016/j.ympev.2018.10.033

    Article  PubMed  Google Scholar 

  36. Seppey M, Manni M, Zdobnov EM (2019) BUSCO: assessing genome assembly and annotation completeness. Methods Mol Biol 1962:227–245. https://doi.org/10.1007/978-1-4939-9173-0_14

  37. Sharaf A, Vijayanathan M, Oborník M, Mozgová I (2022) Phylogenetic profiling resolves early emergence of PRC2 and illuminates its functional core. Life Science Alliance 5(7) e202101271. https://doi.org/10.26508/lsa.202101271

Download references

Acknowledgments

This work was supported by the Czech Academy of Sciences, ERC-CZ, grant number [ERC200961901]. The work also supported by the Science and Technology Development Fund (STDF) and the Partnership for Research and Innovation in the Mediterranean Area (PRIMA), GENDIBAR project. Computational resources were provided by the CESNET LM2015042 and the CERIT Scientific Cloud LM2015085, funded under the programme “Projects of Large Research, Development, and Innovations Infrastructures.” The authors thank Dr. Chayma Ben Saoud for improving the graphs, and Dr. Iva Mozgová for proofreading the manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Abdoallah Sharaf .

Editor information

Editors and Affiliations

1 Electronic Supplementary Material

File S1

A Jackhammer output example file contains homology search results (TXT 801 bytes)

File S2

An emapper output example file with the query protein sequence rbcL annotation, including the target COG/KOG category (TXT 2 kb)

File S3

An emapper output example file with the Selaginella moellendorffii rbcL hits sequences annotation, including the COG/KOG category (TXT 3 kb)

File S4

An emapper output example file with the Zea mays rbcL hits sequences annotation. The only hit “tr|A0A1D6EVW0|A0A1D6EVW0_MAIZE” with different annotation and COG category is highlighted in yellow (XLSX 6 kb)

File S5

A hmmscan output example file contains domain screening search results (TXT 1 kb)

File S6

A list of required software and packages which are available in conda (TXT 1 kb)

File S7

A Bayesian Inference phylogeny using the example data (TXT 985 bytes)

File S8

A map file for the ASTRAL-III tool based on the example phylogenetic trees (TXT 725 bytes)

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Sharaf, A., Elateek, S. (2022). Orthology Prediction and Phylogenetic Analysis Methods in Plants. In: Pereira-Santana, A., Gamboa-Tuz, S.D., Rodríguez-Zapata, L.C. (eds) Plant Comparative Genomics. Methods in Molecular Biology, vol 2512. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-2429-6_1

Download citation

  • DOI: https://doi.org/10.1007/978-1-0716-2429-6_1

  • Published:

  • Publisher Name: Humana, New York, NY

  • Print ISBN: 978-1-0716-2428-9

  • Online ISBN: 978-1-0716-2429-6

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics