Abstract
In this chapter, we outline a pipeline for ortholog prediction and phylogenetic analysis in plants. This computational pipeline uses algorithms from different software to enable bioinformatic-beginner biologists to predict orthologs that can be shared with many distinct plant nonmodel and model species and identify gene loss events. Prediction of orthologs allows (1) investigation of the evolutionary relationships of plant genomes, (2) discovery of their origin, function, and (3) the impact of their adaptability to the environment.
We developed a pipeline to fit, not only eukaryote but also prokaryote organisms, with small or large genomes. All results acquired from the orthologs predication will enable phylogenetic tree construction, using gene and species (phylogenomic) phylogeny approaches.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Rokas A, Williams BI, King N, Carroll SB (2003) Genome-scale approaches to resolving incongruence in molecular phylogenies. Nature 425:798. https://doi.org/10.1038/nature02053
Gee H (2003) Ending incongruence. Nature 425:782
Hipp AL, Eaton DAR, Cavender-Bares J et al (2014) A framework phylogeny of the American oak clade based on sequenced RAD data. PLoS One 9:e93975. https://doi.org/10.1371/journal.pone.0093975
Widhelm TJ, Grewe F, Huang J-P et al (2019) Multiple historical processes obscure phylogenetic relationships in a taxonomically difficult group (Lobariaceae, Ascomycota). Sci Rep 9:8968. https://doi.org/10.1038/s41598-019-45455-x
Wang Y, Coleman-Derr D, Chen G, Gu YQ (2015) OrthoVenn: a web server for genome wide comparison and annotation of orthologous clusters across multiple species. Nucleic Acids Res 43:W78. https://doi.org/10.1093/nar/gkv487
Kapli P, Yang Z, Telford MJ (2020) Phylogenetic tree building in the genomic age. Nat Rev Genet 21:428–444. https://doi.org/10.1038/s41576-020-0233-0
Dalquen DA, Dessimoz C (2013) Bidirectional best hits miss many orthologs in duplication-rich clades such as plants and animals. Genome Biol Evol 5:1800. https://doi.org/10.1093/gbe/evt132
Salomaki ED, Eme L, Brown MW, Kolisko M (2020) Releasing uncurated datasets is essential for reproducible phylogenomics. Nat Ecol Evol 4:1435
Rotterová J, Salomaki E, Pánek T et al (2020) Genomics of new ciliate lineages provides insight into the evolution of obligate anaerobiosis. Curr Biol 30:2037. https://doi.org/10.1016/j.cub.2020.03.064
Lax G, Eglit Y, Eme L et al (2018) Hemimastigophora is a novel supra-kingdom-level lineage of eukaryotes. Nature 564:410. https://doi.org/10.1038/s41586-018-0708-8
Shaver S, Casas-Mollano JA, Cerny RL, Cerutti H (2010) Origin of the polycomb repressive complex 2 and gene silencing by an E(z) homolog in the unicellular alga Chlamydomonas. Epigenetics 5:301–312. https://doi.org/10.4161/epi.5.4.11608
Chen DH, Qiu HL, Huang Y et al (2020) Genome-wide identification and expression profiling of SET DOMAIN GROUP family in Dendrobium catenatum. BMC Plant Biol 20:1–19. https://doi.org/10.1186/s12870-020-2244-6
Burki F, Pawlowski J (2006) Monophyly of rhizaria and multigene phylogeny of unicellular bikonts. Mol Biol Evol 23:1922. https://doi.org/10.1093/molbev/msl055
Torruella G, Derelle R, Paps J et al (2012) Phylogenetic relationships within the Opisthokonta based on phylogenomic analyses of conserved single-copy protein domains. Mol Biol Evol 29:531. https://doi.org/10.1093/molbev/msr185
Saunders GW, Jackson C, Salomaki ED (2018) Phylogenetic analyses of transcriptome data resolve familial assignments for genera of the red-algal Acrochaetiales-Palmariales Complex (Nemaliophycidae). Mol Phylogenet Evol 119:151. https://doi.org/10.1016/j.ympev.2017.11.002
Huerta-Cepas J, Szklarczyk D, Heller D et al (2019) EggNOG 5.0: a hierarchical, functionally and phylogenetically annotated orthology resource based on 5090 organisms and 2502 viruses. Nucleic Acids Res 47:D309. https://doi.org/10.1093/nar/gky1085
Emms DM, Kelly S (2019) OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol 20:238. https://doi.org/10.1186/s13059-019-1832-y
Panzetta A (2016) A new similarity measure for phylogenetic trees. Ca’ Foscari University
Metropolis N, Rosenbluth AW, Rosenbluth MN et al (1953) Equation of state calculations by fast computing machines. J Chem Phys 21:1087. https://doi.org/10.1063/1.1699114
Kumar S, Stecher G, Li M et al (2018) MEGA X: molecular evolutionary genetics analysis across computing platforms. Mol Biol Evol 35:1547. https://doi.org/10.1093/molbev/msy096
Sharaf G, Jiroutová O (2019) Characterization of aminoacyl-tRNA synthetases in chromerids. Genes (Basel) 10:582. https://doi.org/10.3390/genes10080582
Hall BG (2005) Comparison of the accuracies of several phylogenetic methods using protein and DNA sequences. Mol Biol Evol 22:792–802. https://doi.org/10.1093/molbev/msi066
Johnson LS, Eddy SR, Portugaly E (2010) Hidden Markov model speed heuristic and iterative HMM search procedure. BMC Bioinformatics 11:431. https://doi.org/10.1186/1471-2105-11-431
Kriventseva EV, Kuznetsov D, Tegenfeldt F et al (2019) OrthoDB v10: sampling the diversity of animal, plant, fungal, protist, bacterial and viral genomes for evolutionary and functional annotations of orthologs. Nucleic Acids Res 47:D807–D811. https://doi.org/10.1093/nar/gky1053
Scornavacca C, Galtier N (2016) Incomplete lineage sorting in mammalian phylogenomics. Syst Biol 66:syw082. https://doi.org/10.1093/sysbio/syw082
Sonnhammer ELL, Gabaldon T, Sousa da Silva AW et al (2014) Big data and other challenges in the quest for orthologs. Bioinformatics 30:2993–2998. https://doi.org/10.1093/bioinformatics/btu492
Forslund SK, Kaduk M, Sonnhammer ELL (2019) Evolution of protein domain architectures. In: Methods in molecular biology. Humana Press Inc., Totowa, NJ, pp 469–504
Taylor TC, Andersson I (1997) The structure of the complex between rubisco and its natural substrate ribulose 1,5-bisphosphate. J Mol Biol 265:432–444. https://doi.org/10.1006/jmbi.1996.0738
Mohr G, Perlman PS, Lambowitz AM (1993) Evolutionary relationships among group II intron-encoded proteins and identification of a conserved domain that may be related to maturase function. Nucleic Acids Res 21:4991–4997. https://doi.org/10.1093/nar/21.22.4991
El-Gebali S, Mistry J, Bateman A et al (2019) The Pfam protein families database in 2019. Nucleic Acids Res 47:D427. https://doi.org/10.1093/nar/gky995
Letunic I, Bork P (2018) 20 years of the SMART protein domain annotation resource. Nucleic Acids Res 46:D493. https://doi.org/10.1093/nar/gkx922
Katoh K, Standley DM (2013) MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol 30:772–780. https://doi.org/10.1093/molbev/mst010
Sardaraz M, Tahir M, Aziz Ikram T, Bajwa H (2012) Applications and algorithms for inference of huge phylogenetic trees: a review. Am J Bioinformatics Res 2:21–26. https://doi.org/10.5923/j.bioinformatics.20120201.04
Nesnidal MP, Helmkampf M, Bruchhaus I, Hausdorf B (2010) Compositional heterogeneity and phylogenomic inference of metazoan relationships. Mol Biol Evol 27:2095–2104. https://doi.org/10.1093/molbev/msq097
Rabiee M, Sayyari E, Mirarab S (2019) Multi-allele species reconstruction using ASTRAL. Mol Phylogenet Evol 130:286–296. https://doi.org/10.1016/j.ympev.2018.10.033
Seppey M, Manni M, Zdobnov EM (2019) BUSCO: assessing genome assembly and annotation completeness. Methods Mol Biol 1962:227–245. https://doi.org/10.1007/978-1-4939-9173-0_14
Sharaf A, Vijayanathan M, Oborník M, Mozgová I (2022) Phylogenetic profiling resolves early emergence of PRC2 and illuminates its functional core. Life Science Alliance 5(7) e202101271. https://doi.org/10.26508/lsa.202101271
Acknowledgments
This work was supported by the Czech Academy of Sciences, ERC-CZ, grant number [ERC200961901]. The work also supported by the Science and Technology Development Fund (STDF) and the Partnership for Research and Innovation in the Mediterranean Area (PRIMA), GENDIBAR project. Computational resources were provided by the CESNET LM2015042 and the CERIT Scientific Cloud LM2015085, funded under the programme “Projects of Large Research, Development, and Innovations Infrastructures.” The authors thank Dr. Chayma Ben Saoud for improving the graphs, and Dr. Iva Mozgová for proofreading the manuscript.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic Supplementary Material
File S1
A Jackhammer output example file contains homology search results (TXT 801 bytes)
File S2
An emapper output example file with the query protein sequence rbcL annotation, including the target COG/KOG category (TXT 2 kb)
File S3
An emapper output example file with the Selaginella moellendorffii rbcL hits sequences annotation, including the COG/KOG category (TXT 3 kb)
File S4
An emapper output example file with the Zea mays rbcL hits sequences annotation. The only hit “tr|A0A1D6EVW0|A0A1D6EVW0_MAIZE” with different annotation and COG category is highlighted in yellow (XLSX 6 kb)
File S5
A hmmscan output example file contains domain screening search results (TXT 1 kb)
File S6
A list of required software and packages which are available in conda (TXT 1 kb)
File S7
A Bayesian Inference phylogeny using the example data (TXT 985 bytes)
File S8
A map file for the ASTRAL-III tool based on the example phylogenetic trees (TXT 725 bytes)
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Sharaf, A., Elateek, S. (2022). Orthology Prediction and Phylogenetic Analysis Methods in Plants. In: Pereira-Santana, A., Gamboa-Tuz, S.D., Rodríguez-Zapata, L.C. (eds) Plant Comparative Genomics. Methods in Molecular Biology, vol 2512. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-2429-6_1
Download citation
DOI: https://doi.org/10.1007/978-1-0716-2429-6_1
Published:
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-0716-2428-9
Online ISBN: 978-1-0716-2429-6
eBook Packages: Springer Protocols