What lies beneath? Molecular evolution during the radiation of caecilian amphibians
Evolution leaves an imprint in species through genetic change. At the molecular level, evolutionary changes can be explored by studying ratios of nucleotide substitutions. The interplay among molecular evolution, derived phenotypes, and ecological ranges can provide insights into adaptive radiations. Caecilians (order Gymnophiona), probably the least known of the major lineages of vertebrates, are limbless tropical amphibians, with adults of most species burrowing in soils (fossoriality). This enigmatic order of amphibians are very distinct phenotypically from other extant amphibians and likely from the ancestor of Lissamphibia, but little to nothing is known about the molecular changes underpinning their radiation. We hypothesised that colonization of various depths of tropical soils and of freshwater habitats presented new ecological opportunities to caecilians.
A total of 8540 candidate groups of orthologous genes from transcriptomic data of five species of caecilian amphibians and the genome of the frog Xenopus tropicalis were analysed in order to investigate the genetic machinery behind caecilian diversification. We found a total of 168 protein-coding genes with signatures of positive selection at different evolutionary times during the radiation of caecilians. The majority of these genes were related to functional elements of the cell membrane and extracellular matrix with expression in several different tissues. The first colonization of the tropical soils was connected to the largest number of protein-coding genes under positive selection in our analysis. From the results of our study, we highlighted molecular changes in genes involved in perception, reduction-oxidation processes, and aging that likely were involved in the adaptation to different soil strata.
The genes inferred to have been under positive selection provide valuable insights into caecilian evolution, potentially underpin adaptations of caecilians to their extreme environments, and contribute to a better understanding of fossorial adaptations and molecular evolution in vertebrates.
KeywordsEcological opportunity Gene ontology Gymnophiona Positive selection signatures Vertebrate evolution
Bayes empirical Bayes
Non-synonymous nucleotide substitutions
Synonymous nucleotide substitutions
False discovery rate
Likelihood ratio test
Transcripts Per Million
Understanding the diversity of life and how species have evolved into their different and specialized forms is an ultimate goal of evolutionary biology. The events that lead to macroevolutionary diversification by adaptive radiation have been related to ecological opportunities: the availability of new ecological resources for exploitation [1, 2, 3]. The potential for ecological opportunities to trigger adaptive radiation (diversification of species from a common ancestor into different ecomorphological forms) is widely recognised (e.g. [4, 5, 6, 7, 8]). Phenotypic evolutionary changes accumulated during adaptive radiations ultimately have a molecular basis that can involve a variety of genetic changes, including gene gain and loss, beneficial mutations, regulatory changes or other innovations [9, 10, 11]. As more genomic data becomes available, a better understanding of the evolutionary mechanisms underpinning biodiversity should follow. Molecular evolutionary processes can be investigated by studying regulatory and/or functional elements of genomes. In protein-coding genes, sources of evolutionary variation can be explored by comparing rates of nucleotide substitutions at synonymous (dS) and non-synonymous (dN) sites; substitutions in those latter sites result in a change of amino acid sequence and consequently can result in change of phenotype. The ratio between these rates, ω (ω = dN/dS), provides a widely used means of identifying selective pressures in proteins .
Adaptive radiation of vertebrates is in part explained by genetic changes that allowed new functions to emerge [13, 14, 15], increasing the fitness of the organisms in new environments. One of these environments, the soil, presents several restrictive conditions, including low levels of light, high resistance to locomotion, low airborne transmission of sound and scent, and low oxygen (O2) and high carbon dioxide (CO2) levels (hypoxia and hypercapnia respectively). In addition, many microorganisms (fungi, protozoans, bacteria) and diverse invertebrates (often pathogenic) abound in especially humid and thermally stable soils . Despite these challenges, several groups of vertebrates are well adapted to life in soil [17, 18, 19], including one of the most ancient lineages of extant terrestrial vertebrates, the caecilian amphibians that radiated in the edaphic environment during the early Mesozoic [20, 21]. Caecilians (order Gymnophiona) are limbless, elongate, mostly tropical amphibians. Adults of most species burrow in soil. Many other extant amphibians spend time in soil but feed and breed above ground [22, 23]. In contrast, many adult terrestrial caecilians are highly fossorial, dedicated burrowers that feed and breed within moist soils . Terrestrial caecilians inhabit different layers of soil, from leaf litter to deeper strata, while species of one family (Typhlonectidae) are secondarily semi- or fully aquatic . Caecilian evolution has clearly involved the colonisation of tropical soils. We hypothesise that, as well as providing distinctive challenges, the soil offered new ecological opportunities to caecilians with new resources and absence of, or reduction in, competitors and predators, perhaps similar to emergent islands [25, 26, 27, 28], newly formed lakes [29, 30], and post-mass extinction environments [31, 32] for other organisms. Regions with high above-ground biodiversity, such as the tropics, exhibit low below-ground biodiversity  where caecilians might have encountered lower competitive pressure. In addition of the suggestive reduced competition, soil is potentially more stable and less subject to harmful fluctuations in humidity and temperature. Ancestral caecilians adapted to life in soil, developed important innovations and diversified. Given that fossoriality is a derived condition among amphibians, several morphological features of caecilians are clearly adaptations to life in soil, some of which are shared convergently with other edaphic animals. These include modified skull architecture for head-first burrowing and feeding underground , elongated limbless bodies with modified axial musculature [35, 36], reduced visual and hearing systems, and novel sensory tentacles [37, 38, 39]. The molecular changes underlying the evolutionary origin and diversification of caecilians remain unexplored thus far. In this study, we investigated molecular processes involved in the exploitation of (i) soil surface habitats, (ii) deeper soil habitats, and (iii) freshwaters and associated muds.
Number of genes under positive selection
Genes under positive selection (FDR < 10%)
Genes with description
Genes with GO
Biological process domains
Molecular function domains
Cellular component domains
Caecilia + Typhlonectes
Just two of the nine studied branches of the phylogeny account for almost 50% of the identified signatures of positive selection: the branch subtending the clade comprising all sampled caecilians, hereafter the “Gymnophiona branch” (branch 1 in Figs. 1, 50 genes, 29.58%: acot2, wdr1, slc34a2, sod3, col4a2, akr1a1, als2cl, nup155, c10orf35, ddx17, adamts7, nckipsd, esyt1, msn, aqp9, slc22a31, rph3a, lamc1, tet2, gstcd, nup153, gdpd5, tacc2, klhdc10, golga1, pigr, gigyf1, cul9, cdhr2, hprt1, cgn, itga3, p2ry11, ptprh, SPEN, qsox1, vwf, cdk12, tbrg4, tcf19, spg11, rps13, gsto2, tnrc6a, cp, col17a1, acadvl, sult1c1, col16a1 and slc18a1; see Additional file 1: Table S1), and the terminal branch subtending M. dermatophaga (branch 6 in Figs. 1, 33 genes, 19.52%). There are significantly more genes with a signal of positive selection on the terminal branches subtending M. dermatophaga and T. compressicauda than on the terminal branches subtending their respective sister species in the sampled phylogeny (Microcaecila sister group: branches 6 and 7, Typhlonectes-Caecilia sister group: branches 8 and 9, with two-tailed binomial tests p-values 0.021 and 0.043 respectively). Different proportions of genes under positive selection were found also associated with branches (branches 1, 2, 4 and 8 in Fig. 1) that represent hypothesized ecological opportunities. Some of these positive selected genes in the four above-mentioned branches could have been involved in the adaptation to the new edaphic environments that caecilians were colonising. In addition to the 50 genes on the Gymnophiona branch, that is related to the initial hypothesized ecological opportunity in soil surface habitats, we found eight genes (fam3b, aoc3, mbd5, hgs, masp1, pcdh7, tnc, and sypl1; see Additional file 1: Table S1) with signatures of positive selection (5.32% of the total of genes with signatures of positive selection in this study) on the branch in which the hypothesized conquest of deeper soil habitats might have happened (branch 2, Fig. 1 hereafter “Teresomata branch”). The adaptation to freshwater habitats and associated muds that occurred on the branch subtending T. compressicauda (branch 8 in Fig. 1) could have been mediated by some of the 18 genes under positive selection on this branch (f2, col4a1, slc30a10, camkmt, klkb1, mios, polr2a, prkag3, cwc22, ate1, myh4, thoc5, arhgap33, clcn3, fam13a, adgrg6, dsg2, and fr47; see Additional file 1: Table S1). Finally, adaptation to more extreme fossoriality linked to the branch subtending Microcaecilia (branch 4 in Fig. 1, hereafter “Microcaecilia branch”) could have been facilitated by some of the 13 genes (pinx1, col4a2, fam3b, iqsec2, ddx24, mrps7, elovl5, ca5b, yes1, basp1, tspan36, acp1, and plg; see Additional file 1: Table S1) with signatures of positive selection (7.73% of the total of genes with signatures of positive selection in this study) on this Microcaecilia branch.
Finally, several of the positively selected protein-coding genes might be related (potentially causally) to unique traits of caecilian amphibians beyond their adaptations to the four hypothesised new environments. Among them, six protein-coding genes annotated as collagen chains were found to bear evidence of positive selection on several branches (col4a2 on the Gymnophiona and the Microcaecilia branches; col17a1 and col16a1 on the Gymnophiona branch; col4a1 on the T. compressicauda branch; col12a1 on the M. dermatophaga and the M. unicolor branches; and col5a2 on the R. bivittatum and the M. unicolor branches; see Additional file 1: Table S1); nine genes related to lipid metabolism and fatty acid metabolism (acot2 on the Gymnophiona branch; gdpd5 on the Gymnophiona branch; plpp1 on the R. bivittatum branch; elovl5 on the Microcaeclia branch; sptlc3 on the M. unicolor branch; cyp17a1 on the M. unicolor branch; lcat, asah1, and cers6 on the M. dermatophaga branch; see Additional file 1: Table S1, and biological process category: Fatty acid biosynthesis, metabolism and modification in Fig. 2); and at least five components involved in immune system and related mechanisms (tet2 on the Gymnophiona branch; masp1 on the Teresomata branch; enpp3 on the Caecilia + Typhlonectes branch; yes1 on the Microcaecilia branch; fyn on the M. unicolor branch; see Additional file 1: Table S1, and biological process category: Immune system and Homeostasis in Fig. 2).
General view of caecilian molecular evolution
Our analyses identified 168 protein-coding genes with signatures of having been under positive selection at least once during the evolution of caecilians. The reliability of the selection signals is supported by the adequacy of the alignments, quantified by the GUIDANCE2 alignment score , and the small proportion of adjacent codons with ω > 1, which endorse the independence of the nucleotide changes that is required by the applied selection tests . The identified genes represent only 1.97% of the total surveyed genes, a small proportion compared with studies of other taxa [45, 46, 47], and presented a lack of connectivity that perhaps reflects lack of knowledge about the identified genes . These 168 candidate genes under positive selection are almost certainly a substantial underestimate due to our conservative selection of orthologous sequences (only those present in every species, including X. tropicalis; no paralogs within species; and stringent filtering; see Methods section) intended to reduce false positives caused by alignment artefacts, to which positive selection inference methods are known to be sensitive . We also are probably missing signal from genes whose evolution history is not congruent with the species tree topology . An additional source of underestimation in the detection of positive selection could come from genes that are saturated .
Valuable insights into the molecular evolution of caecilians can be extracted from the functional annotations of the genes bearing signatures of positive selection. The high prevalence of GO terms related to cell membrane and its integral components in the set of genes with signatures of positive selection seems to underline the important role of the membrane components during the evolution of caecilian amphibians, and is consistent with positive selection signals found in other species of vertebrates [51, 52] and with previously identified regulatory innovations related to extracellular signaling in the evolution of other major tetrapod groups . Molecular changes in functional elements of the cell membrane and the ECM are likely an additional important genetic aspect of vertebrate macroevolution.
Ancient genetic toolkit for caecilians
The evolutionary changes on the Gymnophiona branch occurred subsequent to the divergence of caecilians from the other extant amphibians, leading to the last common ancestor of all extant caecilians. During this period in evolution, caecilian ancestors would have started to colonise soil environments and exploit the ecological opportunities they provided, and we would expect molecular changes linked to fossorial adaptation. From the 50 genes with signatures of positive selection that are involved in 96 biological processes based on their GO annotation grouped in 28 general categories (Table 1, Fig. 2 and Additional file 1: Table S1), we highlight the identified genes involved in development-related processes (lamc1, tet2, nup153, tacc2, spg11, see Additional file 1: Table S1 and Fig. 2); and in oxidation-reduction (redox) processes (sod3, akr1a1, qsox1, cp, see Additional file 1: Table S1 and Fig. 2).
Related to development, one of the candidate genes deserves special mention, a component of the extracellular glycoprotein matrix of the membrane, the laminin subunit gamma 1 (lamc1), which is essential for basement membrane assembly during mice embryogenesis [54, 55, 56]. The lamc1 gene is associated with several development and morphogenesis processes (GO:0007420, GO:0048854, GO:0048731, and GO:0061053). Additionally, lamc1 is one of the four elements of the detected functional gene-network (Fig. 3). Its function is linked to ECM interaction mechanisms , such as cell adhesion and cell-to-cell communication (ECM-receptor interaction: KEEG pathway ID 04512; see Fig. 3). Among other functions, lamc1 is related to light perception (GO:0050908), retinal and eye development (GO:0031290, GO:0001654, respectively), and optokinetic behavior (GO:0007634). The gene lamc1 has also been related to mechanosensitive processes in zebrafish  and studied as part of a set of important genes for perception in mammals . Unlike other extant amphibians, caecilians are rod-only monochromats with small eyes covered by skin and sometimes also bone . Light is not only important for visual perception, but also plays other important roles controlling, for example, the circadian rhythms vital for synchronization of biological cycles . We hypothesize that molecular innovation in lamc1 might be involved in sensorial adaptation, perhaps related to circadian rhythms, in underground environments.
Oxidation-reduction (redox) processes are associated (by GO terms) with four protein-coding genes inferred to be under positive selection on the Gymnophiona branch. Environmental conditions could have driven the emergence of molecular changes to tolerate chronic low O2 and high CO2 levels that characterise soils . At higher concentrations, CO2 is converted to acid by ionic dissociation and can cause oxidative stress, in turn related to disease and ageing . Additionally, O2 deprivation can affect synaptic transmission and ultimately cause cell death by cytosolic accumulation of calcium ions (Ca2+; ). The gene rph3a (see Additional file 1: Table S1) is a candidate gene under positive selection that could be related to redox processes. It is involved in the regulation of synaptic vesicle traffic that mediates the release of a neurotransmitter when Ca2+ cytosolic levels rise (GO:0048854: calcium ion-regulated exocytosis of neurotransmitter). Redox processes innovations might have contributed to the development of better protective mechanisms to increased cytotoxic threats in the edaphic atmosphere. Similar adaptations have been reported in the most studied fossorial mammal: the naked mole-rat Heterocephalus glaber Rüppell, 1842, where hypoxia experiments have revealed an attenuation of the accumulation of intracellular calcium  and the importance of redox processes [65, 66] during O2 deprivation. Naked mole rats have a surprisingly low metabolic rate in comparison with other mammals. Caecilians also maintain a low metabolism, notably lowest among extant amphibian groups [67, 68].
Evolvability in Teresomata ancestors
After the colonization of surface soil habitats and initial diversification of caecilians, the origin of the Teresomata probably involved colonisation of and adaptation to deeper soil habitats and a second wave of ecological opportunity. Several major events in caecilian evolution occurred along the Teresomata branch (branch 2 in Fig. 1), including the loss of a free-living larval stage and the origin of maternal feeding . Given that number of evolutionary changes, surprisingly only eight genes were found with signatures of positive selection. Some of these genes were associated with different GO terms including redox processes (see Additional file 1: Table S1 and Fig. 2). Given that gas exchange (O2 and CO2) becomes increasingly hampered deeper within soils, redox processes innovations, in this case by changes in the aoc3 gene (candidate gene under positive selection in the Teresomata branch), might have helped caecilians to cope with the more and more extreme conditions in this habitat. The highlighted gene (aoc3) encodes vascular adhesion protein 1, whose expression increases under hypoxia .
Within Teresomata, a major ecological shift occurred in the sampled evolutionary tree along the terminal branch subtending T. compressicauda (branch 8 in Fig. 1), with the evolution of fully aquatic adults and viviparity. Among the genes identified on the T. compressicauda, the gene fam13a is involved in signal transduction (GO:0007165) and has been related to different lung diseases [71, 72] with its activity induced by low levels of O2 . While cutaneous gas exchange is important in Amphibia , T. compressicauda has the largest lungs of any caecilian , is reported to have more than 90% pulmonary oxygen uptake , and is able to tolerate hypoxic and hypercapnic conditions . Thus, changes in fam13a might be related to enhanced pulmonary function.
The expert burrowers
The genus Microcaecilia Taylor, 1968 with 14 described species so far (two sampled in our study, M. unicolor and M. dermatophaga) is the most speciose genus of caecilians after Ichthyophis Fitzinger, 1826 (50 species) and Caecilia Linnaeus, 1758 (34 species). Microcaecilia have bullet shaped heads and heavily ossified skulls with prominent snouts and rudimentary eyes that are covered by bone . They are the most dedicated burrowers among our sampled taxa. This more extreme fossoriality perhaps led to another wave of ecological opportunity for caecilian radiation. Among the genes found under positive selection on the Microcaecilia branch, the gene pinx1 inhibits telomere elongation (GO:0010521: telomerase inhibitor activity; GO:0051974: negative regulation of telomerase activity; GO:0003676: nucleic acid binding; ) and has been related to aging, also found to have been under positive selection in a study of molecular adaptations to fossorial life in African mole-rats . Changes in pinx1 might be an indication of a relatively extended lifespan in Microcaecilia compared to other amphibians, as in mole-rats among mammals. However, little is known about longevity in caecilians, especially in the wild .
Another gene inferred to have been under positive selection along the Microcaecilia branch that drew our attention is linked to pigmentation by the GO term GO:0043473. This protein-coding gene is annotated as a tetraspanin (tspan36, see Additional file 1: Table S1). Tetraspanins are a large family of transmembrane proteins (38 homologous genes in vertebrates) that are involved in diverse biological processes, acting as organizers in the membranes of many kinds of animal cells . The functions of all the tetraspanins are not well known, but some members of this gene family have been associated with pigment cell interactions and pigment pattern formation . The tspan36 gene seems to play an important role in melanocyte biology . Despite spending all or most of their lives in soil, many caecilian species are pigmented, and some are brightly coloured and visually striking, perhaps aposematically in some cases , although many are also more drably coloured. Adaptive innovation in tspan36 might be related to evolutionary changes in pigmentation. Species of Microcaecilia have a range of colours and patterns  and the ancestral phenotype is currently unclear.
Other specific traits
Unlike other extant amphibians, many caecilian amphibians have collagenous scales hidden in annular folds in the skin, the function of which is unknown [86, 87]. Some ancestors of caecilians are expected to have had scales over the entire external surface of the body rather than embedded in the skin. The peculiar disposition (when present) of scales within the skin in modern caecilians and their varied patterns of reduction and loss are derived traits plausibly associated with the evolution of a burrowing habit. In extant caecilians, scales are diverse in their number, form and distribution along the body with each of our sampled species presenting a distinct pattern. Collagen chains are structural proteins classified under different types, and they are the main components of skin, connective tissues, bone, teeth and epithelia . We hypothesise that some of the collagen protein-coding genes might code for collagen chains involved in the formation of caecilian scales, particularly col17a1 that presented a positive selection signature on the Gymnophiona branch and skin tissue specificity expression. We are aware that collagen chains are involved in many other important biological processes, for instance col4a2 is part of the ECM-receptor interaction pathway on the Gymnophiona branch. Also, one of the candidate collagen genes under positive selection is found in the branch subtending T. compressicauda (col4a1), a species that lacks annular scales. Mutations in different genes of type IV collagen cause the Alport syndrome in humans, characterized by hearing and eyesight loss among other symptoms .
Lipid metabolism and fatty acid metabolism are biological processes associated with several of the genes that bear evidence of positive selection. Lipids have very diverse biological functions and play important roles such as energy storage, signaling, and formation of barriers in the cell membrane. They are also involved in other vital and apomorphic roles in caecilians, including the provision of nutrition to large yolky eggs in oviparous taxa, and to developing fetuses and/or newborns during oviductal and/or skin feeding among teresomatan caecilians [90, 91]. Some of these genes might be related to the synthesis, transformation and/or storage of lipids for these traits. For instance, some of these genes have been found expressed in the yolk of zebrafish (elovl5; ), in mouse embryoid bodies (cers6; ), and during vitellogenesis in teleost fish (cyp17a1; ).
Some of the candidate positively selected genes have different important roles linked to the immune system, for example tet2 is expressed in T cells , masp1 presents multiple roles in the innate immune response (masp1; ), enpp3 regulates allergic responses , and fyn controls immune receptor signaling status . Innovations in immune system genes within Gymnophiona are unsurprising. The innate amphibian immune system is likely under strong selective pressure, evolving in arms races via interactions with pathogens. The vast majority of adult caecilians live with their bodies in close proximity to moist (probably microbially rich) tropical soil substrates, and it can be expected that the ecomorphological disparity of caecilians relative to their closest relatives and their ecological diversity promoted immunological molecular genetic changes within the group. Amphibians, survivors of the Earth’s last four mass extinction events, are facing an unprecedentedly high risk of extinction that seems to be linked, in part, to challenges to their immune systems [99, 100]. Caecilian conservation biology is very poorly understood  and immune system mechanisms are in need of better understanding.
Molecular adaptive changes in caecilian amphibians are found to be associated mostly with protein-coding gene products with membrane or extracellular location. These genes present low levels of conservation and connectivity (no PPIs and only one functional network were found). The 168 genes that we infer to have been under positive selection are candidate genes with potential to further clarify adaptations of caecilians linked to their unique and variable natural histories. Several of these candidate genes are possibly causally related to differing degrees of fossoriality and hypothesized ecological shifts that might each have led to new ecological opportunities. Experiments (e.g. transfecting cell-lines with a candidate gene and in silico reconstructions of the protein structure) are required to test the function of these protein-coding genes and to identify their particular roles in important processes, such as perception, reduction-oxidation, and aging in caecilians. Functional experiments can be prompted and focused based on genome-wide studies that have narrowed down candidate genes for more thorough investigation. In this study, we identify a set of candidate genes plausibly involved in ecological and evolutionary key processes. Much biological research relies upon a small number of animal models to investigate biological processes but insights from a broader spectrum of organismal diversity, especially from neglected taxa such as caecilians, are also helpful .
The inclusion of representatives of additional caecilian lineages in future studies (especially to expand the phylogenetic, ecological, and geographic sampling), and more complete sets of genes from the sampled species (available transcriptome data thus far corresponds mostly to adult animals, ) could provide further insights into the selective pressures shaping caecilian molecular evolution. Adaptations are not necessarily only associated with positive selection in protein-coding genes. Changes in regulation can also allow adaptation to new environments, and are thus far unexplored for caecilians. The findings reported here will hopefully provide a foundation for further analyses of the molecular bases of the radiation of Gymnophiona and of molecular evolution in vertebrates more generally.
The source data of this study were the protein-coding gene sequences (both nucleotide and amino-acid level) from reference transcriptomes of five caecilian species (R. bivittatum, C. tentaculata, T. compressicauda, M. unicolor and M. dermatophaga; assemblies are available from NCBI through BioProject ID number PRJNA387587; ) as well as those for the frog X. tropicalis, the only amphibian currently represented in the Ensembl database . Species-specific caecilian transcriptomes were de novo assembled from paired-end RNA-seq samples of multiple tissues (kidney, liver, and skin samples for each of the five species plus a selection of other tissues for subsets of the five species: foregut, heart, lung, muscle, spleen, and testis) yielding five reference transcriptomes with a high percentage of completeness. Protein-coding sequences were identified from these assembled sequences with an open reading frame . For each X. tropicalis gene, the isoform encoding the longest protein was chosen for analysis, and BLAST searches (blastp tool, version 2.2.28; E-value < 10− 10; ) were conducted against the proteins of each of the caecilian transcriptomes. Likewise, each caecilian protein sequence was used as a query in a BLAST search against the X. tropicalis proteome. Pairs of best reciprocal hits were considered orthologs. Only X. tropicalis genes with putative orthologs in all five caecilian species were used in downstream analyses.
For each group of orthologs, the inferred amino acid sequences were aligned using PRANK with default parameters . Given the sensitivity of positive selection analyses to alignment errors, we carried out a thorough filtering of the alignments. First, Gblocks version 0.91b  with default settings was used to remove problematic regions. Second, two ad hoc sliding window filters (of 15 and 5 residues) were used to eliminate regions coding for amino acids that are unique to one species (with 10 or more amino acid singletons, or where all five were singletons, respectively; as in as in [107, 108]) because such regions are often associated with annotation or sequencing errors. The resulting amino acid sequence alignments were used to guide the alignment of the corresponding codon sequences.
Tests of positive selection
To infer positive selection, we performed branch-site model tests [109, 110] for every group of orthologous genes and for every branch of the studied subset of the caecilian phylogeny (based on [40, 69]; Fig. 1), excluding the X. tropicalis branch, and computing branch lengths each time for each group of genes using the CODEML program in PAML 4.6 . The branch-site model test (model A vs. null model A) assumes that only a fraction of sites might have undergone positive selection and only along a single a priori identified branch (foreground lineage) on the phylogeny. The test assumes four classes of sites: codons that are conserved (ω < 1), codons that are evolving neutrally (ω = 1), and codons under positive selection (ω > 1) on the foreground branch but conserved (2a) or neutral (2b) on the other (background) branches. Model A was implemented with the default starting value (0.4) for ω (model = 2, NSsites = 2, cleandata = 1 and fix_blength = 0) and used as the alternative hypothesis for the Likelihood Ratio Test (LRT). The null model of the LRTs was the null model A with only one change in the parameters from model A: ω fixed at 1 for sites under positive selection on the foreground branch (2a and 2b sites). P-values for the LRTs were computed using the χ2 distribution with one degree of freedom, and divided by two [111, 112]. Multiple-testing corrections were conducted following Benjamini and Hochberg’s method in order to control for a false discovery rate (FDR) using R . Genes with a q-value < 0.1 and ω > 1 for the foreground branch (2a and 2b sites) were interpreted as being genes under positive selection. Sites under positive selection were identified by computing posterior probabilities using the Bayes empirical Bayes (BEB) approach . To obtain an estimation of the proportion of tandem complex mutations in our results, we analysed the position of the codons under positive selection from BEB test outputs. Also, the suitability of the positive selection analyses for the multiple sequence alignments of the genes with signatures of selection was tested using the GUIDANCE2 methodology , computing the GUIDANCE alignment score for each inference (alignments are available from the Github repository: TorresSanchezM/alignments). The number of genes with signatures of positive selection in sister branches were compared by two-tailed binomial tests under the hypothesis of equal probability of being or not under positive selection (p = 0.5) with a 95% confidence level.
For each of the putative orthologous groups inferred to be under positive selection, we obtained the associated GO terms from the X. tropicalis annotation using the BioMart data-mining tool (Ensembl Genes 95, Xenopus genes JGI 4.2; ). Novel or uncharacterised genes from X. tropicalis were annotated by BLAST searches (blastx tool; ) against the Non-redundant protein sequences database. We summarized and visualized the common GO terms of the selected genes and their frequencies of occurrence using REVIGO  applying 0.7% allowed similarity (by the semantic similarity method) and using the whole UniProt database  to define the size of each GO term. The exploratory REVIGO networks were manually processed to build a unique explanatory plot with general GO categories presented in the REVIGO networks for each of the nine analysed branches. Enrichment analysis for each branch was performed using the GO enrichment analysis tool . Additionally, protein-protein interactions (PPIs) were inferred using STRING  with X. tropicalis as the reference organism and default settings, comparing the interactions of caecilian protein-coding genes on each branch among themselves to a random set of proteins of similar size, drawn from the chosen genome. Finally, after counting Transcripts Per Million (TPM) units with the RSEM program  for each gene under positive selection in the caecilian transcriptomes, gene-expression presence across tissues was determined for cases with more than 5% of the total TPM being found in one tissue type. Tissue specificity was identified when more than 95% of the total TPM were found in one particular tissue type (following ).
We thank Iván de la Hera, José A. Díaz, Karen Siu-Ting, David Buckley, Iván Gómez-Mestre, Antonio González-Martín, Javier Pérez-Tris, and Jeff Streicher for insightful comments and advice. Computational analyses were performed at the Altamira HPC cluster of the Institute of Physics of Cantabria (IFCA-CSIC), which is part of the Spanish Supercomputing Network. We thank Myrian Virevaire (Direction de l’Environment de l’Amenagement et du Logement) and Le Comitee Scientifique Regional du Patrimonie Naturel for supporting facilitating fieldwork in French Guiana.
This study was financially supported by the Spanish Ministry of Economy and Competitiveness (MINECO: RYC-2011-09321 and CGL2012–40082 grants to DSM; BES-2013-062723 FPI predoctoral fellowship, EEBB-I-16-11395 and EEBB-I-17-12039 research stays to MTS). The MINECO also provided support through AdaptNET project (CGL2015–71726-REDT grant). DAP was supported by the National Science Foundation and National Institutes of Health (MCB 1818288, P20GM103440 and 5P30GM110767–04 grants). The funding bodies had no role in the design of the study, the collection, analysis, the interpretation of data or the writing of the manuscript.
Availability of data and materials
DJG, CJC, MW and DSM devised the conceptual project. MTS, DJG and MW designed the study. MTS and DAP analyzed the data. MTS wrote the manuscript with the contribution of DJG, DAP, CJC, MW, and DSM in the interpretation of results and discussion on principle findings. All authors read and approved the final manuscript.
Ethics approval and consent to participate
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
- 1.Darwin C. On the origin of species. 6th ed. London: John Murray; 1859.Google Scholar
- 3.Schluter D. The ecology of adaptive radiation. Oxford: Oxford Univ. Press; 2000.Google Scholar
- 4.Givnish TJ. 1997. Adaptive radiation and molecular systematics: issues and approaches. See Givnish & Sytsma 1997, pp. 1–54.Google Scholar
- 5.Losos JB, Mahler DL. 2010. Adaptive radiation: the interaction of ecological opportunity, adaptation, and speciation. In Evolution since Darwin: The First 150 Years, ed. MA Bell, DJ Futuyma, WF Eanes, JS Levinton, pp. 381–420. Sunderland, MA: Sinauer.Google Scholar
- 13.Ohno S. Evolution by gene duplication. New York: Springer-Verlag; 1970.Google Scholar
- 18.Wake M. H. 1993. The skull as a locomotor organ. In: Hanken J and BKH eds. 1993. The skull: functional and evolutionary mechanisms. Univ Chicago press:422–453.Google Scholar
- 22.Wells KD. 2008. The ecology and behavior of amphibians . Chicago (Illinois): University of Chicago Press.Google Scholar
- 24.Wilkinson M. Caecilians. Curr Biol. 2012;22.Google Scholar
- 28.Losos JB. Lizards in an evolutionary tree: ecology and adaptive radiation of anoles. Berkeley: Univ. Calif. Press; 2009.Google Scholar
- 36.O’Reilly JC, Summers AP, D a R. The evolution of the functional role of trunk muscles during locomotion in adult amphibians. Am Zool. 2000;40:123–35.Google Scholar
- 38.Wilkinson M, Garbout A, & Mohun SM. 2018. The visual system of caecilian amphibians. Integr Comp Biol. 58: E252-E252.Google Scholar
- 49.Diekmann Y, Pereira-Leal JB. Gene tree affects inference of sites under selection by the branch-site test of positive selection. Evol Bioinforma. 2016;11:11–7.Google Scholar
- 57.Rayagiri SS, Ranaldi D, Raven A, Mohamad Azhar NIF, Lefebvre O, Zammit PS, Borycki AG. Basal lamina remodeling at the skeletal muscle stem cell niche mediates stem cell self-renewal. Nat Commun. 2018;9.Google Scholar
- 62.Davalli P, Mitic T, Caporali A, Lauriola A, D’Arca D. ROS, cell senescence, and novel molecular mechanisms in aging and age-related diseases. Oxidative Med Cell Longev. 2016;2016.Google Scholar
- 66.Fang X, Nevo E, Han L, Levanon EY, Zhao J, Avivi A, Larkin D, Jiang X, Feranchuk S, Zhu Y, et al. Genome-wide adaptive complexes to underground stresses in blind mole rats Spalax. Nat Commun. 2014;5:3966.Google Scholar
- 68.Smits AW, Flanagin JI. Bimodal respiration in aquatic and terrestrial apodan amphibians. Integr Comp Biol. 1994;34(2):247–63.Google Scholar
- 72.Ziółkowska-Suchanek I, Mosor M, Gabryel P, Grabicki M, Zurawek M, Fichna M, Strauss E, Batura-Gabryel H, Dyszkiewicz W, Nowak J. Susceptibility loci in lung cancer and COPD: association of IREB2 and FAM13A with pulmonary diseases. Sci Rep. 2015;5.Google Scholar
- 75.Wilkinson M, Nussbaum RA. Comparative morphology and evolution of the lungless caecilian Atretochoana eiselti (Taylor) (Amphibia: Gymnophiona: Typhlonectidae). Biol J Linn Soc. 1997;62:39–109.Google Scholar
- 76.Sawaya P. Metabolismo respiratorio de anfibio gymnophiona, Typhlonectes com- pressicauda. Bol Fac Filos Cienc Letras Univ San Paulo Ser Zool. 1947;12:51–6.Google Scholar
- 78.Renous S. Cranial morphology of an American siphonopid, Microcaecilia unicolor (Amphibia, Gymnophiona) and its functional interpretation. Gegenbaurs Morphol Jahrb. 1990;136:781–806.Google Scholar
- 83.Lapedriza A. 2015. Gene regulatory network of melanocyte development. PhD University of Bath.Google Scholar
- 86.Taylor EH. Squamation in caecilians, with an atlas of scales. The University of Kansas. Scie Bull. 1972;49:989–1164.Google Scholar
- 88.Lodish H, Berk A, Zipursky S. 2000. Collagen: The fibrous Proteins of the Matrix In: Molecular Cell Biology p Section 22.3.Google Scholar
- 89.Lemmink HH, Mochlzukj T, Van Den Heuvel, LP w j, Schröder CH, Barrientos A, Monnens LAH, Van Oost BA, Brunner HG, Reeders ST, Smeets HJM. Mutations in the type IV collagen α3 (COL4A3) gene in autosomal recessive alport syndrome. Hum Mol Genet. 1994;3:1269–73.PubMedCrossRefPubMedCentralGoogle Scholar
- 97.Tsai SH, Kinoshita M, Kusu T, Kayama H, Okumura R, Ikeda K. The Ectoenzyme E-NPP3 negatively regulates ATP- dependent chronic allergic responses by basophils article the Ectoenzyme E-NPP3 negatively regulates ATP-dependent chronic allergic responses by basophils and Mast cells. Immunity. 2015;42(2):279–93.PubMedCrossRefPubMedCentralGoogle Scholar
- 98.Mkaddem SB, Murua A, Flament H, Titeca-Beauport D, Bounaix C, Danelli L, Launay P, Benhamou M, Blank U, Daugas E, et al. Lyn and Fyn function as molecular switches that control immunoreceptors to direct homeostasis or inflammation. Nat Commun. 2017;8.Google Scholar
- 108.Chakraborty S, Alvarez-Ponce D. Positive selection and centrality in the yeast and Fly protein-protein interaction networks. Biomed Res Int. 2016;2016.Google Scholar
- 113.R Development Core Team. 2016. R: A Language and Environment for Statistical Computing. R Found Stat Comput. Vienna Austria.Google Scholar
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.