Phylogenetic mapping of scale nanostructure diversity in snakes
- 194 Downloads
Many species of snakes exhibit epidermal surface nanostructures that form complex motifs conferring self-cleaning properties, and sometimes structural iridescence, to their skin.
Using confocal microscopy, we show that these specialised cells can be greatly elongated along their left-right axis and that different types of nanostructures are generated by cell borders and cell surface. To characterise the complexity and diversity of these surface gratings, we analysed scanning electron microscopy images of skin sheds from 353 species spanning 19 of the 26 families of snakes and characterised the observed nanostructures with four characters. The full character matrix, as well as one representative SEM image of each of the corresponding species, is available as a MySQL relational database at https://snake-nanogratings.lanevol.org. We then performed continuous-time Markov phylogenetic mapping on the snake phylogeny, providing an evolutionary dynamical estimate for the different types of nanostructures. These analyses suggest that the presence of cell border digitations is the ancestral state for snake skin nanostructures which was subsequently and independently lost in multiple lineages. Our analyses also indicate that cell shape and cell border shape are co-dependent characters whereas we did not find correlation between a simple life habit classification and any specific nanomorphological character.
These results, compatible with the fact that multiple types of nanostructures can generate hydrophobicity, suggest that the diversity and complexity of snake skin surface nano-morphology are dominated by phylogenetic rather than habitat-specific functional constraints. The present descriptive study opens the perspective of investigating the cellular self-organisational cytoskeletal processes controlling the patterning of different skin surface nanostructures in snakes and lizards.
KeywordsMicrostructure Nanostructure Nanograting Snake Scale Phylogenetic mapping Continuous-time Markov model Hydrophobicity Structural colour
Sample-size corrected Akaike information criterion
Normalised Akaike weights
Model with all rates different
Area, solidity and contour-surface-to-bounding-convex-polygon-surface ratio
Equal rate model
Markov Chain Monte Carlo approximation
- Mk models
Phylogenetic generalised linear mixed model
Phylogenetic generalised least squares regression approach
Scanning electron microscopy
Symmetrical rate model
Nanograting morphologies not only differ among species and between body areas but also during post-natal development and within individual scales [13, 14]. Indeed, each scale exhibits a smooth gradient of nanostructure morphologies between their cranial and caudal ends, with the cranial pattern (i.e., at the base of the scale) being less derived, i.e., similar to the one covering the scales of neonatal individuals.
A recent analysis  attempted for the first time to probe the diversity of nanostructures in snakes by described the ventral and dorsal surface nanogratings found in 41 species from the Boidae, Pythonidae, and Elapidae families. This study established characters such as cell shape, cell boundary, and cell surface morphology and enumerated hypotheses for potential links between specific structural features and ecological characters (e.g., the potential presence of longer digitations in arboreal species) but the authors did not find any conclusive correlation between scale nanomorphology and ecological characters.
Here, we provide a more extensive analysis (with a much larger number of species) of the evolution of nanogratings in snakes. First, we use scanning electron microscopy (SEM) to identify and characterise the nanograting morphologies observed in 353 species (Additional file 1: Table S1 and MySQL relational database at https://snake-nanogratings.lanevol.org) spanning 19 of the 26 families of snakes. Second, we use confocal microscopy to unambiguously identify for the first time the shape and spatial organisation of oberhautchen cells. Third, we use a continuous-time reversible Markov model for the phylogenetic mapping of all characters investigated. Fourth, using phylogenetic generalised least square regression and phylogenetic generalised linear mixed model methods, we find that some of our defined characters are co-dependent (i.e., correlated), resulting in two major groups of nano-patterns: polygonal cells with regular cell borders versus elongated cells with cell border digitations. These analyses indicate that phylogeny constrains nanograting morphology whereas life habits (aquatic, terrestrial, fossorial and arboreal) do not significantly co-vary with any of the nanomorphological characters investigated, as illustrated by species of the same sub-family living in dramatically different environments but harbouring virtually the same structures. Our work aims at mapping scale nanostructures on the phylogeny of snakes, but also at guiding the identification of the developmental cellular mechanisms that generate diversity and complexity in surface nanograting structures in Squamates.
Confocal microscopy of oberhautchen cells
Figure 1 illustrates the characterisation of the nanostructures and oberhautchen cells in five species. Atropoides olmec (Viperidae, Fig. 1a) shows a ‘polygonal’ cell shape, ‘regular’ cell borders, a cell surface covered with ‘labyrinthine channels’, and ‘absent’ ridges. Chilabothrus strigilatus (Boidae, Fig. 1b) harbours a ‘polygonal’ cell shape (not shown), ‘sawteeth’ cell borders, a cell surface covered with ‘holes’, and ‘absent’ ridges. Dendroaspis jamesoni kaimosae (Elapidae, Fig. 1c) has an ‘unknown’ cell shape and ‘unknown’ cell surface (because of a dense network of elevations hides the cell borders and cell surface), unknown cell borders, and ‘absent’ ridges. Note that we did not formally identify whether the long and elevated protuberances correspond to modified cell border digitations or originate from the cell surface. Boaedon fuliginosus (Lamprophiidae, Fig. 1d) harbours a ‘wide’ cell shape, cell borders with ‘long digits’, a cell surface covered with ‘straight channels’, and ‘absent’ ridges. Finally, Philothamnus angolensis (Colubridae, Fig. 1e) possesses a ‘wide’ cell shape, ‘mild digits’ cell borders, a cell surface covered with ‘straight channels’, and ‘present’ ridges. The full character matrix with, for each of the 353 scored species, the state of each of the four morphological characters as well as a crude life habit classification is shown in Additional file 1: Table S1. The same character matrix, as well as one representative SEM image of each of the corresponding species, is available in a MySQL relational database at https://snake-nanogratings.lanevol.org.
We realise that some of the characters used in our study might not be independent. Note however that (i) we test below for the presence of such dependences, and (ii) if present, dependences among characters are not problematic in our analyses because we are not using these characters to infer phylogenies but to map characters on an established phylogeny of snakes to evaluate how these surface nanostructures evolved.
Model of character evolution and phylogenetic signal
Inference of proper hypotheses regarding trait evolution requires that a correct model of character evolution and a correct phylogenetic tree are used . Here, our analyses are conditioned upon the use of the time-calibrated tree of squamates (Additional file 3) provided by Tonini et al. . To validate the character evolution model, we performed a statistical comparison of so-called ‘Mk models’ , i.e., Markov-chain models that describe the evolutionary change of discrete characters across a phylogenetic tree using a rate transition matrix [21, 22]. For each of the nano-morphological characters and simple life habit classification, we use a maximum likelihood (ML) estimator to fit three models differing by their rate transition matrix: (i) the ER model where all rates are equal, (ii) the SYM model where the rates from state i to state j is equal to the reciprocal rate from j to i, and (iii) the ARD model where all rates can assume different values. Note that we additionally compared, using Akaike information criteria, if applying Pagel’s tree transformation coefficients (λ, δ, κ [21, 22]) significantly improves each model (in terms of information discrepancy). In a tree transformation perspective, Pagel’s λ compresses internal branches to measure the dependence of trait evolution among species. Using λ = 1 leaves the tree unchanged while branches are compressed for λ < 1. When λ = 0, all branches are collapsed such that the character investigated evolves independently among species [21, 23]. Pagel’s δ coefficient manipulates the rate matrix across the phylogenetic tree: when δ < 1 or δ > 1, transition rates slow down or accelerate, respectively, from the root to the tip branches. Finally, the κ coefficient raises all branch lengths by a power κ to evaluate if the trait under investigation evolves gradually or with a punctuated dynamic [21, 22, 24]. The goal of these analyses is to identify a suitable model for character mapping on the snake phylogeny. Much additional information is provided in the Materials and Methods.
Additional file 4: Table S2 indicates that the ARD model, without application of Pagel’s tree transformation coefficients, yields the best fitting scores (based on Akaike information criteria; see Material and Methods) for the ‘cell shape’ and ‘ridges’ characters. Note that for the ‘cell shape’ character, the ARD with λ or δ coefficients are only marginally less favoured. On the other hand, for the ‘cell borders’ and ‘cell surface’ characters, the SYM model is preferred, with tree transformation parameter λ = 0.966427 (p-value = 3.92∙10− 4, compared to SYM with no transformation) and λ = 0.889429 (p-value = 1.97∙10− 2, compared to SYM with no transformation), respectively. Hence, the tree was rescaled with the corresponding value of λ prior to the mapping of the ‘cell border’ and ‘cell surface’ characters (see below). Finally, the SYM Mk model without transformation is the best fit for the life habit character (Additional file 4: Table S2).
Given that the best fit was obtained without the transformation coefficient λ for two nanomorphological characters and the life habit character, as well as with a λ value close to one for the last two characters, we can safely consider for the presence of substantial phylogenetic signal for all the characters investigated. To more formally test for the presence of such a signal, we searched (separately for each character) for the value of λ that maximises the likelihood of the tree given the character state distribution and the model. The likelihood-ratio test (H0: λ = 0) significantly rejects, for each character, the independence of character evolution from the phylogenetic tree (conforming the presence of phylogenetic signal): ‘cell shape’, λ = 0.98 (p-value = 8.63 × 10− 32); ‘cell border’, λ = 0.97 (p-value = 1.66 × 10− 88); ‘cell surface’, λ = 0.89 (p-value = 2.84 × 10− 22); ‘ridges’, λ = 1 (p-value = 2.39 × 10− 21); and ‘life habit’, λ = 1.00 (p-value = 4.98 × 10− 105). The optimal value of the coefficient is very close to 1 for four of the five characters, whereas the ‘cell surface’ character presents a non-smooth likelihood profile (i.e., with oscillations), probably explaining the somewhat smaller optimal value of λ (slightly below 0.9).
Clearly, the ‘wide’ state (oberhautchen cells more than twice wider than long) is prevalent in snakes whereas the ‘polygonal’ state is dominant only in vipers (family Viperidae) and in three sub-families (Boinae, Erycinae and Ungaliophiinae) of boas whereas it also appears, albeit at low frequencies, in Colubrinae, Elapidae, Homalopsidae, and Uropeltidae. Note that our single representative of Pareidae also exhibits polygonal oberhautchen cells; additional samples are required to infer the character states frequencies in this small family of about 20 species. An important result of this mapping is that the cell shape state distribution across the phylogeny of snakes necessarily requires convergent substitutions. The most probable scenario involves two changes in deep nodes from ‘wide’ to ‘polygonal’ (black arrows in Fig. 5): once at the basal node of Viperidae and once within Boidae. Clearly, when considering the species-level phylogeny, additional homoplastic events occurred at shallower nodes: inferential mapping suggests a total of ten changes from ‘wide’ to ‘polygonal’, and six reversals from ‘polygonal’ to ‘wide’ (Additional file 5: Figure. S1). As there are many more species with the ‘wide’ state than the ‘polygonal’ state, these results suggest that the ‘polygonal’ to ‘wide’ reversal is easier than the ‘wide’ to ‘polygonal’ transition. This is supported by the estimated transition rate matrix (Fig. 10a) that shows a nearly three times faster rate from ‘polygonal’ to ‘wide’ than for the converse transition.
The distribution of shapes of the structures classified as ‘holes’ and ‘straight channels’ (Additional file 9: Figure. S5a) suggests the existence of a continuum between these two states and that channels simply evolved through the elongation of holes. This is supported by the inferred rate matrix of the discrete character model (Fig. 10c) where the ‘straight channels’ state can virtually only be reached via the ‘holes’ state. Although the classification of a depression as a ‘hole’ or a ‘straight channel’ is based on arbitrary (but objective) geometrical criteria (see Materials and Methods), the majority of species with these states harbour only ‘holes’ or only ‘straight channels’, rarely a mix of both (Additional file 10: Figure. S6). This observation supports the idea that, although the shape’s distribution is continuous, these states are distinct and can quickly switch from one to the other.
While the oberhautchen cell surfaces in Elapidae are nearly exclusively ‘smooth’ or covered in ‘straight channels’, it is worth noting that all Caenophidia species harbouring ‘smooth’ cell surfaces (marked by stars in Additional file 8: Figure. S4) actually possess very small depressions that do not pass the algorithm’s requirements to be classified as ‘holes’ (e.g., Fig. 4c, < 3% of image surface covered). Only the species of the three basal families of snakes (Leptotyphlopidae, Typhlopidae, and Anomalepididae) appear to have truly smooth cell surfaces (i.e., devoid of any dimple). Finally, note that ‘holes’ and ‘labyrinthine channels’ cannot be unambiguously differentiated in some species (Additional file 11: Figure. S7), possibly affecting the inference of correct rates between these states (Fig. 10c).
Although some ex-Caenophidia families (e.g., Uropeltidae and Pythonidae) also bare ‘digitations’ (Fig. 6, Additional file 6: Figure. S2), ridges are found exclusively in families that harbour ‘sharp digits’ (Additional file 7: Figure. S3a), suggesting that the evolution of ‘ridges’ was favoured by the alignment and overlap of sharp digits.
Correlations among characters
To investigate potential developmental or functional links among nano-morphological characters, we tested for the presence of correlations among the four oberhautchen cell characters and the simple life habit classification. Given the lack of consensus on which method to apply for discrete characters , we used both the phylogenetic generalised least squares regression (PGLS) approach  and the phylogenetic generalised linear mixed model (PGLMM) method  (see details in the Material and Methods). PGLS yields simple linear coefficients and deviations of the mean value of the variable response (considered here as pseudo-continuous), while PGLMM quantifies the influence of the probability (in log-odds ratio) of one state of a character as a function of a state of another character. Although the PGLMM approach allowed overall good convergence of the MCMC chain, it yielded poor significance levels of the coefficients in most cases, probably because of lack of knowledge on the prior probabilities  (here, we use the same prior for the variance of all characters).
Correlations among characters: Summary of PGLS and PGLMM statistics
The ‘cell border’ and ‘ridges’ characters are also clearly correlated (Table 1; Additional files 16 and 17): PGLS yields global and coefficient p-values 1.9 × 10− 2 and PGLMM gives one coefficient p-value of 4.2 × 10− 2, supporting our observation that ridges are exclusively found in families that harbour ‘digits’ (Figs. 6 and 8). Finally, our analyses did not uncover the presence of a correlation (Table 1; Additional files 16 and 17) between the simple life habit classification and any nanomorphological character. This result suggests that the distribution of the nanostructure diversity and complexity on snake oberhautchen cells is dominated by phylogenetic, and probably developmental, rather than functional, constrains.
The exact function(s) of oberhautchen cells nanogratings is (are) not fully characterised although it is very likely that they are involved in providing snakes (and some other species of squamates) with skin self-cleaning properties . Indeed, nanostructures are known to substantially increase surface hydrophobicity [9, 10, 11], a property that prevents most of the organic and inorganic material found in soil and vegetation to stick to the skin. Note that all of the nanostructures (e.g., holes, channels of various forms, digits, ridges) found on the skin of snakes are likely to provide some level of hydrophobicity. Hence, this could explain the lack of correlation between the type of oberhautchen cell nanostructures and the habitat of the corresponding species: as long as they generate hydrophobicity, the exact morphology of these nanostructures is irrelevant to fitness. For example, the presence of holes on the surface versus digits at the borders of oberhautchen cells in, respectively, the Jamaican Boa (Chilabothrus subflavus) and the corn snake (Pantherophis guttatus), each generates a substantial hydrophobic skin. Therefore, developmental mechanisms generating different nanostructures at the surface of the skin of different species might be somewhat equivalent in terms of adaptive value. Of course, this interpretation should be taken with caution as habitat and ecology includes far more nuance than we have captured with our simple characterisation of habitat (terrestrial, fossorial, arboreal, aquatic); i.e., different types of nanostructures might reflect adaptations we have not identified.
Note that some species (Xenopeltis unicolor exhibiting digits on cell borders, Boiga multomaculata exhibiting holes on cell surface) are characterised by particularly organised nanostructures, i.e., holes or digits (and digit rows) are highly regular in their spatial distributions (Additional file 14: Figure. S10). Whereas it is unknown if regularly-spaced nanostructures provide higher hydrophobicity than irregular ones, it is clear that the former generate structural iridescence (due to light interference) if the length scale of the distribution period is similar to visible light wavelengths [6, 7, 8]. Hence, it is conceivable that nanostructural traits are associated to intra- or inter-specific communication and/or sexual selection rather than to habitat types. A functional analysis, characterising the optical properties, level of hydrophobicity, and friction-modifying characteristics (with different types of substrates) of each combination of nanostructures found in the major lineages of snakes might provide answers to these questions.
Here, we use scanning electron microscopy to identify and characterise the diversity of oberhautchen cells nanogratings found across 353 species representing 19 of the 26 families of snakes (and 32% of all snake genera), i.e., spanning most of the Serpentes infra-order major lineages. We then use a continuous-time reversible Markov model to perform phylogenetic mapping and investigate how this diversity of morphologies was brought about.
These results suggest that the diversity and complexity of snake skin surface nano-morphology are dominated by phylogenetic, rather than habitat-specific, constraints. Obviously, this does not necessarily mean that there was no environmental effect on the evolution of the diversity observed and much additional, more detailed, ecological analyses are warranted.
Finally, a key question concerns the identification of the molecular developmental mechanisms generating these spectacular nanostructures. What are the cytoskeletal elements involved in the development of ‘holes’ and ‘digitations’? What are the self-organisational processes controlling the spatial organisation of these structures? It is likely that the development of the corn snake as a new model species [30, 31, 32] in squamates and the recent availability of its full genome sequence and reptilian transcriptome data [33, 34] will facilitate the proper investigation of these questions.
Sheds from 308 species of snakes were collected from museums, vivariums, private breeders and reptile shops. These samples were used for Scanning Electron Microscopy (SEM) imaging. The SEM images of 45 additional species from previous publications [15, 17] were also used. The 353 species scored for the present analysis represent around 32% of all snake genera (169 over 522 genera ) and 19 of the 26 snake families. Confocal microscopy was performed on skin biopsies obtained from African House Snake (Boaedon fuliginosus) embryos (day post-ovoposition 46) in the colony housed in Milinkovitch’s laboratory, University of Geneva, Switzerland. Maintenance of, and experiments on snakes were approved by the Geneva Canton ethical regulation authority (authorisations GE/82/14 and GE/73/16) and performed according to Swiss law. These guidelines meet international standards.
African house snake skin samples were laid flat face up on a Millipore nitrocellulose membrane (0.8mu AABP) to prevent curling and placed in an Eppendorf tube. Then, 250 μl of Gibco Dulbecco’s phosphate-buffered saline (+ CaCl2 and MgCl2) + 1 μl of 1000x Syto9 Green Fluorescent Nucleic Acid Stain (ThermoFisher Scientific S34854)  were added and the samples were incubated 1 h at 30 °C and 300 rpm in an Eppendorf Thermomix agitator. The skin was then peeled from the filter and mounted between a microscope slide and coverslip, which was glued on to prevent detachment in the inverted confocal microscope. As the Syto9 Green dye binds to nucleic acids, the nuclei appear very bright whereas the cytoplasm is less intensely labelled, allowing the visualisation of both the nuclei and the borders of the cells with high resolution. A Zeiss LSM 780 inverted microscope was used with a Zeiss Planapo 63X Oil 1.4NA lens and 1.518 refractive index oil. The Argon 488 nm excitation laser was used and the detector frame was set between 490 nm and 600 nm to capture the entire Syto9 emission spectrum.
Scanning Electron microscopy and character description
A dorsal sample from each moult was excised using a scalpel blade and placed on a cylindrical specimen mount covered with a carbon adhesive tab (Electron Microscopy Sciences). The samples were then coated with gold and SEM was performed with a Jeol JSM-6510LV microscope using a beam size of 40 nm and a voltage of 10 kV. Pictures were taken at 3000x and 9000x magnifications at the cranial tip, in the middle, and at the caudal tip of a scale from each sample. When the cells were bigger than the image frame at 3000x, a new image was taken at 1000x at the tip of the scale to allow visualisation of cell shape.
We defined four nanomorphological characters, each with a number of mutually exclusive states (Fig. 3). Each of the 353 studied species was categorised using the nanostructures found at the tip of the scales, which are the most developmentally-derived . The corresponding state matrix is provided in Additional file 1: Table S1 as well as in a MySQL relational database at https://snake-nanogratings.lanevol.org. Note that the latter also provides one representative SEM image of each of the corresponding species.
The characters and their states are (Fig. 3): ‘cell shape’ (‘wide’ = 0 or ‘polygonal’ = 1), ‘cell border’ (‘regular’ = 0;‘short digits’ = 1;‘mild digits’ = 2;‘long digits’ = 3 or ‘sawteeth’ = 4), ‘cell surface’ (‘smooth’ = 0; ‘holes’ = 1; ‘straight channels’ = 2 or ‘labyrinthine channels’ = 3), and ‘ridge’ (‘absence’ = 0 or ‘presence’ = 1). Ambiguity was prevented by grouping similar structures (holes and chains), by ignoring dim structures (pits, Fig. 4c), and by ignoring substantially larger microstructures (Fig. 4d,e). When a character’s state was unclear (e.g., cell shape or cell surface not visible), it was classified as ‘unknown’.
For the ‘cell shape’ to be categorised as ‘polygonal’, the ratio between the width (left-right axis) and length (cranial-caudal axis) of the cell must be ≤2 (Fig. 3a). Conversely, when the cell is more than twice wider than long, it is classified as ‘wide’. When a cell’s length varies a lot, as in ‘sawteeth’ cell border, an average is taken between the longest and shortest lengths (Fig. 3b).
For the ‘cell border’ character (Fig. 3b), the states ‘regular’ and ‘sawteeth’ are easily recognised while ‘digitations’ can show various lengths. The length of digits and the total cell length were measured in ImageJ for 10 cells and averaged. The digit length was normalised by the cell length. The states of digitations were defined as ‘short’, ‘mild’, and ‘long’ when the ratio of digit length versus cell length was < 0.3, between 0.3 and 0.5, and > 0.5, respectively. Note that the absolute size ranges of short, mild and long digitations overlap as they are 0.4–4.7 μm, 0.8-5 μm, and 1.2–5.5 μm, respectively. Digitations and sawteeth are easily differentiated because the former are small (< 5.5 μm), regular, cover the cranial border of the next cell, and are specifically oriented toward the caudal end of the scale. On the other hand, sawteeth are larger (5-10 μm), irregular, and the borders of adjacent cells imbricate with each other. Note that it is sometimes difficult to discern the cell’s borders as the previous clear layer’s imprint can leave marks on the current oberhautchen cell layer. Some species exhibit a dense network of ‘elevations’ (Fig. 1c) that could either be elevated border digitations or originate from the cell surface.
The ‘cell surface’ states were defined using functionalities from the OpenCV computer vision library : all contours (i.e., closed curves connecting sharp changes in pixel intensity) in each SEM image were detected. First, we applied Contrast Limited Adaptive Histogram Equalisation  to account for inhomogeneity of contrasts across the image while preventing the over-amplification of noise (Additional file 15: Figure. S11a-b). Next, a local thresholding method was applied as follows: images were divided into small tiles (100 × 100 px) and a k-means clustering method (k = 3) was applied to group the pixels intensities of each tile independently (Additional file 15: Figure. S11c). The black and white thresholding for each tile was then set to the average of the cluster centres with the lowest and middle intensities to differentiate the darkest pixels from those in the two other clusters (Additional file 15: Figure. S11d). Artefacts at the tile borders were removed by applying Gaussian filtering on the threshold values. Then image intensities were inverted and contours were detected. The contours were then classified into shapes based on the ASR criterion: A is the area of the contour (normalised by image area), S is its ‘solidity’ (i.e., the ratio of the contour surface and the surface of its bounding convex polygon), and R (≥ 1) is the ratio of the minimum bounding rectangle sides. A contour was classified as a ‘hole’ if S > 0.8, R < 6 and A is less than two standard deviations (SD) away from the mean area of detected holes on that image (blue contours in Additional file 15: Figure. S11e). This resulted in the grouping of similar structures that could be qualified as roundish and elongated holes (Fig. 4a-b). For a contour to be sorted as a ‘straight channel’ it required S > 0.5 and R ≥ 6, but no constraint on the area (A) as channels can be very long (red contours in Additional file 15: Figure. S11e). The value of R = 6 separating the two classes was selected arbitrarily as the distribution for the bounding rectangle length/width ratio is continuous and strongly decreasing (Additional file 9: Figure. S5a). The classification was improved using a training set consisting of contours (from manually-selected images) clearly belonging to a certain class. Contours that were not labeled as a ‘hole’ or a ‘straight channel’ based on the ASR criterion were still classified if a similar shape (calculated using Hu moments  implemented inside OpenCV ‘Match Shapes’ procedure) was found in the training set. To identify ‘labyrinthine channels’, we recognised all contours that could not be classified into ‘holes’ or ‘straight channels’ but could be subdivided into multiple straight channels based on their convexity defects (green contours in Additional file 15: Figure. S11e). To avoid miss-classification due to image quality or cell border detection, all images were manually corrected by deleting, redrawing and automatically sorting or subdividing already existing contours. Finally, we made sure to subdivide all ‘labyrinthine channels’ into ‘straight channels’ to facilitate the process of image classification (Additional file 15: Figure. S11f). At the end of the procedure, the percentage of surface covered by each type of shapes on a SEM image was calculated and the final image state was decided using a simple decision tree. If the total area of every shape was less than 3% of the total surface of the image, the state was set to ‘smooth’. Otherwise, we set the state to ‘holes’ if they had the largest covering surface. If ‘channels’ had the largest covering area, we measured their orientation and classified the image as ‘labyrinthine channels’ if the angle SD was > 25°, or as ‘straight channels’ otherwise. Given that the standard deviation of channel orientation (Additional file 9: Figure. S5b) presents a gap between 22° and 26°, we can objectively set a threshold at 25° for separating ‘straight channels’ (R ≥ 6) from ‘labyrinthine channels’ (channel orientation angles with SD > 25°). Despite these criteria, some species exhibited structures that could not be unambiguously classified as ‘holes’ or ‘labyrinthine channels’ (Additional file 11: Figure. S7).
The ‘ridge’ character refers to inverted-gutter deformations of the cell surface that run along the cranial-caudal axis (Fig. 3d).
First, we pruned the time-calibrated tree among 9754 species of squamates  to keep the 353 species for which we have obtained nanomorphological data. Note however that some of these species are represented by skin samples from multiple subspecies. Given that no subspecies information is provided in , we selected only one subspecies using the two following successive criteria: (1) if some, but not all, subspecies within a species could not be scored for one or several characters, remove the corresponding subspecies, and (2) if the different subspecies in a species exhibit different character state(s), keep the subspecies with the state(s) most frequently exhibited by the most closely-related other species. When more than one subspecies per species remained, we arbitrarily kept only one of them. This procedure removed 13 subspecies from the dataset prior to character mapping and subsequent analyses. The corresponding Newick tree file among 340 species is provided as a Additional file 3(Suppl_File_3_Tonini_SnakesCommon.nwk). As the pruning process did not remove all of the polytomies present in the original time-calibrated tree , and some of our analyses (e.g., models of character evolution) require a dichotomous tree, we resolved these ambiguous nodes with the function multi2di of the ape package  of the program R  while imposing the length of the newly-generated branches to an arbitrarily small value (10− 6 of the total height of the tree); this avoids problems generated by zero branch lengths in comparative analyses functions. Note that the reverse transformation can easily be performed with the di2multi function of the ape package . The 353 snake (sub) species in our dataset are listed in Additional file 1: Table S1 and in the MySQL relational database at https://snake-nanogratings.lanevol.org; in both cases, the 13 subspecies not used for phylogenetic mapping are indicated with an asterisk.
Models of character evolution
Without a priori information about the possible character evolutionary model, we fitted and compared different rate transition matrices of the Mk model with or without Pagel’s tree transformation coefficients λ, δ, and κ [21, 22]. The Mk model for discrete characters is based on the assumptions that (i) the trait value can evolve through a Markov process, i.e., the probability of character substitution depends only on the current state and not on previous states, and (ii) every state of a particular trait is equally likely to change to any other states. This model is the analogue to a Brownian motion (BM) model for continuous traits as it supposes that evolutionary changes are independent across lineages and that the rate of change is constant over time and along all branches [21, 22, 24]. We focused on three forms of the rate matrix: equal rates (ER) for all substitutions, different but time-reversible (i.e., symmetric) rates (SYM), or with all rates being different (ARD).
When using continuous characters, coefficients can be used to transform the variance-covariance matrix and fit the model to the corresponding tree. On the other hand, when using discrete characters, one cannot define a covariance matrix, such that coefficients can only be applied by transforming the tree while keeping a constant-rate Mk model [21, 22, 23, 24]. First, the λ coefficient compresses internal branches (without affecting tip branches) to measure the dependence of trait evolution among species. A value of λ = 1 leaves the tree unchanged, i.e., the model is assumed to maximally fit the original tree. On the other hand, a λ = 0 (making the tree a star-like phylogeny) indicates independence of evolution among species. Hence, comparing the fit of the model with various values of λ provides a good approach to test for the presence of ‘phylogenetic signal’ (see next subsection). Second, the coefficient δ constitutes a quantitative measure of how rates of substitution change across the tree (i.e., through time). If δ < 1, the evolutionary rate slows down through time, while δ > 1 indicates an acceleration of the rate. In terms of phylogenetic transformation, it affects the tree by modifying nodes’ heights. Third, the coefficient κ raises the branches length by a power κ. When κ = 1, the tree is left unchanged while all branch lengths are one when κ = 0. The parameter κ can be interpreted as character changes being more or less concentrated at speciation events, i.e., it tests if evolution of a trait was punctuated or gradual [21, 22, 26].
For each character, we fitted all combinations of models (ER, SYM, ARD) and coefficients (none, λ, δ, and κ) with the function fitDiscrete of the geiger package  using an internal optimisation procedure supported by the functions optim and subplex (i.e., general-purpose optimisation functions provided in R ). We used 104 starting points during the optimisation procedure and performed comparisons among all fitted models using the sample-size corrected Akaike information criteria scores (AICc and AICw)  provided by the fitDiscrete  and aic.w  functions, respectively. A summary of the scores obtained with all models is provided in Additional file 4: Table S2.
Heritable traits (whether continuous or discrete) observed in species related by a phylogenetic history do not represent data points that are statistically independent and identically-distributed [22, 23, 43]. Hence, specific statistical methods must be applied for comparative analysis of these traits (here, nanostructures). Many indices were developed to quantify the so-called ‘phylogenetic signal’ (i.e., how much closely-related species tend to resemble each other more than randomly-drawn species) with respect to trait evolution [43, 44]. Here, we use Pagel’s λ parameter [22, 23] because of its good statistical properties (low sensitivity to tree size and to errors in topology and branch lengths, high power of associated statistical tests) over broad types of tree transformations . As discussed above, with discrete traits, the λ coefficient is used in the Mk model as a scaling factor of branch lengths . The phytools package  provides the function phylosig that tests for the presence of phylogenetic signal through randomisation of the species in the tree [22, 23, 43, 44] but cannot be used with discrete data. Hence, we follow here the proposed alternative method of Revell : we optimise the Mk model likelihood under a given λ-transformation of the tree, making it an analogue of the BM model but for discrete traits, and we compare the associated estimator λML to the null hypothesis case (λ = 0, i.e., trait evolution and phylogeny are independent). The combination of a value of λ close to 1, and significantly better likelihood than with λ = 0, indicates the presence of significant phylogenetic signal.
Using the time-calibrated tree of squamates  and the best fitting model of character evolution (Additional file 4: Table S2), we performed ancestral state reconstruction and substitutional history estimation of the four nano-morphological characters and the simple life habit character discussed above. The phylogenetic mapping was determined with a continuous-time reversible Markov model implemented in the function make.simmap of the phytools  package in R . If a state could not be scored for a particular character, the corresponding species was removed from the analysis. We followed the procedure described in [42, 46, 47], and implemented in the function make.simmap: estimating the posterior probabilities of ancestral states by sampling the associated posterior distribution approximated by a continuous-time Markov chain. The parameters used in the Markov algorithm were: 104 simulations (nsim), a burnin of 103 steps and a sample frequency of 100. We used the default option of prior distribution estimation at the root by sampling the conditional scaled likelihood distribution. Then, conditioned on this prior distribution at the root and the observed states at tip branches, the mutational history of the character was simulated by sampling with the appropriate posterior distribution.
Correlations among characters
To detect potential linear relationships (i.e., correlation) between traits [19, 27, 48, 49], we performed a comparative analysis among the four nano-morphological traits and the simple classification of life habit (aquatic, terrestrial, fossorial or arboreal) using both the phylogenetic generalised least-squares regression (PGLS; ) and the phylogenetic generalised linear mixed model (PGLMM; ), two methods based on different statistical perspectives (frequentist and Bayesian inferences, respectively). Note that we do not aim to infer the best values of the model parameters but wish to identify the presence or absence of correlation. The analysis was performed on the entire 10 (5 × 4/2) pairs of characters. For each pair of characters, we pruned the global tree such that it only includes the species for which both traits have been scored.
The PGLS regression is based on a common generalised least squares (GLS) model assuming that the response variable Y (or any of its transformations) can be expressed by a linear relationship of explanatory variable X, which are independent of each other and of the error [27, 48], i.e., g(Y) = Xβ + ∈. However, the GLS takes into account dependence of the observations (here, the species) by down-weighting the estimator with the variance-covariance matrix. Consequently, the assumption of uncorrelated errors and homoscedasticity, i.e., constant variance, is relaxed. The error allows grouping within-species variation (also called measurement error), error due to unknown or incomplete phylogenetic relationships, and error due to the stochastic nature of the evolutionary process along the phylogeny . As we could not estimate the error due to unknown or incomplete phylogenetic relationships and to within-species variation, we assume that the error originates only from variation of evolutionary changes along a phylogeny. This error follows a multivariate normal distribution with an expectation of 0 and variance-covariance matrix σ2V, where V (NxN matrix, N number of species) is known and relates to the tree structure and branch lengths. For a Brownian model, the variance is defined by Vij = σ2tij, where tij is the distance on the phylogeny between the root and the most recent common ancestor of taxa i and j, and σ2 (that can be estimated) represents the evolutionary rate . One advantage of the PGLS method (function pgls in the caper package ) is that the matrix V can be adapted to multiple models of evolution by manipulating the branch lengths in the phylogenetic tree with one of Pagel’s coefficients. Note that we had to convert the discrete trait variable response into a pseudo-continuous variable in order to apply PGLS. It has been shown  that this approach exhibits good statistical performances and valid estimations. We used the pgls function with the R structure formula Var1 ~ Var2, where Var1, Var2 represents the tested pair of characters. Note that we fixed the parameters λ, δ, and κ to 1 as suggested by our statistical comparisons among all combinations of models (ER, SYM, ARD) and coefficients (none, λ, δ, and κ).
The PGLMM method, supported by the function MCMCglmm in the package of the same name , implements Bayesian inference with Markov Chain Monte Carlo (MCMC) approximation to fit the model. The generalised linear mixed model is an extension of the generalised linear model (with fixed X explanatory effects) but with the addition of random effects (denoted by the matrix Z) on the model. We can then write the model as l = [X Z]ϑ + ϵ, where l is the so-called latent variable (some transformation of the response variable Y), X and Z are the design matrices related to respectively fixed and random effects, ϑ ([β a]T) is the vector of parameters, and ϵ is the residuals’ vector. Location effects β (fixed), a (random) and residuals ϵ are assumed to be drawn from a multivariate normal distribution. While fixed effects are assumed a priori to be independently-distributed with mean β0 and variance σB2I, the random effects and residuals have zero mean as well variance σa2A and σe2I, respectively. In phylogenetic comparative analyses, A is equivalent to the V matrix defined in PGLS. Note that σ2 can also represent a matrix in the case of multiple responses models (i.e., with multi-state characters), such that the global variance matrix is the Kronecker product of σ2 with I or A.
The main advantage of PGLMM is to model multinomial logit (=log-odds) responses (a special case of multiple response models) with more than two states, hence, it is particularly adapted to our data . Multinomial logit models usually reduces the J states problem to a J-1 problem by taking one of the levels as a reference. While in PGLS the variable response corresponds exactly at the data value, multinomial PGLMM looks on log-odds ratio of the probability that a species i has a trait’s state j, i.e., lij = log (αij/αiref) where αij represents these probabilities and ref is the reference state. The MCMCglmm function input for the fixed effect is defined as Var1 ~ Var2 for multinomial response with 2 states (Cell shape, Ridge) and Var1 ~ trait Var2 + trait – 1 for responses with more than 2 states (Cell borders, Cell surface). The latter formulation enables to obtain specific intercepts and regression coefficients for each trait of Var1 (‘trait’ indexes the latent variables, i.e., the states of the characters) with respect to Var2. The (− 1) is imposed in order to remove global intercept and to get easier interpretable models . For the random effect, we define the input random ~ animal for 2 states latent variables and random ~ us(trait): animal for > 2 states latent variables. ‘Animal’ corresponds to the species identifier, while us() is an internal function that fits the variance of the random effects by assuming the complete σ2 (J-1) x(J-1) variance matrix of the trait, i.e., covariance between levels of the trait are permitted. Note that we use the rcov formula for the residual covariance structure. It is defined as rcov ~ us(trait): units for > 2 states response and random ~ units for 2 states, in order to estimate the fully parametrized covariance matrix. ‘Units’ indexes the rows of the multiple response data (here as there is only one state for each species, such that the ‘units’ index is equivalent to the ‘animal’ index).
In the context of a Bayesian framework, prior distributions should be defined. For the fixed effect, the MCMCglmm function assumes a multivariate normal prior  where the mean μ is set to 0 and the variance parameter V is set to (1.7 + π2/3) I, where I is the (J-1) xK identity matrix (J and K are the numbers of levels for Var1 and Var2, respectively). For the random effects and the residuals, the MCMCglmm function assumes an inverse-Whishart prior with the parameter V and ν. Here, for the residuals, we impose V = (I + J) / J, where J is the unit matrix and ν = J-1. Additionally, we arbitrarily fix the variance throughout the MCMC run . On the other hand, we can estimate the variance of the random effect during the run.
In order to increase the mixing of the MCMC, we use the parameter expansion methods , i.e., we add (in the prior definition of the random effects) the parameters αμ = 0 and α.V = 103I. Other parameters of the MCMC were: 107 MCMC iterations with a burnin of 106 and thinning interval of 104 iterations, and truncate latent variables when necessary (to prevent under/overflow). We used the autocorr.plot and gewecke.plot functions from the coda package  to evaluate Markov chain convergence .
E. Derivery and L. Reymond advised on the use of Syto9 and J. M. de Abreu Nunes on statistical approaches. We thank C. Bauer and J. Bosset (Life Science bioimaging facility, Geneva), N. Muthusubramanian and A. Bruno (Kavli Institute of Nanoscience, Delft University of Technology, The Netherlands), V. Mishra (Center for Nanoscience and Engineering, Bengaluru, India), as well as J. Martina, Varadharajan and Deepalakshmi Putchen (Indian Institute of Science, Bengaluru, India) for access to microscopes. Sheds were provided by S. Bruaux and O. Pauwels (Royal Belgian Institute of Natural Sciences, Brussels, Belgium), M. Jocque (Biodiversity Inventory for Conservation, Belgium), K. Shanker, S.P. Vijaykumar, and M. Thacker, A. Kumar Mallick (Center for Ecological Studies, Indian Institute of Science, Bengaluru, India) as well as R. Khot and V. Hegde (Bombay Natural History Society, Mumbai, India). Additional help was provided by R. Gonzalez (Vivarium de Meyrin, Geneva, Switzerland), M. Ansermet and C. Derwey (Vivarium Lausanne, Switzerland), S. Furrer (Zurich Zoo), J. Krebs (Henry Doorly Zoo), D. Pottier, M. DE BOER, M. Lautenbach, R. De Plecker, L. Gesche, J. Murray, T. Ningler, P. Pavesi, D. Puns, and F. Valable. H. Prasad, J. Gerard, and M. Govender provided technical help. F. Bossuyt, and K. Roelants provided advices and commented on the manuscript.
Consent to participate
Experiments, equipments and animal housing was supported by grants to MCM from the University of Geneva (Switzerland), the Swiss National Science Foundation (FNSNF grants CR32I3_162743 and 31003A_179431), the SystemsX.ch initiative (project EpiPhysX), and the International Human Frontier Science Program Organisation (HFSP RGP0019/2017). The funding bodies played no role in the design of the study, collection, analysis, and interpretation of data and in writing the manuscript.
Availability of data and materials
All data needed to evaluate the conclusions in the paper are present in the paper and supplementary material. Additional data related to this paper may be requested from the corresponding author.
MCM initiated the project and performed the initial SEM analyses with AD. MIA, AD, and ANS collected all subsequent SEM images. MIA performed confocal microscopy. AF and MIA performed image analyses. MIA defined characters and their states. LMDOV performed phylogenetic mapping and statistical analyses. MIA, LMDOV and MCM wrote the manuscript and produced the Figs. MCM supervised the whole study. All authors agreed on the interpretation of data and approved the final version of the manuscript.
Ethics approval and consent to participate
Maintenance of and experiments on reptilians were approved by the Geneva Canton ethical regulation authority (authorizations GE/82/14 and and GE/73/16) and performed according to Swiss law (www.blv.admin.ch/themen/tierschutz/00777/index.html?lang=en); these guidelines meet international standards.
Consent for publication
All authors approved the final version of the manuscript.
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
- 2.Price RM. Systematic implications of dorsal snake scale microdermatoglyphics. Am Zool. 1979;19:181.Google Scholar
- 5.Price R. A unified microdermatoglyphic analysis of the genus Agkistrodon. Snake. 1989;21:90–100.Google Scholar
- 13.Price RM. Analysis of the ecological and taxonomic correspondence of dorsal snake scale microdermatoglyphics. PhD dissertation. New York University. 1981;164. https://elibrary.ru/item.asp?id=7324841.
- 15.Schmidt CV, Gorb SN. Snake scale microstructure: phylogenetic significance and functional adaptations. Zoologica. 2012;157:1–106. ISBN 978-3-510-55044.Google Scholar
- 18.The Reptile Database. July 2018. URL: http://reptile-database.reptarium.cz/. Accessed 20 July 2018.
- 19.Hernández CE, Rodríguez-Serrano E, Avaria-Llautureo J, Inostroza-Michael O, Morales-Pallero B, Boric-Bargetto D, Canales-Aguirre CB, Marquet PA, Meade A. Using phylogenetic information and the comparative method to evaluate hypotheses in macroecology. Methods Ecol Evol. 2013;4:401–15.CrossRefGoogle Scholar
- 21.Harmon LJ. Phylogenetic comparative methods – learning from trees; 2018.Google Scholar
- 24.Graber S. Phylogenetic comparative methods for discrete responses in evolutionary biology. Zurich: Master’s thesis: University of Zurich Faculty of Science; 2013. https://www.zora.uzh.ch/id/eprint/152452/.
- 27.Grafen A. The phylogenetic regression. Trans. R. Soc. Lond. B, biol. Sci. 1989;236:119–57.Google Scholar
- 28.Hadfield JD, Nakagawa S. General quantitative genetic methods for comparative biology: phylogenies, taxonomies and multi-trait models for continuous and categorical characters. J Evol Biol. 2010;23:494–508.Google Scholar
- 31.Milinkovitch MC, Tzika AC. A pragmatic approach for selecting Evo-devo model species in amniotes. J Exp Zool. 2008. https://doi.org/10.1017/CBO9780511541582.011.
- 32.Tzika AC, Milinkovitch MC. Evolving Pathways; Key Themes in Evolutionary Developmental Biology; 2008. p. 119–40.Google Scholar
- 35.Bradski G. OpenCv library. DR Dobb’s Journal of Software Tools. 2000;25:120–25.Google Scholar
- 36.Zuiderveld K. Contrast limited adaptive histogram equalization, Graphics gems IV, Academic Press Professional; 1994. p. 474–85.Google Scholar
- 37.Hu M. Visual pattern recognition by moment invariants. IRE transactions on information theory, IT-08; 1962. p. 179–87.Google Scholar
- 45.Revell L. Phylogenetic tools for comparative biology. July 2018. URL: http://blog.phytools.org/2018/. Accessed 20 July 2018.
- 50.Orme CDL, Freckleton RP, Thomas G, Petzoldt T, Fritz SA, Isaac N, Pearse W. caper: Comparative Analysis of Phylogenetics and Evolution in R. 2011. R package version 1.01. Available: http://CRAN.R-project.org/package=caper. Accessed 20 July 2018.
- 52.Hadfied JD. MCMCglmm course notes; 2012.Google Scholar
- 53.Plummer M, Best N, Cowles K, Vines K. CODA: Convergence diagnosis and output analysis for MCMC. R News. 2006;6:7–11.Google Scholar
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.