Large scale phenotypic characterisation of Hierophis viridiflavus (Squamata: Serpentes): climatic and environmental drivers suggest the role of evolutionary processes in a polymorphic species

Colour variability is largely widespread in the animal world as it is tightly associated with fitness and survivorship. Therefore, the drivers and implications of such variability have been of great interest for zoologists in the past decades. Reptiles are excellent models to investigate colour variations and expression under different conditions. Here, we focused on melanism occurrence in the two main lineages of Hierophis viridiflavus at the scale of the species distribution, by extracting available data from iNaturalist, a citizen science network, with the aim of detecting any pure effect of climate or local habitat on colour expression. Our analyses highlighted that habitat does not explain differences in phenotypes, whereas marked effects of geographic and climatic variables were detected. However, the observed climatic effects could be a proxy of the geographical distribution of the two groups, and thus the high occurrence of bright colourations in western populations of the eastern lineage could be addressed to an ongoing event of asymmetric gene flow in contact zones. The current distribution of phenotypes could be the outcome of the evolutionary history of the species combined with the geological history of the Mediterranean region. This investigation, though, is only preliminary and molecular analyses on highly variable regions of the genome are mandatory to address this issue.


Introduction
Body colourations can be extremely important for animal adaptation and fitness, as they are deeply involved in thermal behaviour, feeding ecology, crypsis and communication (Shine and Kearney 2001;Bittner et al. 2002;Tanaka 2007;Shawkey and D'Alba 2017;Storniolo et al. 2021). Indeed, body colourations are frequently the outcome of multiple interactions between a wide range of environmental factors (Endler 1981) and, therefore, it can be rather complicated to determine the drivers of their evolution.
Body colourations and ornamentations have been also broadly investigated with the aim of assessing their role in intra-and interspecific communication (Cuthill et al. 2017). In detail, conspicuousness in colourations is often the trade-off between the communicative efficiency of manifest colourations and the energetic cost of pigment-related metabolism, as well as the effect of the individual's detectability by potential predators on fitness (Hill and Montgomerie 1994). Consequently, the evolution of specific body colourations, for example cryptic and disruptive ones, can be tightly correlated with the need to compensate higher predation rates (Forsman and Shine 1995;Stuart-Fox et al. 2003). Moreover, dorsal colourations can be highly variable at the geographic and population scale (Ortega et al. 2015;While et al. 2015) and, also, in relation to sexual dimorphism (Sacchi et al. 2013).
Within the huge range of variability in dorsal colourations, the occurrence of darkcoloured polymorphisms in body colourations, namely melanism, is largely widespread in the animal kingdom (Alho et al. 2010;Majerus and Mundy 2003;Vidal et al. 2010). Melanism often occurs as part of geographic polymorphisms and may represent an adaptive response to certain environmental conditions (Gibson andFalls, 1988, Harris et al. 2012). As a matter of fact, it has been postulated to be of great relevance from the evolutionary point of view (Escudero et al. 2016), as it can significantly affect key aspects for a species' survivorship, namely thermal efficiency, reproductive output or predator avoidance (Castella et al. 2013;Cordes and Mouton, 1994).
Body colourations have been widely investigated from the genetic point of view: past research on American colubrids suggests that polymorphisms in colour and ornamentations can be inherited in a simple Mendelian way (Zweifel 1981) or, alternatively, that melanism can be inherited as a recessive character (King 2003). Nevertheless, colour polymorphisms including melanism can be the result of complex genetic processes such as gene flow via introgression among closely related lineages (Sasamori et al. 2017;McRobie et al. 2019). Therefore, melanism has been of great interest for zoologists to investigate the underlying mechanisms that drive species adaptation to specific environments (Muggleton et al. 1975;Portik et al. 2010).
In this field of research, reptiles have been intensely studied in the past decades, especially regarding the genetic backgrounds of colour expression and the ecological implications in polymorphic species (Rosenblum et al. 2004). In fact, colour polymorphisms are correlated to how the individual interacts with its environment from many points of view. For example, polymorphic lacertids can show inter-morphic responses to parasitic infections according to seasonal variations in immune system effectiveness (Huyghe et al. 2010). Moreover, melanism can play a significant role in thermoregulation, especially in medium or large sized reptiles (Clusella-Trullas et al. 2008), as climate can influence the spatial distribution of polymorphisms and habitat selection (Broennimann et al. 2014), according to different thermal efficiency rates of morphs (Strugariu and Zamfirescu 2011;Farquhar 2021).

3
The Western whip snake, Hierophis viridiflavus, is one of the most common snakes in Central and Southern Europe occurring over a wide region from the Pyrenees across France, part of Switzerland, all Italy and of Northern Balkan Peninsula (Sillero et al. 2014). Despite its broad distribution, it is not markedly variable in phenotypes as only two main morphs are currently recognized, corresponding to the two identified subspecies. H. viridiflavus viridiflavus occurs in Spain, France and Western Italy and H. v. carbonarius in Southern, Eastern and part of Northern Italy (Rato et al. 2009;Mezzasalma et al. 2015). However, the distinction between the two lineages has been a subject of debate in the last decade, and Mezzasalma et al. (2015) proposed to separate them in two distinct species according to mitochondrial haplotypes divergence in 16 S, ND4 and Cyt-b genes as well as in karyotype and skull shape. Currently this distinction is not fully accepted (Speybroeck et al. 2020), and a more conservative approach is suggested, considering them two distinct lineages within the same species. From the morphological point of view, the two groups are basically identical with the exceptions of the skull morphology and the body colour (Mezzasalma et al. 2015; Fig. 1): the western lineage is characterised by a disruptive pattern of evident yellow/green spots, that merge to form stripes on the tail, on a black background.  Mezzasalma et al. (2015). At the side of each lineage is shown the picture of the associated phenotype 1 3 The eastern lineage is normally completely black with the exception of the western portion of its range (Apennines and Po plain) where its phenotype becomes much brighter like the counterpart's phenotype (Vanni and Zuffi 2011). This evidence has led to hypothesize that the two groups are involved in an asymmetric gene flow event from the western lineage to the eastern one, causing a mismatch in phenotype expression in the latter. Recent research has tried to determine the genetic cause of colour expression in Southern Italy (Senczuk et al. 2020) but the matter is still unresolved. Consequently, since polymorphism in reptiles have been addressed under the perspective of thermal advantages (Clusella Trullas et al. 2008) as a response to different local climatic conditions also considering the drawbacks of more detectable phenotypes (Andren and Nilson 1981), such ecological constraints of the occurrence of polymorphisms in widespread species are an open field for investigation. Therefore, in this paper, we use citizen science data that has been validated by experts to reconstruct the geographical distribution of phenotypes in the two groups across the range of the species, with the aim to assess whether environmental factors, notably climate and environment, are able to explain their observed geographic distribution.

Data collection
Data were retrieved from the online database iNaturalist (https:// www. inatu ralist. org), a joint initiative of citizen science by the California Academy of Sciences and the National Geographic Society with the purpose to provide, anybody and anywhere, open access to data concerning any type of life on Earth in terms of species occurrence across space and time. From this source we collected 4124 georeferenced observations of the target species Hierophis viridiflavus. However, because 442 records lacked any photograph to identify the species and 763 observations consisted of shed skins or poor resolution pictures (and thus could not be univocally assigned to a phenotype), we were able to process a total amount of 2919 images. Eventually, 1100 individuals were juveniles based on general morphology and body colouration of the specimen and were excluded from the analyses. Thus, our final dataset consisted of 1819 adult whip snakes.
For each individual, latitude and longitude were obtained from iNaturalist itself, while altitude was extracted from SRTM digital elevation maps (https:// www. usgs. gov) with 1 arc-second resolution. We extracted temperature data from the observation sites from WorldClim bioclimatic maps (1 km resolution, https:// world clim. org); in particular, the effects of annual mean temperature (BIO1) and temperature seasonality (BIO4) were considered. Concerning habitat characterisation, we retrieved substrate information of each observation from the CORINE Land Cover database (2018), which is a Pan-European project, part of the Copernicus Land Monitoring Programme (CLMP; https:// www. coper nicus. eu), aimed at assessing land cover, vegetation, water cycle at a large geographic scale. In detail, for each observation, we extracted the associated habitat categories within a 500 m radius buffer and, secondly, among them, we selected the ones normally related to the ecology of the species and pooled them to define four variables to be used in the analyses: (i) meadows (cat. 10, 18, 26, 27); (ii) rocky outcrops (cat. 30, 31); (iii) woodlands (cat. 23, 24, 25, 29); (iv) cultivated land (cat. 12, 15, 17, 19) (see Supplementary Materials for the complete list of environmental categories and those used in the analyses). Habitat categories were pooled together to characterise those habitats that might be involved in selecting phenotypes for their effectiveness in disruption and camouflage with the background to minimize detection by avian predators.
Eventually, we associated each observation with the corresponding mitochondrial haplotype (16 S) according to the distribution maps by Mezzasalma et al. (2015). Each observation was assigned with one out of two alternative phenotypes, i.e. "viridiflavus" form (hereafter "striped" to distinguish it from the H. v. viridiflavus genotype) and "carbonarius" one (hereafter "melanic" to distinguish it from the H. v. carbonarius genotype). Images were visually classified according to the description by Zuffi (2008) by a single operator (F.S.) without taking into consideration the geographic position of each record. On the basis of a subsample of 1500 images, repeatability (3 replicates per individual, performed at 2 days intervals) was r = 0.981.

Statistical analyses
Overall, we carried out three Generalised Linear Models, each incorporating different sets of predictors, in order to investigate how the occurrence of melanism in the model species varies at geographic scale (model 1) and in response to climatic (model 2) or landscape variables (model 3). Consequently, the effects of temperature and landscape variables on the occurrence of melanic phenotypes were assessed through separate GLMs for the two subspecies.
The First model analysed geographical effects on phenotype expression by modelling the probability of observing the melanic phenotype through a GLM with binomial error distribution, where latitude, longitude and altitude entered the model as predictors. In detail, latitude and longitude were modelled using a 3rd grade polynomial function to account for non-linear correlation between geographical variables and phenotype. Additionally, all the two-way interactions between polynomial terms and genotype (mitochondrial lineage) and between altitude and genotype were added to account for differential geographic patterns between the two whip snake subspecies. Finally, we standardised to zero mean and standard deviation equal to one every geographical variable to eliminate any effect of variables measured at different scales.
The second model aimed to analyse the effects of the climatic variables (BIO1 and BIO4) on the melanism occurrence, and the GLM incorporated annual mean temperature and temperature seasonality as predictors; all variables were standardised to have same range and/or variance.
The third model specifically tested the effects of the landscape variables on the probability to observe melanic phenotypes; the GLM included the four standardized landscape variables (i.e., meadows, rocks, woods, and cultivated land) as predictors.
Given that the occurrence of melanism in the populations associated to viridiflavus subspecies is restricted mainly to Sardinia (see Results), the climatic and environmental models were carried out only on these records in contrast to all carbonarius records, while mainland populations of viridiflavus were excluded from the analyses.
Another analysis was performed to assess the pattern of melanism within population across the line of contact between the two species using a geographic cline approach (MacGregor et al. 2017). We selected eight linear transects (approx. 200 km in length) across the contact zone between the two lineages all along the Italian Peninsula (Fig. 1). We divided each transect in 20 equidistant points and for each point we selected the nearest 25 individuals according to geographic coordinates, and computed the frequency of melanic individuals. Consequently, the process was comparable to a moving average, in which every individual was used on average in 4.8 (range 2.4-6.3) successive points per transect. We fitted clines using the Metropolis-Hastings Markov chain Monte Carlo algorithm and used the inflexion point to map the line of transition from the striped to the melanic phenotype, and we mapped the transition zone using the 20% and 80% of melanism occurrence lines as predicted by the cline in each transect.

Results
Among the selected sample of 1819 individuals, 803 (44.1%) showed the melanic phenotype, while the remaining 1016 (55.9%) showed the striped one. When referring to the genotypes, we found a clear uneven distribution of the observations as 1002 of them (55%) belonged to the carbonarius group while 817 (45%) belonged to the viridiflavus group (Fig. 1). More in detail, we found that in the carbonarius group, 746 individuals out of 1002 (74.5%) showed the melanic pattern, and the remaining 256 individuals (25.5%) showed the striped phenotype. On the other hand, in the viridiflavus group, 760 snakes out of 817 (93.0%) had the striped pattern and the melanic phenotype occurred in only 57 (7.0%) cases.

Models
The first GLM showed significant geographical effects on the probability to detect the melanic phenotype in both groups, and indicated that the genotype is a strong predictor of the phenotype occurrence (Table 1). In detail, the polynomial functions of latitude and longitude were both highly significant, as well as the interactions between the latitude and longitude polynomial functions and genotype (Table 1). All above results suggested that the probability to observe the melanic phenotype had a strong geographic pattern, and this pattern sensibly differed in the two whip snake subspecies. In the viridiflavus lineage, the probability to observe a melanic snakes slightly lowered going Northwards an Eastwards (Fig. 2), reflecting the fact that the majority of melanic individuals (45.6%) occurred in Sardinia (Fig. 1). On the other hand, in the carbonarius lineage, melanic individuals occurred more frequently in the Northernmost and Southernmost portions of the species range, and sensibly increased going Eastwards (Fig. 2). The GLM also found a significant effect of elevation, again with a significant interaction with the lineage (Table 1). However, the effect was much less intense than that of latitude and longitude, and involved exclusively the viridiflavus lineage (Fig. 2). In this group, the model predicted the melanic snakes to be more frequent with increasing altitude.
Taking into consideration the second GLM for the Sardinian samples in the H. v. viridiflavus groups, the analysis on both mean temperature and temperature seasonality did not support any statistically significant effect on the probability to detect melanic phenotypes (Table 2). On the other hand, when considering the model performed on the carbonarius group, both climatic variables had a highly significantly effect on the melanic phenotype occurrence probability (Table 2). However, the effects were contrasting, as the melanic phenotype was more frequent with warmer (β ± SE: 0.392 ± 0.093, P < 0.001) and less seasonal (β ± SE: -0.335 ± 0.077, P < 0.001) climates (Fig. 3).
Lastly, the GLMs accounting for the effects of habitat variables did not find any statistically significant effect on the probability to observe melanic phenotypes in both lineages (Table 3), indicating that the occurrence melanic snakes seemed not be affected by habitat features (Fig. 4).

Phenotypic clines
For each of the eight hypothetical transects we constructed, we were able to estimate the clines of melanism occurrence in such populations. In detail, only two transects (RM-BA and SP-PD) covered the whole range of frequency of melanism with lowest scores at the westernmost limit of the transect and highest scores on the Adriatic coast (Fig. 5). On the other hand, in the populations across the transects from the Western Po plain (TO-SO and GE-SO) the probability of finding melanic snakes never exceeded 85-90% (Fig. 5). Similarly, two transects from Southern Italy (RM-PE and RM-SA) provided comparable probabilities (Fig. 5). It is, instead, interesting to mention that across the two transects from Central Italy (LI-RI and OR-AN), the probability to detect melanic individuals never exceeds 65-70%, neither on the Adriatic coast indicating that the populations from Central Italy from the carbonarius lineage are never entirely melanic. Indeed, these Adriatic populations fall within the line corresponding to 80% of probability of melanism (Fig. 5), which does not happen for marginal population of any other transect we identified. Fig. 3 Effects of standardised climatic variables BIO1 (Annual Mean Temperature) and BIO4 (Temperature Seasonality) (x-axis) on the probability to detect melanic phenotypes in Sardinian specimens (green area) and in carbonarius (grey area) (y-axis). Shaded areas correspond to 95% confidence intervals. Climatic variables were obtained from WorldClim database

Discussion
When taking into consideration the rough data description, it is evident that the relative frequencies of the striped and melanic phenotypes are not evenly distributed in the two lineages. Indeed, individuals that belonged to the eastern lineage showed a higher proportion of the melanic phenotype (25.5%), whereas in the western one only 7.0% of the observations showed the melanic colouration, suggesting that individuals belonging to Fig. 4 Effects of standardised habitat variables (x-axis) on the probability to detect melanic phenotypes in Sardinian specimens (green) and in carbonarius (grey) (y-axis). Shaded areas correspond to 95% confidence intervals. Habitat variables were obtained from CORINE Land Cover inventory 1 3 the H. v. carbonarius subspecies are phenotypically more variable than those belonging to the other (Vanni and Zuffi 2011;Senczuk et al. 2020). We showed that the gradient of phenotypic variability between the two morphs across their geographic distribution is not linear and follows very complex patterns for both longitude and latitude. Indeed, in H. v. viridiflavus the occurrence of the melanic phenotype is mainly restricted to Sardinia with only rare observations in the rest of its distribution. This finding is consistent with the current knowledge about the distribution of melanism in the species (Zuffi 2007(Zuffi , 2008 as most melanic colourations in the H. v. viridiflavus subspecies have been reported for the Tyrrhenian islands both in large ones (Sardinia and Corsica) as well as in smaller islets like in Tuscanian Archipelago. As a matter of fact, the evolution of body colourations is postulated to be determined by a broad variety of biotic and abiotic factors, among which predator avoidance plays a major role on their selection over time (Andrén and Nilson 1981;Stuart-Fox et al. 2008;Le et al. 2017). In the specific case of the Western whip snake, it is reasonable to expect that a full black colouration could be more easily detectable by avian predators against the terrain background, consequently suffering negative selection and thus becoming less and less frequent in comparison with more disruptive colourations over time (Andrén and Nilson 1981;Castella et al. 2013). However, given the high prevalence of melanic colourations in Sardinia, more than ¼ of individuals, such hypothesis seems unlikely. Alternatively, the occurrence of melanic colourations in this area could be potentially explained under the perspective of the thermal efficiency hypothesis for ectotherms (Clusella-Trullas, 2008), which predicts black individuals to be favoured in terms of thermoregulation compared with brightly coloured ones. Nevertheless, given that neither annual mean temperature nor temperature seasonality were correlated with the occurrence of melanic phenotypes in Sardinia, we hypothesize that such phenotypes are not favoured by specific climatic conditions and that their distribution in the area is the outcome of a past founder effect that has randomly maintained the occurrence of these two different patterns in a restricted locality across populations (Kolbe et al. 2012). From this point of view, the lack of any correlation between habitat variables and phenotype (apart from the marginal effect of wooded areas that should be investigated more in depth through field surveys) suggests that the two phenotypes undergo comparable environmental selective pressures and that, therefore, are not selected differently (Esquerré and Keogh 2016), keeping phenotypes' relative frequencies stable in the population. On the other hand, the scarcity of melanic phenotypes across the mainland distribution in this group could be ascribed to random mutations across populations that randomly generate melanic individuals, rather than to intrinsic genetic variability of these populations (Eales et al. 2008), however this hypothesis is improbable. Alternatively, it is possible that alleles for melanism occur at low frequencies across the species' range in the form of a retained ancestral polymorphism (Wakeland et al. 1990), making compelling to address deeply genetic variability across the species range. The case of the eastern lineage instead is to be considered differently, indeed in this case the occurrence of the striped phenotype is more abundant and is localized almost entirely to the western populations of this group, along the contact zone with the populations belonging to H. v. viridiflavus (Fig. 1). In this case, the output of the climatic model for the probability to observe melanic phenotypes is consistent with the geographic model (Table 3). Hence, we hypothesize that the effect of mean temperature and seasonality is to be interpreted as a proxy of the geographical distribution of the phenotypes. However, concerning medium-large sized reptiles, the thermal hypothesis for ectotherms, which postulates that melanic individuals are favoured in terms of thermoregulation, fitness and growth rate in colder environments compared to normally coloured ones (Andrén and Nilson 1981;Luiselli 1995), is often considered a potential explanation for the occurrence of melanic phenotypes. Nevertheless, in our case this hypothesis does not support the distribution of phenotypes we observed, because the probability to observe melanic snakes in H. v. carbonarius is higher in warmer areas (e.g., Southern Italy; Pinna, 2017) and lower in inland regions where temperature seasonality is more marked, contrasting the core assumption of this hypothesis (Clusella-Trullas et al. 2007, 2008.

Fig. 5
Map of the fitted phenotypic clines estimated for each of the eight transects identified. Solid lines correspond to the transects, the dashed line to 50% probability to detect melanic phenotypes, the grey area to 20% (West) and 80% (East) frequency of melanic phenotypes and dotted line to Mezzasalma et al. (2015) line separating the two main lineages. For each side plot, on the x axis is shown the distance (km) from the from the W-end to the E-end of each transect Finally, the significant and positive effect of the altitude on the expression of melanism in the western lineage, albeit weak, is consistent with both the thermal hypothesis and the 'protection against UV radiation' hypothesis as well. This hypothesis states that melanin protects effectively from incoming UV radiation by absorbing it (Cope et al. 2001), hence at higher altitudes, where UV radiation is more intense (Blumthaler et al. 1997), reptiles should be selected to become darker. However, evidence to support robustly such hypothesis is still partly lacking, apart from few restricted cases such as for Psammodromus algirus (Reguera et al. 2014); if confirmed, our investigation could be regarded as a second case in which this hypothesis is supported by data.
In summary, we believe that the evidence collected from this study points toward one direction: the marked differences in body colouration between the two groups are neither driven by specific habitat features, nor by climate on its own, but rather by the evolutionary history of the species, which is tightly linked to the geographic distribution of the two groups. From this point of view, climate acts as a proxy of the geographical regions where the species occurs. More in detail, the occurrence of the two phenotypes in the two lineages, which are separated by the Apennines, could be ascribed to past refugia during ice ages (Mezzasalma et al. 2015), which have fixed the two phenotypes in different areas via bottleneck events. Eventually, subsequent warmer periods could have favoured the spreading of these two groups in the Italian territory, in France and in the main islands too, which is the distributional pattern we can observe at present day.
Interestingly, the occurrence of striped phenotypes in the eastern lineage populations in Northern Italy and along the Apennines is complicated to address; however, recently it has been suggested that these two groups might undergo a genetic admixture event along a putative contact zone running along the Apennines and in the Po plain in Lombardy and Piedmont (Senczuk et al. 2020). In this scenario, concerning the current distribution of phenotypes across the transect we identified (Fig. 5), the transition line from the striped to the melanic phenotype does not match the mitochondrial border between the two subspecies. Indeed, this line is shifted eastwards, suggesting that the distribution of phenotypes, determined by nuclear genetic variability (Hánová et al. 2021), does not match the mitochondrial lineages' one and, specifically, that the eastern lineage populations tend to be phenotypically more variable than western ones. These considerations lead us to hypothesize that a complex genetic introgression process could explain thoroughly what was observed. Moreover, in accordance to our finding and to what we hypothesize for the whip snake, in the past years the Common wall lizard (Podarcis muralis) populations from the same geographic area were expected to undergo asymmetric introgression event mediated by morphological traits, resulting in discrepancies in phenotypes (While et al. 2015). Indeed, recent evidence has pointed out that gene flow among P. muralis lineages in this region is marked and asymmetrical and determines the occurrence of mismatching phenotypes we observe in the living populations (Yang et al. 2018(Yang et al. , 2020, similarly to what we hypothesize for H. viridiflavus. The occurrence in two separate squamate species of such complex interactions among lineages after a secondary contact in the same geographic area suggests that the Italian peninsula can be of great interest to assess the microevolutionary processes involving reptiles, and thus that such genetic event could be part of a broader scenario that is not restricted to whip snakes only.
Additional evidence about genetic introgression events in this region would remark that the Italian peninsula is an excellent model to investigate evolutionary processes at a wide scale and across time.