Impact of developmental temperatures on thermal plasticity and repeatability of metabolic rate

Phenotypic plasticity is an important mechanism that allows populations to adjust to changing environments. Early life experiences can have lasting impacts on how individuals respond to environmental variation later in life (i.e., individual reaction norms), altering the capacity for populations to respond to selection. Here, we incubated lizard embryos (Lampropholis delicata) at two fluctuating developmental temperatures (cold = 23 ºC + / − 3 ºC, hot = 29 ºC + / − 3 ºC, ncold = 26, nhot = 25) to understand how it affected metabolic plasticity to temperature later in life. We repeatedly measured individual reaction norms across six temperatures 10 times over ~ 3.5 months (nobs = 3,818) to estimate the repeatability of average metabolic rate (intercept) and thermal plasticity (slope). The intercept and the slope of the population-level reaction norm was not affected by developmental temperature. Repeatability of average metabolic rate was, on average, 10% lower in hot incubated lizards but stable across all temperatures. The slope of the thermal reaction norm was overall moderately repeatable (R = 0.44, 95% CI = 0.035 – 0.93) suggesting that individual metabolic rate changed consistently with short-term changes in temperature, although credible intervals were quite broad. Importantly, reaction norm repeatability did not depend on early developmental temperature. Identifying factors affecting among-individual variation in thermal plasticity will be increasingly more important for terrestrial ectotherms living in changing climate. Our work implies that thermal metabolic plasticity is robust to early developmental temperatures and has the capacity to evolve, despite there being less consistent variation in metabolic rate under hot environments.


Introduction
A substantial amount of variation in an individual's phenotype is determined by formative processes experienced throughout embryonic development. Environmental perturbations during this critical period can have persistent effects on an individual's physiology, morphology, behaviour and life history Eyck et al. 2019;O'Dea et al. 2019). Developmental shifts in phenotypes may be adaptative if it allows organisms to better cope in similar environments later in life (Beldade et al. 2011). However, environment-phenotype mismatches can occur when developmental cues fail to predict late life conditions (Auld et al. 2010;Bonamour et al. 2019). A multitude of traits throughout an animal's life are labile; such as metabolism (Bronikowski and Vleck 2010); behavioural traits (Mitchell and Biro 2017); growth (Rodgers et al. 2018) and hormone levels (Lendvai et al. 2014), that can reversibly respond to environmental change. Reversible plasticity in phenotypic traits allows individuals to adjust to immediate changes in their surroundings (Piersma and Drent 2003), and can broadly be classified into two categories, acclimation and phenotypic flexibility (Piersma and Drent 2003;Havird et al. 2020). Acclimation is generally a slower form of reversible plasticity that involves remodelling of physiological systems from chronic exposure to a particular environment (Seebacher 2005). Phenotypic flexibility, in contrast, describes changes in traits that are induced by short-term environmental exposure, such as changes in metabolic rate in response to ambient temperature (Piersma and Lindström 1997;Piersma and Drent 2003).
Reversible plasticity may be able to alleviate the costs associated with phenotype mismatches induced by early life environments (Angilletta et al. 2003;Ghalambor et al. 2007). When environments shift predictably, flexibility in the phenotype would be advantageous because individuals can compensate for the effects of prevailing conditions to avoid discrepancies between the environment and the phenotype (Botero et al. 2015). However, reversible plasticity in a labile trait later in life can change depending on early environmental conditions -altering how organisms respond to environmental variation (Beaman et al. 2016) (Fig. 1). The interaction between early-and late life  Via et al. 1995). Developmental environment can also influence (C) repeatability of plastic responses (consistent among individual variation in slopes), this is typically represented by the variation in the slope of the reaction norm, Via et al. 1995)" plasticity has been supported by a few studies that show developmental differences in plasticity for a variety of traits including mitochondrial function (Shama et al. 2014), metabolic rate ) and locomotor performance (Kazerouni et al. 2016). However, these studies solely focus on how developmental effects affect acclimation, whereas the influence on phenotypic flexibility and variability of plastic responses is poorly known.
It has long been recognised that individuals vary in their plasticity, with some responding more flexibly than others (Nussey et al. 2007;Dingemanse and Wolf 2013). Consistent among individual variation in plasticity (also known as repeatability in plasticity) may be heritable and provides the phenotypic substrate for selective forces to act upon (Nussey et al. 2007;Araya-Ajoy and Dingemanse 2017). Developmental environments can influence phenotypic variation available for selection (Sultan and Stearns 2005). For example, zebra finches (Taeniopygia guttata) that experience nutritional stress as nestlings weigh less and have reduced growth rates contributing to increases in the repeatability of metabolism and behavioural traits (Careau et al. 2014a). Repeatability of plasticity has also been reported in other labile traits including aggressiveness in great tits (Parsus major) (Araya-Ajoy and Dingemanse 2017), explorative behaviour in chickadees (Thompson et al. 2018) and metabolic rate in amphipods (Réveillon et al. 2019). Whether developmental environments affect the repeatability of plasticity per se is still not well understood (Fig. 1). Identifying the factors that impact repeatability is necessary for understanding the evolution of plasticity in changing environments.
Energy metabolism is a key fitness related trait that is both consistently different among individuals and highly labile within individuals (Nespolo and Franco 2007;Norin and Metcalfe 2019). All organisms require energy for growth, maintenance and reproduction (Careau et al. 2014c). Numerous studies have investigated the influence of various developmental environments, such as temperature (Gangloff et al. 2015;Noble et al. 2021), ultra-violet (UV) exposure (Kazerouni et al. 2016), and dietary restriction (Careau et al. 2014a) on metabolic rate, however, the impacts on plasticity of metabolic rate is not well established (but see . Developmental environments are expected to influence metabolic plasticity, possibly through modifications in metabolic enzymes or cellular membrane structure that influence their function in different environments (Angilletta 2016). Such changes imply that tolerance to environmental perturbations may be determined by the developmental environment a given cohort experiences. Furthermore, if repeatability of metabolic plasticity is also affected, then a population's capacity to respond to selection might also depend on early life conditions. Understanding how early life environments shape metabolic plasticity will be particularly important for organisms where metabolic rate is closely intertwined with prevailing environmental conditions.
Here we employed a 'reaction norm approach'(sensu Via et al. 1995) to examine the impact of developmental temperature on metabolic rate plasticity in an oviparous skink (Lampropholis delicata). Specifically, we were interested in testing whether developmental temperature affects the shape and repeatability of metabolic thermal reaction norms. Over 3.5 months, we repeatedly measured routine metabolic rate at six temperatures for lizards (n obs = 3,818) that hatched from two incubation treatments (cold = 23 ºC ± 3 ºC, hot = 29 ºC ± 3 ºC, n cold = 26, n hot = 25) to address the following key questions: (1) How does developmental temperature change the intercept and slope of the thermal reaction norm?; (2) Does the repeatability of metabolic plasticity (i.e. slope of the reaction norm) differ between the two developmental temperatures? (3) Do developmental temperature treatments differ in their repeatability of metabolic rate (intercept) at each temperature (i.e. temperature-specific repeatability)? Our experimental approach provides important insights into how developmental environments mediate the capacity for ectotherms to metabolically respond to thermal variation.

Lizard collection and husbandry
We established a breeding colony of adult L. delicata (n females = 144, n males = 50) using wild individuals collected across three sites throughout the Sydney region between 28 August and 8 September 2015 151.24;151.18,151.10). These three sample locations represent the same genetic lineage of Lampropholis delicata, it is therefore unlikely that our experiments would be impacted by collection at different sampling sites (Chapple et al. 2011). Furthermore, all adults were kept in common lab conditions for a year prior to the commencement of this study, therefore any environmental effects are likely to be minimal. Post-hoc analyses show that the population origin of dams and sires did not explain a significant amount of variation in metabolic rate (see below and Table S2). Three females were housed with a single male in opaque plastic enclosures measuring 35 × 25 × 15 cm (L × W × H). Enclosures were kept under UV lights on a 12 h light: 12 h dark cycle in a temperature-controlled room set to 24 ºC. Lizards had access to a heat lamp (IKEA Kvart fitted with a E14 2700 k reflector bulb 36.6 cm x 10 cm) that elevated temperatures on one side of the enclosure to 32 ºC. Each enclosure was lined with newspaper and lizards had constant access to water and tree bark was used as refuge. Adult lizards were fed medium sized crickets (Acheta domestica) ad libitum dusted with calcium powder and multi-vitamin every two days. From the beginning of the egg laying season (October of each year), we replaced the newspaper lining with garden potting mix and placed an opaque plastic box (12 × 17.5 × 4.3 cm) containing moistened vermiculite in each enclosure for females to oviposit their eggs. During this time, enclosures and vermiculite boxes were sprayed gently with water every other day to maintain a relatively humid environment. From October to November, vermiculite boxes were checked every day for eggs. Animal collection was approved by the New South Wales National Parks and Wildlife Service (SL101549). All procedures adhere to the 8th edition of the Australian Code for the Care and Use of Animals for Scientific Purposes and were approved by the Macquarie University Ethics committee (ARA 2015/015) and University of New South Wales Animal Care and Ethics committee (ACEC 15/51A).

Developmental temperature manipulations
Eggs were collected between October 2017-March 2018. When eggs were discovered, they were weighed using a digital scale to the nearest 0.01 g (Ohaus Scout SKX123). We also measured egg length (distance between the furthest points along the longest axis of the egg) and egg width (distance between the widest points along the axis perpendicular to the longest axis of the egg) using digital callipers to the nearest 0.01 mm in case any egg features differed between treatments. This is important because egg size can impact suites of offspring traits (Lancaster et al. 2010;Krist 2010). Preliminary analyses showed that none of the egg dimensions differed between treatments and egg dimensions were all positively correlated (Supplementary Materials Sect. 1, Table S1). Following measurements, each egg was placed in a plastic cup (80 ml) containing 3 g of vermiculite and 4 g of water and covered using cling wrap which was secured by an elastic band. We used a split-clutch design such that eggs from each clutch were pseudo-randomly assigned to one of two fluctuating incubation temperature treatments. We used two incubators to precisely control the temperature of eggs (LabWit, ZXSD-R1090). The 'hot' treatment was exposed to a mean temperature of 29 ºC whereas the 'cold' treatment was exposed to a mean temperature of 23 ºC. Both incubators fluctuated + / − 3 ºC the mean temperature over a 24-h period. These treatments represent the average thermal minimum and maximum of natural nest sites in Sydney populations of L. delicata (Cheetham et al. 2011). Egg cups were rotated within each incubator weekly to avoid uneven heat circulation within incubators. Incubators were also checked daily for hatchlings. On average, the incubation duration for the 'hot' treatment was 30 days (SD = 1.40, range = 27-33) days and 47.7 days (SD = 5.90, range = 25-53) for the 'cold' treatment.

Planned missing data and metabolic rate at different temperatures
Metabolic measurements commenced in April 2018 and continued until August 2018. At the beginning of measurements, hatchlings were on average 88.68 days old (SD = 23.75, range = 26-131). Due to the scale of our experiment, we opted for closed-system respirometry instead of flow-through respirometry. The former approach allowed us to obtain highthroughput measurements of MR across many individuals, whereas the latter approach provides a more in-depth characterisation of metabolic changes over time for a much smaller subset of individuals.
We quantified routine metabolic rate (hereafter referred to as metabolic rate [MR]) as our measurements likely included the energetic costs of random movements (Withers 1992;Mathot and Dingemanse 2015). MR was measured as the volume of CO 2 production per unit time ( V CO 2 mL min −1 ) as CO 2 production is less susceptible to fluctuations in water vapour and signals are more easily detected in smaller organisms (Lighton 2008;Tomlinson et al. 2018). CO 2 production was strongly correlated with O 2 consumption in any case (r = 0.81, p < 2.2e −16 ) with RQ values averaging 0.77 (SD = 0.41, n obs = 198).
Due to logistical constraints, lizards were randomly assigned to one of two blocks for MR measurements (block 1: n = 26, block 2: n = 25). Each block was measured two days apart. We sampled lizards once a week for two-weeks consecutively and then allowed them to rest for one week before the next set of measurements. Each measurement week was considered a sampling session (ten sampling sessions in total over the course of 14 weeks). We used the same incubators described above to precisely control the temperature at which MR measurements were taken (+ / − 1 ºC).
Metabolic rate was measured at six acute (i.e., short-term; < two hours) temperatures, 24, 26, 28, 30, 32 and 34 ºC in a randomised order. We employed a planned missing data design to minimise the stress of lizards during the experiment as the lizards were also part of an ongoing growth rate experiment. Metabolic rate data is particularly amenable to a planned missing data design because it is strongly correlated with temperature and body mass. For each block of lizards, we intentionally missed measurements at two randomly selected temperatures at each sampling session (Nakagawa 2015;Noble and Nakagawa 2021). For more details see Supplementary Materials (Fig. S1). Missing MR data temperatures were then recovered using data imputation techniques during analysis (see Statistical analyses).
At ~ 06:00, lizards were gently encouraged into an opaque respiratory chamber and then weighed. After which, chambers were placed inside preheated incubators set at the randomised temperature for 30 min to allow body temperatures to equilibrate. The lids of the chambers were left ajar during this time to minimise CO 2 build up. After 30 min, each chamber was flushed with fresh air and sealed. A 3 mL 'control' air sample was immediately taken via a two-way valve to account for any residual CO 2 that was not flushed from the chambers. The chambers were left in the incubator at the set temperature for lizards to respire for 90 min. After this time, two replicate air samples (3 mL) were taken from each chamber in order to estimate the change in CO 2 relative to the control sample . Two samples were taken so we could explicitly estimate measurement error (see Statistical analyses, Ponzi et al. 2018). Chambers were then reopened and flushed with fresh air before being placed back into the incubator for the second measurement temperature (2 temperatures / day) following the same procedure approximately two hours later. Overall, this sampling design enabled us to characterise the thermal reaction norm (four out of six temperatures for our planned missing data design) for each lizard 10 times while accounting for measurement error. This resulted in n = 4,080 measurements of MR ([2 air samples × 4 temperatures] × 10 sampling sessions = 80 samples per lizard). However, additional missing data from equipment malfunction or human error meant that our total sample size was n = 3,818.
All air samples were injected into the inlet line of a Sables System FMS (Las Vegas NV, USA) with the flow rate set to 200 mL min −1 to measure V CO 2 and V O 2 . Water vapour was scrubbed from the inlet air with Drierite. Output peaks were processed using the R package 'metabR' (https:// github. com/ danie l1nob le/ metabR). The rate of CO 2 produced by an individual was calculated following (Lighton 2008): where Δ%CO 2 is the maximum percentage of CO 2 in air sample above baseline, which was corrected by subtracting any 'residual' CO 2 from the initial flush from the larger of the two air samples; V chamber is the volume of the chamber (70 mL) and V lizard is the volume of the lizard. We used the mass of the lizard as a proxy for its volume (1 g = 1 ml) because of their high correlation and increased accuracy and precision in mass measurements (Friesen et al. 2017;Kar et al. 2021), and t is the duration of time in minutes after where the chamber had been sealed and the first air sample was taken (90 min).

Statistical analyses
We fitted Bayesian linear mixed effect models in R (Core Team 2013) using the package 'brm''s' (Bürkner 2017). Metabolic rate was log transformed then body mass was log transformed and then z-transformed. Age and temperature were z-transformed so parameter estimates of main effects and interaction terms were more interpretable (Schielzeth 2010). Posthoc analyses revealed that including dam ID, sire ID and population origin of either parents as random effects did not improve model performance. As such these were not included in our final analyses (Supplementary Materials Sect. 1, Table S2). Our planned missing data design resulted in random missingness across temperatures (36% missingness in MR and body mass). The 'brms' package is capable of performing model-based data imputation. As such, we performed imputation during model fitting in all of our analyses. Model-based imputation not only retains the hierarchical structure of the dataset but also increases statistical power and t precision by recovering missing data values (Nakagawa 2015;Noble and Nakagawa 2021). We found that models with imputed data resulted in similar conclusions to complete case analyses (Supplementary Materials 2-4, Table S5, S7 -S9, S11 -S14). We present results from the imputation analysis in the main text as parameter estimates were more precise (Supplementary Materials). For all models we used default priors which are presented in Table S3. We ran four Markov Chain Monte Carlo (MCMC) chains; taking 800 samples from the posterior distribution after discarding the first 1,500 iterations. This gave a total of 3,200 samples from the posterior distribution across all chains. We ensured chains were mixing by inspecting trace plots and checked that scale reduction factors were less than 1.01, which indicates that all chains had converged. Throughout we report posterior means and 95% credible intervals for all parameters. All data and code to reproduce our results are provided (see Data accessibility).
To test whether developmental temperatures changed the shape of reaction norms, we fitted a full model with MR as the response and temperature, treatment (developmental temperatures, 23 ºC as the baseline) and an interaction between treatment and temperature as predictors. The model also included a random intercept for lizard identity and sampling session. We wanted to account for measurement error in all our models as it may conflate parameter estimates (Ponzi et al., 2018). Using the two replicate air samples, we estimated measurement error variance by including a nested random effect of lizard identity, sampling session and temperature in all our models (e.g. ID001_session1_temp24). This nested random effect (hereafter referred to as measurement error) estimates the variance attributed to differences among replicates. While we show in a previous study that measurement error can vary by temperature (Kar et al. 2021), here we assumed that measurement error was constant across temperatures by fitting it as a random intercept as estimating a random slope resulted in model convergence issues. Heterogeneous residual variance across temperatures can also influence parameter estimates (Careau et al. 2014a). However, Watanabe-Akaike information criterion (WAIC) and expected log predictive density (ELPD) values indicated that a heterogeneous residual variance model was not well supported, therefore homogenous variance was used in all models (Table S4). Acclimation can influence metabolic plasticity and its effects can take place throughout the course of our study. Unfortunately, it was not possible to measure MR at hatching. However, we still tested whether there were treatment differences in thermal reaction norms in the first sampling session (~ 2.5 months of age) where acclimation effects were likely to have the weakest effect.
We estimated adjusted repeatability of the reaction norm slope (R slope ) in each developmental temperature treatment by fitting separate models for each treatment group. MR was fitted as the response and temperature, body mass and age (days since hatching) as predictors. We included lizard identity, measurement error and a nested random effect of individual identity and sampling session (hereafter referred to as series, Araya-Ajoy et al. 2015). Lizard identity estimates among individual variance, whereas series partitions variance within individuals across all sampling sessions which allows the estimation of slope repeatability. A random temperature slope was estimated for lizard identity and series. The repeatability of the slope is calculated as the proportion of total variance in slopes explained by among individual differences (Araya-Ajoy et al. 2015): where V I,slope is the among-individual variance in the temperature slope term and the V series,slope is the within-individual variance in the temperature slope.
We also estimated adjusted repeatability of average metabolic rate (i.e. intercept of the reaction norm) at each temperature by fitting separate models for each treatment group. Similar to above, MR was included as the response and temperature, body mass and age as predictors. We included lizard identity, sampling session and measurement error as random intercepts and temperature as a random slope for lizard identity. We calculated among individual variance in metabolic rate at each temperature I t following Schielzeth and Nakagawa (2020): where V I is the among individual variance in intercepts, t is the specific temperature at which repeatability is calculated for, V S is the among individual variance in slope and COV I,S is the covariance between the intercept and slope at the among individual level. Temperature specific repeatability ( R t ) is then calculated as follows: where V session is the variance due to sampling session and V e is residual variance.
We also wanted to estimate overall repeatability of average metabolic rate across all temperatures. We therefore fitted the same model as above for each treatment, but we omitted the random temperature slope for lizard identity, this estimates an average among individual variance across all temperatures. Similarly, we calculated repeatability as per the equation above but using just the single estimate of among individual variance.
In order to test for differences in repeatability among the two developmental temperatures, we calculated contrasts by subtracting the posterior distributions of repeatability estimates of the cold treatment from the hot treatment (Hot -Cold). To test whether the magnitude of differences among treatments were significant, we calculated probabilities of direction (pd) using the package 'bayestestR' (Makowski et al. 2019b). The probability of direction is calculated relative to the posterior median and ranges from 50 to 100%. The value of pd describes whether an effect is either positive or negative as it is always relative to the sign of the median (Makowski et al. 2019a). If the median is positive, then pd describes the proportion of the posterior distribution that is also positive (Makowski et al. 2019a). A pd value of 95% can be interpreted as the effect is positive with a probability of 95%.

Results
We found no evidence to suggest that metabolic rate or its response to short-term temperature changes was influenced by early developmental temperature (Fig. 2  This model used an imputed dataset of n obs = 6,000, 36% of observations were imputed. The intercept is the cold developmental temperature. MR was log transformed and mass, age and temperature were z-transformed. Credible intervals of bolded estimates do not overlap zero. Lower and upper bound of estimates represent 95% credible intervals. COV represents covariance. Main effects model is presented in Table S4 There is not particularly definition for the "significance" for bolded estimates as we are employing Bayesian statistics. These values are "significant" because the credible intervals does not overlap zero Materials, Sect. 3). A pd value of 53.5% indicates that there is roughly equal probability that the difference in R slope is positive or negative, indicating little difference among treatment groups.
Overall, temperature-specific repeatability was relatively low, with the cold developmental treatment tending to have higher repeatability estimates compared to the hot developmental treatment (Fig. 4, Fig S2, Supplementary Materials, Sect. 4 Table S12). Irrespective of temperature, repeatability of average metabolic rate was on average 10% higher in cold incubated lizards (pd = 95.7%, Fig. 4b, c). There was a 95.7% probability that the difference in overall repeatability was negative, indicating that lizards from the cold treatment are more likely to have higher repeatability. Higher repeatability in the cold treatment was associated with significant increases in among individual and residual variance (Fig. S3).

Discussion
Early developmental temperature did not change the intercept or slope of the population reaction norm in delicate skinks. Thermal plasticity of metabolic rate was unaffected by developmental temperature, however; variation in slope had relatively high repeatability (R > 0.4). However, temperature-specific repeatability of metabolic rate (i.e., intercept) (A) (B) Fig. 3 Thermal reaction norms of mass-adjusted metabolic rate for lizards reared at (A) 'hot' developmental temperatures (top, red lines, n lizards = 25) and (B) 'cold' developmental temperatures (bottom, blue lines, n lizards = 26) at session number one, five and ten. Each uniquely coloured line represents an individual reaction norm. A random subset of 10 individuals from each treatment are presented 1 3 was lower among lizards that were reared in hot developmental temperatures. Our results suggest that, while individuals displayed consistent variation in their plasticity, how metabolic rate responds to short-term temperature variation later in life was robust to simulated thermal minimum and maximum of natural nest sites. Developmental temperatures did not have an impact on average metabolic rate but rather it changed the amount of consistent individual variation in average metabolic rate. Below we discuss the implications of our results for the evolution of thermal reaction norms.

Thermal reaction norms of metabolic rate are robust to developmental temperature
Developmental environments that affect later life plasticity may affect how individuals and populations respond to environmental fluctuations (Beaman et al. 2016). Epigenetic modifications during development that influence the physiological system are likely responsible for shaping plastic responses in complex ways (Hu and Barrett 2017;McCaw et al. 2020). However, our results suggest instead that thermal reaction norms for metabolic rate were robust to changes to the incubation temperatures we selected which were based on the thermal minimal and maximal of natural nests found in Sydney (Range = 22.5 ºC -34.5 ºC,   Table S15 -16). Contrasts are presented in Table S12 mean = 27.4 ºC, SD = 2.66, N = 59, Cheetham et al. 2011). It should also be noted that our treatments were fluctuating, and at times, temperatures overlapped which may contribute to the lack of differences observed. Among the few studies that have investigated the effects of pre-and post-hatching temperature on the plasticity of metabolic rate, results have been mixed (Table 1, Beaman et al., 2016). For example, wild caught mosquitofish (Gambusia holbrooki) developing under more variable spring conditions exhibited steeper thermal reaction norms for metabolic scope compared to fish born in summer . In contrast, incubation temperature did not affect plasticity in metabolic rate of striped marsh frog tadpoles (Seebacher and Grigaltchik 2014). Given that our lizards were reared in a common environment the lack of difference we observed may be due to the waning effects of developmental environments on metabolism as individuals age and acclimate to post-hatching environments. Generally, acclimation of physiological function takes approximately three to four weeks to complete (Seebacher 2005). Given that we began the study when lizards were ~ 2.5 months old, it is possible that metabolic acclimation was an important factor as young lizards were able to behaviourally thermoregulate in their enclosures and may have aligned to their preferred temperatures (~ 26 ºC in adults, Goulet et al. 2016). Future studies should employ cross factorial designs where post-hatch environments are deliberately matched and mismatched with early environmental conditions to disassociate acclimation and thermal preference effects (Schnurr et al. 2014;Kazerouni et al. 2016).
Stable thermal reaction norms of metabolic rate across both developmental temperatures have key evolutionary implications. Our results imply that population reaction norms may be robust to temperature variation within the thermal range of natural nests (Cheetham et al., 2011). Past thermal regimes encountered by ancestors may have canalized population responses so that they are less sensitive to fluctuations in developmental temperature (Liefting et al. 2009). Canalization may reduce the costs of phenotypic plasticity during development if environmental variation is predictable across generations (Aubret and Shine 2010). In support of this, damselflies undergoing range expansions exhibit geographic variation in thermal reaction norms that align with past climatic conditions (Lancaster et al. 2015). Population comparisons across environmental gradients might reveal whether local adaptation shapes developmental plasticity of population reaction norms that lead to canalisation (Toftegaard et al. 2015). Developmental environments may play a stronger role in shaping population plastic responses in areas that experience greater thermal variability, such as those in temperate or high elevation regions (Rutschmann et al. 2016;While et al. 2018;Bonamour et al. 2019). While our incubation treatments represent the average thermal minimal and maximal of natural nest sites, their thermal variability (+ / − 3 °C) was the same. The lack of thermal variation in our treatments may be a possible explanation why the thermal reaction norms did not differ. Future studies should attempt to isolate the effects of thermal averages and variability during development to understand how thermal reaction norms are shaped in ectotherms.

Developmental temperatures and repeatable thermal reaction norms of metabolic rate
Repeatability of reaction norm slopes did not change with developmental temperature, but lizards reared in hot temperatures had reduced repeatability in metabolic rate (intercept). Variation in developmental time has important consequences on hatching condition and may contribute to differences in consistent variation in hatchling phenotypes such as MR.
Developmental time exhibits a negative nonlinear relationship with temperature, such that development times are considerably shorter at hotter temperatures Marshall et al. 2020). Consequently, eggs reared in warmer environments are expected to be more constrained in their developmental rates, thus hatching phenotypes are more likely to be less variable compared to eggs reared in cooler environments (Pettersen et al. 2019). Indeed, incubation duration was short and less variable in our hot developmental treatment (Hot: 30 days, SD = 1.40, range = 27-33; Cold: 47.7 days, SD = 5.90, range = 25-53). Shortened development times may restrict embryo yolk assimilation that is needed for growth and may have contributed to reduced among-individual variation in MR (Oufiero and Angilletta 2006;Storm and Angilletta 2007). Elevated mitochondrial proton leak at hot developmental temperatures may also lead to less efficient energy production and may explain why metabolic rate did not differ among treatments even though we detected differences in repeatability (Chamberlin 2004). While is not possible to rule out any collection site differences that may contribute to epigenetic differences among our two treatment groups these effects are likely to be minimal because (1) all sites represent the same genetic lineage (Chapple et al. 2011); (2) females were reared in the lab for a year prior to experiments, removing environmental effects on female reproduction and (3) we used a split-clutch design to ensure that all sites were represented in our treatments. Despite there being no developmental differences in the repeatability of metabolic plasticity, we show that it is possible for it to evolve under changing environments (Ghalambor et al., 2007).
We found that individuals consistently vary in metabolic plasticity in response to temperature. While several studies have reported significant among individual variation in thermal plasticity slopes (Careau et al. 2014b;Briga and Verhulst 2017), its repeatability is rarely estimated as it requires a study design that allows partitioning of within and between individual variance of slopes (Araya-Ajoy et al., 2015). Our repeatability estimates for reaction norm slopes were consistent with a previous study of the same species (R = 0.23, Kar et al. 2021). Similarly, moderate repeatability of thermal sensitivity of metabolic rate has also been observed in amphipods (R = 0.38, Réveillon et al. 2019). Assuming that repeatable reaction norm slopes have a heritable basis (Driessen et al. 2007), our work implies that thermal plasticity has the potential to be selected upon and evolve (Falconer 1952; but see Dohm 2002). Consistent individual differences in metabolic rate at a given temperature were also present and stable across all temperatures. This result demonstrates that MR is repeatable within the operable range of temperatures in L. delicata (Matthews et al. 2016). Overall, our estimates for the repeatability of MR ranged from 0.09-0.22. Our results are in line with a meta-analysis showing that repeatability decreases with time between repeated measurements (White et al. 2013). Specifically, the average repeatability of MR in ectotherms from studies that had a measurement interval that was equal or larger than our study ( ≥ 8.5 days) was R = 0.33 (SD = 0.21, n = 18). Interestingly, repeatability of average MR in wild caught adult L. delicata (R = 0.3 -0.5, Kar et al. 2021) was quite comparable or even larger. This is likely due to life stage differences that shape phenotypic among-individual variation. As individuals mature, their experiences in different microhabitats (diet, thermal preferences) can promote amongindividual variation in traits (Kruuk and Hadfield 2007). Such common (micro) environment effects could further increase repeatability and may explain differences between lab and wild studies (Auer et al. 2016). Furthermore, the repeatability in metabolic rate we observe may explain consistent variation in thermoregulation, behaviour and life history observed in this species (Goulet et al. 2017).