The accuracy and precision of age determination by dental cementum annuli in four northern cervids

Individual age is an important element in models of population demographics, but the limitations of the methods used for age determination are not always clear. We used known-age data from moose (Alces alces), red deer (Cervus elaphus), semi-domestic reindeer (Rangifer tarandus tarandus) and Svalbard reindeer (Rangifer tarandus platyrhynchus) to evaluate the accuracy and repeatability of age estimated by cementum annuli analysis of longitudinally sectioned permanent incisors. Four observers with varying experience performed blind duplicate age estimation of 37 specimens from each cervid. The relationship between known age and estimated age was linear, except for Svalbard reindeer where a quadratic model gave a slightly better fit. After correcting for observer ID and animal ID, there was a slightly declining probability to assess the correct age with increasing age for moose, red deer and Svalbard reindeer. Across cervids and observers, estimated age equalled known age in 69% of all readings, while 95% age ± 1 year. Predicted probability of correct age assessment for experienced observers was 93% for red deer, 89% for Svalbard reindeer, 84% for moose and 73% for semi-domestic reindeer. Regardless of observer experience and cervid, there was a high agreement between repeated assessments of a given animal’s tooth sections. The accuracy varied between cervids but was generally higher for observers with former ageing experience with a given cervid. We conclude that the accuracy of estimated age using longitudinally sectioned incisors is generally high, and even more so if performed by observers with former ageing experience of a given species. To ensure consistency over time, a reference material from known-age individuals for each species analysed should be available for calibration and training of observers.


Introduction
In long-lived species, individual age is often a crucial component in analyses of population dynamics, phenotypic trait distribution, evolutionary ecology, and population management and conservation (Coulson et al. 2006;Festa-Bianchet et al. 2003;Gordon et al. 2004;Pelletier et al. 2012;Saether et al. 2013). Without the possibility to control for age, analysis of variables like body size, parental investment, reproductive success, survival rate, senescence and cohort effects can be meaningless (Berube et al. 1999;Douhard et al. 2016;Hewison and Gaillard 2001;Nussey et al. 2006;Pelletier et al. 2012;Weladji et al. 2002). Reliable methods for ageing of live and dead individuals are therefore essential in many aspects of wildlife research and management.
Teeth contain the most durable of all biological tissues (Lucas 2004), and due to their ability to conserve information related to key happenings in animals' life histories (e.g. Antoine et al. 2009;Risnes 1998), nutritional status or anthropogenic trace metal exposure (e.g. Kang et al. 2004;Lee et al. 1999), teeth have proved to be very useful as bio archives, and for determining animal age. The tooth root cementum is a mineralized tissue covering the root dentine surface and provides tooth attachment and maintenance of occlusal relationships as teeth wear (Bosshardt and Selvig 1997). Cementum is deposited continuously throughout the life of a permanent tooth and provides a longitudinal record of factors affecting its growth (Lieberman 1994). Variation in cementogenesis, the process of cementum formation, results in incremental bands that correlate with seasonal growth patterns in most species (see references in Lieberman 1994). Annual deposition is normally composed of a wide and translucent 'summer' layer (growth line) and a narrow, dark and hypermineralized 'winter' layer (rest line). The regularity of the deposition has, on the other hand, been found to be influenced by environmental factors (climatic variation, quantity and quality of food) or metabolic fluctuations following reproduction and varying feeding habits (Grue and Jensen 1979;Stallibrass 1982). Despite such 'noise' in cementum formation, the distinctiveness and regularity of seasonal lines provide a superior basis for age determination that outperform alternative methods based on e.g. tooth wear assessment (Pérez-Barbería et al. 2014).
Registration of seasonal growth lines, often referred to as incremental lines or cementum annuli analysis (CAA), is currently widely used as basis for individual age determination in numerous species (cervids: Hamlin et al. 2000;Rolandsen et al. 2008;Pérez-Barbería et al. 2014;terrestrial carnivores: Matson et al. 1993;Nakanishi et al. 2009;marine mammals: Christensen-Dalsgaard et al. 2010;Read and Hohn 2018). The universal principle involves counting the number of incremental lines and adding the average age for eruption of the actual tooth analysed (Stallibrass 1982). However, although the general guidelines are simple, several potential errors and biases may be introduced, reducing the accuracy and value of the results. Such errors may be both systematic (e.g. sex, age, species, population, observer) and related to random processes (e.g. nutritional conditions and individual variation), and preferably such limitations should be known and accounted for when age estimates are used in other analyses.
In Norway, mandibles are collected from a considerable proportion of the total number of harvested moose (Alces alces), red deer (Cervus elaphus) and wild reindeer (Rangifer tarandus tarandus) every year. This is either initiated as part of long-term monitoring programmes (Solberg et al. 2017), or as part of the regular local management practice. The mandibles are primarily used to age harvested animals. Calves and yearlings of all three species, and most of the 2-year-old reindeer and red deer are aged based on tooth eruption and wear pattern. Age determination of all other animals is based on readings of seasonal growth lines in incisors. Compared to using molars, production of decalcified tooth sections from incisors has proved to be less laborious. This is an important advantage when a large number of adults are routinely aged through e.g. monitoring programmes or other studies that require exact age.
The aim of this study was to evaluate the accuracy and repeatability of age determined by using decalcified, sectioned and stained incisors from four free-living cervids in Norway (moose, red deer, semi-domestic reindeer and Svalbard reindeer Rangifer tarandus platyrhynchus). Although semi-domestic reindeer and Svalbard reindeer represent subspecies of Rangifer, we hereafter refer to each ungulate as a single species. We used tooth sections from 37 individuals of known-age from each species. Two-year-olds represented the lower age range for all species, whereas 12-, 13-, 14and 26-year-old individuals represented the upper age range for semi-domestic reindeer, moose, Svalbard reindeer and red deer, respectively. The CAA was performed as blind duplicates by four observers with varying prior experience. We then explored to what extent accuracy was related to species, individual age and observer experience.

Study area
Within species, the material originated from a quite limited geographical area (Fig. 1), where we expected all individuals to have experienced similar environmental conditions. Table 1 gives a brief presentation of central habitat characteristics. Meteorological data were available through the Norwegian Meteorological Institute's eKlima service (eklima.met.no) and were downloaded as monthly temperature averages and monthly precipitation measurements.

Material
For all species, the known-age material originates from individuals marked during their first winter of life, typically as 8-10 months old. Moose, red deer and Svalbard reindeer were marked as part of research projects, whereas the semidomestic reindeer were marked as part of reindeer husbandry management practices. We assumed data from semi-domestic reindeer to be equivalent to similar data from wild reindeer.
Mandibles from known-age individuals of moose and red deer were collected by hunters during the annual hunting season (1 September-23 December). Mandibles from semidomestic reindeer were collected during slaughtering in late January. Material from Svalbard reindeer originated from animals culled in February (N = 11) and April (N = 7), or animals dying from natural causes between January and May (N = 19). Mandibles from the latter were collected in the following summer. For animals dying from natural causes, the exact time of death is unknown.
Mandibles were boiled to allow easy extraction of incisors (I 1 ). Other teeth were not extracted. For incisors from moose and red deer, the enamel-covered crown was cut off with a band saw to ease further processing. Incisors of reindeer were processed completely. Further processing followed the method developed by Reimers and Nordby (1968). Teeth were demineralized in 7.5% nitric acid (HNO 3 ) for 48 (moose) or 24 h (red deer and reindeer) before washed in gently running tap water for 24 h. A freezing microtome was used to cut longitudinal sections of 30 μm (1 μm = 1/1000 mm). All sections covered the whole length of the pulp. 8-10 sections from the middle part of the tooth were stained with a solution Fig. 1 Map of Norway and Svalbard identifying the areas where the tooth material from each species was sampled and giving the location of the weather stations referred to in the study areas description of 1 g haematoxylin cryst., 0.2 g sodium iodate (NaIO 3 ) and 50 g aluminium potassium sulphate (KAl(SO 4 ) 2 ) dissolved in 1 L of deionized water, for 45 min to increase contrasts between rest lines and growth lines. The stained sections were then washed in gently running tap water for another 24 h before being mounted onto microscope slides using Kaiser's glycerol gelatine (C 3 H 8 O 3 ) and sealed with a clear glass cover slip. Sections were then read using a light-transmitting microscope at × 4 or × 10 magnification or a microfiche reader.
We used a similar number of tooth sections from all species, but due to data limitations, we could not attain a perfectly balanced distribution between age classes (Table 2). Only individuals 2 years and older were included.

Age determination
Annual cementum layers were counted in sectioned permanent incisors, which are normally fully erupted when animals are entering their second winter of life (red deer: Loe et al. 2004;Moose: Markgren 1964; wild reindeer: Reimers and Nordby 1968). Dental cementum is formed continuously, but the exact period for the formation of growth and rest lines has been found to vary as a function of geographical location (Grue and Jensen 1979). In general, growth lines are expected to be formed between June and October while rest lines are formed between November and March. In southern Spain growth line formation in red deer molars occurred between March and September (Azorit et al. 2002). In Southern Norway, growth lines of wild reindeer form between June and September (Reimers and Nordby 1968;Takken Beijersbergen 2019), whereas in Greenland, the translucent growth layer formation was found to continue until November (Pasda 2006). We could not find equally detailed studies for European moose populations, but Boertje et al. (2015) examined the accuracy of moose age determination in Alaskan moose based on canines and incisors from known-aged individuals that also including the seasonal transition period in spring and summer when annuli are formed. They found lower accuracy of age estimates for moose that died in April-August. This was likely because the rest line from last winter was not yet followed by a visible translucent growth line, causing a 1-year underestimation of age.
In our study, only rest lines that were followed by a translucent growth line were counted. Likewise, the rest line closest to the dentin had to be separated from the cementum-dentine junction by a translucent growth line to be recorded as a distinct rest line. To account for the period with deciduous dentition, individual age was determined as 1 + the number of rest lines (see Fig. 2).
In moose, incisors can erupt during the first winter of life, and the first detectable rest line will be formed during the individual's first winter. Such rest lines are referred to as juvenile lines (Grue and Jensen 1979;Rolandsen et al. 2008) and can be identified by the discontinuity around the root apex (Grue and Jensen 1979).  In addition, the translucent growth line between the juvenile line and the cementum-dentine junction is very narrow, and the juvenile rest line becomes thin and undulating towards the tooth crown. Juvenile rest lines were not counted. The rate of cementum apposition normally increases towards the apical end of incisors creating wider growth lines. Annuli are therefore most easily counted in the area around the end of the root.

Experimental setup
All tooth sections were presented to four observers with previous experience with CAA. All observers were aware that their readings should be used in a study evaluating the accuracy of age determination. All observers are also included as co-authors. Species name was given for each set of tooth sections, and observers were classified as 'experienced' if they had previous experience with age determination of the given species. For readings of tooth sections from species where they had no previous experience, they were classified as 'unexperienced'. All the material was given a random animal ID and ordered independent of age. To allow calculation of repeatability, each observer read all available sections twice. Time between readings was minimum 1 week. Age was recorded as 1 + the number of dark rest lines, excluding the potential juvenile rest line. Observers also provided a certainty index for each reading based on the individual judgement of ageing accuracy: (A) ± 0 year, (B) ± 1 year, (C) ≥ ±2 years. showing the apical end of a decalcified and stained incisor root from a 6-year-old individual from each of our four study species, moose (Alces alces), red deer (Cervus elaphus), semi-domestic reindeer (Rangifer tarandus tarandus) and Svalbard reindeer (Rangifer tarandus platyrhynchus). The dentine (D), cementum-dentine junction (CDJ) and cellular cementum (C) are identified in each panel. The lighter and broader vertical bands in the cellular cementum are summer growth lines, whereas the darker and narrower bands are the winter rest lines. White lines with numbers identify each detected rest line referring to the number of experienced winters. To adjust for the first winter with deciduous incisors, 1 is added to the number of rest lines to get the correct age During the second reading, results from the previous age determination were not available for the observer.

Statistical analyses
To examine the accuracy of tooth sectioning as age determination method, we first used linear mixed-effects models (LMM), fitting random intercepts for observer ID and animal ID, and linear and quadratic effects of known age as fixedeffect predictors. Alternative models were compared based on maximum likelihood estimates, and we used Akaike's Information Criteria corrected for small sample size (AICc) to guide model selection and help to assess to what extent there was support for a quadratic relationship. In cases where only the linear term was needed, we examined deviation from the expected relationship (i.e. slope of 1) by examining to what extent the 95% confidence interval (CI) for the term included 1. Since we were interested in testing the reliability of the method for each species, we constructed single-species models rather than a combined model for all four species. In addition, we estimated the proportion of the variance accounted for by the random terms as well as the repeatability (Nakagawa and Schielzeth 2010) across readings of the same animal by the same observer.
Repeatability is synonymous with intraclass correlation, calculated as the proportion of the variance associated with the respective factor variable. We estimated repeatability for the null model with no fixed terms included ('agreement repeatability') (see Nakagawa and Schielzeth 2010). For estimating repeatability across samples and observers, we also created a new variable by combining the observer ID and animal ID. In our experiment, each observer assessed each sample twice.
To determine the probability to assess correct age (0 = noncorrect, 1 = correct), we fitted generalized linear mixed-effects models (GLMM) with a logit link function and observer ID and animal ID as random variables. We then modelled the effect of species, observer experience and known age. LMM and GLMM analyses were performed using the R-package lme4 (Bates et al. 2015). All analyses were performed with the statistical software R version 3.4.1 (R Core Team 2017).

Results
The relationship between known and observer-assessed age was linear for all species except for Svalbard reindeer, for which the relationship was quadratic (Fig. 3, Table 3). In cases of similar support (ΔAICc < 2), the linear model was preferred due to a lower number of parameters (Table 3). The estimated linear regression slope was less than 1 for moose (CI = 0.87, 0.95) and semi-domestic reindeer (CI = 0.81, 0.92), but not for red deer (CI = 0.98, 1.01). The model predictions showed a slight tendency to overestimate the age of younger individuals and underestimate the age of older individuals (Fig. 3). Across all ages and species, the range of bias between predicted and known age varied between a maximum underestimation of − 0.85 for older individuals and an Fig. 3 Panels showing the relationship between known age and observer-assessed age for moose, red deer, semi-domestic reindeer and Svalbard reindeer. Red solid lines represent the predicted relationship between observer-assessed age and known age based on the most parsimonious model for each species (listed in Table 3). Black lines represent the 1:1 relationship between known age and observerassessed age. Points are jittered to avoid complete overlap overestimation of maximum of 0.54 years for younger individuals. When fitting separate models for each species, we found significant decline in the probability to assess correct age with increasing age in both moose (slope on logit-scale: Across species and observers, observer-assessed age was correct in 69% of all readings (Table 4), and 95% of all readings were within ± 1 year of known age. Age determination of red deer proved to be most accurate with 80% correct assessments, whereas semi-domestic reindeer proved to be the most difficult species to age correctly (Table 4).
When modelling the probability of correct age assessment as a function of species and observer experience, both variables were included in the two competing models ( Table 5). Inclusion of an interaction term resulted in a marginally better model fit (ΔAICc = 0.66) than the model consisting of only additive terms (Table 5). Based on the fixed-effects parameters from the latter model (additive effect of species and experience on logit-scale), mean probability for correct age assessment in moose was 84% for experienced and 66% for unexperienced observers. Similar results for red deer was 93% vs 83%, Svalbard reindeer 89% vs 74% and for semidomestic reindeer 73% vs 49%.
Animal ID accounted for 28% (moose), 19% (red deer), 33% (Svalbard reindeer) and 37% (semi-domestic reindeer) of the model variance with some individuals being consistently over-or underestimated. In contrast, there appeared to be little difference between observers in terms of systematic under-or overestimation (accounting for only 6, 0, 3 and 9% in moose, red deer, Svalbard reindeer and semi-domestic reindeer, respectively). Overall agreement repeatability was estimated to 99% in red deer and 98% in moose, Svalbard reindeer and semi-domestic reindeer. A high agreement repeatability indicates that the same observer is likely to assign the same age to the same sample when the sample is assessed multiple times.
The observers assigned a certainty index to each reading, indicating their presumed accuracy of the reading. Across species and observers, probability for correct age assessment was 0.82 when the observer reported high certainty (index A) and 0.58 when the observer reported lower expected certainty (index B or C).

Discussion
In this study, we show that readings of demineralized and stained sections of incisors from four cervids are an overall reliable method for age assessment. The relationship between known age and observer-assessed age was linear, except for Svalbard reindeer where a quadratic function explained the relationship better. Across species and observers, 95% of all readings were within ± 1 year of known age. In general, the age of younger individuals was slightly overestimated, whereas the age of older individuals was slightly underestimated. The probability of correct age assessment was substantially Table 4 Overall correspondence between observer-assessed age and known age across four observers and four study species. 'Exact' indicates the percentage of all readings with a match between observer-assessed age and known age. 'Exact ± 1' indicates the percentage of all readings where the observer-assessed age-matched or equalled the known age ±   improved when the observer had previous experience in ageing a given species. Despite the theoretical predictability related to the formation of incremental lines, there is still noticeable intra-and interspecific variation in the distinctness of annuli. Variation in mineralization and collagen orientation during the process of cementogenesis is the primary physiological factors causing the seasonal line appearance (Lieberman 1994). However, although these processes are well understood, their relation to other secondary factors like seasonal or annual differences in nutritional, biomechanical or hormonal factors is less clear.
In several species, periods of high reproductive investment have been found to cause formation of hypermineralized cementum lines in both males (Mitchell 1967;Reimers and Nordby 1968) and females (Kagerer and Grupe 2001;Medill et al. 2009;Von Biela et al. 2008). Such lines can be misinterpreted as regular rest lines. Periods with cold stress have also been found to cause formation of hypermineralized cementum lines in great apes (Cipriano 2002). A common denominator for periods with high reproductive energy investments or other stressful conditions is that they may cause substantial nutritional stress. Subsequently, this can affect tissue growth rate and thus cementum band formation (Lieberman 1994).
Rest lines occasionally bifurcate. To avoid miscounting, thorough inspection along the longitudinal direction of the root, or of multiple sections from the same tooth, is necessary to reach a consistent conclusion. Another source of error, particularly evident in moose, is the so-called juvenile lines, which might develop when the permanent incisors erupt before or during an individual's first winter. Several distinct characteristics allow identification of such lines. Consistent evaluation of juvenile lines, as well as other regularly occurring variations in cementum annuli patterns, is of key importance to avoid observer related errors when ageing animals by CAA (e.g. Christensen-Dalsgaard et al. 2010;Matson et al. 1993;Rolandsen et al. 2008).
Date of death is important information to consider when CAA is performed. As a rule of thumb, rest lines in northern ungulates are formed during winter and spring, but considerable variation in timing of line completion has been found in white-tailed deer (Sauer 1973). In a study of Alaskan moose, Boertje et al. (2015) found that the age of moose dying during the seasonal transition period associated with completing peripheral annuli formation, in July-August, was frequently underestimated by 1 year. When ageing animals with a date of death falling within such a seasonal transition period, special attention should be given. In our study, Svalbard reindeer were shot in February or April, or dying from natural causes in winter. Given that the formation of the winter rest line was under creation or even completed at the time of death, it would still be hard to detect if not followed by the lighter summer growth layer. Therefore, we do not expect that the difference in time of death causes systematic differences between the Svalbard reindeer results and the results from the other species included in our study.
Previous research has found a relationship between latitude and CAA accuracy, where more seasonal environments generate clearer distinctions between cementum growth lines and rest lines (Brokx 1972;Deyoung 1989;Hamlin et al. 2000;Rice 1980). Similarly, between-year variation in environmental conditions on a regional scale has also been found to generate variation in cementum layer distinctness (Asmus and Weckerly 2011;Jacobson and Reiner 1989). In our study, some individuals from all species were consistently over-or underestimated causing animal ID to be responsible for 19-37% of the variation in modelled probability of correct age assessment. This relatively high individual-specific error rates for all species exemplifies that overall error rates across multiple observers are highly dependent on the individual expression of incremental traits. Between-year variation in e.g. environmental conditions, or annual variation in reproductive costs, may cause individual variation in the cementogenesis ultimately generating individual differences in annuli distinctness. However, the relatively low sample size per species, and the large variation in the year of collection made further investigation of potential year or environmental effects on CAA unfeasible. Variation in phenotype and environmental conditions are known to affect timing of tooth eruption (Loe et al. 2004). Given substantial differences in growth conditions experienced as juveniles, this might have affected the timing of tooth eruption and potentially also the occurrence of the first rest line. As previously described, such factors are mainly expected to be of potential influence for moose, where incisors regularly erupt during their first winter of life, potentially resulting in the formation of a juvenile line. The frequency in occurrence of such juvenile lines and the observers' ability to consistently identify such lines will have a potential effect on the overall ability to accurately determine age. An alternative explanation is that minor variation in the laboratory treatment and processing during the production of stained tooth sections have reduced the quality of the sections and in turn their readability (Matson et al. 1993). To reduce potential handling effects, a second incisor should always be collected so that a second set of sections can be produced if the first set of sections does not lead to an unequivocal conclusion. Potential interspecific differences in incisor eruption, and hence incremental line formation, might cause consistent biases in age determination when a common annuli counting method is used. In our study, we used the same age determination approach for all species, i.e. true age equals the number of rest lines + 1, which assumes that the first rest line is created during the second winter of life. No consistent bias between known and observer-assessed age was found for any species. This supports that the general assumption was valid for the species investigated in our study and given the environmental conditions experienced by the populations that were sampled.
At the species level, the geographical distribution of the data was small. Despite the considerable variation in age, individuals originating from one sampling area were expected to have experienced similar environmental conditions. Considerably more latitudinal and environmental difference existed between study sites for different species. Still, the environmental conditions in all sampling areas were characterized by a distinct difference between nutritionally rich summers and nutritionally stressful winters. Such conditions prevail in most of Norway and are likely to facilitate age determination by CAA. Nonetheless, experiences from more than 25 years of cervid population monitoring and CAA in all of Norway (Solberg et al. 2017) have revealed intraspecific variation in annuli distinctness among regions. This has been particularly evident in moose where the latitudinal distribution range stretches from 58 to 71°N (personal communication M. Heim). Because access to known-age material from areas of contrasting latitudinal origin has not been available for single species, these topics need to be addressed in later studies.
The general principles for the deposition of seasonal growth layers in the tooth cementum are valid for all permanent teeth. In our study, we copied the procedures used by the national monitoring programme for wild cervids in Norway (Solberg et al. 2017). This programme produces tooth sections from around 2-3000 cervids shot during the autumn hunting season every year. Rational and consistent procedures are essential for the effective processing of such large sample sizes. For the workflow of the Norwegian monitoring programme, incisors have proved to be easier and quicker to extract and process compared to (pre)molars. Other studies have recommended using molars (e.g. Pérez- Barbería et al. 2014;Azorit et al. 2004) or canines (Boertje et al. 2015). The arguments for using molars are alternative processing techniques (e.g. Pérez- Barbería et al. 2014) or for canines being more distinct annuli and more easy to extract (Boertje et al. 2015). Moreover, Azorit et al. (2004) found higher ageing accuracy for molars (M 1 ) versus incisors (I 1 ), although only based on individuals up to c. 3.5 years old. Regardless of the chosen method or the teeth processed, the approach should be consistent over time and potential methodological changes should be properly documented.
The interspecific differences in precision of CAA are probably related to species-specific differences in the biological processes of the cementogenesis. Similar interspecific differences were also identified in another comparative study of three North American deer species (Hamlin et al. 2000). In general, CAA provides a reliable technique for ageing a wide range of animals in seasonal environments. Still, it is important to be aware of existing interspecific differences, as well as intraspecific and habitat-related factors expected to influence CAA accuracy. CAA has been criticized for being expensive, labourintensive and time-consuming. However, with rational processing procedures, efficient laboratories and experienced observers, costs and time use are minimized and the accuracy of estimated age is high. Previous research has shown that CAA outperforms alternative methods for age determination of cervids (Pérez-Barbería et al. 2014). There is also a clear advantage that CAA is normally performed by professional laboratories with experienced observers. However, to ensure consistency over time, we recommend that a reference material from known-age individuals be available for calibration and training of new observers.