Introduction

Stable isotope analysis of bone and tooth collagen has been widely used by bioarcheologists for reconstructing breastfeeding and weaning practices in past societies (Cocozza et al. 2021; Eerkens and Bartelink 2013; Eerkens et al. 2011; B. T. Fuller et al. 2006a, b). After metabolic processes, stable isotopes remain in bone or tooth collagen incorporating signatures of consumed foods (Ambrose 1990; DeNiro 1985; Fernandes et al. 2015, 2012; Lee-Thorp 2008). Measured ratios especially of carbon and nitrogen discriminate against broad dietary categories (animal and marine protein, C3 and C4 plants) (Katzenberg and Waters-Rist 2018; Lee-Thorp 2008; Tykot 2004). Therefore, it is possible to determine the diet of ancient humans, bearing in mind that isotopic values may be altered by potential diagenetic effects due humic acids and additional contaminants (Ambrose 1990; DeNiro 1985; Guiry and Szpak 2020; van Klinken 1999).

Nitrogen isotopic ratios are known to increase by 5.5 ± 5‰ between consumer and consumed source (Fernandes et al. 2012), and a similar effect is observed in infants during exclusive breastfeeding. In particular, the δ15N values of exclusive breastfed infants are elevated by approximately 2–3‰ compared to those of their mothers’ based on studies on the fingernails of  modern mother-infant pairs (Fogel et al. 1989: ~ 2.4‰; Fuller et al. 2006a, b: ~ 2–3‰; Herrscher et al. 2017: ~ 2–2.8‰, (Fernández-Crespo et al. 2022). The consumption of solid foods and fluids at the onset of weaning causes the δ15N values to decrease throughout the weaning process due to change on the protein source, e.g. breast milk vs. animal milk (Herring et al. 1998). After the complete cessation of breast milk consumption (termed weaning age or age at completion of weaning) (Dettwyler 2017; Nitsch et al. 2011), δ15N values fall to a similar level as the adult population  (Fig. 1).

Fig. 1
figure 1

Schematic representation of δ15N values of individual profile based on three discreet dietary phases of infants (exclusive breastfeeding, weaning, post-weaning period)

Similar to nitrogen, carbon isotope ratios (δ13C) are expected to be elevated during exclusive breastfeeding and decrease throughout the weaning process between < 0.5 and 1‰ (Fuller, et al. 2006a, b; Herrscher et al. 2017). However, this decrease is not as straightforward as in δ15N values, as δ13C values are more influenced by different sources of protein. By extension, their values are directly influenced by the supplementary foods offered from the first stages of weaning and increase their variability (Fuller, et al. 2006a, b). Therefore, there is less agreement amongst studies about the link between δ13C values and breastfeeding.

Early research on breastfeeding and weaning duration in past populations was based on the measurement of δ15N in bone collagen (Dupras et al. 2001; Powell et al. 2014; Rutgers et al. 2009). Sub-adult individuals whose δ15N values are higher than those of the adult population were assessed as still weaning infants, while those whose δ15N values were similar to the adult population were considered as fully weaned (Halcrow et al. 2021). However, bone collagen integrates a temporal isotopic signature that may include several years before death, which makes this an not optimal approach to address infant feeding practices (Fahy et al. 2017; Hedges et al. 2007). Furthermore, the adult population consists of different subgroups (men, women, young adults, elders, pregnant women), whose diet is not a priori homogenous (Halcrow et al. 2021). Moreover, the assessment of dietary practices using data from children who died before reaching adulthood (i.e., the “Osteological Paradox”) produces further implications in drawing conclusions (DeWitte and Stojanowski 2015; Wood et al. 1992).

Yet, the new development of sequential isotopic analysis of dentine, which includes the segmentation of dentine, from crown to root into small sections, offers the potential to reconstruct diet in time-bound periods of life (Beaumont et al. 2013; Eerkens et al. 2011). As dentine forms in subsequent layers and is hardly influenced by dietary and environmental factors after its initial deposition, each section retains an isotopic signature reflecting its formation period (Dean 2017; Smith et al. 2006).

Although isotopic measurements from dentine collagen increments are far more time-accurate than bulk bone collagen measurements, assigning ages to increments is not a straightforward process. Dentine serial sectioning (DSS) includes the segmentation of dentine in horizontal sections of approximately 1-mm length (Beaumont et al. 2013). However, dentine is deposited in a sigmoid, almost conical pattern (Dean and Cole 2013). As a result, each section incorporates part of the preceding and part of the succeeding section, which widens the age range of each increment (time averaging). Recent research has shown that horizontal sectioning of dentine blurs the isotopic signal more than once thought (Tsutaya 2020). To address this, alternative sampling protocols have been developed (Czermak et al. 2020, 2018; Fernández-Crespo et al. 2018, 2020). These approaches aim to obtain minimum sample sizes, which include less dentine layers per sample either by using biopsy punches either by using image assisting sampling. These sampling methods are more sensitive to the formation of dentine, therefore provide better results in the age-at-increment assignment, and by extension describe dietary changes more accurately (Tsutaya 2020). The recent protocol of Curtis et al. (2022) offers even more promising results.

In parallel to the significant progress in sampling techniques, statistical modelling tries to address sources of uncertainty in stable isotope studies (Cocozza et al. 2021; Fernandes et al. 2015; Tsutaya 2020; Tsutaya and Yoneda 2013). OsteoBioR is a Bayesian model that employs approximate Bayesian computation (ABC) and suggests a flexible method for the age assignment of each dentine increment (Cocozza et al. 2021). Tsutaya (2020) presents a mathematical model for investigating the age range of the horizontal dentine sections (package MDSS, distributed in R) shedding light on the wider, than assumed, age range of each increment. WARN is an R package that uses ABC for estimating the beginning and the complete cessation of weaning (Tsutaya and Yoneda 2013). However, this method utilizes δ15N values only of sub-adult bone collagen (see limitations above).

The abovementioned studies pursue the aim of increasing the temporal resolution of the isotopic analysis and reconstructing intra-individual dietary changes across childhood. However, albeit infant feeding practices are often investigated using incremental dentine stable isotope analysis, the estimation of weaning onset and completion are performed visually (Ganiatsou et al. 2022; Lee et al. 2020; Scharlotta et al. 2018).

The visual evaluation of each isotopic plot requires skill and expertise, which by extension require time and labor investment when analyzing many isotopic profiles at a time. Furthermore, this procedure enfolds the subjective judgment and the inability to compare isotopic profiles produced by different protocols. To facilitate and standardize this process, we propose a streamlined procedure of weaning age estimation based on mathematical computation for which we developed the automatic tool WEAN. The tool estimates the weaning age mathematically and illustrates the results in visually appealing charts and user-friendly output files in PNG and CSV format.

Materials and methods

WEAN overview

WEAN is built in Python (code available on GitHub) and is freely distributed (see Code availability). The tool provides weaning age estimations and descriptives for one or many individuals. The user needs to provide a CSV file with the age-at-increment assignment of each δ15N measurement. The tool necessitates at least five measurements per individual based on the axiom of quartic polynomial equations of fourth degree but has no limit on the maximum number of measurements (Supplementary Material). The file can be easily imported, and the user can select the individual(s) they want to analyze through a drop-down list from a friendly user interface (Fig. 2). The results obtained from WEAN can be easily exported in CSV format and include: the weaning age estimation with the elbow method, the R-squared values showing the fitness of the regression model, and the derivative at the weaning age as well as the calculated values used for the trophic level effect assessment (for details see Table 1). Individual and dataset charts are automatically produced and can be downloaded in PNG format (Fig. 2).

Fig. 2
figure 2

Screenshot from WEAN interface showing the resulting plots. a Line chart for one individual showing δ15N values to assigned ages (dotted line, observed values; black line, regression line; pink line, weaning age). b Chart showing observed δ15N values of the whole dataset. i.e. one blue line per individual and pink line: mean weaning age of the population

Table 1 Description of columns included in WEAN output file

Mathematical background of WEAN

The mathematical base of WEAN lies upon the estimation of the elbow/knee point of a curve. In mathematical terms, the point of a curvature, where the curve visibly angles, is defined as the knee/elbow point. Bearing in mind that reference incremental dentine δ15N values associated with breastfeeding show a gradually declining pattern, which may stabilize or increase soon after breastfeeding is no longer practiced, the completion of weaning process is assessed as the point where the isotopic curvature visibly bends. The automatic weaning age estimation of WEAN is based on the calculation of the elbow point.

The elbow point estimation is broadly used in science, and there are many ways for its calculation (Satopää et al. 2011). In this study, the elbow is calculated by determining the point of the curve with the greatest perpendicular distance from a hypothetical line that connects the first and last observation of the curve (Fig. 3). We assumed that there is a line (M) that connects the first (A0) and the last observation (A1) of the curve with the following coordinates A0 (agemax_Nitrogen, max_Nitrogen) and A1 (agelast_Nitrogen, last_Nitrogen) (Fig. 3). Then, we calculated the perpendicular distance (d) of each measurement (n) belonging to the curve. Based on the above calculations, the elbow point has the maximal distance d (dmax, marked in a pink circle in Fig. 3).

Fig. 3
figure 3

Schematic representation of the elbow point estimation used in the current study. The curve represents individual δ15Ν values at specific age intervals (n). The dotted line M connects the first (A0) and the last (A1) δ15Ν measurements of the individual. Orange lines are the calculated perpendicular distances (d) to the line M, pink arrowed line (dmax) represents the maximal calculated distance, and the pink dot represents the elbow of the curve, i.e. the weaning age of the individual

Besides the application of the elbow point method, we have considered additional parameters for optimizing the accuracy of our estimations and the functionality of the tool.

  1. A.

    Trophic level effect related to breastfeeding

The tool calculates the decrease in δ15Ν between exclusive breastfeeding and the age at completion of weaning, to assess if this difference falls within one trophic level difference related to breastfeeding (see Introduction and Supplementary Material). The mathematical workflow of this step is schematically presented and explained in detail in the Supplementary Material. To facilitate the user, WEAN plots individual(s) according to the degree of decrease in δ15Ν values.

  1. B.

    Age-at-increment refinement

To account for the uncertainty of time averaging (Tsutaya 2020), WEAN primarily performs linear regression on the assigned, by the user, age at increment. This enables the refinement of the assigned ages for each individual. The regression is performed based on fourth-degree polynomial equations (Table S2, Supplementary Material).

  1. III.

    Evaluation of the δ15N pattern

Some individuals may show a stepwise increase instead of a decrease in their δ15N values (Scharlotta et al. 2018). Isotopic studies on malnourished children (Beaumont et al. 2016) indicate that individuals who suffer from malnutrition are characterized by increasing δ15N values and falling δ13C values (Beaumont and Montgomery 2016; Mekota et al. 2006). Although this pattern is not indicative of breastfeeding, it could be troublesome for the tool. To account for this variability, the tool calculates the derivative of the δ15N value of the weaning age to determine its inclination. The derivative of a function shows the slope of its tangent line at a specific point. For functions that have a decreasing tendency, the derivative takes negative values (f'(x) < 0); for functions with an increasing tendency, the derivative takes a positive value f'(x) > 0. In our model, negative derivatives indicate that δ15N values have a decreasing tendency at the weaning age point indicative of breastfeeding. In contrast, positive derivative values are indicative of no breastfeeding. This parameter identifies the presence or not of breastfeeding pattern and prohibits false assignments of weaning age (Table S2, Supplementary Material).

The mathematical workflow of this step is schematically presented in Fig. 4 and explained in detail in the Supplementary Material. To facilitate the user, WEAN plots individual(s) according to the degree of decrease in δ15Ν values (Fig. 4).

Fig. 4
figure 4

Left: schematic representations of the calculations performed to determine the difference in δ15N values between breastfeeding and weaning age for determining the trophic level decrease. Right: the plot for trophic level assessment using WEAN. Each individual is represented by a single dot with error bars of 1 Std.Dev. Dotted lines indicate the 2-3‰ range related to breastfeeding. The δ15Ν value dataset used for the visualization derives from Ganiatsou et al. (2022)

  1. IV.

    Number of δ15N measurements per individual

Given that the number of measurements per individual may vary depending on the tooth type, the root, and increment length (Tsutaya 2020), and this influences the regression modeling (Harrell 2017), we determined the number of δ15N measurements per individual that the tool requires for producing reliable estimations. Using a bootstrap-like method, we generated new sets of observations out of the existing measurements of the datasets. Specifically, we sampled 1000 sets, varying in the number of observations, using random δ15N measurements from each of the 130 individuals, e.g., 1000 iterations for five random measurements from one individual and 1000 iterations for six random measurements from the same individual, and repeated this process until there was no other combination We then calculated the R-squared and the mean R-squared value for all sets to determine the optimal number of δ15Ν measurements per individual for producing reliable results (Figure S4, Supplementary Material).

Evaluation of WEAN

The test datasets

To test the accuracy of our model, we re-estimated 130 weaning ages from published datasets (Table 2) and compared the results with the initial weaning age estimation. The initial estimations were performed visually based on isotopic profiles produced from deciduous first and second molars and first permanent molars. In the dataset of Scharlotta et al. (2018), there were four individuals with two weaning ages as the authors supported a possible multiphase weaning procedure. We used the earliest recorded age in this study, since Tsutaya’s study showed that weaning ages estimated from parallel dentine sections are usually later than the true ones (Tsutaya 2020). The purpose of this measure was to mitigate the artifact associated with time averaging in dentine increments.

Table 2 Datasets used in this study for evaluating the visually observed weaning age with the automated weaning age acquired with WEAN

We should note that we included all 130 individuals despite the fact that in some cases there was no weaning age estimated by the observers as the δ15Ν profiles were not indicative of breastfeeding (n = 14). Our aim was to test the accuracy of the tool in identifying these cases. A summary of the datasets can be found in Table 1.

Statistical analysis

For comparing the published and the newly estimated weaning ages, we calculated the root mean square error (RMSE), which shows their difference in years, e.g. if the weaning age was observed visually at 2 years of age and WEAN estimates it at 3 years, the RMSE would obtain a value of one. To calculate the RMSE, we used the following formula (1):

$$\sqrt{(\mathrm{wa}_\text{published}-\mathrm{wa}_\text{WEAN})^2}$$
(1)

Furthermore, we compared the two methods with the Bland–Altman plot (Altman and Bland 1983; Bland and Altman 2010). This chart shows the agreement between the measurements of the two methods by plotting their difference against their mean difference. The Bland–Altman method is more suited in cases of method comparison instead of correlation analysis or other inferential statistic models, i.e., t-test and ANOVA (Mansournia et al. 2021), and more informative than their direct comparison (Bland and Altman 2010).

Results

Evaluation of the tool

WEAN re-estimated weaning ages from 119 out of 130 individuals (results shown in Table S1). The application could not estimate the weaning age in cases that the δ15N profile was not indicative of breastfeeding, e.g., Az115 T2, Az115 T6, and METi_67 (Ganiatsou et al. 2022; King et al. 2018), and in cases that the δ15N observations did not initiate from birth, e.g., B_93.2, B_76, B_42.1 (Scharlotta et al. 2018), Changxinyuan_M074 (Miller et al. 2020), MLP25 I1, Cam9 T32c1, Az75 T56, and Az71 T315 (King et al. 2018). Identical observations were reported on the published datasets (Ganiatsou et al. 2022; King et al. 2018; Miller et al. 2020; Scharlotta et al. 2018).

Ninety-four out of 130 individuals had a decreasing trend in δ15N, thus acquiring a negative derivative and a mean RMSE of 0.88 (range 0–4.7 years). Two individuals (B_65, B4, Table S1) out of the 94 had the highest RMSE (4.6 and 4.7 respectively). The high RMSE value derives from the missing δ15Ν measurements at the beginning of their life, whereas the first δ15Ν measurement starts at the age of 2.4 years old (median age). The remaining 25 individuals with positive derivatives show flat isotopic profiles or their first δ15Ν measurement starts during the first year of life or later and not at birth.

Figure 5a shows the newly estimated vs. the published visually observed weaning ages (n = 119). The measurements lie close to the line of equality (diagonal line) implying the agreement between the two methods. The Bland–Altman plot (Figure 5b) shows a strong agreement between the two methods because the observations cluster close to zero and within the lines of agreement (Altman and Bland 1983; Mansournia et al. 2021). In both charts of Figure 5b, there are nine estimations (B_58.1, B_71, T030Ind5, T028Ind2, B_65, T010Ind2, B4, METi_119, METi_125; see Table S1) that differ significantly from the rest. All nine individuals obtained a positive derivative, apart from T030Ind5, T028Ind2, B_65, T010Ind2, and B_65, whose observations start from the age of two and will be discussed further.

Fig. 5
figure 5

a The estimated weaning ages of all examined individuals (x-axis) against their respective published weaning age (y-axis). b Bland–Altman plot showing the agreement of the two methods. The observations cluster close to zero and within the lines of agreement (blue lines). In both charts, there are five weaning ages that differ significantly from the rest. Five of them obtained positive derivatives and one has missing values at age 0-2.4

Comparison of tooth types

Deciduous second molars show the smallest RMSE results (0.82, Table 3), followed by permanent molars (0.97, Table 3). Deciduous first molars exhibit the highest RMSE (1.76, Table 3), which, however, decreases significantly (0.85) when excluding the estimations with positive derivatives (potential false results). The same improvement in RMSE is observed in all tooth types when the estimations with positive derivatives are excluded (Table 3). Furthermore, WEAN did not estimate the weaning age of seven individuals with measurements from permanent molars (B_42.1, B_76, B_93.2, Changxinyuan_M074, METi_67, MLP25 I1, MOR4_M1) and four individuals with measurements from deciduous first molars (Az115 T2, Az71 T315, Az75 T56, Cam9 T32c1).

Table 3 The mean value of the RMSE of published and estimated weaning ages with WEAN for different tooth types (M1, dm1, dm2) retrieved from the 130 individuals tested. NA line shows the number of individuals with no estimated weaning ages. The estimations with negative derivatives line includes only the individuals with negative derivatives (indicative of breastfeeding)

Discussion

The results show a strong correlation between the visual and the elbow method underlining that a mathematical framework can be accurately applied in weaning age estimation. The majority of examined individuals obtained a weaning age, which was calculated automatically and did not differ from the visual observation. This indicates that despite the observed variability in δ15Ν individual profiles, a statistical and automated approach can be established.

False weaning estimations indicated by positive derivatives were obtained from individuals with missing ages (e.g., B_42.1, B_76, B_93.2) and individuals with no variation in their δ15Ν measurements (flat isotopic profiles). The only exception is individual B_65 (Scharlota et al. 2018), for which the tool estimated a weaning age even though the initial ages are missing and did not predict the potential false estimation. Specifically, in this individual, initial dentine layers were missing due to wear; therefore, the δ15N measurements correspond from the age of 2.4 years onwards. However, the δ15N profile shows a decreasing pattern which led the authors to suggest that the weaning period would have ended around the age of two (Scharlotta et al. 2018). Considering that the individual’s isotopic pattern closely resembles a breastfeeding motif, WEAN estimated a weaning age at 6.55 years of age without predicting a false estimation. This is an excellent example of an outlier case based on which, we recommend the users to provide ages that initiate around birth, as decreasing trends in δ15N during later life are most likely linked to other factors such as dietary changes or physiological stress and not to weaning. Therefore, individuals with many missing values esp. the first ones should be treated with caution.

Overall, apart from the profiles that did not initiate from birth, some of the weaning age estimations with positive derivatives were individuals with flat δ15Ν profiles. Since the elbow method locates the point where the curve bends and flat isotopic profiles are characterized by barely noticeable changes, WEAN correctly could not assign a weaning age. Similarly, in the original publications, e.g. METi_125 (Ganiatsou et al. 2022) and Cam9 T32c1 (King et al. 2018), the authors did not assign a weaning age verifying the quality of our observations.

Based on the results, the estimations of WEAN are more accurate for isotopic profiles obtained from M1. Since WEAN uses the principle of the elbow method, it functions more efficiently with isotopic profiles that gradually decrease and stabilize. Since this pattern is more frequently observed in measurements from M1, this tooth type offers the most reliable weaning estimation with the proposed methodology (Results and Supplementary Material) and we suggest M1 as the optimal tooth type for breastfeeding reconstruction with WEAN. The estimation of weaning age from dm1 and dm2 with the elbow method is not as accurate as M1 despite the potential of dm1 to reveal the mother-infant dyad since collagen from dm1 captures dietary shifts in utero and early life.

In most cases, the estimated weaning ages with WEAN are earlier than the published ones (Table S1) because age data are smoothed using regression and the maximal reduction occurs at younger ages. Considering that the true ages of dentine increments are earlier than the calculated ones (Tsutaya 2020), this shows that performing regression on age data prior to weaning age estimation has the potential to provide more reliable results than raw calculations. Although Tsutaya’s study and this research differ in terms of scope and methodology, this line of evidence indicates that the tool’s estimates are appropriately reliable and that performing regression prior to any assessment is an efficient method to overcome time averaging uncertainties due to methodological issues in general.

We are aware that our approach has some limitations. First, WEAN presupposes that each δ15N measurement is assigned to a specific age. Therefore, the accuracy of the estimation is partially determined by the provided age-at-increment assignment, which by extension depends on the protocol of dentine sectioning. Tsutaya (2020) has shown that time averaging in increments obtained with different sampling protocols differs and supported that sampling protocols that isolate smaller samples (Czermak et al. 2020, 2018; Fernández-Crespo et al. 2018) are more efficient in weaning reconstructions compared to horizontal sectioning of dentine. WEAN can be applied independently of the sectioning protocol, as we have shown in the present study, and accounts for errors in time averaging of increments with regression. However, a carry-on error based on the sampling protocol should be additionally considered. The evaluation of weaning ages obtained with different protocols is not performed in this study; nonetheless, bearing in mind the results of Tsutaya, it is possible that microsampling protocols (Czermak et al. 2020, 2018; Fernández-Crespo et al. 2018, 2020) provide more accurate observations for WEAN.

Second, the midpoint age assigned to each δ15N measurement represents in reality an age range. The results obtained from WEAN are reported in absolute numbers; however, it is critical to realize that the estimations do not represent time-specific events in the life of individuals. It is not our intent to oversimplify the estimation of the weaning process but to refine and automate an existing method performed manually by researchers. Thus, the weaning age estimations of WEAN should not be considered as absolutes. WEAN can identify mathematically the age that breastmilk consumption can no longer be detected based on isotopic measurements. However, weaning is a process, during which several metabolic processes occur in order to convert dietary changes in isotopic ratios. The isotopic discrimination between the organism and the diet (i.e., the trophic shift) varies between individuals according to their metabolic state (Fuller et al. 2005; Mekota et al. 2006). This may be depicted as independent changes in δ13C and δ15N values or decreases in δ15N and occasionally increases in δ13C values (Fuller et al. 2005; Mekota et al. 2006). Physiological stress during infancy may also implicate interpretations of isotopic ratios. For example, the degree of decrease in δ15N between breastfeeding and weaning age was used to explore underlying nutritional stress in some studies (Beaumont and Montgomery 2016; Ganiatsou et al. 2022; King et al. 2018). However, particular attention has recently been paid to examining physiological stress through isotopic patterns of opposing covariance. In these isotopic profiles, δ15N values increase as a result of protein catabolism and δ13C values decrease as a result of lipid catabolism (Craig-Atkins et al. 2018; Kendall et al. 2021). Up to now, the decrease in δ15N and δ13C values, between 2 and 3‰ and 0.5 and 1‰ respectively (see Introduction), during early life is accepted to reflect the weaning process (Craig-Atkins et al. 2018; Teresa Fernández-Crespo et al. 2022). A potential future optimization of the tool is the estimation of the weaning ages reported in ranges and the utilization of parameters that differentiate dietary and physiological effects in isotopic measurements.

Conclusions

Stable isotope ratios of nitrogen have proven a reliable method to infer the weaning behavior of past populations. The accuracy of stable isotope ratios of incremental dentine collagen offers an aggregated estimation of dietary habits in early life. However, up to now, a standardized methodology for the weaning age estimation is not available. WEAN introduces an approach to estimate the weaning age accurately and efficiently which is now performed with the same computational method enabling the comparison of different isotopic profiles. Our research intends to facilitate the analysis of δ15N incremental data offering a less time-consuming and labor-intensive procedure through an easy-to-use operational environment.