CKD prevalence based on real-world data: continuous age-dependent lower reference limits of eGFR with CKD–EPI, FAS and EKFC algorithms

Purpose Several recent articles discuss the need for a definition of chronic kidney disease (CKD) that embarks age-dependency and its impact on the prevalence of CKD. The relevance is derived from the common knowledge that renal function declines with age. The aim of this study was to calculate age-dependent eGFR lower reference limits and to consider their impact on the prevalence of CKD. Methods A real-world data set from patients with inconspicuous urinalysis was used to establish two quantile regression models which were used to calculate continuous age-dependent lower reference limits of CKD–EPI, FAS and EKFC–eGFR based on either single eGFR determinations or eGFR values that are stable over a period of at least 3 months (± 10% eGFR). The derived lower reference limits were used to calculate the prevalence of CKD in a validation data set. Prevalence calculation was done once without and once with application of the chronicity criterion. Results Both models yielded age-dependent lower reference limits of eGFR that are comparable to previously published data. The model using patients with stable eGFR resulted in higher eGFR reference limits. By applying the chronicity criterion, a lower prevalence of CKD was calculated when compared to one-time eGFR measurements (CKD–EPI: 9.8% vs. 8.3%, FAS: 8.0% vs. 7.2%, EKFC: 9.0% vs. 7.1%). Conclusion The application of age-dependent lower reference intervals of eGFR together with the chronicity criterion result in a lower prevalence of CKD which supports the estimates of recently published work and the idea of introducing age-dependency into the definition of CKD.


Introduction
In recent years, ongoing discussions on the need for an agedependent definition of chronic kidney disease (CKD) have been ignited [1]. It is consensus that renal function declines with age, yet it has not been delineated whether age-related deterioration in renal function is a phenomenon predominantly linked with cell senescence and loss of nephron number versus occurrence due to underlying diseases that are more prevalent with advanced age [2]. Yet, the CKD definition in Kidney Disease-Improving Global Outcomes (KDIGO) guideline lacks an age-dependency hitherto [3]. To address this issue, Delanaye et al. proposed an age-dependent threestage model in which "normal range cut-offs" for estimated glomerular filtration rates (eGFR) are defined to separate impaired renal function in dependency of age (75 ml/min/1.73 m 2 (< 40 years), 60 ml/min/1.73 m 2 (40-65 years), 45 ml/ min/1.73 m 2 (> 65 years) [4]. This three-stage model is a simplification of their continuous model, which uses the 3rd percentile as a lower limit [4][5][6]. It is important to point out that reference intervals and clinical decision limits are mostly not congruent. While a reference interval is a statistically determined range between the 2.5th and the 97.5th percentiles of the measured values of a biomarker in an "apparently healthy" reference cohort, consisting of a lower and upper reference limit, clinical decision limits are determined from clinical studies dealing with specific diseases [7]. The current definition of CKD is based on clinical decision limits. Since it is known that eGFR decreases steadily beyond the age of about 40 years even in healthy individuals [4], the application of a definition that assumes CKD at an eGFR below 60 ml/min even in the absence of renal pathology, such as proteinuria or pathologic imaging, is likely to result in an over-diagnosis of CKD, as discussed by Glassock et al. [1]. In addition, excess mortality of people older than 65 years with an eGFR ranging from 45 to 59 ml/min does not appear to be significantly higher than that of people of the same age with an eGFR ranging between 75 and 89 ml/min [4,8]. These data indicate that the definition of CKD should be age-dependent to avoid over-diagnosis of CKD [1].
With age-dependent definition of CKD, the lower reference limits of eGFR not only have an impact on the diagnosis of impaired kidney function for each individual, but it also furthermore skews the overall prevalence rates of CKD. Another important influence on CKD prevalence is the presence or absence of chronicity of renal impairment. Many studies relied on one-time eGFR measurements and failed to meet the chronicity criterion put forward by the KDIGO definition of CKD with respect to impaired renal function and/or persistence of proteinuria [9]. The derived estimates conclude that the prevalence of CKD ranges between 10 and 16% [9]. However, recent studies suggest that the prevalence of CKD is lower following strict adherence to KDIGO criteria together with applying age-dependent eGFR thresholds and calculates numbers around 6% [9,10,16].
In this study, we establish two models for age-dependent lower reference limits of Chronic Kidney Disease Epidemiology Collaboration (CKD-EPI)-eGFR [11], Full Age Spectrum (FAS)-eGFR [12] and European Kidney Function Consortium (EKFC)-eGFR [13], that are based on one-time eGFR determinations and eGFR values that are stable over a period of at least 3 months, respectively. The data are derived from a realworld data set as outlined in the data collection and modeling section. The two models are used to predict continuous agedependent lower reference limits of eGFR for all three eGFR formulas. The resulting lower reference limits are applied to calculate the prevalence of CKD in a validation data set, once without and once with appliance of the chronicity criterion, to obtain insights into the impact of the chronicity criterion for the calculated prevalence rates of CKD.

Data collection and modeling
For a better understanding, the workflow for creating the data sets and calculating the models and the prevalence of CKD is visualized in Fig. 1. The characteristics of the data sets and models are provided in Table 1.
In this study, we used a retrospective approach. All data were determined and collected at a regional privately organized laboratory specialized in providing blood testing for private practitioners (approximately 800 private practices and 8 hospitals) in an urban setting of a medium-sized city in central Germany (~ 250 k inhabitants).

Model data sets
The basic model data set was created by a data query from the laboratory information system (data query period: August 2017-December 2020) from patients aged 30-85 years with a enzymatic quantification of creatinine ("Creatinine plus ver.2" (CREP2), Roche Diagnostics GmbH, Mannheim, Germany) as well as proteinuria < 10.0 mg/dl, blood impurities of urine < 0.015 mg/dl, < 25 leukocytes per µl of urine and < 20.0 mg/dl of glucose in the urine (measured using the iChemVELOCITY automated urinalysis chemistry instrument by Beckman Coulter GmbH, Krefeld, Germany). Blood and urine samples have been sent together to the laboratory and analyzed the same day. For each enzymatic creatinine measurement, the CKD-EPI, FAS and EKFC-eGFR was calculated. Submissions from hospitals, nephrology practices and dialysis centers were excluded. With this source data, the following data sets were created: Data set for model 1 Only the first submission of each patient into the query period was extracted, resulting in a data set of 8916 singular creatinine measurements with their corresponding eGFR calculations. This data set was used to calculate model 1 for CKD-EPI, FAS and EKFC-eGFR. The characteristics of the data set are provided in Table 1.
Data sets for model 2 Only patients with stable kidney function were included to exclude patients with acute kidney injury despite unremarkable urinalysis. According to the definition of CKD, which requires impaired kidney function for at least 3 months, stable kidney function was defined as an eGFR difference of less than ± 10% over a period of at least 3 months. Among the basic model data set, all patients with two or more submissions of specimens into the query period were selected. The first and the last measured value of each of these patients were extracted. Of these, only those were selected that showed a minimum time interval of 3 months between both measurements and a maximum change in eGFR of ± 10%. Based on the calculation of eGFR using three different formulas, three data sets result, each containing the patients with stable kidney function for each eGFR formula. The characteristics of these three data sets are summarized in Table 1.

Modeling
As outlined, the lower reference limit of a reference interval is defined as the 2.5th percentile [7]. To calculate continuous age-dependent lower reference limits of eGFR based on the 2.5th percentile, first, the age of the patients was rounded down. For the data set for model 2, the age of the patient at the time of the later eGFR measurement was selected to separate the patients into annual age groups from 30 to 85 years. In the following, a quantile regression model was calculated from the data set for model 1 for each eGFR formula and for all three data sets for model 2 for each eGFR formula (the later eGFR measurement was selected), resulting in six models. Other variables such as gender were not adjusted in the models. The same meta-parameters were used for all models. The regression was based on the 2.5th percentile. Cubic splines were used to smooth the curve. Due to the known physiological decrease of kidney function from an age of approximately 40 years [4], a knot was inserted at 40 years of age. After the models were created, the CKD-EPI, FAS and EKFC-eGFR age-dependent continuous lower reference limits were predicted. An excerpt of these results is shown in Table 2.

Validation data sets
To estimate the prevalence of CKD based on the created models, we used a second basic data set. For the creation of this basic validation data set, all creatinine measurements submitted to the laboratory from 01.01.2020 to 31.12.2020 were queried, resulting in a data set of 443,548 creatinine measurements.
Validation data set 1 To build a validation data set without appliance of the chronicity criterion, we extracted only the first value of each patient, resulting in a data set of 197,550 creatinine measurements. For each creatinine measurement, the CKD-EPI, FAS and EKFC-eGFR were calculated. The characteristics of these validation data sets are summarized in Table 1.
Validation data set 2 To build a validation data set with appliance of the chronicity criterion, we extracted only patients from whom two measurements were available. Like model 2, only those were selected that showed a minimum time interval of 3 months between both measurements and a maximum change in CKD-EPI, FAS, or EKFC-eGFR of ± 10%. Based on the calculation of eGFR using three different formulas, three versions of validation data set 2 results, each containing the patients with stable kidney function for each eGFR formula. The characteristics of these three validation data sets are provided in Table 1.

Statistical software
Statistical analysis was performed using R statistic programming language (R Core Team (2021. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria, URL: https:// www.Rproje ct. org/.). For calculation of quantile regression models, the additional software packages "quantreg" and "splines" were used [14,15]. Calculation of confidence intervals was done using the normal approximation.

Approval of the local Ethics Committee
The local ethics committee of the Land Saxony-Anhalt ruled that a consent from the patients from whom the specimens originated were not required, due to the retrospective and fully anonymized setting of data evaluation (approval No. 17/21).

Results
This study reveals two main findings: (i) Using eGFR estimates originating from patients with stable renal function (eGFR difference at most ± 10% within a period of at least 3 months), higher continuous age-dependent reference limits are obtained than with onetime eGFR estimates entered into modeling. An extract of the predicted continuous age-dependent lower reference limits of eGFR in 5-year increments is provided in Table 2. In model 1 (2.5th percentile, inconspicuous urinalysis, one-time eGFR measurements), CKD-EPI-eGFR decreased from 76 to 22 ml/min, FAS-eGFR from 74 to 22 ml/min and EKFC-eGFR from 72 to 21 ml/min as a function of age. In model 2 (2.5th percentile, inconspicuous urinalysis, stable kidney function), CKD-EPI-eGFR decreased from 84 to 32 ml/min, FAS-eGFR from 84 to 30 ml/min and EKFC-eGFR from 82 to 30 ml/min as a function of age. For both models, these results also show the comparability of the three different eGFR estimates. The visualization of the two models for each eGFR formula together with the validation data sets is provided in Fig. 2.
Prevalence rates depending on age groups according to the three-stage model of Delanaye et al. (< 40 years, 40-65 years, > 60 years) [4] are shown in Table 4.
The prevalence rates from model 2 with appliance of the chronicity criterion for the validation data sets are in the range of published prevalence rates when the chronicity criterion together with age-dependent lower reference limits of eGFR are applied to the study cohort (8.3/7.2/7.1% vs. 6%, [9, 10] and 8.3/7.2/7.1% vs. 5.1%, [16]) and lower than the estimated prevalence rates with one-time eGFR estimations without age-dependency in the definition of CKD (8.3/7.2/7.1% vs. 10-16% [9]). Table 3 Comparison of calculated prevalence rates between the two models, one without and one with appliance of the chronicity criterion

Discussion
Our findings confirm two important observations suggested by previous studies. First, the age-dependency of eGFR is remarkable with all three eGFR formulas. Second, using continuous age-dependent reference limits of eGFR together with the application of the chronicity criterion results in a remarkably lower CKD prevalence rate. Nevertheless, the results have to be discussed critically due to the retrospective study design. Lower reference limits: When comparing the predicted lower reference limits from model 1 and model 2, higher reference limits in all three eGFR formulas are calculated with model 2. The underlying reason for this could be a share of patients with flares of acute kidney injury (e.g., deteriorations in fluid homeostasis with exsiccosis due to diuretics intake, impaired thirst, impaired renal function without proteinuria, leukocyturia, erythrocyturia and/or glucosuria, variable or impaired heart function) in the data sets for model 1, all of which may have an influence on the eGFR thresholds. By comparing the predicted lower reference limits to the ones from Delanaye's work, as shown in Table 2, it becomes obvious that the reference limits from model 2 are still lower than the ones described by Delanaye and Pottel [4][5][6]. This may be explained by the methodology underlying the data collection. Although the 2.5th percentile was used as the lower limit in all three models (model 1, model 2 and Delanaye`s model), the collectives studied are likely to differ. A major limitation of the data underlying this study is that no further information on patients' health status was available due to retrospective analysis of data. Thus, information on comorbidities, such as hypertension or diabetes (no glucosuria at the time of measurement) or medication, that may affect renal function (e.g., RAS inhibitors, diuretics, or SGLT2 inhibitors) is missing and patients with stable but impaired eGFR may also be present in the data set despite inconspicuous urinalysis. This may result in a lower calculated 2.5th percentile than in the general population affecting the CKD prevalence calculated in this study. Another limitation may be that especially patients with advanced age may have been increasingly under medical surveillance due to a "reduced" kidney function and therefore healthy older individuals may be less likely to be entered into the data sets, which could lead to a higher prevalence of CKD in the cohort studied. Lastly, it should be remembered that patients with CKD but unstable renal function (e.g., additional AKI) were removed from the data set with the chosen study design. Thus, it must be concluded that the underlying data set is confounded by specific patient cohort analyzed by this laboratory and therefore is not fully representative of the general population. In addition, there are some minor overlaps of patients analyzed within the modeling and validation cohorts [e.g., for model 1 5662 patients from the modeling cohort were also included in the validation cohort (2.9% of the validation cohort)]. This overlap arises from the two data queries, which also overlapped in time to ensure a sufficiently high number of patients for reliable calculation of the models. The influence of this small proportion on the results of this study may be considered small, however has to be taken into account.
CKD prevalence It has been assumed that the CKD prevalence rate is higher when entering a one-time eGFR estimate compared to predictions with appliance of the KDIGO chronicity criterion [6,9,10]. Our results confirm this assumption (prevalence estimated from validation data set 1 vs. validation data set 2 as shown in Table 3 and 4). As a limitation, it must be noted that within the validation data sets, only the patients´ eGFR values were used to define CKD. Information about possible co-existing proteinuria was not available.
Furthermore, it could be shown that age-dependent lower reference limits based on a cohort with stable renal function lead to a higher CKD prevalence (prevalence estimated from model 1 vs. model 2 as shown in Table 3 and 4). This effect is most likely due to the higher 2.5th percentiles in model 2, since patients with stable kidney function were enrolled here, whereas patients with acute kidney injury were excluded from the data set.
Looking at the age-dependent prevalence rates, this study confirms that the prevalence of CKD increases with age. When results from Tables 3 and 4 are compared it is noticeable that for the age group 66-85 years a reversed pattern is present for the prevalence rates determined for validation data sets 1 and 2. While the age-independent prevalences were lower when using validation data set 2, higher prevalence rates were calculated in the 66-85 age group when using validation data set 2. Looking at the visualization of the models in Fig. 2, it is noticeable that for all three eGFR formulas, a lower eGFR drop-off is seen in model 2. This largely explains the discrepancy outlined above.
The strengths of this study are the size of the data sets, the measurement of creatinine as the basis for the eGFR calculation in the same laboratory and with the same methodology, the standardized and automated reading of the urine test strips for urinalysis, and the application of the chronicity criterion for estimation of the CKD prevalence rate. In addition, when using the 2.5th percentile as the definition of the lower limit of eGFR overdiagnosis of CKD in older patients is prevent. Furthermore, underdiagnosis of CKD in younger patients with an eGFR above 60 ml/min/1.73 m 2 but below the age-specific 2.5th percentile of eGFR is precluded, as similarly shown in the work of Beghanem Gharbi et al. and Delanaye et al. [6,16].
In summary, the findings set the basis for "real-life" prevalence rates of CKD when continuous age-dependent lower reference intervals of eGFR together with the chronicity criterion are applied. Further research should verify the established lower reference limits of model 2 in a carefully selected cohort in which the eGFR and other pathological conditions such as proteinuria are tested at the same time.

Author contributions None.
Funding Open Access funding enabled and organized by Projekt DEAL.

Conflicts of interest None.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visithttp:// creat iveco mmons. org/ licen ses/ by/4. 0/.