Comparison of serum and urinary biomarker panels with albumin/creatinine ratio in the prediction of renal function decline in type 1 diabetes

Aims/hypothesis We examined whether candidate biomarkers in serum or urine can improve the prediction of renal disease progression in type 1 diabetes beyond prior eGFR, comparing their performance with urinary albumin/creatinine ratio (ACR). Methods From the population-representative Scottish Diabetes Research Network Type 1 Bioresource (SDRNT1BIO) we sampled 50% and 25% of those with starting eGFR below and above 75 ml min−1 [1.73 m]−2, respectively (N = 1629), and with median 5.1 years of follow-up. Multiplexed ELISAs and single molecule array technology were used to measure nine serum biomarkers and 13 urine biomarkers based on our and others’ prior work using large discovery and candidate studies. Associations with final eGFR and with progression to <30 ml min−1 [1.73] m−2, both adjusted for baseline eGFR, were tested using linear and logistic regression models. Parsimonious biomarker panels were identified using a penalised Bayesian approach, and their performance was evaluated through tenfold cross-validation and compared with using urinary ACR and other clinical record data. Results Seven serum and seven urine biomarkers were strongly associated with either final eGFR or progression to <30 ml min−1 [1.73 m]−2, adjusting for baseline eGFR and other covariates (all at p<2.3 × 10−3). Of these, associations of four serum biomarkers were independent of ACR for both outcomes. The strongest associations with both final eGFR and progression to <30 ml min−1 [1.73 m]−2 were for serum TNF receptor 1, kidney injury molecule 1, CD27 antigen, α-1-microglobulin and syndecan-1. These serum associations were also significant in normoalbuminuric participants for both outcomes. On top of baseline covariates, the r2 for prediction of final eGFR increased from 0.702 to 0.743 for serum biomarkers, and from 0.702 to 0.721 for ACR alone. The area under the receiver operating characteristic curve for progression to <30 ml min−1 [1.73 m]−2 increased from 0.876 to 0.953 for serum biomarkers, and to 0.911 for ACR alone. Other urinary biomarkers did not outperform ACR. Conclusions/interpretation A parsimonious panel of serum biomarkers easily measurable along with serum creatinine may outperform ACR for predicting renal disease progression in type 1 diabetes, potentially obviating the need for urine testing. Electronic supplementary material The online version of this article (10.1007/s00125-019-05081-8) contains peer-reviewed but unedited supplementary material, which is available to authorised users.


Introduction
To facilitate the early identification of those most at risk of decline in future renal function, most clinical guidelines recommend the use of regular serum creatinine measurement and equations such as the Chronic Kidney Disease Epidemiology Collaboration (CKD-EPI) or Modification of Diet in Renal Disease (MDRD) equations for estimation of GFR [1]. The best predictor of future end-stage renal disease (ESRD) is the current eGFR and past eGFR trajectory [2]. As we recently reviewed [3], while serum cystatin C-based eGFR equations have also been proposed, at present creatininebased equations remain the most widely used for estimating GFR [4][5][6][7][8]. Guidelines also advocate regular urinary testing for albumin/creatinine ratio (ACR) since there is extensive evidence that albuminuria is a strong risk factor for progression of diabetic kidney disease (DKD). However, it has been widely discussed that ACR lacks specificity and sensitivity for progressive decline in eGFR. For example, a poor positive predictive value was reported, with only about a third of those with microalbuminuria having progressive renal function decline [9]. Albumin excretion also had low sensitivity in that only about half of those with progressive renal function decline were albuminuric [9]. A further practical limitation of ACR is that obtaining urine samples for ACR adds to clinic workload and can be inconvenient and difficult for people with diabetes, resulting in suboptimal uptake. In Scotland, for example, while 86% of those with type 1 diabetes had eGFR in the past year, only 62% had ACR measured [10].
An important question is therefore whether any other prognostic biomarkers can improve on ACR or even replace it for predicting renal function decline. In a recent paper [11] we reported a large-scale discovery experiment where we assessed 297 biomarkers in two type 1 diabetes cohorts with starting eGFRs of 75 ml min −1 [1.73 m] −2 or less. We found many highly statistically significant biomarker associations with future eGFR, and that prediction of future eGFR using ACR and baseline eGFR could be improved with biomarkers. However, most of the improvement could be achieved by measuring just serum kidney injury molecule 1 (KIM-1) and CD27 antigen (CD27), a member of the TNF superfamily.
Following on from that study, we extended the evaluation to (1) include study participants with higher starting eGFRs; (2) measure the best biomarkers from our first study i.e. KIM-1 and CD27, but also measure new candidate biomarkers chosen from recent reports from large well-conducted studies; and (3) evaluate a more sensitive serum KIM-1 assay using single molecule array (SIMOA) technology, hypothesising that it would be detectable before microalbuminuria in those progressing to renal function decline. The focus of our analysis was to assess how well these biomarkers predict future eGFR and whether they could outperform or add to the prediction gained with ACR across a wide range of initial eGFRs.

Methods
Participants The Scottish Diabetes Research Network Type 1 Bioresource (SDRNT1BIO) [12] is a prospective cohort study comprising 6127 people with a clinical diagnosis of type 1 diabetes mellitus, representing 25% of all adults with type 1 diabetes in Scotland, recruited between December 2010 and November 2013. On the day of recruitment (which we refer to as the study day or biosample date), clinical measurements and blood and urine samples were obtained in which serum creatinine and urinary ACR were measured. From electronic healthcare records we extracted routine health-related data retrospective and prospective to study day, as described [12].
For this study we selected 1629 individuals with eGFR 30 ml min −1 [1.73 m] −2 or above at study biosample date and with at least three prospective eGFR determinations over a period of at least 2 years or incident ESRD. These were a random sample of 50% and 25% of those with starting eGFR below and above 75 ml min −1 [1.73 m] −2 , respectively.
The study was performed in accordance with the Declaration of Helsinki; all participants gave their written consent and the study protocol was approved by the local ethics and data governance committee.
Renal outcomes and covariate data eGFR was calculated with the CKD-EPI equation [13] using serum creatinine values directly measured and retrieved retrospectively and prospectively from medical records. These excluded readings concurrent with hospital admissions. A summary measure of the historical eGFR was obtained by computing a weighted average of all retrospective eGFR records for each person, with weights inversely related to the amount of time leading to the biosample date. Participants with no retrospective eGFR data had their historical eGFR imputed to study day eGFR. Final eGFR was defined as the median eGFR reading of the last 6 months of follow-up. Initiation of renal replacement therapy was considered to indicate an achieved eGFR of 10 ml min −1 [1.73 m] −2 and all subsequent readings were censored. The decline of renal function was estimated by fitting a simple linear regression model to the serial prospective eGFR determinations of each person. We also defined binary clinically significant progression categories of progression to <30 and <45 ml min −1 [1.73 m] −2 .
ACR was measured in paired urine samples with the first taken at study day and the second several days later using the ADVIA 2400 immunoturbidimetric method for albumin and the ADVIA 2400 enzymatic method for creatinine (Siemens, Munich, Germany). These data were used for adjusting for ACR in the analyses here. In addition, longitudinal urinary ACR data were captured from the routine clinical laboratory data. Clinical record data close to study day were highly correlated (r = 0.73) with the direct measurement. At any time point, albuminuric status was defined based on the most recent available albuminuria measurement provided there was no contradictory record of that stage in the preceding or subsequent 90 days; i.e. someone who transited from normo-to microalbuminuria but then had another normoalbuminuria measurement within 90 days was assigned as having been normoalbuminuric across that period, such that transient changes in albuminuria readings were ignored. Participants were classified as normo-, micro-or macroalbuminuric according to their ACR falling in the intervals 0-3.39, 3.39-33.9 or above 33.9 mg/mmol, based on two out of three consecutive measurements before baseline.
Retrospective and prospective covariate data including drug exposures were obtained from the electronic healthcare record SCI-Diabetes as described previously [12].
Biomarkers measured and analysed Serum and urine biomarkers were measured on samples stored at −80°C with no prior freeze/thaw cycles. The assay methods used and the quality control performance are summarised in electronic supplementary material (ESM) Table 1. The candidate biomarkers were chosen either because we had already shown these proteins to be predictive of eGFR decline at chronic kidney disease (CKD) stage 3 or worse (serum KIM-1, CD27, α-1-microglobulin, thrombomodulin) or because they were reported from other well-conducted studies as predictive of renal disease progression (EGF and its ratio to monocyte chemotactic protein 1 [MCP-1], EGF receptor, cystatin C, macrophage inhibitory protein, matrix metalloproteinase 8, TNF receptor 1 [TNFR1]) or as strong candidates from known biology of DKD (syndecan-1), or because they are routinely measured on the same multiplex panel as a candidate (the remainder).
We used a combination of assays using (1) the Luminex platform (Austin, TX, USA) at the Clinical Laboratory Improvement Amendments (CLIA)-certified Myriad RBM laboratory (Austin, TX, USA); (2) a high-sensitivity SIMOA assay for KIM-1 at Myriad RBM that we had found detected KIM-1 in samples KIM-1 negative on their standard Luminex assay; (3) R&D Systems (Minneapolis, MN, USA) Luminex assay at the Immunoassay Biomarker Core Laboratory, University of Dundee (Dundee, UK).
Intraclass correlations were computed over 48 blinded duplicate aliquots to evaluate the reproducibility of the measurements obtained. Biomarkers were excluded from the analysis if their intraclass correlation was less than 0.4, or if over 99% of the readings in the dataset were identical due to falling below the detection threshold. Accordingly, nine serum and 13 urinary biomarkers were included in the final analyses.
Values below the detection limit were imputed to half the detection threshold. All urine biomarkers were normalised to urinary creatinine.
Univariate analysis To test for associations with renal outcomes, biomarkers were evaluated independently in linear models for final eGFR and logistic regression models for progression adjusted for age, sex, diabetes duration, study day eGFR and length of follow-up (basic covariates). To allow comparison with ACR, we reran the basic and biomarker models including ACR at biosample date. We further adjusted models for BMI, systolic BP, diastolic BP, HbA 1c , HDL-cholesterol, total cholesterol, smoking status and a weighted summary of prior eGFR from retrospective records (full covariates). Prior to fitting models, all continuous covariates and biomarkers were Gaussianised and standardised to zero mean and unit standard deviation. Associations were declared significant at Bonferroni-corrected p = 0.05/22 = 2.3 × 10 −3 .
Construction of parsimonious panels of biomarkers Urine and serum biomarker sets were modelled independently from each other. As previously described [11], we adopted a Bayesian modelling approach based on hierarchical shrinkage priors, in which the clinical covariates used to control for confounding in the models were assigned a weakly informative Gaussian prior (which induces some regularisation as in ridge regression), while biomarkers were penalised through the regularised horseshoe prior (which heavily shrinks regression coefficients toward zero unless they are informative) to promote sparsity [14]. A similar approach was also adopted elsewhere for biomarker selection in the context of type 2 diabetes mellitus [15]. The hierarchical shrinkage approach was implemented using the Stan Bayesian inference framework [16], which uses Markov chain Monte Carlo (MCMC) to sample the posterior distribution of the parameters given the data and the model. The specific models implemented are available in the R package hsstan (version 0.6: https:// CRAN.R-project.org/package=hsstan).
We evaluated the predictive performance of models on withdrawn data using tenfold cross-validation. For each set of baseline covariates used, we reported the difference in log-likelihood (computed on the observations withdrawn for testing from tenfold cross-validation) between the model with baseline covariates and biomarkers and the model including only the clinical covariates. For models of achieved eGFR we computed the r 2 as the squared Pearson correlation coefficient between observed and predicted outcome on the test folds. For models of progression, we reported the area under the receiver operating characteristic curve (AUC) and the expected information for discrimination, Λ, expressed in bits [17]. This measure quantifies the gain in information that a set of biomarkers provides over and beyond the baseline clinical covariates. The expected information for discrimination was computed with the R package wevid (version 0.6.2: https:// CRAN.R-project.org/package=wevid).
To recover a parsimonious model, we then applied projection predictive variable selection [18]. This approach is based on projecting the high-dimensional draws from the posterior of the model containing all biomarkers (full model) onto lower-dimensional subspaces corresponding to sparse candidate submodels [19,20]. Predictions made by each candidate submodel are compared with those obtained by the full model: their discrepancy is evaluated using the Kullback-Leibler divergence, which measures the information lost when a smaller submodel is used to approximate a more complex one. By recursively choosing in a forward-selection fashion the submodel that minimises the Kullback-Leibler divergence from the full model, one can construct a series of candidate models of growing complexity. We evaluated each candidate model in terms of its relative contribution towards the performance of the full model, and plotted the relative explanatory power obtained by biomarker panels of different sizes. Table 1 reports the summary characteristics of those included stratified by CKD stage. The median (interquartile range [IQR]) of follow-up was 5.1 (4.4, 5.7) years. For the evaluation of progression status there were a median (IQR) of 9 (6, 14) measurements of eGFR prospectively. The median change in eGFR was a fall of 0.8 ml min −1 [1.73 m] −2 per year. Overall, 41 and 74 participants, respectively, progressed to <30 and <45 ml min −1 [1.73 m] −2 (corresponding to 2.5% and 4.7% of participants not already below these thresholds). Among the 1404 normoalbuminuric individuals at baseline, 23.1% developed albuminuria during follow-up.

Participant characteristics
Biomarkers explored ESM Table 1 shows the median, IQR and range of the biomarkers analysed in this study. All candidate biomarkers showed good intraclass correlation coefficients (≥0.75) when measured on blinded duplicate samples. ESM Table 2 lists the biomarkers that were excluded from the analysis.
The correlation matrix for the biomarkers ordered by hierarchical clustering is shown in Fig. 1. Four distinct clusters are seen; for example, we see that cystatin C is moderately correlated with α-1-microglobulin and clusterin, but only weakly correlated with other biomarkers. TNFR1 and CD27 were very strongly correlated with each other and moderately correlated with KIM-1. Univariate associations with final eGFR Seven serum and six urine biomarkers, including the ratio of urinary EGF to MCP-1, were strongly associated with final eGFR in models adjusted for basic covariates including baseline eGFR (all at p<2.3 × 10 −3 ; Table 2). Of these, nine remained significantly associated with final eGFR after further adjustment for ACR at biosample date. The most strongly associated biomarkers, TNFR1, KIM-1, CD27, syndecan-1, α-1-microglobulin and urinary EGF/ MCP-1 ratio, were also significantly associated with final achieved eGFR when restricted to those normoalbuminuric at biosample date and even when adjusted for ACR in this normoalbuminuric range (data not shown). Furthermore, other than α-1-microglobulin, these biomarkers were also highly significantly associated with final eGFR at follow-up adjusted for ACR at baseline when restricted to those with eGFR ≥90 ml min −1 [1.73 m] −2 at baseline (data not shown). We confirmed that the associations were similar when adjusted for use of ACE inhibitors or angiotensin receptor blockers.
Using progression status to <30 or <45 ml min −1 [1.73 m] −2 as the outcome, a very similar pattern was seen for serum biomarkers (ESM Tables 3 and 4). For example, adjusted for basic covariates and ACR, the odds of progression to <30 ml min −1 [1.73 m] −2 were 5.80-fold for every 1 SD in Gaussianised TNFR1, and 2.05-fold per SD of KIM-1 (ESM Table 3). Among the urinary biomarkers, EGF/MCP-1 ratio was associated with progression, but when adjusted for ACR the association remained significant for progression to <45 but not <30 ml min −1 [1.73 m] −2 . When restricted to those with normo-or microalbuminuria at baseline, a very similar pattern of association with progression was seen. Among those with macroalbuminuria at baseline, significant associations with outcomes were not found, but the sample size was very small for testing this. Table 3 summarises the cross-validated performance of the linear regression models for prediction of final eGFR using all of the serum or urine biomarkers (in models with hierarchical shrinkage priors) on top of clinical covariates, with and without further inclusion of ACR at biosample date. Starting from basic covariates, the r 2 for prediction of final eGFR increased from 0.702 to 0.743 for serum and to 0.721 for urine biomarkers, compared with an increase to 0.722 for ACR alone. Thus, the serum biomarkers alone outperform ACR alone. As shown in Table 3, the model including serum biomarkers with ACR is better than ACR alone.

Panels of biomarkers for predicting eGFR
Similarly, from Table 4 the AUC for progression to <30 ml min −1 [1.73 m] −2 was 0.911 using ACR with basic covariates, but was 0.953 with serum biomarkers and basic covariates, and did not increase further after adding ACR to serum biomarkers. Using the expected information for discrimination, Λ, Table 4 shows that serum biomarkers contain almost one extra bit of information for the prediction of progression to <30 ml min −1 [1.73 m] −2 than does ACR (4.06 vs 3.23 bits).   Fig. 1 Correlation matrix ordered by hierarchical clustering. A1Micro, α-1-microglobulin; Cys-C, cystatin C; GF, growth factor; HB, heparin-binding; MMP8, matrix metalloproteinase-8; PLGF, placenta growth factor; SDC1, syndecan 1 The same conclusion is reached using progression to <45 ml min −1 [1.73 m] −2 , and also when restricted to those with normo-or microalbuminuria at baseline. Among those with macroalbuminuria, however, there was no evidence of serum biomarkers outperforming or improving prediction beyond ACR. There was no evidence of urinary biomarkers outperforming ACR for progression in any stratum.
By applying the projection predictive variable selection approach, we determined a ranking of the biomarkers within each platform. Figure 2 displays the ranking of the first few   Basic and full clinical covariates are listed in the footnotes to Table 2 ΔLoglik, difference in test log-likelihood (natural logarithm) with respect to the model containing only clinical covariates; PI, posterior uncertainty interval serum biomarkers when predicting final eGFR or progression to <30 ml min −1 [1.73 m] −2 , starting from the basic set of covariates. We also treated baseline ACR as another biomarker (Fig. 2b,d). For final eGFR, TNFR1 and KIM-1 together account for more than 75% of the information contained in the whole set of serum biomarkers independently of ACR, with clusterin and α-1-microglobulin making a small contribution beyond this (Fig. 2a,b). For progression to <30 ml min −1 [1.73 m] −2 , KIM-1 and CD27 (which is very strongly correlated with TNFR1) contain most of the predictive information (Fig. 2c). KIM-1, CD27 and syndecan 1 also contribute to the total prediction on top of ACR (Fig. 2d).

Discussion
The key findings from this study across the range of CKD stages were that serum biomarkers improve the prediction of future eGFR and progression to <30 ml min −1 [1.73 m] −2 beyond baseline eGFR. As in our past studies [11,21,22], a large number of the biomarkers evaluated showed highly significant associations with eGFR and its decline. However, almost all of the predictive information could be obtained using just a few of these intercorrelated biomarkers. Here, we found KIM-1 measured by SIMOA assay and either CD27 or TNFR1 to contain most of the predictive information. CD27 and TNFR1 are very highly correlated (r = 0.80) and are both members of the TNF superfamily and therefore likely interchangeable as predictive biomarkers. Beyond these, there is some predictive information in syndecan-1, clusterin or α-1-microglobulin.
An important emphasis of the study reported here was a comparison of biomarkers with the performance of the widely accepted measurement ACR. In this regard, we found that a parsimonious panel of serum biomarkers performed better than ACR for predicting eGFR and eGFR progression to Table 4 Cross-validated performance of models for prediction of final eGFR being <30 or <45 ml min 73 m] −2 , suggesting that they rise before ACR does and may be more sensitive to early renal damage. Of note, the urinary biomarkers did not consistently outperform ACR. These data challenge the place of ACR in our clinical management of people with diabetes. Most clinical guidelines now recommend annual measurement of ACR in people with diabetes. Our data suggest that measuring just KIM-1 and TNFR1 or CD27 in the same sample might obviate the need for doing a urine sample collection and then measuring ACR. Collecting urine samples is time intensive in outpatient clinics and many people miss out on annual screening because they do not bring or cannot produce a urine sample. As we note above, while in Scotland we achieve high rates of annual eGFR measurement, ACR capture rates are much lower [10]. Additional differences in reporting rates for ACR than eGFR exist elsewhere, e.g. in the National Diabetes Audit in England and Wales and in reports from Denmark [23,24]. Clearly, further validation is required before advocating replacing ACR. Longitudinal analyses of repeat measurements of these biomarkers, for example, are needed to establish their variability and thresholds for specific risk levels warranting clinical action (akin to the microalbuminuria thresholds).
Replication in different cohorts across the range of eGFR, using a variety of assay methods, and testing for prediction of ESRD are also needed. Moreover, at the present time these biomarkers are much more expensive to assay than ACR and are not routinely available on many commonly used clinically certified assay platforms. Some may argue that ACR is also measured because it is a biomarker of more widespread vascular disease and mortality independently of renal function: this should also be shown for candidate replacement serum biomarkers. Nonetheless, at the very least our data suggest that further studies to evaluate replacing urinary ACR with serum biomarkers are warranted, given the logistic and predictive advantages they may offer. Contribution of biomarker sets to prediction of final eGFR (a, b) or final eGFR <30 (c, d) when starting from a model containing basic covariates (age, sex, diabetes duration, study day eGFR and length of followup). Variable selection was based on: serum biomarkers (a, c); and serum biomarkers and ACR (b, d). A1Micro, α-1-microglobulin; CLU, clusterin; Cys-C, cystatin C; MMP8, matrix metalloproteinase-8; SDC1, syndecan 1 revision of the manuscript. SB, JRP, FG, SMacR, JAMcK, JC and AC contributed to data generation and reviewed and edited the manuscript. PMM contributed to study design, analysis and drafting of the manuscript. HMC conceived and designed the study, and contributed to analysis and data interpretation and drafting of the manuscript. All authors approved the manuscript for publication. HMC is the guarantor of this work.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.