Introduction

Acute kidney injury (AKI) is a significant health issue in premature infants as it correlates with increased morbidity and mortality, and hence, places a significant burden on the health system [1,2,3]. Currently, AKI is defined by Kidney Disease: Improving Global Outcomes (KDIGO) criteria as an increase in serum creatinine (sCr) above 0.3 mg/dL or 50% from baseline and/or a urine output of less than 0.5 mL/kg/h for at least 6 h [4]. However, these are imprecise parameters of AKI as they are influenced by factors such as age, sex and weight [5]. sCr changes can indicate late consequences of injury because there is a 48 to 72 h delay from the initial renal injury [6, 7]. Significant kidney injury can exist despite minimal changes to sCr levels due to compensatory mechanisms [8, 9]. Neonatal sCr concentrations vary greatly depending on the severity of prematurity and reflect maternal levels up to 72 h post-birth [7, 10, 11]. In neonates born < 32 weeks gestational age, sCr increases after birth and has a peak around day of life 4, which slowly decreases over the first few weeks of life [12,13,14]. Therefore, sCr is unable to detect AKI in the first few days post-birth. AKI diagnosis based on urine output is also problematic as this is often clinically difficult to monitor in neonates, and non-oliguric renal failure is common in premature infants [7, 15].

An estimated 15 million babies are born prematurely worldwide every year and the rate of premature births is increasing [16]. In the first week of life, the incidence of AKI in premature infants (defined as gestational age < 37 weeks by the World Health Organisation) is reported to be between 12.5 and 39.8% [17,18,19]. Premature infants are at high risk of an AKI as they have under-developed, and therefore, poorly functioning kidneys. This is typically further complicated by sepsis, nephrotoxic medications and perinatal asphyxia [20,21,22]. As nephron formation is not completed until 36 weeks of gestation [23], prematurely born infants have a low nephron endowment at birth with subsequent extrauterine nephrogenesis. There is evidence to suggest that perinatal AKI and extrauterine exposures, such as nephrotoxic medications, infections and haemodynamic instability are detrimental to optimal nephrogenesis [24, 25]. Several studies have demonstrated an increased risk for the development of chronic kidney disease (CKD) following an AKI episode in premature children, despite recovery of renal function to pre-morbid estimated glomerular filtration rate (eGFR) levels [26,27,28]. Three to five years after an initial AKI episode, over 50% of children had at least one sign of CKD [26]. These children are at high risk of progressive CKD and death, and so periodic evaluation following the initial insult is warranted [26, 28].

As traditional markers of AKI are unreliable, several candidates for early detection have recently been investigated. These could facilitate early intervention and validation of therapeutic strategies in clinical practice. Previous reviews regarding the use of urinary and serum biomarkers to diagnose AKI have mainly focused on adult populations [29,30,31,32]. While some reviews have included a paediatric population [30, 33,34,35,36], none to date have focussed on premature infants whose kidneys would still be undergoing nephrogenesis and functional maturation. Early AKI diagnosis could enable appropriate early management and potentially improve long-term outcomes of this vulnerable population. The current systematic review and meta-analysis aimed to elucidate the diagnostic accuracy of urine and serum biomarkers not currently used in routine clinical practice to predict AKI in premature infants.

Methods

Search strategy

This systematic review protocol was registered with the international prospective register of systematic reviews (PROSPERO) as CRD42019122046. The Preferred Reporting Items for Systematic Reviews and Meta-Analyses of Diagnostic Test Accuracy Studies (PRISMA-DTA) statement was followed [37]. A comprehensive, systematic search was undertaken of the following 5 computerised databases: PubMed, EMBASE, CINAHL, Scopus and the Cochrane Library from inception until October 2019. An updated search in August 2020 yielded 1 additional article that fulfilled inclusion criteria [38]. The search strategy included medical subject heading (MeSH) terms and synonymous terms covering the age group of interest (e.g. infant/paediatric/neonate/newborn), AKI and biomarker (see Online Resource 1 1 for full search strategy). Reference lists of included studies and on-topic review articles were manually searched for additional studies not identified by the search strategy. Conference abstracts were excluded but cross-referenced with full publications by the same authors. After duplicate articles were removed, the title and abstract of all retrieved articles were screened for eligibility. Full publications were independently reviewed against an inclusion/exclusion criteria by two authors (JK and LA). Conflicting viewpoints were resolved via discussion until a consensus was reached.

Study selection criteria

The a priori inclusion criteria were: (1) infants (< 1 years old); (2) premature birth (gestational age < 37 weeks); (3) biomarker measurement in urine, serum, blood or plasma; (4) diagnostic accuracy measures of a biomarker; (5) clear definition and outcome of AKI; (6) human studies; and (7) published in English. Criteria for exclusion were: (1) no reported diagnostic accuracy measures for any biomarkers; (2) no reported AKI; and (3) conference abstract, PhD dissertation, review article or other editorial.

Study quality assessment

Methodological quality and validity of studies were assessed using a modified 10-itemchecklist of the Standards for Reporting Diagnostic Accuracy (STARD) criteria adapted from Coca et al. [29] (see Online Resource 2). Articles with a quality assessment score ≤ 6 were excluded.

Data extraction and synthesis

Information on the study setting, biomarkers (including type, measurement/timing and threshold for diagnosis), number and gestational age of participants, and criteria for confirming AKI diagnosis (see Online Resource 3 for definitions), were extracted and summarised for the included studies by JK. For each biomarker, data on sensitivity/specificity, area under the receiver operating characteristic curve (AUROC), and likelihood ratios (LRs) were extracted. Biomarkers with AUROC values > 0.70 are defined as a good measure of diagnostic accuracy [39]. Biomarker cut-offs for AKI diagnosis were extracted. An attempt was made to contact authors for raw data if information was not provided.

Meta-analysis

Meta-analysis was conducted using Stata 16 (StataCorp LLC, College Station TX) and the user-written Stata metandi package for analysis of diagnostic accuracy using hierarchical logistic regression [40, 41]. Due to limitations of metandi, a meta-analysis could only be completed for biomarkers with four or more included studies (i.e. urine neutrophil gelatinase-associated lipocalin [uNGAL]). 2 × 2 contingency tables were estimated using published numbers of participants, sensitivity/specificity and/or AUROC data. Where studies published multiple sets of data for one biomarker at varied time points, only one dataset from each paper was included in the meta-analysis. This was pre-determined to be data from 1 day prior to AKI diagnosis and day of life 1 for consistency between studies. Summary sensitivities, specificities, diagnostic odds ratios, LRs and a summary receiver operator characteristic curve (SROC) with 95% confidence and prediction intervals for the diagnostic accuracy of uNGAL were determined. The corresponding area under the SROC curve (AUC-SROC) was estimated. A Fagan nomogram (Bayes nomogram) is a clinically useful tool that objectively quantifies the post-test probability of a patient having a condition, in this case, the probability of an infant having an AKI after a uNGAL test. This nomogram was constructed with the Stata midas package. A pre-test probability of 25% was used, which is the average reported prevalence of AKI in premature infants [19, 42, 43].

Results

Search results

The search identified 2118 articles (Fig. 1). After duplicates were removed, the title and abstracts of 1024 citations were reviewed against the selection criteria. Once 841 studies were excluded, the full texts of 183 citations were screened against inclusion/exclusion criteria, with 15 studies meeting inclusion criteria (Fig. 1) [38, 44,45,46,47,48,49,50,51,52,53,54,55,56,57]. The major reasons for exclusion were study populations in children greater than 1 year-old, non-premature infants, no AKI outcome, no biomarker diagnosis of AKI, and non-English publications. Eight studies [44, 47, 48, 50,51,52, 57] investigated uNGAL but only 6 of these [38, 44, 47, 48, 50, 51] provided sufficient data to construct 2 × 2 contingency tables to be included in the meta-analysis. One study [58] reporting on uNGAL was not included as the data presented were essentially a repeat of the data included from Tabel et al. [48].

Fig. 1
figure 1

Flow diagram of the literature search strategy and study selection process (based on the PRISMA-DTA statement [37]). See text for a full description of exclusion criteria. The original search was completed in October 2019, with an updated search in August 2020. uNGAL = urinary neutrophil gelatinase-associated lipocalin

Assessment of study quality

No studies were excluded, as all had a score of ≥ 7 out of a possible score of 11 (see Online Resource 4). While participant sampling, data collection and rationale for reference standards and biomarkers were similar in all studies, only 8 studies fully stated the specifications of biomarker measurements and analysis [46, 47, 49, 51,52,53,54, 57]. Reference standards for clinical AKI diagnosis were based on sCr only and were used to interpret index test results. Only 2 studies [47, 49] reported blinding of readers to the index test and reference standard. None stated whether they had a representative distribution of AKI severity, with most considering all neonates with AKI as one group. Some studies did not have participants representing the full spectrum of severity (i.e. mild, moderate or severe).

Description of study characteristics

Study characteristics are presented in Table 1, including the number of participants, gestational age, criteria used for AKI diagnosis, biomarker measurement method, determination of biomarker threshold and quality assessment score. All studies investigated biomarkers prospectively and were case–control, cross-sectional or cohort studies, with patient eligibility criteria clearly pre-defined. Studies were published between 2011 and 2020 from 7 countries (Egypt, USA, Turkey, Serbia, Greece, South Korea and Germany). All but one were single-centre studies [52], where participants were recruited from two hospitals. Only premature infants were enrolled but many had various medical conditions including respiratory distress syndrome, perinatal asphyxia, very low birth weight or patent ductus arteriosus. Amongst the studies, there were differences in gestational age [48, 52, 56] and weight [44, 48, 51,52,53] but not gender between AKI and no-AKI participants. All but one of the studies were conducted in neonatal intensive care units (NICU) [55], which did not specify the setting. A total of 27 biomarkers were investigated (Table 1). Of these, there were 5 serum and 22 urinary biomarkers, with the most common being uNGAL (8 studies). Urinary cystatin C (uCysC), kidney injury molecule 1, beta-2 microglobulin and osteopontin and serum cystatin C (sCysC) were each investigated in three studies. Urinary albumin and epidermal growth factor were investigated in two studies each. Remaining biomarkers were each investigated in one study. All samples were stored at − 70 °C prior to biomarker measurement using an assortment of assays (see Table 1 for details). AKI diagnostic criteria varied between studies but were relatively consistent in terms of sCr thresholds (Table 1). Eleven studies determined biomarker cut-off values for AKI diagnosis.

Table 1 Characteristics and participant details of included studies

Data synthesis

Data extracted from studies are detailed in Table 2, including the timing of measurements, biomarker threshold for diagnosis, sensitivity, specificity, AUROC, and positive and negative likelihood ratios. Specific information for serum and urinary biomarkers is summarised below.

Table 2 Diagnostic methods and sensitivity/specificity of tested biomarkers for predicting acute kidney injury (AKI)

Serum biomarkers

sCysC values were significantly raised in AKI compared to no-AKI patients in all 3 studies, with AUROC values ranging from 0.88 to 0.97 [45, 53, 56], suggesting sCysC is a very good to excellent measure of diagnostic accuracy. Serum NGAL (sNGAL) was significantly elevated in AKI compared with no-AKI participants on day of life 1 at 2, 4 and 6 h after NICU admission. The combination of serum paraoxonase 1, total oxidant status and total antioxidant status showed good diagnostic accuracy on day of life 1 at 4 h and 6 h and day of life 7. However, these biomarkers were each only investigated in a single study (Table 2).

Urinary biomarkers

uNGAL, a marker of renal tubular inflammation, was significantly increased in AKI compared to no-AKI controls in all but one study (Table 2) [38, 44, 47, 48, 50,51,52, 57]. The only exception was Waldherr et al. [57] in which uNGAL AUROC values ranged from 0.13 to 0.39, with AUROC values increasing after indomethacin administration.

Urinary osteopontin, epidermal growth factor and uromodulin appeared to be of very good diagnostic value, with the majority of studies reporting AUROC > 0.8. uCysC was significantly elevated in AKI compared to no-AKI patients, with AUROC ranging from 0.65 to 0.82 in all three studies where it was measured (Table 2) [44, 50, 51]. [TIMP-2].[IGFBP7] (tissue inhibitor of metalloproteinases-2 and insulin-like growth factor binding protein 7) was only explored in one study but appeared to be of significant diagnostic value at 6 h and 12 h after indomethacin administration, with AUROCs of 0.80 and 1.00 respectively. Annexin A5 (AUROC = 0.88) and protein S100-P (AUROC = 0.75) also demonstrated good diagnostic accuracy but were only investigated in one study [38]. The metabotypes investigated by Mercier et al. [55], especially hippurate and homovanillate, showed significant differences between AKI and no-AKI groups on day of life 2 (AUROC = 0.93). Urinary kidney injury molecule 1 showed unclear diagnostic accuracy value but appeared to be significantly higher in AKI compared to no-AKI from days of life 1 to 3 and 12 h after indomethacin treatment. Urinary albumin, beta-2 microglobulin, interleukin 18, α-glutathione S-transferase, clusterin, calprotectin, Galectin 3 and 6-phosphogluconolactonase did not appear to be of any significant value in diagnosing AKI (AUROC values < 0.7), although data were derived from a single study for most of these biomarkers (Table 2).

Hanna et al. [51] differentiated the AKI population into 2 groups according to neonatal KDIGO criteria (stage I and stage II/III). uNGAL (0.92 vs 0.91), uCysC (0.82 vs 0.79), urinary uromodulin (0.85 vs 0.87) and urinary osteopontin (0.84 vs 0.80) had relatively similar AUROC for stage II/III AKI and stage I AKI. However, AUROC for urinary beta-2 microglobulin(0.49 vs 0.6), albumin (0.59 vs 0.66), and epidermal growth factor (0.86 vs 0.97) were lower in stage II/III AKI than stage I AKI.

Meta-analysis of uNGAL diagnostic accuracy for detecting AKI

The meta-analysis included 288 participants [no-AKI (n = 195) and AKI (n = 93)] from six studies (Fig. 2A). uNGAL had a summary sensitivity of 77% (95% CI 58–89%), specificity of 76% (95% CI 57–88%) and diagnostic odds ratio of 11 (95% CI 4–28) for AKI diagnosis (Fig. 2B). The SROC curve showed an AUROC of 0.83 (95% CI 0.80–0.86), which suggests that uNGAL has a good diagnostic accuracy for AKI in premature infants (Fig. 2B). Positive and negative LRs were 3.2 (95% CI 1.8–5.8) and 0.30 (95% CI 0.16–0.57). Fagan’s nomogram is a tool that illustrates the post-test probability of a premature infant having AKI. When the pre-test probability of AKI is 25%, Fagan’s nomogram shows that a high uNGAL result increases the post-test probability of AKI to 52% (95% CI 37–66%) while a low uNGAL result decreases the post-test probability to 9% (95% CI 5–16%) (Fig. 2C).

Fig. 2
figure 2

Meta-analysis of diagnostic accuracy of urinary neutrophil gelatinase-associated lipocalin (uNGAL) in predicting AKI in premature infants. Analysis was conducted using six included studies measuring uNGAL [44, 47, 48, 50, 51]. A 2 × 2 contingency table data used for each of the studies included in the meta-analysis. AUC: area under the curve; Dx: diagnosis; FN: false negative; FP: false positive; TN: true negative; TP: true positive; B Hierarchical summary receiver operator characteristic curve (HSROC) with 95% confidence and prediction regions. Calculated in Stata using the metandi and midas commands. Grey circles represent each of the six studies included in the analysis, with size indicating the sample size. The red square represents the overall estimate of sensitivity (SENS), specificity (SPEC). C Fagan plot (Bayes nomogram) created using the midas command within STATA. aEstimated from measurement at 1 day prior to stage I acute kidney injury (AKI) diagnosis. Data for contingency table obtained from Figure 1e in Hanna et al. [51]. Biomarker cut-off as per Parravicini et al. [52]. bEstimated from measurement at 1 day prior to AKI diagnosis (stage not specified). cEstimated from measurement at day of life (DOL) 1

Discussion

This systematic review and meta-analysis of biomarkers of AKI in premature babies suggests that sCysC and uNGAL have the highest diagnostic accuracy in this clinical setting. Our review highlights that a range of urinary and serum biomarkers have been tested but only in a limited number of small studies involving premature babies.

Of all the serum biomarkers investigated, sCysC had the highest diagnostic accuracy, with AUROC values between 0.88 and 0.97. Cystatin C is a cysteine protease inhibitor which is produced at a constant rate and exclusively excreted with no reabsorption by the renal tubules [59]. Unlike serum creatinine, sCysC is not influenced by sex, age or muscle mass and does not cross the placenta, so concentrations in the early postnatal period are reflective of early neonatal kidney function, rather than maternal concentrations [59,60,61,62]. Serum NGAL and the combination of serum paraoxonase 1, total oxidant status and total antioxidant status were only of significant diagnostic value at certain time-points. While sCysC performed the best in terms of diagnostic accuracy, and two studies reported high sensitivity (92.3%, 100%), one study reported a sensitivity of only 16% to predict AKI. It also relies on blood sampling during the neonatal period which can be problematic [63]. Although microsampling techniques such as dried blood spots can be used with ultrasensitive analytical methods such as liquid-chromatography with tandem mass-spectrometry, thereby dramatically reducing the blood volume required, urinary biomarkers are still generally considered more attractive in this clinical setting because of their non-invasive nature.

This systematic review found that urinary NGAL, osteopontin, epidermal growth factor, uromodulin, CysC, [TIMP-2].[IGFBP7], annexin A5, protein S100-P and metabotype, generally provided good diagnostic value with AUROC > 0.7. All other urinary biomarkers had AUROCs below 0.7 and would be less likely to reliably diagnose AKI in premature infants, although most were only investigated in one or two studies. From the meta-analysis, uNGAL showed good diagnostic accuracy for AKI in premature infants with a high summary sensitivity, specificity and AUROC. In the SROC, the confidence and prediction regions were large, which could be attributed to having a small number of included studies and small sample sizes. Other factors that may influence this include variable gestational age ranges, measurement methods and timing, and clinical settings.

Limitations

Although studies included in this systematic review were generally of high quality in terms of reporting, many biomarkers were only investigated in one or two studies or had small numbers of infants developing AKI (i.e. < 20 in 10 studies). Waldherr et al. [57] in particular only had 3 of 14 participants developing AKI, with potential implications for power and statistical significance of the results. From the quality assessment, the lack of blinding of readers to AKI diagnosis or a representative distribution of disorder severity may increase the risk of bias. In a few studies, there were statistically significant differences in gestational age [48, 52, 56] and weight [44, 48, 51,52,53] between infants with and without AKI, which may influence biomarker concentrations and skew diagnostic accuracy.

While the reference standard for AKI diagnosis was based on serum creatinine for all studies, there were variations in criteria between the studies. AKI criteria used were pRIFLE, KDIGO, modified KDIGO/new proposed neonatal AKI [11] or adapted own definitions based on pre-existing criteria. Additionally, studies had different timings for collection of urine for biomarker measurements. For example, biomarkers may have been measured daily until a certain time point, in relation to AKI diagnosis (e.g. 1 day prior), or in relation to medication administration. AKI and non-AKI groups were then determined retrospectively based on serum creatine. The exact timing of kidney injury is unclear, and this poses a limitation in determining time-related diagnostic performance.

Several studies did not report cut-off values for biomarkers. For example, only 6 out of 8 studies investigating uNGAL reported diagnostic cut-offs and only one study had a pre-defined cut-off of 50 ng/mL. Cut-off thresholds for uNGAL varied, with one study reporting an unexpectedly high threshold of 450 ng/mL (compared to 19–50 ng/mL in other studies). Studies likely used cut-offs that optimised AUROC values which may have resulted in an overestimation of diagnostic accuracy. All these factors are likely to result in discrepancies between studies. Future studies should assess these promising biomarkers taking these factors into account, and should consider standardising cut-off values and time points to validate their utility as a good measure of diagnostic accuracy. Although all studies report AUROC, inclusion of 2 × 2 tables and sensitivity and specificity values would also be useful for consistency and ease of interpretation. Given 2 × 2 tables were not provided by the included studies, these had to be calculated from published data in order to conduct the meta-analysis. Additionally, we were unable to determine an ideal diagnostic cut-off for uNGAL, as the raw data was not provided and could not be obtained from study authors.

No studies have serial measurements of these biomarkers following AKI diagnosis using sCr. These biomarkers could possibly be utilized clinically to monitor renal function and AKI progression. There is also limited data on these biomarkers beyond the first week of life in premature neonates. Thus, clinical utility after this period remains uncertain.

Conclusion

As there are several limitations of current AKI diagnostic criteria based on sCr in the neonatal period, biomarkers with greater diagnostic accuracy in this patient demographic are needed to improve patient outcomes. This systematic review and meta-analysis suggests that there are several putative biomarkers for the prediction of AKI in premature infants. In particular, sCysC and uNGAL both show good diagnostic accuracy in this clinical setting. However, evidence is currently only available from a limited number of studies. Of note, > 60% of the studies were conducted in the last 5 years suggesting increased interest in the field, potentially encouraged by enhanced availability of suitable diagnostic tests. To date, many of these biomarkers have only been analysed in a research setting and would need to be scaled for high-throughput analysis before they can be routinely adopted for clinical use. For any of these to be used as a ‘gold standard’ biomarker, they ideally need to have the ability to predict and diagnose AKI and identify the cause, location and type of injury. There is a clinical need to diagnose AKI in premature babies early as this might lead to more appropriate clinical management of these infants with the potential to prevent long-term consequences of AKI [64, 65], especially progression to CKD. Future prospective studies with larger cohorts of participants and appropriate methods to determine adequate diagnostic cut-offs (e.g. Youden’s J statistic) are needed to validate these biomarkers before they can be utilized in clinical practice.