Women with chronic kidney disease (CKD) in reproductive years may frequently wish to consider the option of pregnancy. It is estimated that early-stage CKD is present in 3:100 pregnancies, and later stage CKD affects 1:750 pregnancies [1]. Chronic kidney disease is associated with an increased risk of maternal and neonatal adverse outcomes which are inversely related to deteriorating kidney function and include pre-eclampsia, small-for-gestational-age infants, preterm birth, and accelerated decline in kidney function postpartum [2]. There is no reliable contemporaneous prediction tool for pregnancy and kidney outcomes for women with CKD, despite requests from women with CKD and their health care professionals to provide more resources and support [2]. This study aims to develop and validate two prediction models that estimate (i) the likelihood of a ≥ 25% reduction in estimated glomerular filtration rate (eGFR) or initiation of renal replacement therapy between 6 weeks and 12 months postpartum and (ii) the likelihood of a small-for-gestational-age (SGA < 3rd percentile) infant and/or premature birth < 34 weeks’ gestation.

The models will be developed and validated using seven international datasets. Model development will use the National Registry of Rare Kidney Diseases (RaDaR) from the UK Renal Registry (UKRR), a United Kingdom (UK) wide, linked dataset between RaDaR, UKRR, Hospital Episode Statistics (HES) and Maternity Services Data Set (MSDS). This cohort includes any woman with a diagnosis of kidney disease in the UK with maternity records from 1997 onwards, including those with ICD-10 codes for CKD in Hospital Episode Statistics, consented to participate in RaDaR or with data reported to UKRR.

External validation will be performed in the following datasets:

  1. i.

    West and East Kent integrated maternal and laboratory data from the University of Kent. This includes routinely collected data from hospitals in Kent (UK).

  2. ii.

    Stockholm Creatinine Measurement (SCREAM). This is an observational dataset with laboratory data of individuals residing or accessing healthcare in the region of Stockholm (Sweden) with results. SCREAM has previously been described in detail [3]

  3. iii.

    The Ontario Renal Network Pregnancy Cohort. This is a population-based cohort of women in the province of Ontario (Canada) with a baby born between 2007 and 2020. Administrative health databases linked using unique identifiers at Institute for Clinical Evaluative Sciences (ICES) are used to capture all hospital births in Ontario and outpatient laboratory testing.

  4. iv.

    Cohort data from three obstetric studies within the UK collected between 2010 and 2018. These studies have previously been described in detail [2, 4, 5].

The model development and validation will follow the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis statement recommendations (TRIPOD) [6]. Ethical approval was given by the Health Research Authority (London Bloomsbury Research Ethics Committee, Ref: 23/LO/0258). The study is registered on ClinicalTrials.gov (Ref: NCT05793346).

Women with an eGFR less than 90 ml/min/1.73 m2 using the Chronic Kidney Disease Epidemiology Collaboration (CKD-EPI 2009) [7] without ethnicity adjustment within 24 months prior to conception will be included. Women established on dialysis at time of conception, multi-fetal pregnancies, known inpatient eGFR measurement and those with no preconception eGFR within 24 months will be excluded.

Candidate model predictors are based on previous studies reporting risk factors for pregnancy and kidney outcomes in women with CKD and availability within the cohorts (Table 1). All predictors are within routine care for women with CKD in pregnancy enabling the acceptability of implementation and generalisability of the models.

Table 1 Candidate predictors for model development

There are two binary outcome measures. Firstly, a ≥ 25% reduction in eGFR or initiation of renal replacement therapy between 6 weeks and 12 months postpartum. This timeframe is applied as typically, serum creatinine levels may peak within the first few weeks postpartum but return to pre-pregnancy concentrations [8]. This outcome was chosen from surveys including 90 women with CKD and 73 healthcare professionals. The second outcome is a composite outcome of preterm birth defined as less than 34 weeks’ gestation and/or SGA < 3rd percentile (INTERGROW).

This is a secondary analysis on pre-existing data and therefore the sample size is fixed; however using pmpsampsize package, a sample size of at least 494 pregnancies is estimated to be adequate to avoid overfitting [9]. Characteristics of the cohort for model development and validation will be described, including missing data. Listwise deletion will be applied for model development. A sensitivity analysis will be performed using multiple imputation.

The development of both models will follow the same analyses as both outcomes are dichotomous. Initial univariable logistic regression models will be performed on each candidate predictor to initially assess their crude association with the outcome and we will fit a multivariable model containing all predictors. Then backwards elimination will be used to successively remove the least significant predictors in the model using the Akaike’s Information Criterion and significance. A liberal p value of 0.10 will be applied for retention. We will also consider predictors based on clinical knowledge and previous research findings. Variance inflation factor will be performed prior to fitting the final model to identify any collinearity between the predictor variables. The final predictors will be included in a multivariable logistic regression model. The final models will be assessed for overall performance (model fit), calibration and discrimination. Clinical performance will be assessed through positive predictive value, negative predictive value, sensitivity and specificity.

Final models will be subject to internal and external validation to assess performance of the models. Internal validation using bootstrapping will enable us to examine for potential overfitting of the developed models. We will externally validate the final models in the international cohorts. The overall predictive performance, clinical performance, discrimination and calibration will be evaluated in each cohort.

The purpose of this study is to develop and validate two prediction models estimating the likelihood of: (i) having a 25% or greater decline in kidney function or initiation of renal replacement therapy, and (ii) having a delivery before 34 weeks or a SGA infant. The provision of these models will help to facilitate informed decision-making amongst women and their partners. The provision of risk information will support clinicians in providing personalised maternity counselling and care.