Background

Lithium is the most effective maintenance treatment for bipolar disorder and is first-line in all international clinical practice guidelines [1]. However, its use has been declining globally [2]. Reasons for this include the required monitoring due to its narrow therapeutic window and concerns about adverse effects, particularly irreversible kidney failure [3]. In fact, kidney failure is rare [4], with end-stage kidney failure occurring at similar rates to those treated with other mood stabilisers [5], but more commonly than the general population (0.23% vs 0.11% [6]). Bipolar disorder itself appears to be associated with increased risk of kidney failure independent of lithium exposure [7]. There are inconsistencies in the existing literature about the association between kidney failure and lithium treatment duration and episodes of lithium toxicity [8].

Being able to identify individuals at risk of compromised kidney function would have high clinical utility; it would encourage the use of this effective treatment in those at low-risk and so improve outcomes for people with bipolar disorder. In the general population, established risk factors for CKD include age, sex (increased in women), ethnicity (increased in Black, Asian and Minority Ethnic (BAME) populations), family history of kidney disease, smoking, obesity, hypertension, diabetes mellitus, excessive alcohol consumption and acute kidney injury [9]. Prediction models have been developed for end-stage kidney failure in groups with a range of underlying risk [10,11,12,13,14]. These tend to include a small number of core features including age, gender, ethnicity, eGFR and albuminuria. Models then vary in terms of additional features such as glucose, blood pressure, haemoglobin, lipids, calcium and phosphate. It is unclear if features related to mental health are useful in predicting CKD risk at the point of lithium initiation. It is also likely that risk factors for CKD cluster differently in patients with bipolar disorder prescribed lithium [8] so we cannot assume that risk prediction models for CKD that are of value in the general population would apply to people with bipolar disorder receiving lithium. Because CKD requiring clinical intervention (CKD stage 4 or more severe – eGFR < 30 mL/min/1.73 m2) is a rare and late-stage outcome we aimed to develop a model to classify individuals into high-risk and low-risk trajectories of kidney function following lithium treatment initiation.

Methods

Population

This study used patient data from the Clinical Practice Research Datalink (CPRD) Gold and Aurum databases between 1 January 2000 and 31 December 2018. CPRD contains electronic health records (EHRs) from general practices across the UK. Combined, these databases include 42 million patient records from over 1800 primary care practices (www.cprd.com). Both databases contain coded and anonymised data including demographic details, symptoms, diagnoses, prescribed medication, laboratory tests and referrals. CPRD Gold contains data contributed by practices using Vision software and CPRD Aurum contains data contributed by practices using EMIS Web software [15, 16]. Contributing practices have different geographical distributions; CPRD Gold contains patients from the whole of the UK, whereas Aurum contains only practices from England and Northern Ireland. Therefore, there are some differences in population structures. We used data from the Aurum database for the development of our prediction model and data from the Gold database for external validation. Ethical approval for this study was obtained from the Independent Scientific Advisory Committee of CPRD (protocol no. 18_316). Informed consent was waived because data are anonymised for research purposes. In line with ethical guidance subgroups containing fewer than 5 people are censored in the results section.

Cohort definition

The cohort comprised any patient who; was aged 16 or over, ever received a diagnosis of bipolar disorder in their clinical record, was prescribed lithium (defined as receiving two or more concurrent prescriptions), had at least a year of follow-up before their first lithium prescription and no previous record of being prescribed lithium (to capture patients’ first exposure to lithium), had at least three estimated glomerular filtration rate (eGFR) measures after lithium initiation and had a baseline measure of eGFR ≥ 60 mL/min/1.73 m2 before starting lithium (normal or close to normal kidney function).

Kidney function trajectories

eGFR values were calculated from recorded creatinine blood tests using the CKD-EPI eq. [17]. Using the eGFR, and the date the blood test was performed relative to lithium initiation, we conducted group-based trajectory modelling to identify latent subgroups within the cohort [18]. We included lithium exposure as a time-varying covariate, as rate of change in eGFR may potentially differ between the lithium exposed period and following lithium cessation. In the process of determining the number of trajectory groups, we initially used a cubic polynomial function for all groups. The final number of groups was determined based on the Bayesian Information Criterion (BIC), trajectory shapes for similarity, and the proportion of cohort members in each class [19]. We initially set a 2-group model and increased the number of groups until BIC was minimised, but no group was less than 10% of the total cohort. After identifying the optimal number of groups, the level of the polynomial function for each group was reduced from cubic to zero-order until the BIC was minimised. With this final model, each participant was assigned to one of the subgroups based on maximum posterior probability. We were primarily interested in the group predicted to have the most rapidly declining eGFR trajectory; referred to as the high-risk group.

Prediction model features

We identified features present in a patient’s record before they commenced lithium treatment as potential predictors of being in the high-risk group. These included predictors of eGFR decline in the general population and features related to mental health and its treatment that have been previously identified [20] (code lists available on request):

Sociodemographics

Age, sex, ethnicity (as BAME vs White), relationship status (single vs. in a relationship).

Mental health characteristics

Illness duration before lithium initiation, presentations for depression, mania, anxiety (including diagnosis and symptoms of generalised anxiety, phobic anxiety and obsessive-compulsive disorder), psychosis (including affective and non-affective psychotic episodes), stress (including adjustment disorders and symptoms of stress), self-harm (including intentional overdose and non-accidental self-injury), disturbed sleep (including insomnias and hypersomias).

Physical health characteristics

Hyper/hypocalcaemia, hypo/hyperthyroidism, high LDL cholesterol, low HDL cholesterol, hypertension, coronary heart disease, a measure of eGFR < 60 mL/min/1.73 m2 any time before lithium initiation, type II diabetes mellitus, asthma, weight loss, peptic ulcer, iron deficiency anaemia, liver disease, chronic pulmonary disease, and neurological disorders.

Health behaviours

Smoking status (never smoked, current smoker, ex-smoker), body mass index group (underweight, healthy weight, overweight, obese), cannabis use, other substance misuse, alcohol misuse.

Other drug treatment

Antipsychotic prescription, other mood stabiliser prescription, antidepressant prescription.

Interactions

Baseline eGFR with sex and age, sex with age and body mass index group.

Statistical analysis

We described differences in prevalence of binary covariates and medians of continuous covariates between high-risk and low-risk groups using p values from chi-squared tests. We used probit elastic net regression with 10-fold cross-validation to perform variable selection and penalization of coefficients to generate the prediction model in the Aurum cohort. Elastic net is a regularisation method for regression and classification models which comprises the Least Absolute Shrinkage And Selection Operator (LASSO) penalty (L1) and the ridge penalty (L2) [21]. The LASSO (L1) penalty function performs variable selection and dimension reduction by shrinking coefficients, whilst the ridge (L2) penalty function shrinks the coefficients of correlated variables toward their average. The overall elastic net is a function of parameters λ and α (0 ≤ α ≤ 1), with λ being a parameter for the level of penalty, whilst α being the weight of L1 penalty and (1-α) being that of L2 penalty function. We reported receiver operating characteristic (ROC) area (and 95% confidence interval) (CI), sensitivity and specificity at the empirical optimal cut-off point using Youden’s index and the predictive accuracy. We compared the derived full model with predictions from the 3-variable 5-year kidney failure risk equation (KFRE) which includes age, sex and eGFR, and an elastic net model containing only these 3 variables [14]. We chose the 3-variable KFRE as albumin-to-creatine ratio was poorly recorded before lithium initiation and the 3-variable model performed well in previous validation studies (ROC area 0.79) [22].

External validation

We used patient data from CPRD Gold for external validation of the model generated in the Aurum cohort. To categorise individuals at high risk of a rapid decline in eGFR, we ran group-based trajectory models of the eGFRs independently of the Aurum patients’ trajectory model. We compared trajectory group membership with the predicted group membership from the Aurum model. We reported the ROC area, sensitivity and specificity at the cut-off point defined in the development data, brier score, predictive accuracy, calibration belt (a graphical approach designed to evaluate the goodness of fit of binary outcome models) [23] and decision curve analysis. We examined how well the model could predict CKD stage 3b or more severe (eGFR < 45 mL/min/1.73 m2) during follow-up. We also compared the full model with simple models: the 3-variable KFRE and 3-variable elastic net.

Post hoc supplementary analysis

We combined data from the Aurum and Gold datasets for patients who initiated lithium treatment with a baseline eGFR ≥ 90 mL/min/1.73 m2. We adopted the same approach in this smaller cohort: we identified a high-risk trajectory group using group-based trajectory modelling and then used the full model, 3-variable KFRE and 3-variable elastic net to predict group membership. We also examined how well this model could predict CKD stage 3a or more severe (eGFR < 60 mL/min/1.73 m2) during follow-up. This analysis was completed to address issues arising from the strong association between baseline eGFR and future eGFR measurements in the initial model. All analysis was completed using Stata 16 [24].

Results

We identified 1609 patients in the development sample (Aurum cohort), with a median of 14 (IQR 7–26) eGFR test results each. The median length of lithium treatment was 1.42 years (IQR 0.53–3.58). Of these patients, 401 (24.92%) developed CKD stage 3a or more severe (eGFR < 60 mL/min/1.73 m2), 38 (2.36%) CKD stage 3b (eGFR< 45 mL/min/1.73 m2), but none developed CKD stage 4. In total, 158 (9.82%) died during follow-up.

To categorise risk groups based on eGFR trajectories we chose a 5-group model, all groups with cubic trajectories (BIC = 3566.99). This defined 11.87% of the cohort as high risk. Models with 6 groups had lower BIC but included one group with less than 10% of the cohort.

Trajectories of the high-risk vs other groups (combined group 2–5) are shown in Fig. 1 and described in Table 1. Of those in the high-risk group 168 (87.96%) develop CKD stage 3a or more severe, and 25 (13.09%) developed stage 3b or more severe, compared to 233 (16.43%) and 13 (0.92%) respectively in the low-risk group.

Fig. 1
figure 1

High-risk and low-risk eGFR trajectory in relation to end of lithium exposure in Aurum

Table 1 Characteristics of lithium prescribed patients by risk group in Aurum

Patients in the high-risk group were more likely to be female, younger at lithium initiation, have a lower eGFR before starting lithium and be obese. They were more likely to have a pre-existing diagnosis of migraine. They were more likely to have a record of high LDL cholesterol. They were less likely to have had an eGFR < 60 mL/min/1.73 m2 any time before baseline.

There was no statistical evidence of a difference in duration of lithium treatment and incidence of lithium toxicity (> 1.5 mmol/L) between groups. Those in the high-risk group were less likely to die during follow-up and had fewer eGFR tests in total.

We used 44 features known to the clinician prior to lithium initiation to generate a prediction model for being in the high-risk group. Elastic net with 10-fold cross-validation fitted a model with λ = 0.014 and α = 1.00. The ROC area = 0.868 (95%CI 0.844–0.891) (Fig. 2). The empirical optimal cut-point was 0.134 with a sensitivity of 0.86 (95%CI 0.78–0.94) and a specificity of 0.73 (95%CI 0.63–0.84). The Youden index was 0.589. This gave a prediction accuracy of 74.54% (Table 2).

Fig. 2
figure 2

Sensitivity vs specificity of the high-risk trajectory prediction model in Aurum

Table 2 Prediction of high-risk group membership

Features retained in the model were (in order of coefficient size): baseline eGFR, sex, sex by BMI group interaction, baseline eGFR by age interaction, hypothyroidism, migraine, BMI group, SSRI exposure, high LDL cholesterol, BAME, hyperthyroidism, smoking status, type 2 diabetes mellitus, and self-harm. The 3-variable KFRE and the 3-variable elastic net model performed similarly well to the full model: ROC area = 0.828 (95%CI 0.801–0.855) and ROC area = 0.852 (95%CI 0.827–0.876), respectively (Table 2).

External validation

The external validation data set (Gold cohort) included 934 individuals. We developed new trajectory groups independently for these patients. BIC in the group-based trajectory model was minimised by a 5-group solution, with cubic or quadratic polynomials fitted for each group trajectory; 3, 2, 2, 3, 3 respectively from “highest risk” to “lowest risk” groups (BIC = 1919.07). Of the total Gold cohort, 14.67% (n = 137) were in the high-risk group. Of the total cohort, 229 (24.52%) developed CKD stage 3a or more severe and 14 (1.50%) CKD stage 3b or more severe.

Patient characteristics by risk group are described in Table 3, and trajectories relative to end of lithium exposure are shown in Fig. 3. Of those in the high-risk group,117 (85.40%) develop CKD stage 3a or more severe, and 14 (10.22%) developed stage 3b or more severe, compared to 112 (14.05%) and < 5 respectively in the low-risk group.

Table 3 Characteristics of lithium prescribed patients by risk group in Gold
Fig. 3
figure 3

High-risk and low-risk eGFR trajectory in relation to stopping lithium in Gold

As with the Aurum cohort, patients in the high-risk group were more likely to be female, be younger, have a lower eGFR before starting lithium and less likely to have a prior record of eGFR< 60 mL/min/1.73 m2. High-risk individuals were also more likely to experience migraine. Unlike the Aurum cohort, the high-risk group were more likely to have anaemia and less likely to have hypertension. There was no between group difference for lithium duration and lithium toxicity was potentially more common in the low-risk group.

We predicted high-risk group membership using the model generated in the Aurum Data set. The ROC area was 0.879 (95%CI 0.853–0.904) (Table 2, Fig. 4). At the empirical optimal cut-point defined in the development dataset, the model had a sensitivity of 0.91 (0.84–0.97) and a specificity of 0.74 (95% CI 0.67–0.82) The Brier score was 0.0967. This gave a predictive accuracy of 76.55%. However, the simpler models also predicted high-risk group membership similarly well: 3-variable KFRE ROC area = 0.870 (95%CI 0.841–0.898), 3-variable elastic net 0.888 (95%CI 0.864–0.912) (Eq. 1). The calibration plot suggested that the model performs well up to a probability of 0.60 at the 95% confidence level, the calibration slope was 1.29 and calibration-in the-large 0.41 (Fig. 5).

Fig. 4
figure 4

Sensitivity vs specificity of the high-risk trajectory prediction model in Gold

Fig. 5
figure 5

Calibration plot for Gold data set (20 groups across risk spectrum)

figure a

We also predicted CKD 3b or more severe using these models: ROC area = 0.849 (95%CI 0.792–0.905), ROC area = 0.865 (95%CI 0.808–0.922), ROC area = 0.858 (95%CI 0.792–0.922) using the full model, 3-variable elastic net and 3-variable KFRE respectively (Table 4).

Table 4 Prediction of CKD stage 3b or more severe

The decision curve analysis showed that all 3 of these models were superior to classifying everyone as high risk or low risk between a threshold probability of 0.10 and 0.80 and there was little difference between them (Fig. 6).

Fig. 6
figure 6

Decision curve analysis for Gold data set

Post hoc supplementary analysis

In 668 patients with a baseline eGFR ≥ 90 mL/min/1.73 m2 a two-group cubic trajectory model minimised the BIC (642.39) with 120 patients (17.96%) in the high-risk group (Table 5, Fig. 7). CKD stage 3a or more severe and stage 3b or more severe were more common in the high-risk group. Individuals in the high-risk group were again more likely to be female, be younger and have a lower eGFR before starting lithium. They were more likely to be current smokers. We did not observe any of the other between group differences present in the Aurum or Gold trajectory groups.

Table 5 Characteristics of lithium prescribed patients by risk group in patients with baseline eGFR ≥ 90 mL/min/1.73 m2
Fig. 7
figure 7

High-risk and low-risk eGFR trajectory in relation to stopping lithium in patients with baseline eGFR ≥ 90 mL/min/1.73 m2

In this reduced dataset, our full model was better at predicting high-risk group membership than the 3-variable KFRE model (ROC area = 0.725; 95%CI 0.675–0.776 vs ROC area = 0.667; 95%CI 0.617–0.716; p value for equality = 0.0018), but not the 3-variable elastic net model (ROC area = 0.729; 95%CI 0.679–0.779; p value for equality 0.5846) (Table 6, Fig. 8).

Table 6 Prediction in individuals with baseline eGFR ≥ 90 mL/min/1.73 m2
Fig. 8
figure 8

Comparison of ROC areas between full model, 3-variable model and 3-variable KFRE model

Discussion

As far as we are aware, this is the first model developed to predict high risk of future eGFR decline in people with bipolar disorder prescribed lithium. We used a large representative sample of people with bipolar disorder initiated on lithium and followed up for up for a median of 7.10 years (IQR 3.85–11.36). It is also the first study to use the two CPRD datasets, covering a large, representative sample of the UK population to develop a prediction model and provide external validation.

Because of the rarity of kidney failure, and the varied follow-up time and eGFR recording frequency in EHRs, we sought to identify approximately 10% of individuals prescribed lithium who were at highest risk of deteriorating kidney function, defined by the trajectory of their serial eGFR measurements. Using group-based trajectory analysis we identified high-risk groups independently in the Aurum and Gold cohorts. In both cases approximately 85% of those categorised as high risk developed CKD stage 3a or more severe compared to approximately 15% in the low-risk groups. Approximately 10% of those identified as high risk developed CKD stage 3b or more severe, compared to < 1% in the low-risk group. A number of features differed between the high-risk and low-risk groups in both cohorts. Those in the high-risk groups were more likely to be female, younger, more likely to have a lower eGFR before starting lithium, more likely to experience migraine and less likely to have a prior record of eGFR < 60 mL/min/1.73 m2. These CKD risk factors have been previously identified. CKD is more common in women [25], and this has been shown in lithium users specifically [26]. Younger women appear to be at particular risk [26]. Low baseline eGFR increases risk of CKD in the general population [27]. Migraine is not commonly thought of as a risk factor for CKD, but has been identified as such in one study, especially in younger age groups [28]. Migraine is highly comorbid with bipolar disorder [29] and it may also be a proxy for medication use which could impair kidney function. In both cohorts, there was no association between duration of lithium treatment or lithium toxicity (which was rare) and high-risk group membership.

Our model, developed in CPRD Aurum to predict whether individuals were in a high-risk group for eGFR decline during treatment with lithium for bipolar disorder, had excellent discrimination in the CPRD Gold cohort (ROC area = 0.879). However, simple models only including sex, age and baseline eGFR performed similarly well (3-variable KFRE ROC area = 0.870, 3-variable elastic net ROC area = 0.888), all with similar levels of accuracy (> 75%). In the external validation data set, our model designed to predict high-risk trajectory predicted CKD stage 3b or more severe (eGFR < 45 mL/min/1.73 m2) with a ROC area = 0.849. Again, simple models performed well, with the 3-variable KFRE having the highest accuracy (78%).

When we reduced our cohort to those starting lithium with an eGFR ≥ 90 mL/min/1.73 m2 our full model and 3-variable elastic net model performed better than the KFRE. However, our sample was too small to complete external validation of these new models.

Given these findings, simple risk calculators should be used in clinical practice at the decision to commence lithium and when eGFR is regularly measured. This could be the KFRE or our 3-varaible elastic net model (Eq. 1), which performs better than the 3-variable KFRE when eGFR ≥ 90 mL/min/1.73 m2. We did not find predictors of eGFR decline that were specific to lithium-treated patients.

Strengths and limitations

Our large, population-based longitudinal sample avoided selection bias and is generalisable and representative. Our group-based trajectory approach avoided issues with differential follow-up time and potential surveillance bias. Our use of elastic net allowed us to build a parsimonious prediction model from a large number of potential features.

The study has a number of limitations. Instead of using precise eGFR values to define outcome, we split individuals into those with a high-risk and low-risk trajectory. We forced the group-based trajectory model to identify > 10% of individuals prescribed lithium who were at the highest risk of eGFR decline. A more useful model clinically would be to predict true kidney failure requiring intervention; however, this was too rare in our cohort (2% developed CKD stage 3b or more severe), suggesting it is uncommon in modern clinical practice. We may also have been limited by the relatively short duration of lithium prescribing for many individuals included in the study. We included a large number of potential predictors. However, we may not have included all important features in our elastic net model. Some variables of interest, such as proteinuria, were poorly recorded and biased by diabetes diagnosis. We did not include the broad range of drugs for physical health problems that may influence eGFR, but we did include many physical health conditions for which these drugs would be indicated.

It is possible that we failed to identify people previously exposed to lithium, but we attempted to limit this by ensuring patients had at least a year of follow-up at the same primary care practice before their first identified lithium prescription. In most cases this would also include the uploading of historical records to the EHR. Patients could also be misclassified in terms of different features included in the model. However, our intention was to build a model based on what is already known about the patient from the EHR. Misclassification may be more likely for some features (for example, in a relationship) than others (for example, chronic obstructive pulmonary disease).

We initially planned to develop a model for individuals with essentially normal kidney function (eGFR ≥ 60 mL/min/1.73 m2). Although the discrimination and calibration of the model was good, a simple model based on baseline eGFR, age and sex performed just as well.

Conclusion

We developed a model for predicting, at lithium initiation, individuals at high risk of a poor trajectory of kidney function using serial eGFR measurements. We externally validated this model, which had excellent discrimination and good calibration. CKD stage 3b or more severe occurred in 2% of the population across the two cohorts. Whilst this is worrying, it means that the vast majority of patients treated with lithium do not develop kidney failure, and those at risk can be identified prior to initiating lithium using their age, sex and baseline eGFR.