Introduction

The prevalence of chronic kidney disease (CKD) in the Japanese adult population is approximately 13%, with most of these individuals in stage 3 [1]. Progression of CKD is associated with increased healthcare costs, decreased patient quality of life, and greater risk for cardiovascular (CV) and all-cause death [2,3,4,5,6].

Anemia of CKD affects approximately 26% of Japanese patients with stage 3 CKD, increasing to greater than half of patients with stage 5 CKD [7]. Unfortunately, because anemia of CKD is frequently asymptomatic, patients often fail to seek medical care, especially during the early stages of CKD [8,9,10,11,12,13]. Undertreatment of anemia of CKD, particularly in patients with CV diseases and/or DM, is associated with increased rates of blood transfusions, hospitalization, and death [14,15,16]. Recognition and treatment of anemia of CKD in patients who progress to stage 3 CKD can improve clinical outcomes, including delaying the need for renal replacement therapy [12, 13, 17,18,19].

Anemia of CKD can be transient in nature, resolving with improvement in CKD management [20]. A post-hoc analysis of the Chronic Renal Insufficiency Cohort (CRIC) study, using marginal structural modeling (MSM) to account for time-dependent confounding, identified an increased risk of incident end-stage kidney disease (ESKD) and death with anemia in mild and moderate CKD in a United States (US) population [21]. However, most prior studies have evaluated the risk of anemia based on the anemia status at baseline [16, 22,23,24,25]. The purpose of this study was to determine the causal effect of time-dependent anemia status on renal and CV outcomes and mortality in community-dwelling subjects in Japan at the beginning of impaired renal function. Because anemia status can change over time, and a prior anemia status can affect the potential confounders of the subsequent anemia status (e.g., anemia treatment), the effects of time-varying anemic status was estimated using a counterfactual modeling approach [26]. This model hypothesizes that if a subject were anemic for the entire period of follow-up, the risk of adverse outcomes compared to a subject who was not anemic during the entire period of follow-up would differ.

Materials and methods

Study design and data source

This was a retrospective cohort study using annual health checkup data linked to inpatient and outpatient medical claims and pharmacy claims data from the JMDC [27]. The JMDC has over 3 million unique beneficiaries (and their dependents), aged 18–74 years, who were enrolled in one of over 100 Japanese insurance unions. This study obtained approval from the Astellas Medical Affairs Japan Regional Protocol Review Committee (MAJ-PRC) under a unique identifier code or international study number (ISN): 1517-MA-3316.

Study population and sub-cohorts

Data between January 2005 and June 2019 from subjects with at least two serum creatinine (SCr) measurements were extracted. Estimated glomerular filtration rates (eGFRs) were calculated for each subject using a validated formula for the Japanese population [28]. We identified the first consecutive pair of eGFRs for a subject, within a 2-year timeframe, in which an eGFR ≥ 60 mL/min/1.73 m2 was followed by an eGFR < 60 mL/min/1.73 m2. Each subject’s first date with an eGFR < 60 mL/min/1.73 m2 was defined as the index date. Subjects aged ≥ 18 years as of the index date who had ≥ 1 year of a pre-index lookback period, ≥ 2 years of a follow-up period (unless deceased), and a hemoglobin (Hb) value and dipstick proteinuria examination result at the index date were eligible for inclusion. Subjects were excluded if at least one of these criteria was met: first available eGFR in the database was already < 60 mL/min/1.73 m2; all available eGFRs ≥ 60 mL/min/1.73 m2; history of chronic dialysis, kidney transplantation, or eGFR < 6 mL/min/1.73 m2 at or before the index date; or no eGFR within 38 months after the index date. Subjects were followed up to 30 June 2019 or disenrollment from the JMDC, whichever occurred first.

In addition to the full cohort, two sub-cohorts were evaluated: subjects with a history of CV disease (CV sub-cohort) and subjects who had DM at baseline (DM sub-cohort). A history of CV disease was defined as having previous myocardial infarction, (congestive) heart failure, peripheral vascular disorders, or cerebrovascular disorders, according to International Classification of Diseases, Tenth Revision (ICD10) algorithms for Charlson Comorbidity Index (CCI) [29, 30]. A history of DM was defined according to the CCI ICD-10 algorithms for diagnosis codes of “diabetes, uncomplicated” or “diabetes, complicated,” and/or with antidiabetic treatment/prescription during the 3 months prior to enrollment, or a HbA1c value ≥ 6.5%. The relevant code lists are available in Supplemental Table 1 and Supplemental Table 2.

Exposure

Anemia was defined by the age-sex specific Hb value according to the “2015 Japanese Society for Dialysis Therapy: Guidelines for Renal Anemia in Chronic Kidney Disease” (Table 1) [31]. In the baseline model, anemia status was determined using the Hb value at index date. In MSM, anemia status was updated using the annual health checkup data.

Table 1 Hemoglobin value (g/dL) criteria for anemia definition

Outcomes

The composite renal outcome was ≥ 30% reduction of eGFR over 3 years from the baseline, SCr doubling, progression to chronic dialysis, receipt of kidney transplantation, or eGFR < 15 mL/min/1.73 m2 [5, 32,33,34]. The composite CV outcome was fatal and non-fatal unstable angina, myocardial infarction (MI), heart failure, or cerebrovascular event (Supplemental Table 1). All-cause death was defined either as a reason for withdrawal from a health insurance plan or recorded as an outcome of the condition.

Statistical methods

Two models were applied. As a reference model, we assessed the association between baseline anemia status and each outcome of interest (baseline risk model). Subjects were allocated into anemic vs. non-anemic groups, and crude incidence rates for each outcome were calculated. Imbalances between the groups were adjusted using stabilized inverse probability weight (IPW) [35]. IPWs were estimated separately for each CV and DM sub-cohort. Additional details regarding IPW estimation are provided in the Supplemental Methods. The distribution of IPW is provided in Supplemental Fig. 1. In the case of subjects with missing covariates, simple imputation was performed as described in the Supplemental Methods. Unweighted/weighted Kaplan–Meier (KM) estimates were descriptively compared between anemia and non-anemia groups and adjusted hazard ratios (aHRs) were estimated using the Cox proportional hazards model.

Subsequently, we developed an MSM to incorporate the dynamic change in anemia status and factors influencing Hb values during the follow-up. The stabilized IPW was estimated using a pooled logistic model [36]as follows:

$$SW_{{ik}} = ~\mathop \Pi \limits_{{k = 0}}^{K} \frac{{P\left( {A_{t} = ~a_{{k,i}} ~|~\overline{A} k - 1 = \bar{a}k - 1,i~} \right)}}{{P\left( {A_{k} = ~a_{{k,i}} ~|~\overline{A} k - 1 = \overline{a} k - 1,i,\overline{C} k = \overline{c} k,i} \right)}}~ ,$$
(1)

where i: ith subject.

K: kth day from index date.

\(\overline{a} t - 1,i\): Exposure history up to time t-1 of subject i.

\(\overline{\mathrm{c} }\text{t,i}\): History of time varying covariates up to time t of subject i.

\(\overline{A} - 1\) was defined to be 0. Note in the special case in which k = 0 (at index data), baseline covariates alone were used to estimate IPW.

The time-varying intercept was estimated using a smooth function of the times since index date using natural cubic splines with five knots. To do this, we added three terms as regressors that are specific polynomial functions of time (calculated with the cubic splines SAS Marco RCSPLINE in survrisk.pak, by Frank Harrel, which is publicly available on http://jse.stat.ncsu.edu/70/1s/software/sas). The variables used to estimate the IPW are listed in Supplemental Table 3. We fit the marginal structural Cox model using the IPW estimated with Eq. (1). The robust sandwich variance estimator was used to obtain variance estimates by accounting for the induced correlation among weighted observations [37]. aHRs (anemia vs. no anemia) were estimated using an MSM and survival curves were developed based on the Breslow estimator [38]. Survival probabilities with 95% confidence intervals (CIs) were derived at each year based on the corresponding survival curves. The IPW was estimated separately for the CV and DM sub-cohorts, and aHRs and survival curves were estimated using the same approach.

The slope of eGFR was calculated by fitting a simple linear regression over time. Greedy nearest neighbor one-to-one propensity score matching was performed within a 0.25 caliper. The logit of the propensity score was used in computing differences between pairs of observations. Propensity scores were estimated using logistic regression models. Additional details are provided in the Supplemental Methods.

Sensitivity analyses

Two sets of sensitivity analyses were performed for the aHR estimation: (1) treating death as a competing risk for the renal outcomes applying Fine and Gray’s sub-distribution hazard model [39], and (2) replacing weights > 99th percentile with the 99th percentile weight and < 1st percentile with the 1st percentile weight to assess the impact of extreme IPWs. Another sensitivity analysis was performed for estimation of slope of eGFR by restricting the analysis to subjects whose first two post-index eGFR values were < 60 mL/min/1.73 m2, as a proxy for stage 3 CKD or worse. Finally, in addition to the MSM using binary anemia status as a time-dependent exposure, another MSM using quintile categories of baseline Hb level was developed by sex to assess the causal effect of categorical Hb levels to the renal outcomes. A multinomial logistic regression model was fitted to estimate weights. The fifth (highest) Hb levels category was set as a reference group.

Results

Subject characteristics

Of the 32,870 subjects included in the study, 4,527 and 5,585 comprised the CV and DM sub-cohorts, respectively (Supplemental Fig. 2). Subjects were excluded because they started chronic dialysis before enrollment (n = 2), experienced kidney transplantation before enrollment (n = 2), had eGFR < 6 ml/min/1.73m2 at enrollment (n = 5), or had no SCr record within 38 months from the index date (n = 900). The earliest index year was 2008. Subject follow-up time is provided in Supplemental Table 4.

The median age was 52 years though < 20% of subjects were aged ≥ 60 years (Table 2). Median eGFRs at pre-index and index were 65 and 58 mL/min/1.73 m2, respectively. It was rare for subjects to have underlying conditions associated with anemia development (e.g., malignancy and/or chemotherapy) at baseline. Median Hb at index was 14.8 g/dL, and 4.2% of subjects had anemia at baseline. Additional baseline characteristics are provided in Supplemental Table 5.

Table 2 Demographics in total population

Compared with the total population, subjects in the CV and DM sub-cohorts were numerically older, more frequently male, more commonly had anemia, and had higher BMIs (Supplemental Table 6). In these sub-cohorts, eGFRs at index were similar but subjects had more proteinuria.

Baseline characteristics by baseline anemia status

Unweighted and weighted subject characteristics are presented in Table 3. Considerable differences in subjects with anemia at baseline in the unweighted population, such as age, eGFR, proteinuria, HbA1c, smoking status, and CCI score, were balanced after weighting. Anemia was infrequently treated in both groups before and after weighting. Approximately half of the subjects with anemia had an Hb value less than 12 g/dL.

Table 3 Weighted and unweighted subject characteristics by status of baseline anemia in total population

The mean (SD) follow-up for the total study cohort was 4.1 (1.85) years. Data are presented through the sixth year because by the third, fourth, and fifth years, the number of subjects with available eGFR and/or Hb values were reduced to one-half, one-third, and one-fifth of their original size, respectively. Mean eGFRs remained at approximately 60 mL/min/1.73 m2 throughout the follow-up period regardless of baseline anemia status (Fig. 1A). Hb values in male subjects with anemia at baseline were consistently lower than those in non-anemic subjects (Fig. 1B). While Hb values in female subjects with anemia at baseline gradually increased, the difference between subjects without anemia became smaller over time (Fig. 1C).

Fig. 1
figure 1

Change in eGFR a and hemoglobin values in males b and females c by anemia status at baseline eGFR estimated glomerular filtration rate, Hb hemoglobin

Incidence rate of, and adjusted hazard ratios for, renal and cardiovascular outcomes and mortality

Subjects with anemia at baseline had higher incidence/1000 patient years of mortality and renal and CV outcomes (Table 4). Similar trends were also observed in the CV and DM sub-cohorts (Table 4). The most frequently observed renal outcome was a ≥ 30% decline in eGFR (n = 191, 0.58%) (Supplemental Table 7).

Table 4 Incidence/1000 patient-years of renal, cardiovascular, and mortality outcomes

Anemia at baseline, as well as time-dependent anemia status, were independently associated with higher risk of renal, CV, and survival outcomes (Table 5). The MSM analyses showed clearer associations than baseline risk models. In the DM sub-cohort, anemia was not associated with elevated risk for CV outcomes. In the CV sub-cohort, risk of renal outcomes became marginal when competing risk of death was taken into consideration.

Table 5 Adjusted hazard ratios for renal and cardiovascular outcomes and mortality

Unweighted and weighted KM curves from the baseline risk model and survival curves from the MSM in the entire cohort are shown in Fig. 2. Table 6 shows KM estimates and Breslow estimators at Year 1, 3, and 6. Differences between subjects with and without anemia that existed for all three outcomes in the unweighted baseline risk models were minimal and became less apparent after balancing between the groups by weighting. In MSM, differences between subjects with and without anemia were observed for all outcomes, but the absolute differences were also small.

Fig. 2
figure 2

Survival curves for outcomes by anemia at baseline: renal a unweighted KM; b weighted KM; c survival curves from MSM; cardiovascular d unweighted KM; e weighted KM; f survival curves from MSM; mortality g unweighted KM; h weighted KM; i survival curves from MSM Anemic group: blue lines; non-anemic group: red lines. KM Kaplan–Meier, MSM marginal structural model

Table 6 Survival estimates for renal and cardiovascular outcomes and mortality at 1, 3, and 6 years from the baseline by anemia status

For renal outcomes, proteinuria was the most significant risk factor (aHR 7.102, 95% CI 5.216–9.670). Baseline anemia, current smoking status, baseline HbA1c, and CCI score were also positively associated with renal outcomes, and females had lower risk compared with males. For CV outcomes, CV history was the most significant risk factor (aHR 2.503, 95% CI 2.164–2.895). Baseline anemia, proteinuria, CCI score, HbA1c, and current smoking status were also positively correlated with CV outcomes, and females had lower risk compared with males. For mortality, baseline anemia was the most significant risk factor (aHR 3.440, 95% CI 2.389–4.955). Proteinuria, current smoking, and comorbidity score were also positively associated with mortality (Supplemental Fig. 3). When we used quintile categories of baseline Hb level as a time-dependent exposure instead of binary anemia status, a similar causal association was observed for renal outcomes (Supplemental Fig. 4). In males, the lower the Hb, the higher the risk of renal outcomes. In females, when the fifth (highest) Hb level category was set as a reference category, females in all the other categories tended to have a higher risk of renal outcomes (Supplemental Table 8).

The mean slope difference between subjects with or without anemia at baseline was -0.7 mL/min/1.73 m2 in the baseline analysis, and -0.9 mL/min/1.73 m2 in sensitivity analysis, respectively; slope of eGFR, based on anemia status at baseline, is presented in Supplemental Table 9.

Discussion

Both anemia at baseline and time-varying anemic status were independent risk factors for renal and CV outcomes and mortality in community-dwelling subjects in Japan at the beginning of impaired renal function. Because a potential time-varying exposure (ie, anemia) status and confounder could have been affected by other time-varying variables (eg, prior treatment) and can subsequently mediate the effect on the outcomes of interest, traditional methods for controlling this confounding may not be adequate [40]. MSM allows the changes of exposure status over time (rather than having fixed status of anemia at the baseline), as well as for the effect from confounding variables that change over time to be accounted for throughout a longitudinal study, better reflecting the actual clinical setting.

In this study, renal outcomes were defined as a composite of ≥ 30% reduction of eGFR over 3 years from the baseline, SCr doubling, progression to chronic dialysis, receipt of kidney transplantation, or eGFR < 15 mL/min/1.73 m2. Most 91% of the subjects with renal outcomes experienced eGFR decline. This is a validated surrogate endpoint both in early- and late-stage CKD [5, 32,33,34], which has been associated with renal replacement therapy use and can be measured in a shorter follow-up period. We confirmed that anemia, defined as a binary variable and along a five-category continuum, was associated with this endpoint in community-dwelling subjects in Japan at the very early stage of renal impairment using real-world data. Additionally, eGFR slope appears to be a reasonable surrogate for clinical endpoints in CKD, which was observed in this study and in a recent meta-analysis [41]. In this study, the mean slope difference between subjects with or without anemia at baseline was  − 0.7 (or  − 0.9 in sensitivity analysis), respectively. A meta-analysis suggested that a 0.75 mL/min/1.73 m2/year greater treatment effect on the total eGFR slope was associated with an average 27% lower hazard for the clinical endpoint (95% Bayesian Information Criterion, 20% to 34%). A  − 0.7 (or  − 0.9) mL/min/1.73 m2/year eGFR slope can be translated to an average 34% (or 43%) higher hazard for the clinical endpoint. Therefore, the observed difference in the current study may indicate a clinically meaningful acceleration in eGFR decline that is influenced by baseline anemia. Prior studies have observed an association between baseline anemia and CKD progression [21, 42,43,44,45]. Saraf et al. also used MSM and observed a strong association between anemia and incident ESKD with longer median follow-up time (7.8 years), corroborating our assertion that anemia causally negatively impacts renal outcomes [21]. However, the difference in survival probability (Breslow estimator) was merely 1.1% at Year 6 (98.2% vs. 99.3%), suggesting the clinical impact of this difference remains uncertain. Proteinuria was the most significant risk factor for renal outcomes, though baseline anemia was also positively associated, which is supported by a community-based cohort study [46]. Because the mechanisms by which anemia affects renal outcomes remain predominately theoretical, these hypothesis-generating findings and interventions to mitigate these negative effects are areas for additional research.

The relative risk of time-varying anemia status on CV outcomes was not as pronounced when compared with mortality or renal outcomes, even in the CV and DM sub-cohorts. This aligns with previous studies showing associations between baseline anemia and increased risk of coronary heart disease, stroke, and all-cause death [4, 47, 48]. The difference in Breslow estimators between anemic and non-anemic groups became larger over time with no difference observed for KM estimates, suggesting that longer periods of being anemic increased the likelihood of CV events.

The higher risk of death observed in subjects with anemia corroborated findings from a previous study in US CKD subjects [45] and a study of almost 63,000 Japanese subjects with varying degrees of renal function [49], but not with a previous study using MSM [21]. This lack of association may have occurred because subjects had higher-stage CKD in the prior study using MSM than in this study, predominately received anemia treatment, and were at greater risk for mortality from other causes that may have resulted in residual and unmeasured confounding.

The current study has potential limitations. Because the censoring and disenrollment from the insurance program may have resulted from renal dysfunction development requiring dialysis initiation and the inability for some subjects to continue to work, the risk of renal outcomes may be underestimated. However, in this cohort, the most frequently observed renal outcome was a ≥ 30% decline in eGFR, and initiation of chronic dialysis was limited, which suggests minimal to no effect from disenrollment because of this scenario. Although missing annual health checkup data may not have occurred completely at random and imputation methods cannot truly compensate for missing values, minimal differences were found when data were compared before and after imputation. Additionally, this study used a validated case definition for myocardial infarction but not for the other CV outcomes. The current study combined diagnosis with specific prescriptions or procedures and diagnosis records in inpatient claims in a thorough and reasonable manner though the validity of individual case definitions is unknown.

Conclusions

Time-varying anemia status was associated with increased risk of renal and CV outcomes and higher mortality. Because anemia status may be transient and can spontaneously recover at the very early stage of CKD, early detection and treatment of anemia may help delay further decline in renal function and reduce the development of other negative sequelae.