White matter brain age as a biomarker of cerebrovascular burden in the ageing brain

As the brain ages, it almost invariably accumulates vascular pathology, which differentially affects the white matter. The microstructure of the white matter may therefore reveal a brain age reflecting cerebrovascular disease burden and a relationship to vascular risk factors. In this study, a white matter specific brain age was developed from diffusion weighted imaging (DWI) using a three-dimensional convolutional neural network (3D-CNN) deep learning model in both cross-sectional data from UK biobank participants (n = 37327) and a longitudinal subset (n = 1409) with an average of 2.25 years follow up. We achieved a mean absolute error (MAE) of white matter brain age prediction of 2.84 years and a Pearson's r of 0.902 with chronological age in the test participants. The average white matter brain age gap (WMBAG) of the baseline 1409 participants with repeated scans were 0.36 {+/-} 0.11 years younger than that of other participants in the baseline test sample with single time-point MRI scan (n = 9759). Individual vascular risk factors and the cumulative vascular risk score were significantly correlated with greater WMBAG. Obesity was observed to be correlated with WMBAG only in the male participants. Participants with one, two, and three or more vascular risk factors, compared to those without any, showed an elevated WMBAG of 0.54, 1.23, and 1.94 years, respectively. Baseline WMBAG was also associated significantly with processing speed, executive and global cognition after Bonferroni correction. The significant associations of diabetes and hypertension with poor processing speed and executive function were found to be mediated through the WMBAG. However, the vascular risk factors did not associate with the two-year change in WMBAG in the longitudinal dataset. Our analysis suggests that tissue-specific brain age can be successfully targeted for the examination of the most relevant risk factors and cognition, although longer-term longitudinal data are needed to demonstrate its dynamic characteristics. The results suggest an intriguing possibility that a white matter brain age gap can serve as a potential neuroimaging biomarker for an individual's cerebrovascular ageing process.


INTRODUCTION
Brain ageing is a complex biological process in middle-to-late aged individuals, accompanied by changes at all levels from molecules and morphology to advanced brain functions 1 .
Exposure to different hazardous vascular risk factors such as hypertension, diabetes, and hypercholesterolemia aggravate the vascular burden and accelerate the cerebrovascular ageing trajectory.Given that different organs or systems usually demonstrate heterogeneous ageing rates, an individual might have multiple underlying bodily ages 2 , such as bone age, renal age, and lung age, in addition to their chronological age.Brain age is a special case in this context, and it arguably reflects the brain's biological status.Previous studies have investigated brain age using high-dimensional neuroimaging data in a machine learning framework for healthy populations 3,4 or people with specific brain diseases 5,6 .Brain age gap (BAG) is the difference calculated by subtracting the chronological age from predicted brain age, which describes how one's brain health deviates from what would be expected for the chronological age.Positive brain age gap (i.e., an older predicted brain age than chronological age) was usually reported to be associated with brain disorders such as Alzheimer's Disease (AD) 7 , multiple sclerosis 8 , and schizophrenia 6 .
Different anatomical locations and brain tissue types of the brain show different vulnerability to vascular risk factors 9,10 .Due to the 'outside in' vascularisation pattern 11 with key regulating vessels outside the brain parenchyma, the number of arterioles supplying grey matter are considered eight times more than that in white matter 12 .As a result, white matter, particularly deep white matter, is more susceptible to ischaemia than grey matter.Brain lesions observed in cerebrovascular disorders are more relevant to white matter, such as white matter hyperintensity (WMH), lacunes, microbleeds and enlarged perivascular spaces 13 .Therefore, to investigate the cerebrovascular burden on the white matter, tissuespecific brain ages should be established.To our knowledge, most brain age studies thus far have generated a single brain age using T1-weighted imaging scans.Tissue-specific brain ages, especially white matter brain age, have not been fully investigated.A recent study 14 proposed a grey matter brain age, which was suggested as a biomarker of the risk of dementia.Benson Mwangi et al. 15 developed a diffusion weighted imaging (DWI) derived brain age, but the main aim of the study was to prove the concept and demonstrate the validity of establishing brain age using DWI scans rather than the usual T1-weighted scans.Further, they did not explore specific relationships between cerebrovascular burden and this DWI-derived brain age.In a more recent study 16 , over 1000 neuroimaging phenotypes derived from multimodality MRI scans including DWI were used in a linear regression model to estimate brain age.So far, there has been few explorations of the relationship between vascular risks and its cerebrovascular consequences in white matter by using white matter brain age.
The primary objective of this study was to investigate whether vascular risk factors would be associated with an accelerated brain ageing process measured by 'white matter specific brain age' in community-dwelling, middle to older aged adults drawn from a UK Biobank cohort cross-sectionally and a longitudinal subset.DWI is a sensitive and reliable technique for monitoring white matter microstructural impairment.Using deep learning techniques, we developed a DWI-derived white matter brain age as a biomarker to characterise the microstructural changes in relation to vascular risk factors and cognition.The individual and accumulative effects of vascular risk factors and their sex stratifications on the white matter brain age gap (WMBAG) and cognition were examined.We hypothesised that the DWIderived white matter brain age would reflect the cumulative cerebrovascular burden and thereby enhance the understanding of how vascular risk factors may accelerate the biological ageing of the white matter.

Participants
Data for this study were drawn from UK Biobank, a large-scale ongoing prospective population-based cohort study 17 .A flowchart of the selection of participants can be found in Figure 1.Briefly, after visual inspection of 37327 eligible DWI scans, 98 participants with poor image quality were removed, leaving 37229 at baseline to be included in this study.
After excluding 3399 participants with severe self-reported brain related disorders (Field ID  The ethics of this study has been approved by the North West Multi-centre Research Ethics Committee (MREC) and written informed consent was obtained from all participants.

MRI acquisition and imaging processing
DWI scans were acquired from three imaging centres (Cheadle Greater Manchester, Newcastle and Reading, UK), and each centre used a 3T Siemens Skyra scanner with a standard Siemens 32-channel head coil and same parameters.Details of imaging protocols can be found in the online UK Biobank brain imaging documentation (https://biobank.ctsu.ox.ac.uk/crystal/crystal/docs/brain_mri.pdf).The original DWI data had been pre-processed with eddy currents and head motion correction by UK Biobank using the FMRIB Software Library (FSL) toolkit 18 .The diffusion-tensor-imaging fitting tool (DTIFIT) was used to generate the following DWI-derived maps in native space: FA (fractional anisotropy), MD (mean diffusivity), AxD (axial diffusivity), RD (radial diffusivity) and MO (tensor mode).The individual maps were nonlinearly warped to a 2 × 2 × 2 mm 3 MNI-152 standard space using FNIRT (FMRIB's Nonlinear Image Registration Tool) 19 .The standardspace DWI-derived maps for each individual were finally used as the input for the deep learning model.All DWI-derived maps were visually inspected before being used as input for the deep learning networks.

White matter brain age computation
A three-dimensional convolutional neural network (3D-CNN) deep learning model was used to establish white matter brain age, the structure of which is illustrated in Figure 2.

Convolutional neural network architecture for each DWI map
A 3D-CNN architecture was applied to estimate white matter brain age using DWI maps.The architecture follows the Simple Fully Convolutional Network (SFCN) proposed by Peng Han et al. 20 due to its simplicity, which is based on VGGNet 21 with fully convolutional structures 22 .
Briefly, the network received a 91 × 109 × 91 3D image and the corresponding sex and scanner of a participant as input, and the output was the predicted age at the last layer.The network consisted of eight blocks, as shown in Figure 2, and each of the first five blocks contained a 3D convolutional layer with kernel size 3 × 3 × 3, a 3D batch normalisation layer, a 3D max-pooling layer, and a ReLU 23 activation layer.The sixth block had a 1 × 1 × 1 3D convolutional layer, a batch normalisation layer, and a ReLU activation layer.The seventh block contained a dropout layer (activated only during the training process by randomly dropping 50% of the elements) and a fully connected layer.The spatial dimension was reduced to 2 × 3 × 2 after the sixth block.The flatten operator was applied to resize the tensor to a vector before applying the seventh block.Instead of going through the 3D-CNN network as images, the extra information such as sex and scanner was incorporated into the feature map by concatenation before the eighth block.Finally, linear regression was employed in the eighth block for fusing the image features and the information of sex and scanner, with the output being a scalar for the predicted white matter brain age.The channel numbers used in the first six 3D convolution layers were [32, 64, 128, 256, 256, 64].
The internal process of the model can be summarised into three stages: (1) Nonlinear feature extraction: The first six blocks extracted feature maps from each input image; (2) Tensor to vector: The seventh block smoothly transformed the 3D tensor to a vector for downstream age prediction; and (3) Linear regression: The eighth block incorporated the extra sex and scanner information, and the output was the predicted age.

Network architecture for fusing all five diffusion maps
To increase the white matter brain age prediction accuracy, the five resultant DWI-derived maps (FA, MD, AxD, RD and MO) were incorporated to generate a composite metric.Five 3D CNN networks with the same structure as discussed above were applied, with each network modelling each feature map separately.In terms of feature fusion, we adopted a simple concatenation to fuse the five feature maps after the seventh block, as well as the covariates (i.e., sex and scanner).The two covariates were applied to all five feature maps.
The resultant feature map was a vector with (100*5 +2) entries, namely 100 entries for each feature map and 2 entries for two covariates.Similarly, linear regression was employed in the eighth block for mapping the fused features into the final predicted age.
To reduce the computational cost, instead of retraining five 3D CNN networks simultaneously from scratch, we reused the intermediate feature maps learned during the analysis of each of the five DWI-derived maps and only trained the eighth block accordingly.

Bias correction
The predicted ages normally suffer from the issue of underfitting due to regression dilution and non-Gaussian age distribution, which means older participants will be estimated with a younger brain age while younger participants will be estimated with an older brain age.As reported by Smith et al. 24 , bias correction is an essential postprocessing technique in most brain-age prediction studies.Using the techniques proposed by Smith et al 24 , we applied the linear bias correction method to the predicated age before the subsequent investigations with clinical measurements, where ! and !" denote the chronological age and predicted age, respectively.We can fit a linear regression !" = $! + & on the left-out validation set with known chronological age.Applying the learned coefficients ($, &), the corrected predicted age !" !" for test set can be estimated by where we assume the coefficients ($, &) can be generalised to the test set.

Model performance
Model performance was evaluated by two predominant measures in this study.Mean absolute error (MAE) was defined as Pearson's correlation coefficient (Pearson's r) was applied to characterise the correlation between chronological age and predicted age.

Vascular risk score (VRS) and Apolipoprotein E (APOE) ε4 carrier status
While a variety of vascular risk factors have been reported in previous decades, we incorporated five essential vascular risk factors, i.e.: (1) hypertension; (2) diabetes; (3)   hypercholesterolemia; (4) obesity; and (5) smoking.An Omron HEM-7015IT device was used to automatically evaluate the seated blood pressure twice; the mean blood pressure for each individual was computed by averaging these two measurements.Participants with hypertension were defined with blood pressure over 140/90 mmHg or using lowering blood pressure medication (Field IDs 6177 and 6153).Participants with diabetes were defined according to the doctor's diagnosis (Field ID 2443) and anti-diabetes medication (Field IDs 6177 and 6153).Hypercholesterolemia was identified according to the medication information (Field IDs 6177 and 6153).Obesity was defined as body-mass-index (BMI) ≥ 30, which was constructed from ratio of height and weight measured during the initial assessment (Field ID 21001).Smoking was defined as current or previous smoking history (Field ID 20116).Each vascular risk factor was binarized with 1 indicating presence of that factor and 0 otherwise.A composite vascular risk factor score (VRS) was generated to evaluate the overall cerebrovascular burden by calculating the total numbers of vascular risk factors using a method applied similarly in other studies 25 .Given that there were a very small number of subjects who scored 4 (n = 408, 3.7%) or 5 (n = 82, 0.7%), the VRS was categorised into 0, 1, 2 and ≥ 3.
DNA from a blood sample of first recruited participants (approximately 50,000) were genotyped in the UK biobank using Affymetrix UK BiLEVE Axiom array; the rest were genotyped with Affymetrix UK Biobank Axiom array 26 .Two APOE coding single nucleotide polymorphisms (SNPs) rs7412 and rs429358 downloaded from the genotyped data were used to determine the APOE genotype 27 .APOE was considered as the major genetic AD risk factor, however, it was also reported to be associated with cerebrovascular lesions.As a result, APOE ε4 carrier status was also included in this study as a covariate and was classified into three categories based on the number of ε4 alleles, i.e., non-carriers; carriers with one ε4 allele; carriers with two ε4 alleles.

Cognitive tests
Seven Global cognition was computed by averaging the scores across all three cognitive domains and again standardising against the healthy subsample.Further details of the standardisation procedure can be found in our previous work using the UK Biobank data 28 .

Statistical analysis
Statistical analyses were conducted using SPSS (IBM corporation, USA) version 26.0 and R version 3.6.The association between WMBAG and cognition at baseline was first examined.Then mediation analysis with WMBAG as a mediator and cognition as outcome, was carried out among baseline participants with VRS and individual risk factors as predictors.Baseline chronological age, sex, scanner, APOE and education were controlled.Bonferroni correction was applied for these analyses with four cognitive outcomes (corrected alpha level = 0.05/4 = 0.0125).Mediation analysis was performed using the 'mediation' package 29

Data and code availability statement
The UK Biobank data can be accessed by online application (https://www.ukbiobank.ac.uk/).
Codes for the 3D-CNN deep learning model in this study can be shared from the authors upon request.

Sample characteristics including demographics and vascular risk factors of test data are
shown in Table 1.Cross-sectional test data included 7769 healthy participants and 3399 unhealthy participants.Among them, 1409 participants with both the baseline and follow-up scans were used for longitudinal analysis.

White matter brain age prediction
The white matter brain age predictions before and after bias correction for the whole test set are shown in Figure 3

Associations between risk factors and WMBAG
In model 1a, after controlling for chronological age, sex, scanner and APOE status, participants with one, two, and three or more vascular risk factors had an increased WMBAG of 0.54, 1.23, and 1.94 years older, respectively, than those without vascular risk factors (Table 4; also see Figure 4A).In model 1b, significant interaction between sex and VRS on its association with WMBAG was found (p = 0.015; Table 4).Among participants with three or more vascular risk factors, the WMBAG of males was significantly larger than that of females (mean difference = 0.617 years, p = 0.001).No significant difference of WMBAG between males and females was found for those with 0, 1 or 2 risk factors (Figure 4B).
Apart from the composite VRS, we also observed significant unique contributions of each individual risk factor (except obesity) to the WMBAG when controlling for other risk factors and covariates, in model 2a (Table 5).Having diabetes was more strongly associated with WMBAG (1.39 years, p = 0.002), relative to other risk factors.Interestingly, in models 2b-f, we found that all interaction terms were not significant, except for the interaction between obesity and sex (Table 5).We found that female participants with obesity did not have larger WMBAG, while males with obesity had significant larger WMBAG than those without obesity (Figure 4C).Different APOE ε4 carrier status was not associated with the WMBAG in any of the models (all p values > 0.05).

Association between cognition and WMBAG
After controlling for age, sex, scanner, APOE and education, WMBAG was found to be

Longitudinal analysis
The demographics of 1409 participants with baseline and follow-up scans are shown in Table 1.Estimated white matter brain age and WMBAG for both timepoints are presented in Table 2. Generally, participants underwent an average 2.25 ± 0.12 years of follow up (ranging from 2.01 to 2.67 years).One thousand three hundred and fourteen (93.26%) participants had increased white matter brain age with an average of 2.57 ± 1.48 years (dependent t-test, p < 0.001) between baseline and follow-up scans (Figure 5).Second, we found that, cross-sectionally, cerebrovascular risk factors, both individually and collectively, were significantly associated with WMBAG, such that participants free of vascular risk factor had a younger brain (WMBAG = -0.56);higher WMBAG was associated with poorer cognitive performance, especially processing speed and executive function.
Third, we demonstrated that WMBAG played a mediation role between vascular risk factors, namely, hypertension and diabetes, and declined cognition, especially with a slower processing speed and worse executive function.
Our trained 3D-CNN model showed a better MAE in age prediction compared with many previous brain age studies 30 .Technically, this model was trained in a large population sample of healthy community-dwelling participants drawn from the UK Biobank, which enabled strong power for model estimation.The combined information extracted from all five DWI maps also improved the model accuracy; as shown in Table 3, the information from the fusion of five DWI maps resulted in a lower MAE and higher Pearson's coefficient.FA, MD, AxD, RD and MO maps are widely recognized DWI feature maps and have been shown to be significantly correlated with vascular risk factors 31,32  Interestingly, the WMBAG computed in this study correlated with the cerebrovascular burden but not with the neurodegenerative risk factor, APOE genotype.Neurodegenerative and vascular risk factors are both associated with brain ageing 34,35 .In our study, except for obesity, all vascular risk factors were significantly correlated with WMBAG with diabetes and hypertension having the highest correlations.Our results suggested that the diabetic participants on average had a WMBAG of 1.39-years older than that of the non-diabetic participants; similarly, the brain of a hypertensive participant would have a WMBAG of 0.871 years older than those without hypertension.Collectively, the accumulation of vascular risk factors led to a larger WMBAG, suggesting that WMBAG is a sensitive biomarker for monitoring the vascular burden.This finding was consistent with previous observations that hypertension and diabetes are the most widely recognized vascular risk factors that are highly associated with morbidity and mortality of cerebrovascular diseases (CVD) 16,[36][37][38] .One .
previous study also reported that DWI based brain age gap was correlated with blood pressure and smoking 39 .However, no significant association was found between the APOE ε4 allele(s) status and WMBAG.APOE ε4 allele(s) has been recognized as the strongest genetic risk factor for sporadic Alzheimer's Disease 40 .APOE genotypes with one or two ε4 allele(s) lead to a three to 10 fold risk for AD, respectively 41 .While some studies also reported that AOPE was correlated with subcortical lesions such as WMH 42 and microbleeds 43 , the findings were not always consistent.Most previous brain age studies aimed at capturing the overall changes for the whole brain, therefore unable to differentiate cerebrovascular burden from neurodegenerative burden.Although increasing evidence has suggested cerebrovascular disease and neurodegenerative disease share multiple risk factors and have overlapping neuropathologies 44,45 , there is a general difference in their MRI manifestations.AD patients usually start grey matter atrophy at premorbid stage and the atrophy progresses with the advance of AD, while those with cerebrovascular disease usually suffer more from the subcortical lesions such as WMH, lacunes, microbleeds and enlarged perivascular spaces 13 .Our study was based on this hypothesis, and we used DWI for white matter brain age computation, given its sensitivity to the microstructural integrity and pathology of subcortical white matter.
Sex dimorphism was observed when we considered the interactive effect between sex and vascular risk factors on the prediction of the WMBAG.Males showed higher WMBAG than females when they had three or more vascular risk factors.Obesity was the only vascular risk factor that showed interactive effect with sex on the WMBAG.Only males with obesity had a significantly greater WMBAG, suggesting that obesity was detrimental to brain ageing in men but not in women.This finding was in line with the finding of our previous study which investigated sex difference in WMH 46 ; males with higher BMI showed significantly greater deep WMH volume compared to women.Although the potential mechanisms underlying this sex difference have not been fully understood, other studies have also reported this interesting dimorphism, suggesting that females might be more resilient to the detrimental effect of obesity on the brain than males 47 .A cross-sectional study conducted in the UK Biobank cohort also found that obese men had a steeper grey matter volume decline than women 48 .One hypothesis posits that the distribution of adipose tissues in males and females is different -males tend to accrue more visceral fat, which heightens the vascular burden; conversely women usually accrue more fat in the subcutaneous depot, which is an independent predictor of lower cardiovascular and diabetes-related mortality 49 .
Significant associations between WMBAG processing speed, executive function and global cognition after Bonferroni correction were observed cross-sectionally.Processing speed and executive function were considered to be the most vulnerable cognitive domains in CVD 50 .In comparison with AD patients, patients with CVD usually show less pronounced memory deficits 51 , although the memory dysfunction may also appear progressively during the later course of the disease.Consistent with the clinical differentiations between AD and CVD, we did not find a significant association between the WMBAG and memory loss after Bonferroni correction, which further demonstrated both specificity and reliability of our white matter brain age model in relation to the cerebrovascular disease burden, and that our model may have clinical utility.
Using mediation analyses in baseline participants, we found that among the five cerebrovascular risk factors, only hypertension and diabetes were associated with processing speed, executive function, and global cognition through the mediation of WMBAG.These findings validated the underlying pathway that the vascular risk factors would contribute to .the pathological changes in the white matter and then lead to the cognitive dysfunction.
However, obesity was the only vascular risk factor in our study that contributed directly to the cognitive decline but not through the WMBAG.Although some studies have observed a mediation effect of white matter changes on the relationship between obesity and cognition in healthy adults 52 , our result suggested otherwise.These modifiable vascular risk factors were also reported to be linked to late-life progression of brain disorders in longitudinal studies 53,54 , but few studies have conducted the mediation analysis in a longitudinal cohort with a relatively large sample size.However, our longitudinal analysis did not yield a significant mediation effect of WMBAG change between any vascular risk factor and cognitive decline.
This may be partly due to the short period of time between baseline and follow-up (i.e., about 2 years), where significant changes in WMBAG might be too subtle to be detected.
Moreover, many participants had better cognition at follow-up than baseline due perhaps to practice effects (data not shown).
We believe that future work should be carried out to further investigate the relationship between vascular risk factors and white matter brain age.Our stratification for the level of risk factors was based on the number of vascular risk factors, regardless of the type of vascular risk factors a participant had or the specific contribution of each risk factor.For example, a participant with diabetes only would be grouped with anyone with just one of the vascular risk factors we investigated regardless of the type, i.e., any of one of hypertension, diabetes, hypercholesterolemia, obesity or smoking.In this study, we 'binarized' our participants into 'presence' or 'absence' of a vascular risk factor.Comparisons were therefore limited to 'yes' or 'no' as to whether the participant had that particular vascular risk factor or not, with a lack of more nuanced investigations of the disease stage or disease severity dependent effects of clinical measurements on these risk factors.Additionally, although we have a longitudinal subset with a large sample size from UK Biobank, the follow-up time might be too short to uncover significant brain structural and cognitive changes.Some cerebrovascular and neurodegenerative pathologies may coexist in the brain ageing process, and it is difficult to differentiate the effect of these pathologies on white matter and grey matter distinctively.

ACKNOWLEDGEMENTS
This research was conducted under the Application ID 45262 and 37103; This research was undertaken with the assistance of resources and services from the National Computational Infrastructure (NCI), which is supported by the Australian Government.Ivor W Tsang and Yuangang Pan were supported by the A*STAR Centre for Frontier AI Research.We also thank Angie Russell for her assistance in the preparation of the manuscript.

DECLARATION OF COMPETING INTEREST
The authors declare no conflict of interest.

STUDY FUNDINGS
This study was supported by the National Health and Medical Research Council (NHMRC) of Australia Program Grants ID350833, ID568969, ID1093083 and ID630593; Jiyang Jiang was supported by John Holden Family Foundation.    .

.
training, 60% (n = 19526) of the remaining participants were randomly selected to the training set.Twenty percent (n = 6515) were used as the validation set which provided an unbiased evaluation of a model fit on the training dataset while selecting the model's structures (e.g., the type of loss function and optimiser for training a neural network).The remaining 20% were combined with the unhealthy participants as identified above (n = 11168) for use in the test set.In this test sample, 1409 participants had both baseline and follow-up scans that were used for longitudinal analysis.
neuropsychological tests were included in this study: Reaction Time, Trail Making Test A and Symbol Digit Substitution for assessing processing speed; Numeric Memory and Pairs Matching for assessing memory; Trail Making Test B and Fluid Intelligence for assessing executive function.Raw scores for these tests were standardised by transforming the raw cognitive scores into z-scores using the healthy reference subsample of UK Biobank at baseline.Specific cognitive domain scores were computed by averaging the corresponding cognitive test scores within a domain and then standardising against the healthy subsample.
in R. Direct and indirect effects were estimated via bootstrapping with 5,000 samples.For longitudinal analysis, a dependent t-test was conducted to examine change in WMBAG between baseline and follow-up.To explore the prospective effects in the longitudinal subset, we first conducted multiple linear regression to examine the relationships between baseline vascular risk factors and change in WMBAG (calculated as the difference between follow-up and baseline scores), and that between WMBAG change and cognition change.Mediation analysis was then conducted to examine the direct and indirect effects of vascular risk factors on change in cognition, through change in WMBAG.
VRS was not associated with the WMBAG change, no individual vascular risk factors contributed significantly to the WMBAG change expect for obesity (Supplementary Table e-3).While the presence of obesity contributed to decrease in WMBAG (unstandardised b = -0.371,p = 0.008), the direction of which was not within our expectation.No significant associations between WMBAG change and cognition change were observed (Supplementary

Figure 1
Figure 1 Flowchart of participant selection.

Figure 2
Figure 2 Overview of study design.The left panel shows the 3D convolution neural network (3D-CNN) architecture; the right panel shows the clinical analyses between risk factors and WMBAG and cognition.Inputs of the model are pre-processed 3D maps, Abbreviations: convolution; Batchnorm = batch normalization; ReLU = rectified linear unit; WM = white matter; WMBAG = white matter brain age gap; FA = fractional anisotropy; MD = mean diffusivity; AxD = axial diffusivity; RD = radial diffusivity; MO = anisotropy mode.

Figure 3
Figure 3 Bias correction.Association between chronological age and uncorrected WMBAG (A); Association between chronological age and bias-corrected WMBAG (B).All cross-sectional test participants were included in this bias correction analysis (n= 11168).This bias correction was conducted using the predicted age after fusion of five DWI-derived maps.MAE and the correlation coefficient (r) were listed in the upper left corner of each sub-plot.Colour bar indicates the sample density.WMBAG = White matter brain age -Chronological age.Abbreviations: WMBAG = white matter brain age gap; Spearman r = coefficient for Spearman correlation; MAE = mean absolute error; WM = white matter; DWI = diffusion weighted maps.

Figure 4
Figure 4 Difference of WMBAG across different VRS groups (A for all participants; B for males and females separately) and WMBAG difference for different obesity status by sex (C).Each dot indicated the mean value for the WMBAG, error bar indicated the 95% CI.Abbreviations: WMBAG = white matter brain age gap; VRS = vascular risk score; WM = white matter.CI = confidence interval.

Figure 5
Figure 5 Longitudinal change of white matter brain age for each participant.Each participant has two time-point white matter brain ages (shown as two dots), connected with either a short blue line indicating white matter brain age increase (93.26%), or a red line indicating white matter brain age decrease.The bold blue line was a mean line fitted using white matter brain ages from all participants at both two timepoints.

Table e
-1) to ensure a relatively healthy sample for deep learning Eight regression models were used in this analysis for analysing the vascular risk factors -see their mathematical expressions in Supplementary methods.For model 1a, we investigated the main effect of VRS on the white matter brain age gap.For model 1b, interaction terms (sex times different VRS levels) were added to the model to examine if the effects of VRS differ by sex.In model 2a, to determine the specific contribution of each risk factor, the VRS was replaced with all five vascular risk factors.In model 2b-f, we investigated the moderation effect of sex on the relationship between each vascular risk factor and WMBAG.Chronological age, sex, scanner and APOE status were controlled for all models.
1. Two-tailed p ˂ 0.05 was considered statistically significant.The difference of WMBAG between healthy and unhealthy participants in the baseline test set was compared using Analysis of Covariance (ANCOVA) adjusting for chronological age, sex, scanner and APOE status.Comparison of baseline age, sex, education and risk factors between test participants who had only one time-point brain scan and 1409 participants who had repeat (follow-up) scans was conducted using independent-t test and c 2 analysis, while comparison of white matter brain age measures and cognition was performed by ANCOVA controlling for baseline chronological age, sex, and scanner and APOE.Multiple linear regression models were conducted to investigate the associations between vascular risk factors and WMBAG at baseline.VRS was dummy coded with the 0 category .as reference in all analyses.
(A and B).Spearman correlation coefficient between WMBAG and chronological age for the whole cross-sectional test participants was reduced from -0.54 before bias correction to 0.04 after bias correction, with a slight increase of MAE from 2.57 to 2.84.Pearson's r between white matter brain age and chronological age is 0.902.Cross-sectional white matter brain age which was computed using our 3D-CNN model and WMBAG are summarised in Table2.Interestingly, participants with none of the vascular risk factors had a negative mean WMBAG of 0.56, which suggested that they had a brain 0.56 years younger on average than their chronological age.Performances for all DWI maps and fusion results are listed in Table3.The MAE for the fused white matter brain age was smaller than that of any other single DWI derived map.MAE measured on the healthy test data was 2.75 years (Table3and Supplementary Figuree-1A) with Pearson's r between chronological and predicted brain age of 0.908 (p < 0.001).For unhealthy test data (Table3and Supplementary Figure e-1B), the MAE was 3.03 with Pearson's r = 0.892 (p < 0.001).Due to the best performance of the fusion of all five DWI maps, we conducted the subsequent clinical analysis using the fused predicted age.The mean WMBAG for unhealthy test participants was 0.51 ± 0.08 years older than healthy test participants (p < 0.001, 95% CI = 0.348 -0.668).Comparison of baseline characteristics between 9759 baseline participants with single time-point scans and 1409 participants who had repeat scans were listed in Supplementary Tablee2; at baseline, the WMBAG of the participants who had follow-up brain scans were 0.36 ± 0.11 years lower than that of the rest baseline participants.The cognitive scores such as processing speed, executive function, memory, and global cognition of the participants with follow-up scans were significantly higher than those of the rest of the test sample participants at baseline (see supplementary Tablee2).
Table e-4).This study had three main findings.First, we successfully developed a deep learning model using convolutional neural network (CNN) for estimating brain age based on cerebral white matter only.Our model produced robust and accurate white matter brain age prediction in healthy test subjects with MAE of 2.75 and Pearson's r of 0.908 with chronological age. .
. Each of them taps into distinct physiological properties of the white matter microstructure, from which the deep learning model extracted essential information for white matter brain age prediction.We conducted the bias correction for the original white matter brain age to remove the negative bias towards the chronological age.Taking all these technical steps into consideration, our deep learning model for white matter age predication was well established.The mean WMBAG for unhealthy subjects was shown to be 0.51 ± 0.08 years greater than that of those healthy subjects.The 1409 participants who had repeat brain MRI scans did not only have better cognition, but also had a 0.36 ± 0.11 years younger brain age than the participants who had only a single time-point scan (p < 0.001).This might suggest the potential recruiting bias, as it was reported that the response rate to the repeat MRI scan was 65% 33 .Our method's ability to detect the subtle brain age difference between the two groups of participants from the same cohort with only sample recruitment difference shows that our brain age estimation was accurate and reliable.Moreover, we also validated the model's performance in the subsample with two time-point scans.After approximately an average of 2.25 years of follow up, 93.26% of the 1409 participants had an increased white matter brain age compared with baseline, which further demonstrated that our deep learning model was robust and able to extract useful information from the DWI maps for white matter brain age prediction.

Table 1 Characteristics of test samples.
Instance 2/3 means the first or second imaging assessment.Due to some missing values of the risk factors, the valid percentage of the risk factors were calculated for the remaining participants.The numbers for cross-sectional analysis are: college, n = 11071; hypertension, n = 11151; diabetes, n = 11103; hypercholesterolemia, n = 11060; obesity, n = 10880; smoking, n = 11073; VRS, n = 10749; APOE status, n = 9328.Abbreviations: SD = standard deviation; VRS = vascular risk score; APOE = Apolipoprotein E.

Table 3 Model performance for different DWI maps.
This table shows the model performance in healthy and unhealthy test data before and after bias correction.Abbreviations: DWI = diffusion weighted imaging; FA = fractional anisotropy; MD = mean diffusivity; AxD = axial diffusivity; RD = radial diffusivity; MO = anisotropy mode; MAE = mean absolute error; fusion = fusion of all five diffusion weighted maps; Pearson's r = Pearson's correlation coefficient.

Table 4 Association between VRS and WMBAG.
Main effect of VRS on WMBAG was analysed by recoding VRS into dummy variables as independent variables.Interaction effects were analysed by adding their corresponding interaction terms to the model.Abbreviations: WMBAG = white matter brain age gap; VRS = vascular risk score; APOE = Apolipoprotein E; CI = confidence interval, VRS_1 is the dummy variable indicating participants with only 1 vascular risk factor; VRS_2 indicates participants with 2 vascular risk factors; VRS_2 indicates participants with 3 or more vascular risk factors.

Table 5 Associations between different vascular risk factors and WMBAG.
Independent main effects of vascular risk factors on WMBAG were analysed by adding all vascular risk factors into the regression model.Interaction effects were analysed by adding each vascular risk factor and its corresponding interaction term to the model.Abbreviations: WMBAG = white matter brain age gap; APOE = Apolipoprotein E; CI = confidence interval.