Study Data and Inclusion and Diagnostic Criteria
The data used in this study were derived from two large multicentre cohorts, the AddNeuroMed and ADNI cohorts. The AddNeuroMed study is an integrated project funded by the European Union Sixth Framework Program and aims to establish and validate novel biomarkers of disease and treatment based upon in vitro and in vivo human and animal models of AD. Data was collected from six participating sites across Europe: University of Kuopio, Finland; University of Perugia, Italy; Aristotle University of Thessaloniki, Greece; King’s College London, United Kingdom; University of Lodz, Poland; and University of Toulouse, France (Lovestone et al. 2009; Simmons et al. 2009, 2011).
Data from the ADNI study was downloaded from the ADNI at the LONI website (www.loni.ucla.edu/ADNI, PI Michael M. Weiner). The initiative was launched in 2003 by the National Institute on Ageing, the National Institute of Biomedical Imaging and Bioengineering, the Food and Drug Administration, private pharmaceutical companies and non-profit organisations, as a 5 years public–private partnership. The primary goal of ADNI has been to test whether MRI, positron emission tomography (PET), and other biological markers are useful in clinical trials of MCI and early AD. Subjects aged 55–90 from over 50 sites across the U.S and Canada participated in the research, and imaging, clinical, and biological samples were collected at multiple time points (Jack et al. 2008). A detailed description of the inclusion criteria for the study can be found on its webpage (http://www.adni-info.org/scientists/aboutADNI.aspx#).
A total of 1,069 subjects were included in this study (AD = 291, MCI = 447, HC = 331). The demographics of the cohorts are given in Table 1. Of the 447 MCI subjects in our whole cohort, 90 converted to an AD diagnosis (MCI converters) at 12 months.
Table 1 Demographic, clinical and neuropsychological data in AD, MCI converters, stable MCI, and control subjects
For the AddNeuroMed cohort, subjects were patients who attended local memory clinics and received a diagnosis of MCI while HC subjects were recruited from non-related members of the patient’s families, caregiver relatives, and social centres for the elderly or General Practitioner (GP) surgeries. Informed consent was obtained for all subjects and the study was approved by the ethical review boards of each participating country. The general inclusion and exclusion criteria were as follows.
AD
(1) diagnosis established by National Institute of Neurological and Communicative Disorders and Stroke and the Alzheimer’s Disease and Related Disorders Association (NINCDS-ADRDA) and Diagnostic and Statistical Manual of Mental Disorders IV (DSM IV) criteria, (2) MMSE score ranged from 12 to 28. Subjects were excluded from the study if any psychiatric or neurological illness other than AD was present, and if subjects presented with a systemic illness or signs of organ failure.
MCI
(1) subjects had MMSE scores between 24 and 30, (2) subjective memory complaint with preserved activities of daily living, (3) Clinical Dementia Rating (CDR) score of 0.5, (4) Geriatric depression scale score less than or equal to 5, (5) absence of dementia in accordance with NINCDS-ARDA criteria. A 12 months follow up was used to determine whether MCI subjects converted to AD (MCI converters) or remained clinically stable (stable MCI).
HC
(1) MMSE scores between 24 and 30, (2) CDR score of 0, (3) no presence of neurological or psychiatric illness, and non-demented.
MMSE, CDR, and the Consortium to Establish a Registry for Alzheimer’s Disease (CERAD) cognitive battery were assessed for each subject. The CERAD cognitive battery was replaced with the Alzheimer’s disease assessment scale for AD subjects in AddNeuroMed. The CERAD battery employs the same 10 word recall task as the Alzheimer’s assessment scale, only the scoring is inverted. Therefore, the mean number of words not recalled in the CERAD word list task was calculated in order to obtain comparable measures of memory for all diagnostic groups. This revised cognitive parameter was named ADAS-1 corresponding to the first subtest of the Alzheimer’s disease assessment scale.
MRI Acquisition
Standardized MRI data acquisition techniques were in place for AddNeuroMed and ADNI to ensure homogeneity across data acquisition sites. A detailed description of the ADNI data acquisition protocol can be found at www.loni.ucla.edu/ADNI/research/Cores/index.shtml. The imaging protocol included a 1.5T high resolution T1 weighted sagittal 3D MP-RAGE volumes (voxel size 1.1 × 1.1 × 1.2 mm3), and axial proton density with T2 weighted fast spin echo images. A comprehensive quality control procedure was carried out on all MR images according to the AddNeuroMed quality control framework (Simmons et al. 2009, 2011).
Hippocampal Subfield Segmentation
Image analysis was carried out using the Freesurfer image analysis pipeline (version 5.1.0). These procedures have been described in detail in previous publications (Dale et al. 1999; Fischl et al. 2002; Ségonne et al. 2004; Fischl et al. 2004). Initially volumetric segmentation involved the removal of non-brain tissue using a hybrid watershed/surface deformation procedure (Ségonne et al. 2004), automated Talairach transformation, segmentation of the subcortical white matter and deep grey matter volumetric structures (Fischl et al. 2004).
Automated segmentation of the hippocampus was performed to define anatomical subfield labels using a Bayesian modelling approach and a computational model of the areas surrounding the hippocampus. An atlas mesh had previously been built and validated from manual delineations in ultra-high resolution MRI scans of 10 individuals (Van Leemput et al. 2009). These delineations include the fimbria, presubiculum, subiculum, CA1, CA2/3, and CA4-DG subfields as well as the hippocampal fissure. Figure 1 illustrates the delineations made to define the different subfields of the hippocampus. For more details about this technique and the borders used to define the different subfields, see Van Leemput et al. (2009).
All subfield measures were normalised by the subject’s intracranial volume derived from Freesurfer using the following formula: volumenorm = volumeraw × 1,000/ICV in cm3 (Westman et al. 2013). This automated segmentation approach has been recently applied to a small group of MCI subjects (Hanseeuw et al. 2011).
Statistical Analysis
Statistical analysis was conducted using PASW Statistics (Version 17. 0; SPSS inc., USA). Categorical variables were inspected using the Chi square test while continuous variables were tested using ANOVA with Bonferroni post hoc comparisons. Hippocampal subfield volumes were first analysed using MANCOVA utilising Bonferroni correction by adopting a general linear model procedure, adjusting for age, gender, education, and APOE ε4 genotype as covariates. Bonferroni pairwise comparisons were performed to inspect subfield volume differences between the groups.
Multiple regression analyses were conducted in R version 2.15.2 using the lm function from the R stats package and Bonferroni correction for multiple comparisons. Patterns of subfield volume loss were tested in relation to the effects of age, gender, education, APOE ε4 genotype, and neuropsychological test scores from MMSE and ADAS-1. In this step, all subfield measures were tested as dependent variables by disease group (AD, MCI converters, stable MCI, and HC) as a whole. Age, gender, years of education, APOE ε4 genotype, and neuropsychological scores from MMSE and ADAS-1 tests was treated as independent variables for identifying subfield specific effects. 10 fold cross validation was performed by fitting linear regression models to the data, excluding 1/10th of the data in each fold and using the fitted model for prediction on data that was excluded from the fold.
Hippocampal subfields were subsequently analysed using Orthogonal Partial Least Squares (OPLS) (Wiklund et al. 2008; Trygg and Wold 2002), a supervised multivariate data analysis method included in the software package SIMCA (Umetrics AB, Umea, Sweden). Al1 14 variables (left and right subfields) were used for OPLS analysis. Classification models were created for distinguishing between AD and HC subjects at baseline. The AD versus HC models were subsequently treated as classifiers to investigate how well the hippocampal subfields could predict future MCI conversion to AD at 12 months follow up. Seven-fold cross validation was used for all models. Using this approach we created 4 OPLS models; 2 for the total hippocampus and 2 for the combination of subfield volumes. The first model for each region comprised the AddNeuroMed cohort and the second model comprised the ADNI cohort. To further validate the models created the AddNeuroMed cohort was used as the training set and the ADNI cohort as a test set (and vice versa) to see how well the models could predict new and unseen data. The combined ADNI and AddNeuroMed cohort from the AD versus HC comparison was used as a classifier to investigate the reliability of predicting MCI conversion to AD at 12 months. This OPLS classification approach has been extensively validated (Bylesjo et al. 2006; Wiklund et al. 2008; Westman et al. 2011c) and applied to several biomarker discovery studies in AD (Mangialasche et al. 2010; Westman et al. 2011a, 2012; Spulber et al. 2013).
Sensitivity and specificity were calculated from the cross-validated prediction values of the OPLS models. The positive and negative likelihood ratios (LR+ = sensitivity/(100−specificity) and LR− = (100−sensitivity)/specificity)) were determined. A positive likelihood ratio between 5 and 10 or a negative likelihood ratio between 0.1 and 0.2 increases the diagnostic value in a moderate way, while a value above 10 or below 0.1 significantly increases the diagnostic value of the test.
Receiver operating characteristic (ROC) curves were calculated for the individual subfield volume models using the ROCR library (version 2.1) in R. ROC curves provide a graphical means to interpret the quality of separation and are created by plotting the true positive rate (sensitivity) versus the false positive rate (1−specificity) for various thresholds. The discriminant value of the corresponding ROC curve can be obtained by calculating the area under the curve (AUC). AUC values range from 0.5 (random discriminations no better than chance) to 1.0 (perfect discrimination). The pROC (Receiver Operating Characteristic) package (version 1.5.4) (Robin et al. 2011) in R was used to perform area under the curve (AUC) statistical comparisons between the combined subfield and total hippocampal volume models in the AD vs. HC and MCI converter vs MCI non-converter models.