Background

Both Neurofibrillary tangles (NFT) and neuritic plaques (NP) are the primary neuropathologic markers of Alzheimer's disease (AD), although they are highly prevalent in normal brain aging [14].

Many reports have described that there are fewer differences in AD brain neuropathologic lesions between AD patients and control subjects aged 80 years and older, as compared with the considerable differences between younger persons with AD and controls [5, 6]. While there are dramatic differences in neuropathologic lesion counts between middle-aged AD cases and controls, the difference in lesion counts, while significant, is of lesser magnitude in older adult AD cases and controls[5].

Advanced age at death is associated with somewhat less severe dementia and fewer senile plaques and neurofibrillary tangles[6].

Presently there is not a consensus on whether NFT constitute a specific effect of the disease or result, in part, from a non-specific age related process.

In fact, some investigators [7] have suggested that, since the NFT are very prevalent in the brains of non-demented older adults, the presence of NFT in the brain is not, by itself, diagnostic of AD, and that NFT should be viewed as a later occurrence in the pathological progression of the disease.

Overall, the exact role of NFT to AD, aging, and dementia remains unclear. Even universally accepted neuropathological criteria for Alzheimer's disease differ on the diagnostic role of NFT.

The current approach of determining different cut-off points for NFT and NP density and regional distribution do not allow a 100% sensitivity and specificity in discriminating between AD brains and control subjects with normal cognitive function.

Recent studies further suggest that NFT have a stronger correlation to cognitive function than NP, not only in AD but also in normal aging and mild cognitive impairment [1, 3, 8]. The degree of cognitive impairment is a function of the distribution of NTF within the brain [7]. In particular, the presence of high NFT density in the entorhinal and hippocampus neurons is strongly correlated to reduced cognitive performance in normal aging, whereas NFT formation in neocortical areas is associated with clinically overt AD [24, 9].

Neuropathologic studies [24, 9] have shown that the distribution of NFT in the human brain follows, in general, a predictable and hierarchical pattern whereas the distribution of NP varies among individuals. Neurofibrillary pathology is initially limited to the hippocampus and the entorhinal cortex [3, 9]. As the number of NFT increases in these areas, neurofibrillary pathology extends into the temporal cortex. Finally, tangles emerge and spread to the neocortical areas of the brain.

In a previous study [10] we have shown that Artificial Neural Networks analysis applied to demographic, clinical and genotype descriptors allowed a better prediction of the number of NFT in the neocortex and hippocampus than the number of NP in the same areas. These results indicate that a non-linear analysis of complex data is a valid approach in highlighting on the role of NP and NFT in the development of a degenerative process leading to AD. This supports the concept that the presence of NFT in aging may represent one of its earliest pathological substrates and play a significant role in the initial stages of memory impairment, confirming the findings [3, 9] by other authors.

An important way to challenge this hypothesis is to evaluate the predictive role of NFT and NP in two critical brain regions, i.e. neocortex and hippocampus, in distinguishing between normal subjects and those with AD.

The aim of this study is to discover the hidden and non-linear associations among Alzheimer's disease pathognomonic brain lesions and the clinical diagnosis of Alzheimer's disease in participants in the Nun Study.

Methods

Subjects

Subjects in the study were selected from a cohort of 117 participants in the Nun Study who had donated their brains [10]. The Nun Study was approved by the University of Kentucky's Institutional Review Board. In order to select control subjects with normal cognitive function we excluded non-demented subjects with a MMSE score equal or less than 24 and/or the concomitant presence of mild cognitive impairment of the amnesic type [11].

Thirty six subjects matched these criteria. Six of them were ApoE4 positive (16.6%).

Selection criteria for pure AD patients was the presence of clinical dementia and values of NFT and NP in the neocortex and hippocampus above the following cut-off:

Neurofibrillary Tangles in Neocortex: average value of neocortical NFT per mm2 > 1.0;

Neurofibrillary Tangles in Hippocampus: average value of hippocampal NFT per mm2 > 10;

Neuritic Plaques in Neocortex: maximum number of NP in the neocortex >1.5;

Neuritic Plaques in Hippocampus: maximum number of NP in the hippocampus >1.5.

These cut-off derive from a previous mathematical validation of neuropathological values distribution observed in a previous study [10].

Twenty six patients fulfilled these criteria and they constitute the AD cases in this analyses. Nine of them were ApoE4 positive (34.6%).

Artificial neural networks analysis

ANNs structure and architecture

ANNs models were constructed by using non commercial programs developed by Semeion Research Center [1217]. In this experiment several ANN architectures with different learning rules were assessed, all of them sharing the following structure: the input vector had number of nodes equal to the number of independent variables, the output vector had two nodes corresponding to the two different outcomes (AD cases vs normal controls), and a single layer of hidden units

ANNs with Back Propagation learning rule were employed sharing the following structure: the input layer had a number of nodes equal to the number of independent variables, the output layer had two nodes corresponding to the target (AD cases/normal controls), and the inner layer had four hidden units.

Results obtained with those neural networks have been compared with a linear statistical model: the Linear Discriminant Analysis (LDA) (Software SPSS®) using the same training and testing subsets.

During the training phase the input relevance of each variable was assessed. The so called "input relevance" is a parameter expressing the magnitude of the activation of a given node during the training phase. The magnitude of the activation is arbitrarily expressed with a number which ranges from zero to infinity.

In technical terms, the "Input Relevance" is the Fan-out of every input when the ANN is trained:

R i = 1 K c K j N w c , j , i ; MathType@MTEF@5@5@+=feaafiart1ev1aqatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGsbGudaWgaaWcbaGaemyAaKgabeaakiabg2da9maalaaabaGaeGymaedabaGaem4saSeaaiabgwSixpaaqahabaWaaabCaeaacqWG3bWDdaWgaaWcbaGaem4yamMaeiilaWIaemOAaOMaeiilaWIaemyAaKgabeaaaeaacqWGQbGAaeaacqWGobGta0GaeyyeIuoaaSqaaiabdogaJbqaaiabdUealbqdcqGHris5aOGaei4oaSdaaa@46B5@

where:

R i is the mean relevance of the i-th input variable of the dataset;

K is the number of classifiers used in the training phase;

N is the number of hidden units of the K classifiers trained;

w c,j,i is the trained weight of the c-th classifier, connecting the i-th input to the j-th hidden unit.

The Validation Protocol

The validation protocol is a fundamental procedure to verify the models' ability to generalize the results reached in the Testing phase of each model. The application of a fixed protocol measures the level of performance that a model can produce on data that are not present in the Testing and/or Training sample. Different types of protocol exist in the literature, each presenting advantages and disadvantages.

The protocol, from the point of view of a general procedure, consists of the following steps:

  1. 1.

    subdividing the database in a random way into two subsamples: Subsets A and B;

  2. 2.

    train an ANN on Subset A; in this phase the ANN learns to associate the input variables with those that are indicated as targets;

  3. 3.

    at the end of the training phase the weight matrix produced by the ANN is saved and frozen together with all the other parameters used for the training;

  4. 4.

    with the weight matrix saved, Subset B, which it has not seen before, is shown to the ANN, so that in each case the ANN can express an evaluation based on the previous training; this operation takes place for each input vector and every result (output vector) and is not communicated to the ANN; the ANN is in this way evaluated only in reference to the generalization ability that it has acquired during the Training phase;

  5. 5.

    a new ANN is constructed with identical architecture to the previous one and the procedure is repeated from point 1; but this time the ANN will be trained on Subset B and blindly tested on the Subset A.

This general training plan has been further articulated with the aim of increasing the level of reliability in terms of generalization of the processing models. More specifically we employed the so-called 5·2 cross-validation protocol [13]. In this procedure the study sample is randomly divided ten times into two sub samples, always different but containing a similar distribution of cases and controls: the training one (containing the dependent variable) and the testing one. During the training phase the ANN learns a model of data distribution and then, on the basis of such a model, classifies subjects in the testing set in a blind way. The training and testing sets are then reversed and consequently 10 analyses for every model employed are conducted. To compare the ANNs performances, the same protocol was used with the same data distribution to validate the Linear Discriminant Analysis (LDA).

Results

Table 1 shows the descriptive variables of the subjects included in this study according to the above criteria.

Table 1 Characteristics of the sample under evaluation

As one can see, even if the average difference between the neuropathological lesion load in the two groups was substantial, a marked overlap of values was present for NFT in hippocampus, NP in neocortex, and NP in hippocampus.

A good linear relationship between each of the 4 selected input variables and the target of the study (AD cases/normal controls) was present: for Neurofibrillary Tangles in Neocortex, r-squared = 0.50; Neurofibrillary Tangles in Hippocampus, r-squared = 0.50; Neuritic Plaques in Neocortex, r-squared = 0.50; Neuritic Plaques in Hippocampus respectively. r-squared = 0.32 ;

By taking into account all the four recorded neuropathological features, the overall predictive capability of ANNs in sorting out AD from normal amounted consistently to 100% (table 2).

Table 2 Performance of the ANNs in discriminating AD cases from normal controls. The analysis was carried out on all 4 neuropathologic variables registered in the original database of patients in ten separated experiments with different training and testing subsets. Linear Discriminant Analysis [LDA] results on the same subsets are shown for comparison.

These results were consistently obtained in ten separated experiments performed on different training and testing subsets. The corresponding results obtained with LDA were good but not excellent; in fact the mean accuracy rate was 92.30%.

Since some AD patients had severe cognitive impairment, in further experiments, we excluded from the analysis AD patients with MMSE score below 4.

A subset of 13 AD patients was obtained with a mean MMSE equal to 15.

The average values of pathological markers didn't differ between these two subgroups with the exception of NFT in neocortex (Table 3). We repeated the same predictive experiments on a new data set composed of these 13 mild AD patients and the same 36 controls obtaining identical results.

Table 3 Comparison between severe and non severe AD patients.

In order to assess the relative importance of the four neuropathological AD markers in developing the model build by ANNs, in the ten experiments we evaluated the so called "input relevance " of each markers during the training phase of the neural network.

Figure 1 shows the average input relevance of each variable in the ten independent training sessions. As one can see, NFT Neocortex accounted for the highest input relevance followed by NFT Hippocampus, NP Neocortex, and lastly by Max NP Hippocampus.

Figure 1
figure 1

Mean input relevance* of neuropathological markers in ANNs experiments. * Input relevance refer to the ranking of each variable in term of relative importance within the model created by artificial neural networks. The higher the value, the higher the importance of the variable.

Discussion

Artificial neural networks have shown optimal performance on various medical applications because of their capacity to learn how to identify complex relationships among data.

At variance with statistical linear methods, ANNs are able to reproduce the dynamic interaction of multiple factors simultaneously, allowing the study of complexity; they can also draw conclusions on an individual basis and not as average trends.

In a previous paper [10] we have shown that ANNs can be used to predict the results of post-mortem brain evaluations from cognitive performance data among 117 participants in the Nun Study.

That is, we determined how demographic data and cognitive and functional variables of each subject during the last year of her life could predict: a) the presence of brain pathology expressed as Braak stages of AD pathology, NFT and NP count in the neocortex and hippocampus; and b) brain atrophy, a highly prevalent neuropathologic feature of AD.

In this study our goal was to understand what constitutes the relevant neuropathological pattern differentiating AD from normal control subjects, an issue which, so far, has never been solved.

Thanks to the ANNs analysis we succeeded in reaching a perfect distinction between the two groups which remained unchanged even when we analyzed only the clinically mild and moderate AD patients. Input relevance analysis confirmed the relative dominance of NFT in the neocortex in discriminating between normal controls and AD cases and indicated the low importance played by NP in hippocampus.

Input relevance is a practical way to open the so called "black box" of ANNs, allowing one to discover the role played by each variable in the developing the data model during the training phase. The numerical value of this parameter is proportionally related to the "weight" of a given variable in the model.

Another major challenge in comparing the prevalence of AD lesions in old individuals with AD and non-demented control subjects is the selection of appropriate criteria for excluding mild dementia in the controls. In fact, as regards to non-demented people most of the studies rely on the interview of a knowledgeable informant after the subject death, rather than direct observation of the control subject, according to the same protocol used to assess AD patients One example is the study published by Berg and co-workers in 1998 [5], in which experienced nurses or physicians interviewed informants and reviewed the records of previous clinical assessments to define the Clinical Dementia Rating score of controls. In addition, some controls were excluded because of neocortical senile plaques densities that met neuropathological criteria for AD, introducing in this way a circular reasoning.

A possible limitation of our analysis is linked to the relative small sample size. This issue can be considered at two different levels: the statistical and epidemiological one.

From a pure statistical point of view we can say that the small number of variables considered guarantees a balanced ratio between variables and records. In addition the use of a rigorous validation protocol with many training and testing procedures should protect against statistical imbalances.

From an epidemiological point of view we can't regard the 26 patients in this study as a representative population of AD patients. Therefore it is clear that the results presented in the paper are only valid for this particular environment and cannot be generalized. One should anyway consider the extreme scarcity in the general literature of autopsy data in groups of aged people with a substantial proportion of individuals without dementia symptoms.

Another potential limitation of our paper is that the markers that might best correlate with cognitive status (i.e. synaptic markers) are not included in the dataset ; nonetheless, we think that the information carried out by NFT and NP is sufficiently specific to make a considerable contribution to the understanding of pathology-clinical relation.

Conclusion

In conclusion, the results of this study confirm that the neuropathologic profile of AD subjects is complex but specific and thanks to ANNs it can be conveniently differentiated from that of normal subjects. Cortical NFT represent the key variable more likely related to the patho-physiology of the disease than the NP.