Key words

1 Introduction

Dementia is a progressive condition which affects over 55 million people worldwide, with nearly 10 million new cases every year [1]. The term “dementia” indicates not a single disease, but rather a spectrum of different conditions with different clinical phenotypes, which can be caused by a multitude of pathologies that cause changes in the structure and chemistry of the brain. While the most common cause of dementia-related symptoms is a neurodegenerative disease, other causes do exist (e.g., chronic inflammatory disease, alcoholism…). The exact pathological cascade of events which causes the development of symptoms is still unknown, but overall it is thought that a combination of genetic and environmental factors results in the abnormal accumulation of misfolded, toxic proteins in the brain, which then triggers both chemical imbalance and neuronal loss in the brain (a process called atrophy), ultimately leading to the hallmark clinical symptoms that eventually impair the daily functioning of affected individuals. An important distinction to make is between the concept of “dementia” as a collection of clinical syndromes and as qualitative and quantitative clinical expressions of the disease, and “disease” as the underlying pathophysiological processes of the syndromes.

Thanks to the increased insight into disease pathophysiology, there has been a revision of the clinical diagnostic criteria, moving from considering the observable clinical signs and symptoms and implying a close and consistent correspondence between clinical symptoms and the underlying pathology, to including biomarkers of the underlying disease state in the clinical diagnosis. For example, the 1984 NINCDS-ADRDAFootnote 1 criteria were the benchmark for a clinical diagnosis of Alzheimer’s disease, which was defined as “a progressive, dementing disorder, usually of middle or late life” [2]. These criteria were revised in 2011 [3], to include biomarkers to support the clinical diagnosis and to account for the “pre-dementia” stages and the slow pathological changes occurring over many years before the manifestation of clinical symptoms [4].

Despite different pathological origins, many forms of dementia can have similar symptoms, which typically include memory loss, language difficulties, disorientation, and behavioral changes. However, at an individual level, the symptoms can vary with regard to their nature, presentation, rate of progression, and severity. Such heterogeneity between and within forms of dementia is typically related to the area (or areas) of the brain affected by the underlying pathology and by the etiological cause of the disease itself.

1.1 Alzheimer’s Disease (AD)

AD is the most common form of dementia, accounting for 60–65% of all cases. It typically presents in individuals aged 65 or older, with the initial and most prominent cognitive deficits being memory loss, with additional cognitive impairments in the language, visuospatial, and executive functions [3]. The distinguishing feature of AD is the buildup of amyloid-β plaques and neurofibrillary tangles of tau proteins. The amyloid plaques tend to be diffuse throughout the brain, while tau pathology tends to start in the mediotemporal lobe, and in particular in the hippocampus and entorhinal cortex, and spread to prefrontal and temporoparietal cortex in the moderate stages of the disease. There are numerous genetic factors that have different levels of risk and prevalence in the population. The greatest risk comes from the nearly fully penetrant autosomal dominant mutations in the amyloid precursor protein (APP), presenilin 1 (PSEN1), or presenilin 2 (PSEN2) genes. However, the prevalence of these mutations is extremely low, comprising less than 0.5% of all AD cases. The age at onset of autosomal dominant AD is relatively similar between generations [5] and within individual mutations [6], typically resulting in an early-onset form of AD (below the age of 60 years). The most prominent risk factor gene in terms of both hazard and prevalence is apolipoprotein E (APOE). Carriers of a single copy (roughly 25% of the population) of the 𝜖4 allele are roughly two to three times more likely to develop AD [7], and they tend to have an earlier age of disease onset. Homozygotic 𝜖4 carriers represent 2–3% of the general population, with a dose-dependent increase in risk. There have been some suggestions that carrying an 𝜖2 form of APOE can infer some protection to individuals compared to the most common 𝜖3 allele [7, 8].

Besides the typical presentations of AD in which episodic memory deficits are prominent, there are other variants with atypical presentations. Posterior cortical atrophy (PCA) [9] is characterized by visual and spatial impairments, but memory and language abilities are preserved in the early stages, with atrophy localized in the parietal and occipital lobe. Logopenic variant of the primary progressive aphasia (lvPPA), also called logopenic progressive aphasia (LPA), is characterized by impairments in the language domain (i.e., word-finding difficulty, impaired repetition of sentences and phrases) and atrophy in the left temporoparietal junction [10]. Despite presenting with different symptoms and neuroanatomical features, both PCA and lvPPA typically share the same forms of pathology, amyloid plaques, and neurofibrillary tangles, with the typical forms of AD. Besides these pathological hallmarks, accumulation of the TAR DNA-binding protein 43 (TDP-43) [11] is another form of pathology often observed in AD, particularly in cases with older onset of symptoms, resulting in increased rates of atrophy. The limbic-predominant age-related TDP-43 encephalopathy dementia (LATE) is a related condition found in older elderly adults (above 80 years of age), presenting with a slow progression of amnestic symptoms and hippocampal sclerosis.

1.2 Vascular Dementia (VaD)

As the second most common form of dementia (accounting for 10–15% of all dementia cases), VaD is an umbrella term for a number of syndromes due to a clear primary cause: the decreased blood flow due to damage in the blood supply (large or small vessels), which leads to brain tissue damage. The vascular origin is clearly seen on magnetic resonance imaging (MRI) as the presence of extensive periventricular white matter lesions, or multiple lacunes in the basal ganglia and/or white matter [12]. Symptoms tend to accumulate in a step-wise fashion, rather than gradually worsening, and they greatly vary based on which vessel is involved: ranging from memory loss, and difficulties in executive functions, to language and motor impairments. Different syndromes include multi-infarct dementia (or vascular cognitive impairment), when a series of small strokes damage multiple areas of the brain, typically in the cortex; strategic infarct dementia, when symptoms are caused by a focal ischemic lesion; subcortical vascular dementia (or subcortical leukoencephalopathy), caused by occlusions in small vessels, resulting in multiple lacunes in the subcortical structures; and mixed dementia, when symptoms of both vascular dementia and AD are present.

Generally, strategic infarct dementia and multi-infarct dementia involving the cortex are due to occlusion in one of the major cerebral arteries, and therefore the insult in the brain usually results in a large area affected; they have a definable time of onset and specific deficits related to the region affected. When the occlusion involves small vessels, the dementia symptoms have a more insidious onset and less defined deficits in the executive function domain.

Risk factors typically include age, hypertension, high cholesterol, obesity, smoking, and other cardiovascular diseases (family history of stroke, heart disease, or diabetes). Mutations in the Notch3 gene have been associated to the cerebral autosomal dominant arteriopathy with subcortical infarcts and leukoencephalopathy (CADASIL), which is a genetic disorder showing recurrent stroke, resulting in lacunar infarcts [13].

1.3 Frontotemporal Dementia (FTD)

FTD describes a very heterogeneous group of neurodegenerative disorders with multiple genetic and pathological causes. However, there is sufficient overlap in terms of both clinical (behavioral and/or language symptoms) and anatomical presentation (frontal and temporal lobe atrophy and hypometabolism) that the conditions are commonly considered together as one group. While representing 5–10% of all dementia cases, the FTD disorders constitute a more common cause of early onset dementia, approximately equal in frequency to AD in people under the age of 65. The only confirmed risk factors are genetic, and about a 30–50% of cases are due to an autosomal dominant mutation, primarily found in the microtubule-associated protein tau (MAPT), progranulin (GRN), or chromosome 9 open reading frame 72 (C9orf72) genes [14]. The age at onset is extremely variable within and between genetic forms, including within families, and therefore hard to predict [15].

Clinically, behavioral variant FTD (bvFTD) is the most common presentation, with impaired social conduct and personality changes, often misdiagnosed as psychiatric illness at the onset [16]. It could be caused by tau, TDP-43, or fused-in-sarcoma pathology [17, 18] and associated with extremely variable pattern of atrophy between patients, with a predominant involvement of the frontal and temporal cortex (often asymmetrical), but also insula, anterior cingulate, and subcortical structures [19,20,21].

Less frequently, patients present with progressive decline in speech and language functions, a collection of disorders and variants referred to as primary progressive aphasia, PPA. There are multiple variants of PPA, with different language deficits and brain regions involved. These are semantic variant (svPPA), with a breakdown of semantic memory, and associated atrophy in the left antero-inferior temporal lobe [22, 23]; non-fluent variant (nfvPPA), characterized by agrammatism and speech apraxia, and associated atrophy in the left inferior frontal, superior temporal, and insular cortex [22]; and lvPPA, though as mentioned previously is more often linked with AD pathology [10].

Around 15% of people on the FTD spectrum can also develop motor features consistent with either amyotrophic lateral sclerosis (ALS) (or motor neurone disease, MND) or parkinsonism (including progressive supranuclear palsy, PSP, or corticobasal syndrome, CBS) [24].

There is a distinct differential brain involvement across the genetic forms of FTD, evident up to 15 years before the estimated symptoms onset [25, 26]: MAPT mutations cause focal symmetric atrophy in the anterior temporal and orbitofrontal cortex, including hippocampus and amygdala; GRN mutations usually cause asymmetric atrophy in the temporal, inferior frontal, and inferior parietal lobes and striatum; while C9orf72 repeated expansions showed wider symmetric atrophy, predominantly involving the dorsolateral and medial frontal and orbitofrontal cortex, as well as the thalamus and cerebellum [25, 27, 28]. Despite these common patterns, there is still large variability even within the same genetic group, potentially due to the specific mutations, clinical presentations, or genetic and environmental factors [29, 30].

1.4 Dementia with Lewy Bodies (DLB)

Around 10–15% of dementia cases have a diagnosis of DLB [31]. Symptoms tend to have an insidious onset, usually at the age of 65 years or older, and disease duration has an average of 5 to 8 years from diagnosis, but it can range from 2 to 20 years. Symptoms change greatly from person to person but typically include fluctuating cognition, pronounced alterations in attention, alertness and executive functions, visual hallucinations, and motor features of parkinsonism. Early signs also include rapid eye movement (REM) sleep behavior disorder, while memory and hippocampal volume are relatively preserved in the initial stages, but they become impaired later during the course of the disease. Alongside relatively preserved mediotemporal lobe volumes, typical biomarkers are reduced dopamine transporter (DAT) uptake in the basal ganglia on single-photon emission computed tomography (SPECT) or positron emission tomography (PET) imaging, and polysomnographic recordings, showing REM sleep without atonia.

DLB is considered a sporadic disease; however, mutations in genes encoding α-synuclein (SNCA) and β-synuclein (SNCB) proteins have been associated with DLB [32].

Pathologically, DLB is characterized by the presence of α-synuclein proteins which abnormally aggregate in the brain to form Lewy bodies.

Lewy bodies are also found in the brain of individuals affected by Parkinson’s disease and Parkinson’s disease dementia (PDD). DLB and PDD are often difficult to distinguish, and the “1-year rule” is used for differential diagnosis: if the parkinsonian motor symptoms are experienced for a year or more before the onset of the cognitive impairments, then the condition of PDD is diagnosed, while if the cognitive problems start before or within 1 year after the movement difficulties, then a diagnosis of DLB is likely to be given.

Box 1: Different Diseases Causing Dementia

Dementia is not a disease but a spectrum of disorders defined by different pathologies, the most common being the following.

  • Alzheimer’s disease is the most prevalent, with hallmark pathologies of amyloid-β and neurofibrillary tau tangles. Memory is the most common symptom, but there are visual, language, and behavioral variants.

  • Vascular dementia is caused by various types of vascular insults.

  • Frontotemporal dementia is more common in those with younger ages of onset and is more associated with behavioral and language forms.

  • Dementia with Lewy bodies (DLB) shares pathology with parkinsonian disorders and often has visual fluctuations and hallucinations as symptoms.

2 Features and Markers of Dementia

As previously mentioned, the most prevalent forms of dementia are multi-factorial processes that typically occur over a very long time period, from the silent buildup of pathology through to the onset and progression of the clinical syndrome. As such, there will be numerous types of assessment that can help identify individuals at risk, underlying pathology burden, and severity of the disease. These range from classic clinical workups, cognitive assessments of memory and other brain functions, fluid-based biomarkers, and medical imaging-based assessments. The utility and sensitivity of these investigations will highly depend on the stage of the disease that the patient is experiencing.

2.1 Genetic Markers

Despite the numerous forms of pathology that can ultimately lead to dementia, many genetic risk factors have been identified for AD and related disorders; the overall heritability of AD has been characterized to be between 60 and 80% [33]. This risk of heritability however is spread out over a wide range of locations that vary in terms of prevalence and impact. Identifying genetic risk factors and the associated pathways that these genes are involved in have led to a better understanding of various forms of dementia [34].

As mentioned in Subheading 1, the genetic variants with the strongest penetrance are the autosomal dominant forms of dementia. What these rare autosomal dominant forms provide is an opportunity to study a “purer” form of dementia, as the age of disease onset tends to be in the 30s through 50s, when there should be a far lower likelihood of commodities. It also provides a chance to study pre-symptomatic changes in individuals who are nearly certain to become affected by the disease. Thus, these cohorts are an ideal population for clinical trials of new therapies, in part to prove that the target engagement is successful and whether it provides any evidence that supports the underlying hypothesis around the disease start and spread.

Outside of the autosomal dominant mutations, the gene most linked with risk for AD is APOE. There is not an equivalent gene in terms of risk and prevalence to APOE yet discovered for other forms of dementia, in part because these forms of dementia are rarer and it is thus more difficult to include the number of subjects needed for a well-powered GWAS (genome-wide association study). However, there are some suggestions, such as the TREM2 variant in FTD [35].

Rather than trying to identify single target genes and their associated risks, many researchers have looked to generate a polygenic risk score, i.e., a sum of the risks conferred by each associated variant across the genome. Polygenic risk scores (PRS) have been developed for multiple diseases to better account for the amalgamated risk that the entire genetic profile provides [36]. For AD, however, APOE confers a far greater risk to individuals, with the PRS scores able to slightly improve predictive accuracy and explain additional risks beyond APOE [37, 38].

2.2 Clinical and Cognitive Assessment

Given that various forms of dementia have historically been defined by their clinical phenotype, and that clinical and cognitive assessments tend to be the cheapest and most widely available, they often are paramount in terms of initial diagnostic workup of an individual, as well as their subsequent patient management.

Clinical vital signs, such as blood pressure [39, 40] and body mass index (BMI) [41], may suggest causes of cognitive impairment other than a neurodegenerative disorder or indications of an at-risk profile that would result in a more aggressive disease. The Clinical Dementia Rating (CDR) [42] is a semi-structured interview that examines several aspects of physical and mental well-being which is summarized under six subdomains. For each subdomain, a score of 0 (no impairment), 0.5 (mild/questionable impairment), 1, 2, or 3 is given. Both the sum of these subdomains, referred to as the CDR Sum of Boxes (CDR-SB), and a global summary score are often used, with CDR-SB now commonly used as a primary endpoint in trials. Other clinical workups may help identify non-memory symptoms that would not be picked up via cognitive assessments, such as anxiety, depression, and quality of daily activities.

Cognitive assessments look at numerous domains of brain function, including executive function, language, visuospatial functions, and behavior. However, given that memory is the most common primary complaint from individuals with AD, assessments of various aspects of an individual’s memory is one of the most important and typically included in both clinical and research settings. Numerous tests have been developed and validated for use in the clinic as well, and they often serve as a primary outcome measure in clinical trials of subjects with mild to moderate AD. Standard clinical and cognitive assessments include the Mini Mental State Exam (MMSE) [43], the Alzheimer’s Disease Assessment Scale-Cognitive Subscale (ADAS-COG) [44], and the Montreal Cognitive Assessment (MoCA) [45].

While cheap and readily available, these assessments do come with some disadvantages. These are often pencil and paper tests which are administered and scored by a trained rater. As such, there is a level of subjectivity in many of these assessments that tend to result in high variability. Often these tests repeat the same questions and tasks over and over again, which leads to practice effects. It also is often difficult to build these assessments such that their dynamic range can simultaneously cover both the early subtle signs of dementia pre-symptomatically and the full decline once the individuals have experienced symptoms. This results in some tests having substantial ceiling effects (i.e., being easy enough that there is limited distinction between healthy individuals and those experiencing the subtle initial symptoms) and floor effects (i.e., the tests are so difficult that many with a cognitive impairment cannot perform them). There are also the cultural and lingual artifacts that may produce bias when translating one of these tests over from one language to another. As a result, there is a trend to formulate cognitive assessments in a more objective, computational format to reduce issues around subjectivity, language differences, and learning effects. They may reduce the variability compared to standard paper and pen tests, which may be of key benefit in assessing therapeutic effects in clinical trials [46, 47]. This includes multiple trials of the test run during a single assessment and collection of dense information about the task in addition to more summary metrics as number of items correct or mean reaction time. The rich set of detailed, repeated measures is ideal for further exploration with machine learning algorithms.

2.3 Biofluid Assessments

The most widely studied fluid-based biomarkers in AD and related disorders come from samples of cerebrospinal fluid extracted from an individual’s spine. Measures of primary AD-related pathology (Aβ1−42, tau, p-tau) can be obtained from these samples, as well as peripheral information on downstream mechanisms, such as neuroinflammation, synaptic dysfunction, and neuronal injury [48]. Fluid-based biomarkers are very effective in terms of being a “state” biomarker, i.e., whether an individual has a normal or abnormal level. They are often much cheaper than imaging assessments in providing this status and thus are more likely to be used for screening of individuals at risk for dementia. At the same time, their ability to track change in the disease over time is currently limited. They are in general more noisy measurements, likely due to a number of factors including consistency of extraction, storage, and analysis methods [49]. Even when these have been held extremely consistent, their variability is still much higher in terms of measurement of change over time compared to cognitive and imaging measures [50]. Despite the procedure being very safe and continuing to improve, there is still a set of individuals who will not wish to participate in studies involving these assessments. A far less invasive and cheaper procedure is to extract similar measures from the plasma. While plasma-based biomarkers have been actively pursued for a lengthy time, it is only very recently that they have produced the level of accuracy and precision needed to compete in terms of performance to other established measurements [51, 52]. There have been plasma-based assessments of amyloid-β, different tau isoforms, and nonspecific markers of neurodegeneration (such as levels of the neurofilament light chain, NfL) which show promise for detecting changes in the preclinical stage of AD [53].

2.4 Imaging Dementia

The primary use of brain imaging in clinical settings is to exclude non-neurodegenerative causes, such as normal pressure hydrocephalus, tumors, and chronic hemorrhages, together with absence of atrophy, all features that can be visualized on T1-weighted MRI or computerized tomography (CT) scans. Nevertheless, three-dimensional tomographic medical imaging modalities, particularly PET and MR imaging, provide high-precision measurements of spatiotemporal patterns of disease burden that have proven extremely valuable for research and also currently contribute to the positive diagnosis.

These modalities have been employed primarily in clinical research settings, where longer advanced imaging protocols and novel radiotracers can be implemented. Due to costs, time, and availability of imaging resources, they have been slow to translate to the clinical setting itself, but they are beginning to make impact there as well.

2.4.1 Imaging Primary Pathology

Radiotracers that preferentially bind to the primary pathologies associated with AD allow the detection and tracking of the slow progressive buildup of the amyloid plaques and neurofibrillary tangles, which can occur decades before the onset of symptoms [54, 55]. The original amyloid tracer was 11C Pittsburgh Compound B [PIB], and it identified individuals who were amyloid positive but showed no symptoms [56]. These individuals tended to progress to mild cognitive impairment and subsequently AD at much higher rates than those who were amyloid negative [57]. Since the introduction of PIB, there have been numerous 18F tracers which have been developed and approved for use in humans [58, 59]. As 18F-based tracers have a longer half-life than 11C, it has enabled a much larger group of research centers access to this technology. Tracers specifically related to tau-based pathology have come much later. The most widely used has been flortaucipir, with second-generation tracers now available that have overcome some of the challenges of imaging with the early tau tracers [60]. Findings from tau PET studies suggest that the landmark postmortem staging of tau pathology seeding and spread according to Braak [61] is the most common spatiotemporal pattern observed in individuals [62, 63]. However, other subtypes of different distributions have been observed [64, 65]. Elevated tau PET uptake often happens much later than elevation of amyloid PET [66], especially in autosomal dominant cases of AD [67, 68]. Tau PET is also far more strongly linked regionally with subsequent evidence of neurodegeneration, while amyloid PET tends to elevate in a similar manner across multiple regions at the same time [69]. Examples of amyloid and tau PET images from both patients with various forms of AD and controls can be seen in Fig. 1. Despite many forms of FTD being some form of tauopathy, the available PET tracers have been primarily optimized to the specific form of tau pathology that is primarily observed in AD, namely, the mix of 3-Repeat/4-Repeat species observed in neurofibrillary tangles. Since there are many different forms of tau pathology within FTD, the level of tau PET uptake in these individuals is varied [70,71,72]. In other forms of dementia, amyloid PET can be used to rule out AD pathology if an individual with symptoms has an amyloid negative scanFootnote 2 and tau PET has now been approved to estimate the density and distribution of neurofibrillary tangles in individuals.Footnote 3

Fig. 1
A set of 18 P E T images of the axial, lateral, and anterior views of the brain with healthy control, typical Alzheimer's disease, and posterior cortical atrophy.

Example amyloid (left column) and tau (right column) PET scans from controls and patients with different forms of dementia (AD and PCA). PET images are presented as standardized uptake value ratio (SUVR) images, where the amyloid PET have been normalized to subcortical WM, an area of high nonspecific binding (mainly in myelin) for both healthy controls and patients with Alzheimer’s disease. The tau PET images have been normalized to inferior cerebellar gray matter. While amyloid PET tends to show diffuse cortical uptake across the brain, tau PET tends to be more focal in the areas where neurodegeneration is occurring

2.4.2 Imaging Neurodegeneration

As the pathology continues to build over time during the pre-symptomatic period, it often leads to an insidious process of neuronal dysfunction and ultimately to degeneration in all forms of dementia. This is evidenced by atrophy visible in the structural T1-weighted MRI scans (Fig. 2) and decreased metabolism on fluorodeoxyglucose (FDG)-PET (Fig. 3). These forms of imaging start to be altered around the time when tau pathology is present and then provide close tracking with disease severity as symptoms become apparent. These modalities often tend to be the most widely available of imaging techniques within research settings, with MRI tending to be less costly than PET. Structural imaging, due to its high resolution (1 mm), signal-to-noise ratio, and contrast between tissues, lends itself to high-precision measurements of change over time. The spatial pattern of the neurodegeneration, whether it is hypometabolism or atrophy, can provide useful information for differential diagnosis between different dementia [73,74,75]. Parallel to neurodegeneration, changes in the white matter of individuals with dementia also show evidence of disease-related insult. White matter lesions, suggestive of damage due to vascular insult/insufficiency or demyelination, are visible as hypointensities on T1-weighted imaging, while they present as hyperintense on other forms of structural MRI imaging known as T2-weighted or fluid attenuated inversion recovery (FLAIR) (Fig. 4). Other forms of changes in the WM observed in dementia include microbleeds, lacunes, and perivascular spaces [76]. Whether these are separate processes or linked to the underlying disease cascade is actively being researched [77, 78]. There is evidence that they contribute equally and additively in individuals where no obvious impairment is present. In addition, individuals with heavy white matter burden tend to have more aggressive forms of the disease than those with limited or no signal of change in the white matter.

Fig. 2
A set of 18 M R I of the axial, lateral, and anterior views of the brain with healthy controls, typical A D, posterior cortical atrophy, semantic dementia, logopenic progressive aphasia, and progressive non-fluent aphasia. Arrows point to cortical G M.

Example T1-weighted MRI scans from a healthy control and individuals with different variants of Alzheimer’s disease. For each variant, atrophy can be observed in areas of cortical GM which are known to cause the cognitive deficits typically linked to the clinical phenotype (see white arrows)

Fig. 3
A set of 15 P E T of the axial, lateral, and anterior views of the brain with healthy controls, posterior cortical atrophy, semantic dementia, logopenic progressive aphasia, and progressive non-fluent aphasia. Arrows point to cortical G M.

Example FDG PET scans from a healthy control and multiple variants of Alzheimer’s disease. For each variant, hypometabolism (denoted by cooler colors) observed in areas of cortical GM which are known to cause the cognitive deficits typically linked to the clinical phenotype. Red arrows have been added to highlight focal areas of hypometabolism in each variant

Fig. 4
A set of 3 T 1 weighted and 3 F L A I R images of the axial views of the brain with the minimal, moderate, and prevalent W M H volume, 0.60, 19.7, and 32.0 milliliters, respectively.

Example T1-weighted (left column) and FLAIR (right column) images of minimal, moderate, and prevalent lesions in the white matter, often referred to as white matter hyperintensities (WMH), as they appear bright on FLAIR acquisitions. These lesions also often show up hypointense on T1-weighted scans, but FLAIR tends to be more sensitive and provide more contrast, particularly around deep gray matter areas

Advanced forms of MRI acquisitions are leading to better understanding of the diseases at different scales, from inferences made of the underlying tissue microstructure to how these forms of dementia disrupt the natural networks of the brain. Diffusion-weighted imaging (DWI) provides measurements of both the magnitude and direction of the movement of water within a voxel. In white matter, the tissue consists of long fiber bundles that restrict the motion primarily along the direction of the fibers. In the cases of dementia, the integrity of these white matter bundles, whether through demyelination or some other form of neuronal dysfunction, tends to be less restrictive of water crossing boundaries, suggesting loss of microstructural integrity [79,80,81,82].

On a larger scale, connected brain networks can be identified with these techniques, either by tracing the diffusion profiles from one gray matter region to another using diffusion weighted imaging or by observing correlating patterns of deoxygenation of hemoglobin in the brain regions using functional MRI (fMRI) as a proxy for brain activity. These networks can become disrupted, primarily in regard to within network communication [83]. The direction of this disruption may depend on the stage of the disease. There is more evidence that the later stages of disease cause reduced connectivity and a disconnection from key seed regions to other areas. However, there may be an earlier stage where subjects compensate for increased pathology burden with hyperconnectivity [84, 85].

2.5 Advances in Novel Biomarkers

Novel assessments and biomarkers for all forms of dementia are highly active areas of research, from new fluid-based biomarkers to better computerized psychometric batteries to new imaging tracers and MR sequences to track additional aspects of the disease. While big data for machine learning in dementia has often meant assessing a large number of individuals, each with a small handful of measures, there are new forms of data collection that provide a rich set of data on single individuals. This could include not only the new epigenetic markers like single-cell RNA sequencing [86, 87] but also wearable devices that produce lots of data about individuals’ daily activities and spatial navigation.

3 Challenges for Machine Learning

Researchers are nowadays focusing on two main aspects when it comes to AD and other forms of dementia. First, they aim to gain a better understanding of the disease process, including why individuals with similar underlying primary pathology result in different areas of the brain being affected, and thus have different clinical presentations. This is currently being investigated using many approaches ranging from molecular biology studies in wet labs to large epidemiological studies involving several thousands of participants. Second, they are developing tools to better assist clinicians with treatment management at the level of an individual. This includes, for example, the design of effective computational pipelines dedicated to patient diagnosis and prognosis.

Any research relying on machine learning methodologies must address the specific challenges presented by the disease. The first challenge comes from the large variability of diseases that makes it difficult to differentiate them, especially in the early stages. Additionally, mixed dementia, where patients have several diseases, is quite common. Indeed, the AD phenotype often coexists with vascular dementia or DLB. To partially address this issue, data from individuals with autosomal dominant forms of these diseases are collected by international multicenter studies, as they often have earlier onset and usually “purer” forms of the disease. The Dominantly Inherited Alzheimer Network (DIAN)Footnote 4 and the Genetic Frontotemporal Dementia Initiative (GENFI)Footnote 5 are two studies collecting data from patients and relatives with familial AD and genetic FTD, respectively. While these studies have many benefits (see Subheading 2.1), there can be substantial differences between the genetic and the more widespread sporadic forms of these diseases, the most notable being younger disease onset and fewer comorbidities. Thus, there is a crucial need for ML methods that can disentangle the full complexity of sporadic forms of dementia. Finally, the variability comes not only from the presentation of the disease and comorbidities but also from the age at which the disease starts and the pace at which it progresses.

The second challenge is related to the duration of the disease, which often spans two decades and includes a yearslong prodromal phase. This makes it difficult to acquire data from individual patients that cover the full disease duration, especially as it is extremely challenging to identify with certainty who will develop the disease in the general population. While the previously mentioned studies of autosomal dominant forms of dementia can address this issue, it is not yet clear how much their findings can be translated to the far more common sporadic forms of these diseases. The Alzheimer’s Disease Neuroimaging Initiative (ADNI)Footnote 6 is a large multicenter study that focuses on the acquisition of data from elderly individuals, consisting of those that are cognitively normal, those labeled as having mild cognitive impairment (MCI), and individuals diagnosed with probable AD [88]. These individuals are followed over several years, providing extremely valuable information for researchers. UK BiobankFootnote 7 is another relevant initiative, as it aims to acquire in-depth phenotyping of half a million UK participants. Due to the large prevalence of AD and other related diseases in the population, it is anticipated that many individuals are in the pre-symptomatic phase of the diseases.

As aforementioned in the previous section, many markers of dementia are used to track the diseases, some being more relevant than others at specific times in the illness progression. For example, while amyloid PET-derived imaging biomarkers are valuable in the early stages of the disease, they are unable to quantify the progression toward the final stages. On the opposite end, clinical assessments, while being ineffective prior to symptomatic onset, enable monitoring of symptomatic evolution over time. This is a challenge as it requires to use the most relevant marker for the correct stage. In practice, this often leads to the use of complex models to handle large amount of multimodal data (imaging, clinical, genetic, demographic, …). Additionally, each marker often suffers from its own variability, which can be intra-patient, inter-patient, and inter-center. For example, clinical assessments potentially differ based on the rater. MRI acquisitions will differ due to pulse sequence properties or scanner characteristics, as well as normal physiological variance such as hydration and caffeine intake, among others.

4 Machine Learning Developments

Machine learning has been used in multiple applications related to Alzheimer’s disease and related dementia. As mentioned in the above section, dementia research has leveraged worldwide, multicenter studies in order to obtain enough data to characterize early changes and heterogeneity within the disease process. This has, in turn, propelled the development of dementia-focused ML applications. In this section, we review four main tasks in which extensive ML research has been performed.

Biomarker extraction from imaging data was originally done with manual assessments, which were time-consuming and subject to high inter-rater variability. Machine learning approaches that recreate these measurements with reduced time and variability have been a large effort that has served not only ADRD but many neurological disorders and neurodegenerative diseases. Given the numerous measurements that are now available on the datasets and the different aspects of the phenotype that they reflect, disease classification and prediction techniques have been used to identify consistent multivariate signatures between both healthy and disease groups, for differential diagnosis and for predicting the future state of patients. Disease progression models have been developed to determine the ordering of how markers go from normal to abnormal and to reconstruct the trajectories followed by these biomarkers, leading to advances in disease understanding and prognostication. Data harmonization to characterize variation caused by changes in scanning equipment and software across sites must be accounted for in order to obtain more accurate estimates of the biological changes.

4.1 Machine Learning-Derived Biomarkers

The largest area of machine learning research in relation to Alzheimer’s disease and related disorders is to extract measurements from the different datasets. These biomarkers tend to reflect an aspect of function or integrity of the individual that will gain a better understanding of a disease. Changes in these biomarkers from normal values to abnormal provide a proxy for disease progression. Note that most of this research can usually be useful for other brain disorders.

Medical imaging provides valuable insights into an individual’s brain and is key to noninvasively assessing phenotypes due to the neurodegeneration process. Structural imaging, especially T1-weighted MRI, is commonly acquired in neurodegenerative studies as it enables quantifying key information such as atrophy in particular brain regions and thinning of the cortex or localized brain lesions. These features are often used as imaging biomarkers of disease progression, and many approaches relying on machine learning have been developed to extract them in the last three decades.

Brain segmentation and parcellation relate respectively to the classification of voxel into tissue types (e.g., gray matter, white matter, CSF) and the delineation of identified brain regions (e.g., whole brain, hippocampus, …).

The most popular open-source implementation (FSLFootnote 8 [89], SPMFootnote 9 [90]) for tissue segmentation relied on Gaussian mixture model optimized using expectation maximization [91, 92]. They enable to classify voxels based on their intensity but as well to accommodate with intensity inhomogeneity as well as noise via explicit modeling of the intensity bias field [91] and the use of Markov random field regularization, respectively [92]. With the advance of deep learning in the last decade, many techniques using convolutional neural networks have been proposed. Kumar et al. presented a U-Net based approach achieving close to 90% average Dice score coefficient on the segmentation for gray matter, white matter, and CSF on their dataset [93].

Classical approaches for brain parcellation rely on the concept of segmentation propagation and label fusion. In short, a set of template images, consisting of original images and associated labels, are aligned through medical image registration to a new image. The template labels are then warped into the shape of the new image’s brain and fused into a consensus segmentation. Popular approaches are HAMMERFootnote 10 [94], FreeSurferFootnote 11 [95], or Geodesic Information Flows (GIF) [96], among others. Using neural networks, de Brébisson et al. proposed a dedicated architecture concurrently using 2D and 3D patches and used iteratively to refine their results [97]. More recently, FastSurfer [98] was introduced, which is a deep learning-based method that aims to reproduce FreeSurfer’s results while considerably reducing processing time. While it is rapidly attracting users, further validation is needed to ensure that it is as robust as FreeSurfer across all disease types and severities. Finally, a current trend is to train from vast amounts of synthetic data with the hope of easing generalization to other sequences and/or resolutions. This is, for example, the approach taken in SynthSeg [99].

Other approaches have been proposed to segment individual regions of the brain that are particularly relevant to the study of dementia. For example, hippocampal segmentation has been an active area of research, with many studies including extensive validation in AD [100,101,102,103,104]. In the past years, the focus has turned to the segmentation of hippocampal subfields rather than the whole hippocampus [105, 106]. In particular, Manjon et al. used a U-Net approach combined with a deep supervision approach for training where their loss function optimizes segmentation accuracy at different image scales [107].

The identification of abnormalities (in particular those of vascular origin) is also a key step in the study of dementia and particularly for differential diagnosis. These abnormalities include hyper- or hypo-intensity lesions, micro-bleeds, perivascular spaces, or lacunes. Various approaches to segment T2/FLAIR white matter hyperintensities have been proposed (see [108] for a comparison of seven of them), while there have been fewer works on micro-bleeds or lacunes. Sudre et al. proposed a Gaussian mixture model approach with automated detection of classes number to accurately segment brain tissue classes, as well as abnormalities [109]. More recently, deep learning approaches have also been developed. For example, Boutinaud et al. used a U-Net, which parameters were pre-trained using an autoencoder, to automatically segment perivascular spaces from T1-weighted MRI scans [110]. Wu et al. also used a U-Net architecture for segmentation hyperintensities from T1-weighted and FLAIR MR images [111].

4.2 Disease Classification and Prediction

Machine learning is a powerful tool when it comes to disease diagnosis and prognosis. As a result, many approaches have been proposed for disease classifications, to identify the current stage of a disease within an individual or to predict their future state (e.g., transition to dementia in patients with MCI).

For over a decade, dozens (if not hundreds) of papers have proposed classification techniques to distinguish patients diagnosed with AD versus age-matched controls (e.g., [112,113,114,115,116,117,118,119]) or patients suffering from mild cognitive impairment who are staying stable in time versus those who will progress to a diagnosis of AD (e.g., [115, 117,118,119,120,121,122,123,124]). The latter task can contribute to prognosis which, when it comes to dementia-related diseases, often consists of classifying patients who are likely going to convert from mild cognitive impairment to symptomatic AD within a given time interval, typically 3 years,Footnote 12 from those who are going to stay stable. Several literature reviews on dementia classification and prediction have been published [125,126,127,128,129,130]. In particular, recent reviews by Jo et al. [126] and Ansart et al. [130] have covered the topic of prognosis.

The proposed methodologies have been extremely varied not only in terms of ML algorithms but also of input modalities and extracted features. Initial ML methods often used support vector machines (e.g., [112, 114]), while subsequent works used more recent techniques such as random forest [117] and Gaussian processes [121, 131]. Many recent works have used deep learning classification techniques [127, 129]. However, so far, deep learning has not outperformed classical ML for AD classification and prediction [129, 130, 132]. Furthermore, a review on convolutional neural networks for AD diagnosis from T1-weighted MRI [129] has identified that more than half of these deep learning studies may have been contaminated by data leakage, which is particularly worrisome. In terms of features, some studies use the whole brain as input, either using directly the raw image or computing voxel-wise (or vertex-wise when considering the cortical surface) measures [125]. Others parcellate the brain into regions of interest, within which features are computed. In particular, researchers have combined segmentation approaches with disease classification techniques. This has the advantage of limiting the search space of the machine learning approach via the use of prior knowledge. For example, Coupe et al. used a patch-based approach to classify voxels from the hippocampus as it is known to be a vulnerable structure in patients with dementia [116]. The input modalities have also been extremely varied. While earlier works often focused on T1-weighted MRI only [112,113,114], subsequent studies have included other imaging modalities, in particular FDG-PET [121, 123] Other researchers have combined tailored features extracted from images and non-images features such as fluid biomarkers [121], cognitive tests [124, 133], APOE genotype [121], or genome-wide genotyping data [134, 135]. Through deep learning, researchers are avoiding the need to craft features and can use traditional deep learning approaches to directly infer disease status from raw data: imaging or non-imaging. However, to date, there has been less interest in this area than in biomarker extraction, and thus fewer innovative solutions have been proposed. Popular architectures include conventional neural network, autoencoder, and recurrent neural networks, among others [127]. Training strategies mostly relied on supervised approaches, where some groups have relied on pre-trained networks to compensate for relatively small training databases. However, as mentioned above, it has not been demonstrated so far that deep learning outperforms conventional ML for dementia classification.

A large portion of the literature has relied on the ADNI database. While having such a rich and large database has propelled the development of algorithms, one can wonder if the proposed method will generalize well to other datasets, a problem which has less often been addressed. Another worrisome aspect is that many of the papers based on ADNI are difficult to reproduce because they lack a description of the subjects used and because the code is not often available [136]. They are also difficult to compare. In particular, since different preprocessing tools are used, it is often difficult to know whether improvements in performance come from the innovation in ML or from the preprocessing. Standardized datasets have been created in ADNI [137] to address the former issue. Whenever possible, one of these datasets should be used, or authors should provide a list of subjects/scans included in the study for the purpose of reproducibility.

Researchers have organized challenges that can provide an objective comparison of algorithms. One can cite in particular the CADDementia challenge for classification [138] and the TADPOLE challenge for prognosis [132]. Such challenges provide very important and useful information on the respective merits of different approaches. However, more challenges, in particular using more diverse data, would be needed.

To a lesser extent, differential diagnosis has also been addressed. Earlier works focused on classifying patients diagnosed with AD versus patients diagnosed with frontotemporal dementia [112, 139]. More recent studies have considered classifying between various types of dementia [140, 141].

Overall, there has been considerable research in AD classification and prediction. The easiest task, which is the classification of patients diagnosed with AD versus normal controls, can be considered as solved with accuracy typically above 90%, at least when using data which quality is comparable to that of research studies. However, this task has little clinical utility. For the more interesting task of prediction progression to AD in MCI patients, the performance has increased over the years, with AUC now above 80% [130]. Interestingly, it has been shown that studies which include cognitive tests and FDG-PET tend to have better results than those using T1-weighted MRI only [130]. It is particularly noteworthy that cognitive tests tend to be overlooked, given that they are relatively cheap to perform and widely available. This probably reflects the fact that much of these works have arisen in the medical image computing community. Finally, other tasks are still short of becoming clinically useful, such as the development of a multi-pathology differential diagnostic classifier. This is possibly due to the lack of tailored methodologies leveraging all available information. This will be further discussed in the conclusions of this chapter.

4.3 Disease Progression Modeling

Following the release of a hypothetical model of disease progression by Jack et al. [142], researchers have tried to create data-driven models that would accurately describe the various components of the underlying disease process. These so-called disease progression models have been developed to understand the ordering of phenotypic events (such as a marker becoming abnormal), to model the variability of ordering or trajectories within a population and to be able to distinguish between different disease subtypes. The reader may refer to Chap. 17 for a detailed description of the methodology underlying data-driven disease progression models. Event-based models (EBM) [143] have been created to learn from a curated population and across different modalities the different hidden states of a disease. EBMs are able to order all input features from the one that will most likely become abnormal first to the one that becomes abnormal last [143]. The application of EBM in dementia has been very successful, including studies in familial AD [143, 144], sporadic AD [145, 146], posterior cortical atrophy [147], and genetic FTD [148]. An extended approach called Subtype and Stage Inference (SuStaIn) was developed by Young et al., which incorporates clustering to characterize disease subtypes. In particular, it allowed uncovering the symptomatic profiles of different variants of genetic FTD as well as AD subtypes [29]. EBMs have the advantage of being applicable to cross-sectional data but only provide an ordering of events with no temporal scale as to when they become abnormal. In most EBMs, there is also an assumption of a monotonic biomarker trajectory, an assumption which has been questioned in the early stages of AD [149]. Other works have leveraged longitudinal data to build continuous trajectories. Jedynak et al. used a disease progression model to derive a progression score on a linear scale for every individual in AD [150]. Schiratti et al. proposed a general nonlinear mixed effects model that can handle not only scalar biomarker data but also images or shapes [151]. Applied to AD, the approach uncovered trajectories of progression for different variables, including cognitive tests, PET-derived hypometabolism, and local hippocampal atrophy [152]. In 2021, Wijeratne and Alexander proposed an approach that can, using longitudinal data, infer both discrete event ordering (as in EBMs) and continuous trajectories [153]. Lastly, Abi Nader et al. proposed SimulAD, which enables the setup of in silico interventional trial, where patient prognosis can be assessed against several possible therapies (drug type or timing of intervention) [154].

4.4 Data Harmonization

Data heterogeneity inducing bias arises from many sources, including differences in acquisition protocols, acquisition devices, or populations under study. This is true in large multicenter studies such as ADNI, DIAN, and GENFI. However, such research studies use harmonized protocols for data acquisition and perform strict quality control. It becomes even worse when one is using clinical routine data which is not acquired with harmonized protocols and can be of extremely varied quality.

Data heterogeneity may have various impacts on ML algorithms. For example, for any classification task, one wants to ensure the model focuses on the diseases’ features rather than on any differences caused by the acquisition sites. The ability to harmonize data from any origin is also critical to translate any analytical tool in clinical practice where acquisition protocols are rarely standardized.

This is especially true for imaging biomarkers where scans often contain so-called scanner signatures. As a result, images are often preprocessed prior to being used with machine learning algorithms. The preprocessing steps involve intensity normalization in the shape of image filtering, for denoising, or intensity histogram normalization. For example, Erus et al. [155] proposed a framework to achieve consistent segmentation of brain structures across multiple sites. Their approach relies on the creation of site-specific atlases while ensuring consistency between all available atlases. Their evaluation shows that they reduce the variability associated with sites on volumetric measurements, key to track the process of brain atrophy, derived from structural images. Another example is the work of Jog et al. [156], who used image synthesis via contrast learning to harmonize images acquired with different pulse sequences. The Removal of Artificial Voxel Effect by Linear regression (RAVEL) approach by Fortin et al. [157] is another exemplar application of data harmonization. It consists of a voxel-wise intensity normalization technique, where they apply singular value decomposition (SVD) of the control voxels to estimate factors of unwanted variation. The control voxels are those unaffected by the pathology, such as those in the cerebrospinal fluid (CSF). The unwanted factors are then estimated using linear regression for every voxel of the brain, and the residuals are taken as the RAVEL-corrected intensities. This model has then been further extended [158] to include the modeling of site-specific scaling factors on summary measures derived from the images. Using empirical Bayes to improve the estimation of the site, this model can be used to correct several imaging modalities while associating relevant clinical and demographic information. It was originally developed to correct gene expression microarray data [159], being later extended to correct DTI maps [158], cortical thickness measurements [160], or structural MRI [161]. Additional extensions include longitudinal data [162], site effects due to covariance [163], and a generalized additive model in order to handle nonlinear trajectories over the life span [164]. Prado et al. [165] proposed Dementia ConnEEGtom to harmonize neurophysiological data. They propose a whole analytical pipeline involving many steps, including denoising, artifact removal, and spatial normalization, to promote a standardized processing of EEG data, thereby enabling their use in machine learning while minimizing the bias that could be induced by variability in data handling across sites.

Acquired clinical scales also need to be standardized and harmonized between centers, especially when used jointly in a single machine learning approach. Costa et al. [166] provide recommendations on how this should be achieved and which scales should be acquired for the neuropsychological assessment in neurodegenerative diseases. Even when using the same scales across different multicenter studies, it is important to understand that the provenance and contextual information of each study must be considered, since that might introduce bias in the training of the models [167].

5 What Is Next?

This chapter has illustrated that dementia is a complex, multifactorial, heterogeneous set of pathologies and syndromes, sometimes occurring in parallel. As a result, a wide variety of clinical, genetic, cognitive, imaging, and biofluid data have been collected to characterize these disease processes over a large number of different cohorts, from both those individuals suffering from various forms of dementia, as well as those at risk of developing a form of dementia due to their genetic/environmental/pathophysiological risk profile. Despite the wealth of data on dementia that is available to machine learning researchers, there are still limitations, both in terms of the data available and in the current thinking about how to apply machine learning, that must be addressed in order for machine learning to reach its true potential in terms of making an impact on these conditions. Key to this validation will be open science initiatives that allow for reproducibility and replication of results, so that the value can be demonstrated in independent cohorts.

Careful thought must be given to how to incorporate the myriad different data types that are available to researchers. Despite the wealth of multimodality information and big data available, much of the machine learning-based research in dementia has only considered a subset of the available information. This is the case even within a single community such as the imaging one where each modality or pulse sequence is most often analyzed individually. Machine learning approaches are however able to extract highly nonlinear information, which should enable the development of truly multi-data frameworks able to capture the complexity of the diseases. At the same time, when multiple features across different data acquisition domains are combined into a single analysis, particularly in those individuals who already show evidence of impairment, there is a higher likelihood of missing or corrupted information. The simplest solution to the problem of missing data, and often the most implemented, is to perform complete case analyses, discarding any observations where any of the variables are missing. This practice results in the time donated by patients, carers, and volunteers willing to support research efforts through extensive and often onerous data collection being squandered and the full potential of the resulting data not being realized. While it is common to have 30% of data being discarded in some large studies, we could benefit hugely from further research in this direction. Multiple imputation techniques could be used to address this issue and ensure all available data can be used.

A related question is how to address the limitations of cross-sectional snapshots of data in a decades-long disease process and how best to use relatively short-term follow-up data. While inference on cross-sectional measures would be ideal in terms of providing information expediently to a patient, there are many confounds and covariates that can contribute to added variability, making classification tasks, particularly in the early stages of the disease, less accurate. Longitudinal data, particularly in imaging modalities, tends to reduce the influence of these confounds, and the within-subject change can be more sensitive to identifying the prognosis of individual trajectories. However, requiring longitudinal data for a classification task is undesirable for patients, who would be provided with no information until they come back for additional testing in a year or two. Thus, machine learning approaches could investigate whether a hybrid approach might be more powerful: triaging first with the cross-sectional data and only requiring longitudinal data in cases where inference cannot be made at the baseline assessment with a high degree of confidence.

A second challenge is to extend machine learning approaches to datasets that are more reflective of standard clinical settings. This refers not only to the type of data that is collected but also to the conditions under which the data is collected and to the populations within which data is acquired. Clinical research studies conducted at research institutions often include advanced data acquisitions that are costly and time-consuming, making them intractable for translation into wider communities and developing countries. There is thus a mismatch between the quality of the data acquired in research settings compared to the data acquired in day-to-day clinical environments. For example, only a subset of the patients suspected of having dementia undergo medical imaging and from this subset only a fraction of these individuals are offered MRI scans. In the majority of cases, CT are acquired, and they are mostly used to rule out other causes for impairment, such as space-occupying lesions. As a result, a large amount of CT scans are collected, and they could potentially be an important resource to develop computer-assisted tools able to reach a larger population [168]. Even when the same data type is acquired in clinical and research settings, there can be a considerable mismatch in terms of data quality and homogeneity. For instance, clinical routine MRI is of extremely variable quality and usually acquired using non-harmonized protocols. Similarly, with the current democratization of wearable and smartphone collection data which is prime to be processed with machine learning, there are opportunities to develop novel frameworks assisting patients, carers, and clinicians.Footnote 13,Footnote 14 Another aspect of this challenge to improve widespread translation is that clinical research studies typically involve a disproportionate amount of affluent Caucasian individuals of European descent, meaning that we do not yet have enough data to fully quantify the heterogeneity that is observed in other ethnic groups. While large-scale cohort studies are looking to address this issue, focusing on assembling appropriate testing and training sets to represent the diversity of the population will be an important element for improving machine learning performance in the future.

Effective partnerships with the clinicians who are using the data are another key challenge in terms of incorporating machine learning in a wider clinical setting [169, 170]. Clinicians must believe in the added value that novel machine learning approaches can provide in order to incorporate them as part of their clinical workup and decision-making. One approach to achieving this buy-in from clinicians is a push toward “explainable AI,” such that the results from machine learning algorithms make intuitive sense and that the clinician can better understand how the algorithm came to that decision. While there are certainly concerns around the opaqueness of some algorithms that could lead to overfitting or spurious results, an insistence on explainable AI may also restrict the development of better algorithms that can provide more value, and in some cases, it may result in reducing a complex multivariate pattern down to a summary measure that can be understood, throwing away valuable information in the process. What is likely more important is that best practices are followed in terms of training, testing, model development, and validation of the algorithm such that clinicians may not necessarily understand how the algorithm achieved a specific result but that they are convinced by the evidence of the value it provides.

The final, and likely most significant, challenge is how best to characterize heterogeneity and mixed pathology. Classification of well-characterized cohorts of clear cases of AD and normal controls provides little benefit, in particular given the rise of accurate and increasingly cheaper and accessible plasma tests. Dementia covers a wide range of symptoms caused by a myriad of pathologies and etiologies occurring in a decades-long process, and correctly identifying the underlying disease will be critical for treatment plans as disease-modifying therapies become available. There is a natural inclination to thus subdivide and characterize a number of distinct and discrete disorders, which has led to a focus on machine learning algorithms to aid in tasks of differential diagnosis. However, there are often no clear boundaries between the phenotypic profiles of these disorders, and mixed pathologies are common. Therefore, there should be a shift in machine learning away from classifying between normal aging and a single disease or differential diagnosis task with the goal of dichotomizing between two or more disorders but rather a probabilistic framework that allows for multiple pathologies to coexist. With the advance of big data analysis, access to clinical care data, and innovative machine learning, all is in place for this shift to be achieved.

6 Conclusion

Dementia of all forms is going to be one of the biggest global health challenges around the globe over the coming decades. Improving the ability to characterize the disease at an early stage and providing an accurate prognosis that allows doctors to provide effective treatment plans and individuals to make informed decisions about how to manage their affairs are going to be critical in order to reduce the distress and burden experienced by people suffering from these diseases and their families. Machine learning will need to play a key role in achieving these targets.