Introduction

Twenty-one years ago in a paper on Alzheimer’s disease (AD) there was a statement that there is “growing evidence of genetic causes of AD” [8]. Before the 1990s, Parkinson’s disease (PD) was widely considered to have no genetic contribution. We have come a long way since then. In 2012 not only are there confirmed hereditary causes of AD, PD, frontotemporal lobar degeneration (FTLD), and amyotrophic lateral sclerosis (ALS), but also additional genetic risk factors identified for all these diseases. It took several years from the time tau was identified as the pathological hallmark aggregate in cases of FTLD for mutations in MAPT to be identified as the underlying genetic cause of disease. In contrast, with the advent of powerful new sequencing technologies at the exome or genome level, the current pace of genetic discoveries is very rapid. Many of the initial gene discoveries by either candidate gene or linkage analysis approaches were made by single or small research groups. In contrast, genome-wide association (GWA) studies of today require large consortia and collaborations to collect the large number of control and disease cases needed for success. Once considered distinct and unrelated diseases, we now know that many clinical, pathologic and genetic characteristics are shared between AD, FTLD, ALS, and PD. Dementia is not only a key feature of AD and FTLD, but also seen in conjunction with PD and ALS. FTD can occur alone, and also in combination with motor neuron disease or Parkinsonism. FTLD and ALS, both sharing TDP-43 pathology and C9orf72 expansion mutations, are now considered to be a part of a disease spectrum rather than two distinct diseases. Times have changed, the tools have changed, and our understanding of these diseases has improved greatly, leading us forward on the path to personalized medicine.

In this issue of Acta Neuropathologica, we present a cluster of articles that review the genetics and neuropathology of four neurodegenerative disorders: AD, PD, ALS, and FTLD. Each article in the series provides a summary of what is currently known about both Mendelian disease genes and susceptibility loci for each disease and correlates these genetic associations with clinical and neuropathologic features. Genetic discoveries for neurodegenerative diseases have been very helpful in delineating molecular mechanisms of disease. This knowledge in turn has had a huge impact in many areas of research and clinical medicine, and has spawned numerous clinical trials that provide a realistic hope for therapeutic advances in the not too distant future. Continued advances will not only require large consortia, but also the cooperation and collaboration of neuropathologists, geneticists, clinicians, basic scientists, and bioinformaticians as well as the support of patients, families and advocacy groups. However, to put these genetic advances into practice it is important to consider the significance and utility of the genetic information, the importance of the correct use of terminology and nomenclature, challenges in assessing the pathogenicity of identified variants, ethical issues that may arise with all this information, and resources that are available to access related information.

What is the significance of genetic information?

The identification of a Mendelian cause of disease in an affected patient can help confirm the diagnosis and provide information to family members about genetic risk that can be used for genetic counseling [2, 16, 17, 27]. Genetic counseling is a process by which patients and families are educated about a disease and its genetic risks, generally by a person trained in genetic counseling. Genetic counseling may include education about general genetics, the genetics of the patient’s disease, and associated inheritance patterns. A multi-generation family history and pedigree are obtained and risks to specific individuals in the family are discussed based on available information. Genetic testing options, including both research and clinical testing, are reviewed. Genetic counseling is most important for individuals with a family history of disease, but useful for all persons with a concern for genetic risk or with an interest in genetic testing.

While AD, PD, FTLD, and ALS each have unique genetic characteristics relating to counseling, penetrance, and prognosis, they also share many elements such as a predominant pattern of autosomal dominant inheritance with adult-onset [2, 14, 16, 17, 27]. Presymptomatic testing, genetic testing of at-risk asymptomatic adult family members, is only an option if a mutation in an affected family member has been previously identified. In the context of presymptomatic testing of individuals at risk, genetic counseling before and after testing is imperative due to the real and potential risks for psychological impact [14, 17]. A limitation of presymptomatic testing of relevance to many neurodegenerative disease genes is the inability to accurately predict penetrance and phenotype for some of the genes due to variable clinical presentations, ages of onset, and penetrance. Nevertheless, some patients and family members may decide to pursue genetic testing to reduce uncertainly and to aid in making life plans [27]. In some cases, preimplantation genetic diagnosis can be performed in conjunction with in vitro fertilization for family planning purposes. In the United States testing of human samples for clinical use, including genetic testing, must be done in a laboratory that is accredited under the Clinical Laboratory Improvement Amendment of 1988 (CLIA’88). Clinical testing in a CLIA-certified lab is available for nearly all Mendelian disease genes for AD, PD, FTLD, and ALS in academic and/or commercial laboratories (Table 1). This requirement is one factor that distinguishes clinical genetic testing from research testing.

Table 1 Online resources with relevance to the genetics of neurodegenerative disease

Identification of a mutation in a neurodegenerative disease-associated gene is of particular importance for research as it enables neuropathologists and clinical researchers to perform correlations with pathology, clinical features, biomarkers, and imaging. One goal of such studies is to develop methods to enable an earlier diagnosis that is accurate, more sensitive, and more specific. In addition, knowledge about molecular pathogenesis has led to the development of animal models of disease and has opened the path toward discovery of molecularly targeted therapies. Many patients not interested in clinical testing are interested in participating in research testing in order to help others. This may include providing a DNA sample, plasma, CSF, imaging, neurological and neuropsychological testing, and enrolling in an autopsy program. A well-characterized patient cohort with genetic information can be used to help better understand disease pathogenesis and progression as well as help to select patients for clinical trials [5]. Clinical trials for treatment and prevention are most advanced for AD. While autosomal dominantly inherited AD may not be identical to non-hereditary forms it can serve as a model with a ready source of presymptomatic individuals. Such a population of patients is being studied longitudinally in the international Dominantly Inherited Alzheimer’s Network (DIAN) [4]. The ability to predict individuals at risk of disease with genetics enables the design of clinical trials for interventions aimed at preventing or delaying the onset of cognitive decline.

Non-Mendelian risk factors identified from GWA studies in neurodegenerative diseases do not yet have clinical relevance, except for APOE in AD. APOE genotype explains more of the population-based risk than all other identified risk factors identified combined and is becoming an important factor in clinical trials and determination of prognosis. The added risks to disease diagnosis and prediction from the non-APOE risk factor genotypes are negligible. This is in part due to the fact that GWA studies identify common variations that tag some nearby or linked gene with a biological effect without specifically identifying what that effect actually is. Certainly the nearby genes are suspects for some biological relevance, but their specific role must be proven through further research. Nevertheless, these “hits” are exciting because they open the door to new discoveries and the prospect of finding new disease mechanisms, pathways, and drug targets. The initial risk gene discoveries are based on factors that distinguish between disease and control cases, the next phase of discovery may be to correlate genetic risk factors to neuropathology, disease phenotypes, imaging and CSF or plasma biomarkers. For example, an international collaboration is underway to use existing GWA data for FTLD and ALS to help elucidate genetic risk factors underlying the variable clinical phenotype of C9orf72 mutations which cause both FTLD and ALS.

A cautionary note about using the term “sporadic”

The assessment of family history is an important component of genetic counseling. On an elementary level, a sporadic case is one in which no family history is evident. This could be due to true absence of any family members with disease. Alternatively, a family history of disease may not be evident due to a small or uninformative family due to early death of parents, misattributed parentage, or unrecognized adoption, the presence of variable phenotypes not recognized as being related to the proband’s condition, or variable penetrance of disease. Frequently in the literature, if an autosomal dominant disease-associated mutation is found in a “sporadic” case this is often simply referred to as a sporadic mutation. But this mutation had to come from somewhere—it is either a de novo mutation or it was present in a parent, but with decreased penetrance so that no symptoms were evident or there was a lack of information about one or both parents status’ [7]. If the parents are available this could be assessed definitively by testing them. If the mutation is absent in both (and parentage testing confirms the relationship) then it is a true de novo or sporadic mutation. It is a completely different situation if the mutation is present in an unaffected parent such that the source of the “sporadic” mutation is lack of penetrance. Unfortunately, when evaluating adult-onset neurodegenerative disease cases, frequently both parents are not available. Nevertheless, it is an ideal to strive for if possible. A solution is to limit the use of the term “sporadic” and alternatively use “apparent sporadic” if the true nature of the source of the mutation has not been confirmed. From a counseling perspective, distinguishing these possibilities is very important, although children of a person with the mutation are at 50 % risk regardless of whether the mutation is de novo or inherited.

Is that variant pathogenic?

Finding and publishing new genes for neurodegenerative diseases has certainly helped advance our understanding of disease pathogenesis. However, in some cases, genes or gene mutations may be misclassified as disease-associated or pathogenic, when in fact they may neither be disease-associated nor pathogenic. This can cause confusion and potentially incorrect conclusions about mechanisms of disease or genotype-phenotype correlations. The first time a gene mutation is identified in association with a disease, the bar is frequently set quite high for evidence to prove that in fact the variant(s) identified does indeed cause the disease. After a gene has been associated with a condition new variants may continue to be identified in that gene. In these cases, less evidence for pathogenicity is often accepted in publications because the gene has already been associated with the disease or a related disease. The level of scrutiny depends on the journal and the knowledge level of the reviewers and authors about genetics. Case reports with little supporting evidence for pathogenicity may be published and as a result a variant is classified as pathogenic that may or may not be pathogenic and this finding may then become dissociated from the evidence and perpetuated in subsequent publications as a reference without anyone going to the primary source to confirm.

There are examples of this for many diseases and genes. For cystic fibrosis (CF), a recessive disease for example, the p.I148T mutation was initially found in CF patients and thought to be pathogenic. The p.I148T mutation was included in a panel for population-based carrier screening for CF composed of mutations with allele frequencies >0.1 % as recommended by several organizations in 2001. After several years of population screening of numerous unaffected individuals, it was noted that the variant was found at a much higher rate than expected and was found along with other pathogenic mutations in asymptomatic individuals. Upon further investigation, the true CF mutation was found to be an in-frame deletion in cis with p.I148T on some chromosomes [28]. Until this was discovered, however, population-based CF carrier testing had been performed for about 3 years (2001–2004) signifying that a large number of mostly women were classified as CF carriers when in fact only some of them were actually carriers. The practice of genetic testing laboratories and manufacturers of assays took even longer to change after the published testing guidelines were changed. This example demonstrates the potential societal impact of an incorrect assignment of pathogenicity. For presymptomatic testing in adult-onset autosomal dominant diseases the consequences would be devastating.

Non-pathogenic variants or risk factors for neurodegenerative diseases have also been mis-characterized. For example, the p.E318G variant was classified as a disease-associated mutation initially for early-onset AD, however subsequently it was identified in controls and late-onset AD cases and later reclassified as a benign variant or at most a risk factor [3]. For neurodegenerative diseases, proof of pathogenicity can be difficult to achieve because older generations are frequently not available for testing to confirm segregation of a variant with disease, while unaffected individuals in younger generations are typically uninformative since there is no way to know if an unaffected individual with a mutation will be affected in the future, especially given the wide ranges of onset for some diseases even in the same family.

A variety of guidelines are available to help classify the pathogenicity of variants. Supporting evidence may come from studies of a large number of ethnically matched controls, segregation in multiple affected family members, presence of the variant in other cases, and in vitro testing [9, 21]. In silico models evaluating the likelihood of an effect on protein structure and function based on models, conservation across species, and splicing predictors [32]. These models help determine the likelihood of an effect, but are not definitive proof. Until enough evidence amounts, caution should be urged on interpreting new variants as pathogenic or benign [31]. This is particularly important as a wrong conclusion can be drawn about disease mechanism or phenotype from a misclassified variant.

A published algorithm for evaluation of variants for AD which can serve as a model for other neurodegenerative diseases defined the categories of “definite pathogenicity,” “probable pathogenicity,” and “possible pathogenicity” based on the level of evidence [18]. If limited evidence is available, the term variant of uncertain significance (often abbreviated “VOUS”) is used to convey a level of uncertainty until further proof is obtained. In some cases a variant may be classified as a risk factor if some, but not all, lines of evidence suggest that the mutation may have some association with disease. Efforts are underway in the neurodegenerative disease community to “weed Mendel’s garden” and identify and reclassify dubious associations [12]. This will continue to be an important endeavor as huge amounts of sequencing data demonstrating a large number of novel or private variants in each individual [24]. Along these same lines, as population-based sequence data are generated it will be important to curate databases with clinical phenotype and age information. In particular sequence information from elderly normal individuals will be important if it is to be useful for assessment of variants identified in studies of neurodegenerative diseases.

What if a mutation pathogenic for one disease is identified as a risk factor for another?

This is exactly the situation with glucocerebrosidase (GBA) mutations and PD. Mutations in GBA cause GD, an autosomal recessive lysosomal storage disease with variable phenotypes ranging from a perinatal-lethal disorder to an asymptomatic form. Recently, mutations in GBA have also been associated with about a fivefold increased risk of PD. This raises important ethical questions for genetic testing for GBA mutations, which are often included in an Ashkenazi Jewish carrier screening panel that is commonly offered by laboratories. When a clinician orders a GBA mutation test for GD screening what are their responsibilities in informing about the risk of PD (a direct, albeit low, risk to them rather than a carrier risk) [13]? As GWA studies and next generation sequencing methods identify new genetic associations, the waters may become muddied about how to deal with the information from a patient and population perspective.

Why should we care about nomenclature?

With all these new variations being identified and studied it is clearer than ever that written communication between researchers is important for scientific progress. For communication of genetic information, the language used is called nomenclature. There is specific nomenclature for gene names as well as for genetic variation and mutations. For gene names, the Human Gene Organization (HUGO) Gene Nomenclature Committee (HGNC) assigns standardized nomenclature for human genes [29]. Each gene has a unique short symbol and a longer descriptive name that is listed in an online repository (Table 1). Standardized nomenclature is important for unambiguous communication in scientific journals and online databases. In many cases, the standardized nomenclature is determined after one or more seminal publications are published. This sometimes leads to multiple versions of nomenclature in the literature, depending on how long the non-standard or legacy term was used before the new system was implemented. Some gene names are well established in the basic science and clinical lexicon like HER2/neu which is now known as ERBB2, but in 2012 still appears in the literature frequently as HER2/neu. An example in neurodegenerative diseases is progranulin which initially was published as PGRN, but the standardized nomenclature selected by HGNC is GRN. The genenames.org website lists the standard short and long names, as well as all previous names and synonyms (Table 1).

Variations or mutations in genes are perhaps even more susceptible to miscommunication than gene names because (1) a protein change does not define the specific underlying nucleotide change, (2) it is hard to know for sure which nucleotide is being referred to if different reference sequences are used or the reference is unknown, and (3) an A at position 37 could be the 37th adenine nucleotide or the 37th codon for amino acid alanine. Fortunately, there are guidelines to follow in naming variations that are maintained by the Human Genome Variation Society (HGVS) [11]. Standardized nomenclature guidelines are provided at the DNA, RNA, and protein level. Nevertheless, there is variability in the nomenclature particularly for mutations that were discovered and published before standardized nomenclature was available. Such is the case for two genes associated with neurodegenerative diseases: SOD1 for ALS and GBA, a risk factor for PD and disease-causing gene for Gaucher disease. In both cases, initial reports of mutations were named in publications based on post-translationally modified proteins: a removal of methionine 1 for SOD1 and removal of 39 amino acids for GBA. This can lead to confusion for a lab which wants to test for one of these mutations. The common A4V SOD1 mutation is correctly described as p.Ala5Val, and N370S in GBA is correctly referred to as p.Asn409Ser. Making the switch to new nomenclature is not trivial as many labs have procedures, databases, labeled reagents, and other places where these names may appear. This is particularly true for the common N370S mutation since it is pervasive in genetics literature. Furthermore, while describing the protein effect (if known) is important to get a better sense of the potential functional consequences of a mutation, it is equally important, at least once per publication, to describe the mutation at the nucleotide level, whenever possible using a RefSeqGene ID as the reference sequence. For genomic level nomenclature, the reference sequence and version are both important.

One final point of potential confusion is regarding exon numbering. As a result of standardization of exon numbering in the NCBI database, the exons numbers of some genes is different than what is found in the literature. GRN is a perfect example. The original publications started the transcribed region with exon 0. Currently, however, in the NCBI database this is listed as exon 1, meaning that all subsequent exons are numbered by one off from the original publication. As a general rule it is best to switch to current standard nomenclature, however this can be difficult for laboratories where all the primers and database include old nomenclature. Awareness and vigilance are the best approach for addressing all types of nomenclature issues.

Where can I look up information related to genetic variation?

A variety of curated databases and resources related to the genetics of neurodegenerative disease are available online and are listed in Table 1. For AD and FTLD, PD, and ALS there are mutation databases that curate published pathogenic or coding sequence mutations primarily from the literature [1, 10, 25]. In addition, for AD, ALS, and PD there are also online resources that collect and summarize GWA study findings. These mutation databases are complementary to dbSNP which catalogs a large amount of non-coding variations, for example in the introns and intergenic regions in addition to some clinically relevant variants. It is hard to generalize about the use of standardized nomenclature across all disease-specific databases; therefore, the user is advised to check that HGVS nomenclature is provided, at least in addition to legacy nomenclature. In addition to disease or locus-specific databases, the NCBI maintains a well-annotated list of variations for some genes in the Variation Viewer database. Generally, mutation and variant information may be given at the level of the coding nucleotide, protein, and gene. Clinically, relevant information about some genetic diseases is available in the form of GeneReviews [26]. Laboratories that offer genetic testing for specific genes may be listed in the NCBI Genetic Test Registry.

What is in store for the future?

No new Mendelian genes have been discovered for AD since PSEN2 in 1995. Does this mean that no others exist? This is unlikely since some familial forms of AD are not explained by mutations in APP, PSEN1, or PSEN2. One possibility is that, as exemplified by the discovery of C9orf72 as a gene for ALS and FTLD, the types of mutations identified are directly tied to the method used to identify them. Repeat expansions as in the trinucleotide repeat diseases and C9orf72 associated FTLD and ALS serve as examples of the need to think beyond small mutations that can be sequenced. The availability of technologies for copy number variation analysis and other genomic technologies may contribute to the identification of new disease genes and mechanisms. The frequency and breadth of mutations in known genes may also increase as whole exome and genome technologies are increasingly applied. Next generation sequencing will also be used to identify rare large-effect alleles whereas the GWA studies have been only designed to identify common small-effect genes.

Not all genetic associations that have been identified are risk factors for disease. Recently, an APP variant which may actually reduce the risk for AD was identified by genome sequencing in an Icelandic population [20]. Levels of β-amyloid protein are reduced about 40 % in individuals with this APP variant possibly by interfering with the ability of β-secretase to cleave APP. The finding that lower β-amyloid levels may be associated with a lower risk for AD is a proof of principle finding that can help invigorate drug development and emphasizes the need to continue studying the genome.

As genetic and environmental risk factors for neurodegenerative diseases are better studied and combined with biomarker data this will lead to new models for diagnosis and prognosis. Further, advances in technology and bioinformatics capabilities will enable the use of these models in clinical medicine for diagnosis and therapeutic decision making. Multimodal assessments of disease will enable appropriate selection of patients for clinical trials to delay the onset or the halt progression of disease. While neuropathology is a critical tool for studying and understanding why and how diseases occur and progress, for the patient, diagnosis at autopsy is too late. Therefore, for the future, it is hoped that what we learn from correlating neuropathology with genetics, combined with clinical data and endophenotypes, will enable personalized medicine for neurodegenerative diseases based on risks and prognosis.