Introduction

Alzheimer disease (AD) is the most common neurodegenerative disease and the predominant cause of dementia worldwide. Up to 10 % of AD patients is diagnosed with early-onset AD (EOAD), manifesting symptoms before the age of 65 years [18]. EOAD has a very strong genetic component, with a heritability estimate of 92–100 % [53]. A positive family history of AD is present in 35–60 % of EOAD patients, with up to 15 % of familial EOAD cases showing autosomal dominant inheritance [8, 19]. However, mutations in the known causal genes, encoding Amyloid Precursor Protein (APP), Presenilin 1 and 2 (PSEN1 and PSEN2), explain only 5–10 % of EOAD cases [4, 19, 53].

Rare variants in the sortilin-related receptor (SORL1) gene have been shown to contribute to early-onset as well as late-onset familial AD [35, 37, 48]. SORL1 was originally identified as a risk gene for AD in a candidate-gene based association study [42]. Early replication studies showed discrepant findings, possibly due to allelic heterogeneity, locus heterogeneity or lack of statistical power due to small cohort size. Nonetheless, the association was subsequently confirmed in meta-analyses [20, 24, 39, 50] and in genome-wide association studies (GWAS) including Korean, Japanese and Caucasian individuals [24, 33, 39]. The protein encoded by SORL1 is a type-1 transmembrane, mosaic protein showing homology to the vacuolar protein sorting 10 (Vps10p) family, and the lipoprotein receptor-related proteins (LRP) [52]. The protein SORL1 is unique among the Vps10p-family proteins as it contains additional ligand-binding structures within the LRP domains, including a β-propeller domain, a low-density lipoprotein receptor class A domain, and a fibronectin type-3 domain [2, 16]. The SORL1 protein interacts directly with the APP protein through its complement-type repeats within the low-density lipoprotein receptor class A domain [1, 2], and via a six amino acid-stretching FANSHY motif located in the cytoplasmic tail of SORL1 [14]. Interaction with the protein APP, results in sequestering of APP away from the secretase cleavage route, inhibiting formation of the amyloid-β (Aβ) peptide [2, 14, 32, 36]. Functional characterization of downstream effects of variants identified in familial early-onset and late-onset AD patients elucidated a protective role for SORL1 in the amyloidogenic pathway. Investigation of the functional implications of the familial variant, p.Gly511Arg, showed disrupted interaction of the Vps10p domain with amyloid-β monomers, resulting in reduced lysosomal targeting of Aβ peptide by SORL1 [7]. Two additional rare variants, p.Glu270Lys and p.Thr947Met, were reported in familial late-onset AD patients of Caribbean-Hispanic origin. Both increased Aβ1-40 and Aβ1-42 secretion, and APP levels at the cell surface in transfected cell lines [48].

In this study, we investigated the contribution of genetic variants in the SORL1 coding region to the occurrence of AD in pan-European cohorts of 1255 early-onset AD patients and 1938 age-matched non-affected control individuals.

Materials and methods

Study population

The cohort under study consisted of 1255 EOAD patients originating from Flanders-Belgium (n = 312), Spain (n = 342), Portugal (n = 106), Italy (n = 205), Sweden (n = 183), Germany (n = 100), and Czech Republic (n = 7), and 1938 age-matched European control individuals originating from Flanders-Belgium (n = 748), Spain (n = 306), Portugal (n = 130), Italy (n = 444), Sweden (n = 303), and Czech Republic (n = 7) (supplementary table 1a). An additional set of patients (n = 30), from the same source population, carrying a known pathogenic mutation in APP, PSEN1 or PSEN2, were not included in the study cohort, but used for comparison of clinical characteristics. Mean onset age of the patient cohort was 59.0 ± 6.2 years. Mean age at inclusion for the control cohort was 66.4 ± 9.8 years. In both the patient and the control cohort, 60 % was female. In the patient cohort, information on familial history of AD was available for 759 (60 %) individuals. A positive familial history (defined as presence of at least one first-degree relative with AD) was present for 327 (43 %) individuals, while 432 (57 %) individuals were considered sporadic patients. DNA and medical/demographic information on patients and control individuals from Spain, Portugal, Italy, Sweden, Germany, and Czech Republic was ascertained through the EU EOD consortium as previously described (details are provided in supplementary table 1b) [5, 11, 45, 46]. Consensus diagnosis of possible, probable or definite AD was given according to the National Institute of Neurological and Communicative Disorders and Stroke-Alzheimer Disease and Related Disorders Association (NINCDS-ADRDA) [29] and/or the National Institute on Aging-Alzheimer’s Association (NIA-AA) diagnostic criteria [17, 30]. Belgian patients were ascertained at the memory clinics of Middelheim and Hoge Beuken, Hospital Network Antwerp (ZNA), Antwerp [12], and the University Hospitals of Leuven (UHL), Leuven [30]. Belgian control individuals were either recruited from partners of patients and screened for neurological or psychiatric antecedents or neurological complaints or organic disease involving the central nervous system, or community-recruited control individuals who were included after interview concerning medical and familial history and cognitive screening by means of the Mini Mental State Examination (MMSE > 26) [15].

SORL1 sequencing

Sequencing of SORL1 exons 2–48, and at least 15 nt of each exon–intron flanking region, was performed by target enrichment using MASTR technology (Multiplicom, Niel, Belgium). PCR primers flanking each target region were designed using mPCR software (Multiplicom, Niel, Belgium). Target region size for amplification was set at 500 nt. In total, all target regions were covered by 46 amplicons in nine multiplex PCR reactions. Subsequent indexing and sequencing was performed with extension of target-specific primer sequences with universal tag sequences (5′-TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG-Fwd and 5′ GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG-Rev). Optimal annealing temperature and relative amounts of PCR primers for all targets were established for uniform amplification of each target in the multiplex reaction. Multiplex PCR reactions were performed on 20 ng genomiphied DNA (Illustra GenomiPhi V2; Thermo Fisher, MA, USA). Amplification quality and efficacy was verified by fragment analysis on an ABI 3730 automated sequencer (Applied Biosystems, CA, USA). Subsequently, multiplex PCR amplicons of each individual were pooled to obtain equimolar concentrations of all amplicons. Library purification was performed with AMPureXP beads (Beckman Coulter, CA, USA). Amplicon-specific barcodes (Nextera XT; illumina, CA, USA) were incorporated in a universal PCR step on the pooled libraries. Barcoded samples were subjected to bridge amplification and bead purification prior to sequencing. Sequencing was performed on the Illumina MiSeq platform, using the Illumina reagent kit v2, generating 2 × 250 bp paired-end reads. Trimming of Illumina adapters from raw sequencing Fastq files was performed by Fastq-mcf. Read alignment and mapping was done against whole reference genome hg19 using the Burrows–Wheeler Aligner [26]. Variant calling and annotation was performed using GATK version 2.2 [28] and annotated using the Genomecomb software pipeline [40]. Variants with a read depth below 20 reads or with an imbalanced reference/variant allele read depth exceeding 3:1 were considered false calls. All remaining variants with predicted effect on protein sequence were included in subsequent manual read inspection using the Integrative Genomics Viewer software [41]. In total, 92 % of the SORL1 target sequence was sequenced at >20× read depth for all included individuals.

Due to high GC content (74 %), SORL1 exon 1 was sequenced using simplex PCR amplification followed by Sanger sequencing using the BigDye termination cycle sequencing kit v3.1 on the ABI 3730 DNA Analyzer. Sequences were analyzed using Seqman (DNAstar, WI, USA) and NovoSNP software [51]. Rare variant validation was performed on genomic DNA by Sanger sequencing, as was segregation analysis of variant p.Tyr1816Cys. Variant position on genomic level was based on Genbank accession number NC_000011.9, transcript position was based on NM_003105.5, and protein-level position on NP_003096.1.

In silico prediction

Putative pathogenic effects of coding SORL1 variants were predicted using Polymorphism Phenotyping software version 2 (PolyPhen2, http://genetics.bwh.harvard.edu/pph2/), Sorting Intolerant from Tolerant (SIFT, http://sift.jcvi.org), SIFTindel for frameshift variants (http://sift-dna.org/), and MutationTaster (http://mutationtaster.org/) databases. Previous identification of variants was investigated by comparison of identified variants against public databases, including the Database of Single Nucleotide Polymorphisms 141 (http://www.ncbi.nlm.nih.gov/SNP/), the Exome Variant Server (http://evs.gs.washington.edu/EVS/), the International HapMap Project (http://hapmancbi.nlm.nih.gov/), the 1000 Genomes Project (http://www.1000genomes.org/), and the Exome Aggregation Consortium database (http://exac.broadinstitute.org/). Protein stability predictions were performed using the FoldX free energy prediction tool [47], implemented within the YASARA molecular graphics suite (http://www.yasara.org/) for missense variants located within the Vps10p domain.

RNA sequencing

RNA sequencing data were generated for the p.Gly447Argfs*22 frameshift variant carrier. Total RNA was isolated from Epstein–Barr virus immortalized lymphoblast cells derived from whole blood lymphocytes. RNA isolation was performed using 1.0 × 107 lymphoblast cells with the RNeasy mini kit (Qiagen Inc., Valencia, CA, USA) according to manufacturer’s protocol. Depletion of genomic DNA from the RNA sample was performed by turboDNase treatment (Life Technologies, Carlsbad, CA, USA). RNA quality control to determine RNA concentration and RIN value was performed using the Agilent Technologies 2100 Bioanalyzer. RIN value was measured at 9.3 in a concentration of 86 ng/µl total RNA. The sequencing library was constructed using Truseq stranded mRNA Library Prep Reagent Set (Illumina, San Diego, CA, USA). Library preparation was performed using 1 mg total RNA and included poly-A selected RNA extraction, RNA fragmentation, and random-hexamer-primed reverse transcription. Sequencing of prepared libraries was performed using an Illumina HiSeq 2000 sequencer, generating 126,949,218 101-nucleotide paired-end sequence reads. Data analysis was performed using an in-house developed processing pipeline. Removal of read adapters and trimming of read ends was performed using Trimmomatic [3]. Trimmed reads were mapped against the UCSC human reference genome hg19 [43] using the Bowtie short read aligner integrated in Tophat2 [21]. Post-alignment QC and filtering of mapped-reads was performed with RSeqQC [49]. Variant calling was performed by employing GATK [28], VARSCAN [23] and VEP [31] software.

Nonsense-mediated mRNA decay

Nonsense-mediated mRNA decay (NMD) was investigated for p.Gly447Argfs*22. NMD was inhibited in Epstein–Barr virus immortalized lymphoblast cell lines derived from the p.Gly447Argfs*22 carrier and two non-carrier controls with 150 μg/mL cycloheximide (Sigma, St Louis, MO, USA) at 37 °C for 4 h, as previously described [10]. After incubation, RNA was isolated using the RNeasy mini kit. Depletion of genomic DNA from the RNA sample was performed by turboDNase treatment. Subsequent cDNA synthesis was performed using superscript III first-strand cDNA kit, oligoDT and random hexamers primers (Life Technologies, Carlsbad, CA, USA). Real-time quantitative PCR was performed to investigate the effect of p.Gly447Argfs*22 on SORL1 expression using SYBR Green technology (Life Technologies, Carlsbad, CA, USA). SORL1 expression levels were measured in triplicate, with three measurements per experiment in two separate experiments. Expression of SORL1 in untreated lymphoblast cells was quantified and analyzed with qBasePlus (Biogazelle, Ghent, Belgium). Effect of CHX incubation on SORL1 expression in the p.Gly447Argfs*22 carrier and two non-carrier controls was quantified using the \(2^{-\Delta\Delta \text{C}_{\text{T}}}\) (Livak) method [27].

Statistical analysis

Low-frequency (MAF between 0.01 and 0.05) and common (MAF ≥ 0.05) SORL1 coding variants were tested for deviations from Hardy–Weinberg equilibrium using PLINK [38]. Allele frequencies of common and low-frequency variants in patients and controls were compared by X 2 statistics. Odds ratios and 95 % confidence intervals were calculated by logistic regression modeling, corrected for gender and APOE ε4 allele carrier status using PLINK. Nominal p values were corrected for the number of variants tested using Bonferroni correction. SORL1 variants with MAF <0.01 were included in rare variant burden analysis for individuals originating from Spain, Italy, Portugal, Sweden, and Belgium. Individuals originating from Czech Republic (7 patients, 7 controls) and Germany (100 patients, 0 controls) were excluded from the analysis based on cohort size. Rare variant burden analysis was performed by collapsing alleles of all rare coding variants across the full SORL1 coding sequence or separately for each functional protein domain using an optimized sequence kernel association test (SKAT-O test), adjusted for sample size <2000. Rare variants association tests were performed using the R package SeqMeta [9]. SKAT-O meta-analysis was performed using standard beta-weights, and correction for gender and APOE ε4 carrier status of included individuals. Presented SKAT-O meta p values represent minimal p values over ρ as proposed by Lee et al. [25]. Correction for multiple testing was performed using Šidák correction. Functional protein domains were determined according to Pottier et al. [37]. Differences in relative lymphoblast SORL1 expression were calculated using an unpaired nonparametric (Mann–Whitney) test.

Results

SORL1 mutation screening

We analyzed the coding sequence of SORL1 in 1255 European early-onset AD patients and 1938 origin-matched control individuals and identified 92 rare frameshift, nonsense and nonsynonymous variants (MAF < 0.01) in a total of 219 individuals, of whom 111 (51 %) were patients (Fig. 1; Supplementary tables 2, 3, 4). In addition, the coding region harbored 102 rare synonymous variants, five low-frequency variants (MAF 0.01–0.05; three missense and two synonymous), one common missense and five common synonymous variants (MAF ≥ 0.05) (Supplementary tables 5, 6).

Fig. 1
figure 1

Non-synonymous rare SORL1 variants identified in EOAD patients and control individuals. Patient-only variants denote variant present in patient cohort. Shared variants denote variants present in both the patient and control cohort. Control-only variants denote variants present in the control cohort. Functional domains are adapted from [37], and based on uniprot information. Protein-level variant position was based on NP_003096. Vps10p vacuolar protein sorting 10 domain, LDLR low-density lipoprotein receptor domain, TM transmembrane domain

The observed rare variants included eight mutations introducing a premature termination codon (PTC): frameshift mutations p.Thr659Serfs*30, p.Cys752Serfs*21, p.Tyr350fs*, p.Gly447Argfs*22, p.Cys1103Valfs*4, p.Val1747fs*, and nonsense variants p.Arg416* and p.Arg1442* (Table 1), predicted to result in haploinsufficiency due to NMD. All PTC mutations were private variants and exclusive to the patient cohort [8/1255 (0.64 %) patients vs. 0/1938 controls]. For one of the frameshift mutation carriers, p.Gly447Argfs*22, biomaterials were available for investigation of the predicted mRNA decay. RNA sequencing on lymphoblast cells demonstrated that the alternative allele (insertion of A) was called, but only in a minority of reads (6.8 %) compared to the reference allele. Quantitative RT-PCR on lymphoblast cells showed reduced SORL1 expression levels in the p.Gly447Argfs*22 variant carrier compared to non-carrying control individuals (Mann–Whitney p value <0.001) (Fig. 2a). Blocking of nonsense-mediated decay by CHX treatment showed significant increase of SORL1 expression in the p.Gly447Argfs*22 carrier compared to non-carrying control individuals (Mann–Whitney p value 0.03) (Fig. 2b).

Table 1 Premature termination codon variants identified in EOAD patients
Fig. 2
figure 2

SORL1 expression and investigation of NMD. a SORL1 expression in lymphoblast cell lines of AD patient carrying SORL1 frameshift variant p.Gly447Argfs*22 and non-carrying control individuals. Measurements per sample were conducted in triplicate, with three measurements per experiment in two separate experiments. Y-axis indicates the relative expression quantities of SORL1. Error bars correspond to the standard error of the mean (SEM). Normalization was carried out against the housekeeping gene YWHAZ. Unpaired nonparametric Mann–Whitney test was performed to compare SORL1 expression of the p.Gly447Argfs*22 variant carrier with the control individuals. b SORL1 expression in lymphoblast cell lines of AD patient carrying SORL1 frameshift variant p.Gly447Argfs*22 and non-carrying controls. Black bars represent SORL1 expression in untreated samples (reference, set to 1); grey bars represent SORL1 expression after cycloheximide (CHX) treatment (relative to the non-treated sample). Error bars correspond to the standard error of the mean (SEM). Unpaired nonparametric Mann–Whitney test was performed to compare the effect of CHX incubation on SORL1 expression of the p.Gly447Argfs*22 variant with the control individuals

In addition to these 8 PTC mutations, we observed 84 rare missense variants in the patient/control cohort. Of the total identified rare missense variants and PTC mutations, 44 (48 %) were only observed in the patient cohort (supplementary table 2). In addition, 19 rare missense variants (21 %) were present in both patient and controls, and 29 variants (32 %) were only observed in controls (supplementary tables 3, 4). One patient carried a frameshift (p.Cys1103Valfs*4) and a missense variant (p.Asp2065Val); three patients and two controls carried double missense variants. Of the rare variants observed in this study, 30 (33 %) were not previously reported in any of the screened databases, the majority of which [22 (73 %)] were only observed in the patient cohort, while seven (23 %) were only found in controls, and one novel variant was detected in patients as well as controls (supplementary tables 2, 3, and 4). Of the variants only observed in the patient cohort, 35 of 44 (80 %) were predicted to be pathogenic by at least two of three prediction tools, whereas 20 of 29 (69 %) variants only observed in the control cohort, and 12 of 19 (63 %) shared variants, were predicted pathogenic (supplementary tables 2, 3, and 4). The effect of variants on protein stability could be modeled for eight variants observed in patients only, and seven variants observed in controls only located in the VPS10p domain. Predicted destabilization (∆∆G-value above 1.0) was shown for four out of eight patient-only variants against one out of seven control-only variants (supplementary table 7).

Clinicopathological characteristics

All patients carrying a PTC (n = 8) or a patient-only missense (n = 39) variant, received a probable (n = 44) or definite (n = 3) AD diagnosis. The mean onset age (OA) of the PTC carriers was 58.6 ± 5.2 years, with an age range of 15 (50–65) years, and a mean disease duration of 11.0 ± 5.0 years (Fig. 3a). The missense carriers had a mean OA of 57.9 ± 6.5 years, with a wide age range of 34 (35–69) years, and mean disease duration of 12.5 ± 5.5 years. In comparison, the mean OA was 52.4 ± 10.9 years for PSEN1 carriers (n = 23), 49.5 ± 1.5 years for PSEN2 carriers (n = 2) and 53.0 ± 6.8 years for APP carriers (n = 5) in the EOD cohort. A positive familial history was reported in 71.4 % (5/7) of the SORL1 PTC carriers, and in 43.5 % (10/23) of the SORL1 missense carriers. For the PSEN1 carriers a familial history was reported in 88.9 % (16/18), and in 100 % of the PSEN2 (2/2) and APP (5/5) carriers (Fig. 3b). For one variant, located in the fibronectin type III domain (p.Tyr1816Cys), DNA of relatives was available. The variant was also present in an affected sister, and not present in an unaffected sister (supplementary figure 1).

Fig. 3
figure 3

Clinical characteristics of mutation carriers. a Scatter plot showing the onset ages for the SORL1 PTC and patient exclusive missense carriers versus those of PSEN1, PSEN2 and APP carriers. Mann–Whitney U test p value 0.016. b The proportion of SORL1 PTC, SORL1 missense and PSEN1, PSEN2 and APP carriers with a sporadic, unknown or positive familial history for AD

Additional clinical information was available for 6 PTC carriers. All presented with an insidious memory dysfunction. In one carrier (p.Cys752Serfs*21), disease onset was also accompanied by apathy. Further progression of disease in the carriers was typical of AD, with progression to a global cognitive deterioration and functional dependence. Of note, in patient DR12.1 (p.Gly447Argfs*22), the onset of visual hallucinations, a fluctuating extrapyramidal syndrome and a REM-sleep behavior disorder, after a disease duration of 11 years, led to the suspicion of a concomitant Lewy body pathology.

Neuropathological examination was not available for SORL1 PTC carriers, but has been performed in 3 SORL1 missense carriers [DR112.1 (p.Leu762Pro), CS540 (p.Ala1548Thr) and CS770 (p.Gly1447Ser)]. All three had high-level AD neuropathologic changes (A3B3C3) [34], confirming the clinical AD diagnosis. Neuronal loss, gliosis and abnormal protein deposition—mostly in the form of senile plaques and neurofibrillary tangles (Fig. 4)—were most pronounced in the neocortical areas, amygdala, hippocampus and parahippocampal cortex, while the striatum, thalami, brainstem and cerebellum were more spared. A diffuse amyloid angiopathy, in DR112.1 most pronounced in the occipital cortex and cerebellum, was present in all three patients. Hippocampal sclerosis was absent. Isolated α-synuclein immunoreactive Lewy bodies and Lewy neurites were observed in the amygdala of CS770, but absent from CS540. No α-synuclein immunohistochemistry was performed in DR112.1.

Fig. 4
figure 4

Neuropathology of SORL1 missense carrier CS540. Neuropathological brain examination of a SORL1 missense carrier showing cortical thinning and superficial spongiosis in the frontal cortex, where pyramidal neurons contain very large tangles and abundant lipofuscin (a). Frequent mature and in a lesser extent diffuse beta-amyloid plaques are observed in the neocortical regions (b), as well as the cingulum and hippocampus. Frequent hyperphosphorylated tau immunoreactive (AT8) threads and large globose neurofibrillary tangles are present in neocortical areas (c) and cingulum

Rare variant association analysis

The frequency of rare PTC mutations and missense variants in SORL1 was 8.8 % (111 carriers/1255 patients) in the overall patient cohort and 3.7 % (47/1255) for patient-only variants. Mutation frequency in the overall control cohort was 5.6 % (108/1938), and 1.7 % (33/1938) for control-only variants. Mutation frequency in patients with known familial history of AD was 8.9 % (29/327), and 4.0 % (13/327) for patient-only variants.

SKAT-O meta-analysis was performed using all country cohorts except Germany and Czech Republic which did not meet inclusion criteria for association analysis, resulting in a total of n = 1085 patients and n = 1752 controls. This meta-analysis confirmed significant enrichment of rare PTC mutations and missense variants in patients [SKAT-O p value 0.0001; rare allele frequency in patients 5.0 % (108/2170), rare allele frequency in control individuals 2.8 % (98/3504)] (Table 2). Most significant enrichment of these variant in patients was found for the fibronectin type III protein domain (SKAT-O p value 0.01) (Supplementary table 8). The fibronectin type III domain is the largest protein domain of the SORL1 protein, spanning amino acids 1527–2108 [37, 44]. The cumulative minor allele frequency was largest for this domain, yet variants were identified in each of the SORL1 functional protein domains (Fig. 1). When excluding PTC variants from this analysis, findings remained the same, with association over the full protein (p value 0.0007), and strongest association for fibronectin III domain (p value 0.013).

Table 2 SKAT-O meta-analysis of rare variant burden

Single variant analysis of low-frequency and common variants

We identified six common (MAF ≥ 0.05) variants in the SORL1 coding sequence, including one missense variant p.Ala528Thr, and five synonymous variants (p.His269His, p.Thr833Thr, p.Ser1187Ser, p.Asn1246Asn, and p.Ala1584Ala). In addition, we identified five low-frequency variants (MAF 0.01–0.05), including three missense variants and two silent variants. Low-frequency missense variant p.Glu270Lys was previously associated with AD in Caribbean-Hispanic familial late-onset AD patients and Northern-European sporadic late-onset AD patients with MAF below 0.01, and was shown to segregate within affected Caribbean-Hispanic families [48]. Fixed-effects meta-analysis showed no significant association for this variant with AD in our cohort [OR 0.75 (95 % CI 0.51–1.12), p value 0.17] (Supplementary table 5). Association of missense variant p.Ala528Thr has been demonstrated in Caribbean-Hispanic familial late-onset AD patients at a MAF of 0.16. Fixed-effects meta-analysis showed no significant association for this variant with AD in our cohort [OR 1.22 (95 % CI 0.94–1.59), p value 0.14] (Supplementary table 6). Although one synonymous variant showed nominal significance, none of low-frequency and common variants showed significant association with EOAD after correction for multiple testing.

Discussion

We performed a systematic screening of the complete coding sequence of SORL1 in a large EOAD patient/control cohort in the frame of the BELNEU and EU EOD consortia. We found an increased burden of rare PTC and non-synonymous variants in the EOAD patients, of whom 8.8 % carried one or more SORL1 variants. These independent findings corroborated previous reports of an increased frequency of rare SORL1 variants in EOAD [35, 37]

Strikingly, PTC mutations were exclusively observed in patients. These variants most likely lead to a significant loss of SORL1 protein due to NMD mRNA decay of the mutant transcript. Indeed, we observed reduced SORL1 expression in lymphoblast cells of the p.Gly447Argfs*22 carrier, which increased upon blocking of NMD, indicative of haploinsufficiency. Further, the mode of action of these predicted loss-of-function mutations is in line with the observation of reduced SORL1 expression in post-mortem brain [6] and in human neuronal stable cell lines [1] leading to increased amyloid load. In addition, overexpression of SORL1 cDNA showed decreased amyloid-β secretion in induced human neuronal cells [54]. At a frequency of 0.64 % in the European EOAD cohort, SORL1 PTC mutations are rare. In familial patients, the frequency of SORL1 PTC mutations is increased to 1.5 %, which appears in line with reports of SORL1 PTC mutations in other AD cohorts. Eleven SORL1 PTC mutations have previously been reported, of which 8 were identified in 484 (1.7 %) familial EOAD patients from France [35, 37] and 3 in 154 (1.9 %) familial LOAD patients of Caribbean-Hispanic origin [48]. We observed a higher frequency of positive family history of AD in PTC variant carriers (71.4 %) compared to carriers of missense variants. Combined with the notion that SORL1 PTC mutations have not been observed in healthy controls to date, this suggests that PTC variants may have a high disease penetrance, but samples of affected relatives were not available to explore this further. Further evidence is needed to draw inferences on clinical relevance. Compared to carriers of an established pathogenic mutation in one of the three causal genes for EOAD (PSEN1, PSEN2 and APP), who had a positive familial history in 92 % of the patients (23/25), familial history of SORL1 PTC carriers was somewhat lower. In addition, the mean onset age of the SORL1 PTC carriers (58.6 ± 5.2 years) was higher when compared to the PSEN1, PSEN2 and APP carriers (52.3 ± 9.8 years), suggesting a less aggressive disease process. Because our study is limited to EOAD, the upper limit of onset age reported here is determined by the clinical criteria of EOAD, but this does not exclude a role for rare SORL1 variants in LOAD. In fact, rare SORL1 variants have previously been associated with familial LOAD by Vardarajan et al. [48].

In contrast to PTC mutations, the frequency of rare missense variants in healthy controls was non-negligible (5.6 %), and included a substantial proportion (69 %) of predicted pathogenic missense variants. This can in part be explained by a lower penetrance of SORL1 missense variants compared to PTC variants, and/or a variable degree of pathogenic relevance of the identified missense variants for AD. Pathogenicity may differ depending on parameters like the nature of the amino acid substitution, or location of the mutation in specific protein domains, at methylation sites or within adapter protein binding motifs. This necessitates functional follow-up to investigate effects of SORL1 variants, e.g., on APP trafficking, amyloid-β formation and clearance, to define functional relevance of each rare missense variant. Whereas others have reported that missense variants in SORL1 may lead to autosomal dominant AD, the relatively high frequency of predicted pathogenic variants in healthy controls in our study indicates that in the absence of functional evidence of pathogenicity, the observation of a SORL1 missense variant should be interpreted with caution. This caveat notwithstanding, meta-analysis showed a significant enrichment of rare missense variants in patients, which remained significant after exclusion of PTC mutations. This adds to the growing evidence that SORL1 missense variants may play a role in AD susceptibility. Of note, we obtained evidence of rare variant association in this hypothesis-driven, single gene resequencing study, but in the context of a whole exome sequencing study, this finding would not have survived multiple testing correction, illustrating the need for large sample sizes in hypothesis-free rare variant studies.

We observed missense variants in SORL1 throughout the different protein coding domains from Vps10p to FANSHY motif, only sparing the propeptide (Fig. 1). One of the missense variants, exclusively found in the patient cohort, p.Gly511Arg, had been detected in two affected relatives of a French autosomal dominant EOAD family [37]. This missense variant was shown to disrupt APP sorting from the trans-Golgi network to the lysosomal degradation pathway through abolished interaction of SORL1 with amyloid-β [7]. We observed p.Gly511Arg in a sporadic patient from Italy with an age at onset of 55 years. We could not perform segregation analysis for this variant due to absence of DNA of relatives. For missense variant p.Tyr1816Cys, located in the fibronectin type III domain, and detected in a patient from Italy with an age at onset of 63 years and a reported familial history of AD, we demonstrated the presence of the variant in an affected relative while absent from an unaffected relative. The elucidation of the crystal structure of the Vps10p and β-propeller domains suggested that amyloid-β monomers are bound by the SORL1 Vps10p domain through beta-sheet interaction, binding amyloid-β inside a tunnel structure formed by a 10-bladed beta-sheet propeller [22]. Rare variants located in the Vps10p domain, such as p.Gly511Arg putatively affect SORL1-amyloid-β interaction by destabilization of the SORL1 beta-sheet structure or disruption of the amyloid-β binding motif. Interestingly, patient-only missense variants affecting the Vps10p domain showed strongest Gibbs free energy changes, indicating strongest effects on SORL1 protein stability.

An alternative functional consequence of rare coding variants involves disruption of the anti-amyloidogenic APP trafficking pathway mediated by SORL1. A binding region for APP at the SORL1 protein is located at the cytoplasmic tail of the protein, where a six amino acid-stretching FANSHY motif is involved in binding the retromer adapter complex. We identified one patient-only missense variant, p.Asn2174Ser, in an Italian sporadic patient with onset age 57 years, altering the third amino acid in the FANSHY sequence from asparagine to serine. The retromer complex functions as the seed of direct interaction of SORL1 with APP. Site-directed mutagenesis disrupting the FANSHY motif in vitro has been shown to be amyloidogenic by ablation of the sequestering of APP by SORL1 to the trans-Golgi network [13, 14].

In contrast with previous investigations of SORL1 coding variants in late-onset AD cohorts, we could not identify a significant association of common and low-frequency variants (MAF > 0.01) with disease status. Initial association of variants p.Ala528Thr and p.Glu270Lys was reported for familial late-onset cases of Caribbean-Hispanic origin [48]. Discrepancies between variant frequency and direction of effect could be due to cohort ethnicity and founder effects. Absence of significant association for these variants in our EOAD patient/cohort analysis might also reflect reduced pathogenic relevance of these variants in EOAD compared to late-onset AD.

In conclusion, the study we performed represents one of the largest systematic screenings of SORL1 in EOAD patients and control persons. PTC variants were identified exclusively in patients, and their mode of action corresponds with evidence on the inverse relation between SORL1 expression and amyloid-β formation from in vitro functional studies of SORL1 in AD. The increased proportion of familial disease among PTC variant carriers is indicative of a strong effect on AD pathogenesis. Rare missense variants were associated with increased risk of early-onset AD. Some of these rare missense variants may also exert a strong effect on individual and familial risk of AD. The substantial frequency of (predicted pathogenic) variants in healthy controls, however, necessitates further research on the functional impact of the identified rare SORL1 variants to elucidate the affected pathways.