Introduction

The identification of mutations in the leucine-rich repeat kinase 2 (LRRK2) gene as a cause of autosomal dominant Parkinson’s disease (PD) opened a novel era for the understanding of the causes and mechanisms of this disorder [1, 2]. LRRK2 (particularly, the Gly2019Ser mutation) gave proof of principle of a common genetic cause of typical late-onset PD, but it also yielded the first variant, Gly2385Arg, a frequent polymorphism in different Asian populations, which acts as a risk factor for the common, sporadic form of PD in those populations [3].

In search for population-specific biologically relevant variants, we sequenced the complete LRRK2 coding region in a number of PD patients from Taiwan [4]. This work led to our initial identification of the association of Gly2385Arg with PD, a finding that has been rapidly and consistently replicated in Chinese populations from Singapore [5], Taiwan [6, 7], mainland China [8, 9], and in the Japanese population [10].

Very recently, a different LRRK2 missense variant, Arg1628Pro, was proposed as a second risk factor for sporadic PD in the Han Chinese population (OR 1.84, 95% CI 1.20–2.83, p = 0.006) [11]. In this study, we report the results of the analysis of the Arg1628Pro variant in a large, independent sample of PD cases and controls of Han Chinese ethnicity.

Materials and methods

The study sample consisted of 1,377 subjects of Han Chinese ethnicity, ascertained at a single referral center in Taiwan, and included 834 patients with a clinical diagnosis of idiopathic PD and sporadic pattern of presentation (343 women, 491 men) and 543 control subjects (319 women, 224 men). For the patients, the average ages at last examination and at disease onset were 65.7 ± 11.8 years (range 24–97) and 55.6 ± 12.1 years (range 12–94), respectively. The average disease duration was 10.1 ± 6.24 years (range 1–30). The clinical diagnosis of PD was established according to published criteria [12]. A first subgroup of 608 PD cases was previously screened by us for the LRRK2 Ile2012Thr, Gly2019Ser, and Ile2020Thr mutation and proved to be negative [13]. In cases with disease onset before the age of 40 years, the parkin and PINK1 genes were screened by single-strand conformation polymorphism, and no mutations were found. The control subjects were free from PD, matched to the patients by ethnicity, collected at the same center, and originating from the same geographical areas. They were ascertained among the spouses of PD patients, spouses of patients with unrelated diseases, and healthy volunteers. The mean age at sampling for controls was 51.9 ± 18.4 years, range 11–87, which is slightly, but significantly younger than the average age at onset in the cases. The project was approved from the local ethical authorities, and written informed consent was obtained from all subjects.

Genomic DNA was isolated from peripheral blood using standard protocols. In the whole sample of cases and controls, the LRRK2 c.G4883C variant in exon 34 (single nucleotide polymorphism [SNP] accession no. rs33949390, predicted protein effect: Arg1628Pro) was screened by a TaqMan allelic discrimination method and Assays-by-Design on an Applied Biosystems 7300 Real Time PCR System. Primers, probes, and experimental conditions are available on request.

For genotyping quality control, multiple positive and negative controls for the c.G4883C variant were included in each polymerase chain reaction plate. Furthermore, 176 cases and 181 controls were also genotyped for Arg1628Pro and other variants located in exon 34 by direct sequencing in both strands, as described previously [14], yielding 100% concordance with the results of the TaqMan assay for the Arg1628Pro variant.

For haplotype analysis, we first examined the complete coding region of LRRK2 and exon–intron boundaries, sequenced previously by us in two heterozygous carriers of the c.G4883C (Arg1628Pro) variant [4]. All the variants detected were tested in one additional heterozygous and two homozygous Arg1628Pro carriers detected in this study.

Furthermore, five coding SNPs located in exon 5 (Leu153Leu, rs10878245), exon 34 (Gly1624Gly, rs1427263; Lys1637Lys, rs11176013; Ser1647Thr, rs11564148), and exon 49 (Met2397Thr, rs3761863) were typed by direct sequencing in 46 PD patients who were carriers of the Arg1628Pro variant, as well as in 87 Taiwanese subjects (44 PD patients and 43 controls) who were not carrying Arg1628Pro. Haplotype analysis was performed using Haploview ver. 1.4 [15]. Lastly, the c.G4883C (Arg1628Pro) variant was typed by direct sequencing in 202 Caucasian subjects (106 PD patients and 96 controls, total 404 chromosomes). Sequencing was performed using Big Dye Terminator chemistry ver.3.1 (Applied Biosystems). Fragments were loaded on an ABI3100 and analyzed with DNA Sequencing Analysis (ver.3.7) and SeqScape (ver.2.1) software (Applied Biosystems). All variants are named according to the LRRK2 complementary DNA sequence deposited in GenBank (accession no. NM_198578).

For predicting the secondary structure of the region of the LRRK2 protein containing the Arg1628 residue, we used the PSIPRED ver.2.6 [16] and NPS [17] programs. Statistical analyses included contingency tables and Student’s test as appropriate. For the calculation of the population attributable risk (PAR), the following formula was used: P(OR − 1)/1 + P(OR − 1), where P is the frequency of the risk genotype (heterozygous Arg1628Pro) in the controls, considered to approximate the frequency of the risk genotype in the general population and OR is the odds ratio obtained in the case control comparison, used to estimate the relative risk of PD for the carriers of the risk genotype.

Results

The results of the association analysis are reported in the table. Controls and cases were both in Hardy–Weinberg equilibrium. The Arg1628Pro variant was detected in 62 cases (7.4%, two homozygous and 60 heterozygous) and in 20 controls (3.7%, all heterozygous). The Arg1628Pro allele was significantly more frequent among patients (p = 0.004, OR 2.13, 95% CI 1.29–3.52; Table 1). Using this observed value of OR to estimate the relative risk and the frequency of the risk genotype (heterozygous Arg1628Pro) among controls as an estimate of its frequency in the general population yields a PAR of ~4%.

Table 1 Distribution of G4883C (Arg1628Pro) allelic and genotypic frequencies

The sequencing of LRRK2 exons and exon–intron boundaries in two homozygous and three heterozygous carriers of the Arg1628Pro variant revealed several, previously reported intronic and exonic SNP variants (Fig. 1a). Haplotypes in the two homozygous patients are identical (Fig. 1a), and those in the three heterozygous cases are also compatible with the presence of the same shared haplotype across the whole LRRK2 gene. Furthermore, in 46 Taiwanese patients, who carried the Arg1628Pro variant, the analysis by Haploview detected association of this variant always with the same haplotype (formed by the C-A-G-A-C alleles of the Leu153Leu, Gly1624Gly, Lys1637Lys, Ser1647Thr, and Met2397Thr variants). On the other hand, this specific C-A-G-A-C haplotype was estimated by Haploview at a frequency of 0.283 and 0.291 among Taiwanese PD patients and controls, who did not carry the Arg1628Pro variant.

Fig. 1
figure 1

a Haplotype containing the Arg1628Pro variant. Exonic variants are bolded and two non-synonymous variants are indicated by arrowheads. The position of the Arg1628Pro variant is indicated by an arrow in the haplotype and in the LRRK2 domain structure shown below. ank ankyrin-like repeat region, LRR leucine-rich repeat region, ROC Ras in complex proteins, COR C-terminal of Roc, WD40 WD40 repeat-like domain. b Alignment of LRRK2 protein homologues in the region corresponding to the human Arg1628 residue

All the patients who were carriers of the Arg1628Pro variant were tested for the presence of the Gly2385Arg variant, by direct sequencing of LRRK2 exon 49. Only one of the carriers of the Arg1628Pro variant (heterozygous) was also a carrier of the heterozygous Gly2385Arg variant. We did not detect the c.G4883C (Arg1628Pro) variant in any of 404 Caucasian chromosomes (212 from PD patients and 192 from controls).

PD average onset age in the 60 heterozygous carriers of the Arg1628Pro variant was 55.4 ± 10.3 years (range 35–80), while in the 772 non-carriers, the onset age was 55.6 ± 12.2 (12–94; p = NS). Disease duration was 10.9 ± 7.1 years (1–30) and 10.1 ± 6.2 (1–30) in carriers and non-carriers, respectively (p = NS). The two patients homozygous carriers of Arg1628Pro had PD onset at ages 38 and 55 and disease duration of 15 and 18 years.

The mean age at examination in the 20 controls, who were carriers of the Arg1628Pro heterozygous variant, was 49.2 years ± 18.3 (22–72), which is younger but not significantly different than the mean onset age of PD in the heterozygous carriers (p = 0.064, Student’s t-test); however, half of the control carriers were younger than the mean onset age of PD in heterozygous carriers.

Discussion

Our results support the contention that the LRRK2 Arg1628Pro variant is a second risk factor for the common forms of sporadic PD in the Chinese population (after the Gly2385Arg variant). We neither detected the Arg1628Pro variant in any of the 404 Caucasian chromosomes, nor was the variant reported in the other LRRK2 studies in Caucasians. Similar to Gly2385Arg [3], Arg1628Pro might be therefore considered as a PD-associated population-specific allele in the Han Chinese population. Interestingly, in the previous report [11], the Arg1628Pro variant was not detected among Japanese subjects, but the sample size was limited (N = 246), and further studies are warranted.

The results of our haplotype analyses, in keeping with the data from the previous study [11], strongly suggest that most carriers of the Arg1628Pro variant share an extended haplotype, indicative of a founder effect.

Any given disease-associated variant might merely represent a marker of another, biologically relevant variant, which lies in linkage disequilibrium on the risk allele. While Arg1628Pro is a non-conservative change in a highly conserved residue (Fig. 1b), the associated haplotype includes 11 intronic, five exonic silent, and two conservative variants (Ser1647Thr and Met2397Thr; Fig. 1a). However, all these variants display much higher allele frequencies than Arg1628Pro (our data not shown), and, more importantly, these variants are also present in Caucasian populations, where association with PD was not detected in previous studies examining common LRRK2 variants using haplotype-tagging SNPs [18, 19]. We also explored the frequency of three variants located in exon 34: Gly1624Gly (rs1427263), Lys1637Lys (rs11176013), and Ser1647Thr (rs11564148) in 176 cases and 181 controls from this study (more than 700 chromosomes), but we detected no association with PD (Gly1624Gly minor allele frequency [MAF] 0.45 in cases and 0.53 in controls; Lys1637Lys MAF 0.44 in cases and 0.49 in controls; Ser1647Thr MAF 0.33 in cases and 0.32 in controls). From all these arguments, we conclude that the Arg1628Pro variant is most likely the biologically relevant variant, responsible for the association with PD detected here and in the previous, independent study [11].

Results of allelic association studies might be biased, mainly by the use of small sample sizes, population stratification, and genotyping errors [20], and replication in independent, large samples is important. In order to minimize these sources of biases, we tested a large sample of ethnically matched cases and controls, recruited at only one referral center; we adopted strict PD diagnostic criteria, and we excluded familial PD cases; genotyping was performed in only one laboratory using standard and uniform genotyping platforms and rigorous systematic quality controls. These considerations argue against the existence of the above-mentioned biases. Furthermore, as some of the control carriers of the variant might still develop PD at later ages, the actual effect size (OR) of the Arg1628Pro variant might be higher.

Contrary to the Arg2385Gly, the Arg1628Pro variant replaces a highly conserved residue in the COR domain of the LRRK2 protein (Fig. 1a,b). The COR domain (C-terminal of Roc) is unique for the LRRK2 protein family [21], where it is thought to mediate intramolecular signaling between the N-terminal Roc-GTPase and the C-terminal kinase domain of LRRK2 (Fig. 1a). Only one definitely PD-causing mutation is known so far in the COR domain (Tyr1699Cys). The Arg1628Pro variant replaces the positively charged arginine with a nonpolar Proline. Moreover, Proline might change the local folding, as it is typically acting by breaking α-helical conformation in proteins. Analyses using the PSIPRED and NPS programs indicate that the Arg1628 and the following 7–10 amino acids are likely to adopt an α-helical conformation. Loss of the net positive charge, change in α-helical content, or both might affect the structure and function of the COR domain, with the final effect of altering the regulation of the downstream kinase activity of LRRK2. Functional studies are warranted to understand the mechanisms of action of the Arg1628Pro variant. As it is the case for the other PD-associated variant in LRRK2 (Arg2385Gly), onset age of PD in cases who are carriers and non-carriers of the Arg1628Pro variant does not differ, but further study might identify differences in other clinical or patho-physiological features.

With only one exception, in our experience, the cases that carry Arg1628Pro do not carry Arg2385Gly and vice versa. These two variants are therefore located on different alleles, and their impact at the population level will add to each other. The patient who carries both LRRK2 risk variants has a 16-year history of tremor-predominant PD, which started at the age of 65 and responded well to l-dopa, but with l-dopa-induced psychosis developing 7 years after PD onset. DAT scan (99mTc-TRODAT SPECT) showed decreased dopamine transporter activity. Before the onset of motor symptoms, the patient developed anosmia but no rapid eye movement (REM) sleep behavior disorder. We also detected two cases who are homozygous carriers of Arg1628Pro. These cases had PD onset at ages 38 and 55 and disease duration of 15 and 18 years. The first has tremor-predominant, while the other has akinetic-rigid-predominant PD type. Both responded well to l-dopa, but l-dopa-induced psychosis developed 13 years after onset of PD in the second. DAT scan (99mTc-TRODAT SPECT) showed decreased dopamine transporter activity in both. None of these developed hyposmia or REM sleep behavior disorder. Further homozygous cases need to be identified in order to answer the question whether carrying two copies of the Arg1628Pro variant yields a younger or more aggressive phenotype.

The PAR is the percentage of cases in a given population, which are attributable to the effect of a given etiologic factor (in this case, the risk allele), and which could be prevented if the risk factor could be ideally removed from the population. Using the observed frequency of the risk genotype (heterozygous Arg1628Pro) among controls (0.037) and the observed value of OR (2.1) to estimate the risk genotype frequency in the general population and the relative risk, one can estimate a PAR of ~4%. In the most populous nations of the world, the number of PD cases among individuals over age 50 is projected to more than double from the current approximately four to nine million by the year 2030, with the Chinese population contributing more that 50% of these patients [22]. The Gly2385Arg variant is present in ~5% of controls from Taiwan, and its size effect (OR) as a risk factor for PD is around 2.2 in our Taiwanese study [4] and in replication studies [3], allowing an estimate of PAR of ~6% for the Gly2385Arg variant. Therefore, considering the Arg1628Pro and Gly2385Arg variants together, the coding variability in LRRK2 might account for ~10% of PAR for this disease in China. The identification of Arg1628Pro as a second risk factor for PD constitutes therefore a further important step forward in the dissection of the causes and mechanisms of PD. Additional risk variants in LRRK2 or different genes might exist in other populations, and further, extensive analyses are warranted.