The hexanucleotide repeat expansion (HRE) in the C9orf72 gene is the most common genetic cause of amyotrophic lateral sclerosis (ALS) and frontotemporal dementia (FTD) in European ancestry populations and especially common in Finland [1, 2]. Whether there are risk alleles or haplotypes in the C9orf72 locus, besides the HRE, is an important question, because the putative non-expansion risk alleles are more common in the general population than the HRE.

We have previously reported that carriership of two copies of intermediate-length alleles (IAs) associate with ALS, when one of the alleles is ≥ 17 repeats [3]. De Boer et al. could not confirm this association in a large multi-center study [4]. We believe that our findings are essentially true, but the lack of confirmation may be caused by imprecisely assigned repeat allele threshold, haplotype heterogeneity, population stratification, or IA calling inconsistencies.

Van der Zee et al. [5] were the first to report that homozygosity for a single-nucleotide polymorphism (SNP) (rs2814707) was associated with FTD in a Flanders-Belgian case-control study after excluding expansion carriers [odds ratio (OR) 1.75, 95% CI 1.02-3.0, p = 0.04]. This SNP tags the HRE and IAs with ≥ 7 repeats. In Belgian ALS and FTD–ALS patients a significantly increased risk was found for carriers of two copies of the IAs (OR 2.08, 95%CI 1.04–4.1, p = 0.04), tagging SNPs were not reported [6].

We reported that homozygosity for the IAs (≥ 7/≥7 genotype, OR 1.93, 95%CI 1.19–3.02) and tagging SNP (A/A genotype rs3849942, OR 1.89, 95%CI 1.20–2.99) associated with ALS with similar magnitude in the Finnish population [3]. In our data we did not detect any significant ALS risk with IA genotypes 7–16/7–16 (OR 1.57, 95%CI 0.90–2.62), while an increasing risk of ALS was found with genotypes ≥ 7/≥17, ≥ 7/≥21 and ≥ 7/≥24. Thus, IA length influenced the magnitude of the risk and the threshold for significant risk was with the genotype ≥ 7/≥17. IA length also increased the correlation with the tagging SNP in our data [3] and in the UK birth cohort [7]. Thus, the haplotype becomes more homogeneous with increasing IA length.

Two of the previous studies reported association of ALS or FTD with tagging SNP homozygosity [3, 5] and two studies with intermediate length allele homozygosity [3, 6]. Now, the largest study to date did not confirm the reported risk conferred by IA homozygosity. What might be the explanation for these varying results?

First, the cutoff (≥ 17 repeats) for the IAs may be nonoptimal. De Boer et al. used this cutoff (genotype ≥ 7/≥17) for the analysis of longer IAs, the cutoffs ≥ 21 and ≥ 24 were not reported. We have found that the frequency of alleles of ≥ 20 repeats is 2-fold higher among Finns (0.89%) than in other populations [8], including UK 1958 Birth cohort (0.42%). However, according to published data [7, 8], the alleles with 17–19 repeats are more common the UK birth cohort (0.53%) than in the Finnish populations (0.30%). Thus, based on the geographic enrichment of longer IAs (≥ 20 repeats) in Finland and increasing haplotypic homogeneity associated with longer IAs, it is possible that these alleles/haplotypes are relevant for disease risk. There is also biological evidence from transfected cells (11, 20, 22 and 41 repeats tested) indicating that instability of the repeat is length-dependent and replication fork stalling is observed at 20 repeats [9]. In the future, a higher cutoff should be tested. Also, somatic instability and resulting mosaicism should be analyzed in ALS and FTD patients with ≥ 20 repeat alleles.

Second, as discussed by de Boer et al., haplotype heterogeneity is one possibility and further studies are needed to explore which part(s) of these haplotypes are playing a role in ALS/FTD risk. This risk may be conferred by the IAs as such, other variation on the haplotype or their combination. Haplotype heterogeneity regarding the best expansion-tagging SNPs has already been reported between Finnish and UK populations [10] and the above-mentioned differences in the frequencies of the 17–19 and ≥ 20 repeat alleles may reflect differences in the haplotype background of these alleles.

Third, all three studies that reported associations with ALS or FTDs were single-population studies [3, 5,6,, 6]. In the study of de Boer et al. data was collected from six different publications with three cohorts from Netherlands, two from UK, two from Italy, one from Spain and one from North America. Of the controls 80% were derived from the UK 1958 birth cohort, while 82% of the ALS patients and 37% of the FTD patients (largest fraction) were from Netherlands. Risk haplotypes may be differentially distributed in these cohorts, which may affect the results; tagging SNPs were not available to control for haplotype heterogeneity. In the future single-population analyses of ALS/FTD with combined IA and SNP genotyping should be tested to replicate the reported findings. It will also be important to replicate the findings in Finnish FTD cases.

Fourth, genotyping methodology may influence the allele calling. In our study all C9orf72 repeat allele lengths were assessed in the same laboratory contrary to what was mentioned in the de Boer et al. letter (rs3849942 genotyping was performed at multiple sites). We also made additional over-the-repeat PCRs to verify the larger IA lengths. In the cohorts reported by de Boer et al. slightly different RP-PCR methodologies were used and the largest ALS cohort that contributed 82% of the ALS cases was genotyped by short-read whole genome sequencing data and in silico with the ExpansionHunter algorithm. Sizing of the longer IAs is less accurate than expansion detection with short-read sequencing. In future studies aiming to replicate the previous findings a similar IA genotyping methodology should preferably be used.

These varying results should stimulate the field to define the C9orf72 haplotypes more in detail (e.g. by the use of long-range sequencing) to uncover the putative expansion-independent genetic effect of this ALS/FTD locus.