Background

Spinal muscular atrophy (SMA) is characterised by degeneration of the alpha motor neurons of the spinal cord anterior horn cells, leading to progressive proximal muscle weakness and atrophy and, in the most severe types, paralysis.

The clinical phenotype of SMA is heterogeneous, ranging from a severe to a mild phenotype. It is generally divided into three main subtypes: type I (also called Werdnig Hoffmann disease), type II and, type III (also called Kugelberg Welander disease). However, these phenotypes are seen more as a continuum rather than as distinct subtypes and sometimes further subtypes at both ends of the spectrum are observed. SMA type 0 is a very severe form with onset in utero, reduced or absent movements, contractures, and requirement for mechanical ventilation support at birth and death before six months of age, while SMA type IV is a mild late (adult) onset form that has a normal life expectancy [1, 2]. An overview of the different subtypes is given in Table 1.

Table 1 Clinical classification of spinal muscular atrophy

SMA is inherited in an autosomal recessive manner. In most cases it is caused by mutations in the survival motor neuron 1 (SMN1, SMN T, telomeric) gene, located on chromosome 5q13.2 [3]. In rare cases (~4%) SMA is caused by mutation in another gene (non-5q SMA). The majority of the patients (92%) have a homozygous deletion of SMN1. In the remaining patients small mutations that abolish the production of the SMN protein are found, mostly in a combination with an SMN1 deletion (~4%) [4, 5]. A centromeric homologue of the gene, SMN2, (previously also called SMN C or C BCD541) is present in humans. SMN2 differs from SMN1 by five nucleotides of which only one (an 840C➔T transition at exon 6–7) lies in the coding sequence and is transitionally silent. This change and a change in intron 7 cause exon 7 of the SMN2 transcript to be poorly recognized by the splicing machinery, resulting in the skipping of this exon in the majority of transcripts. This results in a frame-shift and production of a protein with a different C-terminal end, which is unstable and non-functional [3, 6]. Since exon 7 is sometimes included in SMN2 transcripts, some full-length SMN protein can be produced, albeit as very low levels (~10–20%) that are insufficient to prevent disease. The number of SMN2 copies varies within the general population, and is inversely associated with disease severity as having more SMN2 copies ensures that the absolute amount of SMN protein that is produced is higher. Notably, SMN2 defects in isolation do not seem to cause the disease [7,8,9]. Other modifiers that might play a role are NAIP, H4F5, GTF2H2 and PLS3 [10,11,12,13,14,15]. NAIP, H4F5 and GTF2H2 are thought to be a modifiers due to their proximity to the SMN1 gene and NAIP also shows homology to apoptosis inhibitory proteins [12, 14, 16]. PLS3 restores the function of the neuromuscular junction, by stabilizing F-actin-dependent endocytosis [17].

The first therapy for SMA, Spinraza (IONIS-SMNRx, nusinersen), has recently been approved by the Food and Drug Administration (FDA) in the US [18] and by the European Medicines Agency (EMA) in Europe [19]. Clinical trials for other potential therapies are progressing. As such, the knowledge about the frequency of the disease becomes even more important. This review provides an overview of what is currently known about the prevalence, incidence and carrier frequency of SMA.

Methods

Published literature on prevalence, incidence or carrier frequency of SMA was identified through PubMed searches. Search terms were ‘spinal muscular atrophy’ OR ‘Werdnig Hoffmann’ OR ‘Kugelberg Welander’ AND ‘prevalence’ OR ‘incidence’, OR ‘carrier frequency’. No restrictions for language were used; however articles in other languages than English may be missed, due to the use of English search terms. Retrieved literature was scanned and all available articles performing a prevalence, incidence or carrier frequency study were used for this review. Additional publications were identified from references in the articles. Available literature published through 6th December 2016 was taken into account; no start date was used. For prevalence and incidence studies, all studies had determining the prevalence and/or incidence as primary goal. For carrier frequency studies also studies in which carrier frequency was determined for other purposes were included. All articles were appraised critically for accurate use of terminology and were reassigned if needed. For detailed methods on the analysis of carrier frequency differences between ethnic groups see Additional file 1.

Prevalence and incidence of SMA

To date, only a few studies have been performed to assess the prevalence and incidence of SMA. Most of these have been conducted before 1995, when the disease causing gene was identified, therefore using clinical rather than genetic diagnosis as an inclusion criterion. Generally, an estimation of the incidence of all types of SMA of around 10 in 100,000 (1 in 10,000) live births is cited [20, 21].

Prevalence

Prevalence is the number of living individuals with a disease at a given time. An overview of the studies examining the prevalence of SMA is provided in Table 2.

Table 2 Overview prevalence of SMA by subtype

When examining all types of SMA together, in most cases a prevalence of around 1–2 per 100,000 persons is observed. In some studies a somewhat higher prevalence was observed. A study from Bologna, Italy, in 1992 calculated a prevalence of 6.56 per 100,000 persons aged less than 20 years [22]. Three studies in Scandinavia showed a prevalence of 4.18 per 100,000 persons aged 18 years or less, and 3.23 and 2.78 per 100,000 persons aged below 16 years [23,24,25]. This could indicate regional differences in the incidence of SMA, i.e. gene pools. However, there are several other factors that may account for this observation. First of all, all studies were performed in small regions and thereby small populations were studied. For rare diseases like SMA, a small error in the detection of the number of cases can have a large impact on the estimated prevalence (sample bias). Secondly, these studies only took children into account, which is likely to influence the numbers in an upward direction. Furthermore, in the case of Sweden higher prevalence rates have also been observed in studies into other neuromuscular disorders, which could be due to a greater awareness and a good health system in Sweden, making it easier to identify patients for such a study [26,27,28]. A study in Northeast Saudi Arabia also found a very high prevalence rate. Although the prevalence of SMA might be different in the Middle East when compared to Europe, in more than half of the cases parental consanguinity was observed, which could at least partially explain the high prevalence [29].

Prevalence by SMA subtype

Although SMA type I is expected to account for more than half of all new SMA cases [30], the studies that examined a SMA type I only showed a prevalence of 0.04 to 0.28 per 100,000 [24, 25, 31,32,33,34], which is much lower than the 1–2 per 100,000 persons noted for all SMA. Due to its severity, patients with SMA type I have a short life expectancy. Therefore often no or only few patients are alive on the date of the study, which could account for this lower prevalence. Nowadays, a median life expectancy of around one year of age is estimated for type I patients [35,36,37], whereas in type II 75–93% of patients survive beyond 20 years of age [37,38,39,40] and life expectancy for type III is thought to be close to the normal population [20, 39].

The prevalence of both SMA type II and III together has been estimated around 1.5 per 100,000 [31, 32, 41,42,43]. Of three studies that investigated type II and type III separately, two found a higher prevalence of type III compared to type II [24, 32]. This may be explained by the longer life expectancy of type III patients compared to type II SMA patients.

Incidence

Incidence is the number of new cases of disease in a particular time period. In the case of SMA, the genotype is present at birth; a more precise term therefore is birth prevalence. Since newborn screening is not widely performed the number of patients expressing the phenotype is used instead to estimate the incidence. An overview of the studies examining the incidence is given in Table 3.

Table 3 Overview incidence of SMA by subtype

When evaluating the incidence of all types of SMA combined, on average an incidence of around 8 per 100,000 live births is found (~1 in 12,000). Some studies show a somewhat lower or higher incidence. In a study in Iceland an incidence of 13.7 per 100,000 live births was found. This is a study on an island with a relatively small population, where it might be easier to identify all patients. A study in Slovakia found a high incidence of 17.8 per 100,000, but details of the number of patients or population size were unavailable, making it difficult to interpret these findings [44]. In a recent study in Cuba a lower incidence of 5.0 per 100,000 was seen [45]. Patients were detected via an obligatory governmental registry and approximately 70% of the patients were genetically confirmed. This study also examined the ethnicity of the SMA type I patients. The majority of these patients were White (30/36), 5/36 were of mixed race and 1/36 patient was Black. Although this could be partially explained by the racial composition of the Cuban population, still relatively more White people were affected. There are several reasons that could account for this. First, there is a difference in incidence between various ethnicities. There are also reports of lower SMA carrier frequencies among Hispanics [46, 47]. However, it could also be the case that there are differences in the access to health care between different ethnicities. In a small study among 75,000 persons in Libya, a high incidence (24 per 100,000 live births) was found, and this may partially be explained by a high degree of consanguinity [48].

Incidence by subtype

In 1991, Alan Emery published a review estimating the incidence for SMA type I to be around 4–6 in 100,000 (1 in 12,500–1 in 16,667) live births [49], which was based on only three studies [50,51,52]. We identified 17 studies, which taken together indicated an SMA I incidence of approximately 6 per 100,000. In the USA (North Dakota) in a study that pre-dated genetic testing, high incidence was observed (14.9 per 100,000); however this study was performed in a very small population, and any error in the accuracy of case identification may be associated with the high incidence. All patients studied were Caucasian and no consanguinity was observed [53]. In a regional study in Germany, a higher incidence of 9.8 per 100,000 was found [33]. In Libya, a high incidence, as found for total SMA, was not observed among type I patients (8.0 per 100,000) [48]. This is again based on a small population and could be due to a lack of awareness of SMA at the time the study was conducted. Furthermore, SMA type I patients might have been missed due to their short life span. In two small communities a very high incidence was observed. On Reunion Island in a European community a founder effect (loss of genetic variation that occurs when a new population is established by a very small number of individuals, which could lead to a high incidence if in one of these founders a mutation was present) was clearly seen, leading to an incidence of 79 per 100,000. In an Egyptian Karaite community in Israel, where in more than half of the affected families consanguinity was observed, an incidence of as high as 250 per 100,000 live births was found.

For type II and III, a high incidence of both types combined was observed (10.6 per 100,000) in a German study in the same region as the previously mentioned type I study that partially covered the same time period [33, 43]. The healthcare system in Germany may partly explain these observations. Furthermore, there might be regional differences in SMA incidence. The authors suggest that SMA might be more prevalent in central and Eastern Europe than in Western Europe. For type II and type III SMA the highest occurrence was observed in Libya (16 per 100,000) [48].

A study not added in Table 3 is a study from Kurland et al. in Rochester, USA, studying the period 1945–1954. This study found only one SMA type I patient and the calculations used the total population size instead of the number of live births to calculate the incidence. Furthermore, this total population consisted of only 30,000 persons [54].

The epidemiologic burden of SMA is not equally divided over the subtypes. In 2004 Ogino et al. reviewed several studies and calculated incidence rates of 5.83 per 100,000 live births for SMA type I, 2.66 per 100,000 live births for type II and 1.20 per 100,000 live births for type III. This implied that SMA type I, II and III constituted 60%, 27% and 12% of all SMA cases, respectively [30]. This overview included the study of Radhakrishan et al. in Libya, in which for half of the families parental consanguinity was observed [48]. In our analysis, we calculated the percentages in two ways yielding nearly identical results. First by only taking studies into account in which all types of SMA were studied separately, as this makes a direct comparison possible; and, secondly by taking all studies presented into account. In both cases this resulted in incidence rates of around 5.5, 1.9 and 1.7 per 100,000 for type I, II and III, respectively. This yields a percentage of around 60% for the incidence of SMA type I; with the remaining 40% of the cases equally divided between type II and type III. This indicates that SMA I indeed makes up the largest proportion of the total SMA.

Considerations for comparing studies

To date, there are few studies of the prevalence and/or incidence of SMA, with a small number of these being recent. Most of the studies have been carried out in Europe. Furthermore, four of the ten studies done outside of Europe were performed in the countries with high consanguinity or small communities, thereby they are not considered to be representative of the overall SMA prevalence and incidence. No worldwide studies have been published to date.

A number of limitations should be taken into account when estimating prevalence/ incidence of SMA and comparing the presented studies. Most studies have been performed before 1995 when the genetic cause for SMA, deletion of the SMN1 gene, was identified [3], where after genetic diagnosis was implemented. Therefore, most studies rely on the less accurate clinical diagnosis of SMA. This increases the chance of misdiagnosis of diseases with clinical features similar to SMA. Another difficulty comparing studies is that the classification of SMA has slightly changed over the years and it is not always clear which classification system has been used. For example, in the studies of John Pearn in Northeast England patients were defined as SMA type I if they had an onset of symptoms before the age of 12 months, so this might also include some early diagnosed SMA type II patients [41, 52]. Chronic SMA was classified as patients living beyond 18 months old. However, in the study in West-Thüringen, Germany patients had to survive till at least four years of age to be classified as chronic SMA [43]. This is further exemplified by the study of Spiegler et al. in Warsaw, Poland. In this study type Ib patients are mentioned, and are defined as patients diagnosed at birth or in the first months of life and living up to 30 years, whereas type II SMA was described as having an onset at the age of one year onwards [42]. In the study of Zellweger et al. in Switzerland it is not clearly specified which definitions were used, but it is conceivable that some type II patients are included in the numbers of type I patients [55]. Currently, the classification of the main subtypes: I, II and III (and sometimes IV) as described in Table 1 is used.

Another factor that should be taken into account is that the studies have been performed in different time periods. The natural history of SMA has changed over the years as the standards of care and associated outcomes have largely improved in the recent years. For example for type I comparison of studies showed the mean age of death increased from 8.8–10 months in studies performed before 1995 to 10.4 months up to 4 years in studies performed after 2000 [35, 36]. This is partly due to the availability of assisted ventilation (non-invasive or through tracheostomy) and of tube feeding through a gastrostomy [36].

Lastly, most of the studies have been performed in small geographical areas, thereby including a relatively small study population. One or two patients more or less in a small patient population will have a strong effect on the calculated prevalence or incidence. All these factors make a comparison between the studies and the interpretation of the findings difficult.

In conclusion, few prevalence and incidence studies have been performed for SMA, of which most are based on clinical diagnosis and are performed in European countries or regions, using small study populations. In addition to prevalence and incidence studies, carrier frequencies can provided useful additional information about, for example, ethnic subpopulations.

Carrier frequency in SMA

Since SMA is a recessive disease, there are also unaffected, heterozygous carriers of the disease. Carriers fall into four main groups of genotypes (Fig. 1). The most common one is the ‘1 + 0’ genotype (one normal, functional allele and a SMN1 deleted, disease allele). A much less common category is the ‘2 + 0’ genotype with two functional genes on one chromosome and none on the other. Furthermore, there are also ‘1 + 1D’ and ‘2 + 1D’ genotypes, which have one or two functional genes on one chromosome and a non-functional gene due to either a point mutation or a microdeletion on the other. These last two genotypes are very rare [56, 57]. Four or even more copies of the SMN1 gene have also been found, indicating a ‘2 + 2’ or possibly a ‘3 + 1’ genotype. This suggests ‘3 + 0’ or ‘3 + 1D’ carrier genotypes might also be possible, however these will be even rarer.

Fig. 1
figure 1

Most common SMA genotypes among non-carriers and carriers

No signs of disease have been associated with being a carrier for SMA. However, some studies suggest abnormal SMN1 copy numbers (either deletions or duplications) may increase the risk and severity of sporadic amyotrophic lateral sclerosis (ALS), although other studies have been unable to confirm this association (for a review see Butchbach et al., 2016 [58]). Furthermore, it was suggested that in the rare disorder progressive muscular atrophy (PMA) SMN1 duplications might be associated with a more severe clinical phenotype [59].

After the discovery of mutations in SMN1 as the cause of SMA, several studies into the carrier status of SMA have been performed. In contrast to the prevalence/ incidence studies, most studies have been performed outside of Europe. Some of these are population screening programmes, whereas others are large samples of the general population [46, 60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81]. There are also studies where small population samples were analysed or the carrier frequency was estimated from healthy controls screened for SMN1 for other purposes [8, 30, 82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99]. As mentioned before, frequencies estimated from a small population sample are less accurate. An overview of all studies is given in Additional file 2.

Subpopulation differences

Some of the studies have examined differences between ethnic groups within their study population [46, 62,63,64,65, 77, 80]. The main finding was that copy numbers were significantly higher in Black (Sub-Saharan African ancestry) people. This was seen in African Americans [46, 62, 77], as well as in Black Africans [66] and would indicate a higher proportion of 2-copy (duplication) alleles, thereby suggesting a higher number of ‘2 + 0’ carriers. This could account for a lower detection rate (around 70% for Black people versus 90–95% for other ethnicities), leading to a high number of false negatives. The study in Africa found a significantly lower carrier frequency compared to Eurasians [66]. Lower carrier frequencies were also seen in a study comparing Black and White people in South Africa and a study among samples of the 1000 genome project [65, 80]. However, these studies could not detect the ‘2 + 0’ carriers, which could reduce the observed differences. Some studies also found lower carrier frequencies in Hispanics [46, 77], but this was not seen in other studies [62, 69, 80]. Lastly, Luo et al. identified a specific haplotype, present in Ashkenazi Jews and Asians detectable by microsatellite analysis, that could distinguish duplication alleles (present in ‘2 + 0’ carriers) from normal ‘1 + 1’ genotypes [77].

We carried out an analysis of differences between ethnic groups and studies. Fig. 2 shows a comparison of all studies described in Additional file 2 (ethnicities are indicated). The grey area indicates the 95% confidence interval based on the average carrier frequency of all studies combined (0.019).Footnote 1 Most studies fall within this area, indicating no large differences in carrier frequency. Two populations (a Muslim Arab village in Israel and a specific group of Hutterites in South Dakota, USA) showed a particularly high carrier frequency. However, these are isolated populations with a high degree of inbreeding [81, 89]. Also in an Iranian population a higher carrier frequency was seen (1 in 20). However this is based on one study with a small sample size, furthermore, in Iran, consanguineous marriages are common [91]. Combined estimates of carrier frequencies for ethnic groups were calculated (large symbols in Fig. 2 and Table 4).

Fig. 2
figure 2

Carrier frequency studies for SMA. The grey area represents the 95% confidence interval based on the average carrier frequency (0.019) of all individuals (except those from the isolated Muslim Israeli Arab village and the Schmiedeleut Hutterites). Small dots represent individual studies. In case studies seperated between groups, these are depicted as separate dots. Large symbols represent pooled estimates for different ethnical groups

Table 4 Carrier frequencies for SMA per ethnicity

The results show that the highest frequencies are found in Caucasian and Asian populations (around 1 in 50) and the lowest in Black (1 in 100) and Hispanics (1 in 76) populations. However, it is important to note that genetically Hispanics are a very mixed group, making generalizations difficult. This is also demonstrated by the fact that some studies among Hispanics found much higher frequencies [69, 80], while others found that the frequencies were lower [46].

SMN1 copy number differences between populations

In 2014, MacDonald et al. have performed a meta-analysis comparing the SMA carrier frequency among different ethnicities. In their analysis they included 14 studies where ethnicities were described and results were broken down by SMA copy number [47]. They took the different carrier genotypes described above into account and determined the carrier rates in the ethnic groups. Furthermore, they calculated the reduced risk of being a carrier if a 2- or 3-copy result was found. This again showed a substantially higher carrier risk with a 2-copy test result for Black people. In addition a very high carrier risk and 2-copy risk in Iranians was found. However, this is based on one study only [91].

The Additional file 3 shows all studies that examined at the SMN1 copy number status. None of the studies among Arabian populations performed this analysis, therefore this group has not been included in the table. SMN1 allele frequencies were determined for each group (Table 5) using copy numbers (for methods and calculations see Additional file 1).

Table 5 SMN1 allele frequencies per ethnicity

The copy number 0 (carriers) is lower in Blacks and Hispanics. Whilst there is not a great difference in the two copy number frequencies between other ethnicities, this is much higher in the Black population. As is seen in Table 6, this indicates a higher number of hidden carriers (‘2 + 0’ genotype), thereby decreasing the sensitivity of most carrier tests used, which only measure the copy numbers. Therefore, it is important to take the ethnicity into account when performing population screening or genetic counselling and consider a different method to reduce the chance of false negative results. In Table 6 also disease frequencies are estimated by combining the copy number results with an estimated small mutation (1D) frequency of 4% [4, 5] and an estimated de novo mutation frequency of 2% [100]. Thereafter, the incidences rates were estimated using these frequencies (Table 7).

Table 6 Carrier, SMN1 copy number 2 carrier and disease frequencies per ethnicity
Table 7 Estimated incidence from carrier frequency per ethnicity

The incorporation of estimated carrier risks for people with a 2 copy number result, generates only a slightly lower incidence (~1 in 54) for the Black populations compared to most other populations (~1 in 45), due to presence of a much higher number of multiple SMN1 copy number alleles in this population. The estimation of the combined carrier frequency in Hispanics is lower than in other populations (1 in 65), as was also seen in the previous estimations. It must be noted however that here only a subset of studies is used compared to the comparison of all studies (Fig. 2 and Table 4), which can also contribute to differences in estimations.

The combined results lead to the highest incidence estimations of around 1 in 8000 in Asians and Caucasians, whilst lower incidence of around 1 in 20,000 are estimated in the Black and Hispanic populations.

In Caucasians, the incidence rate estimated from carrier frequencies is higher than the observed incidence rates in studies (Table 3, ~1 in 11,000). Carrier frequency estimates are solely based on genetic studies, whilst most incidence studies were based on clinical diagnosis and are mostly much older. However, carrier frequency incidence estimates could be an overestimation of the true incidence due to reduced penetrance. Here a penetrance of 100% is assumed. If the penetrance is decreased by 10% (i.e. penetrance of 90%) the incidence would also decrease by 10%. It might be that some cases of SMA are so severe that they lead to premature death in utero. SMN2 is absent in 10–15% of the general population [101], and deletions of both SMN1 and SMN2 are embryonically lethal. Furthermore increased awareness could lead to more genetic counselling of couples at risk, certainly in couples who have previous children or family members with SMA. In addition, sporadic cases of unaffected individuals without functional SMN1 cases have been described [96, 102,103,104,105,106,107,108,109]. This might be due to high copy numbers of SMN2, since, as mentioned before, SMN2 copy number influences the severity of the disease [7,8,9]. Therefore, it is important to take SMN2 copy number into account when performing newborn screening.

Conclusions

SMA is a severe, heterogeneous, neuromuscular disorder. The few available prevalence and incidence studies mainly predate genetic testing and were performed in small geographical areas, mainly in Europe. This highlights the need for larger, more generalizable prevalence studies.

Recently, carrier frequency of SMA in healthy populations has been studied quite extensively, indicating differences between ethnicities not only in carrier frequency, but also in copy number status. In some groups this decreases the sensitivity of commonly used carrier testing methods. This emphasizes the need to use methods that enable to detect carriers having two SMN1 copies on one chromosome and none on the other.

Good epidemiological data is needed to gain insight into health care needs and for research studies and clinical trials. This is especially important in rare diseases where clinical trials require a careful planning. Furthermore, newborn screening will become increasingly important, especially now a drug has been approved and other new therapies are in advanced clinical trial stages. The introduction of new therapies is also likely to impact on the prevalence of SMA and as such may have significant resource implications for health care planning.