Background

Propionic acidemia (PA) (Online Mendelian Inheritance in Man (OMIM) number #606054) is a serious, life-threatening, inherited, metabolic disorder caused by the deficiency of the mitochondrial enzyme propionyl-coenzyme A (CoA) carboxylase (EC 6.4.1.3), which results in the accumulation of toxic metabolites such as propionic acid and 2-methylcitrate [1, 2]. The onset of PA occurs most frequently in the neonatal period, but also has a rarer late onset form [3]. The clinical manifestations include episodic life-threatening metabolic decompensations, growth impairment, movement disorders, seizures, basal ganglia lesions, pancreatitis, and cardiomyopathy [4]. The disease can lead to severe intellectual disability (IQ < 70) and speech delay, such that the majority of patients with PA require special education [5, 6]. Prognosis of PA is generally poor; patients with severe disease forms may die in the newborn period or later due to metabolic decompensations, cardiac complications (cardiomyopathy, arrhythmias) or basal ganglia stroke [4, 7, 8]. Milder or asymptomatic disease forms also exist, in these cases the prognosis may be more favorable [9].

There are no approved therapies that address the underlying root cause of PA. Current management of the disorder is limited to strict dietary management, carnitine supplementation, antibiotics such as metronidazole to reduce propionate production by intestinal bacteria, and ammonia scavengers such as carglumic acid to control episodes of hyperammonemia [4, 10]. Liver transplant as an approach to increase enzyme activity is a potential treatment option for severely affected individuals [4, 10].

Newborn screening for PA is performed in the United States, Australia and in several European and Asian countries [11]. Early detection by newborn screening is an effective approach to identify late onset cases [12, 13] and has been associated with decreased short-term mortality in PA [12, 14], however the impact on the long-term clinical course of the disorder is less clear [12,13,14]. PA cases can be detected in the neonatal period using acylcarnitine analysis by tandem mass spectrometry (MS/MS) on dried blood spots. Neonatal testing reveals elevated propionylcarnitine (C3) levels, and other secondary markers (methionine, C3/C2, and C3/C16 ratios) can be helpful to increase diagnostic accuracy [4]. Demonstration of deficient activity of propionyl-CoA carboxylase (PCC) or detection of pathogenic mutations in either PCCA (Mendelian Inheritance in Man (MIM) number 232000) or PCCB (MIM 232050) genes establishes the definitive diagnosis [10].

Although several studies reported results of newborn screening for PA in different regions, a systematic literature review on disease epidemiology has not been performed to date. The primary objective of this study was to conduct a systematic literature review and meta-analysis on the epidemiology of PA.

Methods

Systematic literature review

The literature search was performed covering Medline, Embase, Cochrane Database of Systematic Reviews, Centre for Reviews and Dissemination (CRD) Database, Academic Search Complete, Cumulative Index to Nursing and Allied Health Literature (CINAHL) and PROSPERO databases. Websites of rare disease organizations were also searched for eligible studies. Detailed search strategies with the date of search and number of hits are summarized in Additional file 1: Table S1. The exclusion criteria of the title/abstract screening and full-text reviews are summarized in Fig. 1 and are detailed in Additional file 1: Table S2. A snowball method was also used to identify further relevant studies within the citations of full text papers.

Fig. 1
figure 1

Flow of information diagram

Data extraction was performed by two independent researchers, and conflicts were resolved by discussion until a consensus was reached. During the full-text review, studies not reporting on a representative population for the given country or region were excluded. Reports on national screening programs with ~ 100% population coverage and analyses of national statistics were considered to provide the most accurate data on disease epidemiology. Reports on screening programs not covering ~ 100% of the population were considered to be eligible if a relatively large, random sample was used or the screening program had a multicenter design. Studies reporting on selected patient populations (e.g. patients with clinical suspicion of inborn error of metabolism) were excluded. Risk of bias was evaluated using the tool developed by Hoy et al. (2012) which is designed to assess methodological quality of prevalence studies [15].

Meta-analysis

Studies with a high risk of bias were excluded from the quantitative synthesis. Overlap among the patient populations of multiple studies was rigorously investigated by reviewing countries, study periods, data sources and patient cohorts. Only the publication with the more complete dataset was included in the meta-analysis. A random effects meta-analysis was performed including all identified studies presenting birth prevalence, lifetime risk and cumulative incidence data. Heterogeneity between the individual study estimates was determined by the value of the heterogeneity chi-squared test and the I-square (I2) statistics. The metaprop module for STATA was used to perform all meta-analyses on STATA SE 15.0. This routine provides procedures for pooling proportions (in our case prevalence and cumulative incidence) in a meta-analysis of multiple studies. The confidence intervals of the individual study estimates are based on exact binomial (Clopper-Pearson) procedure [16]. Confidence intervals for the pooled estimates were calculated after Freeman-Tukey double arcsine transformation.

The analysis was performed separately for the following regions: North America, Europe, Asia-Pacific, Middle-East and North Africa. A time-specific subgroup analysis was also performed in order to observe the potential changes in disease occurrence throughout the years. The following two time periods were studied separately: 1981–2000 and 2001-present. A sensitivity analysis was also undertaken aiming to decrease the heterogeneity of epidemiological measures by omitting studies not presenting birth prevalence data.

Results

After duplicates were removed, 2338 records were screened by their titles and abstracts from which 129 articles qualified for a full-text review. The snowball method identified 59 extra articles. In total, 188 articles were assessed for eligibility in full text and from these, 43 studies reported on the epidemiology of the disease (see Fig. 1). Among the 43 articles there were 11 overlapping studies and one using a different calculation method than the remaining articles, thus these 12 studies were further excluded from the quantitative analysis.

The largest share of publications originated from Europe, followed by the Asia-Pacific region. In the American continent, the United States was the most frequently investigated area, while in the Middle-East studies from Saudi Arabia were in the majority.

Large heterogeneity was observed regarding the epidemiological terms used in the identified papers. Therefore, the reported measures were recategorized based on their calculation methods according to the scientifically acceptable definitions of epidemiological terms (see Additional file 1: Table S3).

The vast majority of the articles reported on newborn screening programs providing estimates on the birth prevalence of the disease, defined as the number of affected newborns divided by the total population screened. Three articles followed a specific birth cohort over time and counted the number of diagnoses over the follow-up period, providing estimates on the cumulative incidence in the birth cohort [17,18,19]. In seven cases, authors divided the number of diagnosed patients by the number of live births during the same period of time, which measure aims to estimate the lifetime risk at birth [20]; a special case of cumulative incidence where the period of time studied is the entire remaining lifetime [21,22,23,24,25,26,27]. Although the calculation methods differ, the difference in the results is small if it is assumed that PA appears early in life, the disease occurrence is more or less constant, the size of birth cohorts and the diagnostic methods did not change significantly over time and all patients who have the underlying mutations will present with clinical symptoms over their lifetime. Based on these assumptions, we use the term “detection rate” for the three above-mentioned measures throughout the paper. Only one study calculated the proportion of affected patients within the total population providing the point prevalence of the disease [28]. Point prevalence is not comparable with the other frequency measures, therefore, this publication was excluded from the quantitative synthesis.

Epidemiological data on PA – By territory

In North America, detection rates of PA ranged between 0.20 (US, California) and 1.35 (Canada, Ontario) per 100,000 newborns [29, 30] (see Fig. 2) [17,18,19, 21,22,23,24,25,26,27, 29,30,31,32,33,34,35,36,37,38,39,40,41,42, 43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60]. The pooled point estimate indicated a detection rate of 0.33 per 100,000 newborns (CI: 0.11–0.63) in North America (see Table 1). All 8 articles originating from the US – except Zytkovicz et al. (2001) with 1.22 per 100,000 newborns – indicated a detection rate below 1 per 100,000 newborns. Subgroup analysis by time periods revealed some decrease in the detection rate between the periods of ‘1981–2000’ and ‘2001-present’; the detection rate decreased from 0.56 (CI: 0.23–1.01) to 0.26 (CI: 0.00–1.01) per 100,000 newborns, but the confidence intervals were largely overlapping.

Fig. 2
figure 2

Estimates of birth prevalence of propionic acidemia in the different countries and geographical regions

Table 1 Base case and sensitivity analysis by geographic area (sensitivity analysis includes only studies with birth prevalence estimates derived from newborn screening studies) (detection rate per 100,000 newborns)

In Europe, the detection rates varied between 0.32 and 2.20 per 100,000 newborns [31, 32]. The pooled point estimate indicated a rate of 0.33 per 100,000 newborns (CI: 0.15–0.57) in Europe. Screening programs with 100% population coverage were identified in Austria, Italy, Spain and Portugal, where detection rates varied between 1.29 (Austria), 1.25 (Italy), 0.95 (Spain) and 0.32 (Portugal) per 100,000 newborns [32,33,34,35]. The largest reference population was identified in Italy, where the authors analyzed aggregate statistics from the period of 1985–1997 (n = 7,173,959 births) and found a detection rate of 0.42 per 100,000 newborns [21]. No time difference was shown by the subgroup analysis of the two time periods.

Detection rates in the region of Asia-Pacific were between 0.09 and 5.05 per 100,000 newborns [24, 36]. The conducted meta-analysis revealed a pooled point estimate of 0.29 per 100,000 newborns (CI: 0.03–0.74) for the Asia-Pacific region (see Table 1). The highest estimate from the region originates from Korea where Yoon et al. (2005) identified 4 cases among the 79,179 screened newborns (detection rate: 5.05/100,000 newborns) [36]. Relatively high occurrence characterized Japan where detection rates ranged between 2.64 and 4.89 per 100,000 newborns [37] [38]. Subgroup analysis by time periods indicated a possible increase in detection rate over the years, 0.08 (CI: 0.01–0.22) vs. 0.45 (CI: 0.02–1.25) per 100,000 newborns in the time periods of ‘1981–2000’ and ‘2001-present’.

Epidemiology studies performed in the Middle East and North Africa (MENA) region showed significantly increased detection rates compared to other regions. Out of the 8 identified articles, 6 reported detection rates over 3 per 100,000 newborns (range: 3.62 to 8.14 per 100,000 newborns) [17, 23, 27, 39,40,41]. The pooled point estimate was also relatively high, 4.24 per 100,000 newborns (CI: 2.53–6.31) (see Table 1) without considerable change over the years (4.11 (CI: 2.82–5.63) vs 4.48 (CI: 1.34–9.00) per 100,000 newborns in the time periods of ‘1981–2000’ and ‘2001-present’).

The only article that estimated (point) prevalence data originated from Oman where authors reported a prevalence of 0.40 per 100,000 inhabitants [28].

Epidemiological data on PA subtypes and ethnicities

Only one study was identified which reported the proportion of PCCA-deficient and PCCB-deficient patients and was representative on a regional or country level [42], therefore no quantitative analysis could be conducted on the PA subtypes. The described newborn screening program identified 6 PA cases among the 847,418 screened newborns in Australia during 2002–2014. Gene analysis was conducted only on 3 PA patients, which resulted in 2 PCCA-deficient and 1 PCCB-deficient cases (detection rates of 0.24 and 0.12 per 100,000 newborns, respectively) [42].

Disease prevalence by ethnicities was investigated by Feuchtbaum et al. (2012) in the United States, in California [30]. Only Native Americans were characterized by a significantly higher detection rate (6.7 per 100,000 newborns) than the overall rate (0.2 per 100,000 newborns). Black and Hispanic ethnic groups showed detection rates of 0.8 and 0.3 patients per 100,000 newborns, respectively, but these differences did not reach statistical significant level. No PA cases were identified among other ethnicities.

Discussion

Pooled point estimates of detection rates remained below 1 per 100,000 newborns in all regions, except the MENA where the results were significantly higher. This is in line with the findings of Chapman et al. (2018) who also identified higher birth prevalence in Kuwait than in the US or Southwest Germany; reported detection rates were 0.41 in the United States, 0.35 in Southwest Germany and 1.68 in Kuwait per 100,000 newborns [61]. Alfadhel et al. (2016 and 2017) explained the high numbers of metabolic disorders in Saudi Arabia by the frequent consanguineous marriages in the Saudi society [17, 39]. Al-Thihli et al. (2014) found that 95% of the investigated patients with inborn errors of metabolism (n = 229) were from consanguineous parents, while Moammar et al. (2010) detected a consanguineous rate of 100% among the affected patients [23, 28]. Epidemiology studies from Japan also reported higher detection rates ranging between 2.64 and 4.89 per 100,000 newborns. According to Yamaguchi et al. (2008); Shigematsu et al. (2002) and Yorifuji et al. (2003), the higher occurrence can be explained by a mutation (p. Y435C) in the PCCB gene that accounts for a mild form of PA [37, 38, 62]. Founder effects and thus higher detection rates of PA can also be found in such communities as the Amish and Mennonite communities [63], Galicians in Spain [33], and the Greenlandic Inuits in Greenland [64].

The epidemiological terminology used in the identified studies was heterogeneous and inconsistent. An added value of our study is a recategorization and harmonization of all published epidemiological measures (see Additional file 1: Table S3).

In most of meta-analyses performed, the I2 statistics indicated substantial heterogeneity across the studies that underlines the necessity of random effects meta-analysis. Subgroup analysis by two time periods did not reveal a substantial change in disease frequency throughout the years. Pooled point estimates remained under 1 per 100,000 newborns in both periods (‘1981–2000’ and ‘2001-present’) in all regions, except in the MENA, similar to the main analysis. Sensitivity analysis indicated that performing the meta-analyses using only birth prevalence data resulted in slightly higher estimates than the base case analysis. This might imply that newborn screening may result in a slight overestimation of the clinically relevant incidence since not all identified cases will necessarily develop clinical symptoms later on [22]. These patients without clinical symptoms might have a milder form of disease that may remain undiagnosed without systematic screening.

Due to the scarcity of studies with representative reference populations conducting PCCA- and PCCB-deficient subtype analysis, the current systematic literature review could not conclude on the relative detection rates of these subtypes. However, the distribution of PCCA-deficient and PCCB-deficient subtypes is reported to be approximately equally distributed [4] and no differences in severity or outcome have been described between the two subtypes.

Due to the rarity of PA, broadly targeted population-based prevalence studies are not available. However, reports on the results of newborn screening programs provided valuable, high quality data on the birth prevalence of the disease. Nonetheless, differences in case definitions and cut-off values, reference population size, the screening methods used and incomplete reporting may all have an influence on the number of identified and reported cases. In many cases the diagnostic tool and related cut-off values were not reported. In addition, studies did not always provide the rate of population coverage, which prevented the assessment of potential selection bias. Where screening programs reported on the number of false positive and negative findings, the number of positive cases were adjusted accordingly. However, follow-up time was not always long enough to assess adequately the screening performance. To summarize, a newborn screening that includes limited gene sequencing and applies appropriate follow-up can be the “gold standard” for measuring prevalence of most metabolic disorders and possibly non-metabolic genetic disorders as well.

Despite all the limitations mentioned above, our results indicated similar disease occurrence to the systematic literature review by the Spanish Health Technology Assessment Agency, conducted with the purpose of evaluating the clinical effectiveness of newborn screening programs [11]. Compared to this review, our research was not restricted to screening programs, therefore, it provides a more comprehensive overview on disease epidemiology.

Conclusion

Implementation of newborn screening programs has allowed the estimation of the birth prevalence data of PA across multiple geographic regions. However, a certain evidence gap can be observed as epidemiological studies from South America, South Africa, Eastern Europe or Russia were not identified by our literature search. Our systematic literature review and meta-analysis confirm that PA is an ultra-rare disorder, with similar detection rates across all regions with the exception of the MENA region where the disease, similar to other inherited metabolic disorders, is more frequent.